Posts

Showing posts from November, 2025

Practice dataset for Python

 https://docs.google.com/spreadsheets/d/1N5teiOf8TCDzlGFlhmhUXTQKKs_XDoDH/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true

time series data set for practice in R

 set.seed(123) time_series_data <- tibble(   date = seq.Date(from = as.Date("2015-01-01"),                   to   = as.Date("2023-12-01"),                   by   = "month"),      sales = round(     500 + 20 * seq_along(date) +          # Trend       200 * sin(2 * pi * seq_along(date) / 12) +  # Seasonality       rnorm(length(date), sd = 80),       # Noise     0   ) ) print(time_series_data)

Cleaning practice dataset for R

 https://docs.google.com/spreadsheets/d/1AAhhE6VPJ1fnoz7sUJ4QhWEXcLj2c0gx/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true

# Simple business dataset with 4 columns and 7 rows

 # Simple business dataset with 4 columns and 7 rows df <- data.frame(   Customer_ID = c("C001","C002","C003","C004","C005","C006","C007"),   Product     = c("laptop","Phone","tablet","Laptop","phone","Monitor",""),   Sales       = c("1200","","950","N/A","700","1100","800"),  # note: character on purpose   Region      = c("North","east","South","west","","North","East")    )

star schema data set

 flat schema https://docs.google.com/spreadsheets/d/1H_T9kETtO_2Je0AYdgLoywKayCnwaF5_/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true star schema https://docs.google.com/spreadsheets/d/1dbBY-A-hH4KaoNlDI0r8vZYpYLYwVZff/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true