Posts

SQL handouts

https://drive.google.com/file/d/1vyDyIKJP_83tlHc7uYOnVj0zS1C7HEEn/view?usp=sharing CSV Dataset https://drive.google.com/file/d/1EnaYj5VCTLIn9dVAOsIFlhNt9V1Vww_Z/view?usp=sharing  1. Basic SELECT and WHERE Show all columns for orders from the North region. List Order_ID and Profit for orders where Profit is greater than 500. Retrieve all orders for Product_ID = 'P02'. 2. Calculations and Aliases Select Order_ID, Units_Sold, Unit_Price, and calculate Total_Amount as Units_Sold * Unit_Price. Find Revenue minus Cost for each order and show it as Calculated_Profit. Check if it matches the Profit column. 3. Filtering by Date and Numeric Ranges Select all orders placed on or after '2025-02-01'. Get orders where Units_Sold between 2 and 7 (inclusive). 4. GROUP BY and Aggregation Find total Revenue per Region. Find total Profit per Salesperson_ID. Find total Units_Sold for each Product_ID. 5. HAVING and Aggregated Filters Show Region and total Profit, but only for regions wh...

Practice dataset for Python

 https://docs.google.com/spreadsheets/d/1N5teiOf8TCDzlGFlhmhUXTQKKs_XDoDH/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true

time series data set for practice in R

 set.seed(123) time_series_data <- tibble(   date = seq.Date(from = as.Date("2015-01-01"),                   to   = as.Date("2023-12-01"),                   by   = "month"),      sales = round(     500 + 20 * seq_along(date) +          # Trend       200 * sin(2 * pi * seq_along(date) / 12) +  # Seasonality       rnorm(length(date), sd = 80),       # Noise     0   ) ) print(time_series_data)

Cleaning practice dataset for R

 https://docs.google.com/spreadsheets/d/1AAhhE6VPJ1fnoz7sUJ4QhWEXcLj2c0gx/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true

# Simple business dataset with 4 columns and 7 rows

 # Simple business dataset with 4 columns and 7 rows df <- data.frame(   Customer_ID = c("C001","C002","C003","C004","C005","C006","C007"),   Product     = c("laptop","Phone","tablet","Laptop","phone","Monitor",""),   Sales       = c("1200","","950","N/A","700","1100","800"),  # note: character on purpose   Region      = c("North","east","South","west","","North","East")    )

star schema data set

 flat schema https://docs.google.com/spreadsheets/d/1H_T9kETtO_2Je0AYdgLoywKayCnwaF5_/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true star schema https://docs.google.com/spreadsheets/d/1dbBY-A-hH4KaoNlDI0r8vZYpYLYwVZff/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true

practice dagtaset for R

 https://docs.google.com/spreadsheets/d/13OIM_hK_KM8kVSmwn7-TdQxkMKnFtYwZ/edit?usp=sharing&ouid=105173447805752617719&rtpof=true&sd=true