Please go through this top 5 SQL projects with Datasets that you can practice and can add in your resume
π1. Web Analytics:
(https://www.kaggle.com/zynicide/wine-reviews)
π2. Healthcare Data Analysis:
(https://www.kaggle.com/cdc/mortality)
π3. E-commerce Analysis:
(https://www.kaggle.com/olistbr/brazilian-ecommerce)
π4. Inventory Management:
(https://www.kaggle.com/code/govindji/inventory-management)
π 5. Analysis of Sales Data:
(https://www.kaggle.com/kyanyoga/sample-sales-data)
Small suggestion from my side for non tech students: kindly pick those datasets which you like the subject in general, that way you will be more excited to practice it, instead of just doing it for the sake of resume, you will learn SQL more passionately, since itβs a programming language try to make it more exciting for yourself.
Hope this piece of information helps you
Join for more -> https://t.iss.one/addlist/4q2PYC0pH_VjZDk5
ENJOY LEARNING ππ
π1. Web Analytics:
(https://www.kaggle.com/zynicide/wine-reviews)
π2. Healthcare Data Analysis:
(https://www.kaggle.com/cdc/mortality)
π3. E-commerce Analysis:
(https://www.kaggle.com/olistbr/brazilian-ecommerce)
π4. Inventory Management:
(https://www.kaggle.com/code/govindji/inventory-management)
π 5. Analysis of Sales Data:
(https://www.kaggle.com/kyanyoga/sample-sales-data)
Small suggestion from my side for non tech students: kindly pick those datasets which you like the subject in general, that way you will be more excited to practice it, instead of just doing it for the sake of resume, you will learn SQL more passionately, since itβs a programming language try to make it more exciting for yourself.
Hope this piece of information helps you
Join for more -> https://t.iss.one/addlist/4q2PYC0pH_VjZDk5
ENJOY LEARNING ππ
π2β€1
Today, let's move to the next topic of SQL Roadmap:
β SQL JOINS
What a JOIN is
β’ A JOIN combines data from two or more tables
β’ Tables connect using a common column
β’ That column is usually an ID
β’ JOIN answers questions one table cannot answer
Why JOINs exist
β’ Customer details sit in one table
β’ Orders sit in another table
β’ JOIN links customers to their orders
Example tables
customers
customer_id | name | city
orders
order_id | customer_id | amount
Connection
β’ customers.customer_id is primary key
β’ orders.customer_id is foreign key
β’ This shared column enables JOIN
Types of JOINs you must know
β’ INNER JOIN
β’ LEFT JOIN
β’ RIGHT JOIN
β’ FULL JOIN
β’ SELF JOIN
INNER JOIN
β’ Returns only matching rows from both tables
β’ Drops anything without a match
β’ Matches customers with their orders
β’ Shows only customers who placed orders
β’ Removes customers with no orders
β’ Removes orders without customers
LEFT JOIN
β’ Returns all rows from left table
β’ Matches data from right table
β’ Shows NULL when no match exists
β’ Returns every customer
β’ Shows order amount if available
β’ Shows NULL if customer never ordered
RIGHT JOIN
β’ Returns all rows from right table
β’ Matches data from left table
β’ Opposite of LEFT JOIN
β’ Returns all orders
β’ Shows customer name if exists
β’ Shows NULL for missing customer data
FULL JOIN
β’ Returns all rows from both tables
β’ Matches where possible
β’ Shows NULL when no match
β’ Shows all customers
β’ Shows all orders
β’ Includes unmatched data from both sides
SELF JOIN
β’ Table joins with itself
β’ Used for hierarchy or comparison
β’ Matches employee with manager
β’ Uses same table twice
β’ Shows reporting hierarchy
JOIN Comparison Summary
β’ INNER JOIN: Only matching data
β’ LEFT JOIN: All left table rows
β’ RIGHT JOIN: All right table rows
β’ FULL JOIN: Everything from both tables
β’ SELF JOIN: Table joins itself
Double Tap β₯οΈ For More
β SQL JOINS
What a JOIN is
β’ A JOIN combines data from two or more tables
β’ Tables connect using a common column
β’ That column is usually an ID
β’ JOIN answers questions one table cannot answer
Why JOINs exist
β’ Customer details sit in one table
β’ Orders sit in another table
β’ JOIN links customers to their orders
Example tables
customers
customer_id | name | city
orders
order_id | customer_id | amount
Connection
β’ customers.customer_id is primary key
β’ orders.customer_id is foreign key
β’ This shared column enables JOIN
Types of JOINs you must know
β’ INNER JOIN
β’ LEFT JOIN
β’ RIGHT JOIN
β’ FULL JOIN
β’ SELF JOIN
INNER JOIN
β’ Returns only matching rows from both tables
β’ Drops anything without a match
SELECT name, amount
FROM customers
INNER JOIN orders
ON customers.customer_id = orders.customer_id;
β’ Matches customers with their orders
β’ Shows only customers who placed orders
β’ Removes customers with no orders
β’ Removes orders without customers
LEFT JOIN
β’ Returns all rows from left table
β’ Matches data from right table
β’ Shows NULL when no match exists
SELECT name, amount
FROM customers
LEFT JOIN orders
ON customers.customer_id = orders.customer_id;
β’ Returns every customer
β’ Shows order amount if available
β’ Shows NULL if customer never ordered
RIGHT JOIN
β’ Returns all rows from right table
β’ Matches data from left table
β’ Opposite of LEFT JOIN
SELECT name, amount
FROM customers
RIGHT JOIN orders
ON customers.customer_id = orders.customer_id;
β’ Returns all orders
β’ Shows customer name if exists
β’ Shows NULL for missing customer data
FULL JOIN
β’ Returns all rows from both tables
β’ Matches where possible
β’ Shows NULL when no match
SELECT name, amount
FROM customers
FULL JOIN orders
ON customers.customer_id = orders.customer_id;
β’ Shows all customers
β’ Shows all orders
β’ Includes unmatched data from both sides
SELF JOIN
β’ Table joins with itself
β’ Used for hierarchy or comparison
SELECT e.name, m.name AS manager_name
FROM employees e
LEFT JOIN employees m
ON e.manager_id = m.employee_id;
β’ Matches employee with manager
β’ Uses same table twice
β’ Shows reporting hierarchy
JOIN Comparison Summary
β’ INNER JOIN: Only matching data
β’ LEFT JOIN: All left table rows
β’ RIGHT JOIN: All right table rows
β’ FULL JOIN: Everything from both tables
β’ SELF JOIN: Table joins itself
Double Tap β₯οΈ For More
β€8
If you want to Excel at using the most used database language in the world, learn these powerful SQL features:
β’ Wildcards (%, _) β Flexible pattern matching
β’ Window Functions β ROW_NUMBER(), RANK(), DENSE_RANK(), LEAD(), LAG()
β’ Common Table Expressions (CTEs) β WITH for better readability
β’ Recursive Queries β Handle hierarchical data
β’ STRING Functions β LEFT(), RIGHT(), LEN(), TRIM(), UPPER(), LOWER()
β’ Date Functions β DATEDIFF(), DATEADD(), FORMAT()
β’ Pivot & Unpivot β Transform row data into columns
β’ Aggregate Functions β SUM(), AVG(), COUNT(), MIN(), MAX()
β’ Joins & Self Joins β Master INNER, LEFT, RIGHT, FULL, SELF JOIN
β’ Indexing β Speed up queries with CREATE INDEX
Like it if you need a complete tutorial on all these topics! πβ€οΈ
#sql
β’ Wildcards (%, _) β Flexible pattern matching
β’ Window Functions β ROW_NUMBER(), RANK(), DENSE_RANK(), LEAD(), LAG()
β’ Common Table Expressions (CTEs) β WITH for better readability
β’ Recursive Queries β Handle hierarchical data
β’ STRING Functions β LEFT(), RIGHT(), LEN(), TRIM(), UPPER(), LOWER()
β’ Date Functions β DATEDIFF(), DATEADD(), FORMAT()
β’ Pivot & Unpivot β Transform row data into columns
β’ Aggregate Functions β SUM(), AVG(), COUNT(), MIN(), MAX()
β’ Joins & Self Joins β Master INNER, LEFT, RIGHT, FULL, SELF JOIN
β’ Indexing β Speed up queries with CREATE INDEX
Like it if you need a complete tutorial on all these topics! πβ€οΈ
#sql
π6β€4
Useful WhatsApp Channels to Boost Your Career in 2026
ChatGPT: https://whatsapp.com/channel/0029VapThS265yDAfwe97c23
Artificial Intelligence: https://whatsapp.com/channel/0029Va4QUHa6rsQjhITHK82y
Web Development: https://whatsapp.com/channel/0029VaiSdWu4NVis9yNEE72z
Stock Marketing: https://whatsapp.com/channel/0029VatOdpD2f3EPbBlLYW0h
Finance: https://whatsapp.com/channel/0029Vax0HTt7Noa40kNI2B1P
Marketing: https://whatsapp.com/channel/0029VbB4goz6rsR1YtmiFV3f
Crypto: https://whatsapp.com/channel/0029Vb3H903DOQIUyaFTuw3P
Generative AI: https://whatsapp.com/channel/0029VazaRBY2UPBNj1aCrN0U
Sales: https://whatsapp.com/channel/0029VbC3NVX4dTnEv8IYCs3U
Digital Marketing: https://whatsapp.com/channel/0029VbAuBjwLSmbjUbItjM1t
Data Engineering: https://whatsapp.com/channel/0029Vaovs0ZKbYMKXvKRYi3C
Data Science: https://whatsapp.com/channel/0029Va8v3eo1NCrQfGMseL2D
UI/UX Design: https://whatsapp.com/channel/0029Vb5dho06LwHmgMLYci1P
Project Management: https://whatsapp.com/channel/0029Vb6QIAUJUM2SwC03jn2W
Entrepreneurs: https://whatsapp.com/channel/0029Vb2N3YA2phHJfsMrHZ0b
Content Creation: https://whatsapp.com/channel/0029VbC7n5FLo4hdy90kVx34
Freelancers: https://whatsapp.com/channel/0029Vb1U4wG9sBI22PXhSy0r
AI Tools: https://whatsapp.com/channel/0029VaojSv9LCoX0gBZUxX3B
Data Analysts: https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Jobs: https://whatsapp.com/channel/0029VaI5CV93AzNUiZ5Tt226
Science Facts: https://whatsapp.com/channel/0029Vb5m9UR6xCSQo1YXTA0O
Psychology: https://whatsapp.com/channel/0029Vb62WgKG8l5KlJpcIe2r
Prompt Engineering: https://whatsapp.com/channel/0029Vb6ISO1Fsn0kEemhE03b
Coding: https://whatsapp.com/channel/0029VamhFMt7j6fx4bYsX908
Double Tap β₯οΈ For More
ChatGPT: https://whatsapp.com/channel/0029VapThS265yDAfwe97c23
Artificial Intelligence: https://whatsapp.com/channel/0029Va4QUHa6rsQjhITHK82y
Web Development: https://whatsapp.com/channel/0029VaiSdWu4NVis9yNEE72z
Stock Marketing: https://whatsapp.com/channel/0029VatOdpD2f3EPbBlLYW0h
Finance: https://whatsapp.com/channel/0029Vax0HTt7Noa40kNI2B1P
Marketing: https://whatsapp.com/channel/0029VbB4goz6rsR1YtmiFV3f
Crypto: https://whatsapp.com/channel/0029Vb3H903DOQIUyaFTuw3P
Generative AI: https://whatsapp.com/channel/0029VazaRBY2UPBNj1aCrN0U
Sales: https://whatsapp.com/channel/0029VbC3NVX4dTnEv8IYCs3U
Digital Marketing: https://whatsapp.com/channel/0029VbAuBjwLSmbjUbItjM1t
Data Engineering: https://whatsapp.com/channel/0029Vaovs0ZKbYMKXvKRYi3C
Data Science: https://whatsapp.com/channel/0029Va8v3eo1NCrQfGMseL2D
UI/UX Design: https://whatsapp.com/channel/0029Vb5dho06LwHmgMLYci1P
Project Management: https://whatsapp.com/channel/0029Vb6QIAUJUM2SwC03jn2W
Entrepreneurs: https://whatsapp.com/channel/0029Vb2N3YA2phHJfsMrHZ0b
Content Creation: https://whatsapp.com/channel/0029VbC7n5FLo4hdy90kVx34
Freelancers: https://whatsapp.com/channel/0029Vb1U4wG9sBI22PXhSy0r
AI Tools: https://whatsapp.com/channel/0029VaojSv9LCoX0gBZUxX3B
Data Analysts: https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Jobs: https://whatsapp.com/channel/0029VaI5CV93AzNUiZ5Tt226
Science Facts: https://whatsapp.com/channel/0029Vb5m9UR6xCSQo1YXTA0O
Psychology: https://whatsapp.com/channel/0029Vb62WgKG8l5KlJpcIe2r
Prompt Engineering: https://whatsapp.com/channel/0029Vb6ISO1Fsn0kEemhE03b
Coding: https://whatsapp.com/channel/0029VamhFMt7j6fx4bYsX908
Double Tap β₯οΈ For More
β€7π2
β
SQL JOINS β Scenario-Based Interview Questions with Answers
Scenario 1: Find customers who have never placed an order
Tables: customers(customer_id, name) orders(order_id, customer_id);
Question: Business wants a list of customers with zero orders.
Answer:
SELECT c.customer_id, c.name FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id WHERE o.customer_id IS NULL;
Why this works
β’ LEFT JOIN keeps all customers
β’ Orders missing β NULL
β’ WHERE filters only non-ordering customers
Scenario 2: Get total revenue per customer, including customers with no orders
Question: Show every customer and their total spend. If no orders, show 0.
Answer:
SELECT c.customer_id, c.name, COALESCE(SUM(o.amount), 0) AS total_spend FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.name;
Explanation
β’ LEFT JOIN keeps all customers
β’ SUM aggregates orders
β’ COALESCE converts NULL to 0
Scenario 3: Find orders that donβt have a matching customer
Question: Audit data to find orphan orders.
Answer:
SELECT o.order_id, o.customer_id FROM orders o LEFT JOIN customers c ON o.customer_id = c.customer_id WHERE c.customer_id IS NULL;
Explanation
β’ LEFT JOIN from orders
β’ Missing customers become NULL
β’ Filters invalid data
Scenario 4: Get only customers who have placed at least one order
Question: Marketing wants only active customers.
Answer:
SELECT DISTINCT c.customer_id, c.name FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id;
Explanation
β’ INNER JOIN keeps only matching rows
β’ Customers without orders are excluded
Scenario 5: Find customers with more than 3 orders
Answer:
SELECT c.customer_id, c.name, COUNT(o.order_id) AS order_count FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.name HAVING COUNT(o.order_id) > 3;
Explanation
β’ JOIN combines data
β’ GROUP BY customer
β’ HAVING filters aggregated count
Scenario 6: Show latest order for each customer
Answer:
SELECT c.customer_id, c.name, MAX(o.order_date) AS last_order_date FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.name;
Explanation
β’ JOIN connects customers and orders
β’ MAX finds latest order per customer
Scenario 7: Find customers who ordered in 2024 but not in 2025
Answer:
SELECT DISTINCT c.customer_id, c.name FROM customers c INNER JOIN orders o2024 ON c.customer_id = o2024.customer_id LEFT JOIN orders o2025 ON c.customer_id = o2025.customer_id AND o2025.order_date >= '2025-01-01' WHERE o2024.order_date BETWEEN '2024-01-01' AND '2024-12-31' AND o2025.customer_id IS NULL;
Explanation
β’ INNER JOIN ensures 2024 orders
β’ LEFT JOIN checks absence in 2025
β’ NULL filter removes 2025 buyers
Scenario 8: Employee-Manager hierarchy (SELF JOIN)
Table: employees(employee_id, name, manager_id);
Answer:
SELECT e.name AS employee, m.name AS manager FROM employees e LEFT JOIN employees m ON e.manager_id = m.employee_id;
Explanation
β’ Same table joined twice
β’ Shows reporting structure
Scenario 9: Revenue by city
Answer:
SELECT c.city, SUM(o.amount) AS revenue FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.city;
Explanation
β’ JOIN links customers to orders
β’ GROUP BY city
β’ SUM calculates revenue
Scenario 10: Duplicate explosion after JOIN (classic trap)
Question: Why does this query show inflated revenue?
SELECT SUM(o.amount) FROM customers c JOIN orders o ON c.customer_id = o.customer_id;
Answer:
β’ Customer table may have duplicates
β’ JOIN multiplies rows
β’ Revenue gets inflated
Fix:
SELECT SUM(amount) FROM orders; or deduplicate customers before joining.
Interview golden rule for JOINS
Always explain:
1οΈβ£ Which table is LEFT
2οΈβ£ Which table is RIGHT
3οΈβ£ What rows are kept
4οΈβ£ Where NULLs appear
Double Tap β₯οΈ For More
Scenario 1: Find customers who have never placed an order
Tables: customers(customer_id, name) orders(order_id, customer_id);
Question: Business wants a list of customers with zero orders.
Answer:
SELECT c.customer_id, c.name FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id WHERE o.customer_id IS NULL;
Why this works
β’ LEFT JOIN keeps all customers
β’ Orders missing β NULL
β’ WHERE filters only non-ordering customers
Scenario 2: Get total revenue per customer, including customers with no orders
Question: Show every customer and their total spend. If no orders, show 0.
Answer:
SELECT c.customer_id, c.name, COALESCE(SUM(o.amount), 0) AS total_spend FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.name;
Explanation
β’ LEFT JOIN keeps all customers
β’ SUM aggregates orders
β’ COALESCE converts NULL to 0
Scenario 3: Find orders that donβt have a matching customer
Question: Audit data to find orphan orders.
Answer:
SELECT o.order_id, o.customer_id FROM orders o LEFT JOIN customers c ON o.customer_id = c.customer_id WHERE c.customer_id IS NULL;
Explanation
β’ LEFT JOIN from orders
β’ Missing customers become NULL
β’ Filters invalid data
Scenario 4: Get only customers who have placed at least one order
Question: Marketing wants only active customers.
Answer:
SELECT DISTINCT c.customer_id, c.name FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id;
Explanation
β’ INNER JOIN keeps only matching rows
β’ Customers without orders are excluded
Scenario 5: Find customers with more than 3 orders
Answer:
SELECT c.customer_id, c.name, COUNT(o.order_id) AS order_count FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.name HAVING COUNT(o.order_id) > 3;
Explanation
β’ JOIN combines data
β’ GROUP BY customer
β’ HAVING filters aggregated count
Scenario 6: Show latest order for each customer
Answer:
SELECT c.customer_id, c.name, MAX(o.order_date) AS last_order_date FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.name;
Explanation
β’ JOIN connects customers and orders
β’ MAX finds latest order per customer
Scenario 7: Find customers who ordered in 2024 but not in 2025
Answer:
SELECT DISTINCT c.customer_id, c.name FROM customers c INNER JOIN orders o2024 ON c.customer_id = o2024.customer_id LEFT JOIN orders o2025 ON c.customer_id = o2025.customer_id AND o2025.order_date >= '2025-01-01' WHERE o2024.order_date BETWEEN '2024-01-01' AND '2024-12-31' AND o2025.customer_id IS NULL;
Explanation
β’ INNER JOIN ensures 2024 orders
β’ LEFT JOIN checks absence in 2025
β’ NULL filter removes 2025 buyers
Scenario 8: Employee-Manager hierarchy (SELF JOIN)
Table: employees(employee_id, name, manager_id);
Answer:
SELECT e.name AS employee, m.name AS manager FROM employees e LEFT JOIN employees m ON e.manager_id = m.employee_id;
Explanation
β’ Same table joined twice
β’ Shows reporting structure
Scenario 9: Revenue by city
Answer:
SELECT c.city, SUM(o.amount) AS revenue FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.city;
Explanation
β’ JOIN links customers to orders
β’ GROUP BY city
β’ SUM calculates revenue
Scenario 10: Duplicate explosion after JOIN (classic trap)
Question: Why does this query show inflated revenue?
SELECT SUM(o.amount) FROM customers c JOIN orders o ON c.customer_id = o.customer_id;
Answer:
β’ Customer table may have duplicates
β’ JOIN multiplies rows
β’ Revenue gets inflated
Fix:
SELECT SUM(amount) FROM orders; or deduplicate customers before joining.
Interview golden rule for JOINS
Always explain:
1οΈβ£ Which table is LEFT
2οΈβ£ Which table is RIGHT
3οΈβ£ What rows are kept
4οΈβ£ Where NULLs appear
Double Tap β₯οΈ For More
β€7π1π1
SQL Detailed Roadmap
|
| | |-- Fundamentals
| |-- Introduction to Databases
| | |-- What SQL does
| | |-- Relational model
| | |-- Tables, rows, columns
| |-- Keys and Constraints
| | |-- Primary keys
| | |-- Foreign keys
| | |-- Unique and check constraints
| |-- Normalization
| | |-- 1NF, 2NF, 3NF
| | |-- ER diagrams
| | |-- Core SQL
| |-- SQL Basics
| | |-- SELECT, WHERE, ORDER BY
| | |-- GROUP BY and HAVING
| | |-- JOINS: INNER, LEFT, RIGHT, FULL
| |-- Intermediate SQL
| | |-- Subqueries
| | |-- CTEs
| | |-- CASE statements
| | |-- Aggregations
| |-- Advanced SQL
| | |-- Window functions
| | |-- Analytical functions
| | |-- Ranking, moving averages, lag and lead
| | |-- UNION, INTERSECT, EXCEPT
| | |-- Data Management
| |-- Data Types
| | |-- Numeric, text, date, JSON
| |-- Indexes
| | |-- B tree and hash indexes
| | |-- When to create indexes
| |-- Transactions
| | |-- ACID properties
| |-- Views
| | |-- Standard views
| | |-- Materialized views
| | |-- Database Design
| |-- Schema Design
| | |-- Star schema
| | |-- Snowflake schema
| |-- Fact and Dimension Tables
| |-- Constraints for clean data
| | |-- Performance Tuning
| |-- Query Optimization
| | |-- Execution plans
| | |-- Index usage
| | |-- Reducing scans
| |-- Partitioning
| | |-- Horizontal partitioning
| | |-- Sharding basics
| | |-- SQL for Analytics
| |-- KPI calculations
| |-- Cohort analysis
| |-- Funnel analysis
| |-- Churn and retention tables
| |-- Time based aggregations
| |-- Window functions for metrics
| | |-- SQL for Data Engineering
| |-- ETL Workflows
| | |-- Staging tables
| | |-- Transformations
| | |-- Incremental loads
| |-- Data Warehousing
| | |-- Snowflake
| | |-- Redshift
| | |-- BigQuery
| |-- dbt Basics
| | |-- Models
| | |-- Tests
| | |-- Lineage
| | |-- Tools and Platforms
| |-- PostgreSQL
| |-- MySQL
| |-- SQL Server
| |-- Oracle
| |-- SQLite
| |-- Cloud SQL
| |-- BigQuery UI
| |-- Snowflake Worksheets
| | |-- Projects
| |-- Build a sales reporting system
| |-- Create a star schema from raw CSV files
| |-- Design a customer segmentation query
| |-- Build a churn dashboard dataset
| |-- Optimize slow queries in a sample DB
| |-- Create an analytics pipeline with dbt
| | |-- Soft Skills and Career Prep
| |-- SQL interview patterns
| |-- Joins practice
| |-- Window function drills
| |-- Query writing speed
| |-- Git and GitHub
| |-- Data storytelling
| | |-- Bonus Topics
| |-- NoSQL intro
| |-- Working with JSON fields
| |-- Spatial SQL
| |-- Time series tables
| |-- CDC concepts
| |-- Real time analytics
| | |-- Community and Growth
| |-- LeetCode SQL
| |-- Kaggle datasets with SQL
| |-- GitHub projects
| |-- LinkedIn posts
| |-- Open source contributions
Free Resources to learn SQL
β’ W3Schools SQL
https://www.w3schools.com/sql/
β’ SQL Programming
https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v
β’ SQL Notes
https://whatsapp.com/channel/0029Vb6hJmM9hXFCWNtQX944
β’ Mode Analytics SQL tutorials
https://mode.com/sql-tutorial/
β’ Data Analytics Resources
https://t.iss.one/sqlspecialist
β’ HackerRank SQL practice
https://www.hackerrank.com/domains/sql
β’ LeetCode SQL problems
https://leetcode.com/problemset/database/
β’ Data Engineering Resources
https://whatsapp.com/channel/0029Vaovs0ZKbYMKXvKRYi3C
β’ Khan Academy SQL basics
https://www.khanacademy.org/computing/computer-programming/sql
β’ PostgreSQL official docs
https://www.postgresql.org/docs/
β’ MySQL official docs
https://dev.mysql.com/doc/
β’ NoSQL Resources
https://whatsapp.com/channel/0029VaxA2hTHgZWe5FpFjm3p
Double Tap β€οΈ For More
|
| | |-- Fundamentals
| |-- Introduction to Databases
| | |-- What SQL does
| | |-- Relational model
| | |-- Tables, rows, columns
| |-- Keys and Constraints
| | |-- Primary keys
| | |-- Foreign keys
| | |-- Unique and check constraints
| |-- Normalization
| | |-- 1NF, 2NF, 3NF
| | |-- ER diagrams
| | |-- Core SQL
| |-- SQL Basics
| | |-- SELECT, WHERE, ORDER BY
| | |-- GROUP BY and HAVING
| | |-- JOINS: INNER, LEFT, RIGHT, FULL
| |-- Intermediate SQL
| | |-- Subqueries
| | |-- CTEs
| | |-- CASE statements
| | |-- Aggregations
| |-- Advanced SQL
| | |-- Window functions
| | |-- Analytical functions
| | |-- Ranking, moving averages, lag and lead
| | |-- UNION, INTERSECT, EXCEPT
| | |-- Data Management
| |-- Data Types
| | |-- Numeric, text, date, JSON
| |-- Indexes
| | |-- B tree and hash indexes
| | |-- When to create indexes
| |-- Transactions
| | |-- ACID properties
| |-- Views
| | |-- Standard views
| | |-- Materialized views
| | |-- Database Design
| |-- Schema Design
| | |-- Star schema
| | |-- Snowflake schema
| |-- Fact and Dimension Tables
| |-- Constraints for clean data
| | |-- Performance Tuning
| |-- Query Optimization
| | |-- Execution plans
| | |-- Index usage
| | |-- Reducing scans
| |-- Partitioning
| | |-- Horizontal partitioning
| | |-- Sharding basics
| | |-- SQL for Analytics
| |-- KPI calculations
| |-- Cohort analysis
| |-- Funnel analysis
| |-- Churn and retention tables
| |-- Time based aggregations
| |-- Window functions for metrics
| | |-- SQL for Data Engineering
| |-- ETL Workflows
| | |-- Staging tables
| | |-- Transformations
| | |-- Incremental loads
| |-- Data Warehousing
| | |-- Snowflake
| | |-- Redshift
| | |-- BigQuery
| |-- dbt Basics
| | |-- Models
| | |-- Tests
| | |-- Lineage
| | |-- Tools and Platforms
| |-- PostgreSQL
| |-- MySQL
| |-- SQL Server
| |-- Oracle
| |-- SQLite
| |-- Cloud SQL
| |-- BigQuery UI
| |-- Snowflake Worksheets
| | |-- Projects
| |-- Build a sales reporting system
| |-- Create a star schema from raw CSV files
| |-- Design a customer segmentation query
| |-- Build a churn dashboard dataset
| |-- Optimize slow queries in a sample DB
| |-- Create an analytics pipeline with dbt
| | |-- Soft Skills and Career Prep
| |-- SQL interview patterns
| |-- Joins practice
| |-- Window function drills
| |-- Query writing speed
| |-- Git and GitHub
| |-- Data storytelling
| | |-- Bonus Topics
| |-- NoSQL intro
| |-- Working with JSON fields
| |-- Spatial SQL
| |-- Time series tables
| |-- CDC concepts
| |-- Real time analytics
| | |-- Community and Growth
| |-- LeetCode SQL
| |-- Kaggle datasets with SQL
| |-- GitHub projects
| |-- LinkedIn posts
| |-- Open source contributions
Free Resources to learn SQL
β’ W3Schools SQL
https://www.w3schools.com/sql/
β’ SQL Programming
https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v
β’ SQL Notes
https://whatsapp.com/channel/0029Vb6hJmM9hXFCWNtQX944
β’ Mode Analytics SQL tutorials
https://mode.com/sql-tutorial/
β’ Data Analytics Resources
https://t.iss.one/sqlspecialist
β’ HackerRank SQL practice
https://www.hackerrank.com/domains/sql
β’ LeetCode SQL problems
https://leetcode.com/problemset/database/
β’ Data Engineering Resources
https://whatsapp.com/channel/0029Vaovs0ZKbYMKXvKRYi3C
β’ Khan Academy SQL basics
https://www.khanacademy.org/computing/computer-programming/sql
β’ PostgreSQL official docs
https://www.postgresql.org/docs/
β’ MySQL official docs
https://dev.mysql.com/doc/
β’ NoSQL Resources
https://whatsapp.com/channel/0029VaxA2hTHgZWe5FpFjm3p
Double Tap β€οΈ For More
β€8
β
Handling NULL Values in SQL
What is NULL in SQL?
NULL means missing or unknown data.
It does NOT mean:
- 0
- Empty string ''
- False
π NULL = no value at all
Why NULLs exist in real data
Real business data is messy:
- Customer didnβt provide city
- Order amount not updated yet
- Employee not assigned a manager
So databases allow NULLs.
Example Table
Data: customers
customer_id: 1, name: Rahul, city: Pune, email: [email protected]
customer_id: 2, name: Neha, city: NULL, email: [email protected]
customer_id: 3, name: Aman, city: Delhi, email: NULL
π« Biggest Beginner Mistake
β This is WRONG
π This will return no rows
Why? Because NULL cannot be compared using = or !=
β Correct Way to Handle NULL
1οΈβ£ IS NULL: Used to find missing values
What this query does:
- Scans all rows
- Returns customers where city is missing
2οΈβ£ IS NOT NULL: Used to exclude missing values
What this query does:
- Returns customers who have email
- Removes rows with NULL email
NULLs in JOINs (Very Important)
Customers data:
customer_id: 1, name: Rahul
customer_id: 2, name: Neha
customer_id: 3, name: Aman
Orders data:
order_id: 101, customer_id: 1, amount: 5000
order_id: 102, customer_id: 1, amount: 3000
LEFT JOIN with NULL check
π Find customers with NO orders
What this query does:
- Keeps all customers
- Matches orders where possible
- Filters customers without orders
NULLs in Aggregations
COUNT behavior (INTERVIEW FAVORITE)
π Counts all rows
π Counts only non-NULL cities
Example
β Counts customers who have email
β Ignores NULL emails
Handling NULL using COALESCE
What this query does:
- If city exists β show city
- If city is NULL β show βUnknownβ
NULLs in SUM / AVG
π NULL values are ignored, not treated as 0
But if all rows are NULL, result is NULL.
Safe approach:
NULL vs Empty String
- NULL: No value
- '': Empty value
Common NULL Mistakes (Must Avoid)
β Using = NULL
β Forgetting NULLs in LEFT JOIN
β Assuming COUNT(column) counts NULL
β Ignoring NULL replacement in reports
Interview One-Liner π‘
> NULL represents missing data and must be handled using IS NULL, IS NOT NULL, or COALESCE, not with =.
Double Tap β₯οΈ For More
What is NULL in SQL?
NULL means missing or unknown data.
It does NOT mean:
- 0
- Empty string ''
- False
π NULL = no value at all
Why NULLs exist in real data
Real business data is messy:
- Customer didnβt provide city
- Order amount not updated yet
- Employee not assigned a manager
So databases allow NULLs.
Example Table
Data: customers
customer_id: 1, name: Rahul, city: Pune, email: [email protected]
customer_id: 2, name: Neha, city: NULL, email: [email protected]
customer_id: 3, name: Aman, city: Delhi, email: NULL
π« Biggest Beginner Mistake
β This is WRONG
SELECT * FROM customers WHERE city = NULL;
π This will return no rows
Why? Because NULL cannot be compared using = or !=
β Correct Way to Handle NULL
1οΈβ£ IS NULL: Used to find missing values
SELECT * FROM customers WHERE city IS NULL;
What this query does:
- Scans all rows
- Returns customers where city is missing
2οΈβ£ IS NOT NULL: Used to exclude missing values
SELECT * FROM customers WHERE email IS NOT NULL;
What this query does:
- Returns customers who have email
- Removes rows with NULL email
NULLs in JOINs (Very Important)
Customers data:
customer_id: 1, name: Rahul
customer_id: 2, name: Neha
customer_id: 3, name: Aman
Orders data:
order_id: 101, customer_id: 1, amount: 5000
order_id: 102, customer_id: 1, amount: 3000
LEFT JOIN with NULL check
π Find customers with NO orders
SELECT c.customer_id, c.name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.customer_id IS NULL;
What this query does:
- Keeps all customers
- Matches orders where possible
- Filters customers without orders
NULLs in Aggregations
COUNT behavior (INTERVIEW FAVORITE)
SELECT COUNT(*) FROM customers;
π Counts all rows
SELECT COUNT(city) FROM customers;
π Counts only non-NULL cities
Example
SELECT COUNT(email) FROM customers;
β Counts customers who have email
β Ignores NULL emails
Handling NULL using COALESCE
SELECT name, COALESCE(city, 'Unknown') AS city
FROM customers;
What this query does:
- If city exists β show city
- If city is NULL β show βUnknownβ
NULLs in SUM / AVG
SELECT SUM(amount) FROM orders;
π NULL values are ignored, not treated as 0
But if all rows are NULL, result is NULL.
Safe approach:
SELECT COALESCE(SUM(amount), 0) FROM orders;
NULL vs Empty String
- NULL: No value
- '': Empty value
WHERE email IS NULL -- missing
WHERE email = '' -- empty but exists
Common NULL Mistakes (Must Avoid)
β Using = NULL
β Forgetting NULLs in LEFT JOIN
β Assuming COUNT(column) counts NULL
β Ignoring NULL replacement in reports
Interview One-Liner π‘
> NULL represents missing data and must be handled using IS NULL, IS NOT NULL, or COALESCE, not with =.
Double Tap β₯οΈ For More
β€10
What does NULL represent in SQL?
Anonymous Quiz
13%
A. Zero
40%
B. Empty string
3%
C. False value
44%
D. Missing or unknown value
π5β€2
Here are some commonly asked SQL interview questions along with brief answers:
1. What is SQL?
- SQL stands for Structured Query Language, used for managing and manipulating relational databases.
2. What are the types of SQL commands?
- SQL commands can be broadly categorized into four types: Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language (DCL), and Transaction Control Language (TCL).
3. What is the difference between CHAR and VARCHAR data types?
- CHAR is a fixed-length character data type, while VARCHAR is a variable-length character data type. CHAR will always occupy the same amount of storage space, while VARCHAR will only use the necessary space to store the actual data.
4. What is a primary key?
- A primary key is a column or a set of columns that uniquely identifies each row in a table. It ensures data integrity by enforcing uniqueness and can be used to establish relationships between tables.
5. What is a foreign key?
- A foreign key is a column or a set of columns in one table that refers to the primary key in another table. It establishes a relationship between two tables and ensures referential integrity.
6. What is a JOIN in SQL?
- JOIN is used to combine rows from two or more tables based on a related column between them. There are different types of JOINs, including INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
7. What is the difference between INNER JOIN and OUTER JOIN?
- INNER JOIN returns only the rows that have matching values in both tables, while OUTER JOIN (LEFT, RIGHT, FULL) returns all rows from one or both tables, with NULL values in columns where there is no match.
8. What is the difference between GROUP BY and ORDER BY?
- GROUP BY is used to group rows that have the same values into summary rows, typically used with aggregate functions like SUM, COUNT, AVG, etc., while ORDER BY is used to sort the result set based on one or more columns.
9. What is a subquery?
- A subquery is a query nested within another query, used to return data that will be used in the main query. Subqueries can be used in SELECT, INSERT, UPDATE, and DELETE statements.
10. What is normalization in SQL?
- Normalization is the process of organizing data in a database to reduce redundancy and dependency. It involves dividing large tables into smaller tables and defining relationships between them to improve data integrity and efficiency.
Around 90% questions will be asked from sql in data analytics interview, so please make sure to practice SQL skills using websites like stratascratch. βΊοΈπͺ
1. What is SQL?
- SQL stands for Structured Query Language, used for managing and manipulating relational databases.
2. What are the types of SQL commands?
- SQL commands can be broadly categorized into four types: Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language (DCL), and Transaction Control Language (TCL).
3. What is the difference between CHAR and VARCHAR data types?
- CHAR is a fixed-length character data type, while VARCHAR is a variable-length character data type. CHAR will always occupy the same amount of storage space, while VARCHAR will only use the necessary space to store the actual data.
4. What is a primary key?
- A primary key is a column or a set of columns that uniquely identifies each row in a table. It ensures data integrity by enforcing uniqueness and can be used to establish relationships between tables.
5. What is a foreign key?
- A foreign key is a column or a set of columns in one table that refers to the primary key in another table. It establishes a relationship between two tables and ensures referential integrity.
6. What is a JOIN in SQL?
- JOIN is used to combine rows from two or more tables based on a related column between them. There are different types of JOINs, including INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
7. What is the difference between INNER JOIN and OUTER JOIN?
- INNER JOIN returns only the rows that have matching values in both tables, while OUTER JOIN (LEFT, RIGHT, FULL) returns all rows from one or both tables, with NULL values in columns where there is no match.
8. What is the difference between GROUP BY and ORDER BY?
- GROUP BY is used to group rows that have the same values into summary rows, typically used with aggregate functions like SUM, COUNT, AVG, etc., while ORDER BY is used to sort the result set based on one or more columns.
9. What is a subquery?
- A subquery is a query nested within another query, used to return data that will be used in the main query. Subqueries can be used in SELECT, INSERT, UPDATE, and DELETE statements.
10. What is normalization in SQL?
- Normalization is the process of organizing data in a database to reduce redundancy and dependency. It involves dividing large tables into smaller tables and defining relationships between them to improve data integrity and efficiency.
Around 90% questions will be asked from sql in data analytics interview, so please make sure to practice SQL skills using websites like stratascratch. βΊοΈπͺ
β€4
Which condition correctly finds rows where city is missing?
Anonymous Quiz
27%
A. city = NULL
11%
B. city != NULL
56%
C. city IS NULL
6%
D. city = ''
β€3
What is the result of this query?
SELECT COUNT(city) FROM customers;
SELECT COUNT(city) FROM customers;
Anonymous Quiz
28%
A. Counts all rows
12%
B. Counts only rows where city is NULL
55%
C. Counts only rows where city is NOT NULL
4%
D. Returns error
β€2
You want to show 0 instead of NULL for total sales. Which function should you use?
Anonymous Quiz
33%
A. ISNULL
8%
B. NVL
46%
C. COALESCE
13%
D. NULLIF
β€3
In a LEFT JOIN, why do unmatched rows show NULL values?
Anonymous Quiz
4%
A. Because data is deleted
5%
B. Because INNER JOIN is used
78%
C. Because matching rows donβt exist in the right table
14%
D. Because NULL is a default value
β€3
SQL Interview Questions for 0-1 year of Experience (Asked in Top Product-Based Companies).
Sharpen your SQL skills with these real interview questions!
Q1. Customer Purchase Patterns -
You have two tables, Customers and Purchases: CREATE TABLE Customers ( customer_id INT PRIMARY KEY, customer_name VARCHAR(255) ); CREATE TABLE Purchases ( purchase_id INT PRIMARY KEY, customer_id INT, product_id INT, purchase_date DATE );
Assume necessary INSERT statements are already executed.
Write an SQL query to find the names of customers who have purchased more than 5 different products within the last month. Order the result by customer_name.
Q2. Call Log Analysis -
Suppose you have a CallLogs table: CREATE TABLE CallLogs ( log_id INT PRIMARY KEY, caller_id INT, receiver_id INT, call_start_time TIMESTAMP, call_end_time TIMESTAMP );
Assume necessary INSERT statements are already executed.
Write a query to find the average call duration per user. Include only users who have made more than 10 calls in total. Order the result by average duration descending.
Q3. Employee Project Allocation - Consider two tables, Employees and Projects:
CREATE TABLE Employees ( employee_id INT PRIMARY KEY, employee_name VARCHAR(255), department VARCHAR(255) ); CREATE TABLE Projects ( project_id INT PRIMARY KEY, lead_employee_id INT, project_name VARCHAR(255), start_date DATE, end_date DATE );
Assume necessary INSERT statements are already executed.
The goal is to write an SQL query to find the names of employees who have led more than 3 projects in the last year. The result should be ordered by the number of projects led.
Sharpen your SQL skills with these real interview questions!
Q1. Customer Purchase Patterns -
You have two tables, Customers and Purchases: CREATE TABLE Customers ( customer_id INT PRIMARY KEY, customer_name VARCHAR(255) ); CREATE TABLE Purchases ( purchase_id INT PRIMARY KEY, customer_id INT, product_id INT, purchase_date DATE );
Assume necessary INSERT statements are already executed.
Write an SQL query to find the names of customers who have purchased more than 5 different products within the last month. Order the result by customer_name.
Q2. Call Log Analysis -
Suppose you have a CallLogs table: CREATE TABLE CallLogs ( log_id INT PRIMARY KEY, caller_id INT, receiver_id INT, call_start_time TIMESTAMP, call_end_time TIMESTAMP );
Assume necessary INSERT statements are already executed.
Write a query to find the average call duration per user. Include only users who have made more than 10 calls in total. Order the result by average duration descending.
Q3. Employee Project Allocation - Consider two tables, Employees and Projects:
CREATE TABLE Employees ( employee_id INT PRIMARY KEY, employee_name VARCHAR(255), department VARCHAR(255) ); CREATE TABLE Projects ( project_id INT PRIMARY KEY, lead_employee_id INT, project_name VARCHAR(255), start_date DATE, end_date DATE );
Assume necessary INSERT statements are already executed.
The goal is to write an SQL query to find the names of employees who have led more than 3 projects in the last year. The result should be ordered by the number of projects led.
β€4
SQL Interview Trap π¨ Consecutive Orders Logic
You have a table:
orders
order_id | customer_id | order_date | amount
π Question:
Find customers who placed orders on 3 or more consecutive days,
but return only the first date of each such streak per customer.
β οΈ No temp tables.
β οΈ Assume multiple orders per day are possible.
π§ Most candidates fail because they:
- Forget to handle multiple orders on the same day
- Misuse ROW_NUMBER()
- Miss the date gap logic
β Correct SQL Solution:
WITH distinct_orders AS (
SELECT DISTINCT customer_id, order_date
FROM orders
),
grp AS (
SELECT
customer_id,
order_date,
order_date - INTERVAL '1 day' *
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS grp_id
FROM distinct_orders
)
SELECT
customer_id,
MIN(order_date) AS streak_start_date
FROM grp
GROUP BY customer_id, grp_id
HAVING COUNT(*) >= 3;
π‘ Why this works (Interview Gold):
- DISTINCT removes same-day duplicates
- ROW_NUMBER() creates a sequence
- Date minus row number groups consecutive dates
- HAVING COUNT(*) >= 3 filters valid streaks
π₯ React with π₯ if this bent your brain
π Follow the channel for REAL interview-level SQL Content
You have a table:
orders
order_id | customer_id | order_date | amount
π Question:
Find customers who placed orders on 3 or more consecutive days,
but return only the first date of each such streak per customer.
β οΈ No temp tables.
β οΈ Assume multiple orders per day are possible.
π§ Most candidates fail because they:
- Forget to handle multiple orders on the same day
- Misuse ROW_NUMBER()
- Miss the date gap logic
β Correct SQL Solution:
WITH distinct_orders AS (
SELECT DISTINCT customer_id, order_date
FROM orders
),
grp AS (
SELECT
customer_id,
order_date,
order_date - INTERVAL '1 day' *
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS grp_id
FROM distinct_orders
)
SELECT
customer_id,
MIN(order_date) AS streak_start_date
FROM grp
GROUP BY customer_id, grp_id
HAVING COUNT(*) >= 3;
π‘ Why this works (Interview Gold):
- DISTINCT removes same-day duplicates
- ROW_NUMBER() creates a sequence
- Date minus row number groups consecutive dates
- HAVING COUNT(*) >= 3 filters valid streaks
π₯ React with π₯ if this bent your brain
π Follow the channel for REAL interview-level SQL Content
1β€6
β
Essential Tools for Data Analytics ππ οΈ
π£ 1οΈβ£ Excel / Google Sheets
β’ Quick data entry & analysis
β’ Pivot tables, charts, functions
β’ Good for early-stage exploration
π» 2οΈβ£ SQL (Structured Query Language)
β’ Work with databases (MySQL, PostgreSQL, etc.)
β’ Query, filter, join, and aggregate data
β’ Must-know for data from large systems
π 3οΈβ£ Python (with Libraries)
β’ Pandas β Data manipulation
β’ NumPy β Numerical analysis
β’ Matplotlib / Seaborn β Data visualization
β’ OpenPyXL / xlrd β Work with Excel files
π 4οΈβ£ Power BI / Tableau
β’ Create dashboards and visual reports
β’ Drag-and-drop interface for non-coders
β’ Ideal for business insights & presentations
π 5οΈβ£ Google Data Studio
β’ Free dashboard tool
β’ Connects easily to Google Sheets, BigQuery
β’ Great for real-time reporting
π§ͺ 6οΈβ£ Jupyter Notebook
β’ Interactive Python coding
β’ Combine code, text, and visuals in one place
β’ Perfect for storytelling with data
π οΈ 7οΈβ£ R Programming (Optional)
β’ Popular in statistical analysis
β’ Strong in academic and research settings
βοΈ 8οΈβ£ Cloud & Big Data Tools
β’ Google BigQuery, Snowflake β Large-scale analysis
β’ Excel + SQL + Python still work as a base
π‘ Tip:
Start with Excel + SQL + Python (Pandas) β Add BI tools for reporting.
π¬ Tap β€οΈ for more!
π£ 1οΈβ£ Excel / Google Sheets
β’ Quick data entry & analysis
β’ Pivot tables, charts, functions
β’ Good for early-stage exploration
π» 2οΈβ£ SQL (Structured Query Language)
β’ Work with databases (MySQL, PostgreSQL, etc.)
β’ Query, filter, join, and aggregate data
β’ Must-know for data from large systems
π 3οΈβ£ Python (with Libraries)
β’ Pandas β Data manipulation
β’ NumPy β Numerical analysis
β’ Matplotlib / Seaborn β Data visualization
β’ OpenPyXL / xlrd β Work with Excel files
π 4οΈβ£ Power BI / Tableau
β’ Create dashboards and visual reports
β’ Drag-and-drop interface for non-coders
β’ Ideal for business insights & presentations
π 5οΈβ£ Google Data Studio
β’ Free dashboard tool
β’ Connects easily to Google Sheets, BigQuery
β’ Great for real-time reporting
π§ͺ 6οΈβ£ Jupyter Notebook
β’ Interactive Python coding
β’ Combine code, text, and visuals in one place
β’ Perfect for storytelling with data
π οΈ 7οΈβ£ R Programming (Optional)
β’ Popular in statistical analysis
β’ Strong in academic and research settings
βοΈ 8οΈβ£ Cloud & Big Data Tools
β’ Google BigQuery, Snowflake β Large-scale analysis
β’ Excel + SQL + Python still work as a base
π‘ Tip:
Start with Excel + SQL + Python (Pandas) β Add BI tools for reporting.
π¬ Tap β€οΈ for more!
β€4
π SQL Subqueries CTEs
1οΈβ£ What is a Subquery?
A subquery is a query inside another query. It runs first and passes its result to the outer query.
Think like this π
> βFirst find something β then use it to filter or calculate something elseβ
Why Subqueries exist (business thinking)
Real questions like:
β’ Find customers who spent more than average
β’ Find products with highest sales
β’ Find employees earning more than their manager
These need one queryβs result inside another query.
β Basic Subquery Structure
SELECT column
FROM text
WHERE column OPERATOR (
SELECT column
FROM text
);
Example Tables: orders
order_id | customer_id | amount
1 | 101 | 5000
2 | 102 | 8000
3 | 103 | 3000
2οΈβ£ Subquery in WHERE clause (Most Common)
πΉ Scenario: Find orders with amount greater than average order value
SELECT *
FROM orders
WHERE amount > (
SELECT AVG(amount)
FROM orders
);
What this query does
1. Inner query calculates average order amount
2. Outer query keeps only orders above that average
β Very common interview question
3οΈβ£ Subquery with IN
πΉ Scenario: Find customers who have placed at least one order
Tables: customers(customer_id, name) orders(customer_id)
SELECT name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
);
What this query does
β’ Inner query gets customers who ordered
β’ Outer query fetches their names
4οΈβ£ Subquery in SELECT clause
πΉ Scenario: Show each order with total number of orders
SELECT order_id, amount, (
SELECT COUNT(*)
FROM orders
) AS total_orders
FROM orders;
What this query does
β’ Inner query runs once
β’ Adds total order count to every row
β οΈ Use carefully β can be inefficient
5οΈβ£ Correlated Subquery (Important)
A correlated subquery depends on the outer query. It runs once per row.
πΉ Scenario: Find customers who spent more than their cityβs average
Tables: customers(customer_id, city) orders(customer_id, amount)
SELECT c.customer_id
FROM customers c
WHERE (
SELECT AVG(o.amount)
FROM orders o
WHERE o.customer_id = c.customer_id
) > 5000;
What this query does
β’ For each customer
β’ Calculates their average spend
β’ Filters based on condition
β οΈ Powerful but slower on large data
6οΈβ£ Problems with Subqueries
β Hard to read
β Hard to debug
β Performance issues
β Nested logic becomes messy
π Thatβs why CTEs exist
7οΈβ£ What is a CTE (Common Table Expression)?
A CTE is a named temporary result.
It makes complex queries readable and reusable.
CTE Syntax
WITH cte_name AS (
SELECT ...
)
SELECT *
FROM cte_name;
8οΈβ£ Same Problem Solved Using CTE (Cleaner)
πΉ Find customers with total spend > 10,000
WITH customer_spend AS (
SELECT customer_id, SUM(amount) AS total_spend
FROM orders
GROUP BY customer_id
)
SELECT *
FROM customer_spend
WHERE total_spend > 10000;
What this does
β’ First block calculates spend
β’ Second block filters results
β’ Very readable
9οΈβ£ CTE vs Subquery
β’ Readability: CTE is excellent, Subquery is poor
β’ Reusability: CTE is yes, Subquery is no
β’ Debugging: CTE is easy, Subquery is hard
β’ Performance: Both depend on usage
π When to Use What?
Use Subquery when:
β Logic is small
β Used only once
Use CTE when:
β Logic is complex
β Multiple steps
β Interview or production query
Common Beginner Mistakes
β Writing very deep nested subqueries
β Using correlated subqueries unnecessarily
β Forgetting CTE scope (only valid for one query)
Interview Tip π‘
> Subqueries solve problems inside queries, while CTEs solve readability and maintainability.
Double Tap β₯οΈ For More
1οΈβ£ What is a Subquery?
A subquery is a query inside another query. It runs first and passes its result to the outer query.
Think like this π
> βFirst find something β then use it to filter or calculate something elseβ
Why Subqueries exist (business thinking)
Real questions like:
β’ Find customers who spent more than average
β’ Find products with highest sales
β’ Find employees earning more than their manager
These need one queryβs result inside another query.
β Basic Subquery Structure
SELECT column
FROM text
WHERE column OPERATOR (
SELECT column
FROM text
);
Example Tables: orders
order_id | customer_id | amount
1 | 101 | 5000
2 | 102 | 8000
3 | 103 | 3000
2οΈβ£ Subquery in WHERE clause (Most Common)
πΉ Scenario: Find orders with amount greater than average order value
SELECT *
FROM orders
WHERE amount > (
SELECT AVG(amount)
FROM orders
);
What this query does
1. Inner query calculates average order amount
2. Outer query keeps only orders above that average
β Very common interview question
3οΈβ£ Subquery with IN
πΉ Scenario: Find customers who have placed at least one order
Tables: customers(customer_id, name) orders(customer_id)
SELECT name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
);
What this query does
β’ Inner query gets customers who ordered
β’ Outer query fetches their names
4οΈβ£ Subquery in SELECT clause
πΉ Scenario: Show each order with total number of orders
SELECT order_id, amount, (
SELECT COUNT(*)
FROM orders
) AS total_orders
FROM orders;
What this query does
β’ Inner query runs once
β’ Adds total order count to every row
β οΈ Use carefully β can be inefficient
5οΈβ£ Correlated Subquery (Important)
A correlated subquery depends on the outer query. It runs once per row.
πΉ Scenario: Find customers who spent more than their cityβs average
Tables: customers(customer_id, city) orders(customer_id, amount)
SELECT c.customer_id
FROM customers c
WHERE (
SELECT AVG(o.amount)
FROM orders o
WHERE o.customer_id = c.customer_id
) > 5000;
What this query does
β’ For each customer
β’ Calculates their average spend
β’ Filters based on condition
β οΈ Powerful but slower on large data
6οΈβ£ Problems with Subqueries
β Hard to read
β Hard to debug
β Performance issues
β Nested logic becomes messy
π Thatβs why CTEs exist
7οΈβ£ What is a CTE (Common Table Expression)?
A CTE is a named temporary result.
It makes complex queries readable and reusable.
CTE Syntax
WITH cte_name AS (
SELECT ...
)
SELECT *
FROM cte_name;
8οΈβ£ Same Problem Solved Using CTE (Cleaner)
πΉ Find customers with total spend > 10,000
WITH customer_spend AS (
SELECT customer_id, SUM(amount) AS total_spend
FROM orders
GROUP BY customer_id
)
SELECT *
FROM customer_spend
WHERE total_spend > 10000;
What this does
β’ First block calculates spend
β’ Second block filters results
β’ Very readable
9οΈβ£ CTE vs Subquery
β’ Readability: CTE is excellent, Subquery is poor
β’ Reusability: CTE is yes, Subquery is no
β’ Debugging: CTE is easy, Subquery is hard
β’ Performance: Both depend on usage
π When to Use What?
Use Subquery when:
β Logic is small
β Used only once
Use CTE when:
β Logic is complex
β Multiple steps
β Interview or production query
Common Beginner Mistakes
β Writing very deep nested subqueries
β Using correlated subqueries unnecessarily
β Forgetting CTE scope (only valid for one query)
Interview Tip π‘
> Subqueries solve problems inside queries, while CTEs solve readability and maintainability.
Double Tap β₯οΈ For More
β€4
π‘ 10 SQL Projects You Can Start Today (With Datasets)
1) E-commerce Deep Dive π
Brazilian orders, payments, reviews, deliveries β the full package.
https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce
2) Sales Performance Tracker π
Perfect for learning KPIs, revenue trends, and top products.
https://www.kaggle.com/datasets/kyanyoga/sample-sales-data
3) HR Analytics (Attrition + Employee Insights) π₯
Analyze why employees leave + build dashboards with SQL.
https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset
4) Banking + Financial Data π³
Great for segmentation, customer behavior, and risk analysis.
https://www.kaggle.com/datasets?tags=11129-Banking
5) Healthcare & Mortality Analysis π₯
Serious dataset for serious SQL practice (filters, joins, grouping).
https://www.kaggle.com/datasets/cdc/mortality
6) Marketing + Customer Value (CRM) π―
Customer lifetime value, retention, and segmentation projects.
https://www.kaggle.com/datasets/pankajjsh06/ibm-watson-marketing-customer-value-data
7) Supply Chain & Procurement Analytics π
Great for vendor performance + procurement cost tracking.
https://www.kaggle.com/datasets/shashwatwork/dataco-smart-supply-chain-for-big-data-analysis
8) Inventory Management π¦
Search and pick a dataset β tons of options here.
https://www.kaggle.com/datasets/fayez1/inventory-management
9) Web/Product Review Analytics βοΈ
Use SQL to analyze ratings, trends, and categories.
https://www.kaggle.com/datasets/zynicide/wine-reviews
10) Social Mediaβ Style Analytics (User Behavior / Health Trends) π
This one is more behavioral analytics than social media, but still great for SQL practice.
https://www.kaggle.com/datasets/aasheesh200/framingham-heart-study-dataset
1) E-commerce Deep Dive π
Brazilian orders, payments, reviews, deliveries β the full package.
https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce
2) Sales Performance Tracker π
Perfect for learning KPIs, revenue trends, and top products.
https://www.kaggle.com/datasets/kyanyoga/sample-sales-data
3) HR Analytics (Attrition + Employee Insights) π₯
Analyze why employees leave + build dashboards with SQL.
https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset
4) Banking + Financial Data π³
Great for segmentation, customer behavior, and risk analysis.
https://www.kaggle.com/datasets?tags=11129-Banking
5) Healthcare & Mortality Analysis π₯
Serious dataset for serious SQL practice (filters, joins, grouping).
https://www.kaggle.com/datasets/cdc/mortality
6) Marketing + Customer Value (CRM) π―
Customer lifetime value, retention, and segmentation projects.
https://www.kaggle.com/datasets/pankajjsh06/ibm-watson-marketing-customer-value-data
7) Supply Chain & Procurement Analytics π
Great for vendor performance + procurement cost tracking.
https://www.kaggle.com/datasets/shashwatwork/dataco-smart-supply-chain-for-big-data-analysis
8) Inventory Management π¦
Search and pick a dataset β tons of options here.
https://www.kaggle.com/datasets/fayez1/inventory-management
9) Web/Product Review Analytics βοΈ
Use SQL to analyze ratings, trends, and categories.
https://www.kaggle.com/datasets/zynicide/wine-reviews
10) Social Mediaβ Style Analytics (User Behavior / Health Trends) π
This one is more behavioral analytics than social media, but still great for SQL practice.
https://www.kaggle.com/datasets/aasheesh200/framingham-heart-study-dataset
Kaggle
Brazilian E-Commerce Public Dataset by Olist
100,000 Orders with product, customer and reviews info
β€5
β
15 Power BI Interview Questions for Freshers ππ»
1οΈβ£ What is Power BI and what is it used for?
Answer: Power BI is a business analytics tool by Microsoft to visualize data, create reports, and share insights across organizations.
2οΈβ£ What are the main components of Power BI?
Answer: Power BI Desktop, Power BI Service (Cloud), Power BI Mobile, Power BI Gateway, and Power BI Report Server.
3οΈβ£ What is a DAX in Power BI?
Answer: Data Analysis Expressions (DAX) is a formula language used to create custom calculations in Power BI.
4οΈβ£ What is the difference between a calculated column and a measure?
Answer: Calculated columns are row-level computations stored in the table. Measures are aggregations computed at query time.
5οΈβ£ What is the difference between Power BI Desktop and Power BI Service?
Answer: Desktop is for building reports and data modeling. Service is for publishing, sharing, and collaboration online.
6οΈβ£ What is a data model in Power BI?
Answer: A data model organizes tables, relationships, and calculations to efficiently analyze and visualize data.
7οΈβ£ What is the difference between DirectQuery and Import mode?
Answer: Import loads data into Power BI, faster for analysis. DirectQuery queries the source directly, no data is imported.
8οΈβ£ What are slicers in Power BI?
Answer: Visual filters that allow users to dynamically filter report data.
9οΈβ£ What is Power Query?
Answer: A data connection and transformation tool in Power BI used for cleaning and shaping data before loading.
1οΈβ£0οΈβ£ What is the difference between a table visual and a matrix visual?
Answer: Table displays data in simple rows and columns. Matrix allows grouping, row/column hierarchies, and aggregations.
1οΈβ£1οΈβ£ What is a Power BI dashboard?
Answer: A single-page collection of visualizations from multiple reports for quick insights.
1οΈβ£2οΈβ£ What is a relationship in Power BI?
Answer: Links between tables that define how data is connected for accurate aggregations and filtering.
1οΈβ£3οΈβ£ What are filters in Power BI?
Answer: Visual-level, page-level, or report-level filters to restrict data shown in reports.
1οΈβ£4οΈβ£ What is Power BI Gateway?
Answer: A bridge between on-premise data sources and Power BI Service for scheduled refreshes.
1οΈβ£5οΈβ£ What is the difference between a report and a dashboard?
Answer: Reports can have multiple pages and visuals; dashboards are single-page, with pinned visuals from reports.
Power BI Resources: https://whatsapp.com/channel/0029Vai1xKf1dAvuk6s1v22c
π¬ React with β€οΈ for more!
1οΈβ£ What is Power BI and what is it used for?
Answer: Power BI is a business analytics tool by Microsoft to visualize data, create reports, and share insights across organizations.
2οΈβ£ What are the main components of Power BI?
Answer: Power BI Desktop, Power BI Service (Cloud), Power BI Mobile, Power BI Gateway, and Power BI Report Server.
3οΈβ£ What is a DAX in Power BI?
Answer: Data Analysis Expressions (DAX) is a formula language used to create custom calculations in Power BI.
4οΈβ£ What is the difference between a calculated column and a measure?
Answer: Calculated columns are row-level computations stored in the table. Measures are aggregations computed at query time.
5οΈβ£ What is the difference between Power BI Desktop and Power BI Service?
Answer: Desktop is for building reports and data modeling. Service is for publishing, sharing, and collaboration online.
6οΈβ£ What is a data model in Power BI?
Answer: A data model organizes tables, relationships, and calculations to efficiently analyze and visualize data.
7οΈβ£ What is the difference between DirectQuery and Import mode?
Answer: Import loads data into Power BI, faster for analysis. DirectQuery queries the source directly, no data is imported.
8οΈβ£ What are slicers in Power BI?
Answer: Visual filters that allow users to dynamically filter report data.
9οΈβ£ What is Power Query?
Answer: A data connection and transformation tool in Power BI used for cleaning and shaping data before loading.
1οΈβ£0οΈβ£ What is the difference between a table visual and a matrix visual?
Answer: Table displays data in simple rows and columns. Matrix allows grouping, row/column hierarchies, and aggregations.
1οΈβ£1οΈβ£ What is a Power BI dashboard?
Answer: A single-page collection of visualizations from multiple reports for quick insights.
1οΈβ£2οΈβ£ What is a relationship in Power BI?
Answer: Links between tables that define how data is connected for accurate aggregations and filtering.
1οΈβ£3οΈβ£ What are filters in Power BI?
Answer: Visual-level, page-level, or report-level filters to restrict data shown in reports.
1οΈβ£4οΈβ£ What is Power BI Gateway?
Answer: A bridge between on-premise data sources and Power BI Service for scheduled refreshes.
1οΈβ£5οΈβ£ What is the difference between a report and a dashboard?
Answer: Reports can have multiple pages and visuals; dashboards are single-page, with pinned visuals from reports.
Power BI Resources: https://whatsapp.com/channel/0029Vai1xKf1dAvuk6s1v22c
π¬ React with β€οΈ for more!
β€2
Top 50 SQL Interview Questions
1. What is SQL?
2. Differentiate between SQL and NoSQL databases.
3. What are the different types of SQL commands?
4. Explain the difference between WHERE and HAVING clauses.
5. Write a SQL query to find the second highest salary in a table.
6. What is a JOIN? Explain different types of JOINs.
7. How do you optimize slow-performing SQL queries?
8. What is a primary key? What is a foreign key?
9. What are indexes? Explain clustered and non-clustered indexes.
10. Write a SQL query to fetch the top 5 records from a table.
11. What is a subquery? Give an example.
12. Explain the concept of normalization.
13. What is denormalization? When is it used?
14. Describe transactions and their properties (ACID).
15. What is a stored procedure?
16. How do you handle NULL values in SQL?
17. Explain the difference between UNION and UNION ALL.
18. What are views? How are they useful?
19. What is a trigger? Give use cases.
20. How do you perform aggregate functions in SQL?
21. What is data partitioning?
22. How do you find duplicates in a table?
23. What is the difference between DELETE and TRUNCATE?
24. Explain window functions with examples.
25. What is the difference between correlated and non-correlated subqueries?
26. How do you enforce data integrity?
27. What are CTEs (Common Table Expressions)?
28. Explain EXISTS and NOT EXISTS operators.
29. How do SQL constraints work?
30. What is an execution plan? How do you use it?
31. Describe how to handle errors in SQL.
32. What are temporary tables?
33. Explain the difference between CHAR and VARCHAR.
34. How do you perform pagination in SQL?
35. What is a composite key?
36. How do you convert data types in SQL?
37. Explain locking and isolation levels in SQL.
38. How do you write recursive queries?
39. What are the advantages of using prepared statements?
40. How to debug SQL queries?
41. Differentiate between OLTP and OLAP databases.
42. What is schema in SQL?
43. How do you implement many-to-many relationships in SQL?
44. What is query optimization?
45. How do you handle large datasets in SQL?
46. Explain the difference between CROSS JOIN and INNER JOIN.
47. What is a materialized view?
48. How do you backup and restore a database?
49. Explain how indexing can degrade performance.
50. Can you write a query to find employees with no managers?
Double tap β€οΈ for detailed answers!
1. What is SQL?
2. Differentiate between SQL and NoSQL databases.
3. What are the different types of SQL commands?
4. Explain the difference between WHERE and HAVING clauses.
5. Write a SQL query to find the second highest salary in a table.
6. What is a JOIN? Explain different types of JOINs.
7. How do you optimize slow-performing SQL queries?
8. What is a primary key? What is a foreign key?
9. What are indexes? Explain clustered and non-clustered indexes.
10. Write a SQL query to fetch the top 5 records from a table.
11. What is a subquery? Give an example.
12. Explain the concept of normalization.
13. What is denormalization? When is it used?
14. Describe transactions and their properties (ACID).
15. What is a stored procedure?
16. How do you handle NULL values in SQL?
17. Explain the difference between UNION and UNION ALL.
18. What are views? How are they useful?
19. What is a trigger? Give use cases.
20. How do you perform aggregate functions in SQL?
21. What is data partitioning?
22. How do you find duplicates in a table?
23. What is the difference between DELETE and TRUNCATE?
24. Explain window functions with examples.
25. What is the difference between correlated and non-correlated subqueries?
26. How do you enforce data integrity?
27. What are CTEs (Common Table Expressions)?
28. Explain EXISTS and NOT EXISTS operators.
29. How do SQL constraints work?
30. What is an execution plan? How do you use it?
31. Describe how to handle errors in SQL.
32. What are temporary tables?
33. Explain the difference between CHAR and VARCHAR.
34. How do you perform pagination in SQL?
35. What is a composite key?
36. How do you convert data types in SQL?
37. Explain locking and isolation levels in SQL.
38. How do you write recursive queries?
39. What are the advantages of using prepared statements?
40. How to debug SQL queries?
41. Differentiate between OLTP and OLAP databases.
42. What is schema in SQL?
43. How do you implement many-to-many relationships in SQL?
44. What is query optimization?
45. How do you handle large datasets in SQL?
46. Explain the difference between CROSS JOIN and INNER JOIN.
47. What is a materialized view?
48. How do you backup and restore a database?
49. Explain how indexing can degrade performance.
50. Can you write a query to find employees with no managers?
Double tap β€οΈ for detailed answers!
β€25π1
ππ»ππ²πΏππΆπ²ππ²πΏ: You have 2 minutes to solve this SQL query.
Retrieve the department name and the highest salary in each department from the employees table, but only for departments where the highest salary is greater than $70,000.
π π²: Challenge accepted!
SELECT department, MAX(salary) AS highest_salary
FROM employees
GROUP BY department
HAVING MAX(salary) > 70000;
I used GROUP BY to group employees by department, MAX() to get the highest salary, and HAVING to filter the result based on the condition that the highest salary exceeds $70,000. This solution effectively shows my understanding of aggregation functions and how to apply conditions on the result of those aggregations.
π§πΆπ½ π³πΌπΏ π¦π€π ππΌπ― π¦π²π²πΈπ²πΏπ:
It's not about writing complex queries; it's about writing clean, efficient, and scalable code. Focus on mastering subqueries, joins, and aggregation functions to stand out!
React with β€οΈ for more
Retrieve the department name and the highest salary in each department from the employees table, but only for departments where the highest salary is greater than $70,000.
π π²: Challenge accepted!
SELECT department, MAX(salary) AS highest_salary
FROM employees
GROUP BY department
HAVING MAX(salary) > 70000;
I used GROUP BY to group employees by department, MAX() to get the highest salary, and HAVING to filter the result based on the condition that the highest salary exceeds $70,000. This solution effectively shows my understanding of aggregation functions and how to apply conditions on the result of those aggregations.
π§πΆπ½ π³πΌπΏ π¦π€π ππΌπ― π¦π²π²πΈπ²πΏπ:
It's not about writing complex queries; it's about writing clean, efficient, and scalable code. Focus on mastering subqueries, joins, and aggregation functions to stand out!
React with β€οΈ for more
β€10