Data Science & Machine Learning
73.1K subscribers
789 photos
2 videos
68 files
688 links
Join this channel to learn data science, artificial intelligence and machine learning with funny quizzes, interesting projects and amazing resources for free

For collaborations: @love_data
Download Telegram
SQL Cheatsheet ๐Ÿ“

This SQL cheatsheet is designed to be your quick reference guide for SQL programming. Whether youโ€™re a beginner learning how to query databases or an experienced developer looking for a handy resource, this cheatsheet covers essential SQL topics.

1. Database Basics
- CREATE DATABASE db_name;
- USE db_name;

2. Tables
- Create Table: CREATE TABLE table_name (col1 datatype, col2 datatype);
- Drop Table: DROP TABLE table_name;
- Alter Table: ALTER TABLE table_name ADD column_name datatype;

3. Insert Data
- INSERT INTO table_name (col1, col2) VALUES (val1, val2);

4. Select Queries
- Basic Select: SELECT * FROM table_name;
- Select Specific Columns: SELECT col1, col2 FROM table_name;
- Select with Condition: SELECT * FROM table_name WHERE condition;

5. Update Data
- UPDATE table_name SET col1 = value1 WHERE condition;

6. Delete Data
- DELETE FROM table_name WHERE condition;

7. Joins
- Inner Join: SELECT * FROM table1 INNER JOIN table2 ON table1.col = table2.col;
- Left Join: SELECT * FROM table1 LEFT JOIN table2 ON table1.col = table2.col;
- Right Join: SELECT * FROM table1 RIGHT JOIN table2 ON table1.col = table2.col;

8. Aggregations
- Count: SELECT COUNT(*) FROM table_name;
- Sum: SELECT SUM(col) FROM table_name;
- Group By: SELECT col, COUNT(*) FROM table_name GROUP BY col;

9. Sorting & Limiting
- Order By: SELECT * FROM table_name ORDER BY col ASC|DESC;
- Limit Results: SELECT * FROM table_name LIMIT n;

10. Indexes
- Create Index: CREATE INDEX idx_name ON table_name (col);
- Drop Index: DROP INDEX idx_name;

11. Subqueries
- SELECT * FROM table_name WHERE col IN (SELECT col FROM other_table);

12. Views
- Create View: CREATE VIEW view_name AS SELECT * FROM table_name;
- Drop View: DROP VIEW view_name;
โค5๐Ÿ”ฅ1
๐Ÿš€ Complete Roadmap to Become a Data Scientist in 5 Months

๐Ÿ“… Week 1-2: Fundamentals
โœ… Day 1-3: Introduction to Data Science, its applications, and roles.
โœ… Day 4-7: Brush up on Python programming ๐Ÿ.
โœ… Day 8-10: Learn basic statistics ๐Ÿ“Š and probability ๐ŸŽฒ.

๐Ÿ” Week 3-4: Data Manipulation & Visualization
๐Ÿ“ Day 11-15: Master Pandas for data manipulation.
๐Ÿ“ˆ Day 16-20: Learn Matplotlib & Seaborn for data visualization.

๐Ÿค– Week 5-6: Machine Learning Foundations
๐Ÿ”ฌ Day 21-25: Introduction to scikit-learn.
๐Ÿ“Š Day 26-30: Learn Linear & Logistic Regression.

๐Ÿ— Week 7-8: Advanced Machine Learning
๐ŸŒณ Day 31-35: Explore Decision Trees & Random Forests.
๐Ÿ“Œ Day 36-40: Learn Clustering (K-Means, DBSCAN) & Dimensionality Reduction.

๐Ÿง  Week 9-10: Deep Learning
๐Ÿค– Day 41-45: Basics of Neural Networks with TensorFlow/Keras.
๐Ÿ“ธ Day 46-50: Learn CNNs & RNNs for image & text data.

๐Ÿ› Week 11-12: Data Engineering
๐Ÿ—„ Day 51-55: Learn SQL & Databases.
๐Ÿงน Day 56-60: Data Preprocessing & Cleaning.

๐Ÿ“Š Week 13-14: Model Evaluation & Optimization
๐Ÿ“ Day 61-65: Learn Cross-validation & Hyperparameter Tuning.
๐Ÿ“‰ Day 66-70: Understand Evaluation Metrics (Accuracy, Precision, Recall, F1-score).

๐Ÿ— Week 15-16: Big Data & Tools
๐Ÿ˜ Day 71-75: Introduction to Big Data Technologies (Hadoop, Spark).
โ˜๏ธ Day 76-80: Learn Cloud Computing (AWS, GCP, Azure).

๐Ÿš€ Week 17-18: Deployment & Production
๐Ÿ›  Day 81-85: Deploy models using Flask or FastAPI.
๐Ÿ“ฆ Day 86-90: Learn Docker & Cloud Deployment (AWS, Heroku).

๐ŸŽฏ Week 19-20: Specialization
๐Ÿ“ Day 91-95: Choose NLP or Computer Vision, based on your interest.

๐Ÿ† Week 21-22: Projects & Portfolio
๐Ÿ“‚ Day 96-100: Work on Personal Data Science Projects.

๐Ÿ’ฌ Week 23-24: Soft Skills & Networking
๐ŸŽค Day 101-105: Improve Communication & Presentation Skills.
๐ŸŒ Day 106-110: Attend Online Meetups & Forums.

๐ŸŽฏ Week 25-26: Interview Preparation
๐Ÿ’ป Day 111-115: Practice Coding Interviews (LeetCode, HackerRank).
๐Ÿ“‚ Day 116-120: Review your projects & prepare for discussions.

๐Ÿ‘จโ€๐Ÿ’ป Week 27-28: Apply for Jobs
๐Ÿ“ฉ Day 121-125: Start applying for Entry-Level Data Scientist positions.

๐ŸŽค Week 29-30: Interviews
๐Ÿ“ Day 126-130: Attend Interviews & Practice Whiteboard Problems.

๐Ÿ”„ Week 31-32: Continuous Learning
๐Ÿ“ฐ Day 131-135: Stay updated with the Latest Data Science Trends.

๐Ÿ† Week 33-34: Accepting Offers
๐Ÿ“ Day 136-140: Evaluate job offers & Negotiate Your Salary.

๐Ÿข Week 35-36: Settling In
๐ŸŽฏ Day 141-150: Start your New Data Science Job, adapt & keep learning!

๐ŸŽ‰ Enjoy Learning & Build Your Dream Career in Data Science! ๐Ÿš€๐Ÿ”ฅ
โค7
SQL Joins โ€” A Practical Cheatsheet for Professionals

If youโ€™re working with relational data โ€” whether youโ€™re a business analyst, backend dev, or aspiring data scientist โ€” mastering SQL joins isnโ€™t optional. Itโ€™s fundamental.

Hereโ€™s a concise guide to the most important join types, with real-world use cases:


INNER JOIN

Returns records with matching keys from both tables.
Use case: Show only customers whoโ€™ve placed at least one order.


LEFT JOIN (OUTER)

Returns all rows from the left table, and matched rows from the right.
Use case: List all customers, including those with zero orders.


RIGHT JOIN (OUTER)

Returns all rows from the right table. Rarely used, but powerful.
Use case: Show all orders, even if the customer was deleted.


FULL OUTER JOIN

Returns all records from both tables.
Use case: Capture everything โ€” matched and unmatched.


CROSS JOIN

Returns the cartesian product.
Use case: Generate every possible product/supplier combo.


SELF JOIN

Joins a table to itself.
Use case: Show employees and their reporting managers.


Best Practices

Use aliases (A, B) for clean code
Prefer JOIN ON over WHERE for clarity
Always test joins with LIMIT to prevent overloads
โค6๐Ÿ”ฅ3
Random Module in Python ๐Ÿ‘†
โค7
Data Cleaning Tips โœ…
โค7
The Data Science skill no one talks about...

Every aspiring data scientist I talk to thinks their job starts when someone else gives them:
    1. a dataset, and
    2. a clearly defined metric to optimize for, e.g. accuracy

But it doesnโ€™t.

It starts with a business problem you need to understand, frame, and solve. This is the key data science skill that separates senior from junior professionals.

Letโ€™s go through an example.

Example

Imagine you are a data scientist at Uber. And your product lead tells you:

    ๐Ÿ‘ฉโ€๐Ÿ’ผ: โ€œWe want to decrease user churn by 5% this quarterโ€


We say that a user churns when she decides to stop using Uber.

But why?

There are different reasons why a user would stop using Uber. For example:

   1.  โ€œLyft is offering better prices for that geoโ€ (pricing problem)
   2. โ€œCar waiting times are too longโ€ (supply problem)
   3. โ€œThe Android version of the app is very slowโ€ (client-app performance problem)

You build this list โ†‘ by asking the right questions to the rest of the team. You need to understand the userโ€™s experience using the app, from HER point of view.

Typically there is no single reason behind churn, but a combination of a few of these. The question is: which one should you focus on?

This is when you pull out your great data science skills and EXPLORE THE DATA ๐Ÿ”Ž.

You explore the data to understand how plausible each of the above explanations is. The output from this analysis is a single hypothesis you should consider further. Depending on the hypothesis, you will solve the data science problem differently.

For exampleโ€ฆ

Scenario 1: โ€œLyft Is Offering Better Pricesโ€ (Pricing Problem)

One solution would be to detect/predict the segment of users who are likely to churn (possibly using an ML Model) and send personalized discounts via push notifications. To test your solution works, you will need to run an A/B test, so you will split a percentage of Uber users into 2 groups:

    The A group. No user in this group will receive any discount.

    The B group. Users from this group that the model thinks are likely to churn, will receive a price discount in their next trip.

You could add more groups (e.g. C, D, Eโ€ฆ) to test different pricing points.

In a nutshell

    1. Translating business problems into data science problems is the key data science skill that separates a senior from a junior data scientist.
2. Ask the right questions, list possible solutions, and explore the data to narrow down the list to one.
3. Solve this one data science problem
โค10
๐Ÿ“Š Data Science Essentials: What Every Data Enthusiast Should Know!

1๏ธโƒฃ Understand Your Data
Always start with data exploration. Check for missing values, outliers, and overall distribution to avoid misleading insights.

2๏ธโƒฃ Data Cleaning Matters
Noisy data leads to inaccurate predictions. Standardize formats, remove duplicates, and handle missing data effectively.

3๏ธโƒฃ Use Descriptive & Inferential Statistics
Mean, median, mode, variance, standard deviation, correlation, hypothesis testingโ€”these form the backbone of data interpretation.

4๏ธโƒฃ Master Data Visualization
Bar charts, histograms, scatter plots, and heatmaps make insights more accessible and actionable.

5๏ธโƒฃ Learn SQL for Efficient Data Extraction
Write optimized queries (SELECT, JOIN, GROUP BY, WHERE) to retrieve relevant data from databases.

6๏ธโƒฃ Build Strong Programming Skills
Python (Pandas, NumPy, Scikit-learn) and R are essential for data manipulation and analysis.

7๏ธโƒฃ Understand Machine Learning Basics
Know key algorithmsโ€”linear regression, decision trees, random forests, and clusteringโ€”to develop predictive models.

8๏ธโƒฃ Learn Dashboarding & Storytelling
Power BI and Tableau help convert raw data into actionable insights for stakeholders.

๐Ÿ”ฅ Pro Tip: Always cross-check your results with different techniques to ensure accuracy!

Data Science Learning Series: https://whatsapp.com/channel/0029Va8v3eo1NCrQfGMseL2D

DOUBLE TAP โค๏ธ IF YOU FOUND THIS HELPFUL!
โค5๐Ÿ‘2