Data Science Projects
52.3K subscribers
379 photos
1 video
57 files
334 links
Perfect channel for Data Scientists

Learn Python, AI, R, Machine Learning, Data Science and many more

Admin: @love_data
Download Telegram
Data Analyst Roadmap

Like if it helps ❀️
❀1
Here are some essential SQL tips for beginners πŸ‘‡πŸ‘‡

β—† Primary Key = Unique Key + Not Null constraint
β—† To perform case insensitive search use UPPER() function ex. UPPER(customer_name) LIKE β€˜A%A’
β—† LIKE operator is for string data type
β—† COUNT(*), COUNT(1), COUNT(0) all are same
β—† All aggregate functions ignore the NULL values
β—† Aggregate functions MIN, MAX, SUM, AVG, COUNT are for int data type whereas STRING_AGG is for string data type
β—† For row level filtration use WHERE and aggregate level filtration use HAVING
β—† UNION ALL will include duplicates where as UNION excludes duplicates 
β—† If the results will not have any duplicates, use UNION ALL instead of UNION
β—† We have to alias the subquery if we are using the columns in the outer select query
β—† Subqueries can be used as output with NOT IN condition.
β—† CTEs look better than subqueries. Performance wise both are same.
β—† When joining two tables , if one table has only one value then we can use 1=1 as a condition to join the tables. This will be considered as CROSS JOIN.
β—† Window functions work at ROW level.
β—† The difference between RANK() and DENSE_RANK() is that RANK() skips the rank if the values are the same.
β—† EXISTS works on true/false conditions. If the query returns at least one value, the condition is TRUE. All the records corresponding to the conditions are returned.

Like for more πŸ˜„πŸ˜„
❀1πŸ‘1
You don't need to know everything about every data tool. Focus on what will help land you your job.

For Excel:
- IFS (all variations)
- XLOOKUP
- IMPORTRANGE (in GSheets)
- Pivot Tables
- Dynamic functions like TODAY()

For SQL:
- Sum
- Group By
- Window Functions
- CTEs
- Joins

For Tableau:
- Calculated Columns
- Sets
- Groups
- Formatting

For Power BI:
- Power Query for data transformation
- DAX (Data Analysis Expressions) for creating custom calculations
- Relationships between tables
- Creating interactive and dynamic dashboards
- Utilizing slicers and filters effectively

I have created Resources for Data Analyst πŸ‘‡πŸ‘‡
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02

Hope it helps :)
❀4
Data Science Cheatsheet πŸ’ͺ
❀4
Most people learn SQL just enough to pull some data. But if you really understand it, you can analyze massive datasets without touching Excel or Python.

Here are 8 game-changing SQL concepts that will make you a data pro:

πŸ‘‡


1. Stop pulling raw data. Start pulling insights.

The biggest mistake? Running a query that gives you everything and then filtering it later.

Good analysts don’t pull raw data. They shape the data before it even reaches them.

2. β€œSELECT ” is a rookie move.

Pulling all columns is lazy and slow.

A pro only selects what they need.
βœ”οΈ Fewer columns = Faster queries
βœ”οΈ Less noise = Clearer insights

The more precise your query, the less time you waste cleaning data.

3. GROUP BY is your best friend.

You don’t need 100,000 rows of transactions. What you need is:
βœ”οΈ Sales per region
βœ”οΈ Average order size per customer
βœ”οΈ Number of signups per month

Grouping turns chaotic data into useful summaries.

4. Joins = Connecting the dots.

Your most important data is split across multiple tables.

Want to know how much each customer spent? You need to join:
βœ”οΈ Customer info
βœ”οΈ Order history
βœ”οΈ Payments

Joins = unlocking hidden insights.

5. Window functions will blow your mind.

They let you:
βœ”οΈ Rank customers by total purchases
βœ”οΈ Calculate rolling averages
βœ”οΈ Compare each row to the overall trend

It’s like pivot tables, but way more powerful.

6. CTEs will save you from spaghetti SQL.

Instead of writing a 50-line nested query, break it into steps.

CTEs (Common Table Expressions) make your SQL:
βœ”οΈ Easier to read
βœ”οΈ Easier to debug
βœ”οΈ Reusable

Good SQL is clean SQL.

7. Indexes = Speed.

If your queries take forever, your database is probably doing unnecessary work.

Indexes help databases find data faster.

If you work with large datasets, this is a game changer.

SQL isn’t just about pulling data. It’s about analyzing, transforming, and optimizing it.

Master these 7 concepts, and you’ll never look at SQL the same way again.

Join us on WhatsApp: https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v
❀5
Can AI replace data scientist?

AI can automate many tasks that data scientists perform, but it is unlikely to completely replace them in the foreseeable future. Rather than replacing data scientists, AI will enhance their capabilities by automating repetitive tasks, allowing them to focus on higher-level strategy, decision-making, and ethical considerations.

What AI Can Automate in Data Science:

Data Cleaning & Preparation – AI can automate data wrangling tasks like handling missing values and detecting anomalies.

Feature Engineering – AI-driven tools can generate and select features automatically.

Model Selection & Hyperparameter Tuning – Automated Machine Learning (AutoML) can choose models, tune hyperparameters, and even optimize architectures.

Basic Data Visualization & Reporting – AI tools can generate dashboards and insights automatically.

What AI Cannot Replace:

Problem-Solving & Business Understanding – AI cannot define business problems, formulate hypotheses, or align analysis with strategic goals.

Interpretability & Decision-Making – AI-generated models can be complex, but a human expert is needed to interpret results and make decisions.

Innovation – AI lacks the ability identify new opportunities, or design novel experiments.

Ethical Considerations & Bias Handling – AI can introduce biases, and data scientists are needed to ensure fairness and ethical use.
❀5
Roadmap for Learning Machine Learning (ML)

Here’s a concise and point-wise roadmap for learning ML:

1. Prerequisites
- Learn programming basics (e.g., Python).
- Understand mathematics:
1 - Linear Algebra (vectors, matrices).
2 - Probability and Statistics (distributions, Bayes’ theorem).
3 - Calculus (derivatives, gradients).
4 - Familiarize yourself with data structures and algorithms.

2. Basics of Machine Learning
-Understand ML concepts:
Supervised, unsupervised, and reinforcement learning.
Training, validation, and testing datasets.
- Learn how to preprocess and clean data.
- Get familiar with Python libraries:
NumPy, Pandas, Matplotlib, and Seaborn.

3. Supervised Learning
- Study regression techniques:
Linear and Logistic Regression.
- Explore classification algorithms:
Decision Trees, Support Vector Machines (SVM), k-NN.
- Learn model evaluation metrics:
Accuracy, Precision, Recall, F1 Score, ROC-AUC.

4. Unsupervised Learning
- Learn clustering techniques:
k-Means, DBSCAN, Hierarchical Clustering.
- Understand Dimensionality Reduction:
PCA, t-SNE.

5. Advanced Concepts
- Explore ensemble methods:
Random Forest, Gradient Boosting, XGBoost, LightGBM.
- Learn hyperparameter tuning techniques:
Grid Search, Random Search.

6. Deep Learning (Optional for Advanced ML)
- Learn neural networks basics:
Forward and Backpropagation.
- Study Deep Learning libraries:
TensorFlow, PyTorch, Keras.
Explore CNNs, RNNs, and Transformers.

7. Hands-on Practice
- Work on small projects like:
1 - Predicting house prices.
2 - Sentiment analysis on tweets.
3 - Image classification.
4 - Explore Kaggle competitions and datasets.

8. Deployment
- Learn how to deploy ML models:
Use Flask, FastAPI, or Django.
- Explore cloud platforms: AWS, Azure, Google Cloud.

9. Keep Learning
- Stay updated with new techniques:
Follow blogs, papers, and conferences (e.g., NeurIPS, ICML).
- Dive into specialized fields:
NLP, Computer Vision, Reinforcement Learning.

Join for more: https://t.iss.one/datalemur
❀1
Quick Power BI Dax Revision

1. Measures: Measures in DAX are calculations that are used in Power BI to perform aggregations, calculations, and comparisons on data. They are defined using the DEFINE MEASURE or CALCULATE functions.

2. Calculated Columns: Calculated columns are columns that are created in a table by using DAX expressions. They are calculated row by row when the data is loaded into the model.

3. DAX Functions: DAX provides a wide range of functions for data manipulation and calculation. Some common functions include SUM, AVERAGE, COUNT, FILTER, CALCULATE, RELATED, ALL, ALLEXCEPT, and many more.

4. Context: DAX calculations are performed within a context, which can be row context or filter context. Understanding how context works is crucial for writing accurate DAX expressions.

5. Relationships: Power BI data models are built on relationships between tables. DAX expressions can leverage these relationships to perform calculations across related tables.

6. Time Intelligence Functions: DAX includes a set of time intelligence functions that enable you to perform calculations based on dates and time periods. Examples include TOTALYTD, SAMEPERIODLASTYEAR, DATESBETWEEN, etc.

7. Variables: DAX allows you to declare and use variables within expressions to improve readability and performance of complex calculations.

8. Aggregation Functions: DAX provides aggregation functions like SUMX, AVERAGEX, COUNTX that allow you to iterate over a table and perform aggregations based on specified conditions.

9. Logical Functions: DAX includes logical functions such as IF, AND, OR, SWITCH that help in implementing conditional logic within calculations.

10. Error Handling: DAX provides functions like ISBLANK, IFERROR, BLANK, etc., for handling errors and missing data in calculations.
❀1
πŸ‘©πŸ»β€πŸ’» Why should one study Linear Algebra for ML?

πŸ‘‰πŸΌ Clearly, to develop a better intuition for machine learning and deep learning algorithms and not treat them as black boxes. This would allow you to choose proper hyper-parameters and develop a better model. You would also be able to code algorithms from scratch and make your own variations to them as well.

πŸ‘‰πŸΌ Learn Linear Algebra for Machine Learning with:

Khan Academy: https://www.khanacademy.org/math/linear-algebra

Udacity: https://www.udacity.com/course/linear-algebra-refresher-course--ud953

Coursera: https://www.coursera.org/learn/linear-algebra-machine-learning

Here are some amazing freely available ebooks on the same topic:

Mathematics for Machine Learning: https://mml-book.github.io/book/mml-book.pdf

An Introduction to Statistical Learning: https://faculty.marshall.usc.edu/gareth-james/ISL/

Happy machine learning! πŸŽ‰
❀1
Python Detailed Roadmap πŸš€

πŸ“Œ 1. Basics
β—Ό Data Types & Variables
β—Ό Operators & Expressions
β—Ό Control Flow (if, loops)

πŸ“Œ 2. Functions & Modules
β—Ό Defining Functions
β—Ό Lambda Functions
β—Ό Importing & Creating Modules

πŸ“Œ 3. File Handling
β—Ό Reading & Writing Files
β—Ό Working with CSV & JSON

πŸ“Œ 4. Object-Oriented Programming (OOP)
β—Ό Classes & Objects
β—Ό Inheritance & Polymorphism
β—Ό Encapsulation

πŸ“Œ 5. Exception Handling
β—Ό Try-Except Blocks
β—Ό Custom Exceptions

πŸ“Œ 6. Advanced Python Concepts
β—Ό List & Dictionary Comprehensions
β—Ό Generators & Iterators
β—Ό Decorators

πŸ“Œ 7. Essential Libraries
β—Ό NumPy (Arrays & Computations)
β—Ό Pandas (Data Analysis)
β—Ό Matplotlib & Seaborn (Visualization)

πŸ“Œ 8. Web Development & APIs
β—Ό Web Scraping (BeautifulSoup, Scrapy)
β—Ό API Integration (Requests)
β—Ό Flask & Django (Backend Development)

πŸ“Œ 9. Automation & Scripting
β—Ό Automating Tasks with Python
β—Ό Working with Selenium & PyAutoGUI

πŸ“Œ 10. Data Science & Machine Learning
β—Ό Data Cleaning & Preprocessing
β—Ό Scikit-Learn (ML Algorithms)
β—Ό TensorFlow & PyTorch (Deep Learning)

πŸ“Œ 11. Projects
β—Ό Build Real-World Applications
β—Ό Showcase on GitHub

πŸ“Œ 12. βœ… Apply for Jobs
β—Ό Strengthen Resume & Portfolio
β—Ό Prepare for Technical Interviews

Like for more ❀️πŸ’ͺ
❀3
Since many of you were asking me to send Data Science Session

πŸ“ŒSo we have come with a session for you!! πŸ‘¨πŸ»β€πŸ’» πŸ‘©πŸ»β€πŸ’»

This will help you to speed up your job hunting process πŸ’ͺ

Register here
πŸ‘‡πŸ‘‡
https://go.acciojob.com/RYFvdU

Only limited free slots are available so Register Now
❀4
Python Cheat Sheet.pdf
677.7 KB
This cheat sheet includes basic python required for data analysis excluding pandas, numpy & other libraries
❀2πŸ‘2