๐7
If you're thinking about building a data analytics projects, you don't need another book, video, or blog post.
Just start.
You'll learn 10x more by failing big time than by reading someone else's advice ๐คทโ๏ธ
Just start.
You'll learn 10x more by failing big time than by reading someone else's advice ๐คทโ๏ธ
๐14โค3
Starting exploratory data analysis (EDA) can be tricky. Many of us often feel lost at the beginning. Here's a simple way to get on track: start by creating hypothesis questions and defining KPIs based on your dataset and the field you are working in.
๐ ๐จ๐ฅ๐ฅ๐จ๐ฐ ๐ญ๐ก๐๐ฌ๐ ๐ฌ๐ญ๐๐ฉ๐ฌ ๐ญ๐จ ๐ ๐ฎ๐ข๐๐ ๐ฒ๐จ๐ฎ๐ซ ๐๐๐:
1. ๐ผ๐๐ ๐๐๐๐๐๐๐ ๐๐๐๐ ๐ญ๐๐๐๐ : Learn about the industry and the specific problems you're trying to solve. This will help you know what to look for in your data.
2. ๐ฐ๐ ๐๐๐๐๐๐ ๐ฒ๐๐ ๐ด๐๐๐๐๐๐: Decide on the most important KPIs for your analysis. These should align with your business goals and provide clear insights.
3. ๐ช๐๐๐๐๐ ๐ฏ๐๐๐๐๐๐๐๐๐: Formulate questions that your EDA will try to answer. This keeps your analysis focused and purposeful.
Using these steps will make your EDA process smoother and ensure your results are valuable and relevant.
๐ ๐จ๐ฅ๐ฅ๐จ๐ฐ ๐ญ๐ก๐๐ฌ๐ ๐ฌ๐ญ๐๐ฉ๐ฌ ๐ญ๐จ ๐ ๐ฎ๐ข๐๐ ๐ฒ๐จ๐ฎ๐ซ ๐๐๐:
1. ๐ผ๐๐ ๐๐๐๐๐๐๐ ๐๐๐๐ ๐ญ๐๐๐๐ : Learn about the industry and the specific problems you're trying to solve. This will help you know what to look for in your data.
2. ๐ฐ๐ ๐๐๐๐๐๐ ๐ฒ๐๐ ๐ด๐๐๐๐๐๐: Decide on the most important KPIs for your analysis. These should align with your business goals and provide clear insights.
3. ๐ช๐๐๐๐๐ ๐ฏ๐๐๐๐๐๐๐๐๐: Formulate questions that your EDA will try to answer. This keeps your analysis focused and purposeful.
Using these steps will make your EDA process smoother and ensure your results are valuable and relevant.
๐5โค2
๐๐ฎ๐ฌ๐ข๐ง๐๐ฌ๐ฌ ๐๐ง๐๐ฅ๐ฒ๐ฌ๐ญ V/S ๐๐ฎ๐ฌ๐ข๐ง๐๐ฌ๐ฌ ๐๐ง๐ญ๐๐ฅ๐ฅ๐ข๐ ๐๐ง๐๐
๐๐ฎ๐ฌ๐ข๐ง๐๐ฌ๐ฌ ๐๐ง๐๐ฅ๐ฒ๐ฌ๐ญ (๐๐):
- Acts as a bridge between the business side and the IT side of an organization.
- Gathers and analyzes business requirements.
- Conducts stakeholder meetings.
๐๐ฎ๐ฌ๐ข๐ง๐๐ฌ๐ฌ ๐๐ง๐ญ๐๐ฅ๐ฅ๐ข๐ ๐๐ง๐๐ (๐๐):
- Focuses on data analysis, reporting, and data visualization using BI tools.
- Extracts and transforms data from various sources into meaningful insights to support decision-making.
- Builds dashboards and reports.
- Identifies trends and patterns in data.
๐๐ฑ๐๐ฆ๐ฉ๐ฅ๐:
๐๐ฆ๐๐ณ๐จ๐ง: A BA might analyze customer feedback to improve delivery processes, while a BI professional could create dashboards to monitor sales trends and warehouse efficiency.
๐๐จ๐จ๐ ๐ฅ๐: A BA could work on improving user experience based on app usage data, whereas a BI expert might analyze advertising data to optimize ad campaigns.
๐๐ฎ๐ฌ๐ข๐ง๐๐ฌ๐ฌ ๐๐ง๐๐ฅ๐ฒ๐ฌ๐ญ (๐๐):
- Acts as a bridge between the business side and the IT side of an organization.
- Gathers and analyzes business requirements.
- Conducts stakeholder meetings.
๐๐ฎ๐ฌ๐ข๐ง๐๐ฌ๐ฌ ๐๐ง๐ญ๐๐ฅ๐ฅ๐ข๐ ๐๐ง๐๐ (๐๐):
- Focuses on data analysis, reporting, and data visualization using BI tools.
- Extracts and transforms data from various sources into meaningful insights to support decision-making.
- Builds dashboards and reports.
- Identifies trends and patterns in data.
๐๐ฑ๐๐ฆ๐ฉ๐ฅ๐:
๐๐ฆ๐๐ณ๐จ๐ง: A BA might analyze customer feedback to improve delivery processes, while a BI professional could create dashboards to monitor sales trends and warehouse efficiency.
๐๐จ๐จ๐ ๐ฅ๐: A BA could work on improving user experience based on app usage data, whereas a BI expert might analyze advertising data to optimize ad campaigns.
๐8โค3
๐ฅณ๐When delving into data analytics and initiating your SQL journey, prioritize mastering the fundamental concepts that address the majority of problems before delving into other topics.
๐๐ป Basic Aggregation function:
1๏ธโฃ AVG
2๏ธโฃ COUNT
3๏ธโฃ SUM
4๏ธโฃ MIN
5๏ธโฃ MAX
๐๐ป JOINS
1๏ธโฃ Left
2๏ธโฃ Inner
3๏ธโฃ Self (Important, Practice questions on self join)
๐๐ป Windows Function (Important)
1๏ธโฃ Learn how partitioning works
2๏ธโฃ Learn the different use cases where Ranking/Numbering Functions are used? ( ROW_NUMBER,RANK, DENSE_RANK, NTILE)
3๏ธโฃ Use Cases of LEAD & LAG functions
4๏ธโฃ Use cases of Aggregate window functions
๐๐ป GROUP BY
๐๐ป WHERE vs HAVING
๐๐ป CASE STATEMENT
๐๐ป UNION vs Union ALL
๐๐ป LOGICAL OPERATORS
Other Commonly used functions:
๐๐ป IFNULL
๐๐ป COALESCE
๐๐ป ROUND
๐๐ป Working with Date Functions
1๏ธโฃ EXTRACTING YEAR/MONTH/WEEK/DAY
2๏ธโฃ Calculating date differences
๐๐ปCTE
๐๐ปViews & Triggers (optional)
Here is an amazing resources to learn & practice SQL: https://t.iss.one/sqlanalyst/195
Hope it helps in your SQL learning ๐
๐๐ป Basic Aggregation function:
1๏ธโฃ AVG
2๏ธโฃ COUNT
3๏ธโฃ SUM
4๏ธโฃ MIN
5๏ธโฃ MAX
๐๐ป JOINS
1๏ธโฃ Left
2๏ธโฃ Inner
3๏ธโฃ Self (Important, Practice questions on self join)
๐๐ป Windows Function (Important)
1๏ธโฃ Learn how partitioning works
2๏ธโฃ Learn the different use cases where Ranking/Numbering Functions are used? ( ROW_NUMBER,RANK, DENSE_RANK, NTILE)
3๏ธโฃ Use Cases of LEAD & LAG functions
4๏ธโฃ Use cases of Aggregate window functions
๐๐ป GROUP BY
๐๐ป WHERE vs HAVING
๐๐ป CASE STATEMENT
๐๐ป UNION vs Union ALL
๐๐ป LOGICAL OPERATORS
Other Commonly used functions:
๐๐ป IFNULL
๐๐ป COALESCE
๐๐ป ROUND
๐๐ป Working with Date Functions
1๏ธโฃ EXTRACTING YEAR/MONTH/WEEK/DAY
2๏ธโฃ Calculating date differences
๐๐ปCTE
๐๐ปViews & Triggers (optional)
Here is an amazing resources to learn & practice SQL: https://t.iss.one/sqlanalyst/195
Hope it helps in your SQL learning ๐
โค6๐5๐ฅฐ1
Will AI Tools for Data Analysis Replace Data Analysts?
AI and Data Analysis are two closely related scientific areas, that have been developing rapidly for the last several years. As technology continues to evolve, the question arises: Will AI tools for data analysis replace data analysts?
This article aims to describe how AI is related to Data Analysis, what it can do, and will AI tools for data analysis replace data analysts. Starting with the introduction to AI and its fundamental aspects, to how it is going to affect the world in the distant future, the article addresses that and also focuses on how AI is associated with Data analysis.
The moderate generation of AI comprises Machine Learning, Deep Learning, and Generative AI. While generative AI is the capability to produce materials and contents like images, sound, and music, Machine Learning is a specific type of GI that prepares an algorithm to feed information to make a prediction.
AI and Data Analysis are two closely related scientific areas, that have been developing rapidly for the last several years. As technology continues to evolve, the question arises: Will AI tools for data analysis replace data analysts?
This article aims to describe how AI is related to Data Analysis, what it can do, and will AI tools for data analysis replace data analysts. Starting with the introduction to AI and its fundamental aspects, to how it is going to affect the world in the distant future, the article addresses that and also focuses on how AI is associated with Data analysis.
The moderate generation of AI comprises Machine Learning, Deep Learning, and Generative AI. While generative AI is the capability to produce materials and contents like images, sound, and music, Machine Learning is a specific type of GI that prepares an algorithm to feed information to make a prediction.
๐2
Letโs go back to the basics...!
Hereโs what you do to become a Data Analyst
- Learn SQL (best skill to have)
- Learn Excel (hidden requirement)
- Learn a BI tool (for nice portfolio projects)
Donโt stop there you still have work to do
- Create a portfolio
- Learn how to create an appealing resume
- Learn how to answer interview questions (STAR method)
After this, my favorite, networking
- Comment on posts
- Start posting yourself
- Reach out to all the recruiters
It can take you anywhere from a couple of months to a year!
It all depends on how much time you can dedicate each day!
But the longer you wait, the longer it will take!
Get after it...!
Hereโs what you do to become a Data Analyst
- Learn SQL (best skill to have)
- Learn Excel (hidden requirement)
- Learn a BI tool (for nice portfolio projects)
Donโt stop there you still have work to do
- Create a portfolio
- Learn how to create an appealing resume
- Learn how to answer interview questions (STAR method)
After this, my favorite, networking
- Comment on posts
- Start posting yourself
- Reach out to all the recruiters
It can take you anywhere from a couple of months to a year!
It all depends on how much time you can dedicate each day!
But the longer you wait, the longer it will take!
Get after it...!
๐16โค7
Complete Guide to Data Analytics for Beginners ๐๐
https://youtu.be/1-T-VBjLpJo?si=fo_RhbXC46Hg-FVE
https://youtu.be/1-T-VBjLpJo?si=fo_RhbXC46Hg-FVE
๐9
Data Analysis Books | Python | SQL | Excel | Artificial Intelligence | Power BI | Tableau | AI Resources
Complete Guide to Data Analytics for Beginners ๐๐ https://youtu.be/1-T-VBjLpJo?si=fo_RhbXC46Hg-FVE
What should be the next topic for YouTube video?
Anonymous Poll
34%
SQL
21%
Python
12%
Excel
24%
Power BI
8%
Tableau
๐16โค3
Data Analysis Books | Python | SQL | Excel | Artificial Intelligence | Power BI | Tableau | AI Resources
What should be the next topic for YouTube video?
Since most of you voted for SQL, I created this video which contains essential SQL topics & free resources to practice sql.
๐๐
https://youtu.be/VCZxODefTIs?si=1XB44uv5DIpcJA4K
Please like this video & subscribe my youtube channel so that I can bring more awesome videos. I would really appreciate any feedback in the comments :)
๐๐
https://youtu.be/VCZxODefTIs?si=1XB44uv5DIpcJA4K
Please like this video & subscribe my youtube channel so that I can bring more awesome videos. I would really appreciate any feedback in the comments :)
โค7๐4
Guesstimate questions are scary, simply because they really matter for impacting your performance in those all-important interviews โ often for consulting, data analytics or product management. No need to worry; you can do it! In this guide, we are looking at how to approach guesstimate questions with confidence and make what sounds like a guessing game into an opportunity for showcasing our analytical thinking.
https://datasimplifier.com/guesstimate-questions/
https://datasimplifier.com/guesstimate-questions/
๐5
Reminder for all data analyst job seekersโฐ
DA + HR Knowledgeโก๏ธHR Analyst
DA + Sales Knowledgeโก๏ธSales Analyst
DA + Supply Chainโก๏ธSupply chain Analyst
DA + Finance Knowledgeโก๏ธFinance Analyst
DA + Research Knowledgeโก๏ธResearch Analyst
DA + Marketing Knowledgeโก๏ธMarketing Analyst
What does it mean?
โฉBuild more functional / domain knowledge
โฉBy doing more projects & research
Why?
โ To increase your chances of landing a DA job ๐
DA + HR Knowledgeโก๏ธHR Analyst
DA + Sales Knowledgeโก๏ธSales Analyst
DA + Supply Chainโก๏ธSupply chain Analyst
DA + Finance Knowledgeโก๏ธFinance Analyst
DA + Research Knowledgeโก๏ธResearch Analyst
DA + Marketing Knowledgeโก๏ธMarketing Analyst
What does it mean?
โฉBuild more functional / domain knowledge
โฉBy doing more projects & research
Why?
โ To increase your chances of landing a DA job ๐
๐7
10 Data Cleaning Techniques Every Data Analyst Should Master:
1. Handling Missing Data
Use methods like imputation (mean, median, mode) or deletion to handle missing values.
In Python, pandas functions like fillna() or dropna() are useful.
Example:
2. Removing Duplicates
Identify and remove duplicate records to ensure the dataset is accurate. Use
1. Handling Missing Data
Use methods like imputation (mean, median, mode) or deletion to handle missing values.
In Python, pandas functions like fillna() or dropna() are useful.
Example:
df.fillna(df.mean()) replaces missing values with the column mean.2. Removing Duplicates
Identify and remove duplicate records to ensure the dataset is accurate. Use
drop_duplicates() in pandas.๐12๐ฅ2
3. Standardizing Data
Ensure consistency in formatting, such as dates and strings.
Use
4. Handling Outliers
Detect and manage outliers using statistical methods or by creating visuals like box plots. Methods include capping, flooring, or removing outliers.
Example:
Ensure consistency in formatting, such as dates and strings.
Use
str.lower() or pd.to_datetime() for standardization.4. Handling Outliers
Detect and manage outliers using statistical methods or by creating visuals like box plots. Methods include capping, flooring, or removing outliers.
Example:
df = df[(df['column'] >= lower_limit) & (df['column'] <= upper_limit)]๐6๐1
5. Correcting Data Types
Check that all columns have the correct data types for analysis. Use
6. Normalizing and Scaling Data
Normalize or scale data to bring all values into a similar range, which is important for algorithms like K-Means clustering.
Use
Example:
Check that all columns have the correct data types for analysis. Use
astype() in pandas to convert data types.6. Normalizing and Scaling Data
Normalize or scale data to bring all values into a similar range, which is important for algorithms like K-Means clustering.
Use
StandardScaler or MinMaxScaler from scikit-learn.Example:
from sklearn.preprocessing import StandardScaler; df_scaled = StandardScaler().fit_transform(df)๐7
7. Encoding Categorical Variables
Convert categorical data into numerical format using techniques like one-hot encoding or label encoding. Use
Example:
8. Dealing with Inconsistent Data
Identify and correct inconsistencies in data entries, such as typos or inconsistent naming conventions.
Example:
Convert categorical data into numerical format using techniques like one-hot encoding or label encoding. Use
pd.get_dummies() or LabelEncoder.Example:
df_encoded = pd.get_dummies(df, columns=['category'])8. Dealing with Inconsistent Data
Identify and correct inconsistencies in data entries, such as typos or inconsistent naming conventions.
Example:
df['column'] = df['column'].replace({'val1':'value1', 'val2':'value2'})๐9