Data Analyst Interview Resources
51.2K subscribers
254 photos
1 video
51 files
317 links
Join our telegram channel to learn how data analysis can reveal fascinating patterns, trends, and stories hidden within the numbers! πŸ“Š

For ads & suggestions: @love_data
Download Telegram
Here are some Statistics Interview Questions for Data analyst Interview

Que 1. What Is the Difference Between Inferential Statistics and Descriptive Statistics?
Ans 1. The difference between inferential statistics and descriptive statistics is that inferential statistics are used to draw conclusions about a population based on the data you’ve collected. In contrast, descriptive statistics are used to summarize your data.

Que 2. What Is the Difference Between Quantitative Data and Qualitative Data?
Ans 2. Quantitative data is numerical data that can be measured, counted, and expressed as a percentage. Qualitative data is non-numerical information that describes subjective experiences or opinions about an event or topic.

Que 3. How Do You Calculate Range and Interquartile Range?
Ans 3. Range and interquartile range are two ways to calculate the spread of data. The range is the difference between the highest and lowest value in a set of data. The interquartile range is the difference between the 75th percentile and 25th percentile of a set of data.

Que 4. Explain Pareto Principle
Ans 4. The Pareto Principle, also known as the 80-20 rule, is a principle that states that 20% of causes are responsible for 80% of effects.

Que 5. What Are Left-Skewed Distribution and Right-Skewed Distribution?
Ans 5. Left-skewed distributions have a longer tail to the left (lower values), while right-skewed distributions have a longer tail to the right (higher values).

Que 6. What Is an Outlier, and How Can You Find One?
Ans 6. An outlier is an observation point that is distant from other data points. It’s important to note that the term β€œoutlier” doesn’t refer to the numerical value of a data point but rather the distance between it and all other values.

Que 7. What Are Skewness and Kurtosis?
Ans 7. Skewness is an excellent way to measure the symmetry of distribution and the likelihood of a given value falling in the tails. With symmetrical distribution, the mean and median coincide. If the data distribution isn’t symmetrical, it’s skewed.
There are two types of skewness:
Positive is when the right tail is longer. Most values are clustered around the left tail, and the median is smaller than the mean.
Negative is when the left tail is longer. Most values are clustered around the right tail, and the median is greater than the mean

Kurtosis, on the other hand, reveals how heavy or light-tailed data is compared to the normal distribution. There are three types of kurtoses:
Mesokurtic distributions approximate a normal distribution.
Leptokurtic distributions have a pointy shape and heavy tails, indicating a high probability of extreme events occurring.
Platykurtic distributions have a flat shape and light tails. They reveal a low probability of the occurrence of extreme events.
πŸ‘19πŸ‘1
Best practices for writing SQL queries:

Join for more: https://t.iss.one/learndataanalysis

1- Write SQL keywords in capital letters.

2- Use table aliases with columns when you are joining multiple tables.

3- Never use select *, always mention list of columns in select clause.

4- Add useful comments wherever you write complex logic. Avoid too many comments.

5- Use joins instead of subqueries when possible for better performance.

6- Create CTEs instead of multiple sub queries , it will make your query easy to read.

7- Join tables using JOIN keywords instead of writing join condition in where clause for better readability.

8- Never use order by in sub queries , It will unnecessary increase runtime.

9- If you know there are no duplicates in 2 tables, use UNION ALL instead of UNION for better performance.

SQL Basics: https://t.iss.one/sqlanalyst/105
πŸ‘16❀1πŸ‘1
If you are a data analyst and thinking of getting started with freelancing on upwork then here's something you should know.

You should be ready to invest money if you want to get started with freelancing on upwork.

So there's something called connects on Upwork. For simplicity you can consider connects as the currency of upwork which one will spend while submitting a proposal for the freelancing tasks listed on the platform.

Previously upwork used to give some free connects to every new account but these days they don't. So you have to buy the connects at the rate of 100 connects per $15 + Taxes (without upgrading to upwork plus) which will be 1.3k + taxes in INR.

Let's say you submit proposal for those jobs asking for 20 connects, the max you will be able to submit is 5 jobs and you will get the job or not again depend on many factors.
You may end up having no jobs even after spending 100 connects and then again you have to repeat the cycle.

Everything looks shiny from outside but reality can be different.
Every platform requires investment either in the form of time, dedication, money or combination of all.
πŸ‘14πŸ‘2
If you want to earn 6-figures working as a data analyst, learn these 6 important skills:

Excel - advanced Excel functions for data manipulation and interpretation.

Data Cleaning is about mastering data preprocessing and cleaning techniques.

Python/R - data analysis, preparation and manipulation

Statistical Analysis - understanding fundamental statistics for data

Data Visualization - clear and effective visual representations of data

SQL - querying and managing databases efficiently
πŸ‘17❀4
I'm sure you had an idea, but something got in the way and you didn't develop it. The channel "Usual thing" is about this, the author tries to implement different business ideas, but every day he encounters problems and discusses them with you.
https://t.iss.one/usual_thing
πŸ‘2πŸ‘Œ1
1. What are the various types of refresh options provided in Power BI?

Package refresh - This synchronizes your Power BI Desktop or Excel file between the Power BI service and OneDrive, or SharePoint Online.
Model or data refresh - This refreshes the dataset within the Power BI service with data from the original data source.
Tile refresh - This updates the cache for tile visuals every 15 minutes on the dashboard once data changes.
Visual container refresh - This refreshes the visible container and updates the cached report visuals within a report once the data changes.

2. Explain some date manipulation functions in SQL.

Getdate: As its name suggests, the getdate function gives us today’s date. Dateadd: The dateadd function is used for adding a time or date interval to a date.Datediff: The datediff function is used for calculating the difference between two dates based on a given interval. Datename: The datename function can be used for extracting the parts of a date. Year, month, day: The year, month, and day functions allow for decomposing a date.


3. What is CTE in SQL?

A CTE (Common Table Expression) is a one-time result set that only exists for the duration of the query. It allows us to refer to data within a single SELECT, INSERT, UPDATE, DELETE, CREATE VIEW, or MERGE statement's execution scope. It is temporary because its result cannot be stored anywhere and will be lost as soon as a query's execution is completed.
πŸ‘11❀1
TOP CONCEPTS FOR INTERVIEW PREPARATION!!

πŸš€TOP 10 SQL Concepts for Job Interview

1. Aggregate Functions (SUM/AVG)
2. Group By and Order By
3. JOINs (Inner/Left/Right)
4. Union and Union All
5. Date and Time processing
6. String processing
7. Window Functions (Partition by)
8. Subquery
9. View and Index
10. Common Table Expression (CTE)


πŸš€TOP 10 Statistics Concepts for Job Interview

1. Sampling
2. Experiments (A/B tests)
3. Descriptive Statistics
4. p-value
5. Probability Distributions
6. t-test
7. ANOVA
8. Correlation
9. Linear Regression
10. Logistics Regression


πŸš€TOP 10 Python Concepts for Job Interview

1. Reading data from file/table
2. Writing data to file/table
3. Data Types
4. Function
5. Data Preprocessing (numpy/pandas)
6. Data Visualisation (Matplotlib/seaborn/bokeh)
7. Machine Learning (sklearn)
8. Deep Learning (Tensorflow/Keras/PyTorch)
9. Distributed Processing (PySpark)
10. Functional and Object Oriented Programming

Like ❀️ the post if it was helpful to you!!!
πŸ‘16❀8πŸ‘1
Free Resources for Numpy and Pandas:

Codebasics Numpy playlist: 
https://www.youtube.com/playlist?list=PLeo1K3hjS3uset9zIVzJWqplaWBiacTEU

Codebasics pandas playlist (first 9): 
https://www.youtube.com/playlist?list=PLeo1K3hjS3uuASpe-1LjfG5f14Bnozjwy

Freecodecamp matplotlib playlist: 
https://youtu.be/3Xc3CA655Y4

Seaborn tutorials: 
https://youtu.be/GcXcSZ0gQps

Pandas for beginners
https://t.iss.one/datasciencefun/660

Numpy for beginners
https://t.iss.one/datasciencefree/156
πŸ‘11❀3πŸ₯°1
1. Define the term 'Data Wrangling.

Data Wrangling is the process wherein raw data is cleaned, structured, and enriched into a desired usable format for better decision making. It involves discovering, structuring, cleaning, enriching, validating, and analyzing data. This process can turn and map out large amounts of data extracted from various sources into a more useful format.

2. What are the best methods for data cleaning?

Create a data cleaning plan by understanding where the common errors take place and keep all the communications open. Before working with the data, identify and remove the duplicates. This will lead to an easy and effective data analysis process.Focus on the accuracy of the data. Set cross-field validation, maintain the value types of data, and provide mandatory constraints.Normalize the data at the entry point so that it is less chaotic. You will be able to ensure that all information is standardized, leading to fewer errors on entry.


3. Explain the Type I and Type II errors in Statistics?

In Hypothesis testing, a Type I error occurs when the null hypothesis is rejected even if it is true. It is also known as a false positive.

A Type II error occurs when the null hypothesis is not rejected, even if it is false. It is also known as a false negative.

4. How do you make a dropdown list in MS Excel?

First, click on the Data tab that is present in the ribbon.Under the Data Tools group, select Data Validation.Then navigate to Settings > Allow > List.Select the source you want to provide as a list array.

5. State some ways to improve the performance of Tableau?

Use an Extract to make workbooks run faster.
Reduce the scope of data to decrease the volume of data.
Reduce the number of marks on the view to avoid information overload.
Hide unused fields.
Use Context filters.
Use indexing in tables and use the same fields for filtering.
Remove unnecessary calculations and sheets.
πŸ‘13❀4
1. What are Query and Query language?

A query is nothing but a request sent to a database to retrieve data or information. The required data can be retrieved from a table or many tables in the database.

Query languages use various types of queries to retrieve data from databases. SQL, Datalog, and AQL are a few examples of query languages; however, SQL is known to be the widely used query language.



2. What are Superkey and candidate key?

A super key may be a single or a combination of keys that help to identify a record in a table. Know that Super keys can have one or more attributes, even though all the attributes are not necessary to identify the records.

A candidate key is the subset of Superkey, which can have one or more than one attributes to identify records in a table. Unlike Superkey, all the attributes of the candidate key must be helpful to identify the records.


3. What do you mean by buffer pool and mention its benefits?

A buffer pool in SQL is also known as a buffer cache. All the resources can store their cached data pages in a buffer pool. The size of the buffer pool can be defined during the configuration of an instance of SQL Server.
The following are the benefits of a buffer pool:

Increase in I/O performance
Reduction in I/O latency
Increase in transaction throughput
Increase in reading performance


4. What is the difference between Zero and NULL values in SQL?

When a field in a column doesn’t have any value, it is said to be having a NULL value. Simply put, NULL is the blank field in a table. It can cancel be considered as an unassigned, unknown, or unavailable value. On the contrary, zero is a number, and it is an available, assigned, and known value.
πŸ‘19❀4πŸŽ‰1
Data Analysis with Excel
πŸ‘‡πŸ‘‡
https://t.iss.one/excel_analyst/2

Power BI DAX Functions
πŸ‘‡πŸ‘‡
https://t.iss.one/PowerBI_analyst/2

All about SQL
πŸ‘‡πŸ‘‡
https://t.iss.one/sqlanalyst/29

Python for data analysis
πŸ‘‡πŸ‘‡
https://t.iss.one/pythonanalyst/26

Statistics Book and other useful resources
πŸ‘‡πŸ‘‡
https://t.iss.one/DataAnalystInterview/34
πŸ‘18πŸ‘6πŸ‘Œ2❀1πŸ₯°1πŸ€”1πŸŽ‰1
Hi Guys,

Here are some of the telegram channels which may help you in data analytics journey πŸ‘‡πŸ‘‡

SQL: https://t.iss.one/sqlanalyst

Power BI & Tableau:
https://t.iss.one/PowerBI_analyst

Excel:
https://t.iss.one/excel_analyst

Python:
https://t.iss.one/dsabooks

Jobs:
https://t.iss.one/jobs_SQL

Data Science:
https://t.iss.one/datasciencefree

Artificial intelligence:
https://t.iss.one/machinelearning_deeplearning

Data Engineering:
https://t.iss.one/sql_engineer

Hope it helps :)
πŸ€”7πŸ‘5❀4πŸ₯°2πŸ‘Œ2
πŸ‘12❀4πŸ€”1
These are the top 5 skills (I think) you need as an entry-level data analyst:

1. Excel. It may not be fancy but it's still one of the most used tools in the business world. I can guarantee you will use it at some point.

2. SQL. You may not actually use SQL but it's worth learning. It's the language of databases and gives you a strong foundation for working with other data analysis tools.

3. A data viz tool. Look, I don't care if you learn Power BI, Tableau, or any other data viz tool. You need to be able to communicate insights in a way that makes sense to non-technical people.

4. Communication. This may actually be the most important skill. It doesn't matter if you can analyze data if you can't communicate why that analysis should matter.

5. Problem solving. You use data to answer business questions and...wait for it... solve problems. It's an absolutely essential skill to have.

The best part of this is that you very likely already have 2, if not 3, of these in a pretty good place.

Focus your efforts on the skills that will make a difference.
πŸ‘14❀4πŸ₯°1