Data Analyst interview questions
1) What joins are mostly used in SQL?
2) Use cases of Cross and Self Joins?
3) Write a query to exclude weekends from a table?
4) What are Window Functions?
5) What is the difference between CTEs and Subqueries?
6) How can you optimize SQL queries?
7) How can you convert data from rows into columns?
8) If there are 10 different KPIs calculated from different tables on a daily basis, how would you compile them into a single report?
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Hope it helps :)
1) What joins are mostly used in SQL?
2) Use cases of Cross and Self Joins?
3) Write a query to exclude weekends from a table?
4) What are Window Functions?
5) What is the difference between CTEs and Subqueries?
6) How can you optimize SQL queries?
7) How can you convert data from rows into columns?
8) If there are 10 different KPIs calculated from different tables on a daily basis, how would you compile them into a single report?
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Hope it helps :)
π5
Essential Topics to Master Data Science Interviews: π
SQL:
1. Foundations
- Craft SELECT statements with WHERE, ORDER BY, GROUP BY, HAVING
- Embrace Basic JOINS (INNER, LEFT, RIGHT, FULL)
- Navigate through simple databases and tables
2. Intermediate SQL
- Utilize Aggregate functions (COUNT, SUM, AVG, MAX, MIN)
- Embrace Subqueries and nested queries
- Master Common Table Expressions (WITH clause)
- Implement CASE statements for logical queries
3. Advanced SQL
- Explore Advanced JOIN techniques (self-join, non-equi join)
- Dive into Window functions (OVER, PARTITION BY, ROW_NUMBER, RANK, DENSE_RANK, lead, lag)
- Optimize queries with indexing
- Execute Data manipulation (INSERT, UPDATE, DELETE)
Python:
1. Python Basics
- Grasp Syntax, variables, and data types
- Command Control structures (if-else, for and while loops)
- Understand Basic data structures (lists, dictionaries, sets, tuples)
- Master Functions, lambda functions, and error handling (try-except)
- Explore Modules and packages
2. Pandas & Numpy
- Create and manipulate DataFrames and Series
- Perfect Indexing, selecting, and filtering data
- Handle missing data (fillna, dropna)
- Aggregate data with groupby, summarizing data
- Merge, join, and concatenate datasets
3. Data Visualization with Python
- Plot with Matplotlib (line plots, bar plots, histograms)
- Visualize with Seaborn (scatter plots, box plots, pair plots)
- Customize plots (sizes, labels, legends, color palettes)
- Introduction to interactive visualizations (e.g., Plotly)
Excel:
1. Excel Essentials
- Conduct Cell operations, basic formulas (SUMIFS, COUNTIFS, AVERAGEIFS, IF, AND, OR, NOT & Nested Functions etc.)
- Dive into charts and basic data visualization
- Sort and filter data, use Conditional formatting
2. Intermediate Excel
- Master Advanced formulas (V/XLOOKUP, INDEX-MATCH, nested IF)
- Leverage PivotTables and PivotCharts for summarizing data
- Utilize data validation tools
- Employ What-if analysis tools (Data Tables, Goal Seek)
3. Advanced Excel
- Harness Array formulas and advanced functions
- Dive into Data Model & Power Pivot
- Explore Advanced Filter, Slicers, and Timelines in Pivot Tables
- Create dynamic charts and interactive dashboards
Power BI:
1. Data Modeling in Power BI
- Import data from various sources
- Establish and manage relationships between datasets
- Grasp Data modeling basics (star schema, snowflake schema)
2. Data Transformation in Power BI
- Use Power Query for data cleaning and transformation
- Apply advanced data shaping techniques
- Create Calculated columns and measures using DAX
3. Data Visualization and Reporting in Power BI
- Craft interactive reports and dashboards
- Utilize Visualizations (bar, line, pie charts, maps)
- Publish and share reports, schedule data refreshes
Statistics Fundamentals:
- Mean, Median, Mode
- Standard Deviation, Variance
- Probability Distributions, Hypothesis Testing
- P-values, Confidence Intervals
- Correlation, Simple Linear Regression
- Normal Distribution, Binomial Distribution, Poisson Distribution.
Show some β€οΈ if you're ready to elevate your data science journey! π
ENJOY LEARNING ππ
SQL:
1. Foundations
- Craft SELECT statements with WHERE, ORDER BY, GROUP BY, HAVING
- Embrace Basic JOINS (INNER, LEFT, RIGHT, FULL)
- Navigate through simple databases and tables
2. Intermediate SQL
- Utilize Aggregate functions (COUNT, SUM, AVG, MAX, MIN)
- Embrace Subqueries and nested queries
- Master Common Table Expressions (WITH clause)
- Implement CASE statements for logical queries
3. Advanced SQL
- Explore Advanced JOIN techniques (self-join, non-equi join)
- Dive into Window functions (OVER, PARTITION BY, ROW_NUMBER, RANK, DENSE_RANK, lead, lag)
- Optimize queries with indexing
- Execute Data manipulation (INSERT, UPDATE, DELETE)
Python:
1. Python Basics
- Grasp Syntax, variables, and data types
- Command Control structures (if-else, for and while loops)
- Understand Basic data structures (lists, dictionaries, sets, tuples)
- Master Functions, lambda functions, and error handling (try-except)
- Explore Modules and packages
2. Pandas & Numpy
- Create and manipulate DataFrames and Series
- Perfect Indexing, selecting, and filtering data
- Handle missing data (fillna, dropna)
- Aggregate data with groupby, summarizing data
- Merge, join, and concatenate datasets
3. Data Visualization with Python
- Plot with Matplotlib (line plots, bar plots, histograms)
- Visualize with Seaborn (scatter plots, box plots, pair plots)
- Customize plots (sizes, labels, legends, color palettes)
- Introduction to interactive visualizations (e.g., Plotly)
Excel:
1. Excel Essentials
- Conduct Cell operations, basic formulas (SUMIFS, COUNTIFS, AVERAGEIFS, IF, AND, OR, NOT & Nested Functions etc.)
- Dive into charts and basic data visualization
- Sort and filter data, use Conditional formatting
2. Intermediate Excel
- Master Advanced formulas (V/XLOOKUP, INDEX-MATCH, nested IF)
- Leverage PivotTables and PivotCharts for summarizing data
- Utilize data validation tools
- Employ What-if analysis tools (Data Tables, Goal Seek)
3. Advanced Excel
- Harness Array formulas and advanced functions
- Dive into Data Model & Power Pivot
- Explore Advanced Filter, Slicers, and Timelines in Pivot Tables
- Create dynamic charts and interactive dashboards
Power BI:
1. Data Modeling in Power BI
- Import data from various sources
- Establish and manage relationships between datasets
- Grasp Data modeling basics (star schema, snowflake schema)
2. Data Transformation in Power BI
- Use Power Query for data cleaning and transformation
- Apply advanced data shaping techniques
- Create Calculated columns and measures using DAX
3. Data Visualization and Reporting in Power BI
- Craft interactive reports and dashboards
- Utilize Visualizations (bar, line, pie charts, maps)
- Publish and share reports, schedule data refreshes
Statistics Fundamentals:
- Mean, Median, Mode
- Standard Deviation, Variance
- Probability Distributions, Hypothesis Testing
- P-values, Confidence Intervals
- Correlation, Simple Linear Regression
- Normal Distribution, Binomial Distribution, Poisson Distribution.
Show some β€οΈ if you're ready to elevate your data science journey! π
ENJOY LEARNING ππ
π4β€3
Data Analyst interview questions π
Excel:
1. Explain the difference between the "COUNT", "COUNTA", "COUNTIF", and "COUNTIFS" functions in Excel. When would you use each of these functions, and provide examples?
2. How do you create a pivot chart in Excel, and what are some advantages of using pivot charts for data visualization?
3. Describe the purpose and usage of Excel's "Solver" tool. Can you provide an example of a problem you could solve using the Solver tool?
4. How would you use Excel's "Data Validation" feature to ensure data integrity in a spreadsheet? Provide examples of different types of data validation rules you might implement.
5. What are Excel tables, and how do they differ from regular data ranges? What advantages do tables offer in terms of data management and analysis?
SQL:
1. Discuss the concept of data aggregation in SQL. How do you use aggregate functions such as SUM, AVG, MIN, and MAX to summarize data in a query?
2. Explain the difference between a primary key and a foreign key in SQL. Why are these constraints important in database design?
3. How do you handle duplicates in a SQL query result? Can you demonstrate how to remove duplicates using the DISTINCT keyword or other techniques?
4. Describe the purpose and benefits of using stored procedures in SQL databases. Provide an example of a scenario where you would use a stored procedure.
5. What is SQL injection, and how can you prevent it in your SQL queries or applications? Discuss best practices for writing secure SQL code.
Power BI:
1. How does Power BI handle data refresh and scheduling for reports and dashboards? What options are available for configuring data refresh settings?
2. Describe the concept of row-level security in Power BI. How can you implement row-level security to restrict access to specific data based on user roles or permissions?
3. What is the Power Query Editor in Power BI, and how do you use it to transform and clean data imported from different sources?
4. Discuss the benefits of using Power BI's Direct Query mode versus Import mode for connecting to data sources. When would you choose one mode over the other?
5. How do you share reports and dashboards with other users in Power BI? What options are available for distributing and collaborating on Power BI content within an organization?
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Like if it helps :)
Excel:
1. Explain the difference between the "COUNT", "COUNTA", "COUNTIF", and "COUNTIFS" functions in Excel. When would you use each of these functions, and provide examples?
2. How do you create a pivot chart in Excel, and what are some advantages of using pivot charts for data visualization?
3. Describe the purpose and usage of Excel's "Solver" tool. Can you provide an example of a problem you could solve using the Solver tool?
4. How would you use Excel's "Data Validation" feature to ensure data integrity in a spreadsheet? Provide examples of different types of data validation rules you might implement.
5. What are Excel tables, and how do they differ from regular data ranges? What advantages do tables offer in terms of data management and analysis?
SQL:
1. Discuss the concept of data aggregation in SQL. How do you use aggregate functions such as SUM, AVG, MIN, and MAX to summarize data in a query?
2. Explain the difference between a primary key and a foreign key in SQL. Why are these constraints important in database design?
3. How do you handle duplicates in a SQL query result? Can you demonstrate how to remove duplicates using the DISTINCT keyword or other techniques?
4. Describe the purpose and benefits of using stored procedures in SQL databases. Provide an example of a scenario where you would use a stored procedure.
5. What is SQL injection, and how can you prevent it in your SQL queries or applications? Discuss best practices for writing secure SQL code.
Power BI:
1. How does Power BI handle data refresh and scheduling for reports and dashboards? What options are available for configuring data refresh settings?
2. Describe the concept of row-level security in Power BI. How can you implement row-level security to restrict access to specific data based on user roles or permissions?
3. What is the Power Query Editor in Power BI, and how do you use it to transform and clean data imported from different sources?
4. Discuss the benefits of using Power BI's Direct Query mode versus Import mode for connecting to data sources. When would you choose one mode over the other?
5. How do you share reports and dashboards with other users in Power BI? What options are available for distributing and collaborating on Power BI content within an organization?
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Like if it helps :)
π2β€1
Data Analytics Interview Preparation Part-2
[Questions with Answers]
How did you get your job?
I was hired after an internship.
To get the internship, I prepared a bunch for general Python questions (LeetCode etc.) and studied the basics of machine learning (several different algorithms, how they work, when they're useful, metrics
to measure their performance, how to train them in practice etc.).
To get the internship I had to pass a technical interview as well as a take-home machine learning (ML) exercise. Then, it was just a question of doing a good job in the internship!
What are your data related responsibilities in your job?
I work on our recommendation system. Itβs deep learning based. I work on a lot of features to try and
improve it (reinforcement learning & NLP etc). Since I'm in a start-up, it's also up to our team to put the models we design into production. So, after a phase of research & development and model design, in notebooks, it's time to create a real pipeline, by creating scripts.
This enables us to define, train, replace, compare and check the status of the models in production. It's basically all in Python, using Keras/TensorFlow, Pandas, Scikit-learn and NumPy. We also do a lot of analysis for the business team to help them compute metrics of interest (related to
revenue, acquisition etc.). For that, we use an external utility called Metabase. It is is hooked up to our database where we write SQL queries and visualize the results and create dashboards (using
Tableau/Looker etc).
I would say my role is quite "full-stack" since we are all involved from the phase of R&D to deployment on our cluster.
Was it difficult to get this role?
I got hired after an internship. If you come from a scientific background, it's not that hard to transition into data science. All the math is something you will probably have seen already (especially if you're
doing maths or physics). So, with some preparation and coding practice, you can start applying to internships.
It took me maybe a month or two of preparation to get some basic ideas of the typical Python data stack (Pandas, Keras, SciKit-learn etc) before I started to send out CVs. Then, if you get an internship, try your best to do the best you can and then maybe you'll be hired after!
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Hope it helps :)
[Questions with Answers]
How did you get your job?
I was hired after an internship.
To get the internship, I prepared a bunch for general Python questions (LeetCode etc.) and studied the basics of machine learning (several different algorithms, how they work, when they're useful, metrics
to measure their performance, how to train them in practice etc.).
To get the internship I had to pass a technical interview as well as a take-home machine learning (ML) exercise. Then, it was just a question of doing a good job in the internship!
What are your data related responsibilities in your job?
I work on our recommendation system. Itβs deep learning based. I work on a lot of features to try and
improve it (reinforcement learning & NLP etc). Since I'm in a start-up, it's also up to our team to put the models we design into production. So, after a phase of research & development and model design, in notebooks, it's time to create a real pipeline, by creating scripts.
This enables us to define, train, replace, compare and check the status of the models in production. It's basically all in Python, using Keras/TensorFlow, Pandas, Scikit-learn and NumPy. We also do a lot of analysis for the business team to help them compute metrics of interest (related to
revenue, acquisition etc.). For that, we use an external utility called Metabase. It is is hooked up to our database where we write SQL queries and visualize the results and create dashboards (using
Tableau/Looker etc).
I would say my role is quite "full-stack" since we are all involved from the phase of R&D to deployment on our cluster.
Was it difficult to get this role?
I got hired after an internship. If you come from a scientific background, it's not that hard to transition into data science. All the math is something you will probably have seen already (especially if you're
doing maths or physics). So, with some preparation and coding practice, you can start applying to internships.
It took me maybe a month or two of preparation to get some basic ideas of the typical Python data stack (Pandas, Keras, SciKit-learn etc) before I started to send out CVs. Then, if you get an internship, try your best to do the best you can and then maybe you'll be hired after!
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Hope it helps :)
π5
Statistics Interview Questions
Topics to Cover:
β’ Descriptive statistics
β’ Probability
β’ Hypothesis testing
β’ Regression analysis
Questions and Answers:
1 Q: What is the difference between descriptive and inferential statistics?
A: Descriptive statistics summarize the main features of a dataset (e.g., mean, median, mode), while inferential statistics use samples to make inferences about a larger population.
2 Q: Define p-value in hypothesis testing.
A: The p-value is the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. A low p-value (< 0.05) indicates strong evidence against the null hypothesis.
3 Q: What is the central limit theorem?
A: The central limit theorem states that the distribution of the sample mean approximates a normal distribution as the sample size becomes large, regardless of the population's distribution.
4 Q: Explain the concept of correlation.
A: Correlation measures the strength and direction of the relationship between two variables. It ranges from -1 (perfect negative) to +1 (perfect positive), with 0 indicating no correlation.
5 Q: What is linear regression?
A: Linear regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data.
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Like if it helps :)
Topics to Cover:
β’ Descriptive statistics
β’ Probability
β’ Hypothesis testing
β’ Regression analysis
Questions and Answers:
1 Q: What is the difference between descriptive and inferential statistics?
A: Descriptive statistics summarize the main features of a dataset (e.g., mean, median, mode), while inferential statistics use samples to make inferences about a larger population.
2 Q: Define p-value in hypothesis testing.
A: The p-value is the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. A low p-value (< 0.05) indicates strong evidence against the null hypothesis.
3 Q: What is the central limit theorem?
A: The central limit theorem states that the distribution of the sample mean approximates a normal distribution as the sample size becomes large, regardless of the population's distribution.
4 Q: Explain the concept of correlation.
A: Correlation measures the strength and direction of the relationship between two variables. It ranges from -1 (perfect negative) to +1 (perfect positive), with 0 indicating no correlation.
5 Q: What is linear regression?
A: Linear regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data.
I have curated best 80+ top-notch Data Analytics Resources ππ
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Like if it helps :)
π5β€1
Here are some interview questions for both freshers and experienced applying for a data analyst #SQL
Analyst role:
#ForFreshers:
1. What is SQL, and why is it important in data analysis?
2. Explain the difference between a database and a table.
3. What are the basic SQL commands for data retrieval?
4. How do you retrieve all records from a table named "Employees"?
5. What is a primary key, and why is it important in a database?
6. What is a foreign key, and how is it used in SQL?
7. Describe the difference between SQL JOIN and SQL UNION.
8. How do you write a SQL query to find the second-highest salary in a table?
9. What is the purpose of the GROUP BY clause in SQL?
10. Can you explain the concept of normalization in SQL databases?
11. What are the common aggregate functions in SQL, and how are they used?
For ExperiencedCandidates:
1. Describe a scenario where you had to optimize a slow-running SQL query. How did you approach it?
2. Explain the differences between SQL Server, MySQL, and Oracle databases.
3. Can you describe the process of creating an index in a SQL database and its impact on query performance?
4. How do you handle data quality issues when performing data analysis with SQL?
5. What is a subquery, and when would you use it in SQL? Give an example of a complex SQL query you've written to extract specific insights from a database.
6. How do you handle NULL values in SQL, and what are the challenges associated with them?
7. Explain the ACID properties of a database and their importance.
8. What are stored procedures and triggers in SQL, and when would you use them?
9. Describe your experience with ETL (Extract, Transform, Load) processes using SQL.
10. Can you explain the concept of query optimization in SQL, and what techniques have you used for optimization?
Enjoy Learning ππ
Analyst role:
#ForFreshers:
1. What is SQL, and why is it important in data analysis?
2. Explain the difference between a database and a table.
3. What are the basic SQL commands for data retrieval?
4. How do you retrieve all records from a table named "Employees"?
5. What is a primary key, and why is it important in a database?
6. What is a foreign key, and how is it used in SQL?
7. Describe the difference between SQL JOIN and SQL UNION.
8. How do you write a SQL query to find the second-highest salary in a table?
9. What is the purpose of the GROUP BY clause in SQL?
10. Can you explain the concept of normalization in SQL databases?
11. What are the common aggregate functions in SQL, and how are they used?
For ExperiencedCandidates:
1. Describe a scenario where you had to optimize a slow-running SQL query. How did you approach it?
2. Explain the differences between SQL Server, MySQL, and Oracle databases.
3. Can you describe the process of creating an index in a SQL database and its impact on query performance?
4. How do you handle data quality issues when performing data analysis with SQL?
5. What is a subquery, and when would you use it in SQL? Give an example of a complex SQL query you've written to extract specific insights from a database.
6. How do you handle NULL values in SQL, and what are the challenges associated with them?
7. Explain the ACID properties of a database and their importance.
8. What are stored procedures and triggers in SQL, and when would you use them?
9. Describe your experience with ETL (Extract, Transform, Load) processes using SQL.
10. Can you explain the concept of query optimization in SQL, and what techniques have you used for optimization?
Enjoy Learning ππ
π5
Top 10 Excel Interview Questions with Answers ππ
Free Resources to learn Excel: https://t.iss.one/excel_analyst
1. Question: What is the difference between CONCATENATE and "&" in Excel?
Answer: CONCATENATE and "&" both combine text, but "&" is more concise. For example,
2. Question: How can you freeze rows and columns simultaneously in Excel?
Answer: Use the "Freeze Panes" option under the "View" tab. Select the cell below and to the right of the rows and columns you want to freeze, and then click on "Freeze Panes."
3. Question: Explain the VLOOKUP function and when would you use it?
Answer: VLOOKUP searches for a value in the first column of a range and returns a corresponding value in the same row from another column. It's useful for looking up information in a table based on a specific criteria.
4. Question: What is the purpose of the IFERROR function?
Answer: IFERROR is used to handle errors in Excel formulas. It returns a specified value if a formula results in an error, and the actual result if there's no error.
5. Question: How do you create a PivotTable, and what is its purpose?
Answer: To create a PivotTable, select your data, go to the "Insert" tab, and choose "PivotTable." It summarizes and analyzes data in a spreadsheet, allowing you to make sense of large datasets.
6. Question: Explain the difference between relative and absolute cell references.
Answer: Relative references change when you copy a formula to another cell, while absolute references stay fixed. Use a
7. Question: What is the purpose of the INDEX and MATCH functions?
Answer: INDEX returns a value in a specified range based on the row and column number, while MATCH searches for a value in a range and returns its relative position. Combined, they provide a flexible way to look up data.
8. Question: How can you find and remove duplicate values in Excel?
Answer: Use the "Remove Duplicates" feature under the "Data" tab. Select the range containing duplicates, go to "Data" -> "Remove Duplicates," and choose the columns to check for duplicates.
9. Question: Explain the difference between a workbook and a worksheet.
Answer: A workbook is the entire Excel file, while a worksheet is a single sheet within that file. Workbooks can contain multiple worksheets.
10. Question: What is the purpose of the COUNTIF function?
Answer: COUNTIF counts the number of cells within a range that meet a specified condition. For example,
Share with credits: https://t.iss.one/sqlspecialist
Hope it helps :)
Free Resources to learn Excel: https://t.iss.one/excel_analyst
1. Question: What is the difference between CONCATENATE and "&" in Excel?
Answer: CONCATENATE and "&" both combine text, but "&" is more concise. For example,
=A1&B1 achieves the same result as =CONCATENATE(A1, B1).2. Question: How can you freeze rows and columns simultaneously in Excel?
Answer: Use the "Freeze Panes" option under the "View" tab. Select the cell below and to the right of the rows and columns you want to freeze, and then click on "Freeze Panes."
3. Question: Explain the VLOOKUP function and when would you use it?
Answer: VLOOKUP searches for a value in the first column of a range and returns a corresponding value in the same row from another column. It's useful for looking up information in a table based on a specific criteria.
4. Question: What is the purpose of the IFERROR function?
Answer: IFERROR is used to handle errors in Excel formulas. It returns a specified value if a formula results in an error, and the actual result if there's no error.
5. Question: How do you create a PivotTable, and what is its purpose?
Answer: To create a PivotTable, select your data, go to the "Insert" tab, and choose "PivotTable." It summarizes and analyzes data in a spreadsheet, allowing you to make sense of large datasets.
6. Question: Explain the difference between relative and absolute cell references.
Answer: Relative references change when you copy a formula to another cell, while absolute references stay fixed. Use a
$ symbol to make a reference absolute (e.g., $A$1).7. Question: What is the purpose of the INDEX and MATCH functions?
Answer: INDEX returns a value in a specified range based on the row and column number, while MATCH searches for a value in a range and returns its relative position. Combined, they provide a flexible way to look up data.
8. Question: How can you find and remove duplicate values in Excel?
Answer: Use the "Remove Duplicates" feature under the "Data" tab. Select the range containing duplicates, go to "Data" -> "Remove Duplicates," and choose the columns to check for duplicates.
9. Question: Explain the difference between a workbook and a worksheet.
Answer: A workbook is the entire Excel file, while a worksheet is a single sheet within that file. Workbooks can contain multiple worksheets.
10. Question: What is the purpose of the COUNTIF function?
Answer: COUNTIF counts the number of cells within a range that meet a specified condition. For example,
=COUNTIF(A1:A10, ">50") counts the cells in A1 to A10 that are greater than 50.Share with credits: https://t.iss.one/sqlspecialist
Hope it helps :)
π4π2β€1
1. What do you understand by the term silhouette coefficient?
The silhouette coefficient is a measure of how well clustered together a data point is with respect to the other points in its cluster. It is a measure of how similar a point is to the points in its own cluster, and how dissimilar it is to the points in other clusters. The silhouette coefficient ranges from -1 to 1, with 1 being the best possible score and -1 being the worst possible score.
2. What is the difference between trend and seasonality in time series?
Trends and seasonality are two characteristics of time series metrics that break many models. Trends are continuous increases or decreases in a metricβs value. Seasonality, on the other hand, reflects periodic (cyclical) patterns that occur in a system, usually rising above a baseline and then decreasing again.
3. What is Bag of Words in NLP?
Bag of Words is a commonly used model that depends on word frequencies or occurrences to train a classifier. This model creates an occurrence matrix for documents or sentences irrespective of its grammatical structure or word order.
4. What is the difference between bagging and boosting?
Bagging is a homogeneous weak learnersβ model that learns from each other independently in parallel and combines them for determining the model average. Boosting is also a homogeneous weak learnersβ model but works differently from Bagging. In this model, learners learn sequentially and adaptively to improve model predictions of a learning algorithm
5. What do you understand by the F1 score?
The F1 score represents the measurement of a model's performance. It is referred to as a weighted average of the precision and recall of a model. The results tending to 1 are considered as the best, and those tending to 0 are the worst. It could be used in classification tests, where true negatives don't matter much.
6. How to create ATS- friendly Resume?
https://www.linkedin.com/posts/sql-analysts_resume-templates-activity-7137312110321057792-zxPh
Share for more: https://t.iss.one/datasciencefun
ENJOY LEARNING ππ
The silhouette coefficient is a measure of how well clustered together a data point is with respect to the other points in its cluster. It is a measure of how similar a point is to the points in its own cluster, and how dissimilar it is to the points in other clusters. The silhouette coefficient ranges from -1 to 1, with 1 being the best possible score and -1 being the worst possible score.
2. What is the difference between trend and seasonality in time series?
Trends and seasonality are two characteristics of time series metrics that break many models. Trends are continuous increases or decreases in a metricβs value. Seasonality, on the other hand, reflects periodic (cyclical) patterns that occur in a system, usually rising above a baseline and then decreasing again.
3. What is Bag of Words in NLP?
Bag of Words is a commonly used model that depends on word frequencies or occurrences to train a classifier. This model creates an occurrence matrix for documents or sentences irrespective of its grammatical structure or word order.
4. What is the difference between bagging and boosting?
Bagging is a homogeneous weak learnersβ model that learns from each other independently in parallel and combines them for determining the model average. Boosting is also a homogeneous weak learnersβ model but works differently from Bagging. In this model, learners learn sequentially and adaptively to improve model predictions of a learning algorithm
5. What do you understand by the F1 score?
The F1 score represents the measurement of a model's performance. It is referred to as a weighted average of the precision and recall of a model. The results tending to 1 are considered as the best, and those tending to 0 are the worst. It could be used in classification tests, where true negatives don't matter much.
6. How to create ATS- friendly Resume?
https://www.linkedin.com/posts/sql-analysts_resume-templates-activity-7137312110321057792-zxPh
Share for more: https://t.iss.one/datasciencefun
ENJOY LEARNING ππ
π7β€1
Prepare for GATE: The Right Time is NOW!
GeeksforGeeks brings you everything you need to crack GATE 2026 β 900+ live hours, 300+ recorded sessions, and expert mentorship to keep you on track.
Whatβs inside?
β Live & recorded classes with Indiaβs top educators
β 200+ mock tests to track your progress
β Study materials - PYQs, workbooks, formula book & more
β 1:1 mentorship & AI doubt resolution for instant support
β Interview prep for IITs & PSUs to help you land opportunities
Learn from Experts Like:
Satish Kumar Yadav β Trained 20K+ students
Dr. Khaleel β Ph.D. in CS, 29+ years of experience
Chandan Jha β Ex-ISRO, AIR 23 in GATE
Vijay Kumar Agarwal β M.Tech (NIT), 13+ years of experience
Sakshi Singhal β IIT Roorkee, AIR 56 CSIR-NET
Shailendra Singh β GATE 99.24 percentile
Devasane Mallesham β IIT Bombay, 13+ years of experience
Use code UPSKILL30 to get an extra 30% OFF (Limited time only)
π Enroll for a free counseling session now: https://gfgcdn.com/tu/UI2/
GeeksforGeeks brings you everything you need to crack GATE 2026 β 900+ live hours, 300+ recorded sessions, and expert mentorship to keep you on track.
Whatβs inside?
β Live & recorded classes with Indiaβs top educators
β 200+ mock tests to track your progress
β Study materials - PYQs, workbooks, formula book & more
β 1:1 mentorship & AI doubt resolution for instant support
β Interview prep for IITs & PSUs to help you land opportunities
Learn from Experts Like:
Satish Kumar Yadav β Trained 20K+ students
Dr. Khaleel β Ph.D. in CS, 29+ years of experience
Chandan Jha β Ex-ISRO, AIR 23 in GATE
Vijay Kumar Agarwal β M.Tech (NIT), 13+ years of experience
Sakshi Singhal β IIT Roorkee, AIR 56 CSIR-NET
Shailendra Singh β GATE 99.24 percentile
Devasane Mallesham β IIT Bombay, 13+ years of experience
Use code UPSKILL30 to get an extra 30% OFF (Limited time only)
π Enroll for a free counseling session now: https://gfgcdn.com/tu/UI2/
π2
Must important topics to look before any excel interview for Data/Business Analyst role :-
Data Handling: Cell formatting, rows/columns, basic functions (SUM, AVERAGE, COUNT etc).
Data Management Mastery: Sorting, filtering, data validation, diverse cell references. Function Proficiency: Explore SUMIF, (V & X)LOOKUP, INDEX, MATCH, IF, and advanced function nesting.
Advanced Analytics: Master PivotTables for dynamic data analysis and various chart creation.
Advanced Analysis Techniques: Conditional formatting, goal-seeking, in-depth what-if analysis.
Advanced Functions: COUNTIF/IFS, SUMIFS, AVERAGEIF/IFS, CONCATENATE, date/time functions.
These are the most important one's which I tried to summarise in the best possible way, please let me know in the comments if I have missed something important.
Data Handling: Cell formatting, rows/columns, basic functions (SUM, AVERAGE, COUNT etc).
Data Management Mastery: Sorting, filtering, data validation, diverse cell references. Function Proficiency: Explore SUMIF, (V & X)LOOKUP, INDEX, MATCH, IF, and advanced function nesting.
Advanced Analytics: Master PivotTables for dynamic data analysis and various chart creation.
Advanced Analysis Techniques: Conditional formatting, goal-seeking, in-depth what-if analysis.
Advanced Functions: COUNTIF/IFS, SUMIFS, AVERAGEIF/IFS, CONCATENATE, date/time functions.
These are the most important one's which I tried to summarise in the best possible way, please let me know in the comments if I have missed something important.
π6
Q1: How would you analyze data to understand user connection patterns on a professional network?
Ans: I'd use graph databases like Neo4j for social network analysis. By analyzing connection patterns, I can identify influencers or isolated communities.
Q2: Describe a challenging data visualization you created to represent user engagement metrics.
Ans: I visualized multi-dimensional data showing user engagement across features, regions, and time using tools like D3.js, creating an interactive dashboard with drill-down capabilities.
Q3: How would you identify and target passive job seekers on LinkedIn?
Ans: I'd analyze user behavior patterns, like increased profile updates, frequent visits to job postings, or engagement with career-related content, to identify potential passive job seekers.
Q4: How do you measure the effectiveness of a new feature launched on LinkedIn?
Ans: I'd set up A/B tests, comparing user engagement metrics between those who have access to the new feature and a control group. I'd then analyze metrics like time spent, feature usage frequency, and overall platform engagement to measure effectiveness.
Join WhatsApp channels for more free resources: https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
Ans: I'd use graph databases like Neo4j for social network analysis. By analyzing connection patterns, I can identify influencers or isolated communities.
Q2: Describe a challenging data visualization you created to represent user engagement metrics.
Ans: I visualized multi-dimensional data showing user engagement across features, regions, and time using tools like D3.js, creating an interactive dashboard with drill-down capabilities.
Q3: How would you identify and target passive job seekers on LinkedIn?
Ans: I'd analyze user behavior patterns, like increased profile updates, frequent visits to job postings, or engagement with career-related content, to identify potential passive job seekers.
Q4: How do you measure the effectiveness of a new feature launched on LinkedIn?
Ans: I'd set up A/B tests, comparing user engagement metrics between those who have access to the new feature and a control group. I'd then analyze metrics like time spent, feature usage frequency, and overall platform engagement to measure effectiveness.
Join WhatsApp channels for more free resources: https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02
π3β€1
Important Excel, Tableau, Statistics, SQL related Questions with answers
1. What are the common problems that data analysts encounter during analysis?
The common problems steps involved in any analytics project are:
Handling duplicate data
Collecting the meaningful right data at the right time
Handling data purging and storage problems
Making data secure and dealing with compliance issues
2. Explain the Type I and Type II errors in Statistics?
In Hypothesis testing, a Type I error occurs when the null hypothesis is rejected even if it is true. It is also known as a false positive.
A Type II error occurs when the null hypothesis is not rejected, even if it is false. It is also known as a false negative.
3. How do you make a dropdown list in MS Excel?
First, click on the Data tab that is present in the ribbon.
Under the Data Tools group, select Data Validation.
Then navigate to Settings > Allow > List.
Select the source you want to provide as a list array.
4. How do you subset or filter data in SQL?
To subset or filter data in SQL, we use WHERE and HAVING clauses which give us an option of including only the data matching certain conditions.
5. What is a Gantt Chart in Tableau?
A Gantt chart in Tableau depicts the progress of value over the period, i.e., it shows the duration of events. It consists of bars along with the time axis. The Gantt chart is mostly used as a project management tool where each bar is a measure of a task in the project
1. What are the common problems that data analysts encounter during analysis?
The common problems steps involved in any analytics project are:
Handling duplicate data
Collecting the meaningful right data at the right time
Handling data purging and storage problems
Making data secure and dealing with compliance issues
2. Explain the Type I and Type II errors in Statistics?
In Hypothesis testing, a Type I error occurs when the null hypothesis is rejected even if it is true. It is also known as a false positive.
A Type II error occurs when the null hypothesis is not rejected, even if it is false. It is also known as a false negative.
3. How do you make a dropdown list in MS Excel?
First, click on the Data tab that is present in the ribbon.
Under the Data Tools group, select Data Validation.
Then navigate to Settings > Allow > List.
Select the source you want to provide as a list array.
4. How do you subset or filter data in SQL?
To subset or filter data in SQL, we use WHERE and HAVING clauses which give us an option of including only the data matching certain conditions.
5. What is a Gantt Chart in Tableau?
A Gantt chart in Tableau depicts the progress of value over the period, i.e., it shows the duration of events. It consists of bars along with the time axis. The Gantt chart is mostly used as a project management tool where each bar is a measure of a task in the project
π4