Best Practices for writing SQL QUERIES ๐๐
https://www.linkedin.com/posts/sql-analysts_best-practices-for-writing-sql-queries-activity-7149138743667449856-rf53?utm_source=share&utm_medium=member_android
https://www.linkedin.com/posts/sql-analysts_best-practices-for-writing-sql-queries-activity-7149138743667449856-rf53?utm_source=share&utm_medium=member_android
๐12โค2
Numerical Methods with Python - 2023.pdf
21 MB
Numerical Methods with Python
William Miles, 2023
William Miles, 2023
๐10โค9๐1
If you have ever given an SQL interview some of the questions would be definitely from below list :
1- How to find duplicates in a table
2- How to delete duplicates from a table
3- Difference between union and union all
4- Difference between rank,row_number and dense_rank
5- Find records in a table which are not present in another table
6- Find second highest salary employees in each department
7- Find employees with salary more than their manager's salary
8- Difference between inner and left join
9- update a table and swap gender values.
If not exact at least flavor of these questions are always asked in interviews irrespective of your experience level
1- How to find duplicates in a table
2- How to delete duplicates from a table
3- Difference between union and union all
4- Difference between rank,row_number and dense_rank
5- Find records in a table which are not present in another table
6- Find second highest salary employees in each department
7- Find employees with salary more than their manager's salary
8- Difference between inner and left join
9- update a table and swap gender values.
If not exact at least flavor of these questions are always asked in interviews irrespective of your experience level
๐36โค8๐ค4
๐๐จ๐ฐ ๐ญ๐จ ๐ฉ๐ซ๐๐๐ญ๐ข๐๐ ๐๐๐ญ๐ ๐ฏ๐๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐๐ฌ ๐๐ง ๐๐ฌ๐ฉ๐ข๐ซ๐ข๐ง๐ ๐๐๐ญ๐ ๐๐ง๐๐ฅ๐ฒ๐ฌ๐ญ?
Here's a step-by-step guide for the same:
Step 1๏ธโฃ - Download a practice dataset. I'd recommend the Codebasics resume project challenge dataset (as it contains multi-table datasets).
Step 2๏ธโฃ - Open your preferred RDBMS tool (SQL server/MySQL). Create a local database to load the dataset.
Step 3๏ธโฃ - Import the practice dataset (.xlsx/.csv) into this database by creating the tables (please google if you need help).
Step 4๏ธโฃ - Now open Power BI desktop and connect to the local database using the appropriate connector.
Step 5๏ธโฃ - Build the dashboard using the questions shared in the resume project challenge.
Step 6๏ธโฃ - Now, you can validate the output of your dashboard by writing SQL queries.
Step 7๏ธโฃ - Try to write an SQL query for a question asked in the challenge. You need to convert a natural language question into an SQL query.
Step 8๏ธโฃ - Compare the query output with the dashboard output and check if the numbers are matching. If they aren't matching, either the query is wrong or the dashboard numbers are wrong. Hence, try to identify the gap.
Step 9๏ธโฃ - Repeat the process for every question asked in the challenge.
Thus, you will learn and practice both SQL and Power BI simultaneously.
๐๐ก๐ฒ ๐ฌ๐ก๐จ๐ฎ๐ฅ๐ ๐ฒ๐จ๐ฎ ๐ญ๐ซ๐ฒ ๐ญ๐ก๐ข๐ฌ ๐ฆ๐๐ญ๐ก๐จ๐?
In real-world scenarios, ๐๐๐ญ๐ ๐ฏ๐๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง is a very important step in every analytics project. One needs to compare the output of the report/dashboard with the data source and then launch it for usage, to avoid discrepancies.
This will help you weed out any mistakes that you have applied in your report/dashboard logic.
Best Telegram Channel for Data Analysts: https://t.iss.one/sqlspecialist
Here's a step-by-step guide for the same:
Step 1๏ธโฃ - Download a practice dataset. I'd recommend the Codebasics resume project challenge dataset (as it contains multi-table datasets).
Step 2๏ธโฃ - Open your preferred RDBMS tool (SQL server/MySQL). Create a local database to load the dataset.
Step 3๏ธโฃ - Import the practice dataset (.xlsx/.csv) into this database by creating the tables (please google if you need help).
Step 4๏ธโฃ - Now open Power BI desktop and connect to the local database using the appropriate connector.
Step 5๏ธโฃ - Build the dashboard using the questions shared in the resume project challenge.
Step 6๏ธโฃ - Now, you can validate the output of your dashboard by writing SQL queries.
Step 7๏ธโฃ - Try to write an SQL query for a question asked in the challenge. You need to convert a natural language question into an SQL query.
Step 8๏ธโฃ - Compare the query output with the dashboard output and check if the numbers are matching. If they aren't matching, either the query is wrong or the dashboard numbers are wrong. Hence, try to identify the gap.
Step 9๏ธโฃ - Repeat the process for every question asked in the challenge.
Thus, you will learn and practice both SQL and Power BI simultaneously.
๐๐ก๐ฒ ๐ฌ๐ก๐จ๐ฎ๐ฅ๐ ๐ฒ๐จ๐ฎ ๐ญ๐ซ๐ฒ ๐ญ๐ก๐ข๐ฌ ๐ฆ๐๐ญ๐ก๐จ๐?
In real-world scenarios, ๐๐๐ญ๐ ๐ฏ๐๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง is a very important step in every analytics project. One needs to compare the output of the report/dashboard with the data source and then launch it for usage, to avoid discrepancies.
This will help you weed out any mistakes that you have applied in your report/dashboard logic.
Best Telegram Channel for Data Analysts: https://t.iss.one/sqlspecialist
๐29โค15
Data Analyst Roadmap for 2024
Comment your favourite tool
๐๐
https://www.instagram.com/p/C2kOgdeIV-L/?igsh=MWlnaXpydXh2YnFyOQ==
Comment your favourite tool
๐๐
https://www.instagram.com/p/C2kOgdeIV-L/?igsh=MWlnaXpydXh2YnFyOQ==
๐1
Free access to data analytics Resources
Comment "interested" to get the link
๐๐
https://www.instagram.com/reel/C2mzUr7Pyof/?igsh=YWo5YTN2MWpqMG41
Comment "interested" to get the link
๐๐
https://www.instagram.com/reel/C2mzUr7Pyof/?igsh=YWo5YTN2MWpqMG41
๐5
Data Analyst Interview Questions ๐
1.How to create filters in Power BI?
Filters are an integral part of Power BI reports. They are used to slice and dice the data as per the dimensions we want. Filters are created in a couple of ways.
Using Slicers: A slicer is a visual under Visualization Pane. This can be added to the design view to filter our reports. When a slicer is added to the design view, it requires a field to be added to it. For example- Slicer can be added for Country fields. Then the data can be filtered based on countries.
Using Filter Pane: The Power BI team has added a filter pane to the reports, which is a single space where we can add different fields as filters. And these fields can be added depending on whether you want to filter only one visual(Visual level filter), or all the visuals in the report page(Page level filters), or applicable to all the pages of the report(report level filters)
2.How to sort data in Power BI?
Sorting is available in multiple formats. In the data view, a common sorting option of alphabetical order is there. Apart from that, we have the option of Sort by column, where one can sort a column based on another column. The sorting option is available in visuals as well. Sort by ascending and descending option by the fields and measure present in the visual is also available.
3.How to convert pdf to excel?
Open the PDF document you want to convert in XLSX format in Acrobat DC.
Go to the right pane and click on the โExport PDFโ option.
Choose spreadsheet as the Export format.
Select โMicrosoft Excel Workbook.โ
Now click โExport.โ
Download the converted file or share it.
4. How to enable macros in excel?
Click the file tab and then click โOptions.โ
A dialog box will appear. In the โExcel Optionsโ dialog box, click on the โTrust Centerโ and then โTrust Center Settings.โ
Go to the โMacro Settingsโ and select โenable all macros.โ
Click OK to apply the macro settings.
1.How to create filters in Power BI?
Filters are an integral part of Power BI reports. They are used to slice and dice the data as per the dimensions we want. Filters are created in a couple of ways.
Using Slicers: A slicer is a visual under Visualization Pane. This can be added to the design view to filter our reports. When a slicer is added to the design view, it requires a field to be added to it. For example- Slicer can be added for Country fields. Then the data can be filtered based on countries.
Using Filter Pane: The Power BI team has added a filter pane to the reports, which is a single space where we can add different fields as filters. And these fields can be added depending on whether you want to filter only one visual(Visual level filter), or all the visuals in the report page(Page level filters), or applicable to all the pages of the report(report level filters)
2.How to sort data in Power BI?
Sorting is available in multiple formats. In the data view, a common sorting option of alphabetical order is there. Apart from that, we have the option of Sort by column, where one can sort a column based on another column. The sorting option is available in visuals as well. Sort by ascending and descending option by the fields and measure present in the visual is also available.
3.How to convert pdf to excel?
Open the PDF document you want to convert in XLSX format in Acrobat DC.
Go to the right pane and click on the โExport PDFโ option.
Choose spreadsheet as the Export format.
Select โMicrosoft Excel Workbook.โ
Now click โExport.โ
Download the converted file or share it.
4. How to enable macros in excel?
Click the file tab and then click โOptions.โ
A dialog box will appear. In the โExcel Optionsโ dialog box, click on the โTrust Centerโ and then โTrust Center Settings.โ
Go to the โMacro Settingsโ and select โenable all macros.โ
Click OK to apply the macro settings.
๐30โค8๐1๐1
10 Project Ideas for Data Analyst
๐๐
https://www.instagram.com/reel/C2yjxX8N09I/?igsh=MTZ0YnNkMnc4NnV4dA==
๐๐
https://www.instagram.com/reel/C2yjxX8N09I/?igsh=MTZ0YnNkMnc4NnV4dA==
๐4
Top 20 #SQL INTERVIEW QUESTIONS
1๏ธโฃ Explain Order of Execution of SQL query
2๏ธโฃ Provide a use case for each of the functions Rank, Dense_Rank & Row_Number ( ๐ก majority struggle )
3๏ธโฃ Write a query to find the cumulative sum/Running Total
4๏ธโฃ Find the Most selling product by sales/ highest Salary of employees
5๏ธโฃ Write a query to find the 2nd/nth highest Salary of employees
6๏ธโฃ Difference between union vs union all
7๏ธโฃ Identify if there any duplicates in a table
8๏ธโฃ Scenario based Joins question, understanding of Inner, Left and Outer Joins via simple yet tricky question
9๏ธโฃ LAG, write a query to find all those records where the transaction value is greater then previous transaction value
1๏ธโฃ 0๏ธโฃ Rank vs Dense Rank, query to find the 2nd highest Salary of employee
( Ideal soln should handle ties)
1๏ธโฃ 1๏ธโฃ Write a query to find the Running Difference (Ideal sol'n using windows function)
1๏ธโฃ 2๏ธโฃ Write a query to display year on year/month on month growth
1๏ธโฃ 3๏ธโฃ Write a query to find rolling average of daily sign-ups
1๏ธโฃ 4๏ธโฃ Write a query to find the running difference using self join (helps in understanding the logical approach, ideally this question is solved via windows function)
1๏ธโฃ 5๏ธโฃ Write a query to find the cumulative sum using self join
(you can use windows function to solve this question)
1๏ธโฃ6๏ธโฃ Differentiate between a clustered index and a non-clustered index?
1๏ธโฃ7๏ธโฃ What is a Candidate key?
1๏ธโฃ8๏ธโฃWhat is difference between Primary key and Unique key?
1๏ธโฃ9๏ธโฃWhat's the difference between RANK & DENSE_RANK in SQL?
2๏ธโฃ0๏ธโฃ Whats the difference between LAG & LEAD in SQL?
Access SQL Learning Series for Free: https://t.iss.one/sqlspecialist/523
Hope it helps :)
1๏ธโฃ Explain Order of Execution of SQL query
2๏ธโฃ Provide a use case for each of the functions Rank, Dense_Rank & Row_Number ( ๐ก majority struggle )
3๏ธโฃ Write a query to find the cumulative sum/Running Total
4๏ธโฃ Find the Most selling product by sales/ highest Salary of employees
5๏ธโฃ Write a query to find the 2nd/nth highest Salary of employees
6๏ธโฃ Difference between union vs union all
7๏ธโฃ Identify if there any duplicates in a table
8๏ธโฃ Scenario based Joins question, understanding of Inner, Left and Outer Joins via simple yet tricky question
9๏ธโฃ LAG, write a query to find all those records where the transaction value is greater then previous transaction value
1๏ธโฃ 0๏ธโฃ Rank vs Dense Rank, query to find the 2nd highest Salary of employee
( Ideal soln should handle ties)
1๏ธโฃ 1๏ธโฃ Write a query to find the Running Difference (Ideal sol'n using windows function)
1๏ธโฃ 2๏ธโฃ Write a query to display year on year/month on month growth
1๏ธโฃ 3๏ธโฃ Write a query to find rolling average of daily sign-ups
1๏ธโฃ 4๏ธโฃ Write a query to find the running difference using self join (helps in understanding the logical approach, ideally this question is solved via windows function)
1๏ธโฃ 5๏ธโฃ Write a query to find the cumulative sum using self join
(you can use windows function to solve this question)
1๏ธโฃ6๏ธโฃ Differentiate between a clustered index and a non-clustered index?
1๏ธโฃ7๏ธโฃ What is a Candidate key?
1๏ธโฃ8๏ธโฃWhat is difference between Primary key and Unique key?
1๏ธโฃ9๏ธโฃWhat's the difference between RANK & DENSE_RANK in SQL?
2๏ธโฃ0๏ธโฃ Whats the difference between LAG & LEAD in SQL?
Access SQL Learning Series for Free: https://t.iss.one/sqlspecialist/523
Hope it helps :)
๐28โค4๐2๐ค2๐1
SQL (Structured Query Language) is a standard programming language used to manage and manipulate relational databases. Here are some key concepts to understand the basics of SQL:
1. Database: A database is a structured collection of data organized in tables, which consist of rows and columns.
2. Table: A table is a collection of related data organized in rows and columns. Each row represents a record, and each column represents a specific attribute or field.
3. Query: A SQL query is a request for data or information from a database. Queries are used to retrieve, insert, update, or delete data in a database.
4. CRUD Operations: CRUD stands for Create, Read, Update, and Delete. These are the basic operations performed on data in a database using SQL:
- Create (INSERT): Adds new records to a table.
- Read (SELECT): Retrieves data from one or more tables.
- Update (UPDATE): Modifies existing records in a table.
- Delete (DELETE): Removes records from a table.
5. Data Types: SQL supports various data types to define the type of data that can be stored in each column of a table, such as integer, text, date, and decimal.
6. Constraints: Constraints are rules enforced on data columns to ensure data integrity and consistency. Common constraints include:
- Primary Key: Uniquely identifies each record in a table.
- Foreign Key: Establishes a relationship between two tables.
- Unique: Ensures that all values in a column are unique.
- Not Null: Specifies that a column cannot contain NULL values.
7. Joins: Joins are used to combine rows from two or more tables based on a related column between them. Common types of joins include INNER JOIN, LEFT JOIN (or LEFT OUTER JOIN), RIGHT JOIN (or RIGHT OUTER JOIN), and FULL JOIN (or FULL OUTER JOIN).
8. Aggregate Functions: SQL provides aggregate functions to perform calculations on sets of values. Common aggregate functions include SUM, AVG, COUNT, MIN, and MAX.
9. Group By: The GROUP BY clause is used to group rows that have the same values into summary rows. It is often used with aggregate functions to perform calculations on grouped data.
10. Order By: The ORDER BY clause is used to sort the result set of a query based on one or more columns in ascending or descending order.
Understanding these basic concepts of SQL will help you write queries to interact with databases effectively. Practice writing SQL queries and experimenting with different commands to become proficient in using SQL for database management and manipulation.
1. Database: A database is a structured collection of data organized in tables, which consist of rows and columns.
2. Table: A table is a collection of related data organized in rows and columns. Each row represents a record, and each column represents a specific attribute or field.
3. Query: A SQL query is a request for data or information from a database. Queries are used to retrieve, insert, update, or delete data in a database.
4. CRUD Operations: CRUD stands for Create, Read, Update, and Delete. These are the basic operations performed on data in a database using SQL:
- Create (INSERT): Adds new records to a table.
- Read (SELECT): Retrieves data from one or more tables.
- Update (UPDATE): Modifies existing records in a table.
- Delete (DELETE): Removes records from a table.
5. Data Types: SQL supports various data types to define the type of data that can be stored in each column of a table, such as integer, text, date, and decimal.
6. Constraints: Constraints are rules enforced on data columns to ensure data integrity and consistency. Common constraints include:
- Primary Key: Uniquely identifies each record in a table.
- Foreign Key: Establishes a relationship between two tables.
- Unique: Ensures that all values in a column are unique.
- Not Null: Specifies that a column cannot contain NULL values.
7. Joins: Joins are used to combine rows from two or more tables based on a related column between them. Common types of joins include INNER JOIN, LEFT JOIN (or LEFT OUTER JOIN), RIGHT JOIN (or RIGHT OUTER JOIN), and FULL JOIN (or FULL OUTER JOIN).
8. Aggregate Functions: SQL provides aggregate functions to perform calculations on sets of values. Common aggregate functions include SUM, AVG, COUNT, MIN, and MAX.
9. Group By: The GROUP BY clause is used to group rows that have the same values into summary rows. It is often used with aggregate functions to perform calculations on grouped data.
10. Order By: The ORDER BY clause is used to sort the result set of a query based on one or more columns in ascending or descending order.
Understanding these basic concepts of SQL will help you write queries to interact with databases effectively. Practice writing SQL queries and experimenting with different commands to become proficient in using SQL for database management and manipulation.
๐46โค12๐2
Data Analyst Checklist to review before your next interview in 2024 ๐๐
https://www.linkedin.com/posts/sqlspecialist_dataanalyst-dataanalytics-dataanalysis-activity-7161585338870710272--VGI?utm_source=share&utm_medium=member_android
https://www.linkedin.com/posts/sqlspecialist_dataanalyst-dataanalytics-dataanalysis-activity-7161585338870710272--VGI?utm_source=share&utm_medium=member_android
๐8โค2
Here are some commonly asked SQL interview questions along with brief answers:
1. What is SQL?
- SQL stands for Structured Query Language, used for managing and manipulating relational databases.
2. What are the types of SQL commands?
- SQL commands can be broadly categorized into four types: Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language (DCL), and Transaction Control Language (TCL).
3. What is the difference between CHAR and VARCHAR data types?
- CHAR is a fixed-length character data type, while VARCHAR is a variable-length character data type. CHAR will always occupy the same amount of storage space, while VARCHAR will only use the necessary space to store the actual data.
4. What is a primary key?
- A primary key is a column or a set of columns that uniquely identifies each row in a table. It ensures data integrity by enforcing uniqueness and can be used to establish relationships between tables.
5. What is a foreign key?
- A foreign key is a column or a set of columns in one table that refers to the primary key in another table. It establishes a relationship between two tables and ensures referential integrity.
6. What is a JOIN in SQL?
- JOIN is used to combine rows from two or more tables based on a related column between them. There are different types of JOINs, including INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
7. What is the difference between INNER JOIN and OUTER JOIN?
- INNER JOIN returns only the rows that have matching values in both tables, while OUTER JOIN (LEFT, RIGHT, FULL) returns all rows from one or both tables, with NULL values in columns where there is no match.
8. What is the difference between GROUP BY and ORDER BY?
- GROUP BY is used to group rows that have the same values into summary rows, typically used with aggregate functions like SUM, COUNT, AVG, etc., while ORDER BY is used to sort the result set based on one or more columns.
9. What is a subquery?
- A subquery is a query nested within another query, used to return data that will be used in the main query. Subqueries can be used in SELECT, INSERT, UPDATE, and DELETE statements.
10. What is normalization in SQL?
- Normalization is the process of organizing data in a database to reduce redundancy and dependency. It involves dividing large tables into smaller tables and defining relationships between them to improve data integrity and efficiency.
Around 90% questions will be asked from sql in data analytics interview, so please make sure to practice SQL skills using websites like stratascratch. โบ๏ธ๐ช
1. What is SQL?
- SQL stands for Structured Query Language, used for managing and manipulating relational databases.
2. What are the types of SQL commands?
- SQL commands can be broadly categorized into four types: Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language (DCL), and Transaction Control Language (TCL).
3. What is the difference between CHAR and VARCHAR data types?
- CHAR is a fixed-length character data type, while VARCHAR is a variable-length character data type. CHAR will always occupy the same amount of storage space, while VARCHAR will only use the necessary space to store the actual data.
4. What is a primary key?
- A primary key is a column or a set of columns that uniquely identifies each row in a table. It ensures data integrity by enforcing uniqueness and can be used to establish relationships between tables.
5. What is a foreign key?
- A foreign key is a column or a set of columns in one table that refers to the primary key in another table. It establishes a relationship between two tables and ensures referential integrity.
6. What is a JOIN in SQL?
- JOIN is used to combine rows from two or more tables based on a related column between them. There are different types of JOINs, including INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
7. What is the difference between INNER JOIN and OUTER JOIN?
- INNER JOIN returns only the rows that have matching values in both tables, while OUTER JOIN (LEFT, RIGHT, FULL) returns all rows from one or both tables, with NULL values in columns where there is no match.
8. What is the difference between GROUP BY and ORDER BY?
- GROUP BY is used to group rows that have the same values into summary rows, typically used with aggregate functions like SUM, COUNT, AVG, etc., while ORDER BY is used to sort the result set based on one or more columns.
9. What is a subquery?
- A subquery is a query nested within another query, used to return data that will be used in the main query. Subqueries can be used in SELECT, INSERT, UPDATE, and DELETE statements.
10. What is normalization in SQL?
- Normalization is the process of organizing data in a database to reduce redundancy and dependency. It involves dividing large tables into smaller tables and defining relationships between them to improve data integrity and efficiency.
Around 90% questions will be asked from sql in data analytics interview, so please make sure to practice SQL skills using websites like stratascratch. โบ๏ธ๐ช
๐45โค9๐1๐1
SQL Interview Questions with Answers
1. What is a primary key and why is it important in a database?
- A primary key is a unique identifier for each record in a database table. It is important because it ensures that each record can be uniquely identified and helps maintain data integrity by preventing duplicate or null values.
2. Can you explain the difference between INNER JOIN and OUTER JOIN in SQL?
- INNER JOIN returns only the rows that have matching values in both tables, while OUTER JOIN returns all rows from one table and the matched rows from the other table (or null values if there is no match).
3. How do you optimize a SQL query for better performance?
- To optimize a SQL query, you can use indexes, avoid using SELECT *, limit the number of columns selected, use appropriate data types, and avoid using functions in WHERE clauses.
4. What is normalization and why is it important in database design?
- Normalization is the process of organizing data in a database to reduce redundancy and dependency. It is important because it helps improve data integrity, reduce storage space, and make data maintenance easier.
5. How do you handle missing data in SQL queries?
- You can handle missing data in SQL queries by using functions like COALESCE or IFNULL to replace null values with a default value, or by using the IS NULL or IS NOT NULL operators to filter out records with missing data.
6. Can you explain the difference between GROUP BY and HAVING clauses in SQL?
- GROUP BY is used to group rows that have the same values into summary rows, while HAVING is used to filter groups based on specified conditions after the GROUP BY clause has been applied.
7. How do you identify and remove duplicate records from a database table?
- You can identify duplicate records by using the DISTINCT keyword or by using the GROUP BY clause with COUNT() function. To remove duplicate records, you can use the DELETE statement with a subquery that identifies the duplicates.
8. How do you write a subquery in SQL?
- A subquery is a query nested within another query. You can write a subquery by enclosing the inner query within parentheses and using it as a part of the outer query's WHERE, FROM, or SELECT clause.
9. What is the difference between a view and a table in SQL?
- A table stores actual data in a database, while a view is a virtual table that displays data from one or more tables based on a predefined query. Views do not store data themselves but provide a way to present data in a specific format.
10. How do you use indexes to improve query performance in SQL?
- Indexes are used to speed up data retrieval in SQL queries by creating an ordered list of values for one or more columns in a table. You can create indexes on columns frequently used in WHERE, JOIN, or ORDER BY clauses to improve query performance.
Hope it helps :)
1. What is a primary key and why is it important in a database?
- A primary key is a unique identifier for each record in a database table. It is important because it ensures that each record can be uniquely identified and helps maintain data integrity by preventing duplicate or null values.
2. Can you explain the difference between INNER JOIN and OUTER JOIN in SQL?
- INNER JOIN returns only the rows that have matching values in both tables, while OUTER JOIN returns all rows from one table and the matched rows from the other table (or null values if there is no match).
3. How do you optimize a SQL query for better performance?
- To optimize a SQL query, you can use indexes, avoid using SELECT *, limit the number of columns selected, use appropriate data types, and avoid using functions in WHERE clauses.
4. What is normalization and why is it important in database design?
- Normalization is the process of organizing data in a database to reduce redundancy and dependency. It is important because it helps improve data integrity, reduce storage space, and make data maintenance easier.
5. How do you handle missing data in SQL queries?
- You can handle missing data in SQL queries by using functions like COALESCE or IFNULL to replace null values with a default value, or by using the IS NULL or IS NOT NULL operators to filter out records with missing data.
6. Can you explain the difference between GROUP BY and HAVING clauses in SQL?
- GROUP BY is used to group rows that have the same values into summary rows, while HAVING is used to filter groups based on specified conditions after the GROUP BY clause has been applied.
7. How do you identify and remove duplicate records from a database table?
- You can identify duplicate records by using the DISTINCT keyword or by using the GROUP BY clause with COUNT() function. To remove duplicate records, you can use the DELETE statement with a subquery that identifies the duplicates.
8. How do you write a subquery in SQL?
- A subquery is a query nested within another query. You can write a subquery by enclosing the inner query within parentheses and using it as a part of the outer query's WHERE, FROM, or SELECT clause.
9. What is the difference between a view and a table in SQL?
- A table stores actual data in a database, while a view is a virtual table that displays data from one or more tables based on a predefined query. Views do not store data themselves but provide a way to present data in a specific format.
10. How do you use indexes to improve query performance in SQL?
- Indexes are used to speed up data retrieval in SQL queries by creating an ordered list of values for one or more columns in a table. You can create indexes on columns frequently used in WHERE, JOIN, or ORDER BY clauses to improve query performance.
Hope it helps :)
๐44โค13๐ค4๐2๐ฅฐ1
1. What are the ways to detect outliers?
Outliers are detected using two methods:
Box Plot Method: According to this method, the value is considered an outlier if it exceeds or falls below 1.5*IQR (interquartile range), that is, if it lies above the top quartile (Q3) or below the bottom quartile (Q1).
Standard Deviation Method: According to this method, an outlier is defined as a value that is greater or lower than the mean ยฑ (3*standard deviation).
2. What is a Recursive Stored Procedure?
A stored procedure that calls itself until a boundary condition is reached, is called a recursive stored procedure. This recursive function helps the programmers to deploy the same set of code several times as and when required.
3. What is the shortcut to add a filter to a table in EXCEL?
The filter mechanism is used when you want to display only specific data from the entire dataset. By doing so, there is no change being made to the data. The shortcut to add a filter to a table is Ctrl+Shift+L.
4. What is DAX in Power BI?
DAX stands for Data Analysis Expressions. It's a collection of functions, operators, and constants used in formulas to calculate and return values. In other words, it helps you create new info from data you already have.
Outliers are detected using two methods:
Box Plot Method: According to this method, the value is considered an outlier if it exceeds or falls below 1.5*IQR (interquartile range), that is, if it lies above the top quartile (Q3) or below the bottom quartile (Q1).
Standard Deviation Method: According to this method, an outlier is defined as a value that is greater or lower than the mean ยฑ (3*standard deviation).
2. What is a Recursive Stored Procedure?
A stored procedure that calls itself until a boundary condition is reached, is called a recursive stored procedure. This recursive function helps the programmers to deploy the same set of code several times as and when required.
3. What is the shortcut to add a filter to a table in EXCEL?
The filter mechanism is used when you want to display only specific data from the entire dataset. By doing so, there is no change being made to the data. The shortcut to add a filter to a table is Ctrl+Shift+L.
4. What is DAX in Power BI?
DAX stands for Data Analysis Expressions. It's a collection of functions, operators, and constants used in formulas to calculate and return values. In other words, it helps you create new info from data you already have.
๐29โค6
1. How can we deal with problems that arise when the data flows in from a variety of sources?
There are many ways to go about dealing with multi-source problems. However, these are done primarily to solve the problems of:
Identifying the presence of similar/same records and merging them into a single recordRe-structuring the schema to ensure there is good schema integration
2. Where is Time Series Analysis used?
Since time series analysis (TSA) has a wide scope of usage, it can be used in multiple domains. Here are some of the places where TSA plays an important role:
Statistics
Signal processing
Econometrics
Weather forecasting
Earthquake prediction
Astronomy
Applied science
3. What are the ideal situations in which t-test or z-test can be used?
It is a standard practice that a t-test is used when there is a sample size less than 30 and the z-test is considered when the sample size exceeds 30 in most cases.
4. What is the usage of the NVL() function?
The NVL() function is used to convert the NULL value to the other value. The function returns the value of the second parameter if the first parameter is NULL. If the first parameter is anything other than NULL, it is left unchanged. This function is used in Oracle, not in SQL and MySQL. Instead of NVL() function, MySQL have IFNULL() and SQL Server have ISNULL() function.
5. What is the difference between DROP and TRUNCATE commands?
If a table is dropped, all things associated with that table are dropped as well. This includes the relationships defined on the table with other tables, access privileges, and grants that the table has, as well as the integrity checks and constraints.
However, if a table is truncated, there are no such problems as mentioned above. The table retains its original structure and the data is dropped.
There are many ways to go about dealing with multi-source problems. However, these are done primarily to solve the problems of:
Identifying the presence of similar/same records and merging them into a single recordRe-structuring the schema to ensure there is good schema integration
2. Where is Time Series Analysis used?
Since time series analysis (TSA) has a wide scope of usage, it can be used in multiple domains. Here are some of the places where TSA plays an important role:
Statistics
Signal processing
Econometrics
Weather forecasting
Earthquake prediction
Astronomy
Applied science
3. What are the ideal situations in which t-test or z-test can be used?
It is a standard practice that a t-test is used when there is a sample size less than 30 and the z-test is considered when the sample size exceeds 30 in most cases.
4. What is the usage of the NVL() function?
The NVL() function is used to convert the NULL value to the other value. The function returns the value of the second parameter if the first parameter is NULL. If the first parameter is anything other than NULL, it is left unchanged. This function is used in Oracle, not in SQL and MySQL. Instead of NVL() function, MySQL have IFNULL() and SQL Server have ISNULL() function.
5. What is the difference between DROP and TRUNCATE commands?
If a table is dropped, all things associated with that table are dropped as well. This includes the relationships defined on the table with other tables, access privileges, and grants that the table has, as well as the integrity checks and constraints.
However, if a table is truncated, there are no such problems as mentioned above. The table retains its original structure and the data is dropped.
๐25โค5๐2
Important questions for data analyst interview๐๐
1. Can you walk me through a project where you had to analyze a large dataset and draw meaningful insights from it?
2. How do you ensure the accuracy and reliability of your analysis results?
3. What programming languages and tools are you proficient in for data analysis?
4. How do you approach data cleaning and preprocessing before conducting analysis?
5. Can you give an example of a time when you had to communicate complex data analysis results to non-technical stakeholders?
6. How do you stay current with industry trends and best practices in data analysis?
7. Have you ever worked with machine learning algorithms or predictive modeling? If so, can you provide an example of a project where you applied these techniques?
8. How do you handle missing or incomplete data in your analysis process?
9. Can you discuss a challenging problem you encountered during a data analysis project and how you overcame it?
10. How do you prioritize and manage multiple projects or tasks simultaneously as a data analyst?
1. Can you walk me through a project where you had to analyze a large dataset and draw meaningful insights from it?
2. How do you ensure the accuracy and reliability of your analysis results?
3. What programming languages and tools are you proficient in for data analysis?
4. How do you approach data cleaning and preprocessing before conducting analysis?
5. Can you give an example of a time when you had to communicate complex data analysis results to non-technical stakeholders?
6. How do you stay current with industry trends and best practices in data analysis?
7. Have you ever worked with machine learning algorithms or predictive modeling? If so, can you provide an example of a project where you applied these techniques?
8. How do you handle missing or incomplete data in your analysis process?
9. Can you discuss a challenging problem you encountered during a data analysis project and how you overcame it?
10. How do you prioritize and manage multiple projects or tasks simultaneously as a data analyst?
๐36โค5
1. What is a Self-Join?
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
5. What is the difference between the RANK() and DENSE_RANK() functions?
The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. If we have three records at rank 4, for example, the next level indicated is 5.
A self-join is a type of join that can be used to connect two tables. As a result, it is a unary relationship. Each row of the table is attached to itself and all other rows of the same table in a self-join. As a result, a self-join is mostly used to combine and compare rows from the same database table.
2. What is OLTP?
OLTP, or online transactional processing, allows huge groups of people to execute massive amounts of database transactions in real time, usually via the internet. A database transaction occurs when data in a database is changed, inserted, deleted, or queried.
3. What is the difference between joining and blending in Tableau?
Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in Oracle databaseWhile blending requires two completely defined data sources in your report.
4. How to prevent someone from copying the cell from your worksheet in excel?
If you want to protect your worksheet from being copied, go into Menu bar > Review > Protect sheet > Password.
By entering password you can prevent your worksheet from getting copied.
5. What is the difference between the RANK() and DENSE_RANK() functions?
The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. If we have three records at rank 4, for example, the next level indicated is 5.
๐34โค5๐2
My journey as a data analyst so far: https://www.linkedin.com/posts/sql-analysts_today-i-decided-to-share-my-journey-in-data-activity-7169341885914853376-GWnd?utm_source=share&utm_medium=member_android
Feel free to write down your opinion in the comments.
I usually don't share my personal journey online but sometimes even small things like this help others.
I hope you guys can relate with me โค๏ธ
Feel free to write down your opinion in the comments.
I usually don't share my personal journey online but sometimes even small things like this help others.
I hope you guys can relate with me โค๏ธ
โค13๐7๐ฅฐ4