Data Analytics

Requirements for data analyst based on some job profiles from @getjobss channel

👉 Strong data handling, data modeling & data flow understanding

👉 Ability to write complex queries on SQL to manipulate, consolidate multiple data sources for the purpose of dashboarding and analysis

👉 Intuition for data and ability to handle big data sources

👉 Strong working knowledge in Excel and visualization tools like PowerBI, Tableau, QlikView

👉 Ability to work on ambiguous tasks, find suitable solutions, and seek help/advice where appropriate.

👍61😁2❤1

16K viewsedited 17:16

Data Analytics

What is the ful form of TCL in SQL?

Anonymous Quiz

89%

Transaction Control Language

11%

Total Control Level

👍11🔥9❤1

2.77K voters14.7K views08:49

Data Analytics

Which of the following is not available in marks card of tableau?

Anonymous Quiz

👍21🥰1

1.88K voters14.7K views06:10

Data Analytics

Which of the following clause is used to sort data in SQL?

Anonymous Quiz

👍26🔥8😢3❤2👏1

3.23K voters16.1K views06:02

Data Analytics

Which of the following clause is not available in SQL?

Anonymous Quiz

👍13❤3🔥1

3.68K voters16.5K views14:13

Data Analytics

1. Define the term 'Data Wrangling.

Data Wrangling is the process wherein raw data is cleaned, structured, and enriched into a desired usable format for better decision making. It involves discovering, structuring, cleaning, enriching, validating, and analyzing data. This process can turn and map out large amounts of data extracted from various sources into a more useful format.

2. What are the best methods for data cleaning?

Create a data cleaning plan by understanding where the common errors take place and keep all the communications open. Before working with the data, identify and remove the duplicates. This will lead to an easy and effective data analysis process.Focus on the accuracy of the data. Set cross-field validation, maintain the value types of data, and provide mandatory constraints.Normalize the data at the entry point so that it is less chaotic. You will be able to ensure that all information is standardized, leading to fewer errors on entry.

3. Explain the Type I and Type II errors in Statistics?

In Hypothesis testing, a Type I error occurs when the null hypothesis is rejected even if it is true. It is also known as a false positive.

A Type II error occurs when the null hypothesis is not rejected, even if it is false. It is also known as a false negative.

4. How do you make a dropdown list in MS Excel?

First, click on the Data tab that is present in the ribbon.Under the Data Tools group, select Data Validation.Then navigate to Settings > Allow > List.Select the source you want to provide as a list array.

5. State some ways to improve the performance of Tableau?

Use an Extract to make workbooks run faster.
Reduce the scope of data to decrease the volume of data.
Reduce the number of marks on the view to avoid information overload.
Hide unused fields.
Use Context filters.
Use indexing in tables and use the same fields for filtering.
Remove unnecessary calculations and sheets.

👍44❤10🔥4

18.9K views10:09

Data Analytics

Which of the following is not a join type in SQL?

Anonymous Quiz

👍30❤1

2.57K voters13.9K views18:56

Data Analytics

Which of the following is not a data visualization tool?

Anonymous Quiz

👍24❤12😁9👏2

3.9K voters17.3K views17:30

Data Analytics

What is full form of DAX in Power BI?

Anonymous Quiz

87%

Data Analysis Expressions

13%

Data Acronym Experts

👍22❤5😢3🥰2👏1

3.47K voters18.3K views19:03

Data Analytics

What is the full form of DML in database language?

Anonymous Quiz

87%

Data manipulation language

Data munging language

Data management language

Data machine language p

👍15🔥3👏3❤2🥰1

3.1K voters15.7K views06:13

Data Analytics

🗂The order of operations used in MS Excel while evaluating formulas

MS Excel follows a standard math protocol to evaluate a formula.

This protocol is called “order of operations” – PEMDAS –

~Parentheses
~Exponents
~Multiplication
~Division
~Addition
~Subtraction

MS Excel also applies some customization to handle the formula syntax.

The order in which MS Excel performs calculations can affect the return value of the formula.

First of all, Excel evaluates any expressions in parentheses.

As we have seen in mathematical formulae too, parentheses essentially override the normal order of operations. It prioritizes certain operations.

Next, Excel resolves cell references like A1 (cell address). It evaluates range references like A1:A10, making them arrays of values.

It also performs range operations like a union (comma) and an intersection (space).

Next, Excel performs –

-Exponentiation
-Negation
-% conversions
-Multiplication and division
-Addition and subtraction
-Concatenation
-Logical operators

👍46❤7

21.2K views06:57

Data Analytics

We are 10k+ now before the new year 💪

Here is a special channel where you will find FREE Data Analysis Books
👇👇
https://t.iss.one/learndataanalysis

You guys are amazing

Thanks for sharing and supporting the channel ❤️❤️

❤54👍23👏10🔥8🥰7

22.3K viewsedited 05:14

Data Analytics

1. What is the meaning of dropout in Deep Learning?

Dropout is a technique that is used to avoid overfitting a model in Deep Learning. If the dropout value is too low, then it will have minimal effect on learning. If it is too high, then the model can under-learn, thereby, causing lower efficiency.

2. What are sets in Tableau?

Sets are custom fields that define a subset of data based on some conditions. A set can be based on a computed condition, for example, a set may contain customers with sales over a certain threshold. Computed sets update as your data changes. Alternatively, a set can be based on specific data point in your view.

3. What is the difference between DROP and TRUNCATE commands?

DROP command removes a table and it cannot be rolled back from the database whereas TRUNCATE command removes all the rows from the table.

4. What is slicing in Python?

Ans: Slicing is used to access parts of sequences like lists, tuples, and strings. The syntax of slicing is-[start:end:step]. The step can be omitted as well. When we write [start:end] this returns all the elements of the sequence from the start (inclusive) till the end-1 element. If the start or end element is negative i, it means the ith element from the end.

👍28❤2🥰2

20.2K viewsedited 05:04

Data Analytics

1. What is the difference between the RANK() and DENSE_RANK() functions?

The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. If we have three records at rank 4, for example, the next level indicated is 5.

2. Explain One-hot encoding and Label Encoding. How do they affect the dimensionality of the given dataset?

One-hot encoding is the representation of categorical variables as binary vectors. Label Encoding is converting labels/words into numeric form. Using one-hot encoding increases the dimensionality of the data set. Label encoding doesn’t affect the dimensionality of the data set. One-hot encoding creates a new variable for each level in the variable whereas, in Label encoding, the levels of a variable get encoded as 1 and 0.

3. Explain the Difference Between Tableau Worksheet, Dashboard, Story, and Workbook in Tableau?

Tableau uses a workbook and sheet file structure, much like Microsoft Excel.
A workbook contains sheets, which can be a worksheet, dashboard, or a story.
A worksheet contains a single view along with shelves, legends, and the Data pane.
A dashboard is a collection of views from multiple worksheets.
A story contains a sequence of worksheets or dashboards that work together to convey information.

4. How can you split a column into 2 or more columns?

You can split a column into 2 or more columns by following the below steps:
1. Select the cell that you want to split. Then, navigate to the Data tab, after that, select Text to Columns. 2. Select the delimiter. 3. Choose the column data format and select the destination you want to display the split. 4. The final output will look like below where the text is split into multiple columns.

5. Do you wanna make your career in Data Science & Analytics but don't know how to start ?

https://t.iss.one/sqlspecialist/94

Here is a complete roadmap from scratch that will make you technically strong enough to crack any Data Analyst and also learn Pro Career Growth Hacks to land on your Dream Job.

👍24❤2

16.7K viewsedited 08:35

Data Analytics

1. What do Tableau's sets and groups mean?

Data is grouped using sets and groups according to predefined criteria. The primary distinction between the two is that although a set can have only two options—either in or out—a group can divide the dataset into several groups. A user should decide which group or sets to apply based on the conditions.

3.What do you mean by a Bag of Words (BOW)?

It is used for word frequency or occurrences to train a classifier.
It contains a text representation that describes the frequency with which words appear in a document.
It has two steps:
-A list of terms that are well-known.
-A metric for determining the existence of well-known terms.

3. What are Nested Triggers?

Triggers may implement DML by using INSERT, UPDATE, and DELETE statements. These triggers that contain DML and find other triggers for data modification are called Nested Triggers.

4. What is a True positive rate and a false positive rate?

True positive rate or Recall: It gives us the percentage of the true positives captured by the model out of all the Actual Positive class.

TPR = TP/ (TP+FN)

False Positive rate: It gives us the percentage of all the false positives by my model prediction from the all Actual Negative class.

FPR = FP/(FP+TN)

👍21❤1

16.4K views03:27

Data Analytics

1. What are the uses of using RNN in NLP?
The RNN is a stateful neural network, which means that it not only retains information from the previous layer but also from the previous pass. Thus, this neuron is said to have connections between passes, and through time.
For the RNN the order of the input matters due to being stateful. The same words with different orders will yield different outputs.
RNN can be used for unsegmented, connected applications such as handwriting recognition or speech recognition.

2. How to remove values to a python array?
Ans: Array elements can be removed using pop() or remove() method. The difference between these two functions is that the former returns the deleted value whereas the latter does not.

3. What are the advantages and disadvantages of views in the database?
Answer: Advantages of Views:
As there is no physical location where the data in the view is stored, it generates output without wasting resources.
Data access is restricted as it does not allow commands like insertion, updation, and deletion.
Disadvantages of Views:
The view becomes irrelevant if we drop a table related to that view.
Much memory space is occupied when the view is created for large tables.

4. How to create a calculated field in Tableau?
Click the drop down to the right of Dimensions on the Data pane and select “Create > Calculated Field” to open the calculation editor.
Name the new field and create a formula.

👍13

14.6K views16:26

Data Analytics

1. What is the case when in SQL Server?

The CASE statement is used to construct logic in which one column’s value is determined by the values of other columns.
At least one set of WHEN and THEN commands makes up the SQL Server CASE Statement. The condition to be tested is specified by the WHEN statement. If the WHEN condition returns TRUE, the THEN sentence explains what to do.

When none of the WHEN conditions return true, the ELSE statement is executed. The END keyword brings the CASE statement to a close.

2. What is a relationship in SQL and what are they?

Database Relationship is defined as the connection between the tables in a database. There are various data base relationships, and they are as follows:.

One to One Relationship.

One to Many Relationship.

Many to One Relationship.

Self-Referencing Relationship.

3. What is the use of cycle fields in tableau?

Cycle fields help in switching and trying different colour combinations or views in a cyclic order. It will work only if we have a chart that allows more than one measure such as stacked bar chart and we are unable to finalize the visualizations then we can use cycle fields. To use cycle field, click on analysis menu in the toolbar then select cycle fields to take a quick look at an alternative visualization.

4. What is the difference between a function and a formula in Excel?

A formula is a user-defined expression that calculates a value. A function is pre-defined built-in operation that can take the specified number of arguments. A user can create formulas that can be complex and can have multiple functions in it. For example, =A1+A2 is a formula and =SUM(A1:A10) is a function.

👍18❤1

15.5K views13:46

Data Analytics

Q1. What are sets and groups in Tableau?

Sets and groups are used group data based on some specific conditions. The main difference between these two is that a group can divide the dataset into multiple groups whereas a set can have only two options which is either in or out. A user should choose to apply group or sets based on the requirements.

Q2. What is Power Pivot & Power Query?

Power Pivot is an add-on provided by Microsoft for Excel since 2010. Power Pivot was designed to extend the analytical capabilities and services of Microsoft Excel.

Power Query is a business intelligence tool designed by Microsoft for Excel. Power Query allows you to import data from various data sources and will enable you to clean, transform and reshape your data as per the requirements. Power Query allows you to write your query once and then run it with a simple refresh.

Q3. State some ways to improve the performance of Tableau?

Use an Extract to make workbooks run faster
Reduce the scope of data to decrease the volume of data
Reduce the number of marks on the view to avoid information overload
Try to use integers or Booleans in calculations as they are much faster than strings
Hide unused fields
Use Context filters
Reduce filter usage and use some alternative way to achieve the same result
Use indexing in tables and use the same fields for filtering
Remove unnecessary calculations and sheets.

Q4. What is macro in excel?

Macro refers to an algorithm or a set of actions that help automate a task in Excel by recording and playing back the steps taken to complete that task. Once the steps are stored, you create a Macro, and it can be edited and played back as many times as the user wants.

Macro is great for repetitive tasks and also eliminates errors. For example, suppose an account manager has to share reports regarding the company employees for non-payment of dues. In that case, it can be automated using a Macro and doing minor changes every month, as needed.

👍24❤1

16.5K views15:48

Data Analytics

Which of the following is not a constraint in SQL?

Anonymous Quiz

👍30😁11

2.76K voters15.8K views13:53

Data Analytics

What is used to access parts of sequences like lists, tuples, and strings in python?

Anonymous Quiz

👍21❤3

2.38K voters16.5K views13:55

Data Analytics

1. What data sources can Power BI connect to?

Ans: The list of data sources for Power BI is extensive, but it can be grouped into the following:
Files: Data can be imported from Excel (.xlsx, xlxm), Power BI Desktop files (.pbix) and Comma Separated Value (.csv).
Content Packs: It is a collection of related documents or files that are stored as a group. In Power BI, there are two types of content packs, firstly those from services providers like Google Analytics, Marketo, or Salesforce, and secondly those created and shared by other users in your organization.
Connectors to databases and other datasets such as Azure SQL, Database and SQL, Server Analysis Services tabular data, etc.

2. What are the different integrity rules present in the DBMS?

The different integrity rules present in DBMS are as follows:
Entity Integrity: This rule states that the value of the primary key can never be NULL. So, all the tuples in the column identified as the primary key should have a value.
Referential Integrity: This rule states that either the value of the foreign key is NULL or it should be the primary key of any other relation.

3. What are some common clauses used with SELECT query in SQL?

Some common SQL clauses used in conjuction with a SELECT query are as follows:
WHERE clause in SQL is used to filter records that are necessary, based on specific conditions.
ORDER BY clause in SQL is used to sort the records based on some field(s) in ascending (ASC) or descending order (DESC).
GROUP BY clause in SQL is used to group records with identical data and can be used in conjunction with some aggregation functions to produce summarized results from the database.
HAVING clause in SQL is used to filter records in combination with the GROUP BY clause. It is different from WHERE, since the WHERE clause cannot filter aggregated records.

4. What is the difference between count, counta, and countblank in Excel?

The count function is very often used in Excel. Here, let’s look at the difference between count, and it’s variants - counta and countblank.

1. COUNT
It counts the number of cells that contain numeric values only. Cells that have string values, special characters, and blank cells will not be counted.

2. COUNTA
It counts the number of cells that contain any form of content. Cells that have string values, special characters, and numeric values will be counted. However, a blank cell will not be counted.

3. COUNTBLANK
As the name suggests, it counts the number of blank cells only. Cells that have content will not be taken into consideration.

👍47❤10🔥9

19.6K views15:58

About

Blog

Apps

Platform