Some important questions to crack data science interview
Q. Describe how Gradient Boosting works.
A. Gradient boosting is a type of machine learning boosting. It relies on the intuition that the best possible next model, when combined with previous models, minimizes the overall prediction error. If a small change in the prediction for a case causes no change in error, then next target outcome of the case is zero. Gradient boosting produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.
Q. Describe the decision tree model.
A. Decision Trees are a type of Supervised Machine Learning where the data is continuously split according to a certain parameter. The leaves are the decisions or the final outcomes. A decision tree is a machine learning algorithm that partitions the data into subsets.
Q. What is a neural network?
A. Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. They interpret sensory data through a kind of machine perception, labeling or clustering raw input. They, also known as Artificial Neural Networks, are the subset of Deep Learning.
Q. Explain the Bias-Variance Tradeoff
A. The biasโvariance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estimated parameters.
Q. Whatโs the difference between L1 and L2 regularization?
A. The main intuitive difference between the L1 and L2 regularization is that L1 regularization tries to estimate the median of the data while the L2 regularization tries to estimate the mean of the data to avoid overfitting. That value will also be the median of the data distribution mathematically.
ENJOY LEARNING ๐๐
Q. Describe how Gradient Boosting works.
A. Gradient boosting is a type of machine learning boosting. It relies on the intuition that the best possible next model, when combined with previous models, minimizes the overall prediction error. If a small change in the prediction for a case causes no change in error, then next target outcome of the case is zero. Gradient boosting produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.
Q. Describe the decision tree model.
A. Decision Trees are a type of Supervised Machine Learning where the data is continuously split according to a certain parameter. The leaves are the decisions or the final outcomes. A decision tree is a machine learning algorithm that partitions the data into subsets.
Q. What is a neural network?
A. Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. They interpret sensory data through a kind of machine perception, labeling or clustering raw input. They, also known as Artificial Neural Networks, are the subset of Deep Learning.
Q. Explain the Bias-Variance Tradeoff
A. The biasโvariance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estimated parameters.
Q. Whatโs the difference between L1 and L2 regularization?
A. The main intuitive difference between the L1 and L2 regularization is that L1 regularization tries to estimate the median of the data while the L2 regularization tries to estimate the mean of the data to avoid overfitting. That value will also be the median of the data distribution mathematically.
ENJOY LEARNING ๐๐
โค3
Some important questions to crack data science interview Part-2
๐1. ๐ฉ-๐ฏ๐๐ฅ๐ฎ๐?
๐ns. p-value is a measure of the probability that an observed difference could have occurred just by random chance. The lower the p-value, the greater the statistical significance of the observed difference. P-value can be used as an alternative to or in addition to pre-selected confidence levels for hypothesis testing.
๐2. ๐๐ง๐ญ๐๐ซ๐ฉ๐จ๐ฅ๐๐ญ๐ข๐จ๐ง ๐๐ง๐ ๐๐ฑ๐ญ๐ซ๐๐ฉ๐จ๐ฅ๐๐ญ๐ข๐จ๐ง?
๐ns. Interpolation is the process of calculating the unknown value from known given values whereas extrapolation is the process of calculating unknown values beyond the given data points.
๐3. ๐๐ง๐ข๐๐จ๐ซ๐ฆ๐๐ ๐๐ข๐ฌ๐ญ๐ซ๐ข๐๐ฎ๐ญ๐ข๐จ๐ง & ๐ง๐จ๐ซ๐ฆ๐๐ฅ ๐๐ข๐ฌ๐ญ๐ซ๐ข๐๐ฎ๐ญ๐ข๐จ๐ง?
๐ns. The normal distribution is bell-shaped, which means value near the center of the distribution are more likely to occur as opposed to values on the tails of the distribution. The uniform distribution is rectangular-shaped, which means every value in the distribution is equally likely to occur.
๐4. ๐๐๐๐จ๐ฆ๐ฆ๐๐ง๐๐๐ซ ๐๐ฒ๐ฌ๐ญ๐๐ฆ๐ฌ?
๐ns. The recommender system mainly deals with the likes and dislikes of the users. Its major objective is to recommend an item to a user which has a high chance of liking or is in need of a particular user based on his previous purchases. It is like having a personalized team who can understand our likes and dislikes and help us in making the decisions regarding a particular item without being biased by any means by making use of a large amount of data in the repositories which are generated day by day.
๐5. ๐๐๐๐ ๐๐ฎ๐ง๐๐ญ๐ข๐จ๐ง ๐ข๐ง ๐๐๐
๐ns. The SQL Joins clause is used to combine records from two or more tables in a database.
๐6. ๐๐ช๐ฎ๐๐ซ๐๐ ๐๐ซ๐ซ๐จ๐ซ ๐๐ง๐ ๐๐๐ฌ๐จ๐ฅ๐ฎ๐ญ๐ ๐๐ซ๐ซ๐จ๐ซ?
๐ns. mean squared error (MSE), and mean absolute error (MAE) are used to evaluate the regression problem's accuracy. The squared error is everywhere differentiable, while the absolute error is not (its derivative is undefined at 0). This makes the squared error more amenable to the techniques of mathematical optimization.
ENJOY LEARNING ๐๐
๐1. ๐ฉ-๐ฏ๐๐ฅ๐ฎ๐?
๐ns. p-value is a measure of the probability that an observed difference could have occurred just by random chance. The lower the p-value, the greater the statistical significance of the observed difference. P-value can be used as an alternative to or in addition to pre-selected confidence levels for hypothesis testing.
๐2. ๐๐ง๐ญ๐๐ซ๐ฉ๐จ๐ฅ๐๐ญ๐ข๐จ๐ง ๐๐ง๐ ๐๐ฑ๐ญ๐ซ๐๐ฉ๐จ๐ฅ๐๐ญ๐ข๐จ๐ง?
๐ns. Interpolation is the process of calculating the unknown value from known given values whereas extrapolation is the process of calculating unknown values beyond the given data points.
๐3. ๐๐ง๐ข๐๐จ๐ซ๐ฆ๐๐ ๐๐ข๐ฌ๐ญ๐ซ๐ข๐๐ฎ๐ญ๐ข๐จ๐ง & ๐ง๐จ๐ซ๐ฆ๐๐ฅ ๐๐ข๐ฌ๐ญ๐ซ๐ข๐๐ฎ๐ญ๐ข๐จ๐ง?
๐ns. The normal distribution is bell-shaped, which means value near the center of the distribution are more likely to occur as opposed to values on the tails of the distribution. The uniform distribution is rectangular-shaped, which means every value in the distribution is equally likely to occur.
๐4. ๐๐๐๐จ๐ฆ๐ฆ๐๐ง๐๐๐ซ ๐๐ฒ๐ฌ๐ญ๐๐ฆ๐ฌ?
๐ns. The recommender system mainly deals with the likes and dislikes of the users. Its major objective is to recommend an item to a user which has a high chance of liking or is in need of a particular user based on his previous purchases. It is like having a personalized team who can understand our likes and dislikes and help us in making the decisions regarding a particular item without being biased by any means by making use of a large amount of data in the repositories which are generated day by day.
๐5. ๐๐๐๐ ๐๐ฎ๐ง๐๐ญ๐ข๐จ๐ง ๐ข๐ง ๐๐๐
๐ns. The SQL Joins clause is used to combine records from two or more tables in a database.
๐6. ๐๐ช๐ฎ๐๐ซ๐๐ ๐๐ซ๐ซ๐จ๐ซ ๐๐ง๐ ๐๐๐ฌ๐จ๐ฅ๐ฎ๐ญ๐ ๐๐ซ๐ซ๐จ๐ซ?
๐ns. mean squared error (MSE), and mean absolute error (MAE) are used to evaluate the regression problem's accuracy. The squared error is everywhere differentiable, while the absolute error is not (its derivative is undefined at 0). This makes the squared error more amenable to the techniques of mathematical optimization.
ENJOY LEARNING ๐๐
โค4
Most Asked SQL Interview Questions at MAANG Companies๐ฅ๐ฅ
Preparing for an SQL Interview at MAANG Companies? Here are some crucial SQL Questions you should be ready to tackle:
1. How do you retrieve all columns from a table?
SELECT * FROM table_name;
2. What SQL statement is used to filter records?
SELECT * FROM table_name
WHERE condition;
The WHERE clause is used to filter records based on a specified condition.
3. How can you join multiple tables? Describe different types of JOINs.
SELECT columns
FROM table1
JOIN table2 ON table1.column = table2.column
JOIN table3 ON table2.column = table3.column;
Types of JOINs:
1. INNER JOIN: Returns records with matching values in both tables
SELECT * FROM table1
INNER JOIN table2 ON table1.column = table2.column;
2. LEFT JOIN: Returns all records from the left table & matched records from the right table. Unmatched records will have NULL values.
SELECT * FROM table1
LEFT JOIN table2 ON table1.column = table2.column;
3. RIGHT JOIN: Returns all records from the right table & matched records from the left table. Unmatched records will have NULL values.
SELECT * FROM table1
RIGHT JOIN table2 ON table1.column = table2.column;
4. FULL JOIN: Returns records when there is a match in either left or right table. Unmatched records will have NULL values.
SELECT * FROM table1
FULL JOIN table2 ON table1.column = table2.column;
4. What is the difference between WHERE & HAVING clauses?
WHERE: Filters records before any groupings are made.
SELECT * FROM table_name
WHERE condition;
HAVING: Filters records after groupings are made.
SELECT column, COUNT(*)
FROM table_name
GROUP BY column
HAVING COUNT(*) > value;
5. How do you calculate average, sum, minimum & maximum values in a column?
Average: SELECT AVG(column_name) FROM table_name;
Sum: SELECT SUM(column_name) FROM table_name;
Minimum: SELECT MIN(column_name) FROM table_name;
Maximum: SELECT MAX(column_name) FROM table_name;
Here you can find essential SQL Interview Resources๐
https://t.iss.one/mysqldata
Like this post if you need more ๐โค๏ธ
Hope it helps :)
Preparing for an SQL Interview at MAANG Companies? Here are some crucial SQL Questions you should be ready to tackle:
1. How do you retrieve all columns from a table?
SELECT * FROM table_name;
2. What SQL statement is used to filter records?
SELECT * FROM table_name
WHERE condition;
The WHERE clause is used to filter records based on a specified condition.
3. How can you join multiple tables? Describe different types of JOINs.
SELECT columns
FROM table1
JOIN table2 ON table1.column = table2.column
JOIN table3 ON table2.column = table3.column;
Types of JOINs:
1. INNER JOIN: Returns records with matching values in both tables
SELECT * FROM table1
INNER JOIN table2 ON table1.column = table2.column;
2. LEFT JOIN: Returns all records from the left table & matched records from the right table. Unmatched records will have NULL values.
SELECT * FROM table1
LEFT JOIN table2 ON table1.column = table2.column;
3. RIGHT JOIN: Returns all records from the right table & matched records from the left table. Unmatched records will have NULL values.
SELECT * FROM table1
RIGHT JOIN table2 ON table1.column = table2.column;
4. FULL JOIN: Returns records when there is a match in either left or right table. Unmatched records will have NULL values.
SELECT * FROM table1
FULL JOIN table2 ON table1.column = table2.column;
4. What is the difference between WHERE & HAVING clauses?
WHERE: Filters records before any groupings are made.
SELECT * FROM table_name
WHERE condition;
HAVING: Filters records after groupings are made.
SELECT column, COUNT(*)
FROM table_name
GROUP BY column
HAVING COUNT(*) > value;
5. How do you calculate average, sum, minimum & maximum values in a column?
Average: SELECT AVG(column_name) FROM table_name;
Sum: SELECT SUM(column_name) FROM table_name;
Minimum: SELECT MIN(column_name) FROM table_name;
Maximum: SELECT MAX(column_name) FROM table_name;
Here you can find essential SQL Interview Resources๐
https://t.iss.one/mysqldata
Like this post if you need more ๐โค๏ธ
Hope it helps :)
โค4๐1