Machine Learning & Artificial Intelligence | Data Science Free Courses
64.5K subscribers
557 photos
2 videos
98 files
425 links
Perfect channel to learn Data Analytics, Data Sciene, Machine Learning & Artificial Intelligence

Admin: @coderfun
Download Telegram
How do you handle null, 0, and blank values in your data during the cleaning process?

Sometimes interview questions are also based on this topic. Many data aspirants or even some professionals sometimes make the mistake of simply deleting missing values or trying to fill them without proper analysis.This can damage the integrity of the analysis. Itโ€™s essential to ask or find out the reason behind missing values in the data
whether from the project head, client, or through own investigation.

๐˜ผ๐™ฃ๐™จ๐™ฌ๐™š๐™ง:

Handling null, 0, and blank values is crucial for ensuring the accuracy and reliability of data analysis. Hereโ€™s how to approach it:

1. ๐™„๐™™๐™š๐™ฃ๐™ฉ๐™ž๐™›๐™ฎ๐™ž๐™ฃ๐™œ ๐™–๐™ฃ๐™™ ๐™๐™ฃ๐™™๐™š๐™ง๐™จ๐™ฉ๐™–๐™ฃ๐™™๐™ž๐™ฃ๐™œ ๐™ฉ๐™๐™š ๐˜พ๐™ค๐™ฃ๐™ฉ๐™š๐™ญ๐™ฉ:
   - ๐™‰๐™ช๐™ก๐™ก ๐™‘๐™–๐™ก๐™ช๐™š๐™จ: These represent missing or undefined data. Identify them using functions like 'ISNULL' or filters in Power Query.
   - 0 ๐™‘๐™–๐™ก๐™ช๐™š๐™จ: These can be legitimate data points but may also indicate missing data in some contexts. Understanding the context is important.
   - ๐˜ฝ๐™ก๐™–๐™ฃ๐™  ๐™‘๐™–๐™ก๐™ช๐™š๐™จ: These can be spaces or empty strings. Identify them using 'LEN', 'TRIM', or filters.

2. ๐™ƒ๐™–๐™ฃ๐™™๐™ก๐™ž๐™ฃ๐™œ ๐™๐™๐™š๐™จ๐™š ๐™‘๐™–๐™ก๐™ช๐™š๐™จ ๐™๐™จ๐™ž๐™ฃ๐™œ ๐™‹๐™ง๐™ค๐™ฅ๐™š๐™ง ๐™๐™š๐™˜๐™๐™ฃ๐™ž๐™ฆ๐™ช๐™š๐™จ:
   - ๐™‰๐™ช๐™ก๐™ก ๐™‘๐™–๐™ก๐™ช๐™š๐™จ: Typically decide whether to impute, remove, or leave them based on the datasetโ€™s context and the analysis requirements. Common imputation methods include using mean, median, or a placeholder.
   - 0 ๐™‘๐™–๐™ก๐™ช๐™š๐™จ: If 0s are valid data, leave them as is. If they indicate missing data, treat them similarly to null values.

   - ๐˜ฝ๐™ก๐™–๐™ฃ๐™  ๐™‘๐™–๐™ก๐™ช๐™š๐™จ: Convert blanks to nulls or handle them as needed. This involves using 'IF' statements or Power Query transformations.

3. ๐™๐™จ๐™ž๐™ฃ๐™œ ๐™€๐™ญ๐™˜๐™š๐™ก ๐™–๐™ฃ๐™™ ๐™‹๐™ค๐™ฌ๐™š๐™ง ๐™Œ๐™ช๐™š๐™ง๐™ฎ:
   - ๐™€๐™ญ๐™˜๐™š๐™ก: Use formulas like 'IFERROR', 'IF', and 'VLOOKUP' to handle these values.
   - ๐™‹๐™ค๐™ฌ๐™š๐™ง ๐™Œ๐™ช๐™š๐™ง๐™ฎ: Use transformations to filter, replace, or fill null and blank values. Steps like 'Fill Down', 'Replace Values', and custom columns help automate the process.

By carefully considering the context and using appropriate methods, the data cleaning process maintains the integrity and quality of the data.

Hope it helps :)
๐Ÿ‘5โค2๐Ÿคฃ1
Will LLMs always hallucinate?

As large language models (LLMs) become more powerful and pervasive, it's crucial that we understand their limitations.

A new paper argues that hallucinations - where the model generates false or nonsensical information - are not just occasional mistakes, but an inherent property of these systems.

While the idea of hallucinations as features isn't new, the researchers' explanation is.

They draw on computational theory and Gรถdel's incompleteness theorems to show that hallucinations are baked into the very structure of LLMs.

In essence, they argue that the process of training and using these models involves undecidable problems - meaning there will always be some inputs that cause the model to go off the rails.

This would have big implications. It suggests that no amount of architectural tweaks, data cleaning, or fact-checking can fully eliminate hallucinations.

So what does this mean in practice? For one, it highlights the importance of using LLMs carefully, with an understanding of their limitations.

It also suggests that research into making models more robust and understanding their failure modes is crucial.

No matter how impressive the results, LLMs are not oracles - they're tools with inherent flaws and biases

LLM & Generative AI Resources: https://whatsapp.com/channel/0029VaoePz73bbV94yTh6V2E
๐Ÿ‘5๐Ÿคฃ1
An high level overview for becoming a machine learning engineer
โค4๐Ÿ‘1
Machine learning algorithms
๐Ÿ‘4โค1
Data Scientist vs Data Analyst ๐Ÿ‘†
Preparing for a machine learning interview as a data analyst is a great step.

Here are some common machine learning interview questions :-

1. Explain the steps involved in a machine learning project lifecycle.

2. What is the difference between supervised and unsupervised learning? Give examples of each.

3. What evaluation metrics would you use to assess the performance of a regression model?

4. What is overfitting and how can you prevent it?

5. Describe the bias-variance tradeoff.

6. What is cross-validation, and why is it important in machine learning?

7. What are some feature selection techniques you are familiar with?

8.What are the assumptions of linear regression?

9. How does regularization help in linear models?

10. Explain the difference between classification and regression.

11. What are some common algorithms used for dimensionality reduction?

12. Describe how a decision tree works.

13. What are ensemble methods, and why are they useful?

14. How do you handle missing or corrupted data in a dataset?

15. What are the different kernels used in Support Vector Machines (SVM)?


These questions cover a range of fundamental concepts and techniques in machine learning that are important for a data scientist role.
Good luck with your interview preparation!


Best Data Science & Machine Learning Resources: https://topmate.io/coding/914624

Like if you need similar content ๐Ÿ˜„๐Ÿ‘
๐Ÿ‘8โค2
Free Session to learn Data Analytics, Data Science & AI
๐Ÿ‘‡๐Ÿ‘‡
https://tracking.acciojob.com/g/PUfdDxgHR

Register fast, only for first few users
๐Ÿ‘5
โŒจ๏ธ Python Tips & Tricks
๐Ÿฅฐ1
๐Ÿ”— Become a Machine Learning Expert in 7 Steps
๐Ÿ‘5
Data Science Roadmap
โค4๐Ÿ‘3๐Ÿฅฐ1