Data Science & Machine Learning
73.1K subscribers
780 photos
2 videos
68 files
687 links
Join this channel to learn data science, artificial intelligence and machine learning with funny quizzes, interesting projects and amazing resources for free

For collaborations: @love_data
Download Telegram
❀️ Learning Path for ML
Data Analysis with Python from Scratch
πŸ‘‡πŸ‘‡
https://t.iss.one/sqlspecialist/26
Which of the following is not a supervised algorithm?
Anonymous Quiz
11%
Linear Regression
9%
Logistic Regression
64%
Clustering
16%
Decision Tree
πŸ‘3
Which of the following tool can be used for Data Visualization?
Anonymous Quiz
9%
Tableau
11%
Matplotlib
7%
Power BI
74%
All of the above
Which of the following cannot give 10 as an answer?
Anonymous Quiz
8%
5*2
7%
2+5*2-2
69%
2+5*(2-2)
16%
3*2+9//2
πŸ‘2
Data Science & Machine Learning
Which of the following cannot give 10 as an answer?
Well done guys!!

Explanation for those who marked wrong answer:
Read the question again
The Answer to (9//2) is 4 and not 4.5
Mathematics for Machine Learning

Published by Cambridge University Press (published April 2020)

https://mml-book.com

PDF: https://mml-book.github.io/book/mml-book.pdf
πŸ‘5
Neural Networks and Learning Machines Third Edition
πŸ‘‡πŸ‘‡
https://cours.etsmtl.ca/sys843/REFS/Books/ebook_Haykin09.pdf
πŸ‘3
Which of the following is not an Unsupervised algorithm?
Anonymous Quiz
13%
K-means clustering
14%
Hierarchical Clustering
21%
Anomaly detection
52%
Logistic Regression
Β©How fresher can get a job as a data scientist?Β©

India as a job market is highly resistant to hire data scientist as a fresher. Everyone out there asks for at least 2 years of experience, but then the question is where will we get the two years experience from?

The important thing here to build a portfolio. As you are a fresher I would assume you had learnt data science through online courses. They only teach you the basics, the analytical skills required to clean the data and apply machine learning algorithms to them comes only from practice.

Do some real-world data science projects, participate in Kaggle competition. kaggle provides data sets for practice as well. Whatever projects you do, create a GitHub repository for it. Place all your projects there so when a recruiter is looking at your profile they know you have hands-on practice and do know the basics. This will take you a long way.

All the major data science jobs for freshers will only be available through off-campus interviews.

Some companies that hires data scientists are:

Siemens

Accenture

IBM

Cerner

Creating a technical portfolio will showcase the knowledge you have already gained and that is essential while you got out there as a fresher and try to find a data scientist job.
πŸ‘4
7 Steps of the Machine Learning Process

Data Collection: The process of extracting raw datasets for the machine learning task. This data can come from a variety of places, ranging from open-source online resources to paid crowdsourcing. The first step of the machine learning process is arguably the most important. If the data you collect is poor quality or irrelevant, then the model you train will be poor quality as well.

Data Processing and Preparation:
Once you’ve gathered the relevant data, you need to process it and make sure that it is in a usable format for training a machine learning model. This includes handling missing data, dealing with outliers, etc.

Feature Engineering:
Once you’ve collected and processed your dataset, you will likely need to transform some of the features (and sometimes even drop some features) in order to optimize how well a model can be trained on the data.

Model Selection:
Based on the dataset, you will choose which model architecture to use. This is one of the main tasks of industry engineers. Rather than attempting to come up with a completely novel model architecture, most tasks can be thoroughly performed with an existing architecture (or combination of model architectures).

Model Training and Data Pipeline:
After selecting the model architecture, you will create a data pipeline for training the model. This means creating a continuous stream of batched data observations to efficiently train the model. Since training can take a long time, you want your data pipeline to be as efficient as possible.

Model Validation:
After training the model for a sufficient amount of time, you will need to validate the model’s performance on a held-out portion of the overall dataset. This data needs to come from the same underlying distribution as the training dataset, but needs to be different data that the model has not seen before.

Model Persistence:
Finally, after training and validating the model’s performance, you need to be able to properly save the model weights and possibly push the model to production. This means setting up a process with which new users can easily use your pre-trained model to make predictions.
5_6339144778529113396.pdf
11.1 MB
Machine learning notes in 15 pages