20 Python Libraries You Aren't Using (But Should).pdf
4.1 MB
20 Python Libraries You
Arenβt Using (But Should)
Caleb Hattingh, 2016
Arenβt Using (But Should)
Caleb Hattingh, 2016
Advice from 25 Amazing Data Scientist.pdf
2.8 MB
Resource Pdf :- Advice from 25 Amazing Data Scientists.
Source :- Jake Klamka
Source :- Jake Klamka
π5β€2
Python for Data Analysts - Quick Summary (1).pdf
64.4 KB
π4β€2π2
Data Scientist Roadmap
|
|-- 1. Basic Foundations
| |-- a. Mathematics
| | |-- i. Linear Algebra
| | |-- ii. Calculus
| | |-- iii. Probability
| |
| | |
| |
| |
|
|
|-- 2. Data Exploration and Preprocessing
| |-- a. Exploratory Data Analysis (EDA)
| |-- b. Feature Engineering
| |-- c. Data Cleaning
| |-- d. Handling Missing Data
|
| | |
| |
| |
| |-- b. Unsupervised Learning
| | |-- i. Clustering
| | | |-- 1. K-means
| | | |-- 2. DBSCAN
| | |
| | |-- 1. Principal Component Analysis (PCA)
| | |-- 2. t-Distributed Stochastic Neighbor Embedding (t-SNE)
| |
| |
|
|
|-- 4. Deep Learning
| |-- a. Neural Networks
| | |-- i. Perceptron
| |
| |
| |-- c. Recurrent Neural Networks (RNNs)
| | |-- i. Sequence-to-Sequence Models
| | |-- ii. Text Classification
| |
| |
|
|
|-- 5. Big Data Technologies
| |-- a. Hadoop
| | |-- i. HDFS
| |
| |
|
|
|-- 6. Data Visualization and Reporting
| |-- a. Dashboarding Tools
| | |-- i. Tableau
| | |-- ii. Power BI
| | |-- iii. Dash (Python)
| |
|
|-- 7. Domain Knowledge and Soft Skills
| |-- a. Industry-specific Knowledge
| |-- b. Problem-solving
| |-- c. Communication Skills
| |-- d. Time Management
|
|-- a. Online Courses
|-- b. Books and Research Papers
|-- c. Blogs and Podcasts
|-- d. Conferences and Workshops
`-- e. Networking and Community Engagement
|
|-- 1. Basic Foundations
| |-- a. Mathematics
| | |-- i. Linear Algebra
| | |-- ii. Calculus
| | |-- iii. Probability
| |
-- iv. Statistics
| |
| |-- b. Programming
| | |-- i. Python
| | | |-- 1. Syntax and Basic Concepts
| | | |-- 2. Data Structures
| | | |-- 3. Control Structures
| | | |-- 4. Functions
| | | -- 5. Object-Oriented Programming| | |
| |
-- ii. R (optional, based on preference)
| |
| |-- c. Data Manipulation
| | |-- i. Numpy (Python)
| | |-- ii. Pandas (Python)
| | -- iii. Dplyr (R)| |
|
-- d. Data Visualization
| |-- i. Matplotlib (Python)
| |-- ii. Seaborn (Python)
| -- iii. ggplot2 (R)|
|-- 2. Data Exploration and Preprocessing
| |-- a. Exploratory Data Analysis (EDA)
| |-- b. Feature Engineering
| |-- c. Data Cleaning
| |-- d. Handling Missing Data
|
-- e. Data Scaling and Normalization
|
|-- 3. Machine Learning
| |-- a. Supervised Learning
| | |-- i. Regression
| | | |-- 1. Linear Regression
| | | -- 2. Polynomial Regression| | |
| |
-- ii. Classification
| | |-- 1. Logistic Regression
| | |-- 2. k-Nearest Neighbors
| | |-- 3. Support Vector Machines
| | |-- 4. Decision Trees
| | -- 5. Random Forest| |
| |-- b. Unsupervised Learning
| | |-- i. Clustering
| | | |-- 1. K-means
| | | |-- 2. DBSCAN
| | |
-- 3. Hierarchical Clustering
| | |
| | -- ii. Dimensionality Reduction| | |-- 1. Principal Component Analysis (PCA)
| | |-- 2. t-Distributed Stochastic Neighbor Embedding (t-SNE)
| |
-- 3. Linear Discriminant Analysis (LDA)
| |
| |-- c. Reinforcement Learning
| |-- d. Model Evaluation and Validation
| | |-- i. Cross-validation
| | |-- ii. Hyperparameter Tuning
| | -- iii. Model Selection| |
|
-- e. ML Libraries and Frameworks
| |-- i. Scikit-learn (Python)
| |-- ii. TensorFlow (Python)
| |-- iii. Keras (Python)
| -- iv. PyTorch (Python)|
|-- 4. Deep Learning
| |-- a. Neural Networks
| | |-- i. Perceptron
| |
-- ii. Multi-Layer Perceptron
| |
| |-- b. Convolutional Neural Networks (CNNs)
| | |-- i. Image Classification
| | |-- ii. Object Detection
| | -- iii. Image Segmentation| |
| |-- c. Recurrent Neural Networks (RNNs)
| | |-- i. Sequence-to-Sequence Models
| | |-- ii. Text Classification
| |
-- iii. Sentiment Analysis
| |
| |-- d. Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU)
| | |-- i. Time Series Forecasting
| | -- ii. Language Modeling| |
|
-- e. Generative Adversarial Networks (GANs)
| |-- i. Image Synthesis
| |-- ii. Style Transfer
| -- iii. Data Augmentation|
|-- 5. Big Data Technologies
| |-- a. Hadoop
| | |-- i. HDFS
| |
-- ii. MapReduce
| |
| |-- b. Spark
| | |-- i. RDDs
| | |-- ii. DataFrames
| | -- iii. MLlib| |
|
-- c. NoSQL Databases
| |-- i. MongoDB
| |-- ii. Cassandra
| |-- iii. HBase
| -- iv. Couchbase|
|-- 6. Data Visualization and Reporting
| |-- a. Dashboarding Tools
| | |-- i. Tableau
| | |-- ii. Power BI
| | |-- iii. Dash (Python)
| |
-- iv. Shiny (R)
| |
| |-- b. Storytelling with Data
| -- c. Effective Communication|
|-- 7. Domain Knowledge and Soft Skills
| |-- a. Industry-specific Knowledge
| |-- b. Problem-solving
| |-- c. Communication Skills
| |-- d. Time Management
|
-- e. Teamwork
|
-- 8. Staying Updated and Continuous Learning|-- a. Online Courses
|-- b. Books and Research Papers
|-- c. Blogs and Podcasts
|-- d. Conferences and Workshops
`-- e. Networking and Community Engagement
π35π₯°2
Python Science Projects.pdf_20231120_013618_0000.pdf
2.1 MB
Python Data Science Projects For Boosting Your Portfolio
π10
Forwarded from Data Science & Machine Learning
Important Topics to become a data scientist
[Advanced Level]
ππ
1. Mathematics
Linear Algebra
Analytic Geometry
Matrix
Vector Calculus
Optimization
Regression
Dimensionality Reduction
Density Estimation
Classification
2. Probability
Introduction to Probability
1D Random Variable
The function of One Random Variable
Joint Probability Distribution
Discrete Distribution
Normal Distribution
3. Statistics
Introduction to Statistics
Data Description
Random Samples
Sampling Distribution
Parameter Estimation
Hypotheses Testing
Regression
4. Programming
Python:
Python Basics
List
Set
Tuples
Dictionary
Function
NumPy
Pandas
Matplotlib/Seaborn
R Programming:
R Basics
Vector
List
Data Frame
Matrix
Array
Function
dplyr
ggplot2
Tidyr
Shiny
DataBase:
SQL
MongoDB
Data Structures
Web scraping
Linux
Git
5. Machine Learning
How Model Works
Basic Data Exploration
First ML Model
Model Validation
Underfitting & Overfitting
Random Forest
Handling Missing Values
Handling Categorical Variables
Pipelines
Cross-Validation(R)
XGBoost(Python|R)
Data Leakage
6. Deep Learning
Artificial Neural Network
Convolutional Neural Network
Recurrent Neural Network
TensorFlow
Keras
PyTorch
A Single Neuron
Deep Neural Network
Stochastic Gradient Descent
Overfitting and Underfitting
Dropout Batch Normalization
Binary Classification
7. Feature Engineering
Baseline Model
Categorical Encodings
Feature Generation
Feature Selection
8. Natural Language Processing
Text Classification
Word Vectors
9. Data Visualization Tools
BI (Business Intelligence):
Tableau
Power BI
Qlik View
Qlik Sense
10. Deployment
Microsoft Azure
Heroku
Google Cloud Platform
Flask
Django
Join @datasciencefun to learning important data science and machine learning concepts
ENJOY LEARNING ππ
[Advanced Level]
ππ
1. Mathematics
Linear Algebra
Analytic Geometry
Matrix
Vector Calculus
Optimization
Regression
Dimensionality Reduction
Density Estimation
Classification
2. Probability
Introduction to Probability
1D Random Variable
The function of One Random Variable
Joint Probability Distribution
Discrete Distribution
Normal Distribution
3. Statistics
Introduction to Statistics
Data Description
Random Samples
Sampling Distribution
Parameter Estimation
Hypotheses Testing
Regression
4. Programming
Python:
Python Basics
List
Set
Tuples
Dictionary
Function
NumPy
Pandas
Matplotlib/Seaborn
R Programming:
R Basics
Vector
List
Data Frame
Matrix
Array
Function
dplyr
ggplot2
Tidyr
Shiny
DataBase:
SQL
MongoDB
Data Structures
Web scraping
Linux
Git
5. Machine Learning
How Model Works
Basic Data Exploration
First ML Model
Model Validation
Underfitting & Overfitting
Random Forest
Handling Missing Values
Handling Categorical Variables
Pipelines
Cross-Validation(R)
XGBoost(Python|R)
Data Leakage
6. Deep Learning
Artificial Neural Network
Convolutional Neural Network
Recurrent Neural Network
TensorFlow
Keras
PyTorch
A Single Neuron
Deep Neural Network
Stochastic Gradient Descent
Overfitting and Underfitting
Dropout Batch Normalization
Binary Classification
7. Feature Engineering
Baseline Model
Categorical Encodings
Feature Generation
Feature Selection
8. Natural Language Processing
Text Classification
Word Vectors
9. Data Visualization Tools
BI (Business Intelligence):
Tableau
Power BI
Qlik View
Qlik Sense
10. Deployment
Microsoft Azure
Heroku
Google Cloud Platform
Flask
Django
Join @datasciencefun to learning important data science and machine learning concepts
ENJOY LEARNING ππ
π27β€4
Modern Time Series Forecasting with Python.pdf
25.5 MB
Modern Time Series Forecasting with Python
Manu Joseph, 2022
Manu Joseph, 2022
Rlecturenotes.pdf
4.3 MB
An Introduction to R
Petra Kuhnert, 2007
Petra Kuhnert, 2007
π5β€2