Data Science Projects
52K subscribers
372 photos
1 video
57 files
329 links
Perfect channel for Data Scientists

Learn Python, AI, R, Machine Learning, Data Science and many more

Admin: @love_data
Download Telegram
Tech stack for Machine Learning in 2024:

- ml workflow orchestrator: Kubeflow
- experiment tracking: MLflow
- data ingestion: Airbyte
- job orchestrator: Apache Airflow
- batch pipeline: Apache Spark
- message queue for real-time streaming: Apache Kafka
- feature engineering: Scikit-learn
- model selection and training: Pytorch
- hyperparameter tuning: Ray Tune
- model evaluation: Weights & Biases
- model monitoring: Grafana
- CI/CD: Github actions
- model versioning: neptune
- model serving: BentoML
- web app framework: Flask
- front-end: React
- feature store: Qwak
- Graph database: Neo4j
- Vector database: ChromaDB
- NoSQL database: MongoDB
- In-memory data store: Redis

...

What is your current ML tech stack?
👍302
Data Science vs Data Engineering vs AI Song
👇👇
https://youtu.be/WQOzBawrTsQ?si=8wVYA3Me_SGM2GDs

Took a lot of efforts, please share your views in comments 😄
👍62🔥1
Has anyone watched Vinland Saga? It will change the way you look at life.

I would recommend it even if you aren't an anime lover.
7👍3🥰2
Very important concept
The 80/20 Principle
16👍4👎1
How to convert image to pdf in Python

# Python3 program to convert image to pfd
# using img2pdf library
 
# importing necessary libraries
import img2pdf
from PIL import Image
import os
 
# storing image path
img_path = "Input.png"
 
# storing pdf path
pdf_path = "file_pdf.pdf"
 
# opening image
image = Image.open(img_path)
 
# converting into chunks using img2pdf
pdf_bytes = img2pdf.convert(image.filename)
 
# opening or creating pdf file
file = open(pdf_path, "wb")
 
# writing pdf files with chunks
file.write(pdf_bytes)
 
# closing image file
image.close()
 
# closing pdf file
file.close()
 
# output
print("Successfully made pdf file")

pip3 install pillow && pip3 install img2pdf
👍121
What is the output of following Python Code?
👍7
Which of the following evaluation metrics may be used in classification?
Anonymous Poll
58%
F1 score
11%
Log loss
6%
Jaccard index
26%
All of the above
🔥2
6 Data Science Projects for your portfolio
 
1.   Predictive Analytics Project
Build a model to predict future outcomes based on historical data.
Skills Demonstrated: ML, data preprocessing, feature engineering, model evaluation.
2.   Time Series Analysis Project
Analyze time series data to identify trends, seasonal patterns, and anomalies. You could work on projects like stock market analysis.
Skills Demonstrated: Time series decomposition, forecasting models, data preprocessing
3.   Recommender System
Develop a recommendation engine for products, articles, songs any other items. You can use collaborative filtering, content-based filtering, or hybrid methods.
Skills Demonstrated: Recommendation algorithms, data preprocessing, model evaluation.
4.   Customer Segmentation Project
Use clustering algorithms to segment customers based on their behavior and characteristics. This could involve dividing customers into groups for targeted marketing.
Skills Demonstrated: Clustering algorithms (K-means, DBSCAN), data preprocessing, feature selection.
5.   Anomaly Detection Project
Develop a model to detect anomalies in data, such as fraud detection in financial transactions.
Skills Demonstrated: Anomaly detection techniques, data preprocessing, model evaluation.
6.   Churn Prediction for Subscription Services
Predict which customers are likely to cancel their subscriptions based on their usage patterns and other factors.
Skills Demonstrated: Machine learning, data preprocessing, feature engineering, model evaluation.

Join for more: https://t.iss.one/pythonspecialist
👍16
👍141
⌨️ Python Quiz
👍8🔥5
😂😂
🤣39😁6👍3
👍16🔥2
I am starting with a data science interview series to check your knowledge, let's start with the first question. Here it is:

Question 1:
Explain the difference between supervised and unsupervised learning.

Let me know answer in comments 👇👇
👍235
Which of the following is a cluster computing framework that specialises in working with big data?
Anonymous Poll
8%
HTML
51%
Apache Spark
4%
CSS
28%
Pandas
9%
Scipy
👎4👍21👏1
Question 2:
What is overfitting in machine learning, and how can you prevent it?
👍192🔥1
👍146👏1
⌨️ Python Quiz
7👍3
Question 3:
What is the bias-variance tradeoff in machine learning?
👍91