Data Science Portfolio - Kaggle Datasets & AI Projects | Artificial Intelligence
37.8K subscribers
283 photos
76 files
337 links
Free Datasets For Data Science Projects & Portfolio

Buy ads: https://telega.io/c/DataPortfolio

For Promotions/ads: @coderfun @love_data
Download Telegram
Mathematics for Machine Learning

๐Ÿ“š Book
๐—ก๐—ผ ๐——๐—ฒ๐—ด๐—ฟ๐—ฒ๐—ฒ? ๐—ก๐—ผ ๐—ฃ๐—ฟ๐—ผ๐—ฏ๐—น๐—ฒ๐—บ. ๐—ง๐—ต๐—ฒ๐˜€๐—ฒ ๐Ÿฐ ๐—–๐—ฒ๐—ฟ๐˜๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€ ๐—–๐—ฎ๐—ป ๐—Ÿ๐—ฎ๐—ป๐—ฑ ๐—ฌ๐—ผ๐˜‚ ๐—ฎ ๐——๐—ฎ๐˜๐—ฎ ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜€๐˜ ๐—๐—ผ๐—ฏ๐Ÿ˜

Dreaming of a career in data but donโ€™t have a degree? You donโ€™t need one. What you do need are the right skills๐Ÿ”—

These 4 free/affordable certifications can get you there. ๐Ÿ’ปโœจ

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/4ioaJ2p

Letโ€™s get you certified and hired!โœ…๏ธ
๐Ÿ‘1
Here are 10 project ideas to work on for Data Analytics

1. Customer Churn Prediction: Predict customer churn for subscription-based services. Skills: EDA, classification models. Tools: Python, Scikit-Learn.
2. Retail Sales Forecasting: Forecast sales using historical data. Skills: Time series analysis. Tools: Python, Statsmodels.
3. Sentiment Analysis: Analyze sentiments in product reviews or tweets. Skills: Text processing, NLP. Tools: Python, NLTK.
4. Loan Approval Prediction: Predict loan approvals based on credit risk. Skills: Classification models. Tools: Python, Scikit-Learn.
5. COVID-19 Data Analysis: Explore and visualize COVID-19 trends. Skills: EDA, visualization. Tools: Python, Tableau.
6. Traffic Accident Analysis: Discover patterns in traffic accidents. Skills: Clustering, heatmaps. Tools: Python, Folium.
7. Movie Recommendation System: Build a recommendation system using user ratings. Skills: Collaborative filtering. Tools: Python, Scikit-Learn.
8. E-commerce Analysis: Analyze top-performing products in e-commerce. Skills: EDA, association rules. Tools: Python, Apriori.
9. Stock Market Analysis: Analyze stock trends using historical data. Skills: Moving averages, sentiment analysis. Tools: Python, Matplotlib.
10. Employee Attrition Analysis: Predict employee turnover. Skills: Classification models, HR analytics. Tools: Python, Scikit-Learn.

And this is how you can work on

Hereโ€™s a compact list of free resources for working on data analytics projects:

1. Datasets
โ€ข Kaggle Datasets: Wide range of datasets and community discussions.
โ€ข UCI Machine Learning Repository: Great for educational datasets.
โ€ข Data.gov: U.S. government datasets (e.g., traffic, COVID-19).
2. Learning Platforms
โ€ข YouTube: Channels like Data School and freeCodeCamp for tutorials.
โ€ข 365DataScience: Data Science & AI Related Courses
3. Tools
โ€ข Google Colab: Free Jupyter Notebooks for Python coding.
โ€ข Tableau Public & Power BI Desktop: Free data visualization tools.
4. Project Resources
โ€ข Kaggle Notebooks & GitHub: Code examples and project walk-throughs.
โ€ข Data Analytics on Medium: Project guides and tutorials.

ENJOY LEARNING โœ…๏ธโœ…๏ธ

#datascienceprojects
๐Ÿ‘2โค1
๐Ÿฑ ๐—™๐—ฟ๐—ฒ๐—ฒ ๐—ฅ๐—ฒ๐˜€๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ฒ๐˜€ ๐—ง๐—ต๐—ฎ๐˜โ€™๐—น๐—น ๐— ๐—ฎ๐—ธ๐—ฒ ๐—ฆ๐—ค๐—Ÿ ๐—™๐—ถ๐—ป๐—ฎ๐—น๐—น๐˜† ๐—–๐—น๐—ถ๐—ฐ๐—ธ.๐Ÿ˜

SQL seems tough, right? ๐Ÿ˜ฉ

These 5 FREE SQL resources will take you from beginner to advanced without boring theory dumps or confusion.๐Ÿ“Š

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/3GtntaC

Master it with ease. ๐Ÿ’ก
๐Ÿ‘2
Python Roadmap: ๐Ÿ—บ

๐Ÿ“‚ Basics
โ€ƒโˆŸ๐Ÿ“‚ Data Types & Variables
โ€ƒโˆŸ๐Ÿ“‚ Operators & Expressions
โ€ƒโˆŸ๐Ÿ“‚ Control Flow (if, loops)
โ€ƒโ€ƒโˆŸ๐Ÿ“‚ Functions & Modules
โ€ƒโ€ƒโ€ƒโˆŸ๐Ÿ“‚ File Handling
โ€ƒโ€ƒโ€ƒโ€ƒโˆŸ๐Ÿ“‚ OOP (Classes & Objects)
โ€ƒโ€ƒโ€ƒโ€ƒโ€ƒโˆŸ๐Ÿ“‚ Exception Handling
โ€ƒโ€ƒโ€ƒโ€ƒโ€ƒโ€ƒ
โˆŸ๐Ÿ“‚ Advanced Topics (Decorators, Generators)
โ€ƒโˆŸ๐Ÿ“‚ Libraries (NumPy, Pandas, Matplotlib)
โ€ƒโˆŸ๐Ÿ“‚ Web Scraping / API Integration
โ€ƒโˆŸ๐Ÿ“‚ Frameworks (Flask/Django)
โ€ƒ โˆŸ๐Ÿ“‚ Automation & Scripting
โ€ƒโ€ƒโ€ƒโˆŸ๐Ÿ“‚ Projects
โ€ƒโ€ƒโ€ƒโ€ƒโˆŸ โœ… Apply For Job

Like if you need a detailed explanation step-by-step โค๏ธ
๐Ÿ‘7โค4
๐—ช๐—ฎ๐—ป๐˜ ๐˜๐—ผ ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป ๐—œ๐—ป-๐——๐—ฒ๐—บ๐—ฎ๐—ป๐—ฑ ๐—ง๐—ฒ๐—ฐ๐—ต ๐—ฆ๐—ธ๐—ถ๐—น๐—น๐˜€ โ€” ๐—ณ๐—ผ๐—ฟ ๐—™๐—ฅ๐—˜๐—˜ โ€” ๐——๐—ถ๐—ฟ๐—ฒ๐—ฐ๐˜๐—น๐˜† ๐—ณ๐—ฟ๐—ผ๐—บ ๐—š๐—ผ๐—ผ๐—ด๐—น๐—ฒ?๐Ÿ˜

Whether youโ€™re a student, job seeker, or just hungry to upskill โ€” these 5 beginner-friendly courses are your golden ticket. ๐ŸŽŸ๏ธ

Just career-boosting knowledge and certificates that make your resume pop๐Ÿ“„

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/42vL6br

All The Best ๐ŸŽŠ
10 Python Libraries Every AI Engineer Should Know

1. Hugging Face Transformers
A powerful library for using and fine-tuning pre-trained transformer models for NLP. Learn more:
Hugging Face NLP Course

2. Ollama
A framework for running and managing open-source LLMs locally with ease. Learn video:
Ollama Course

3. OpenAI Python SDK
The official toolkit for integrating OpenAI models into Python applications. Learn more:
The official developer quickstart guide

4. Anthropic SDK
A client library for seamless interaction with Claude and other Anthropic models. Learn more:
Anthropic Python SDK

5. LangChain
A framework for building LLM applications with modular and extensible components. Learn more:
DeepLearning.AI

6. LlamaIndex
A toolkit for integrating custom data sources with LLMs for better retrieval. Learn more:
Building Agentic RAG with LlamaIndex

7. SQLAlchemy
A Python SQL toolkit and ORM for efficient and maintainable database interactions. Learn more:
SQLAlchemy Unified Tutorial

8. ChromaDB
An open-source vector database optimized for AI-powered search and retrieval. Learn more:
Getting Started - Chroma Docs

9. Weaviate
A cloud-native vector search engine for efficient semantic search at scale. Learn more:
101T Work with: Text data

10. Weights & Biases

A platform for tracking, visualizing, and optimizing ML experiments.
Learn more: Effective MLOps: Model Development

#artificialintelligence
๐Ÿ‘4โค1
Forwarded from Artificial Intelligence
๐—ง๐—–๐—ฆ ๐—™๐—ฅ๐—˜๐—˜ ๐——๐—ฎ๐˜๐—ฎ ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ ๐—–๐—ฒ๐—ฟ๐˜๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ๐˜€๐Ÿ˜

Want to kickstart your career in Data Analytics but donโ€™t know where to begin?๐Ÿ‘จโ€๐Ÿ’ป

TCS has your back with a completely FREE course designed just for beginnersโœ…

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/4jNMoEg

Just pure, job-ready learning๐Ÿ“
Important Pandas & Spark Commands for Data Science
๐Ÿ”ฅ2
Flow chart of commonly used statistical tests
๐Ÿ”ฅ3
๐Ÿฒ ๐—•๐—ฒ๐˜€๐˜ ๐—ฌ๐—ผ๐˜‚๐—ง๐˜‚๐—ฏ๐—ฒ ๐—–๐—ต๐—ฎ๐—ป๐—ป๐—ฒ๐—น๐˜€ ๐˜๐—ผ ๐— ๐—ฎ๐˜€๐˜๐—ฒ๐—ฟ ๐—ฃ๐—ผ๐˜„๐—ฒ๐—ฟ ๐—•๐—œ๐Ÿ˜

Power BI Isnโ€™t Just a Toolโ€”Itโ€™s a Career Game-Changer๐Ÿš€

Whether youโ€™re a student, a working professional, or switching careers, learning Power BI can set you apart in the competitive world of data analytics๐Ÿ“Š

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/3ELirpu

Your Analytics Journey Starts Nowโœ…๏ธ
๐Ÿ‘1
Exploratory Data Analysis ( EDA)
๐Ÿ”ฅ3
Forwarded from Artificial Intelligence
๐Ÿฑ ๐—™๐—ฅ๐—˜๐—˜ ๐—œ๐—•๐—  ๐—–๐—ฒ๐—ฟ๐˜๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ๐˜€ ๐˜๐—ผ ๐—ฆ๐—ธ๐˜†๐—ฟ๐—ผ๐—ฐ๐—ธ๐—ฒ๐˜ ๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—ฅ๐—ฒ๐˜€๐˜‚๐—บ๐—ฒ๐Ÿ˜

From mastering Cloud Computing to diving into Deep Learning, Docker, Big Data, and IoT Blockchain

IBM, one of the biggest tech companies, is offering 5 FREE courses that can seriously upgrade your resume and skills โ€” without costing you anything.

๐—Ÿ๐—ถ๐—ป๐—ธ:-๐Ÿ‘‡

https://pdlink.in/44GsWoC

Enroll For FREE & Get Certified โœ…
๐Ÿ‘2
5 frequently Asked SQL Interview Questions with Answers in Data Engineering interviews:
๐ƒ๐ข๐Ÿ๐Ÿ๐ข๐œ๐ฎ๐ฅ๐ญ๐ฒ - ๐Œ๐ž๐๐ข๐ฎ๐ฆ

โšซ๏ธDetermine the Top 5 Products with the Highest Revenue in Each Category.
Schema: Products (ProductID, Name, CategoryID), Sales (SaleID, ProductID, Amount)

WITH ProductRevenue AS (
SELECT p.ProductID,
p.Name,
p.CategoryID,
SUM(s.Amount) AS TotalRevenue,
RANK() OVER (PARTITION BY p.CategoryID ORDER BY SUM(s.Amount) DESC) AS RevenueRank
FROM Products p
JOIN Sales s ON p.ProductID = s.ProductID
GROUP BY p.ProductID, p.Name, p.CategoryID
)
SELECT ProductID, Name, CategoryID, TotalRevenue
FROM ProductRevenue
WHERE RevenueRank <= 5;

โšซ๏ธ Identify Employees with Increasing Sales for Four Consecutive Quarters.
Schema: Sales (EmployeeID, SaleDate, Amount)

WITH QuarterlySales AS (
SELECT EmployeeID,
DATE_TRUNC('quarter', SaleDate) AS Quarter,
SUM(Amount) AS QuarterlyAmount
FROM Sales
GROUP BY EmployeeID, DATE_TRUNC('quarter', SaleDate)
),
SalesTrend AS (
SELECT EmployeeID,
Quarter,
QuarterlyAmount,
LAG(QuarterlyAmount, 1) OVER (PARTITION BY EmployeeID ORDER BY Quarter) AS PrevQuarter1,
LAG(QuarterlyAmount, 2) OVER (PARTITION BY EmployeeID ORDER BY Quarter) AS PrevQuarter2,
LAG(QuarterlyAmount, 3) OVER (PARTITION BY EmployeeID ORDER BY Quarter) AS PrevQuarter3
FROM QuarterlySales
)
SELECT EmployeeID, Quarter, QuarterlyAmount
FROM SalesTrend
WHERE QuarterlyAmount > PrevQuarter1 AND PrevQuarter1 > PrevQuarter2 AND PrevQuarter2 > PrevQuarter3;

โšซ๏ธ List Customers Who Made Purchases in Each of the Last Three Years.
Schema: Orders (OrderID, CustomerID, OrderDate)

WITH YearlyOrders AS (
SELECT CustomerID,
EXTRACT(YEAR FROM OrderDate) AS OrderYear
FROM Orders
GROUP BY CustomerID, EXTRACT(YEAR FROM OrderDate)
),
RecentYears AS (
SELECT DISTINCT OrderYear
FROM Orders
WHERE OrderDate >= CURRENT_DATE - INTERVAL '3 years'
),
CustomerYearlyOrders AS (
SELECT CustomerID,
COUNT(DISTINCT OrderYear) AS YearCount
FROM YearlyOrders
WHERE OrderYear IN (SELECT OrderYear FROM RecentYears)
GROUP BY CustomerID
)
SELECT CustomerID
FROM CustomerYearlyOrders
WHERE YearCount = 3;


โšซ๏ธ Find the Third Lowest Price for Each Product Category.
Schema: Products (ProductID, Name, CategoryID, Price)

WITH RankedPrices AS (
SELECT CategoryID,
Price,
DENSE_RANK() OVER (PARTITION BY CategoryID ORDER BY Price ASC) AS PriceRank
FROM Products
)
SELECT CategoryID, Price
FROM RankedPrices
WHERE PriceRank = 3;

โšซ๏ธ Identify Products with Total Sales Exceeding a Specified Threshold Over the Last 30 Days.
Schema: Sales (SaleID, ProductID, SaleDate, Amount)

WITH RecentSales AS (
SELECT ProductID,
SUM(Amount) AS TotalSales
FROM Sales
WHERE SaleDate >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY ProductID
)
SELECT ProductID, TotalSales
FROM RecentSales
WHERE TotalSales > 200;

Here you can find essential Interview Resources๐Ÿ‘‡
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02

Like this post if you need more ๐Ÿ‘โค๏ธ

Hope it helps :)
๐Ÿ‘1
๐Ÿฐ ๐—™๐—ฅ๐—˜๐—˜ ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ๐˜€ ๐—ฏ๐˜† ๐—›๐—ฎ๐—ฟ๐˜ƒ๐—ฎ๐—ฟ๐—ฑ ๐—ฎ๐—ป๐—ฑ ๐—ฆ๐˜๐—ฎ๐—ป๐—ณ๐—ผ๐—ฟ๐—ฑ ๐˜๐—ผ ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป ๐—”๐—œ๐Ÿ˜

Dreaming of Mastering AI? ๐ŸŽฏ

Harvard and Stanfordโ€”two of the most prestigious universities in the worldโ€”are offering FREE AI courses๐Ÿ‘จโ€๐Ÿ’ป

No hidden fees, no long applicationsโ€”just pure, world-class education, accessible to everyone๐Ÿ”ฅ

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/3GqHkau

Hereโ€™s your golden ticket to the future!โœ…
๐Ÿ‘1
Important Topics to become a data scientist [Advanced Level]
๐Ÿ‘‡๐Ÿ‘‡

1. Mathematics

Linear Algebra
Analytic Geometry
Matrix
Vector Calculus
Optimization
Regression
Dimensionality Reduction
Density Estimation
Classification

2. Probability

Introduction to Probability
1D Random Variable
The function of One Random Variable
Joint Probability Distribution
Discrete Distribution
Normal Distribution

3. Statistics

Introduction to Statistics
Data Description
Random Samples
Sampling Distribution
Parameter Estimation
Hypotheses Testing
Regression

4. Programming

Python:

Python Basics
List
Set
Tuples
Dictionary
Function
NumPy
Pandas
Matplotlib/Seaborn

R Programming:

R Basics
Vector
List
Data Frame
Matrix
Array
Function
dplyr
ggplot2
Tidyr
Shiny

DataBase:
SQL
MongoDB

Data Structures

Web scraping

Linux

Git

5. Machine Learning

How Model Works
Basic Data Exploration
First ML Model
Model Validation
Underfitting & Overfitting
Random Forest
Handling Missing Values
Handling Categorical Variables
Pipelines
Cross-Validation(R)
XGBoost(Python|R)
Data Leakage

6. Deep Learning

Artificial Neural Network
Convolutional Neural Network
Recurrent Neural Network
TensorFlow
Keras
PyTorch
A Single Neuron
Deep Neural Network
Stochastic Gradient Descent
Overfitting and Underfitting
Dropout Batch Normalization
Binary Classification

7. Feature Engineering

Baseline Model
Categorical Encodings
Feature Generation
Feature Selection

8. Natural Language Processing

Text Classification
Word Vectors

9. Data Visualization Tools

BI (Business Intelligence):
Tableau
Power BI
Qlik View
Qlik Sense

10. Deployment

Microsoft Azure
Heroku
Google Cloud Platform
Flask
Django

I have curated the best interview resources to crack Data Science Interviews
๐Ÿ‘‡๐Ÿ‘‡
https://whatsapp.com/channel/0029Va4QUHa6rsQjhITHK82y

Like if you need similar content ๐Ÿ˜„๐Ÿ‘
๐Ÿ‘3
Forwarded from Generative AI
๐—™๐—ฅ๐—˜๐—˜ ๐—š๐—ผ๐—ผ๐—ด๐—น๐—ฒ ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐—ฃ๐—ฎ๐˜๐—ต! ๐—•๐—ฒ๐—ฐ๐—ผ๐—บ๐—ฒ ๐—ฎ ๐—–๐—ฒ๐—ฟ๐˜๐—ถ๐—ณ๐—ถ๐—ฒ๐—ฑ ๐——๐—ฎ๐˜๐—ฎ ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜€๐˜ ๐—ถ๐—ป ๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฑ๐Ÿ˜

If youโ€™re dreaming of starting a high-paying data career or switching into the booming tech industry, Google just made it a whole lot easier โ€” and itโ€™s completely FREE๐Ÿ‘จโ€๐Ÿ’ป

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/4cMx2h2

Youโ€™ll get access to hands-on labs, real datasets, and industry-grade training created directly by Googleโ€™s own experts๐Ÿ’ป
๐Ÿ‘2
Please go through this top 5 SQL projects with Datasets that you can practice and can add in your resume

๐Ÿš€1. Web Analytics:
(
https://www.kaggle.com/zynicide/wine-reviews)

๐Ÿš€2. Healthcare Data Analysis:
(
https://www.kaggle.com/cdc/mortality)

๐Ÿ“Œ3. E-commerce Analysis:
(
https://www.kaggle.com/olistbr/brazilian-ecommerce)

๐Ÿš€4. Inventory Management:
(
https://www.kaggle.com/code/govindji/inventory-management)


๐Ÿš€ 5. Analysis of Sales Data:
(
https://www.kaggle.com/kyanyoga/sample-sales-data)

Small suggestion from my side for non tech students: kindly pick those datasets which you like the subject in general, that way you will be more excited to practice it, instead of just doing it for the sake of resume, you will learn SQL more passionately, since itโ€™s a programming language try to make it more exciting for yourself.

Hope this piece of information helps you
๐Ÿ‘2
๐—•๐—ฒ๐˜€๐˜ ๐—ฌ๐—ผ๐˜‚๐—ง๐˜‚๐—ฏ๐—ฒ ๐—–๐—ต๐—ฎ๐—ป๐—ป๐—ฒ๐—น๐˜€ ๐˜๐—ผ ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป ๐—˜๐˜€๐˜€๐—ฒ๐—ป๐˜๐—ถ๐—ฎ๐—น ๐——๐—ฎ๐˜๐—ฎ ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ ๐—ฆ๐—ธ๐—ถ๐—น๐—น๐˜€ ๐—ณ๐—ผ๐—ฟ ๐—™๐—ฅ๐—˜๐—˜๐Ÿ˜

Dreaming of becoming a Data Analyst but feel overwhelmed by where to start?๐Ÿ‘จโ€๐Ÿ’ป

Hereโ€™s the truth: YouTube is packed with goldmine content, and the best part โ€” itโ€™s all 100% FREE๐Ÿ”ฅ

๐‹๐ข๐ง๐ค๐Ÿ‘‡:-

https://pdlink.in/4cL3SyM

๐Ÿš€ If Youโ€™re Serious About Data Analytics, You Canโ€™t Sleep on These YouTube Channels!
๐Ÿ‘1