Machine Learning with Python
68.6K subscribers
1.32K photos
100 videos
173 files
984 links
Learn Machine Learning with hands-on Python tutorials, real-world code examples, and clear explanations for researchers and developers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
๐Ÿ˜‰ A list of the best YouTube videos
โœ… To learn data science


1๏ธโƒฃ SQL language


โฌ…๏ธ Learning

๐Ÿ’ฐ 4-hour SQL course from zero to one hundred

๐Ÿ’ฐ Window functions tutorial

โฌ…๏ธ Projects

๐Ÿ“Ž Starting your first SQL project

๐Ÿ’ฐ Data cleansing project

๐Ÿ’ฐ Restaurant order analysis

โฌ…๏ธ Interview

๐Ÿ’ฐ How to crack the SQL interview?

โž–โž–โž–

2๏ธโƒฃ Python


โฌ…๏ธ Learning

๐Ÿ’ฐ 12-hour Python for Data Science course

โฌ…๏ธ Projects

๐Ÿ’ฐ Python project for beginners

๐Ÿ’ฐ Analyzing Corona Data with Python

โฌ…๏ธ Interview

๐Ÿ’ฐ Python interview golden tricks

๐Ÿ’ฐ Python Interview Questions

โž–โž–โž–

3๏ธโƒฃ Statistics and machine learning


โฌ…๏ธ Learning

๐Ÿ’ฐ 7-hour course in applied statistics

๐Ÿ’ฐ Machine Learning Training Playlist

โฌ…๏ธ Projects

๐Ÿ’ฐ Practical ML Project

โฌ…๏ธ Interview

๐Ÿ’ฐ ML Interview Questions and Answers

๐Ÿ’ฐ How to pass a statistics interview?

โž–โž–โž–

4๏ธโƒฃ Product and business case studies


โฌ…๏ธ Learning

๐Ÿ’ฐ Building strong product understanding

๐Ÿ’ฐ Product Metric Definition

โฌ…๏ธ Interview

๐Ÿ’ฐ Case Study Analysis Framework

๐Ÿ’ฐ How to shine in a business interview?

#DataScience #SQL #Python #MachineLearning #Statistics #BusinessAnalytics #ProductCaseStudies #DataScienceProjects #InterviewPrep #LearnDataScience #YouTubeLearning #CodingInterview #MLInterview #SQLProjects #PythonForDataScience



โœ‰๏ธ Our Telegram channels: https://t.iss.one/addlist/0f6vfFbEMdAwODBk
Please open Telegram to view this post
VIEW IN TELEGRAM
โค17๐Ÿ‘3๐ŸŽ‰1
Topic: Handling Datasets of All Types โ€“ Part 1 of 5: Introduction and Basic Concepts

---

1. What is a Dataset?

โ€ข A dataset is a structured collection of data, usually organized in rows and columns, used for analysis or training machine learning models.

---

2. Types of Datasets

โ€ข Structured Data: Tables, spreadsheets with rows and columns (e.g., CSV, Excel).

โ€ข Unstructured Data: Images, text, audio, video.

โ€ข Semi-structured Data: JSON, XML files containing hierarchical data.

---

3. Common Dataset Formats

โ€ข CSV (Comma-Separated Values)

โ€ข Excel (.xls, .xlsx)

โ€ข JSON (JavaScript Object Notation)

โ€ข XML (eXtensible Markup Language)

โ€ข Images (JPEG, PNG, TIFF)

โ€ข Audio (WAV, MP3)

---

4. Loading Datasets in Python

โ€ข Use libraries like pandas for structured data:

import pandas as pd
df = pd.read_csv('data.csv')


โ€ข Use libraries like json for JSON files:

import json
with open('data.json') as f:
data = json.load(f)


---

5. Basic Dataset Exploration

โ€ข Check shape and size:

print(df.shape)


โ€ข Preview data:

print(df.head())


โ€ข Check for missing values:

print(df.isnull().sum())


---

6. Summary

โ€ข Understanding dataset types is crucial before processing.

โ€ข Loading and exploring datasets helps identify cleaning and preprocessing needs.

---

Exercise

โ€ข Load a CSV and JSON dataset in Python, print their shapes, and identify missing values.

---

#DataScience #Datasets #DataLoading #Python #DataExploration

The rest of the parts ๐Ÿ‘‡
https://t.iss.one/DataScienceM ๐ŸŒŸ
Please open Telegram to view this post
VIEW IN TELEGRAM
โค28๐Ÿ‘1
๐Ÿš€ Comprehensive Guide: How to Prepare for a Graph Neural Networks (GNN) Job Interview โ€“ 350 Most Common Interview Questions

Read: https://hackmd.io/@husseinsheikho/GNN-interview

#GNN #GraphNeuralNetworks #MachineLearning #DeepLearning #AI #DataScience #PyTorchGeometric #DGL #NodeClassification #LinkPrediction #GraphML

โœ‰๏ธ Our Telegram channels: https://t.iss.one/addlist/0f6vfFbEMdAwODBk

๐Ÿ“ฑ Our WhatsApp channel: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
โค8
๐—ฃ๐—ฟ๐—ฒ๐—ฝ๐—ฎ๐—ฟ๐—ฒ ๐—ณ๐—ผ๐—ฟ ๐—๐—ผ๐—ฏ ๐—œ๐—ป๐˜๐—ฒ๐—ฟ๐˜ƒ๐—ถ๐—ฒ๐˜„๐˜€.

In DS or AI/ML interviews, you need to be able to explain models, debug them live, and design AI/ML systems from scratch. If you canโ€™t demonstrate this during an interview, expect to hear, โ€œWeโ€™ll get back to you.โ€

The attached person's name is Chip Huyen. Hopefully you know her; if not, then I can't help you here. She is probably one of the finest authors in the field of AI/ML.

She designed proper documentation/a book for common ML interview questions.

Target Audiences: ML engineer, a platform engineer, a research scientist, or you want to do ML but donโ€™t yet know the differences among those titles.Check the comment section for links and repos.

๐Ÿ“Œ link:
https://huyenchip.com/ml-interviews-book/

#JobInterview #MachineLearning #AI #DataScience #MLEngineer #AIInterview #TechCareers #DeepLearning #AICommunity #MLSystems #CareerGrowth #AIJobs #ChipHuyen #InterviewPrep #DataScienceCommunit

๏ปฟ
https://t.iss.one/CodeProgrammer ๐ŸŒŸ
Please open Telegram to view this post
VIEW IN TELEGRAM
โค8๐Ÿ’ฏ3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป This Python library helps you extract usable data for language models from complex files like tables, images, charts, or multi-page documents.

๐Ÿ“ The idea of Agentic Document Extraction is that unlike common methods like OCR that only read text, it can also understand the structure and relationships between different parts of the document. For example, it understands which title belongs to which table or image.


โœ… Works with PDFs, images, and website links.

โ˜‘๏ธ Can chunk and process very large documents (up to 1000 pages) by itself.

โœ”๏ธ Outputs both JSON and Markdown formats.

โ˜‘๏ธ Even specifies the exact location of each section on the page.

โœ”๏ธ Supports parallel and batch processing.

pip install agentic-doc


โ”Œ ๐Ÿฅต Agentic Document Extraction
โ”œ
๐ŸŒŽ Website
โ””
๐Ÿฑ GitHub Repos

๐ŸŒ #DataScience #DataScience
โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–

https://t.iss.one/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
โค7๐Ÿ‘3๐Ÿ”ฅ1
๐Ÿ“บ 12 comprehensive playlists to master
โฌ…๏ธ machine learning, deep learning, and GenAI!


๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป Each playlist is designed to be simple and understandable for beginners, and then gradually dive deeper into the topics.


๐Ÿ˜‰ Machine Learning Basics (39 videos)

๐Ÿ˜‰ Python for ML (9 videos)

๐Ÿ˜‰ Optimization for ML (5 videos)

๐Ÿ˜‰ Machine Learning with Practical Exercises (37 videos)

๐Ÿ˜‰ Building Decision Trees from Scratch (13 videos)

๐Ÿ˜‰ Building Neural Networks from Scratch (35 videos)

๐Ÿ˜‰ Graph Neural Networks (6 videos)

๐Ÿ˜‰ Computer Vision from Scratch (19 videos)

๐Ÿ˜‰ Building LLM from Scratch (43 videos)

๐Ÿ˜‰ Reasoning in LLMs from Scratch (22 videos)

๐Ÿ˜‰ Building DeepSeek from Scratch (29 videos)

๐Ÿ˜‰ Machine Learning in Production Environment (6 videos)



๐ŸŒ #Data_Science #DataScience
โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–

https://t.iss.one/CodeProgrammer โค๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
โค18๐Ÿ‘2
๐Ÿ’  The Best Tool for Extracting Data from PDF Files!

๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ป Usually, PDF files like financial reports, scientific articles, or data analyses are full of tables, formulas, and complex texts.

โฌ…๏ธ Most tools only extract texts and destroy the data structure, causing important information to be lost.

โœ… But the tool Docling uses artificial intelligence to preserve all those structures (text, tables, formulas) exactly as they are in the file. Then it converts that data into a structured format. Meaning AI models can work on them.

โญ• The interesting point is that with just three lines of Python code, you can convert any PDF into searchable data!

โ”Œ ๐Ÿฅต Docling
โ”œ
๐Ÿ”Ž Article
โ”œ
๐Ÿ“„ Documentation
โ””
๐Ÿฑ GitHub-Repos

๐ŸŒ #Data_Science #DataScience
โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–โž–
Please open Telegram to view this post
VIEW IN TELEGRAM
โค5๐Ÿ‘1
โš™๏ธ This tool is turning the world of Web Scraping upside down!

๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป A new tool called Crawl4AI has been introduced that makes Web Scraping and data extraction from websites much easier, faster, and smarter! Especially designed for use in AI models like ChatGPT and similar tools.

1โƒฃ Its special features:

๐Ÿ”น Completely free and open-source. That means you can use it however you want without any cost.

๐Ÿ”น Works much faster than paid tools.

๐Ÿ”น Its outputs are AI-friendly, such as JSON, HTML, or Markdown.

๐Ÿ”น Can extract data from multiple websites simultaneously.

๐Ÿ”น Collects images, videos, and audio from pages as well.

๐Ÿ”น Extracts all internal and external links for you.
                   โž– โž– โž–

๐Ÿ”ข More advanced features:

๐Ÿ”น Takes screenshots of pages and collects metadata (like title, description, tags).

๐Ÿ”น You can write custom code or special settings like auth and headers.

๐Ÿ”น You can even change its browser User-Agent to behave like a human.

๐Ÿ”น Before starting extraction, it can run your custom JavaScript code.

โ”Œ โ™ฆ๏ธ Crawl4AI
โ””
๐Ÿฑ GitHub Repos

๐ŸŒ #DataScience #DataScience

https://t.iss.one/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
โค8
๐Ÿค–๐Ÿง  Master Machine Learning: Explore the Ultimate โ€œMachine-Learning-Tutorialsโ€ Repository

๐Ÿ—“๏ธ 23 Oct 2025
๐Ÿ“š AI News & Trends

In todayโ€™s data-driven world, Machine Learning (ML) has become the cornerstone of modern technology from intelligent chatbots to predictive analytics and recommendation systems. However, mastering ML isnโ€™t just about coding, it requires a structured understanding of algorithms, statistics, optimization techniques and real-world problem-solving. Thatโ€™s where Ujjwal Karnโ€™s Machine-Learning-Tutorials GitHub repository stands out. This open-source, topic-wise ...

#MachineLearning #MLTutorials #ArtificialIntelligence #DataScience #OpenSource #AIEducation
โค7๐Ÿ‘2
Forwarded from PyData Careers
In Python, NumPy is the cornerstone of scientific computing, offering high-performance multidimensional arrays and tools for working with themโ€”critical for data science interviews and real-world applications! ๐Ÿ“Š

import numpy as np

# Array Creation - The foundation of NumPy
arr = np.array([1, 2, 3])
zeros = np.zeros((2, 3)) # 2x3 matrix of zeros
ones = np.ones((2, 2), dtype=int) # Integer matrix
arange = np.arange(0, 10, 2) # [0 2 4 6 8]
linspace = np.linspace(0, 1, 5) # [0. 0.25 0.5 0.75 1. ]
print(linspace)


# Array Attributes - Master your data's structure
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix.shape) # Output: (2, 3)
print(matrix.ndim) # Output: 2
print(matrix.dtype) # Output: int64
print(matrix.size) # Output: 6


# Indexing & Slicing - Precision data access
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(data[1, 2]) # Output: 6 (row 1, col 2)
print(data[0:2, 1:3]) # Output: [[2 3], [5 6]]
print(data[:, -1]) # Output: [3 6 9] (last column)


# Reshaping Arrays - Transform dimensions effortlessly
flat = np.arange(6)
reshaped = flat.reshape(2, 3)
raveled = reshaped.ravel()
print(reshaped)
# Output: [[0 1 2], [3 4 5]]
print(raveled) # Output: [0 1 2 3 4 5]


# Stacking Arrays - Combine datasets vertically/horizontally
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.vstack((a, b))) # Vertical stack
# Output: [[1 2 3], [4 5 6]]
print(np.hstack((a, b))) # Horizontal stack
# Output: [1 2 3 4 5 6]


# Mathematical Operations - Vectorized calculations
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
print(x + y) # Output: [5 7 9]
print(x * 2) # Output: [2 4 6]
print(np.dot(x, y)) # Output: 32 (1*4 + 2*5 + 3*6)


# Broadcasting Magic - Operate on mismatched shapes
matrix = np.array([[1, 2, 3], [4, 5, 6]])
scalar = 10
print(matrix + scalar)
# Output: [[11 12 13], [14 15 16]]


# Aggregation Functions - Statistical power in one line
values = np.array([1, 5, 3, 9, 7])
print(np.sum(values)) # Output: 25
print(np.mean(values)) # Output: 5.0
print(np.max(values)) # Output: 9
print(np.std(values)) # Output: 2.8284271247461903


# Boolean Masking - Filter data like a pro
temperatures = np.array([18, 25, 12, 30, 22])
hot_days = temperatures > 24
print(temperatures[hot_days]) # Output: [25 30]


# Random Number Generation - Simulate real-world data
print(np.random.rand(2, 2)) # Uniform distribution
print(np.random.randn(3)) # Normal distribution
print(np.random.randint(0, 10, (2, 3))) # Random integers


# Linear Algebra Essentials - Solve equations like a physicist
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
x = np.linalg.solve(A, b)
print(x) # Output: [2. 3.] (Solution to 3x+y=9 and x+2y=8)

# Matrix inverse and determinant
print(np.linalg.inv(A)) # Output: [[ 0.4 -0.2], [-0.2 0.6]]
print(np.linalg.det(A)) # Output: 5.0


# File Operations - Save/load your computational work
data = np.array([[1, 2], [3, 4]])
np.save('array.npy', data)
loaded = np.load('array.npy')
print(np.array_equal(data, loaded)) # Output: True


# Interview Power Move: Vectorization vs Loops
# 10x faster than native Python loops!
def square_sum(n):
arr = np.arange(n)
return np.sum(arr ** 2)

print(square_sum(5)) # Output: 30 (0ยฒ+1ยฒ+2ยฒ+3ยฒ+4ยฒ)


# Pro Tip: Memory-efficient data processing
# Process 1GB array without loading entire dataset
large_array = np.memmap('large_data.bin', dtype='float32', mode='r', shape=(1000000, 100))
print(large_array[0:5, 0:3]) # Process small slice


By: @DataScienceQ ๐Ÿš€

#Python #NumPy #DataScience #CodingInterview #MachineLearning #ScientificComputing #DataAnalysis #Programming #TechJobs #DeveloperTips
โค6๐Ÿ‘1