ML Research Hub – Telegram

ML Research Hub

32.3K subscribers

6.78K photos

479 videos

24 files

7.4K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.3K subscribers

ML Research Hub

✨A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications

📝 Summary:
Mixture of Experts MoE models enhance large AI model efficiency and performance by dynamically selecting sub-models for diverse data. This survey details MoE design, algorithms, theory, and applications in various machine learning fields.

🔹 Publication Date: Published on Mar 10, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.07137
• PDF: https://arxiv.org/pdf/2503.07137
• Github: https://github.com/deepseek-ai/DeepEP

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MixtureOfExperts #MoE #AI #MachineLearning #DeepLearning

❤1

467 views23:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Forwarded from Machine Learning with Python

This media is not supported in your browser

VIEW IN TELEGRAM

Self Attention vs Cross Attention by hand ✍️
Resize the matrices yourself 👉 https://byhand.ai/aMisxP

Two attention mechanisms, side by side. Both project X into queries; both compute attention via S = Kᵀ × Q and F = V × A. The only difference is the source of K and V.

Self attention uses X for everything. Q, K, and V all come from projecting X. Each X token attends to every other X token. The score matrix S is square — 128 × 128.

Cross attention uses X for queries and a second sequence E for keys and values. Each X token attends to every E token instead. The score matrix S is rectangular — 64 × 128.

Notice what's shared and what's not:

X is the same in both — same 36 × 128 input.

Q and K share the 16 dimension — that's what makes the dot product Kᵀ × Q valid in either case.

V dimensions are independent: self-attention uses 12, cross-attention uses 12. The choice doesn't depend on which mechanism you're using; it depends on what output dimension your downstream layer expects.

https://t.iss.one/CodeProgrammer

❤2

262 views06:16

ML Research Hub

Forwarded from Machine Learning with Python

Follow the Machine Learning with Python channel on WhatsApp: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

146 views07:38

ML Research Hub

✨LLM Safety From Within: Detecting Harmful Content with Internal Representations

📝 Summary:
SIREN is a lightweight guard model that uses LLM internal layer features to detect harmful content, outperforming current models. It is more efficient, generalizes better, and requires significantly fewer parameters than existing guard models.

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18519
• PDF: https://arxiv.org/pdf/2604.18519
• Github: https://github.com/CSSLab/SIREN

🔹 Models citing this paper:
• https://huggingface.co/UofTCSSLab/SIREN-Qwen3-0.6B
• https://huggingface.co/UofTCSSLab/SIREN-Qwen3-4B
• https://huggingface.co/UofTCSSLab/SIREN-Llama-3.2-1B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMSafety #AIethics #HarmfulContent #DeepLearning #NLP

200 views02:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model

📝 Summary:
dWorldEval proposes a scalable robotics policy evaluation method using a discrete diffusion world model. It unifies diverse modalities into a token space, employing a transformer and progress token for success detection. This approach significantly outperforms prior methods, enabling large-scale ...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22152
• PDF: https://arxiv.org/pdf/2604.22152

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#Robotics #DiffusionModels #WorldModels #AI #MachineLearning

115 views02:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AgentSearchBench: A Benchmark for AI Agent Search in the Wild

📝 Summary:
AgentSearchBench is a new benchmark for finding suitable AI agents using execution-grounded performance signals from nearly 10,000 real-world agents. It shows that description-based similarity is insufficient, and lightweight behavioral signals significantly improve agent ranking.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22436
• PDF: https://arxiv.org/pdf/2604.22436

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #AIAgents #Benchmarking #AgentSearch #MachineLearning

122 views02:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Learning Evidence Highlighting for Frozen LLMs

📝 Summary:
HiLight enhances long-context reasoning in large language models by training a lightweight emphasis actor to highlight key evidence without modifying the original input or solver, using reinforcement ...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22565
• PDF: https://arxiv.org/pdf/2604.22565

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

157 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

📝 Summary:
World models are categorized into three capability levels and four law regimes to better understand and develop predictive environment models for AI agents across diverse domains. AI-generated summary...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22748
• PDF: https://arxiv.org/pdf/2604.22748
• Project Page: https://agentic-world-modeling.xyz/
• Github: https://github.com/matrix-agent/awesome-agentic-world-modeling

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

164 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Media is too big

VIEW IN TELEGRAM

✨AgriIR: A Scalable Framework for Domain-Specific Knowledge Retrieval

📝 Summary:
AgriIR is a modular retrieval-augmented generation framework for agriculture. It uses configurable stages to provide accurate, trustworthy, and resource-efficient domain-specific information. This adaptable design promotes accessibility and accountability in AI for agriculture.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.16353
• PDF: https://arxiv.org/pdf/2604.16353
• Github: https://github.com/Shuvam-Banerji-Seal/AgriIR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #Agriculture #RAG #KnowledgeRetrieval #NLP

157 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

📝 Summary:
DiffNR enhances sparse-view CT reconstruction with neural representations by employing SliceFixer, a single-step diffusion model. It corrects artifacts via pseudo-reference volumes, offering 3D supervision for better accuracy and efficient optimization, with a 3.99 dB PSNR gain.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21518
• PDF: https://arxiv.org/pdf/2604.21518
• Project Page: https://ooonesevennn.github.io/DiffNR/
• Github: https://github.com/ooonesevennn/DiffNR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#3DReconstruction #DiffusionModels #NeuralNetworks #CTReconstruction #DeepLearning

132 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FlowAnchor: Stabilizing the Editing Signal for Inversion-Free Video Editing

📝 Summary:
FlowAnchor stabilizes inversion-free video editing by addressing signal instability in high-dimensional latent spaces. It uses spatial-aware attention refinement and adaptive magnitude modulation to ensure precise localization and sufficient editing strength, leading to faithful and coherent vide...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22586
• PDF: https://arxiv.org/pdf/2604.22586

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoEditing #DeepLearning #ComputerVision #GenerativeAI #AIResearch

159 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets

📝 Summary:
SLIDERS tackles long-document QA by extracting information into a relational database and using SQL for structured reasoning. This avoids LLM context window issues and aggregation bottlenecks, significantly outperforming traditional methods on various benchmarks.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22294
• PDF: https://arxiv.org/pdf/2604.22294

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#QuestionAnswering #NLP #AI #SQL #LongDocuments

172 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Sessa: Selective State Space Attention

📝 Summary:
Sessa is a new decoder architecture that puts attention inside a recurrent feedback path. This allows it to model long contexts better than Transformers and state-space models, achieving power-law memory decay and flexible selective retrieval. It outperforms on long-context tasks.

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18580
• PDF: https://arxiv.org/pdf/2604.18580
• Github: https://github.com/LibratioAI/sessa

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#Sessa #DeepLearning #AttentionMechanisms #StateSpaceModels #LongContextAI

234 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨Building a Precise Video Language with Human-AI Oversight

📝 Summary:
Video-language models are enhanced through structured visual specifications and human-AI oversight frameworks that improve captioning accuracy and enable detailed video generation control. AI-generate...

🔹 Publication Date: Published on Apr 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21718
• PDF: https://arxiv.org/pdf/2604.21718
• Project Page: https://linzhiqiu.github.io/papers/chai/
• Github: https://github.com/chancharikmitra/CHAI

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

228 views10:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨Video Analysis and Generation via a Semantic Progress Function

📝 Summary:
Researchers developed a Semantic Progress Function to analyze and correct non-linear semantic evolution in generated media. This function identifies uneven pacing, enabling a linearization procedure that re-times sequences for smoother, more coherent transitions at a constant semantic rate.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22554
• PDF: https://arxiv.org/pdf/2604.22554
• Project Page: https://sagipolaczek.github.io/semantic-progress-function/
• Github: https://github.com/SagiPolaczek/semantic-progress-function

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoAI #GenerativeAI #ComputerVision #SemanticAnalysis #AIResearch

174 views13:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents

📝 Summary:
Memanto introduces a universal, typed semantic memory layer for AI agents that bypasses complex semantic graphs. It uses an information-theoretic search engine for fast, overhead-free retrieval. This system achieves state-of-the-art accuracy on benchmarks with a single query and no ingestion cost.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22085
• PDF: https://arxiv.org/pdf/2604.22085
• Project Page: https://memanto.ai/
• Github: https://github.com/moorcheh-ai/memanto-evaluation

✨ Datasets citing this paper:
• https://huggingface.co/datasets/moorcheh/memanto-longmem-results
• https://huggingface.co/datasets/moorcheh/memanto-locomo-results

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #SemanticMemory #InformationRetrieval #AIAgents #MachineLearning

172 views14:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨EmbodiedMidtrain: Bridging the Gap between Vision-Language Models and Vision-Language-Action Models via Mid-training

📝 Summary:
EmbodiedMidtrain addresses the gap between vision-language models and vision-language-action models by using a mid-training approach that selects VLA-aligned data to improve downstream robot manipulat...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.20012
• PDF: https://arxiv.org/pdf/2604.20012
• Project Page: https://adu2021.github.io/blog/EmbodiedMidtrain/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

230 views14:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DiagramBank: A Large-scale Dataset of Diagram Design Exemplars with Paper Metadata for Retrieval-Augmented Generation

📝 Summary:
A large-scale dataset of schematic diagrams called DiagramBank is introduced for multimodal retrieval and exemplar-driven scientific figure generation, addressing the gap in automated publication-grad...

🔹 Publication Date: Published on Feb 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.20857
• PDF: https://arxiv.org/pdf/2604.20857
• Github: https://github.com/csml-rpi/DiagramBank

✨ Datasets citing this paper:
• https://huggingface.co/datasets/zhangt20/DiagramBank

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤2

248 views15:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework

📝 Summary:
Large language models exhibit emergent strategic reasoning risks including deception and reward hacking, which are systematically evaluated through a taxonomy-driven agentic framework called ESRRSim t...

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22119
• PDF: https://arxiv.org/pdf/2604.22119

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

198 views20:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis

📝 Summary:
DataPRM, a new environment-aware process reward model, enhances LLM reasoning in dynamic data analysis. It actively detects silent errors and distinguishes error types, achieving superior benchmark performance.

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24198
• PDF: https://arxiv.org/pdf/2604.24198

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #RewardModeling #DataAnalysis #AIagents #MachineLearning

114 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨SketchVLM: Vision language models can annotate images to explain thoughts and guide users

📝 Summary:
SketchVLM is a training-free framework that enables vision-language models to generate editable SVG overlays for visual explanations, improving reasoning accuracy and annotation quality across multipl...

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22875
• PDF: https://arxiv.org/pdf/2604.22875
• Project Page: https://sketchvlm.github.io/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SketchVLM #VisionLanguageModels #ComputerVision #AI #ImageAnnotation

111 views03:01

✨ Explore Data Science 📝 Write your paper