ML Research Hub
32.3K subscribers
6.78K photos
479 videos
24 files
7.4K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications

📝 Summary:
Mixture of Experts MoE models enhance large AI model efficiency and performance by dynamically selecting sub-models for diverse data. This survey details MoE design, algorithms, theory, and applications in various machine learning fields.

🔹 Publication Date: Published on Mar 10, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.07137
• PDF: https://arxiv.org/pdf/2503.07137
• Github: https://github.com/deepseek-ai/DeepEP

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MixtureOfExperts #MoE #AI #MachineLearning #DeepLearning
1
This media is not supported in your browser
VIEW IN TELEGRAM
Self Attention vs Cross Attention by hand ✍️
Resize the matrices yourself 👉 https://byhand.ai/aMisxP

Two attention mechanisms, side by side. Both project X into queries; both compute attention via S = Kᵀ × Q and F = V × A. The only difference is the source of K and V.

Self attention uses X for everything. Q, K, and V all come from projecting X. Each X token attends to every other X token. The score matrix S is square — 128 × 128.

Cross attention uses X for queries and a second sequence E for keys and values. Each X token attends to every E token instead. The score matrix S is rectangular — 64 × 128.

Notice what's shared and what's not:

X is the same in both — same 36 × 128 input.

Q and K share the 16 dimension — that's what makes the dot product Kᵀ × Q valid in either case.

V dimensions are independent: self-attention uses 12, cross-attention uses 12. The choice doesn't depend on which mechanism you're using; it depends on what output dimension your downstream layer expects.

https://t.iss.one/CodeProgrammer
2
Follow the Machine Learning with Python channel on WhatsApp: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
LLM Safety From Within: Detecting Harmful Content with Internal Representations

📝 Summary:
SIREN is a lightweight guard model that uses LLM internal layer features to detect harmful content, outperforming current models. It is more efficient, generalizes better, and requires significantly fewer parameters than existing guard models.

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18519
• PDF: https://arxiv.org/pdf/2604.18519
• Github: https://github.com/CSSLab/SIREN

🔹 Models citing this paper:
https://huggingface.co/UofTCSSLab/SIREN-Qwen3-0.6B
https://huggingface.co/UofTCSSLab/SIREN-Qwen3-4B
https://huggingface.co/UofTCSSLab/SIREN-Llama-3.2-1B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMSafety #AIethics #HarmfulContent #DeepLearning #NLP
dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model

📝 Summary:
dWorldEval proposes a scalable robotics policy evaluation method using a discrete diffusion world model. It unifies diverse modalities into a token space, employing a transformer and progress token for success detection. This approach significantly outperforms prior methods, enabling large-scale ...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22152
• PDF: https://arxiv.org/pdf/2604.22152

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #DiffusionModels #WorldModels #AI #MachineLearning
AgentSearchBench: A Benchmark for AI Agent Search in the Wild

📝 Summary:
AgentSearchBench is a new benchmark for finding suitable AI agents using execution-grounded performance signals from nearly 10,000 real-world agents. It shows that description-based similarity is insufficient, and lightweight behavioral signals significantly improve agent ranking.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22436
• PDF: https://arxiv.org/pdf/2604.22436

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #AIAgents #Benchmarking #AgentSearch #MachineLearning
Learning Evidence Highlighting for Frozen LLMs

📝 Summary:
HiLight enhances long-context reasoning in large language models by training a lightweight emphasis actor to highlight key evidence without modifying the original input or solver, using reinforcement ...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22565
• PDF: https://arxiv.org/pdf/2604.22565

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

📝 Summary:
World models are categorized into three capability levels and four law regimes to better understand and develop predictive environment models for AI agents across diverse domains. AI-generated summary...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22748
• PDF: https://arxiv.org/pdf/2604.22748
• Project Page: https://agentic-world-modeling.xyz/
• Github: https://github.com/matrix-agent/awesome-agentic-world-modeling

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
AgriIR: A Scalable Framework for Domain-Specific Knowledge Retrieval

📝 Summary:
AgriIR is a modular retrieval-augmented generation framework for agriculture. It uses configurable stages to provide accurate, trustworthy, and resource-efficient domain-specific information. This adaptable design promotes accessibility and accountability in AI for agriculture.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.16353
• PDF: https://arxiv.org/pdf/2604.16353
• Github: https://github.com/Shuvam-Banerji-Seal/AgriIR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #Agriculture #RAG #KnowledgeRetrieval #NLP
DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

📝 Summary:
DiffNR enhances sparse-view CT reconstruction with neural representations by employing SliceFixer, a single-step diffusion model. It corrects artifacts via pseudo-reference volumes, offering 3D supervision for better accuracy and efficient optimization, with a 3.99 dB PSNR gain.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21518
• PDF: https://arxiv.org/pdf/2604.21518
• Project Page: https://ooonesevennn.github.io/DiffNR/
• Github: https://github.com/ooonesevennn/DiffNR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DReconstruction #DiffusionModels #NeuralNetworks #CTReconstruction #DeepLearning
FlowAnchor: Stabilizing the Editing Signal for Inversion-Free Video Editing

📝 Summary:
FlowAnchor stabilizes inversion-free video editing by addressing signal instability in high-dimensional latent spaces. It uses spatial-aware attention refinement and adaptive magnitude modulation to ensure precise localization and sufficient editing strength, leading to faithful and coherent vide...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22586
• PDF: https://arxiv.org/pdf/2604.22586

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoEditing #DeepLearning #ComputerVision #GenerativeAI #AIResearch
Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets

📝 Summary:
SLIDERS tackles long-document QA by extracting information into a relational database and using SQL for structured reasoning. This avoids LLM context window issues and aggregation bottlenecks, significantly outperforming traditional methods on various benchmarks.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22294
• PDF: https://arxiv.org/pdf/2604.22294

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#QuestionAnswering #NLP #AI #SQL #LongDocuments
Sessa: Selective State Space Attention

📝 Summary:
Sessa is a new decoder architecture that puts attention inside a recurrent feedback path. This allows it to model long contexts better than Transformers and state-space models, achieving power-law memory decay and flexible selective retrieval. It outperforms on long-context tasks.

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18580
• PDF: https://arxiv.org/pdf/2604.18580
• Github: https://github.com/LibratioAI/sessa

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Sessa #DeepLearning #AttentionMechanisms #StateSpaceModels #LongContextAI
This media is not supported in your browser
VIEW IN TELEGRAM
Building a Precise Video Language with Human-AI Oversight

📝 Summary:
Video-language models are enhanced through structured visual specifications and human-AI oversight frameworks that improve captioning accuracy and enable detailed video generation control. AI-generate...

🔹 Publication Date: Published on Apr 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21718
• PDF: https://arxiv.org/pdf/2604.21718
• Project Page: https://linzhiqiu.github.io/papers/chai/
• Github: https://github.com/chancharikmitra/CHAI

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Video Analysis and Generation via a Semantic Progress Function

📝 Summary:
Researchers developed a Semantic Progress Function to analyze and correct non-linear semantic evolution in generated media. This function identifies uneven pacing, enabling a linearization procedure that re-times sequences for smoother, more coherent transitions at a constant semantic rate.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22554
• PDF: https://arxiv.org/pdf/2604.22554
• Project Page: https://sagipolaczek.github.io/semantic-progress-function/
• Github: https://github.com/SagiPolaczek/semantic-progress-function

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoAI #GenerativeAI #ComputerVision #SemanticAnalysis #AIResearch
Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents

📝 Summary:
Memanto introduces a universal, typed semantic memory layer for AI agents that bypasses complex semantic graphs. It uses an information-theoretic search engine for fast, overhead-free retrieval. This system achieves state-of-the-art accuracy on benchmarks with a single query and no ingestion cost.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22085
• PDF: https://arxiv.org/pdf/2604.22085
• Project Page: https://memanto.ai/
• Github: https://github.com/moorcheh-ai/memanto-evaluation

Datasets citing this paper:
https://huggingface.co/datasets/moorcheh/memanto-longmem-results
https://huggingface.co/datasets/moorcheh/memanto-locomo-results

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #SemanticMemory #InformationRetrieval #AIAgents #MachineLearning
EmbodiedMidtrain: Bridging the Gap between Vision-Language Models and Vision-Language-Action Models via Mid-training

📝 Summary:
EmbodiedMidtrain addresses the gap between vision-language models and vision-language-action models by using a mid-training approach that selects VLA-aligned data to improve downstream robot manipulat...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.20012
• PDF: https://arxiv.org/pdf/2604.20012
• Project Page: https://adu2021.github.io/blog/EmbodiedMidtrain/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DiagramBank: A Large-scale Dataset of Diagram Design Exemplars with Paper Metadata for Retrieval-Augmented Generation

📝 Summary:
A large-scale dataset of schematic diagrams called DiagramBank is introduced for multimodal retrieval and exemplar-driven scientific figure generation, addressing the gap in automated publication-grad...

🔹 Publication Date: Published on Feb 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.20857
• PDF: https://arxiv.org/pdf/2604.20857
• Github: https://github.com/csml-rpi/DiagramBank

Datasets citing this paper:
https://huggingface.co/datasets/zhangt20/DiagramBank

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework

📝 Summary:
Large language models exhibit emergent strategic reasoning risks including deception and reward hacking, which are systematically evaluated through a taxonomy-driven agentic framework called ESRRSim t...

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22119
• PDF: https://arxiv.org/pdf/2604.22119

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis

📝 Summary:
DataPRM, a new environment-aware process reward model, enhances LLM reasoning in dynamic data analysis. It actively detects silent errors and distinguishes error types, achieving superior benchmark performance.

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24198
• PDF: https://arxiv.org/pdf/2604.24198

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #RewardModeling #DataAnalysis #AIagents #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
SketchVLM: Vision language models can annotate images to explain thoughts and guide users

📝 Summary:
SketchVLM is a training-free framework that enables vision-language models to generate editable SVG overlays for visual explanations, improving reasoning accuracy and annotation quality across multipl...

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22875
• PDF: https://arxiv.org/pdf/2604.22875
• Project Page: https://sketchvlm.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SketchVLM #VisionLanguageModels #ComputerVision #AI #ImageAnnotation