ML Research Hub
32.3K subscribers
6.81K photos
482 videos
24 files
7.43K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

📝 Summary:
World models are categorized into three capability levels and four law regimes to better understand and develop predictive environment models for AI agents across diverse domains. AI-generated summary...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22748
• PDF: https://arxiv.org/pdf/2604.22748
• Project Page: https://agentic-world-modeling.xyz/
• Github: https://github.com/matrix-agent/awesome-agentic-world-modeling

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
AgriIR: A Scalable Framework for Domain-Specific Knowledge Retrieval

📝 Summary:
AgriIR is a modular retrieval-augmented generation framework for agriculture. It uses configurable stages to provide accurate, trustworthy, and resource-efficient domain-specific information. This adaptable design promotes accessibility and accountability in AI for agriculture.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.16353
• PDF: https://arxiv.org/pdf/2604.16353
• Github: https://github.com/Shuvam-Banerji-Seal/AgriIR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #Agriculture #RAG #KnowledgeRetrieval #NLP
DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

📝 Summary:
DiffNR enhances sparse-view CT reconstruction with neural representations by employing SliceFixer, a single-step diffusion model. It corrects artifacts via pseudo-reference volumes, offering 3D supervision for better accuracy and efficient optimization, with a 3.99 dB PSNR gain.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21518
• PDF: https://arxiv.org/pdf/2604.21518
• Project Page: https://ooonesevennn.github.io/DiffNR/
• Github: https://github.com/ooonesevennn/DiffNR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DReconstruction #DiffusionModels #NeuralNetworks #CTReconstruction #DeepLearning
FlowAnchor: Stabilizing the Editing Signal for Inversion-Free Video Editing

📝 Summary:
FlowAnchor stabilizes inversion-free video editing by addressing signal instability in high-dimensional latent spaces. It uses spatial-aware attention refinement and adaptive magnitude modulation to ensure precise localization and sufficient editing strength, leading to faithful and coherent vide...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22586
• PDF: https://arxiv.org/pdf/2604.22586

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoEditing #DeepLearning #ComputerVision #GenerativeAI #AIResearch
Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets

📝 Summary:
SLIDERS tackles long-document QA by extracting information into a relational database and using SQL for structured reasoning. This avoids LLM context window issues and aggregation bottlenecks, significantly outperforming traditional methods on various benchmarks.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22294
• PDF: https://arxiv.org/pdf/2604.22294

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#QuestionAnswering #NLP #AI #SQL #LongDocuments
Sessa: Selective State Space Attention

📝 Summary:
Sessa is a new decoder architecture that puts attention inside a recurrent feedback path. This allows it to model long contexts better than Transformers and state-space models, achieving power-law memory decay and flexible selective retrieval. It outperforms on long-context tasks.

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18580
• PDF: https://arxiv.org/pdf/2604.18580
• Github: https://github.com/LibratioAI/sessa

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Sessa #DeepLearning #AttentionMechanisms #StateSpaceModels #LongContextAI
This media is not supported in your browser
VIEW IN TELEGRAM
Building a Precise Video Language with Human-AI Oversight

📝 Summary:
Video-language models are enhanced through structured visual specifications and human-AI oversight frameworks that improve captioning accuracy and enable detailed video generation control. AI-generate...

🔹 Publication Date: Published on Apr 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21718
• PDF: https://arxiv.org/pdf/2604.21718
• Project Page: https://linzhiqiu.github.io/papers/chai/
• Github: https://github.com/chancharikmitra/CHAI

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Video Analysis and Generation via a Semantic Progress Function

📝 Summary:
Researchers developed a Semantic Progress Function to analyze and correct non-linear semantic evolution in generated media. This function identifies uneven pacing, enabling a linearization procedure that re-times sequences for smoother, more coherent transitions at a constant semantic rate.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22554
• PDF: https://arxiv.org/pdf/2604.22554
• Project Page: https://sagipolaczek.github.io/semantic-progress-function/
• Github: https://github.com/SagiPolaczek/semantic-progress-function

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoAI #GenerativeAI #ComputerVision #SemanticAnalysis #AIResearch
Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents

📝 Summary:
Memanto introduces a universal, typed semantic memory layer for AI agents that bypasses complex semantic graphs. It uses an information-theoretic search engine for fast, overhead-free retrieval. This system achieves state-of-the-art accuracy on benchmarks with a single query and no ingestion cost.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22085
• PDF: https://arxiv.org/pdf/2604.22085
• Project Page: https://memanto.ai/
• Github: https://github.com/moorcheh-ai/memanto-evaluation

Datasets citing this paper:
https://huggingface.co/datasets/moorcheh/memanto-longmem-results
https://huggingface.co/datasets/moorcheh/memanto-locomo-results

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #SemanticMemory #InformationRetrieval #AIAgents #MachineLearning
EmbodiedMidtrain: Bridging the Gap between Vision-Language Models and Vision-Language-Action Models via Mid-training

📝 Summary:
EmbodiedMidtrain addresses the gap between vision-language models and vision-language-action models by using a mid-training approach that selects VLA-aligned data to improve downstream robot manipulat...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.20012
• PDF: https://arxiv.org/pdf/2604.20012
• Project Page: https://adu2021.github.io/blog/EmbodiedMidtrain/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DiagramBank: A Large-scale Dataset of Diagram Design Exemplars with Paper Metadata for Retrieval-Augmented Generation

📝 Summary:
A large-scale dataset of schematic diagrams called DiagramBank is introduced for multimodal retrieval and exemplar-driven scientific figure generation, addressing the gap in automated publication-grad...

🔹 Publication Date: Published on Feb 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.20857
• PDF: https://arxiv.org/pdf/2604.20857
• Github: https://github.com/csml-rpi/DiagramBank

Datasets citing this paper:
https://huggingface.co/datasets/zhangt20/DiagramBank

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework

📝 Summary:
Large language models exhibit emergent strategic reasoning risks including deception and reward hacking, which are systematically evaluated through a taxonomy-driven agentic framework called ESRRSim t...

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22119
• PDF: https://arxiv.org/pdf/2604.22119

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis

📝 Summary:
DataPRM, a new environment-aware process reward model, enhances LLM reasoning in dynamic data analysis. It actively detects silent errors and distinguishes error types, achieving superior benchmark performance.

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24198
• PDF: https://arxiv.org/pdf/2604.24198

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #RewardModeling #DataAnalysis #AIagents #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
SketchVLM: Vision language models can annotate images to explain thoughts and guide users

📝 Summary:
SketchVLM is a training-free framework that enables vision-language models to generate editable SVG overlays for visual explanations, improving reasoning accuracy and annotation quality across multipl...

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22875
• PDF: https://arxiv.org/pdf/2604.22875
• Project Page: https://sketchvlm.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SketchVLM #VisionLanguageModels #ComputerVision #AI #ImageAnnotation
Media is too big
VIEW IN TELEGRAM
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

📝 Summary:
World-R1 framework improves video generation by incorporating 3D constraints through reinforcement learning and specialized text datasets while maintaining visual quality and scalability. AI-generated...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24764
• PDF: https://arxiv.org/pdf/2604.24764
• Project Page: https://aka.ms/world-r1
• Github: https://github.com/microsoft/World-R1

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

📝 Summary:
A benchmark for evaluating language-model agents in multi-day collaborative workflows with evolving environmental states across multiple service domains. AI-generated summary Language-model agents are...

🔹 Publication Date: Published on Apr 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.23781
• PDF: https://arxiv.org/pdf/2604.23781
• Github: https://github.com/evolvent-ai/ClawMark

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

📝 Summary:
Tuna-2 is a unified multimodal model that performs visual understanding and generation directly from pixel embeddings without pretrained vision encoders, achieving state-of-the-art performance in mult...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24763
• PDF: https://arxiv.org/pdf/2604.24763
• Project Page: https://tuna-ai.org/tuna-2/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Stabilizing Efficient Reasoning with Step-Level Advantage Selection

📝 Summary:
Short-context post-training induces reasoning compression but causes instability; Step-level Advantage Selection addresses this by selectively adjusting reasoning steps based on confidence and verific...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24003
• PDF: https://arxiv.org/pdf/2604.24003
• Github: https://github.com/HanNight/SAS

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Zero-to-CAD: Agentic Synthesis of Interpretable CAD Programs at Million-Scale Without Real Data

📝 Summary:
A scalable framework synthesizes executable CAD construction sequences by framing the process as an agentic search problem using large language models within a feedback-driven CAD environment. AI-gene...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24479
• PDF: https://arxiv.org/pdf/2604.24479

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation

📝 Summary:
ProEval uses transfer learning with pre-trained Gaussian Processes and Bayesian quadrature to efficiently evaluate generative AI models by identifying failure cases with significantly fewer samples th...

🔹 Publication Date: Published on Apr 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.23099
• PDF: https://arxiv.org/pdf/2604.23099
• Github: https://github.com/google-deepmind/proeval

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

📝 Summary:
Transformer language models can reduce KV cache memory requirements through random cross-layer attention during training, enabling efficient depth-wise cache sharing without performance loss. AI-gener...

🔹 Publication Date: Published on Apr 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22782
• PDF: https://arxiv.org/pdf/2604.22782

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research