ML Research Hub
32.3K subscribers
6.84K photos
486 videos
24 files
7.46K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

📝 Summary:
DriveDreamer-Policy is a unified driving world-action model. It integrates depth, future video, and motion planning using geometry-aware world representation learning. This improves imagined futures and driving actions, achieving strong performance on navigation benchmarks.

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01765
• PDF: https://arxiv.org/pdf/2604.01765
• Project Page: https://drivedreamer-policy.github.io/
• Github: https://github.com/youngzhou1999/DriveDreamer-Policy

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AutonomousDriving #MotionPlanning #WorldModels #DeepLearning #ComputerVision
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing

📝 Summary:
This paper presents SpatialEdit-Bench, a new benchmark and dataset for fine-grained image spatial editing. It introduces SpatialEdit-16B, a model that substantially outperforms prior methods on spatial manipulation, offering precise control over object layout and camera viewpoints.

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04911
• PDF: https://arxiv.org/pdf/2604.04911
• Project Page: https://github.com/EasonXiao-888/SpatialEdit
• Github: https://github.com/EasonXiao-888/SpatialEdit

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ImageEditing #ComputerVision #DeepLearning #AI #Benchmark
AURA: Always-On Understanding and Real-Time Assistance via Video Streams

📝 Summary:
AURA is an end-to-end streaming visual interaction framework for continuous video understanding. It enables real-time question answering and proactive responses, improving on current VideoLLMs through integrated context management and optimized deployment.

🔹 Publication Date: Published on Apr 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04184
• PDF: https://arxiv.org/pdf/2604.04184
• Project Page: https://aurateam2026.github.io
• Github: https://github.com/aurateam2026/AURA

🔹 Models citing this paper:
https://huggingface.co/aurateam/AURA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoUnderstanding #RealTimeAI #VideoLLM #ComputerVision #DeepLearning
ClawArena: Benchmarking AI Agents in Evolving Information Environments

📝 Summary:
ClawArena evaluates AI agents' ability to maintain accurate beliefs in dynamic, multi-source information environments through diverse professional scenarios and evaluation methods. AI-generated summar...

🔹 Publication Date: Published on Apr 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04202
• PDF: https://arxiv.org/pdf/2604.04202
• Github: https://github.com/aiming-lab/ClawArena

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Less Detail, Better Answers: Degradation-Driven Prompting for VQA

📝 Summary:
Visual question answering performance is enhanced by strategically reducing image fidelity to focus models on essential structural information rather than distracting details. AI-generated summary Rec...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04838
• PDF: https://arxiv.org/pdf/2604.04838
• Project Page: https://hhx-jpg.github.io/ddp/
• Github: https://github.com/ziplab/DDP

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Vero: An Open RL Recipe for General Visual Reasoning

📝 Summary:
Vero is an open vision-language model family that achieves state-of-the-art visual reasoning performance through scaled reinforcement learning data across diverse tasks, demonstrating that broad data ...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04917
• PDF: https://arxiv.org/pdf/2604.04917
• Project Page: https://vero-reasoning.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VisualReasoning #ReinforcementLearning #VisionLanguageModels #AIResearch #DeepLearning
Memory Intelligence Agent

📝 Summary:
Memory Intelligence Agent framework integrates non-parametric and parametric memory systems with reinforcement learning to enable efficient reasoning and autonomous evolution in open-world environment...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04503
• PDF: https://arxiv.org/pdf/2604.04503
• Github: https://github.com/ECNU-SII/MIA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

📝 Summary:
To overcome LLM KV cache bottlenecks, TriAttention leverages stable pre-RoPE Q/K vector concentration and a trigonometric series to accurately estimate key importance. It matches full attention accuracy with 10.7x memory reduction or 2.5x higher throughput.

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04921
• PDF: https://arxiv.org/pdf/2604.04921
• Project Page: https://weianmao.github.io/tri-attention-project-page/
• Github: https://github.com/WeianMao/triattention

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

📝 Summary:
Training data engineering and optimized strategies improve document parsing performance without architectural changes, achieving state-of-the-art results on OmniDocBench v1.6. AI-generated summary Cur...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04771
• PDF: https://arxiv.org/pdf/2604.04771

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LightThinker++: From Reasoning Compression to Memory Management

📝 Summary:
LightThinker and LightThinker++ enable efficient large language model reasoning through dynamic compression and adaptive memory management, significantly reducing computational overhead while maintain...

🔹 Publication Date: Published on Apr 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03679
• PDF: https://arxiv.org/pdf/2604.03679

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SkillX: Automatically Constructing Skill Knowledge Bases for Agents

📝 Summary:
SkillX is an automated framework that creates reusable skill libraries for LLM agents through hierarchical skill design, iterative refinement, and exploratory expansion to improve generalization and e...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04804
• PDF: https://arxiv.org/pdf/2604.04804
• Github: https://github.com/zjunlp/SkillX

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
FileGram: Grounding Agent Personalization in File-System Behavioral Traces

📝 Summary:
FileGram is a framework for personalized AI agents that uses file-system behavioral traces to enhance memory systems and agent personalization, featuring a data engine, diagnostic benchmark, and memor...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04901
• PDF: https://arxiv.org/pdf/2604.04901
• Project Page: https://filegram.choiszt.com/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

📝 Summary:
OpenWorldLib presents a standardized framework for advanced world models. It defines a world model as a perception-centered system with interaction and long-term memory for understanding and predicting complex worlds. This unified framework enables efficient model reuse and collaborative inferenc...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04707
• PDF: https://arxiv.org/pdf/2604.04707
• Project Page: https://wcny4qa9krto.feishu.cn/wiki/XtPJwf5XQipP7RkeVv0ckyWlnNd

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#WorldModels #AI #MachineLearning #DeepLearning #AIFrameworks
Can LLMs Learn to Reason Robustly under Noisy Supervision?

📝 Summary:
Reinforcement Learning with Verifiable Rewards faces challenges with noisy labels, but a proposed method called Online Label Refinement addresses this by progressively correcting labels based on polic...

🔹 Publication Date: Published on Apr 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03993
• PDF: https://arxiv.org/pdf/2604.03993
• Github: https://github.com/ShenzhiYang2000/OLR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
HDP: A Lightweight Cryptographic Protocol for Human Delegation Provenance in Agentic AI Systems

📝 Summary:
Agentic AI systems lack verifiable human authorization for delegated tasks. HDP is a lightweight cryptographic protocol that records and verifies the full human delegation provenance using tokens, allowing offline checks.

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04522
• PDF: https://arxiv.org/pdf/2604.04522

Spaces citing this paper:
https://huggingface.co/spaces/helixar-ai/hdp-physical-demo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Self-Execution Simulation Improves Coding Models

📝 Summary:
This work trains code LLMs to simulate program execution step-by-step using fine-tuning and reinforcement learning. This enables self-verification and iterative self-fixing, significantly improving competitive programming performance and outperforming standard reasoning methods.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03253
• PDF: https://arxiv.org/pdf/2604.03253

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#CodeLLMs #AI #ReinforcementLearning #DeepLearning #CompetitiveProgramming
Media is too big
VIEW IN TELEGRAM
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization

📝 Summary:
AvatarPointillist creates dynamic 4D Gaussian avatars from a single image using an autoregressive Transformer. It builds point clouds with adaptive density and binding info for realistic animation, producing high-quality, controllable results.

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04787
• PDF: https://arxiv.org/pdf/2604.04787
• Project Page: https://kumapowerliu.github.io/AvatarPointillist/
• Github: https://github.com/KumapowerLIU/AvatarPointillist

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ComputerVision #3DAvatars #GenerativeAI #MachineLearning
Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

📝 Summary:
A real-world safety analysis of the personal AI agent OpenClaw reveals significant vulnerabilities due to its broad system access. Attacks targeting its Capability, Identity, or Knowledge CIK dimensions drastically increase success rates, and current defenses are insufficient, indicating inherent...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04759
• PDF: https://arxiv.org/pdf/2604.04759
• Project Page: https://ucsc-vlaa.github.io/CIK-Bench/
• Github: https://github.com/UCSC-VLAA/CIK-Bench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AISafety #Cybersecurity #AIAgents #Vulnerability #AIsecurity
👍1
Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

📝 Summary:
SRPO unifies GRPO and SDPO in reinforcement learning by routing correct samples to GRPO's reward-aligned reinforcement and failed samples to SDPO's targeted logit-level correction. This novel approach achieves superior stability, rapid improvement, and better performance than either baseline.

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02288
• PDF: https://arxiv.org/pdf/2604.02288

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #PolicyOptimization #SampleRouting #MachineLearning #AIResearch
LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models

📝 Summary:
Vision-Language-Action models show significant performance drops when handling paraphrased instructions due to surface-level matching rather than semantic understanding, highlighting the need for bett...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28301
• PDF: https://arxiv.org/pdf/2603.28301
• Project Page: https://cau-hai-lab.github.io/LIBERO-Para/
• Github: https://github.com/cau-hai-lab/LIBERO-Para

Datasets citing this paper:
https://huggingface.co/datasets/HAI-Lab/LIBERO-Para

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies

📝 Summary:
Meta-TTL formulates adaptation policy discovery as a bi-level optimization problem to improve language agent performance through learned policies rather than hand-crafted ones. AI-generated summary Te...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00830
• PDF: https://arxiv.org/pdf/2604.00830
• Github: https://github.com/zzzlou/meta-ttl

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research