ML Research Hub
32.3K subscribers
6.81K photos
481 videos
24 files
7.43K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Media is too big
VIEW IN TELEGRAM
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

📝 Summary:
World-R1 framework improves video generation by incorporating 3D constraints through reinforcement learning and specialized text datasets while maintaining visual quality and scalability. AI-generated...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24764
• PDF: https://arxiv.org/pdf/2604.24764
• Project Page: https://aka.ms/world-r1
• Github: https://github.com/microsoft/World-R1

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

📝 Summary:
A benchmark for evaluating language-model agents in multi-day collaborative workflows with evolving environmental states across multiple service domains. AI-generated summary Language-model agents are...

🔹 Publication Date: Published on Apr 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.23781
• PDF: https://arxiv.org/pdf/2604.23781
• Github: https://github.com/evolvent-ai/ClawMark

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

📝 Summary:
Tuna-2 is a unified multimodal model that performs visual understanding and generation directly from pixel embeddings without pretrained vision encoders, achieving state-of-the-art performance in mult...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24763
• PDF: https://arxiv.org/pdf/2604.24763
• Project Page: https://tuna-ai.org/tuna-2/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Stabilizing Efficient Reasoning with Step-Level Advantage Selection

📝 Summary:
Short-context post-training induces reasoning compression but causes instability; Step-level Advantage Selection addresses this by selectively adjusting reasoning steps based on confidence and verific...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24003
• PDF: https://arxiv.org/pdf/2604.24003
• Github: https://github.com/HanNight/SAS

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Zero-to-CAD: Agentic Synthesis of Interpretable CAD Programs at Million-Scale Without Real Data

📝 Summary:
A scalable framework synthesizes executable CAD construction sequences by framing the process as an agentic search problem using large language models within a feedback-driven CAD environment. AI-gene...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24479
• PDF: https://arxiv.org/pdf/2604.24479

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation

📝 Summary:
ProEval uses transfer learning with pre-trained Gaussian Processes and Bayesian quadrature to efficiently evaluate generative AI models by identifying failure cases with significantly fewer samples th...

🔹 Publication Date: Published on Apr 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.23099
• PDF: https://arxiv.org/pdf/2604.23099
• Github: https://github.com/google-deepmind/proeval

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

📝 Summary:
Transformer language models can reduce KV cache memory requirements through random cross-layer attention during training, enabling efficient depth-wise cache sharing without performance loss. AI-gener...

🔹 Publication Date: Published on Apr 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22782
• PDF: https://arxiv.org/pdf/2604.22782

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

📝 Summary:
ReVSI addresses flaws in current spatial intelligence evaluation by creating a validated benchmark with improved annotations and controlled frame sampling conditions. AI-generated summary Current eval...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24300
• PDF: https://arxiv.org/pdf/2604.24300
• Project Page: https://3dlg-hcvc.github.io/revsi/
• Github: https://3dlg-hcvc.github.io/revsi/

Datasets citing this paper:
https://huggingface.co/datasets/3dlg-hcvc/ReVSI

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer

📝 Summary:
OmniShotCut formulates shot boundary detection as structured relational prediction using a shot query-based dense video Transformer, addressing limitations of existing methods through synthetic transi...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24762
• PDF: https://arxiv.org/pdf/2604.24762
• Project Page: https://uva-computer-vision-lab.github.io/OmniShotCut_website/
• Github: https://github.com/UVA-Computer-Vision-Lab/OmniShotCut

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

📝 Summary:
Vision-Language-Action models present unique safety challenges due to their embodied nature, requiring unified approaches across multiple domains to address threats from data poisoning to adversarial ...

🔹 Publication Date: Published on Apr 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.23775
• PDF: https://arxiv.org/pdf/2604.23775
• Github: https://github.com/LiQiiiii/Awesome-VLA-Safety

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Efficient Agent Evaluation via Diversity-Guided User Simulation

📝 Summary:
DIVERT is a coverage-guided user simulation framework that efficiently evaluates large language models by reusing conversation prefixes and exploring diverse interaction paths through branching trajec...

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21480
• PDF: https://arxiv.org/pdf/2604.21480

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
For-Value: Efficient Forward-Only Data Valuation for finetuning LLMs and VLMs

📝 Summary:
For-Value is an efficient forward-only data valuation framework for LLMs and VLMs. It estimates data value using final hidden representations and prediction errors, eliminating costly gradient computations. This enables scalable batch processing, matching or exceeding gradient-based methods in ef...

🔹 Publication Date: Published on Apr 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10180
• PDF: https://arxiv.org/pdf/2508.10180
• Github: https://github.com/vengdeng/For-Value-Efficient-Forward-Only-Data-Valuation-for-finetuning

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment

📝 Summary:
Large language model agents exhibit cognitive bias where self-reflection and mutual auditing lead to inconsistent error attributions, which are addressed through a dialectical reasoning framework that...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19548
• PDF: https://arxiv.org/pdf/2604.19548
• Project Page: https://unikcc.github.io/ReTAS/
• Github: https://github.com/unikcc/ReTAS

Datasets citing this paper:
https://huggingface.co/datasets/BradNLP/ReTAS

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

📝 Summary:
OneManCompany OMC addresses static multi-agent systems by providing a framework for dynamic team assembly and governance. It uses portable agent identities and a hierarchical decision loop for self-organizing AI teams. OMC achieves 84.67% success on PRDBench, improving state-of-the-art by 15.48%.

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22446
• PDF: https://arxiv.org/pdf/2604.22446
• Project Page: https://1mancompany.github.io/OneManCompany/
• Github: https://github.com/1mancompany/OneManCompany

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #MultiAgentSystems #SelfOrganizingAI #AIteams #AutonomousAgents
Discovering Agentic Safety Specifications from 1-Bit Danger Signals

📝 Summary:
EPO-Safe allows LLM agents to discover hidden safety objectives using only binary danger warnings and reflection. This framework generates human-readable safety specifications autonomously, demonstrating robustness even with noisy feedback. It highlights that a dedicated safety channel is crucial...

🔹 Publication Date: Published on Apr 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.23210
• PDF: https://arxiv.org/pdf/2604.23210

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ATTN-FIQA: Interpretable Attention-based Face Image Quality Assessment with Vision Transformers

📝 Summary:
ATTN-FIQA uses pre-softmax attention scores from Vision Transformers to assess face image quality without additional training or architectural changes. AI-generated summary Face Image Quality Assessme...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22841
• PDF: https://arxiv.org/pdf/2604.22841
• Github: https://github.com/gurayozgur/ATTN-FIQA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EX-FIQA: Leveraging Intermediate Early eXit Representations from Vision Transformers for Face Image Quality Assessment

📝 Summary:
ViT-based face quality assessment method utilizes intermediate representations through early exit mechanisms and score fusion strategies, demonstrating that different transformer block depths capture ...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22842
• PDF: https://arxiv.org/pdf/2604.22842
• Github: https://github.com/gurayozgur/EX-FIQA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Improving Vision-language Models with Perception-centric Process Reward Models

📝 Summary:
A process reward model called Perceval enables token-level error detection and correction in vision-language models through perception-intensive training and fine-grained supervision during reinforcem...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24583
• PDF: https://arxiv.org/pdf/2604.24583
• Github: https://github.com/RUCAIBox/Perceval

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction

📝 Summary:
This research presents TexOCR for reconstructing scientific PDFs into compilable LaTeX, addressing limitations of current OCR. It introduces a new benchmark and trains TexOCR using reinforcement learning with verifiable rewards. This approach significantly improves structural accuracy and compila...

🔹 Publication Date: Published on Apr 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22880
• PDF: https://arxiv.org/pdf/2604.22880
• Github: https://github.com/QDRhhhh/TexOCR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models

📝 Summary:
Research quantifies the computational value of recurrent connections in language models through a scaling law that establishes a recurrence-equivalence exponent of 0.46, indicating that additional rec...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21106
• PDF: https://arxiv.org/pdf/2604.21106
• Project Page: https://kschwethelm.github.io/looped-lm-scaling
• Github: https://github.com/kschwethelm/looped-lm-scaling

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

📝 Summary:
UniGeo unifies geometric guidance across representation, architecture, and loss function levels in camera-controllable image editing. This novel framework addresses geometric drift and structural degradation, achieving superior visual quality and geometric consistency.

🔹 Publication Date: Published on Apr 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17565
• PDF: https://arxiv.org/pdf/2604.17565
• Project Page: https://mo230761.github.io/UniGeo.github.io/
• Github: https://github.com/mo230761/UniGeo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research