ML Research Hub

✨OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering

📝 Summary:
OmniJigsaw presents a self-supervised framework for video-audio understanding and collaborative reasoning through temporal reordering and cross-modal integration strategies. AI-generated summary To ex...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08209
• PDF: https://arxiv.org/pdf/2604.08209
• Project Page: https://aim-uofa.github.io/OmniJigsaw
• Github: https://github.com/aim-uofa/OmniJigsaw

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

89 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

📝 Summary:
Flux Attention dynamically optimizes attention computation in LLMs by routing layers to full or sparse attention based on input context, achieving faster inference with minimal training overhead. AI-g...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2604.07394
• PDF: https://arxiv.org/pdf/2604.07394
• Github: https://github.com/qqtang-code/FluxAttention

🔹 Models citing this paper:
• https://huggingface.co/QQTang1223/full_xattn_Qwen3-8B
• https://huggingface.co/QQTang1223/full_streaming_Llama-3.1-8B-Instruct
• https://huggingface.co/QQTang1223/full_streaming_Qwen3-4B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

112 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LPM 1.0: Video-based Character Performance Model

📝 Summary:
A large-scale multimodal model for real-time conversational character performance generation that maintains identity consistency while enabling interactive, infinite-length video synthesis. AI-generat...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07823
• PDF: https://arxiv.org/pdf/2604.07823
• Project Page: https://large-performance-model.github.io/
• Github: https://github.com/large-performance-model/large-performance-model.github.io

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

84 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:05

This media is not supported in your browser

VIEW IN TELEGRAM

✨When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

📝 Summary:
NUMINA enhances text-to-video diffusion models' numerical accuracy through a training-free framework that identifies layout inconsistencies and guides regeneration via attention modulation. AI-generat...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08546
• PDF: https://arxiv.org/pdf/2604.08546
• Project Page: https://h-embodvis.github.io/NUMINA/
• Github: https://github.com/H-EmbodVis/NUMINA

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

78 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents

📝 Summary:
HY-Embodied-0.5 is a foundation model family for embodied agents featuring Mixture-of-Transformers architecture and iterative post-training for enhanced visual perception and reasoning capabilities. A...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07430
• PDF: https://arxiv.org/pdf/2604.07430
• Github: https://github.com/Tencent-Hunyuan/HY-Embodied

🔹 Models citing this paper:
• https://huggingface.co/tencent/HY-Embodied-0.5

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents

We introduce HY-Embodied-0.5, a family of foundation models specifically designed for real-world embodied agents. To bridge the gap between general Vision-Language Models (VLMs) and the demands of...

89 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On

📝 Summary:
A large-scale virtual try-on dataset called FIT is introduced that includes precise body and garment measurements to address garment fit accuracy, using synthetic 3D garment generation, physics simula...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08526
• PDF: https://arxiv.org/pdf/2604.08526
• Project Page: https://johannakarras.github.io/FIT/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

89 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

📝 Summary:
Open-source web agents leveraging diverse mixed datasets achieve state-of-the-art performance on browser-based tasks while operating without access to HTML or accessibility tree information. AI-genera...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08516
• PDF: https://arxiv.org/pdf/2604.08516
• Project Page: https://allenai.org/blog/molmoweb

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

93 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

📝 Summary:
Agents with meta-cognitive deficits struggle with tool usage decisions, leading to inefficiencies; a new framework called HDPO addresses this through decoupled optimization channels for accuracy and e...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08545
• PDF: https://arxiv.org/pdf/2604.08545
• Project Page: https://Accio-Lab.github.io/Metis
• Github: https://github.com/Accio-Lab/Metis

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

97 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Beyond Stochastic Exploration: What Makes Training Data Valuable for Agentic Search

📝 Summary:
A novel hierarchical experience framework improves reinforcement learning-based search agents by transforming raw reasoning trajectories into structured knowledge, enhancing both performance and train...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08124
• PDF: https://arxiv.org/pdf/2604.08124

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

103 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

📝 Summary:
SkillClaw enables collective skill evolution in multi-user LLM agent systems by aggregating user interactions to autonomously update and improve reusable skills across the ecosystem. AI-generated summ...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08377
• PDF: https://arxiv.org/pdf/2604.08377

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

116 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨RewardFlow: Generate Images by Optimizing What You Reward

📝 Summary:
RewardFlow enables pretrained diffusion and flow-matching models to be guided during inference through multi-reward Langevin dynamics without requiring inversion, achieving superior performance in ima...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08536
• PDF: https://arxiv.org/pdf/2604.08536
• Project Page: https://plan-lab.github.io/projects/rewardflow

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

148 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics

📝 Summary:
Phantom is a physics-infused video generation model that jointly models visual content and latent physical dynamics to produce videos that are both visually realistic and physically consistent. AI-gen...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08503
• PDF: https://arxiv.org/pdf/2604.08503
• Project Page: https://plan-lab.github.io/projects/phantom

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

88 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PokeGym: A Visually-Driven Long-Horizon Benchmark for Vision-Language Models

📝 Summary:
Vision-Language Models face limitations in 3D embodied environments due to insufficient physical reasoning capabilities, as demonstrated by the PokeGym benchmark that reveals deadlock recovery as the ...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08340
• PDF: https://arxiv.org/pdf/2604.08340

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

97 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

📝 Summary:
Gaussian GRPO addresses challenges in multimodal model training by using distributional matching to ensure gradient equity and stable reinforcement learning, enabling improved perception-reasoning bal...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08539
• PDF: https://arxiv.org/pdf/2604.08539
• Project Page: https://gordonhu608.github.io/openvlthinkerv2.github.io/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

84 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

📝 Summary:
GameWorld presents a standardized benchmark for evaluating multimodal large language model agents in video games, featuring diverse games and verified metrics for comprehensive assessment. AI-generate...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07429
• PDF: https://arxiv.org/pdf/2604.07429
• Project Page: https://gameworld-bench.github.io/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

113 views03:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping

📝 Summary:
MegaStyle presents a scalable data curation pipeline for creating high-quality, style-consistent datasets using large generative models and proposes style-supervised contrastive learning for effective...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08364
• PDF: https://arxiv.org/pdf/2604.08364
• Project Page: https://jeoyal.github.io/MegaStyle/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

99 views04:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

📝 Summary:
OpenSpatial presents an open-source data engine for spatial reasoning tasks using 3D bounding boxes, creating a large-scale dataset and achieving state-of-the-art performance in spatial perception ben...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07296
• PDF: https://arxiv.org/pdf/2604.07296
• Github: https://github.com/VINHYU/OpenSpatial

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

76 views04:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Lighting-grounded Video Generation with Renderer-based Agent Reasoning

📝 Summary:
LiVER presents a diffusion-based framework for scene-controllable video generation that disentangles 3D scene properties through explicit conditioning and automated user instruction translation. AI-ge...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07966
• PDF: https://arxiv.org/pdf/2604.07966

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

92 views04:04

✨ Explore Data Science 📝 Write your paper

✨Automating Database-Native Function Code Synthesis with LLMs

📝 Summary:
D a t a b a s e s y s t e m s i n c o r p o r a t e a n e v e r - g r o w i n g n u m b e r o f f u n c t i o n s i n t h e i r k e r n e l s ( a . k . a . , d a t a b a s e n a t i v e f u n c t i o ...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06231
• PDF: https://arxiv.org/pdf/2604.06231
• Project Page: https://code4db.github.io/hi-opencook/
• Github: https://github.com/weAIDB/OpenCook

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

84 views04:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

📝 Summary:
A physics-aligned simulation framework enables effective robotic manipulation of deformable objects by creating metric-consistent synthetic data that matches real-world performance. AI-generated summa...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08544
• PDF: https://arxiv.org/pdf/2604.08544
• Project Page: https://internrobotics.github.io/sim1.github.io/
• Github: https://github.com/InternRobotics/SIM1

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

98 views04:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Structured Distillation of Web Agent Capabilities Enables Generalization

📝 Summary:
Structured synthetic trajectory generation using a frontier LLM as teacher enables open-weight web agents with superior performance and cross-environment capabilities. AI-generated summary Frontier LL...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/McGill-NLP/a3-agent-as-annotators
• PDF: https://arxiv.org/pdf/2604.07776
• Project Page: https://agent-as-annotators.github.io/
• Github: https://github.com/McGill-NLP/agent-as-annotators

🔹 Models citing this paper:
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-9B
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-4B
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-2B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/McGill-NLP/A3-Synth

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

129 views04:05

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform