✨On the Step Length Confounding in LLM Reasoning Data Selection
📝 Summary:
Researchers identified a bias in naturalness-based data selection for reasoning tasks where longer reasoning steps are preferred over higher-quality ones, and proposed two debiasing methods to improve...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06834
• PDF: https://arxiv.org/pdf/2604.06834
• Project Page: https://wangbing1416.github.io/projects/acl2026_lengthbias.html
• Github: https://github.com/wangbing1416/ASLEC
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Researchers identified a bias in naturalness-based data selection for reasoning tasks where longer reasoning steps are preferred over higher-quality ones, and proposed two debiasing methods to improve...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06834
• PDF: https://arxiv.org/pdf/2604.06834
• Project Page: https://wangbing1416.github.io/projects/acl2026_lengthbias.html
• Github: https://github.com/wangbing1416/ASLEC
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
📝 Summary:
LLM agents now increasingly rely on externalized components like memory, skills, and protocols, rather than just modifying model weights. This externalization transforms complex cognitive tasks into more reliably solvable forms. Practical agent progress depends on this external cognitive infrastr...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08224
• PDF: https://arxiv.org/pdf/2604.08224
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LLM agents now increasingly rely on externalized components like memory, skills, and protocols, rather than just modifying model weights. This externalization transforms complex cognitive tasks into more reliably solvable forms. Practical agent progress depends on this external cognitive infrastr...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08224
• PDF: https://arxiv.org/pdf/2604.08224
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills
📝 Summary:
Graph of Skills GoS is an inference-time structural retrieval layer for large skill libraries. It constructs an executable skill graph to retrieve dependency-aware skill bundles, significantly improving performance and reducing token usage. GoS boosts average reward by 43.6 percent and cuts input...
🔹 Publication Date: Published on Apr 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05333
• PDF: https://arxiv.org/pdf/2604.05333
• Github: https://github.com/davidliuk/graph-of-skills
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Graph of Skills GoS is an inference-time structural retrieval layer for large skill libraries. It constructs an executable skill graph to retrieve dependency-aware skill bundles, significantly improving performance and reducing token usage. GoS boosts average reward by 43.6 percent and cuts input...
🔹 Publication Date: Published on Apr 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05333
• PDF: https://arxiv.org/pdf/2604.05333
• Github: https://github.com/davidliuk/graph-of-skills
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨DMax: Aggressive Parallel Decoding for dLLMs
📝 Summary:
DMax introduces a novel approach for efficient diffusion language models that reduces error accumulation during parallel decoding through self-refinement and unified training strategies. AI-generated ...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2604.08302
• PDF: https://arxiv.org/pdf/2604.08302
• Github: https://github.com/czg1225/DMax
🔹 Models citing this paper:
• https://huggingface.co/Zigeng/DMax-Math-16B
• https://huggingface.co/Zigeng/DMax-Coder-16B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Math-Trajectories
• https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
DMax introduces a novel approach for efficient diffusion language models that reduces error accumulation during parallel decoding through self-refinement and unified training strategies. AI-generated ...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2604.08302
• PDF: https://arxiv.org/pdf/2604.08302
• Github: https://github.com/czg1225/DMax
🔹 Models citing this paper:
• https://huggingface.co/Zigeng/DMax-Math-16B
• https://huggingface.co/Zigeng/DMax-Coder-16B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Math-Trajectories
• https://huggingface.co/datasets/Zigeng/DMax-LLaDA-2.0-Mini-Code-Trajectories
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation
📝 Summary:
KnowU-Bench presents a comprehensive benchmark for personalized mobile agents that evaluates true preference inference and proactive assistance capabilities in real-world GUI environments. AI-generate...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08455
• PDF: https://arxiv.org/pdf/2604.08455
• Project Page: https://zju-real.github.io/KnowU-Bench
• Github: https://github.com/ZJU-REAL/KnowU-Bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
KnowU-Bench presents a comprehensive benchmark for personalized mobile agents that evaluates true preference inference and proactive assistance capabilities in real-world GUI environments. AI-generate...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08455
• PDF: https://arxiv.org/pdf/2604.08455
• Project Page: https://zju-real.github.io/KnowU-Bench
• Github: https://github.com/ZJU-REAL/KnowU-Bench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces
📝 Summary:
OmniBehavior benchmark reveals that current LLMs fail to accurately simulate complex real-world user behaviors due to structural biases and limited behavioral diversity. AI-generated summary The emerg...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08362
• PDF: https://arxiv.org/pdf/2604.08362
• Project Page: https://omnibehavior.github.io/
• Github: https://github.com/icip-cas/OmniBehavior
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OmniBehavior benchmark reveals that current LLMs fail to accurately simulate complex real-world user behaviors due to structural biases and limited behavioral diversity. AI-generated summary The emerg...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08362
• PDF: https://arxiv.org/pdf/2604.08362
• Project Page: https://omnibehavior.github.io/
• Github: https://github.com/icip-cas/OmniBehavior
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering
📝 Summary:
OmniJigsaw presents a self-supervised framework for video-audio understanding and collaborative reasoning through temporal reordering and cross-modal integration strategies. AI-generated summary To ex...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08209
• PDF: https://arxiv.org/pdf/2604.08209
• Project Page: https://aim-uofa.github.io/OmniJigsaw
• Github: https://github.com/aim-uofa/OmniJigsaw
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OmniJigsaw presents a self-supervised framework for video-audio understanding and collaborative reasoning through temporal reordering and cross-modal integration strategies. AI-generated summary To ex...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08209
• PDF: https://arxiv.org/pdf/2604.08209
• Project Page: https://aim-uofa.github.io/OmniJigsaw
• Github: https://github.com/aim-uofa/OmniJigsaw
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
📝 Summary:
Flux Attention dynamically optimizes attention computation in LLMs by routing layers to full or sparse attention based on input context, achieving faster inference with minimal training overhead. AI-g...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2604.07394
• PDF: https://arxiv.org/pdf/2604.07394
• Github: https://github.com/qqtang-code/FluxAttention
🔹 Models citing this paper:
• https://huggingface.co/QQTang1223/full_xattn_Qwen3-8B
• https://huggingface.co/QQTang1223/full_streaming_Llama-3.1-8B-Instruct
• https://huggingface.co/QQTang1223/full_streaming_Qwen3-4B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Flux Attention dynamically optimizes attention computation in LLMs by routing layers to full or sparse attention based on input context, achieving faster inference with minimal training overhead. AI-g...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2604.07394
• PDF: https://arxiv.org/pdf/2604.07394
• Github: https://github.com/qqtang-code/FluxAttention
🔹 Models citing this paper:
• https://huggingface.co/QQTang1223/full_xattn_Qwen3-8B
• https://huggingface.co/QQTang1223/full_streaming_Llama-3.1-8B-Instruct
• https://huggingface.co/QQTang1223/full_streaming_Qwen3-4B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LPM 1.0: Video-based Character Performance Model
📝 Summary:
A large-scale multimodal model for real-time conversational character performance generation that maintains identity consistency while enabling interactive, infinite-length video synthesis. AI-generat...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07823
• PDF: https://arxiv.org/pdf/2604.07823
• Project Page: https://large-performance-model.github.io/
• Github: https://github.com/large-performance-model/large-performance-model.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A large-scale multimodal model for real-time conversational character performance generation that maintains identity consistency while enabling interactive, infinite-length video synthesis. AI-generat...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07823
• PDF: https://arxiv.org/pdf/2604.07823
• Project Page: https://large-performance-model.github.io/
• Github: https://github.com/large-performance-model/large-performance-model.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models
📝 Summary:
NUMINA enhances text-to-video diffusion models' numerical accuracy through a training-free framework that identifies layout inconsistencies and guides regeneration via attention modulation. AI-generat...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08546
• PDF: https://arxiv.org/pdf/2604.08546
• Project Page: https://h-embodvis.github.io/NUMINA/
• Github: https://github.com/H-EmbodVis/NUMINA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
NUMINA enhances text-to-video diffusion models' numerical accuracy through a training-free framework that identifies layout inconsistencies and guides regeneration via attention modulation. AI-generat...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08546
• PDF: https://arxiv.org/pdf/2604.08546
• Project Page: https://h-embodvis.github.io/NUMINA/
• Github: https://github.com/H-EmbodVis/NUMINA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents
📝 Summary:
HY-Embodied-0.5 is a foundation model family for embodied agents featuring Mixture-of-Transformers architecture and iterative post-training for enhanced visual perception and reasoning capabilities. A...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07430
• PDF: https://arxiv.org/pdf/2604.07430
• Github: https://github.com/Tencent-Hunyuan/HY-Embodied
🔹 Models citing this paper:
• https://huggingface.co/tencent/HY-Embodied-0.5
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
HY-Embodied-0.5 is a foundation model family for embodied agents featuring Mixture-of-Transformers architecture and iterative post-training for enhanced visual perception and reasoning capabilities. A...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07430
• PDF: https://arxiv.org/pdf/2604.07430
• Github: https://github.com/Tencent-Hunyuan/HY-Embodied
🔹 Models citing this paper:
• https://huggingface.co/tencent/HY-Embodied-0.5
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents
We introduce HY-Embodied-0.5, a family of foundation models specifically designed for real-world embodied agents. To bridge the gap between general Vision-Language Models (VLMs) and the demands of...
✨FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On
📝 Summary:
A large-scale virtual try-on dataset called FIT is introduced that includes precise body and garment measurements to address garment fit accuracy, using synthetic 3D garment generation, physics simula...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08526
• PDF: https://arxiv.org/pdf/2604.08526
• Project Page: https://johannakarras.github.io/FIT/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A large-scale virtual try-on dataset called FIT is introduced that includes precise body and garment measurements to address garment fit accuracy, using synthetic 3D garment generation, physics simula...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08526
• PDF: https://arxiv.org/pdf/2604.08526
• Project Page: https://johannakarras.github.io/FIT/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MolmoWeb: Open Visual Web Agent and Open Data for the Open Web
📝 Summary:
Open-source web agents leveraging diverse mixed datasets achieve state-of-the-art performance on browser-based tasks while operating without access to HTML or accessibility tree information. AI-genera...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08516
• PDF: https://arxiv.org/pdf/2604.08516
• Project Page: https://allenai.org/blog/molmoweb
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Open-source web agents leveraging diverse mixed datasets achieve state-of-the-art performance on browser-based tasks while operating without access to HTML or accessibility tree information. AI-genera...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08516
• PDF: https://arxiv.org/pdf/2604.08516
• Project Page: https://allenai.org/blog/molmoweb
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models
📝 Summary:
Agents with meta-cognitive deficits struggle with tool usage decisions, leading to inefficiencies; a new framework called HDPO addresses this through decoupled optimization channels for accuracy and e...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08545
• PDF: https://arxiv.org/pdf/2604.08545
• Project Page: https://Accio-Lab.github.io/Metis
• Github: https://github.com/Accio-Lab/Metis
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Agents with meta-cognitive deficits struggle with tool usage decisions, leading to inefficiencies; a new framework called HDPO addresses this through decoupled optimization channels for accuracy and e...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08545
• PDF: https://arxiv.org/pdf/2604.08545
• Project Page: https://Accio-Lab.github.io/Metis
• Github: https://github.com/Accio-Lab/Metis
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Beyond Stochastic Exploration: What Makes Training Data Valuable for Agentic Search
📝 Summary:
A novel hierarchical experience framework improves reinforcement learning-based search agents by transforming raw reasoning trajectories into structured knowledge, enhancing both performance and train...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08124
• PDF: https://arxiv.org/pdf/2604.08124
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A novel hierarchical experience framework improves reinforcement learning-based search agents by transforming raw reasoning trajectories into structured knowledge, enhancing both performance and train...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08124
• PDF: https://arxiv.org/pdf/2604.08124
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
📝 Summary:
SkillClaw enables collective skill evolution in multi-user LLM agent systems by aggregating user interactions to autonomously update and improve reusable skills across the ecosystem. AI-generated summ...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08377
• PDF: https://arxiv.org/pdf/2604.08377
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SkillClaw enables collective skill evolution in multi-user LLM agent systems by aggregating user interactions to autonomously update and improve reusable skills across the ecosystem. AI-generated summ...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08377
• PDF: https://arxiv.org/pdf/2604.08377
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨RewardFlow: Generate Images by Optimizing What You Reward
📝 Summary:
RewardFlow enables pretrained diffusion and flow-matching models to be guided during inference through multi-reward Langevin dynamics without requiring inversion, achieving superior performance in ima...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08536
• PDF: https://arxiv.org/pdf/2604.08536
• Project Page: https://plan-lab.github.io/projects/rewardflow
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
RewardFlow enables pretrained diffusion and flow-matching models to be guided during inference through multi-reward Langevin dynamics without requiring inversion, achieving superior performance in ima...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08536
• PDF: https://arxiv.org/pdf/2604.08536
• Project Page: https://plan-lab.github.io/projects/rewardflow
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics
📝 Summary:
Phantom is a physics-infused video generation model that jointly models visual content and latent physical dynamics to produce videos that are both visually realistic and physically consistent. AI-gen...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08503
• PDF: https://arxiv.org/pdf/2604.08503
• Project Page: https://plan-lab.github.io/projects/phantom
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Phantom is a physics-infused video generation model that jointly models visual content and latent physical dynamics to produce videos that are both visually realistic and physically consistent. AI-gen...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08503
• PDF: https://arxiv.org/pdf/2604.08503
• Project Page: https://plan-lab.github.io/projects/phantom
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PokeGym: A Visually-Driven Long-Horizon Benchmark for Vision-Language Models
📝 Summary:
Vision-Language Models face limitations in 3D embodied environments due to insufficient physical reasoning capabilities, as demonstrated by the PokeGym benchmark that reveals deadlock recovery as the ...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08340
• PDF: https://arxiv.org/pdf/2604.08340
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Vision-Language Models face limitations in 3D embodied environments due to insufficient physical reasoning capabilities, as demonstrated by the PokeGym benchmark that reveals deadlock recovery as the ...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08340
• PDF: https://arxiv.org/pdf/2604.08340
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks
📝 Summary:
Gaussian GRPO addresses challenges in multimodal model training by using distributional matching to ensure gradient equity and stable reinforcement learning, enabling improved perception-reasoning bal...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08539
• PDF: https://arxiv.org/pdf/2604.08539
• Project Page: https://gordonhu608.github.io/openvlthinkerv2.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Gaussian GRPO addresses challenges in multimodal model training by using distributional matching to ensure gradient equity and stable reinforcement learning, enabling improved perception-reasoning bal...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08539
• PDF: https://arxiv.org/pdf/2604.08539
• Project Page: https://gordonhu608.github.io/openvlthinkerv2.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents
📝 Summary:
GameWorld presents a standardized benchmark for evaluating multimodal large language model agents in video games, featuring diverse games and verified metrics for comprehensive assessment. AI-generate...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07429
• PDF: https://arxiv.org/pdf/2604.07429
• Project Page: https://gameworld-bench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
GameWorld presents a standardized benchmark for evaluating multimodal large language model agents in video games, featuring diverse games and verified metrics for comprehensive assessment. AI-generate...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07429
• PDF: https://arxiv.org/pdf/2604.07429
• Project Page: https://gameworld-bench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research