ML Research Hub

✨Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

📝 Summary:
This paper introduces Principia, a new dataset for deriving mathematical objects, and training recipes using on-policy LLM judges. These methods significantly improve model performance and enable cross-format generalization in reasoning tasks, while also scaling test-time compute.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18886
• PDF: https://arxiv.org/pdf/2603.18886

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

98 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents

📝 Summary:
Reinforcement learning infrastructure for multi-turn LLM agents that provides scalable rollout services and standardized sandbox environments for complex interactive tasks. AI-generated summary Multi-...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18815
• PDF: https://arxiv.org/pdf/2603.18815
• Github: https://github.com/NVIDIA-NeMo/ProRL-Agent-Server

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

125 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨COT-FM: Cluster-wise Optimal Transport Flow Matching

📝 Summary:
COT-FM enhances Flow Matching by clustering target samples and assigning dedicated source distributions. This creates straighter probability paths, enabling faster and more reliable generation with improved quality across diverse tasks.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13395
• PDF: https://arxiv.org/pdf/2603.13395
• Project Page: https://embodiedai-ntu.github.io/cotfm/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

138 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

📝 Summary:
A three-stage framework bridges semantic and kinematic conditions using discrete tokens and diffusion synthesis. Its core MoTok tokenizer achieves compact high-fidelity tokens, significantly boosting controllability, fidelity, and reducing token usage under strong kinematic constraints.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19227
• PDF: https://arxiv.org/pdf/2603.19227
• Project Page: https://rheallyc.github.io/projects/motok/
• Github: https://github.com/rheallyc/MoTok

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

99 views05:37

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding

📝 Summary:
Top-tier MLLMs demonstrate limited capability in processing discrete symbols despite strong performance in complex reasoning, revealing a cognitive mismatch between visual perception and symbolic unde...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18472
• PDF: https://arxiv.org/pdf/2603.18472

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

84 views05:37

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VTC-Bench: Evaluating Agentic Multimodal Models via Compositional Visual Tool Chaining

📝 Summary:
A new benchmark called VisualToolChain-Bench is introduced to evaluate the tool-use capabilities of multimodal large language models in complex visual tasks requiring multi-step planning and diverse t...

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15030
• PDF: https://arxiv.org/pdf/2603.15030
• Github: https://github.com/zhuzil/VTC-Bench

✨ Datasets citing this paper:
• https://huggingface.co/datasets/zzzhu/VTC-Bench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

87 views05:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models

📝 Summary:
Loc3R-VLM enhances 2D Vision-Language Models with 3D understanding capabilities through spatial supervision from monocular video input, achieving superior performance in language-based localization an...

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18002
• PDF: https://arxiv.org/pdf/2603.18002
• Project Page: https://kevinqu7.github.io/loc3r-vlm/
• Github: https://kevinqu7.github.io/loc3r-vlm/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

101 views05:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model

📝 Summary:
This novel framework enables 3D-aware video customization by decoupling spatial geometry from temporal motion using 1-frame optimization to build robust 3D priors. It also incorporates a visual conditioning module for enhanced texture generation and faster convergence.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18524
• PDF: https://arxiv.org/pdf/2603.18524
• Project Page: https://ko-lani.github.io/3DreamBooth
• Github: https://github.com/Ko-Lani/3DreamBooth

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

107 views06:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction

📝 Summary:
MonoArt presents a unified framework for reconstructing articulated 3D objects from single images through progressive structural reasoning that enables stable articulation inference without external t...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19231
• PDF: https://arxiv.org/pdf/2603.19231
• Project Page: https://lihaitian.com/MonoArt/
• Github: https://github.com/Quest4Science/MonoArt

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

121 views06:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

113 views06:38

ML Research Hub

✨MOSS-TTS Technical Report

📝 Summary:
MOSS-TTS is a speech generation model using discrete audio tokens and autoregressive modeling with capabilities for voice cloning, pronunciation control, and long-form generation across multiple langu...

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18090
• PDF: https://arxiv.org/pdf/2603.18090
• Project Page: https://mosi.cn/models/moss-tts
• Github: https://github.com/OpenMOSS/MOSS-TTS

🔹 Models citing this paper:
• https://huggingface.co/OpenMOSS-Team/MOSS-TTS
• https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Realtime
• https://huggingface.co/OpenMOSS-Team/MOSS-TTS-Local-Transformer

✨ Spaces citing this paper:
• https://huggingface.co/spaces/OpenMOSS-Team/MOSS-TTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/JymNils/MOSS-TTS

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

MOSS-TTS Technical Report

This technical report presents MOSS-TTS, a speech generation foundation model built on a scalable recipe: discrete audio tokens, autoregressive modeling, and large-scale pretraining. Built on...

🔥1

149 views06:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Prompt-Free Universal Region Proposal Network

📝 Summary:
PF-RPN is a novel network that identifies potential objects without needing external prompts, improving flexibility. It uses Sparse Image-Aware Adapters and Cascade Self-Prompting to localize objects, validated across 19 datasets. This method works across diverse domains with limited data.

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17554
• PDF: https://arxiv.org/pdf/2603.17554
• Github: https://github.com/tangqh03/PF-RPN

🔹 Models citing this paper:
• https://huggingface.co/tangqh/PF-RPN

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ObjectDetection #ComputerVision #DeepLearning #RPN #PromptFreeAI

153 views07:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing

📝 Summary:
EffectErase is a new video object removal method that effectively erases dynamic objects and their visual effects. It introduces VOR, a large dataset for training, and uses reciprocal learning with task-aware guidance for high-quality results.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19224
• PDF: https://arxiv.org/pdf/2603.19224
• Project Page: https://henghuiding.com/EffectErase/
• Github: https://github.com/FudanCVL/EffectErase

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoEditing #ComputerVision #ObjectRemoval #DeepLearning #AI

153 views08:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation

📝 Summary:
LLMs struggle with low-resource language translation due to data scarcity. WALAR, a novel RL method, uses only monolingual text to improve LLM translation by mitigating reward hacking in quality estimation models. This significantly outperforms existing multilingual LLMs.

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13045
• PDF: https://arxiv.org/pdf/2603.13045
• Github: https://github.com/LeiLiLab/WALAR

🔹 Models citing this paper:
• https://huggingface.co/lyf07/LLaMAX3-8B-Alpaca-WALAR
• https://huggingface.co/lyf07/Translategemma-4B-it-WALAR
• https://huggingface.co/lyf07/Qwen3-8B-WALAR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ReinforcementLearning #LLM #MultilingualTranslation #NLP #LowResourceLanguages

196 views08:39

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ReactMotion: Generating Reactive Listener Motions from Speaker Utterance

📝 Summary:
This paper introduces ReactMotion, a framework for generating natural listener body motions that react appropriately to speaker utterances. It uses a large dataset and preference-based training to create diverse, realistic responses, outperforming prior methods.

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15083
• PDF: https://arxiv.org/pdf/2603.15083

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #MachineLearning #HumanComputerInteraction #GenerativeAI #ComputerAnimation

193 views12:39

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SimulU: Training-free Policy for Long-form Simultaneous Speech-to-Speech Translation

📝 Summary:
SimulU proposes a training-free policy for long-form simultaneous speech-to-speech translation SimulS2S. It uses history management and cross-attention from pre-trained models to regulate input and output, achieving good quality-latency without specific training.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16924
• PDF: https://arxiv.org/pdf/2603.16924

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SpeechToSpeech #SimultaneousTranslation #NLP #AI #DeepLearning

142 views13:39

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

📝 Summary:
The paper presents AndroTMem, a framework and benchmark diagnosing interaction memory failures in long-horizon GUI agents. It proposes Anchored State Memory ASM, which uses causally linked intermediate-state anchors to overcome this bottleneck, improving task completion rates by up to 30%.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18429
• PDF: https://arxiv.org/pdf/2603.18429

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#GUIAgents #AIMemory #AIAgents #AIResearch #HumanComputerInteraction

153 views13:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PARSA-Bench: A Comprehensive Persian Audio-Language Model Benchmark

📝 Summary:
PARSA-Bench is the first benchmark for Persian audio-language models, featuring 16 tasks covering speech, paralinguistics, and cultural audio comprehension. It reveals current models struggle with Persian's unique audio challenges like poetry and music, performing poorly on culturally-grounded ta...

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14456
• PDF: https://arxiv.org/pdf/2603.14456

✨ Datasets citing this paper:
• https://huggingface.co/datasets/MohammadJRanjbar/PARSA-Bench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#PersianAI #AudioLanguageModels #NLP #Benchmarking #SpeechProcessing

201 views13:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Tinted Frames: Question Framing Blinds Vision-Language Models

📝 Summary:
Vision-language models suffer selective blindness, where linguistic framing degrades visual attention and performance. Constrained framings reduce focus on relevant image regions. A new prompt-tuning method improves visual grounding and performance across different framings.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19203
• PDF: https://arxiv.org/pdf/2603.19203
• Project Page: https://davidhalladay.github.io/tinted_frames_demo/
• Github: https://github.com/davidhalladay/Tinted-Frames

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VisionLanguageModels #PromptEngineering #AIAttention #DeepLearning #AIResearch

228 views14:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VID-AD: A Dataset for Image-Level Logical Anomaly Detection under Vision-Induced Distraction

📝 Summary:
VID-AD is a dataset for logical anomaly detection in industrial inspection, specifically addressing challenges from visual distractions. A new language-based framework is also proposed, which uses text descriptions and contrastive learning to capture logical attributes.

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13964
• PDF: https://arxiv.org/pdf/2603.13964
• Github: https://github.com/nkthiroto/VID-AD

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AnomalyDetection #IndustrialInspection #ComputerVision #MachineLearning #Datasets

297 views14:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

📝 Summary:
MultiTempBench evaluates LLMs multilingual temporal reasoning across various calendars and languages. It finds that tokenization quality, specifically fragmentation of temporal data, is a major bottleneck that severely reduces accuracy in low-resource languages and less common calendar formats.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19017
• PDF: https://arxiv.org/pdf/2603.19017
• Github: https://github.com/gagan3012/mtb

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #TemporalReasoning #Tokenization #MultilingualAI #NLP

308 views17:41

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform