ML Research Hub

✨Prompt-Free Universal Region Proposal Network

📝 Summary:
PF-RPN is a novel network that identifies potential objects without needing external prompts, improving flexibility. It uses Sparse Image-Aware Adapters and Cascade Self-Prompting to localize objects, validated across 19 datasets. This method works across diverse domains with limited data.

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17554
• PDF: https://arxiv.org/pdf/2603.17554
• Github: https://github.com/tangqh03/PF-RPN

🔹 Models citing this paper:
• https://huggingface.co/tangqh/PF-RPN

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ObjectDetection #ComputerVision #DeepLearning #RPN #PromptFreeAI

146 views07:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing

📝 Summary:
EffectErase is a new video object removal method that effectively erases dynamic objects and their visual effects. It introduces VOR, a large dataset for training, and uses reciprocal learning with task-aware guidance for high-quality results.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19224
• PDF: https://arxiv.org/pdf/2603.19224
• Project Page: https://henghuiding.com/EffectErase/
• Github: https://github.com/FudanCVL/EffectErase

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoEditing #ComputerVision #ObjectRemoval #DeepLearning #AI

152 views08:38

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation

📝 Summary:
LLMs struggle with low-resource language translation due to data scarcity. WALAR, a novel RL method, uses only monolingual text to improve LLM translation by mitigating reward hacking in quality estimation models. This significantly outperforms existing multilingual LLMs.

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13045
• PDF: https://arxiv.org/pdf/2603.13045
• Github: https://github.com/LeiLiLab/WALAR

🔹 Models citing this paper:
• https://huggingface.co/lyf07/LLaMAX3-8B-Alpaca-WALAR
• https://huggingface.co/lyf07/Translategemma-4B-it-WALAR
• https://huggingface.co/lyf07/Qwen3-8B-WALAR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ReinforcementLearning #LLM #MultilingualTranslation #NLP #LowResourceLanguages

195 views08:39

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ReactMotion: Generating Reactive Listener Motions from Speaker Utterance

📝 Summary:
This paper introduces ReactMotion, a framework for generating natural listener body motions that react appropriately to speaker utterances. It uses a large dataset and preference-based training to create diverse, realistic responses, outperforming prior methods.

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15083
• PDF: https://arxiv.org/pdf/2603.15083

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #MachineLearning #HumanComputerInteraction #GenerativeAI #ComputerAnimation

190 views12:39

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SimulU: Training-free Policy for Long-form Simultaneous Speech-to-Speech Translation

📝 Summary:
SimulU proposes a training-free policy for long-form simultaneous speech-to-speech translation SimulS2S. It uses history management and cross-attention from pre-trained models to regulate input and output, achieving good quality-latency without specific training.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16924
• PDF: https://arxiv.org/pdf/2603.16924

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SpeechToSpeech #SimultaneousTranslation #NLP #AI #DeepLearning

139 views13:39

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

📝 Summary:
The paper presents AndroTMem, a framework and benchmark diagnosing interaction memory failures in long-horizon GUI agents. It proposes Anchored State Memory ASM, which uses causally linked intermediate-state anchors to overcome this bottleneck, improving task completion rates by up to 30%.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18429
• PDF: https://arxiv.org/pdf/2603.18429

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#GUIAgents #AIMemory #AIAgents #AIResearch #HumanComputerInteraction

149 views13:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PARSA-Bench: A Comprehensive Persian Audio-Language Model Benchmark

📝 Summary:
PARSA-Bench is the first benchmark for Persian audio-language models, featuring 16 tasks covering speech, paralinguistics, and cultural audio comprehension. It reveals current models struggle with Persian's unique audio challenges like poetry and music, performing poorly on culturally-grounded ta...

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14456
• PDF: https://arxiv.org/pdf/2603.14456

✨ Datasets citing this paper:
• https://huggingface.co/datasets/MohammadJRanjbar/PARSA-Bench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#PersianAI #AudioLanguageModels #NLP #Benchmarking #SpeechProcessing

196 views13:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Tinted Frames: Question Framing Blinds Vision-Language Models

📝 Summary:
Vision-language models suffer selective blindness, where linguistic framing degrades visual attention and performance. Constrained framings reduce focus on relevant image regions. A new prompt-tuning method improves visual grounding and performance across different framings.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19203
• PDF: https://arxiv.org/pdf/2603.19203
• Project Page: https://davidhalladay.github.io/tinted_frames_demo/
• Github: https://github.com/davidhalladay/Tinted-Frames

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VisionLanguageModels #PromptEngineering #AIAttention #DeepLearning #AIResearch

222 views14:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VID-AD: A Dataset for Image-Level Logical Anomaly Detection under Vision-Induced Distraction

📝 Summary:
VID-AD is a dataset for logical anomaly detection in industrial inspection, specifically addressing challenges from visual distractions. A new language-based framework is also proposed, which uses text descriptions and contrastive learning to capture logical attributes.

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13964
• PDF: https://arxiv.org/pdf/2603.13964
• Github: https://github.com/nkthiroto/VID-AD

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AnomalyDetection #IndustrialInspection #ComputerVision #MachineLearning #Datasets

292 views14:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

📝 Summary:
MultiTempBench evaluates LLMs multilingual temporal reasoning across various calendars and languages. It finds that tokenization quality, specifically fragmentation of temporal data, is a major bottleneck that severely reduces accuracy in low-resource languages and less common calendar formats.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19017
• PDF: https://arxiv.org/pdf/2603.19017
• Github: https://github.com/gagan3012/mtb

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #TemporalReasoning #Tokenization #MultilingualAI #NLP

302 views17:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising

📝 Summary:
DreamPartGen generates 3D objects by modeling part geometry and appearance with Duplex Part Latents. It captures inter-part relationships using Relational Semantic Latents for improved text-shape alignment. A co-denoising process ensures consistency and achieves state-of-the-art results.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19216
• PDF: https://arxiv.org/pdf/2603.19216
• Project Page: https://plan-lab.github.io/dreampartgen

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#3DGeneration #GenerativeAI #DeepLearning #ComputerVision #TextTo3D

❤1

347 views19:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Please open Telegram to view this post

VIEW IN TELEGRAM

211 views12:30

About

Blog

Apps

Platform