ML Research Hub
32.9K subscribers
4.58K photos
282 videos
23 files
4.95K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
FARE: Fast-Slow Agentic Robotic Exploration

📝 Summary:
FARE is a hierarchical exploration framework that combines large language model reasoning with reinforcement learning control to enable efficient autonomous robot navigation in complex environments. A...

🔹 Publication Date: Published on Jan 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14681
• PDF: https://arxiv.org/pdf/2601.14681

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RoboBrain 2.5: Depth in Sight, Time in Mind

📝 Summary:
RoboBrain 2.5 enhances embodied AI through improved 3D spatial reasoning and temporal value estimation for more precise manipulation tasks. AI-generated summary We introduce RoboBrain 2.5, a next-gene...

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14352
• PDF: https://arxiv.org/pdf/2601.14352
• Project Page: https://superrobobrain.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

📝 Summary:
MMDeepResearch-Bench evaluates multimodal research agents on report generation with visual evidence, revealing trade-offs between prose quality, citation accuracy, and visual grounding. AI-generated s...

🔹 Publication Date: Published on Jan 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.12346
• PDF: https://arxiv.org/pdf/2601.12346
• Github: https://github.com/AIoT-MLSys-Lab/MMDeepResearch-Bench

Datasets citing this paper:
https://huggingface.co/datasets/MMDR-2025/MMdeepresearch

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

📝 Summary:
FinVault presents the first execution-grounded security benchmark for financial agents, revealing significant vulnerabilities in current defense mechanisms when applied to real-world financial workflo...

🔹 Publication Date: Published on Jan 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07853
• PDF: https://arxiv.org/pdf/2601.07853
• Github: https://github.com/aifinlab/FinVault

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
The Responsibility Vacuum: Organizational Failure in Scaled Agent Systems

📝 Summary:
Modern CI/CD pipelines integrating agent-generated code exhibit a structural failure in responsibility attribution. Decisions are executed through formally correct approval processes, yet no entity po...

🔹 Publication Date: Published on Jan 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.15059
• PDF: https://arxiv.org/pdf/2601.15059

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgentEHR: Advancing Autonomous Clinical Decision-Making via Retrospective Summarization

📝 Summary:
AgentEHR is a benchmark for autonomous EHR navigation involving complex clinical decision-making in raw data. The RetroSum framework addresses information loss and fractured reasoning through retrospective summarization and evolving experience strategies. RetroSum improves performance by up to 29...

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13918
• PDF: https://arxiv.org/pdf/2601.13918
• Github: https://github.com/BlueZeros/AgentEHR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Numina-Lean-Agent: An Open and General Agentic Reasoning System for Formal Mathematics

📝 Summary:
A general coding agent paradigm enables flexible formal theorem proving by directly interfacing with proof assistants and retrieving relevant theorems without task-specific training. AI-generated summ...

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14027
• PDF: https://arxiv.org/pdf/2601.14027
• Project Page: https://demo.projectnumina.ai/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Typhoon OCR: Open Vision-Language Model For Thai Document Extraction

📝 Summary:
Typhoon OCR is an open vision-language model for Thai and English document extraction, tackling complex script and unstructured documents. It achieves high accuracy and layout reconstruction comparable to larger proprietary systems, yet is compact and computationally efficient.

🔹 Publication Date: Published on Jan 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14722
• PDF: https://arxiv.org/pdf/2601.14722

🔹 Models citing this paper:
https://huggingface.co/typhoon-ai/typhoon-ocr-7b
https://huggingface.co/typhoon-ai/typhoon-ocr1.5-2b
https://huggingface.co/typhoon-ai/typhoon-ocr-3b

Spaces citing this paper:
https://huggingface.co/spaces/doeqoth/typhoon-ocr

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

📝 Summary:
Research investigates the relationship between speaker embeddings and phonological rules in accent control for text-to-speech systems, introducing a metric to measure rule preservation versus embeddin...

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14417
• PDF: https://arxiv.org/pdf/2601.14417
• Project Page: https://sav-eng.github.io/icassp_samples.html

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Typhoon ASR Real-time: FastConformer-Transducer for Thai Automatic Speech Recognition

📝 Summary:
A 115M-parameter FastConformer-Transducer model achieves low-latency Thai speech recognition with reduced computational cost through text normalization and curriculum learning, accompanied by a benchm...

🔹 Publication Date: Published on Jan 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13044
• PDF: https://arxiv.org/pdf/2601.13044

🔹 Models citing this paper:
https://huggingface.co/typhoon-ai/typhoon-asr-realtime
https://huggingface.co/typhoon-ai/typhoon-isan-asr-realtime
https://huggingface.co/typhoon-ai/typhoon-whisper-turbo

Datasets citing this paper:
https://huggingface.co/datasets/typhoon-ai/gigaspeech2-typhoon
https://huggingface.co/datasets/typhoon-ai/TVSpeech

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
sangkuriang: A pseudo-spectral Python library for Korteweg-de Vries soliton simulation

📝 Summary:
The Korteweg-de Vries (KdV) equation serves as a foundational model in nonlinear wave physics, describing the balance between dispersive spreading and nonlinear steepening that gives rise to solitons....

🔹 Publication Date: Published on Jan 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.12029
• PDF: https://arxiv.org/pdf/2601.12029
• Project Page: https://pypi.org/project/sangkuriang-ideal-solver/
• Github: https://github.com/sandyherho/sangkuriang-ideal-solver

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented Generation

📝 Summary:
UltraRAG is a RAG toolkit automating knowledge adaptation across the entire workflow from data to evaluation. It provides a user-friendly WebUI, enabling non-coders to build and optimize RAG systems for diverse scenarios.

🔹 Publication Date: Published on Mar 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.08761
• PDF: https://arxiv.org/pdf/2504.08761
• Github: https://github.com/OpenBMB/UltraRAG

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#RAG #AI #LLMs #Automation #DataScience
3
Behavior Knowledge Merge in Reinforced Agentic Models

📝 Summary:
Reinforced Agent Merging RAM improves integrating RL agents by distinguishing shared and task-specific parameters. This preserves critical behaviors, outperforming baselines and unlocking synergistic performance beyond specialized agents.

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13572
• PDF: https://arxiv.org/pdf/2601.13572
• Project Page: https://xiangchi-yuan.github.io/ram-project/
• Github: https://github.com/xiangchi-yuan/mrl

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #MultiAgentSystems #ArtificialIntelligence #DeepLearning #AgenticModels
Privacy Collapse: Benign Fine-Tuning Can Break Contextual Privacy in Language Models

📝 Summary:
Benign fine-tuning can cause privacy collapse in language models. Models lose contextual privacy reasoning despite maintaining high performance, leading to severe vulnerabilities. This silent failure reveals a critical gap in current safety evaluations for specialized agents.

🔹 Publication Date: Published on Jan 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.15220
• PDF: https://arxiv.org/pdf/2601.15220
• Github: https://github.com/parameterlab/privacy-collapse

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #Privacy #AIsafety #FineTuning #AIsecurity
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning

📝 Summary:
Chroma 1.0 is the first open-source real-time end-to-end spoken dialogue model with personalized voice cloning. It achieves low-latency interaction and high-fidelity voice synthesis, improving speaker similarity by 10.96% over a human baseline.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11141
• PDF: https://arxiv.org/pdf/2601.11141
• Project Page: https://www.flashlabs.ai/flashai-voice-agents
• Github: https://github.com/FlashLabs-AI-Corp/FlashLabs-Chroma

🔹 Models citing this paper:
https://huggingface.co/FlashLabs/Chroma-4B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ConversationalAI #VoiceCloning #RealTimeAI #OpenSourceAI #TTS
Show me the evidence: Evaluating the role of evidence and natural language explanations in AI-supported fact-checking

📝 Summary:
This study found that non-expert users consistently relied on evidence to validate AI claims in fact-checking. While natural language explanations reduced evidence use, participants still turned to evidence if explanations seemed flawed or insufficient. Evidence is a key ingredient for evaluating...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11387
• PDF: https://arxiv.org/pdf/2601.11387

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #FactChecking #ExplainableAI #Evidence #InformationCredibility
GutenOCR: A Grounded Vision-Language Front-End for Documents

📝 Summary:
GutenOCR enhances vision-language models for document understanding, unifying reading, detection, and grounding via a prompt-based interface. It significantly improves grounded OCR, region and line-level OCR, and text detection on diverse documents.

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14490
• PDF: https://arxiv.org/pdf/2601.14490

🔹 Models citing this paper:
https://huggingface.co/rootsautomation/GutenOCR-3B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#OCR #VisionLanguageModels #DocumentAI #ComputerVision #DeepLearning
CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning

📝 Summary:
The paper addresses unreliable multilingual medical reasoning in LLMs, especially for underrepresented languages. It introduces CURE-MED, a curriculum-informed reinforcement learning framework, and CUREMED-BENCH dataset. CURE-MED significantly improves language consistency and logical correctness...

🔹 Publication Date: Published on Jan 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13262
• PDF: https://arxiv.org/pdf/2601.13262
• Project Page: https://cure-med.github.io/

🔹 Models citing this paper:
https://huggingface.co/Aikyam-Lab/CURE-MED-1.5B
https://huggingface.co/Aikyam-Lab/CURE-MED-3B
https://huggingface.co/Aikyam-Lab/CURE-MED-7B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #MedicalAI #ReinforcementLearning #MultilingualNLP #AIResearch
Implicit Neural Representation Facilitates Unified Universal Vision Encoding

📝 Summary:
This paper unifies image representation learning for both recognition and generation. It uses a hyper-network for implicit neural representation with knowledge distillation to create compressed embeddings. The model achieves state-of-the-art results and enables generative capabilities.

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14256
• PDF: https://arxiv.org/pdf/2601.14256
• Github: https://github.com/tiktok/huvr

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ComputerVision #DeepLearning #GenerativeAI #RepresentationLearning #VisionEncoding
Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis

📝 Summary:
Motion 3-to-4 synthesizes 4D dynamic objects from monocular video by separating static 3D shape generation from motion reconstruction. It uses a canonical mesh and a transformer to predict temporally coherent vertex trajectories, achieving superior fidelity.

🔹 Publication Date: Published on Jan 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14253
• PDF: https://arxiv.org/pdf/2601.14253

🔹 Models citing this paper:
https://huggingface.co/River-Chen/Motion324

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DReconstruction #4DSynthesis #ComputerVision #DeepLearning #MotionCapture
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

📝 Summary:
Arbitrary order generation in diffusion LLMs, surprisingly, limits reasoning by causing premature solution space collapse. This occurs because dLLMs exploit flexibility to bypass crucial, high-uncertainty tokens. Standard Group Relative Policy Optimization without arbitrary order is more effectiv...

🔹 Publication Date: Published on Jan 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.15165
• PDF: https://arxiv.org/pdf/2601.15165
• Project Page: https://nzl-thu.github.io/the-flexibility-trap
• Github: https://github.com/LeapLabTHU/JustGRPO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #DiffusionModels #NLP #AIResearch #MachineLearning