ML Research Hub
32.5K subscribers
6K photos
385 videos
24 files
6.49K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

📝 Summary:
PPTAgent, a two-stage approach, improves presentation generation by analyzing reference presentations and ensuring structural and content consistency, outperforming traditional methods across content,...

🔹 Publication Date: Published on Jan 7, 2025

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/ICIP/pptagent
• PDF: https://arxiv.org/pdf/2501.03936
• Project Page: https://github.com/icip-cas/PPTAgent
• Github: https://github.com/icip-cas/PPTAgent

Datasets citing this paper:
https://huggingface.co/datasets/Forceless/Zenodo10K

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing

📝 Summary:
Expert Threshold ET routing dynamically allocates computation in MoE models. Tokens route to experts based on individual scores exceeding EMA thresholds, achieving load balance without auxiliary losses. ET lowers cross-entropy loss by 0.067 compared to Token-choice MoE.

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11535
• PDF: https://arxiv.org/pdf/2603.11535
• Github: https://github.com/MasterGodzilla/Expert-Threshold-Routing

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning

📝 Summary:
V-JEPA 2.1 is a self-supervised model learning dense visual representations for images and videos. It combines dense predictive loss, deep self-supervision, multi-modal tokenizers, and scaling to achieve state-of-the-art performance across various benchmarks, significantly advancing visual unders...

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14482
• PDF: https://arxiv.org/pdf/2603.14482
• Project Page: https://ai.meta.com/blog/v-jepa-2-world-model-benchmarks/
• Github: https://github.com/facebookresearch/vjepa2

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SelfSupervisedLearning #ComputerVision #DeepLearning #AI #VideoUnderstanding
From Prior to Pro: Efficient Skill Mastery via Distribution Contractive RL Finetuning

📝 Summary:
DICE-RL refines pretrained generative robot policies via reinforcement learning distribution contraction. It boosts high-success behaviors, leading to stable, sample-efficient mastery of complex manipulation from pixels on real robots.

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10263
• PDF: https://arxiv.org/pdf/2603.10263
• Project Page: https://zhanyisun.github.io/dice.rl.2026/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers

📝 Summary:
AdapterTune introduces zero-initialized low-rank adapters for Vision Transformers, addressing optimization instability and capacity issues. This method prevents representation drift and significantly improves accuracy, often outperforming full fine-tuning with fewer parameters.

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14706
• PDF: https://arxiv.org/pdf/2603.14706
• Github: https://github.com/salimkhazem/adaptertune

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OSM-based Domain Adaptation for Remote Sensing VLMs

📝 Summary:
A self-contained domain adaptation framework for vision-language models in remote sensing uses OpenStreetMap data and optical character recognition to generate captions without requiring external teac...

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11804
• PDF: https://arxiv.org/pdf/2603.11804

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

📝 Summary:
A video diffusion model is repurposed as a latent world simulator to enhance multimodal large language models with implicit 3D structural priors and physical laws through spatiotemporal feature extrac...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19235
• PDF: https://arxiv.org/pdf/2603.19235
• Project Page: https://github.com/H-EmbodVis/VEGA-3D
• Github: https://github.com/H-EmbodVis/VEGA-3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing

📝 Summary:
SAMA presents a factorized approach to video editing that separates semantic anchoring from motion modeling, enabling instruction-guided edits with preserved motion through pre-trained motion restorat...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19228
• PDF: https://arxiv.org/pdf/2603.19228
• Project Page: https://cynthiazxy123.github.io/SAMA/
• Github: https://github.com/Cynthiazxy123/SAMA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

📝 Summary:
CubiD is a discrete generation model for high-dimensional representations that enables fine-grained masking and learns rich correlations across spatial positions while maintaining fixed generation ste...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19232
• PDF: https://arxiv.org/pdf/2603.19232
• Github: https://github.com/YuqingWang1029/CubiD

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Memento-Skills: Let Agents Design Agents

📝 Summary:
A generalist language model agent system autonomously designs and improves task-specific agents through memory-based reinforcement learning with stateful prompts and skill libraries. AI-generated summ...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18743
• PDF: https://arxiv.org/pdf/2603.18743
• Project Page: https://memento.run/
• Github: https://github.com/Memento-Teams/Memento-Skills

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World

📝 Summary:
F2LLM-v2 is a multilingual embedding model family trained on 60 million samples across 200+ languages, achieving superior performance through LLM-based training, matryoshka learning, pruning, and dist...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19223
• PDF: https://arxiv.org/pdf/2603.19223
• Project Page: https://huggingface.co/collections/codefuse-ai/f2llm

🔹 Models citing this paper:
https://huggingface.co/codefuse-ai/F2LLM-v2-8B-Preview
https://huggingface.co/codefuse-ai/F2LLM-v2-0.6B-Preview
https://huggingface.co/codefuse-ai/F2LLM-v2-1.7B

Datasets citing this paper:
https://huggingface.co/datasets/codefuse-ai/F2LLM-v2

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs

📝 Summary:
Long-form audio-visual comprehension benchmark reveals significant challenges for current omnimodal large language models in handling extended multi-modal inputs. AI-generated summary Recent advanceme...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19217
• PDF: https://arxiv.org/pdf/2603.19217
• Project Page: https://kd-tao.github.io/LVOmniBench/
• Github: https://github.com/KD-TAO/LVOmniBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FASTER: Rethinking Real-Time Flow VLAs

📝 Summary:
Fast Action Sampling for ImmediaTE Reaction (FASTER) reduces real-time reaction latency in Vision-Language-Action models by adapting sampling schedules to prioritize immediate actions while maintainin...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19199
• PDF: https://arxiv.org/pdf/2603.19199
• Project Page: https://innovator-zero.github.io/FASTER
• Github: https://github.com/innovator-zero/FASTER

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

📝 Summary:
Nemotron-Cascade 2 is a 30B parameter Mixture-of-Experts model with 3B activated parameters that achieves exceptional reasoning and agentic capabilities, matching frontier open models despite its comp...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19220
• PDF: https://arxiv.org/pdf/2603.19220

🔹 Models citing this paper:
https://huggingface.co/nvidia/Nemotron-Cascade-2-30B-A3B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning

📝 Summary:
Modulated Hazard-aware Policy Optimization introduces a Log-Fidelity Modulator and Decoupled Hazard Penalty to stabilize reinforcement learning by controlling importance ratios and regulating asymmetr...

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16929
• PDF: https://arxiv.org/pdf/2603.16929

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Matryoshka Gaussian Splatting

📝 Summary:
Matryoshka Gaussian Splatting enables continuous level of detail rendering by training a single ordered set of Gaussians that maintains full-capacity quality while allowing smooth quality-scaling trad...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19234
• PDF: https://arxiv.org/pdf/2603.19234
• Github: https://github.com/ZhilinGuo/matryoshka-gaussian-splatting

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

📝 Summary:
This paper introduces Principia, a new dataset for deriving mathematical objects, and training recipes using on-policy LLM judges. These methods significantly improve model performance and enable cross-format generalization in reasoning tasks, while also scaling test-time compute.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18886
• PDF: https://arxiv.org/pdf/2603.18886

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents

📝 Summary:
Reinforcement learning infrastructure for multi-turn LLM agents that provides scalable rollout services and standardized sandbox environments for complex interactive tasks. AI-generated summary Multi-...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18815
• PDF: https://arxiv.org/pdf/2603.18815
• Github: https://github.com/NVIDIA-NeMo/ProRL-Agent-Server

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
COT-FM: Cluster-wise Optimal Transport Flow Matching

📝 Summary:
COT-FM enhances Flow Matching by clustering target samples and assigning dedicated source distributions. This creates straighter probability paths, enabling faster and more reliable generation with improved quality across diverse tasks.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13395
• PDF: https://arxiv.org/pdf/2603.13395
• Project Page: https://embodiedai-ntu.github.io/cotfm/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

📝 Summary:
A three-stage framework bridges semantic and kinematic conditions using discrete tokens and diffusion synthesis. Its core MoTok tokenizer achieves compact high-fidelity tokens, significantly boosting controllability, fidelity, and reducing token usage under strong kinematic constraints.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19227
• PDF: https://arxiv.org/pdf/2603.19227
• Project Page: https://rheallyc.github.io/projects/motok/
• Github: https://github.com/rheallyc/MoTok

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding

📝 Summary:
Top-tier MLLMs demonstrate limited capability in processing discrete symbols despite strong performance in complex reasoning, revealing a cognitive mismatch between visual perception and symbolic unde...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18472
• PDF: https://arxiv.org/pdf/2603.18472

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research