ML Research Hub
32.6K subscribers
5.74K photos
366 videos
24 files
6.21K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

📝 Summary:
MicroCoder-GRPO enhances code generation through improved policy optimization with innovations in truncation masking, temperature selection, and loss function adjustments, achieving superior performan...

🔹 Publication Date: Published on Mar 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07777
• PDF: https://arxiv.org/pdf/2603.07777
• Project Page: https://github.com/ZongqianLi/MicroCoder/blob/main/README.md
• Github: https://github.com/ZongqianLi/MicroCoder

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Retrieval-Augmented Generation for Predicting Cellular Responses to Gene Perturbation

📝 Summary:
PT-RAG framework improves prediction of cellular responses to genetic perturbations by using differentiable, cell-type-aware retrieval combined with generative modeling, outperforming existing methods...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07233
• PDF: https://arxiv.org/pdf/2603.07233
• Github: https://github.com/difra100/PT-RAG_ICLR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Training-free Latent Inter-Frame Pruning with Attention Recovery

📝 Summary:
LIPAR reduces video generation latency by skipping redundant latent patches. It uses Attention Recovery to maintain quality, boosting throughput by 1.45x without extra training.

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05811
• PDF: https://arxiv.org/pdf/2603.05811
• Project Page: https://dennismenn.github.io/lipar/
• Github: https://github.com/DennisMenn/lipar

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models

📝 Summary:
LiveWorld addresses the out-of-sight dynamics problem in video world models by introducing a persistent global state representation that maintains continuous evolution of dynamic entities beyond the o...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07145
• PDF: https://arxiv.org/pdf/2603.07145
• Project Page: https://zichengduan.github.io/LiveWorld/index.html

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Variational Flow Maps: Make Some Noise for One-Step Conditional Generation

📝 Summary:
Variational Flow Maps VFMs introduce a novel framework for fast, high-fidelity conditional image generation. VFMs learn an optimal initial noise distribution to respect observations and data priors, accelerating sampling over iterative models. This allows well-calibrated conditional samples in si...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07276
• PDF: https://arxiv.org/pdf/2603.07276
• Github: https://github.com/abbasmammadov/VFM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

📝 Summary:
ByteFlow Net presents a tokenizer-free hierarchical architecture that enables language models to learn adaptive segmentation of raw byte streams through compression-driven methods while maintaining a ...

🔹 Publication Date: Published on Mar 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03583
• PDF: https://arxiv.org/pdf/2603.03583

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CAST: Modeling Visual State Transitions for Consistent Video Retrieval

📝 Summary:
Current video retrieval often lacks context, leading to inconsistent narratives. CAST is a new plug-and-play adapter that predicts state-conditioned visual history to improve video consistency. It enhances retrieval performance and temporal coherence in video generation.

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.08648
• PDF: https://arxiv.org/pdf/2603.08648
• Project Page: https://ucsc-vlaa.github.io/CAST/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Free Lunch for Pass@k? Low Cost Diverse Sampling for Diffusion Language Models

📝 Summary:
Diffusion language models suffer from redundant sampling, but a novel technique that repels samples from each other's feature space improves diversity and performance on code generation and math probl...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04893
• PDF: https://arxiv.org/pdf/2603.04893
• Project Page: https://sean-lamont.github.io/odd/
• Github: https://github.com/sean-lamont/odd

Spaces citing this paper:
https://huggingface.co/spaces/sean-lamont/ODD-Demo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

📝 Summary:
Multimodal LLMs struggle to process text from images compared to textual tokens, a modality gap influenced by rendering quality. This gap mainly stems from amplified reading errors. A self-distillation method, using pure text reasoning traces with image inputs, effectively improves visual text un...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09095
• PDF: https://arxiv.org/pdf/2603.09095

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion

📝 Summary:
Omni-Diffusion introduces the first any-to-any multimodal language model based on mask-based discrete diffusion models, unifying text, speech, and image processing in a single framework. AI-generated ...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06577
• PDF: https://arxiv.org/pdf/2603.06577
• Project Page: https://omni-diffusion.github.io/
• Github: https://github.com/VITA-MLLM/Omni-Diffusion

🔹 Models citing this paper:
https://huggingface.co/lijiang/Omni-Diffusion

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness

📝 Summary:
The RAISE framework demonstrates how advances in logical reasoning capabilities within large language models can lead to increasingly sophisticated forms of situational awareness, potentially resultin...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09200
• PDF: https://arxiv.org/pdf/2603.09200

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement

📝 Summary:
SAHOO provides a framework for monitoring and controlling alignment drift in self-improving AI systems through goal drift detection, constraint preservation, and regression risk quantification across ...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06333
• PDF: https://arxiv.org/pdf/2603.06333

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

📝 Summary:
MM-Zero introduces a zero-data self-evolving framework for Vision Language Models using a multi-role system Proposer Coder Solver. It generates visual content and performs reasoning, trained with Group Relative Policy Optimization. This improves VLM reasoning performance and offers a scalable sel...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09206
• PDF: https://arxiv.org/pdf/2603.09206
• Github: https://github.com/zli12321/MM-Zero

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants

📝 Summary:
MiniAppBench introduces the first comprehensive benchmark for evaluating principle-driven, interactive application generation, addressing the gap in existing benchmarks that focus on static correctnes...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09652
• PDF: https://arxiv.org/pdf/2603.09652
• Project Page: https://miniappbench.github.io/
• Github: https://github.com/MiniAppBench/miniappbench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Fish Audio S2 Technical Report

📝 Summary:
Fish Audio S2 is an open-source text-to-speech system with multi-speaker capabilities, multi-turn generation, and instruction-following control through natural-language descriptions, utilizing a multi...

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.08823
• PDF: https://arxiv.org/pdf/2603.08823
• Project Page: https://fish.audio/
• Github: https://github.com/fishaudio/fish-speech

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?

📝 Summary:
VLM-SubtleBench is introduced as a benchmark for evaluating vision-language models on subtle comparative reasoning across diverse domains, revealing significant gaps between model and human performanc...

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07888
• PDF: https://arxiv.org/pdf/2603.07888
• Github: https://github.com/krafton-ai/VLM-SubtleBench

Datasets citing this paper:
https://huggingface.co/datasets/KRAFTON/VLM-SubtleBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Towards a Neural Debugger for Python

📝 Summary:
Neural debuggers are language models that emulate traditional debuggers by supporting interactive control operations like stepping and breakpoint setting, enabling both forward and inverse execution p...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09951
• PDF: https://arxiv.org/pdf/2603.09951

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Text-Native Interface for Generative Video Authoring

📝 Summary:
Everyone can write their stories in freeform text format -- it's something we all learn in school. Yet storytelling via video requires one to learn specialized and complicated tools. In this paper, we...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09072
• PDF: https://arxiv.org/pdf/2603.09072

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

📝 Summary:
CourtSI is a large-scale spatial intelligence dataset for sports scenarios that enables evaluation and improvement of vision-language models' understanding of human motion and object interactions. AI-...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09896
• PDF: https://arxiv.org/pdf/2603.09896
• Project Page: https://visionary-laboratory.github.io/CourtSI/
• Github: https://github.com/Visionary-Laboratory/CourtSI

Datasets citing this paper:
https://huggingface.co/datasets/Charlie019/CourtSI-1M
https://huggingface.co/datasets/Charlie019/CourtSI-Bench
https://huggingface.co/datasets/Charlie019/CourtSI-Ext

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

📝 Summary:
InternVL-U is a 4-billion parameter unified multimodal model that combines advanced visual generation with robust semantic understanding through specialized modular design and reasoning-centric data s...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09877
• PDF: https://arxiv.org/pdf/2603.09877
• Github: https://github.com/OpenGVLab/InternVL-U

🔹 Models citing this paper:
https://huggingface.co/InternVL-U/InternVL-U

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Streaming Autoregressive Video Generation via Diagonal Distillation

📝 Summary:
Diagonal Distillation improves video generation speed and quality by leveraging temporal context and asymmetric denoising steps while addressing error accumulation and motion coherence issues in diffu...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09488
• PDF: https://arxiv.org/pdf/2603.09488
• Project Page: https://spherelab.ai/diagdistill
• Github: https://github.com/Sphere-AI-Lab/diagdistill

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research