ML Research Hub
32.6K subscribers
5.76K photos
367 videos
24 files
6.22K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
PresentBench: A Fine-Grained Rubric-Based Benchmark for Slide Generation

📝 Summary:
Slides serve as a critical medium for conveying information in presentation-oriented scenarios such as academia, education, and business. Despite their importance, creating high-quality slide decks re...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07244
• PDF: https://arxiv.org/pdf/2603.07244

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Agentic Planning with Reasoning for Image Styling via Offline RL

📝 Summary:
This paper presents an agentic offline reinforcement learning framework for complex image styling. It uses structured planning with chain-of-thought reasoning and a tool library to decompose editing tasks. This approach significantly improves performance over direct prompting, validated by human ...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07148
• PDF: https://arxiv.org/pdf/2603.07148

Datasets citing this paper:
https://huggingface.co/datasets/subhojyoti1990/image-agent-styling

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity

📝 Summary:
Sparse-BitNet demonstrates that 1.58-bit quantization works better with N:M sparsity than full-precision models, achieving stable training and improved efficiency across different scales and regimes. ...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05168
• PDF: https://arxiv.org/pdf/2603.05168
• Github: https://github.com/AAzdi/Sparse-BitNet

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering

📝 Summary:
MedSteer is a training-free framework for generating counterfactual medical images. It steers diffusion model activations along pathology vectors to modify concepts while preserving underlying image structure. This method outperforms existing techniques in concept modification and significantly i...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07066
• PDF: https://arxiv.org/pdf/2603.07066
• Github: https://github.com/phamtrongthang123/medsteer

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

📝 Summary:
A four-stage data processing framework with LLM-based difficulty filtering creates a high-quality code generation dataset that significantly improves model performance on challenging problems. AI-gene...

🔹 Publication Date: Published on Mar 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07779
• PDF: https://arxiv.org/pdf/2603.07779
• Project Page: https://github.com/ZongqianLi/MicroCoder/blob/main/README.md

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

📝 Summary:
MicroCoder-GRPO enhances code generation through improved policy optimization with innovations in truncation masking, temperature selection, and loss function adjustments, achieving superior performan...

🔹 Publication Date: Published on Mar 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07777
• PDF: https://arxiv.org/pdf/2603.07777
• Project Page: https://github.com/ZongqianLi/MicroCoder/blob/main/README.md
• Github: https://github.com/ZongqianLi/MicroCoder

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Retrieval-Augmented Generation for Predicting Cellular Responses to Gene Perturbation

📝 Summary:
PT-RAG framework improves prediction of cellular responses to genetic perturbations by using differentiable, cell-type-aware retrieval combined with generative modeling, outperforming existing methods...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07233
• PDF: https://arxiv.org/pdf/2603.07233
• Github: https://github.com/difra100/PT-RAG_ICLR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Training-free Latent Inter-Frame Pruning with Attention Recovery

📝 Summary:
LIPAR reduces video generation latency by skipping redundant latent patches. It uses Attention Recovery to maintain quality, boosting throughput by 1.45x without extra training.

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05811
• PDF: https://arxiv.org/pdf/2603.05811
• Project Page: https://dennismenn.github.io/lipar/
• Github: https://github.com/DennisMenn/lipar

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models

📝 Summary:
LiveWorld addresses the out-of-sight dynamics problem in video world models by introducing a persistent global state representation that maintains continuous evolution of dynamic entities beyond the o...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07145
• PDF: https://arxiv.org/pdf/2603.07145
• Project Page: https://zichengduan.github.io/LiveWorld/index.html

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Variational Flow Maps: Make Some Noise for One-Step Conditional Generation

📝 Summary:
Variational Flow Maps VFMs introduce a novel framework for fast, high-fidelity conditional image generation. VFMs learn an optimal initial noise distribution to respect observations and data priors, accelerating sampling over iterative models. This allows well-calibrated conditional samples in si...

🔹 Publication Date: Published on Mar 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07276
• PDF: https://arxiv.org/pdf/2603.07276
• Github: https://github.com/abbasmammadov/VFM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

📝 Summary:
ByteFlow Net presents a tokenizer-free hierarchical architecture that enables language models to learn adaptive segmentation of raw byte streams through compression-driven methods while maintaining a ...

🔹 Publication Date: Published on Mar 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03583
• PDF: https://arxiv.org/pdf/2603.03583

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CAST: Modeling Visual State Transitions for Consistent Video Retrieval

📝 Summary:
Current video retrieval often lacks context, leading to inconsistent narratives. CAST is a new plug-and-play adapter that predicts state-conditioned visual history to improve video consistency. It enhances retrieval performance and temporal coherence in video generation.

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.08648
• PDF: https://arxiv.org/pdf/2603.08648
• Project Page: https://ucsc-vlaa.github.io/CAST/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Free Lunch for Pass@k? Low Cost Diverse Sampling for Diffusion Language Models

📝 Summary:
Diffusion language models suffer from redundant sampling, but a novel technique that repels samples from each other's feature space improves diversity and performance on code generation and math probl...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04893
• PDF: https://arxiv.org/pdf/2603.04893
• Project Page: https://sean-lamont.github.io/odd/
• Github: https://github.com/sean-lamont/odd

Spaces citing this paper:
https://huggingface.co/spaces/sean-lamont/ODD-Demo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

📝 Summary:
Multimodal LLMs struggle to process text from images compared to textual tokens, a modality gap influenced by rendering quality. This gap mainly stems from amplified reading errors. A self-distillation method, using pure text reasoning traces with image inputs, effectively improves visual text un...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09095
• PDF: https://arxiv.org/pdf/2603.09095

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion

📝 Summary:
Omni-Diffusion introduces the first any-to-any multimodal language model based on mask-based discrete diffusion models, unifying text, speech, and image processing in a single framework. AI-generated ...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06577
• PDF: https://arxiv.org/pdf/2603.06577
• Project Page: https://omni-diffusion.github.io/
• Github: https://github.com/VITA-MLLM/Omni-Diffusion

🔹 Models citing this paper:
https://huggingface.co/lijiang/Omni-Diffusion

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness

📝 Summary:
The RAISE framework demonstrates how advances in logical reasoning capabilities within large language models can lead to increasingly sophisticated forms of situational awareness, potentially resultin...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09200
• PDF: https://arxiv.org/pdf/2603.09200

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement

📝 Summary:
SAHOO provides a framework for monitoring and controlling alignment drift in self-improving AI systems through goal drift detection, constraint preservation, and regression risk quantification across ...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06333
• PDF: https://arxiv.org/pdf/2603.06333

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

📝 Summary:
MM-Zero introduces a zero-data self-evolving framework for Vision Language Models using a multi-role system Proposer Coder Solver. It generates visual content and performs reasoning, trained with Group Relative Policy Optimization. This improves VLM reasoning performance and offers a scalable sel...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09206
• PDF: https://arxiv.org/pdf/2603.09206
• Github: https://github.com/zli12321/MM-Zero

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants

📝 Summary:
MiniAppBench introduces the first comprehensive benchmark for evaluating principle-driven, interactive application generation, addressing the gap in existing benchmarks that focus on static correctnes...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09652
• PDF: https://arxiv.org/pdf/2603.09652
• Project Page: https://miniappbench.github.io/
• Github: https://github.com/MiniAppBench/miniappbench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Fish Audio S2 Technical Report

📝 Summary:
Fish Audio S2 is an open-source text-to-speech system with multi-speaker capabilities, multi-turn generation, and instruction-following control through natural-language descriptions, utilizing a multi...

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.08823
• PDF: https://arxiv.org/pdf/2603.08823
• Project Page: https://fish.audio/
• Github: https://github.com/fishaudio/fish-speech

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?

📝 Summary:
VLM-SubtleBench is introduced as a benchmark for evaluating vision-language models on subtle comparative reasoning across diverse domains, revealing significant gaps between model and human performanc...

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.07888
• PDF: https://arxiv.org/pdf/2603.07888
• Github: https://github.com/krafton-ai/VLM-SubtleBench

Datasets citing this paper:
https://huggingface.co/datasets/KRAFTON/VLM-SubtleBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research