ML Research Hub
32.3K subscribers
6.74K photos
473 videos
24 files
7.36K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning

📝 Summary:
This paper introduces process-driven image generation, an iterative method with interleaved textual and visual reasoning. It decomposes synthesis into planning, drafting, reflecting, and refining steps. Dense step-wise supervision ensures consistency and interpretability of intermediate states.

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04746
• PDF: https://arxiv.org/pdf/2604.04746

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ImageGeneration #GenerativeAI #ArtificialIntelligence #DeepLearning #ComputerVision
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

📝 Summary:
TC-AE is a Vision Transformer-based architecture that improves deep compression autoencoders by addressing token space limitations and enhancing semantic structures through joint self-supervised train...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07340
• PDF: https://arxiv.org/pdf/2604.07340
• Github: https://github.com/inclusionAI/TC-AE

🔹 Models citing this paper:
https://huggingface.co/inclusionAI/TC-AE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DeonticBench: A Benchmark for Reasoning over Rules

📝 Summary:
DEONTICBENCH presents a benchmark for evaluating large language models on complex, context-specific deontic reasoning tasks drawn from real-world legal and policy domains, supporting both symbolic and...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04443
• PDF: https://arxiv.org/pdf/2604.04443
• Project Page: https://huggingface.co/datasets/gydou/DeonticBench
• Github: https://github.com/guangyaodou/DeonticBench

Datasets citing this paper:
https://huggingface.co/datasets/gydou/DeonticBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning

📝 Summary:
Research reveals that large language models can perform latent reasoning with varying depths, but there's a gap between discovering and executing multi-step planning strategies, suggesting limitations...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06427
• PDF: https://arxiv.org/pdf/2604.06427

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Q-Zoom: Query-Aware Adaptive Perception for Efficient Multimodal Large Language Models

📝 Summary:
Q-Zoom enhances MLLM performance by adaptively focusing computational resources on relevant visual regions through dynamic gating and self-distilled region proposal networks, achieving faster inferenc...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2604.06912
• PDF: https://arxiv.org/pdf/2604.06912
• Project Page: https://yuhengsss.github.io/Q-Zoom/
• Github: https://yuhengsss.github.io/Q-Zoom/

🔹 Models citing this paper:
https://huggingface.co/YuhengSSS/Q-Zoom-Qwen2.5VL-3B
https://huggingface.co/YuhengSSS/Q-Zoom-Qwen2.5VL-7B
https://huggingface.co/YuhengSSS/Q-Zoom-Qwen3VL-4B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Fast Spatial Memory with Elastic Test-Time Training

📝 Summary:
Elastic Test-Time Training with fast spatial memory enables efficient 4D reconstruction through multi-chunk adaptation while maintaining stability against catastrophic forgetting. AI-generated summary...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07350
• PDF: https://arxiv.org/pdf/2604.07350
• Project Page: https://fast-spatial-memory.github.io/
• Github: https://github.com/Mars-tin/fast-spatial-mem

🔹 Models citing this paper:
https://huggingface.co/marstin/fast-spatial-mem

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching

📝 Summary:
FlowInOne presents a vision-centric multimodal generation framework that unifies diverse input modalities into a single visual representation, enabling coherent image generation and editing through a ...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06757
• PDF: https://arxiv.org/pdf/2604.06757
• Github: https://csu-jpg.github.io/FlowInOne.github.io/

🔹 Models citing this paper:
https://huggingface.co/CSU-JPG/FlowInOne

Datasets citing this paper:
https://huggingface.co/datasets/CSU-JPG/VisPrompt5M
https://huggingface.co/datasets/CSU-JPG/VPBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SEVerA: Verified Synthesis of Self-Evolving Agents

📝 Summary:
Formally Guarded Generative Models enable safe and correct agentic code generation by combining formal specifications with soft objectives, ensuring reliability in autonomous agent systems. AI-generat...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25111
• PDF: https://arxiv.org/pdf/2603.25111
• Github: https://github.com/uiuc-focal-lab/severa

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Beyond Hard Negatives: The Importance of Score Distribution in Knowledge Distillation for Dense Retrieval

📝 Summary:
Stratified sampling improves knowledge distillation by preserving the full range of teacher scores, outperforming traditional sampling methods in retrieval tasks. AI-generated summary Transferring kno...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04734
• PDF: https://arxiv.org/pdf/2604.04734

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Learning to Hint for Reinforcement Learning

📝 Summary:
HiLL is a reinforcement learning framework that adaptively generates hints conditioned on reasoner errors to improve learning signals and transfer performance in group relative policy optimization. AI...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00698
• PDF: https://arxiv.org/pdf/2604.00698
• Github: https://github.com/Andree-9/HiLL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Tunable Soft Equivariance with Guarantees

📝 Summary:
A general framework for constructing soft equivariant models through weight projection into designed subspaces is proposed, demonstrating improved performance and reduced equivariance error across mul...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26657
• PDF: https://arxiv.org/pdf/2603.26657
• Github: https://github.com/ashiq24/soft-equivariance

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

📝 Summary:
DeltaTok encodes visual feature differences as delta tokens and DeltaWorld generates diverse video futures with reduced parameters and computational cost through multi-hypothesis training. AI-generate...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04913
• PDF: https://arxiv.org/pdf/2604.04913
• Project Page: https://deltatok.github.io
• Github: https://huggingface.co/collections/Amazon-FAR/deltatok

🔹 Models citing this paper:
https://huggingface.co/Amazon-FAR/deltatok-kinetics
https://huggingface.co/Amazon-FAR/deltaworld-kinetics
https://huggingface.co/Amazon-FAR/seg-head-vspw

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
R3PM-Net: Real-time, Robust, Real-world Point Matching Network

📝 Summary:
R3PM-Net is a lightweight, global-aware point matching network that achieves high-speed and accurate point cloud registration with competitive performance on real-world datasets. AI-generated summary ...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05060
• PDF: https://arxiv.org/pdf/2604.05060
• Project Page: https://yasiikb.github.io/R3PM-Net/
• Github: https://github.com/YasiiKB/R3PM-Net

Datasets citing this paper:
https://huggingface.co/datasets/YasiiKB/R3PM-Net

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Qualixar OS: A Universal Operating System for AI Agent Orchestration

📝 Summary:
Qualixar OS enables universal AI agent orchestration through a comprehensive runtime environment supporting diverse LLM providers, agent frameworks, and communication protocols, featuring advanced mul...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06392
• PDF: https://arxiv.org/pdf/2604.06392

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgentGL: Towards Agentic Graph Learning with LLMs via Reinforcement Learning

📝 Summary:
AgentGL is a reinforcement learning-driven framework that enables large language models to navigate and reason over complex relational data by integrating graph-native tools and curriculum learning st...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05846
• PDF: https://arxiv.org/pdf/2604.05846
• Github: https://github.com/sunyuanfu/AgentGL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Systematic Study of Cross-Modal Typographic Attacks on Audio-Visual Reasoning

📝 Summary:
Multi-modal typography attacks demonstrate significantly higher success rates than unimodal attacks by exploiting cross-modal vulnerabilities in audio-visual multi-modal large language models. AI-gene...

🔹 Publication Date: Published on Apr 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03995
• PDF: https://arxiv.org/pdf/2604.03995
• Project Page: https://cskyl.github.io/MLLM-Typography/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Graph-Based Chain-of-Thought Pruning for Reducing Redundant Reflections in Reasoning LLMs

📝 Summary:
This paper optimizes LLM chain-of-thought reasoning by addressing redundant reflections and overthinking. It uses a graph-based framework to convert CoT into a DAG and applies dual pruning strategies to remove inefficient reflection patterns. This approach reduces reasoning tokens by 42% while ma...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05643
• PDF: https://arxiv.org/pdf/2604.05643

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #ChainOfThought #AI #GraphAlgorithms #Reasoning
This media is not supported in your browser
VIEW IN TELEGRAM
GenLCA: 3D Diffusion for Full-Body Avatars from In-the-Wild Videos

📝 Summary:
GenLCA generates photorealistic 3D avatars from text and images using a novel 3D diffusion model. It trains on millions of partially observable 2D videos by using a 3D tokenizer and a visibility-aware strategy to handle incomplete data. This enables superior photorealism and animatability.

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07273
• PDF: https://arxiv.org/pdf/2604.07273
• Project Page: https://onethousandwu.com/GenLCA-Page

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

📝 Summary:
Combee scales prompt learning for self-improving language model agents, overcoming previous limitations with high parallelism. It uses parallel scans, augmented shuffling, and dynamic batch size control to achieve up to 17x speedup with better or comparable accuracy.

🔹 Publication Date: Published on Apr 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04247
• PDF: https://arxiv.org/pdf/2604.04247

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
On the Step Length Confounding in LLM Reasoning Data Selection

📝 Summary:
Researchers identified a bias in naturalness-based data selection for reasoning tasks where longer reasoning steps are preferred over higher-quality ones, and proposed two debiasing methods to improve...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06834
• PDF: https://arxiv.org/pdf/2604.06834
• Project Page: https://wangbing1416.github.io/projects/acl2026_lengthbias.html
• Github: https://github.com/wangbing1416/ASLEC

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

📝 Summary:
LLM agents now increasingly rely on externalized components like memory, skills, and protocols, rather than just modifying model weights. This externalization transforms complex cognitive tasks into more reliably solvable forms. Practical agent progress depends on this external cognitive infrastr...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08224
• PDF: https://arxiv.org/pdf/2604.08224

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research