ML Research Hub
32.9K subscribers
5.4K photos
339 videos
24 files
5.83K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

📝 Summary:
TermiGen introduces a pipeline for generating verifiable terminal environments and resilient trajectories to improve open-weight LLMs' ability to execute complex tasks and recover from runtime errors....

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07274
• PDF: https://arxiv.org/pdf/2602.07274
• Github: https://github.com/ucsb-mlsec/terminal-bench-env

🔹 Models citing this paper:
https://huggingface.co/UCSB-SURFI/TermiGen-32B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UI-Venus-1.5 Technical Report

📝 Summary:
UI-Venus-1.5 is a unified GUI agent with improved performance through mid-training stages, online reinforcement learning, and model merging techniques. AI-generated summary GUI agents have emerged as ...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09082
• PDF: https://arxiv.org/pdf/2602.09082
• Github: https://github.com/inclusionAI/UI-Venus

🔹 Models citing this paper:
https://huggingface.co/inclusionAI/UI-Venus-1.5-8B
https://huggingface.co/inclusionAI/UI-Venus-1.5-30B-A3B
https://huggingface.co/inclusionAI/UI-Venus-1.5-2B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

📝 Summary:
SkillRL enables LLM agents to improve through hierarchical skill discovery and recursive policy evolution, achieving superior performance on complex tasks while reducing computational overhead. AI-gen...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08234
• PDF: https://arxiv.org/pdf/2602.08234
• Github: https://github.com/aiming-lab/SkillRL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling

📝 Summary:
Agent Banana addresses challenges in instruction-based image editing through a hierarchical framework with context folding and image layer decomposition for high-fidelity, multi-turn editing at ultra-...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09084
• PDF: https://arxiv.org/pdf/2602.09084
• Project Page: https://agent-banana.github.io/
• Github: https://github.com/taco-group/agent-banana

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models

📝 Summary:
SCALE is a novel inference strategy for Vision-Language-Action models that jointly modulates visual perception and action based on self-uncertainty, improving robustness without additional training or...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04208
• PDF: https://arxiv.org/pdf/2602.04208
• Project Page: https://dcahn12.github.io/projects/scale/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ANCHOR: Branch-Point Data Generation for GUI Agents

📝 Summary:
A trajectory expansion framework called Anchor bootstraps scalable desktop supervision from seed demonstrations by identifying branch points and generating new trajectories through state-grounded task...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07153
• PDF: https://arxiv.org/pdf/2602.07153

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Prism: Spectral-Aware Block-Sparse Attention

📝 Summary:
Prism addresses inefficiencies in block-sparse attention for long-context LLM pre-filling by using a spectral-aware approach that improves block selection accuracy through energy-based temperature cal...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08426
• PDF: https://arxiv.org/pdf/2602.08426
• Github: https://github.com/xinghaow99/prism

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Temporal Pair Consistency for Variance-Reduced Flow Matching

📝 Summary:
Temporal Pair Consistency reduces variance in continuous-time generative models by coupling velocity predictions at paired timesteps, improving sample quality and efficiency without altering model arc...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04908
• PDF: https://arxiv.org/pdf/2602.04908

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Code2World: A GUI World Model via Renderable Code Generation

📝 Summary:
Code2World is a GUI world model that predicts next visual states by generating renderable code. It solves visual fidelity and structural control issues of prior methods, significantly boosting autonomous agent navigation performance.

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09856
• PDF: https://arxiv.org/pdf/2602.09856
• Project Page: https://amap-ml.github.io/Code2World/
• Github: https://github.com/AMAP-ML/Code2World

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Chain of Mindset: Reasoning with Adaptive Cognitive Modes

📝 Summary:
A novel training-free framework called Chain of Mindset enables step-level adaptive mindset orchestration for large language models by integrating spatial, convergent, divergent, and algorithmic reaso...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10063
• PDF: https://arxiv.org/pdf/2602.10063
• Github: https://github.com/QuantaAlpha/chain-of-mindset

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VideoWorld 2: Learning Transferable Knowledge from Real-world Videos

📝 Summary:
VideoWorld 2 enables transferable knowledge learning from raw videos through a dynamic-enhanced Latent Dynamics Model that decouples action dynamics from visual appearance, achieving improved task per...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10102
• PDF: https://arxiv.org/pdf/2602.10102
• Project Page: https://maverickren.github.io/VideoWorld2.github.io/
• Github: https://github.com/ByteDance-Seed/VideoWorld/tree/main/VideoWorld2

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
BagelVLA: Enhancing Long-Horizon Manipulation via Interleaved Vision-Language-Action Generation

📝 Summary:
BagelVLA is a unified Vision-Language-Action model that integrates linguistic planning, visual forecasting, and action generation through residual flow guidance for improved manipulation tasks. AI-gen...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09849
• PDF: https://arxiv.org/pdf/2602.09849
• Project Page: https://cladernyjorn.github.io/BagelVLA.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents

📝 Summary:
Diffusion Large Language Models are optimized for search agents through enhanced reasoning capabilities and reduced latency via a parallel reasoning paradigm. AI-generated summary Recently, Diffusion ...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07035
• PDF: https://arxiv.org/pdf/2602.07035
• Project Page: https://bubble65.github.io/dllm-searcher-pub/
• Github: https://github.com/bubble65/DLLM-Searcher

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Covo-Audio Technical Report

📝 Summary:
Covo-Audio is a 7B-parameter end-to-end large audio language model that processes continuous audio inputs and generates audio outputs, achieving state-of-the-art performance across speech-text modelin...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09823
• PDF: https://arxiv.org/pdf/2602.09823

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration

📝 Summary:
Reinforcement learning with verifiable rewards is used to enhance parallel thinking in large reasoning models through outline-guided path exploration that reduces information redundancy and improves s...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08344
• PDF: https://arxiv.org/pdf/2602.08344

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SAGE: Scalable Agentic 3D Scene Generation for Embodied AI

📝 Summary:
SAGE is an agentic framework that automatically generates simulation-ready 3D environments for embodied AI by combining layout and object composition generators with evaluative critics for semantic pl...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10116
• PDF: https://arxiv.org/pdf/2602.10116
• Project Page: https://nvlabs.github.io/sage

Datasets citing this paper:
https://huggingface.co/datasets/nvidia/SAGE-10k

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Autoregressive Image Generation with Masked Bit Modeling

📝 Summary:
Discrete tokenizers can match or exceed continuous methods when properly scaled, and a new masked Bit AutoRegressive modeling approach achieves state-of-the-art results with reduced computational cost...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09024
• PDF: https://arxiv.org/pdf/2602.09024
• Project Page: https://bar-gen.github.io/
• Github: https://bar-gen.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models

📝 Summary:
Large language models exhibit varying levels of social risk across multiple dimensions, with significant differences in worst-case behavior that cannot be captured by traditional scalar evaluation met...

🔹 Publication Date: Published on Jan 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21235
• PDF: https://arxiv.org/pdf/2601.21235

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

📝 Summary:
VLA-JEPA is a JEPA-style pretraining framework that improves vision-language-action policy learning by using leakage-free state prediction in latent space, enhancing generalization and robustness in m...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10098
• PDF: https://arxiv.org/pdf/2602.10098
• Project Page: https://ginwind.github.io/VLA-JEPA/
• Github: https://github.com/ginwind/VLA-JEPA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

📝 Summary:
OPUS is a dynamic data selection framework that improves pre-training efficiency by scoring data candidates based on optimizer-induced update projections in a stable proxy-derived target space, achiev...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.05400
• PDF: https://arxiv.org/pdf/2602.05400

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research