ML Research Hub
32.9K subscribers
4.98K photos
312 videos
24 files
5.38K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Beyond Pixels: Visual Metaphor Transfer via Schema-Driven Agentic Reasoning

📝 Summary:
Visual metaphor transfer enables creative AI systems to decompose abstract conceptual relationships from reference images and reapply them to new subjects through a multi-agent framework grounded in c...

🔹 Publication Date: Published on Feb 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01335
• PDF: https://arxiv.org/pdf/2602.01335

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

📝 Summary:
Vision-DeepResearch benchmark addresses limitations in evaluating visual-textual search capabilities of multimodal models by introducing realistic evaluation conditions and improving visual retrieval ...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02185
• PDF: https://arxiv.org/pdf/2602.02185
• Project Page: https://osilly.github.io/Vision-DeepResearch/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

📝 Summary:
Vision-DeepResearch introduces a multimodal deep-research paradigm enabling multi-turn, multi-entity, and multi-scale visual and textual search with deep-research capabilities integrated through cold-...

🔹 Publication Date: Published on Jan 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22060
• PDF: https://arxiv.org/pdf/2601.22060
• Project Page: https://osilly.github.io/Vision-DeepResearch/
• Github: https://github.com/Osilly/Vision-DeepResearch

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Kimi K2.5: Visual Agentic Intelligence

📝 Summary:
Kimi K2.5 is an open-source multimodal agentic model that enhances text and vision processing through joint optimization techniques and introduces Agent Swarm for parallel task execution. AI-generated...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02276
• PDF: https://arxiv.org/pdf/2602.02276
• Project Page: https://huggingface.co/moonshotai/Kimi-K2.5

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

📝 Summary:
A novel Causal Forcing method addresses the architectural gap in distilling bidirectional video diffusion models into autoregressive models by using AR teachers for ODE initialization, significantly i...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02214
• PDF: https://arxiv.org/pdf/2602.02214
• Project Page: https://thu-ml.github.io/CausalForcing.github.io/
• Github: https://thu-ml.github.io/CausalForcing.github.io/

🔹 Models citing this paper:
https://huggingface.co/zhuhz22/Causal-Forcing

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation

📝 Summary:
Mind-Brush presents a unified agentic framework for text-to-image generation that dynamically retrieves multimodal evidence and employs reasoning tools to improve understanding of implicit user intent...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01756
• PDF: https://arxiv.org/pdf/2602.01756
• Github: https://github.com/PicoTrex/Mind-Brush

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Interacted Planes Reveal 3D Line Mapping

📝 Summary:
LiP-Map presents a line-plane joint optimization framework that explicitly models learnable line and planar primitives for accurate 3D line mapping in man-made environments. AI-generated summary 3D li...

🔹 Publication Date: Published on Feb 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01296
• PDF: https://arxiv.org/pdf/2602.01296
• Github: https://github.com/calmke/LiPMAP

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing

📝 Summary:
UniReason integrates text-to-image generation and image editing through a dual reasoning paradigm that enhances planning with world knowledge and uses editing for visual refinement, achieving superior...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02437
• PDF: https://arxiv.org/pdf/2602.02437

🔹 Models citing this paper:
https://huggingface.co/Alex11556666/UniReason

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models

📝 Summary:
A new test-time scaling framework called Prism is introduced for discrete diffusion language models that improves reasoning performance through hierarchical trajectory search, local branching with par...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01842
• PDF: https://arxiv.org/pdf/2602.01842
• Github: https://github.com/viiika/Prism

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CoDiQ: Test-Time Scaling for Controllable Difficult Question Generation

📝 Summary:
A novel framework called CoDiQ enables controllable difficulty generation for competition-level questions through test-time scaling, resulting in a corpus that significantly improves large reasoning m...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01660
• PDF: https://arxiv.org/pdf/2602.01660

🔹 Models citing this paper:
https://huggingface.co/AleXGroup/CoDiQ-Gen-8B

Datasets citing this paper:
https://huggingface.co/datasets/AleXGroup/CoDiQ-Corpus

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
On the Relationship Between Representation Geometry and Generalization in Deep Neural Networks

📝 Summary:
Effective dimension, an unsupervised geometric metric, strongly predicts neural network performance across different architectures and domains, showing bidirectional causality between representation g...

🔹 Publication Date: Published on Jan 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.00130
• PDF: https://arxiv.org/pdf/2602.00130

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents

📝 Summary:
Re-TRAC is an agentic framework that enhances LLM-based research agents by enabling cross-trajectory exploration and iterative reflection through structured state representations, leading to more effi...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02486
• PDF: https://arxiv.org/pdf/2602.02486

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
SEA-Guard: Culturally Grounded Multilingual Safeguard for Southeast Asia

📝 Summary:
Researchers developed a novel agentic data-generation framework to create culturally grounded safety datasets for Southeast Asia, resulting in multilingual safeguard models that outperform existing ap...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01618
• PDF: https://arxiv.org/pdf/2602.01618

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Structured 3D Latents for Scalable and Versatile 3D Generation

📝 Summary:
A 3D generation method using a unified SLAT representation and rectified flow transformers achieves high-quality results across different formats and conditions. AI-generated summary We introduce a no...

🔹 Publication Date: Published on Dec 2, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.01506
• PDF: https://arxiv.org/pdf/2412.01506
• Github: https://github.com/Microsoft/TRELLIS

🔹 Models citing this paper:
https://huggingface.co/microsoft/TRELLIS-image-large
https://huggingface.co/microsoft/TRELLIS-text-xlarge
https://huggingface.co/microsoft/TRELLIS-text-base

Datasets citing this paper:
https://huggingface.co/datasets/JeffreyXiang/TRELLIS-500K
https://huggingface.co/datasets/argojuni0506/TRELLIS-3D
https://huggingface.co/datasets/gqk/TRELLIS-500K-fork

Spaces citing this paper:
https://huggingface.co/spaces/trellis-community/TRELLIS
https://huggingface.co/spaces/dkatz2391/Cavargas-TRELLIS-Multiple3D
https://huggingface.co/spaces/microsoft/TRELLIS

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
PISCES: Annotation-free Text-to-Video Post-Training via Optimal Transport-Aligned Rewards

📝 Summary:
PISCES is an annotation-free text-to-video generation method that uses dual optimal transport-aligned rewards to improve visual quality and semantic alignment without human preference annotations. AI-...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01624
• PDF: https://arxiv.org/pdf/2602.01624

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

📝 Summary:
Vision-DeepResearch introduces a multimodal deep-research paradigm enabling multi-turn, multi-entity, and multi-scale visual and textual search with deep-research capabilities integrated through cold-...

🔹 Publication Date: Published on Jan 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22060
• PDF: https://arxiv.org/pdf/2601.22060
• Project Page: https://osilly.github.io/Vision-DeepResearch/
• Github: https://github.com/Osilly/Vision-DeepResearch

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Show, Don't Tell: Morphing Latent Reasoning into Image Generation

📝 Summary:
LatentMorph integrates implicit latent reasoning into text-to-image generation through four lightweight components that enable adaptive self-refinement and improve both efficiency and cognitive alignm...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02227
• PDF: https://arxiv.org/pdf/2602.02227
• Github: https://github.com/EnVision-Research/LatentMorph

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss

📝 Summary:
PixelGen is a pixel-space diffusion framework that uses perceptual supervision through LPIPS and DINO-based losses to generate high-quality images without requiring VAEs or latent representations. AI-...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02493
• PDF: https://arxiv.org/pdf/2602.02493
• Project Page: https://zehong-ma.github.io/PixelGen/
• Github: https://github.com/Zehong-Ma/PixelGen

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SWE-Universe: Scale Real-World Verifiable Environments to Millions

📝 Summary:
A scalable framework for constructing real-world software engineering environments from GitHub pull requests using an efficient building agent with self-verification and hacking detection capabilities...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02361
• PDF: https://arxiv.org/pdf/2602.02361
• Github: https://huggingface.co/papers?q=GitHub%20pull%20requests

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

📝 Summary:
Post-training of reasoning large language models can be improved by correcting distribution mismatches between supervised fine-tuning and reinforcement learning stages through importance sampling rewe...

🔹 Publication Date: Published on Feb 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.01058
• PDF: https://arxiv.org/pdf/2602.01058

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research