ML Research Hub
32.4K subscribers
6.15K photos
406 videos
24 files
6.67K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes

📝 Summary:
On-policy distillation for LLMs suffers from fragile token-level signals and unreliable teacher guidance. This paper introduces teacher top-K local support matching with truncated reverse-KL, top-p sampling, and special-token masking to achieve stable optimization and improved performance.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25562
• PDF: https://arxiv.org/pdf/2603.25562
• Project Page: https://www.notion.so/yuqianfu/Revisiting-On-Policy-Distillation-Empirical-Failure-Modes-and-Simple-Fixes-31dd5cc40dd181f89eead3de7181df1d
• Github: https://github.com/hhh675597/revisiting_opd

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#OnPolicyDistillation #LLMs #MachineLearning #DeepLearning #NLP
1
MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution

📝 Summary:
MemMA is a multi-agent framework that coordinates the memory cycle in LLM agents. It uses a Meta-Thinker for strategic guidance and in-situ self-evolving repair for memory construction and retrieval. MemMA consistently outperforms existing baselines.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18718
• PDF: https://arxiv.org/pdf/2603.18718
• Github: https://github.com/ventr1c/memma

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #MultiAgentSystems #AIMemory #AIResearch #ArtificialIntelligence
1
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration

📝 Summary:
Calibri enhances Diffusion Transformers by adding a single learned scaling parameter to improve generative quality. This parameter-efficient method, optimizing only ~100 parameters, reduces inference steps across various text-to-image models while maintaining high-quality outputs.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24800
• PDF: https://arxiv.org/pdf/2603.24800
• Project Page: https://v-gen-ai.github.io/Calibri-page/
• Github: https://github.com/v-gen-ai/Calibri

🔹 Models citing this paper:
https://huggingface.co/v-gen-ai/flux-calibri-gates
https://huggingface.co/v-gen-ai/qwen-calibri

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DiffusionModels #GenerativeAI #AIResearch #MachineLearning #DeepLearning
1
AVControl: Efficient Framework for Training Audio-Visual Controls

📝 Summary:
AVControl efficiently enables modular audio-visual generation by training diverse controls as separate LoRA adapters on a parallel canvas in LTX-2. It achieves superior performance on various tasks including depth and pose guidance, requiring minimal computational resources.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24793
• PDF: https://arxiv.org/pdf/2603.24793
• Project Page: https://matanby.github.io/AVControl/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AudioVisualAI #GenerativeAI #LoRA #EfficientAI #DeepLearning
1
PMT: Plain Mask Transformer for Image and Video Segmentation with Frozen Vision Encoders

📝 Summary:
PMT introduces a Plain Mask Decoder for fast image and video segmentation using frozen Vision Foundation Model encoders. This preserves VFM multi-task sharing, achieving competitive accuracy and significant speed improvements over prior methods.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25398
• PDF: https://arxiv.org/pdf/2603.25398
• Github: https://github.com/tue-mps/pmt

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ImageSegmentation #VideoSegmentation #Transformers #ComputerVision #DeepLearning
IQuest-Coder-V1 Technical Report

📝 Summary:
The IQuest-Coder-V1 series presents new code LLMs using a multi-stage training paradigm to capture dynamic software logic. This approach achieves state-of-the-art performance in agentic software engineering and competitive programming tasks. The Loop variant also optimizes deployment efficiency.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16733
• PDF: https://arxiv.org/pdf/2603.16733
• Project Page: https://iquestlab.github.io/release-1.0-2603/index.html
• Github: https://github.com/IQuestLab/IQuest-Coder-V1

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#CodeLLM #SoftwareEngineering #LargeLanguageModels #AIResearch #MachineLearning
Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models

📝 Summary:
This paper introduces training-free inference-time model steering to enhance Chain-of-Thought reasoning in Large Audio-Language Models. It achieves accuracy gains up to 4.4% and shows cross-modal transfer, where text-derived steering vectors efficiently guide speech reasoning. This positions mode...

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14636
• PDF: https://arxiv.org/pdf/2603.14636

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #MachineLearning #LALMs #ChainOfThought #ModelSteering
Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

📝 Summary:
CroBo is a visual state representation framework that learns what-is-where composition for robotics. It uses global-to-local reconstruction to encode scene element identities and spatial locations in a compact token. This enables tracking scene dynamics for sequential decision making.

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13904
• PDF: https://arxiv.org/pdf/2603.13904
• Project Page: https://seokminlee-chris.github.io/CroBo-ProjectPage/
• Github: https://github.com/SeokminLee-Chris/CroBo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #ComputerVision #SceneUnderstanding #AI #StateRepresentation
1
VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models

📝 Summary:
VFIG is a vision-language model that converts raster images into scalable vector graphics SVG. It employs a 66K dataset and hierarchical training for high-fidelity conversion, outperforming open-source models and matching proprietary ones.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2603.24575
• PDF: https://arxiv.org/pdf/2603.24575
• Project Page: https://vfig-proj.github.io/
• Github: https://github.com/RAIVNLab/VFig

🔹 Models citing this paper:
https://huggingface.co/XunmeiLiu/VFIG-4B

Spaces citing this paper:
https://huggingface.co/spaces/allenai/VFig-Image2SVG-Demo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VisionLanguageModels #SVG #VectorGraphics #AI #ComputerVision
Can MLLMs Read Students' Minds? Unpacking Multimodal Error Analysis in Handwritten Math

📝 Summary:
ScratchMath introduces a benchmark for analyzing errors in student handwritten math. It reveals MLLMs significantly lag human experts in visual and logical reasoning, but proprietary models show potential for error explanation.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24961
• PDF: https://arxiv.org/pdf/2603.24961
• Project Page: https://bbsngg.github.io/ScratchMath/
• Github: https://github.com/ai-for-edu/ScratchMath

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models

📝 Summary:
Language models typically give one answer, but many tasks have multiple solutions. This paper presents multi-answer RL, allowing LMs to generate multiple plausible answers with confidence in a single pass, improving diversity, accuracy, and computational efficiency.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24844
• PDF: https://arxiv.org/pdf/2603.24844
• Project Page: https://multi-answer-rl.github.io/
• Github: https://github.com/ishapuri/multi_answer_rl

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AVO: Agentic Variation Operators for Autonomous Evolutionary Search

📝 Summary:
Agentic variation operators enable autonomous discovery of performance-critical micro-architectural optimizations for attention kernels, outperforming state-of-the-art implementations on advanced GPU ...

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24517
• PDF: https://arxiv.org/pdf/2603.24517

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
WAFT-Stereo: Warping-Alone Field Transforms for Stereo Matching

📝 Summary:
WAFT-Stereo achieves state-of-the-art stereo matching performance by replacing cost volumes with warping techniques, demonstrating superior efficiency and accuracy on major benchmarks. AI-generated su...

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24836
• PDF: https://arxiv.org/pdf/2603.24836
• Github: https://github.com/princeton-vl/WAFT-Stereo

🔹 Models citing this paper:
https://huggingface.co/MemorySlices/WAFT-Stereo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading

📝 Summary:
QuantAgent is a multi-agent LLM framework for high-frequency trading. It uses specialized agents for indicators, patterns, trends, and risk to make rapid decisions. It outperforms existing neural and rule-based systems in accuracy and returns.

🔹 Publication Date: Published on Sep 12, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.09995
• PDF: https://arxiv.org/pdf/2509.09995
• Project Page: https://Y-Research-SBU.github.io/QuantAgent/
• Github: https://github.com/Y-Research-SBU/QuantAgent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #MultiAgent #HighFrequencyTrading #FinTech #AlgorithmicTrading
2
🚀 Master Data Science & Programming!

Unlock your potential with this curated list of Telegram channels. Whether you need books, datasets, interview prep, or project ideas, we have the perfect resource for you. Join the community today!


🔰 Machine Learning with Python
Learn Machine Learning with hands-on Python tutorials, real-world code examples, and clear explanations for researchers and developers.
https://t.iss.one/CodeProgrammer

🔖 Machine Learning
Machine learning insights, practical tutorials, and clear explanations for beginners and aspiring data scientists. Follow the channel for models, algorithms, coding guides, and real-world ML applications.
https://t.iss.one/DataScienceM

🧠 Code With Python
This channel delivers clear, practical content for developers, covering Python, Django, Data Structures, Algorithms, and DSA – perfect for learning, coding, and mastering key programming skills.
https://t.iss.one/DataScience4

🎯 PyData Careers | Quiz
Python Data Science jobs, interview tips, and career insights for aspiring professionals.
https://t.iss.one/DataScienceQ

💾 Kaggle Data Hub
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.
https://t.iss.one/datasets1

🧑‍🎓 Udemy Coupons | Courses
The first channel in Telegram that offers free Udemy coupons
https://t.iss.one/DataScienceC

😀 ML Research Hub
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.
https://t.iss.one/DataScienceT

💬 Data Science Chat
An active community group for discussing data challenges and networking with peers.
https://t.iss.one/DataScience9

🐍 Python Arab| بايثون عربي
The largest Arabic-speaking group for Python developers to share knowledge and help.
https://t.iss.one/PythonArab

🖊 Data Science Jupyter Notebooks
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
https://t.iss.one/DataScienceN

📺 Free Online Courses | Videos
Free online courses covering data science, machine learning, analytics, programming, and essential skills for learners.
https://t.iss.one/DataScienceV

📈 Data Analytics
Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.
https://t.iss.one/DataAnalyticsX

🎧 Learn Python Hub
Master Python with step-by-step courses – from basics to advanced projects and practical applications.
https://t.iss.one/Python53

⭐️ Research Papers
Professional Academic Writing & Simulation Services
https://t.iss.one/DataScienceY

━━━━━━━━━━━━━━━━━━
Admin: @HusseinSheikho
Please open Telegram to view this post
VIEW IN TELEGRAM
1
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

📝 Summary:
ShotStream enables real-time interactive multi-shot video generation via a novel causal architecture. It uses dual-cache memory for visual consistency and two-stage distillation to reduce latency and error. This achieves high-quality, coherent videos at 16 FPS, paving the way for dynamic storytel...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25746
• PDF: https://arxiv.org/pdf/2603.25746
• Project Page: https://luo0207.github.io/ShotStream/
• Github: https://github.com/KlingAIResearch/ShotStream

🔹 Models citing this paper:
https://huggingface.co/KlingTeam/ShotStream

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoGeneration #GenerativeAI #RealTimeAI #DeepLearning #AIStorytelling
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

📝 Summary:
Hybrid Memory improves video world models by consistently tracking dynamic subjects during occlusion. It combines static background archiving with active dynamic subject tracking. This ensures motion continuity and outperforms existing methods in generation quality.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25716
• PDF: https://arxiv.org/pdf/2603.25716
• Project Page: https://kj-chen666.github.io/Hybrid-Memory-in-Video-World-Models/
• Github: https://github.com/H-EmbodVis/HyDRA

🔹 Models citing this paper:
https://huggingface.co/H-EmbodVis/HyDRA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoWorldModels #ComputerVision #AI #MachineLearning #GenerativeAI
This media is not supported in your browser
VIEW IN TELEGRAM
Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models

📝 Summary:
Know3D integrates vision-language models into 3D generation via latent hidden-state injection. This enables language-controlled synthesis of unseen back-views, transforming stochastic hallucination into a semantically guided process for 3D assets.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22782
• PDF: https://arxiv.org/pdf/2603.22782
• Project Page: https://xishuxishu.github.io/Know3D.github.io/
• Github: https://github.com/xishuxishu/Know3D

Spaces citing this paper:
https://huggingface.co/spaces/xishushu/Know3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DGeneration #VisionLanguageModels #GenerativeAI #DeepLearning #AIResearch
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models

📝 Summary:
Full-duplex speech models need high-quality multi-speaker conversational data, which is scarce and difficult to process due to natural dialogue dynamics. This paper introduces Sommelier, a robust, scalable, open-source data processing pipeline to address this data bottleneck.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25750
• PDF: https://arxiv.org/pdf/2603.25750
• Project Page: https://kyudan1.github.io/sommelier.github.io//

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpeechAI #AudioProcessing #DataProcessing #OpenSource #NLP
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

📝 Summary:
Trace2Skill generates transferable LLM agent skills by analyzing diverse execution traces in parallel and consolidating them via inductive reasoning. This framework significantly improves performance, transfers across LLM scales, and generalizes to new settings without model updates.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25158
• PDF: https://arxiv.org/pdf/2603.25158

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AgentAI #TransferLearning #MachineLearning #AIResearch
Media is too big
VIEW IN TELEGRAM
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

📝 Summary:
PackForcing enables efficient, long-video generation via hierarchical KV-cache management and spatiotemporal compression, overcoming memory and consistency issues. It generates 2-minute coherent videos on a single GPU, demonstrating that short-video training suffices for high-quality long-video s...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25730
• PDF: https://arxiv.org/pdf/2603.25730
• Github: https://github.com/ShandaAI/PackForcing

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoGeneration #GenerativeAI #DeepLearning #ModelEfficiency #LongContext