ML Research Hub
32.9K subscribers
4.49K photos
275 videos
23 files
4.85K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
M^4olGen: Multi-Agent, Multi-Stage Molecular Generation under Precise Multi-Property Constraints

📝 Summary:
M4olGen is a multi-agent, multi-stage framework for precise molecular generation under multiple physicochemical constraints. It uses fragment-level, retrieval-augmented reasoning and RL-based optimization, outperforming LLMs and graph-based methods.

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10131
• PDF: https://arxiv.org/pdf/2601.10131

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model

📝 Summary:
dots.ocr is a unified Vision-Language Model that jointly learns document layout parsing tasks, overcoming limitations of multi-stage pipelines. It achieves state-of-the-art performance on OmniDocBench and sets a new baseline on the challenging multilingual XDocParse benchmark.

🔹 Publication Date: Published on Dec 2, 2025

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/dotsocr-multilingual-document-layout-parsing-in-a-single-vision-language-model
• PDF: https://arxiv.org/pdf/2512.02498
• Github: https://github.com/rednote-hilab/dots.ocr

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VisionLanguageModel #DocumentParsing #MultilingualAI #AIResearch #DeepLearning
🚀 Master Data Science & Programming!

Unlock your potential with this curated list of Telegram channels. Whether you need books, datasets, interview prep, or project ideas, we have the perfect resource for you. Join the community today!


🔰 Machine Learning with Python
Learn Machine Learning with hands-on Python tutorials, real-world code examples, and clear explanations for researchers and developers.
https://t.iss.one/CodeProgrammer

🔖 Machine Learning
Machine learning insights, practical tutorials, and clear explanations for beginners and aspiring data scientists. Follow the channel for models, algorithms, coding guides, and real-world ML applications.
https://t.iss.one/DataScienceM

🧠 Code With Python
This channel delivers clear, practical content for developers, covering Python, Django, Data Structures, Algorithms, and DSA – perfect for learning, coding, and mastering key programming skills.
https://t.iss.one/DataScience4

🎯 PyData Careers | Quiz
Python Data Science jobs, interview tips, and career insights for aspiring professionals.
https://t.iss.one/DataScienceQ

💾 Kaggle Data Hub
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.
https://t.iss.one/datasets1

🧑‍🎓 Udemy Coupons | Courses
The first channel in Telegram that offers free Udemy coupons
https://t.iss.one/DataScienceC

😀 ML Research Hub
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.
https://t.iss.one/DataScienceT

💬 Data Science Chat
An active community group for discussing data challenges and networking with peers.
https://t.iss.one/DataScience9

🐍 Python Arab| بايثون عربي
The largest Arabic-speaking group for Python developers to share knowledge and help.
https://t.iss.one/PythonArab

🖊 Data Science Jupyter Notebooks
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
https://t.iss.one/DataScienceN

📺 Free Online Courses | Videos
Free online courses covering data science, machine learning, analytics, programming, and essential skills for learners.
https://t.iss.one/DataScienceV

📈 Data Analytics
Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.
https://t.iss.one/DataAnalyticsX

🎧 Learn Python Hub
Master Python with step-by-step courses – from basics to advanced projects and practical applications.
https://t.iss.one/Python53

⭐️ Research Papers
Professional Academic Writing & Simulation Services
https://t.iss.one/DataScienceY

━━━━━━━━━━━━━━━━━━
Admin: @HusseinSheikho
Please open Telegram to view this post
VIEW IN TELEGRAM
3
Your Group-Relative Advantage Is Biased

📝 Summary:
Group-based Reinforcement Learning from Verifier Rewards has a biased advantage estimator, underestimating hard prompts and overestimating easy ones. This paper proposes History-Aware Adaptive Difficulty Weighting HA-DW to correct this bias, improving performance on reasoning tasks.

🔹 Publication Date: Published on Jan 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08521
• PDF: https://arxiv.org/pdf/2601.08521

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #MachineLearning #AIResearch #BiasCorrection #ReasoningTasks
1
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

📝 Summary:
This work presents an automated rubric generation framework and RubricHub dataset for open-ended AI generation. RubricHub enables significant performance gains, achieving state-of-the-art results on HealthBench and surpassing GPT-5.

🔹 Publication Date: Published on Jan 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08430
• PDF: https://arxiv.org/pdf/2601.08430
• Project Page: https://huggingface.co/datasets/sojuL/RubricHub_v1
• Github: https://github.com/teqkilla/RubricHub

Datasets citing this paper:
https://huggingface.co/datasets/sojuL/RubricHub_v1

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #GenerativeAI #MachineLearning #NLP #Dataset
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text

📝 Summary:
This paper introduces GEM, a text-based pipeline to synthesize multi-turn tool-use trajectories for LLMs from text corpora. It addresses data scarcity and reduces costs with a specialized Trajectory Synthesizer. GEM-32B significantly improves performance on multi-turn benchmarks, showing strong g...

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10355
• PDF: https://arxiv.org/pdf/2601.10355

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #NLP #ToolUse #DataSynthesis
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

📝 Summary:
Reinforcement learning framework for agentic search that improves reliability by teaching agents to recognize reasoning limits and respond appropriately when evidence is insufficient. AI-generated sum...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11037
• PDF: https://arxiv.org/pdf/2601.11037
• Github: https://github.com/Liushiyu-0709/BAPO-Reliable-Search

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

📝 Summary:
Supervised fine-tuning with multiple references addresses overfitting to non-core expressions by masking low-probability tokens based on their semantic importance. AI-generated summary Supervised fine...

🔹 Publication Date: Published on Jan 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09195
• PDF: https://arxiv.org/pdf/2601.09195
• Github: https://github.com/Utaotao/ProFit

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Reasoning Models Generate Societies of Thought

📝 Summary:
Reasoning models demonstrate enhanced performance through multi-agent-like interactions that create diverse cognitive perspectives and improve problem-solving through structured social organization. A...

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10825
• PDF: https://arxiv.org/pdf/2601.10825

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models

📝 Summary:
Existing video generation models lack physical realism, especially for rigid body collisions, treating physics rules as soft conditions. This paper introduces PhysRVG, a physics-aware reinforcement learning paradigm that strictly enforces physical collision rules directly in high-dimensional spac...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11087
• PDF: https://arxiv.org/pdf/2601.11087

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoGeneration #PhysicsAI #ReinforcementLearning #GenerativeAI #ComputerVision
Building Production-Ready Probes For Gemini

📝 Summary:
Language model misuse probes struggle with long-context generalization in production. New architectures and diverse training improve robustness, demonstrating broad generalization and successful deployment in Gemini. Pairing probes with prompted classifiers also improves accuracy.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11516
• PDF: https://arxiv.org/pdf/2601.11516

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AISafety #GeminiAI #AIResearch #MLOps
AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems

📝 Summary:
Recent advances in agentic Large Language Models (LLMs) have positioned them as generalist planners capable of reasoning and acting across diverse tasks. However, existing agent benchmarks largely foc...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11354
• PDF: https://arxiv.org/pdf/2601.11354
• Github: https://github.com/Mtrya/astro-reason

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Monolith: Real Time Recommendation System With Collisionless Embedding Table

📝 Summary:
Monolith is a real-time recommendation system designed for online training. It features a collisionless embedding table with memory optimizations and a fault-tolerant architecture, enabling real-time learning by overcoming limitations of general DL frameworks.

🔹 Publication Date: Published on Sep 16, 2022

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2209.07663
• PDF: https://arxiv.org/pdf/2209.07663
• Github: https://github.com/bytedance/monolith

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#RecommendationSystems #DeepLearning #MachineLearning #RealTimeAI #DataScience
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

📝 Summary:
Agent Lightning is a flexible RL framework for training LLMs in any AI agent, uniquely decoupling execution from training. It uses a hierarchical RL algorithm to handle complex interactions, enabling seamless integration with existing agents and showing stable improvements.

🔹 Publication Date: Published on Aug 5, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03680
• PDF: https://arxiv.org/pdf/2508.03680
• Project Page: https://www.microsoft.com/en-us/research/project/agent-lightning/
• Github: https://github.com/microsoft/agent-lightning

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ReinforcementLearning #LLMs #AIAgents #MachineLearning
SkyReels-V2: Infinite-length Film Generative Model

📝 Summary:
SkyReels-V2 is an infinite-length film generative model that overcomes video generation limits in duration and motion. It synergizes MLLMs, multi-stage pretraining, reinforcement learning, and a diffusion forcing framework. This enables high-quality, long-form video synthesis with realistic motion.

🔹 Publication Date: Published on Apr 17, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.13074
• PDF: https://arxiv.org/pdf/2504.13074
• Github: https://github.com/skyworkai/skyreels-v2

🔹 Models citing this paper:
https://huggingface.co/Skywork/SkyReels-V2-I2V-14B-540P
https://huggingface.co/Skywork/SkyCaptioner-V1
https://huggingface.co/Skywork/SkyReels-V2-I2V-1.3B-540P

Spaces citing this paper:
https://huggingface.co/spaces/fffiloni/SkyReels-V2
https://huggingface.co/spaces/svjack/SkyReels-V2
https://huggingface.co/spaces/Dudu0043/SkyReels-V2

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoGeneration #GenerativeAI #DiffusionModels #MachineLearning #AIResearch
This media is not supported in your browser
VIEW IN TELEGRAM
FrankenMotion: Part-level Human Motion Generation and Composition

📝 Summary:
FrankenMotion introduces a new dataset with atomic, part-level motion annotations using LLMs. This enables a diffusion framework to generate human motion with unprecedented fine-grained spatial and temporal control over body parts, outperforming prior methods.

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10909
• PDF: https://arxiv.org/pdf/2601.10909
• Project Page: https://coral79.github.io/frankenmotion/
• Github: https://github.com/Coral79/FrankenMotion-Code

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MotionGeneration #AI #LLMs #DiffusionModels #ComputerVision
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs

📝 Summary:
Personalized LLMs can generate false information aligned with user history instead of facts. A new method called FPPS mitigates these personalization-induced factual distortions. It substantially improves factual accuracy while maintaining personalized responses.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11000
• PDF: https://arxiv.org/pdf/2601.11000

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #Personalization #Hallucinations #NLP
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models

📝 Summary:
This paper introduces ACoT-VLA, a new paradigm for Vision-Language-Action models that enhances reasoning by formulating it as a structured sequence of coarse action intents. It uses explicit and implicit action reasoners to guide the final policy, significantly improving robot manipulation perfor...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11404
• PDF: https://arxiv.org/pdf/2601.11404

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PhyRPR: Training-Free Physics-Constrained Video Generation

📝 Summary:
PhyRPR introduces a three-stage pipeline Reason-Plan-Refine for video generation. It decouples physical understanding from visual synthesis, addressing issues with physical plausibility. This improves motion controllability and allows for explicit physical control during generation.

🔹 Publication Date: Published on Jan 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09255
• PDF: https://arxiv.org/pdf/2601.09255

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoGeneration #PhysicsAI #AIResearch #ComputerVision #DeepLearning
The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents

📝 Summary:
AI agent integration alters market dynamics. The Poisoned Apple effect describes how an agent can strategically release unused technology solely to manipulate market regulation in their favor, harming opponents. This calls for dynamic market design.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11496
• PDF: https://arxiv.org/pdf/2601.11496

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIAgents #MarketDynamics #GameTheory #TechnologyPolicy #DigitalEconomy
1