ML Research Hub

✨RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

📝 Summary:
This work presents an automated rubric generation framework and RubricHub dataset for open-ended AI generation. RubricHub enables significant performance gains, achieving state-of-the-art results on HealthBench and surpassing GPT-5.

🔹 Publication Date: Published on Jan 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08430
• PDF: https://arxiv.org/pdf/2601.08430
• Project Page: https://huggingface.co/datasets/sojuL/RubricHub_v1
• Github: https://github.com/teqkilla/RubricHub

✨ Datasets citing this paper:
• https://huggingface.co/datasets/sojuL/RubricHub_v1

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #GenerativeAI #MachineLearning #NLP #Dataset

145 views06:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text

📝 Summary:
This paper introduces GEM, a text-based pipeline to synthesize multi-turn tool-use trajectories for LLMs from text corpora. It addresses data scarcity and reduces costs with a specialized Trajectory Synthesizer. GEM-32B significantly improves performance on multi-turn benchmarks, showing strong g...

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10355
• PDF: https://arxiv.org/pdf/2601.10355

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #NLP #ToolUse #DataSynthesis

144 views06:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

📝 Summary:
Reinforcement learning framework for agentic search that improves reliability by teaching agents to recognize reasoning limits and respond appropriately when evidence is insufficient. AI-generated sum...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11037
• PDF: https://arxiv.org/pdf/2601.11037
• Github: https://github.com/Liushiyu-0709/BAPO-Reliable-Search

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

143 views06:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

📝 Summary:
Supervised fine-tuning with multiple references addresses overfitting to non-core expressions by masking low-probability tokens based on their semantic importance. AI-generated summary Supervised fine...

🔹 Publication Date: Published on Jan 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09195
• PDF: https://arxiv.org/pdf/2601.09195
• Github: https://github.com/Utaotao/ProFit

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

140 views06:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Reasoning Models Generate Societies of Thought

📝 Summary:
Reasoning models demonstrate enhanced performance through multi-agent-like interactions that create diverse cognitive perspectives and improve problem-solving through structured social organization. A...

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10825
• PDF: https://arxiv.org/pdf/2601.10825

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

131 views06:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models

📝 Summary:
Existing video generation models lack physical realism, especially for rigid body collisions, treating physics rules as soft conditions. This paper introduces PhysRVG, a physics-aware reinforcement learning paradigm that strictly enforces physical collision rules directly in high-dimensional spac...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11087
• PDF: https://arxiv.org/pdf/2601.11087

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoGeneration #PhysicsAI #ReinforcementLearning #GenerativeAI #ComputerVision

130 views06:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Building Production-Ready Probes For Gemini

📝 Summary:
Language model misuse probes struggle with long-context generalization in production. New architectures and diverse training improve robustness, demonstrating broad generalization and successful deployment in Gemini. Pairing probes with prompted classifiers also improves accuracy.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11516
• PDF: https://arxiv.org/pdf/2601.11516

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AISafety #GeminiAI #AIResearch #MLOps

122 views06:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems

📝 Summary:
Recent advances in agentic Large Language Models (LLMs) have positioned them as generalist planners capable of reasoning and acting across diverse tasks. However, existing agent benchmarks largely foc...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11354
• PDF: https://arxiv.org/pdf/2601.11354
• Github: https://github.com/Mtrya/astro-reason

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

130 views06:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Monolith: Real Time Recommendation System With Collisionless Embedding Table

📝 Summary:
Monolith is a real-time recommendation system designed for online training. It features a collisionless embedding table with memory optimizations and a fault-tolerant architecture, enabling real-time learning by overcoming limitations of general DL frameworks.

🔹 Publication Date: Published on Sep 16, 2022

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2209.07663
• PDF: https://arxiv.org/pdf/2209.07663
• Github: https://github.com/bytedance/monolith

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#RecommendationSystems #DeepLearning #MachineLearning #RealTimeAI #DataScience

150 views06:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Agent Lightning: Train ANY AI Agents with Reinforcement Learning

📝 Summary:
Agent Lightning is a flexible RL framework for training LLMs in any AI agent, uniquely decoupling execution from training. It uses a hierarchical RL algorithm to handle complex interactions, enabling seamless integration with existing agents and showing stable improvements.

🔹 Publication Date: Published on Aug 5, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03680
• PDF: https://arxiv.org/pdf/2508.03680
• Project Page: https://www.microsoft.com/en-us/research/project/agent-lightning/
• Github: https://github.com/microsoft/agent-lightning

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #ReinforcementLearning #LLMs #AIAgents #MachineLearning

229 views06:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

200 views06:44

ML Research Hub

✨SkyReels-V2: Infinite-length Film Generative Model

📝 Summary:
SkyReels-V2 is an infinite-length film generative model that overcomes video generation limits in duration and motion. It synergizes MLLMs, multi-stage pretraining, reinforcement learning, and a diffusion forcing framework. This enables high-quality, long-form video synthesis with realistic motion.

🔹 Publication Date: Published on Apr 17, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.13074
• PDF: https://arxiv.org/pdf/2504.13074
• Github: https://github.com/skyworkai/skyreels-v2

🔹 Models citing this paper:
• https://huggingface.co/Skywork/SkyReels-V2-I2V-14B-540P
• https://huggingface.co/Skywork/SkyCaptioner-V1
• https://huggingface.co/Skywork/SkyReels-V2-I2V-1.3B-540P

✨ Spaces citing this paper:
• https://huggingface.co/spaces/fffiloni/SkyReels-V2
• https://huggingface.co/spaces/svjack/SkyReels-V2
• https://huggingface.co/spaces/Dudu0043/SkyReels-V2

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoGeneration #GenerativeAI #DiffusionModels #MachineLearning #AIResearch

arXiv.org

SkyReels-V2: Infinite-length Film Generative Model

Recent advances in video generation have been driven by diffusion models and autoregressive frameworks, yet critical challenges persist in harmonizing prompt adherence, visual quality, motion...

201 views06:44

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:08

This media is not supported in your browser

VIEW IN TELEGRAM

✨FrankenMotion: Part-level Human Motion Generation and Composition

📝 Summary:
FrankenMotion introduces a new dataset with atomic, part-level motion annotations using LLMs. This enables a diffusion framework to generate human motion with unprecedented fine-grained spatial and temporal control over body parts, outperforming prior methods.

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10909
• PDF: https://arxiv.org/pdf/2601.10909
• Project Page: https://coral79.github.io/frankenmotion/
• Github: https://github.com/Coral79/FrankenMotion-Code

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MotionGeneration #AI #LLMs #DiffusionModels #ComputerVision

210 views06:54

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs

📝 Summary:
Personalized LLMs can generate false information aligned with user history instead of facts. A new method called FPPS mitigates these personalization-induced factual distortions. It substantially improves factual accuracy while maintaining personalized responses.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11000
• PDF: https://arxiv.org/pdf/2601.11000

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #Personalization #Hallucinations #NLP

206 views07:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models

📝 Summary:
This paper introduces ACoT-VLA, a new paradigm for Vision-Language-Action models that enhances reasoning by formulating it as a structured sequence of coarse action intents. It uses explicit and implicit action reasoners to guide the final policy, significantly improving robot manipulation perfor...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11404
• PDF: https://arxiv.org/pdf/2601.11404

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

223 views07:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PhyRPR: Training-Free Physics-Constrained Video Generation

📝 Summary:
PhyRPR introduces a three-stage pipeline Reason-Plan-Refine for video generation. It decouples physical understanding from visual synthesis, addressing issues with physical plausibility. This improves motion controllability and allows for explicit physical control during generation.

🔹 Publication Date: Published on Jan 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09255
• PDF: https://arxiv.org/pdf/2601.09255

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoGeneration #PhysicsAI #AIResearch #ComputerVision #DeepLearning

230 views08:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents

📝 Summary:
AI agent integration alters market dynamics. The Poisoned Apple effect describes how an agent can strategically release unused technology solely to manipulate market regulation in their favor, harming opponents. This calls for dynamic market design.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11496
• PDF: https://arxiv.org/pdf/2601.11496

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AIAgents #MarketDynamics #GameTheory #TechnologyPolicy #DigitalEconomy

❤1

191 views09:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Language of Thought Shapes Output Diversity in Large Language Models

📝 Summary:
Controlling the language of thought in large language models increases output diversity. Switching the internal thinking language from English to non-English languages consistently boosts diversity, with mixed-language sampling yielding superior results. This approach expands LLMs diversity ceili...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11227
• PDF: https://arxiv.org/pdf/2601.11227
• Github: https://github.com/iNLP-Lab/Multilingual-LoT-Diversity

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #NLP #MultilingualAI #OutputDiversity

❤1

184 views09:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨More Images, More Problems? A Controlled Analysis of VLM Failure Modes

📝 Summary:
Large Vision Language Models struggle with multi-image understanding. A new benchmark, MIMIC, reveals these failures. The authors propose procedural data generation and attention masking, which significantly improve performance and cross-image aggregation.

🔹 Publication Date: Published on Jan 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07812
• PDF: https://arxiv.org/pdf/2601.07812
• Github: https://github.com/anurag-198/MIMIC

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤2

229 views09:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ShapeR: Robust Conditional 3D Shape Generation from Casual Captures

📝 Summary:
ShapeR generates high-fidelity 3D shapes from casual image sequences by combining SLAM, 3D detection, and vision-language models to condition a rectified flow transformer. It is robust to real-world conditions and significantly outperforms existing methods.

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11514
• PDF: https://arxiv.org/pdf/2601.11514
• Project Page: https://facebookresearch.github.io/ShapeR
• Github: https://github.com/facebookresearch/ShapeR

🔹 Models citing this paper:
• https://huggingface.co/facebook/ShapeR

✨ Datasets citing this paper:
• https://huggingface.co/datasets/facebook/ShapeR-Evaluation

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#3DGeneration #ComputerVision #DeepLearning #GenerativeAI #3DReconstruction

❤2

271 views10:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

📝 Summary:
PersonalAlign is a new framework for GUI agents to align with implicit user intents using hierarchical memory and long-term user records. Their HIM-Agent significantly improves both execution by 15.7% and proactive performance by 7.3%.

🔹 Publication Date: Published on Jan 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09636
• PDF: https://arxiv.org/pdf/2601.09636
• Project Page: https://jiutian-vl.github.io/PersonalAlign-page/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#PersonalAlign #GUIAgents #AI #Personalization #IntelligentAgents

❤3

297 views11:44

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform