ML Research Hub – Telegram

ML Research Hub

32.6K subscribers

3.39K photos

133 videos

23 files

3.62K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.6K subscribers

ML Research Hub

✨ReCode: Unify Plan and Action for Universal Granularity Control

📝 Summary:
ReCode unifies planning and action in LLM agents via recursive code generation. It treats plans as abstract functions recursively decomposed into primitive actions, enabling dynamic decision granularity. This significantly improves performance and data efficiency.

🔹 Publication Date: Published on Oct 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.23564
• PDF: https://arxiv.org/pdf/2510.23564
• Github: https://github.com/FoundationAgents/ReCode

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMAgents #AI #CodeGeneration #Planning #GranularityControl

68 views06:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LongCat-Video Technical Report

📝 Summary:
LongCat-Video is a 13.6B Diffusion Transformer model excelling in efficient, high-quality long video generation. It uses a unified architecture for tasks like Text-to-Video and coarse-to-fine generation for efficiency. This model is a significant step toward developing world models.

🔹 Publication Date: Published on Oct 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.22200
• PDF: https://arxiv.org/pdf/2510.22200
• Github: https://github.com/meituan-longcat/LongCat-Video

🔹 Models citing this paper:
• https://huggingface.co/meituan-longcat/LongCat-Video

✨ Spaces citing this paper:
• https://huggingface.co/spaces/multimodalart/LongCat-Video
• https://huggingface.co/spaces/rahul7star/LongCat-Video
• https://huggingface.co/spaces/armaishere/meituan-longcat-LongCat-Video

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoGeneration #DiffusionModels #Transformers #AI #TextToVideo

56 views06:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨RAG-Anything: All-in-One RAG Framework

📝 Summary:
RAG-Anything is a unified framework extending RAG to all modalities, not just text. It integrates cross-modal relationships and semantic matching via dual-graph construction and hybrid retrieval. This significantly improves performance on complex multimodal benchmarks.

🔹 Publication Date: Published on Oct 14

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/rag-anything-all-in-one-rag-framework
• PDF: https://arxiv.org/pdf/2510.12323
• Github: https://github.com/HKUDS/RAG-Anything

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#RAG #MultimodalAI #MachineLearning #InformationRetrieval #GraphAI

69 views06:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold

📝 Summary:
PokeeResearch-7B is a 7B-parameter deep research agent achieving state-of-the-art results using Reinforcement Learning from AI Feedback RLAIF. Its chain-of-thought reasoning scaffold enhances robustness and alignment, producing an efficient, resilient, and research-grade AI.

🔹 Publication Date: Published on Oct 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.15862
• PDF: https://arxiv.org/pdf/2510.15862
• Github: https://github.com/Pokee-AI/PokeeResearchOSS

🔹 Models citing this paper:
• https://huggingface.co/PokeeAI/pokee_research_7b
• https://huggingface.co/Mungert/pokee_research_7b-GGUF

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #ReinforcementLearning #LLM #MachineLearning #AIResearch

75 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning

📝 Summary:
FAPO improves LLM reasoning by penalizing flawed-positive rollouts, which are unreliable reasoning patterns. This secures early gains while shifting optimization toward reliable reasoning later, enhancing correctness and stability.

🔹 Publication Date: Published on Oct 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.22543
• PDF: https://arxiv.org/pdf/2510.22543
• Project Page: https://fapo-rl.github.io/
• Github: https://fapo-rl.github.io

🔹 Models citing this paper:
• https://huggingface.co/dyyyyyyyy/FAPO-32B
• https://huggingface.co/dyyyyyyyy/FAPO-GenRM-4B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/dyyyyyyyy/FAPO-Critic
• https://huggingface.co/datasets/dyyyyyyyy/FAPO-Reasoning-Dataset

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #ReinforcementLearning #DeepLearning #Reasoning

72 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨The Unreasonable Effectiveness of Scaling Agents for Computer Use

📝 Summary:
Behavior Best-of-N bBoN improves computer-use agent reliability by generating multiple rollouts and selecting them via behavior narratives. This method achieves state-of-the-art performance on OSWorld and generalizes across operating systems, demonstrating effective CUA scaling.

🔹 Publication Date: Published on Oct 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.02250
• PDF: https://arxiv.org/pdf/2510.02250
• Project Page: https://www.simular.ai/articles/agent-s3
• Github: https://github.com/simular-ai/Agent-S

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AIAgents #AIScaling #OperatingSystems #BehavioralAI #AIResearch

75 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

📝 Summary:
Agent S2 is a compositional framework for computer use agents that delegates tasks across generalist and specialist models. Using Mixture-of-Grounding and Proactive Hierarchical Planning, it achieves state-of-the-art performance on diverse benchmarks and operating systems.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.00906
• PDF: https://arxiv.org/pdf/2504.00906
• Project Page: https://www.simular.ai/articles/agent-s2-technical-review
• Github: https://github.com/simular-ai/Agent-S

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AIAgents #MachineLearning #AI #GeneralistSpecialist #AutonomousSystems

❤1

82 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

📝 Summary:
Pico-Banana-400K is a new 400K-image dataset for text-guided image editing, built from real photos. It offers diverse edit types, high quality, and specialized subsets for multi-turn, preference-based, and long-short instruction editing, enabling comprehensive model development.

🔹 Publication Date: Published on Oct 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.19808
• PDF: https://arxiv.org/pdf/2510.19808
• Github: https://github.com/apple/pico-banana-400k

🔹 Models citing this paper:
• https://huggingface.co/eigen-ai-labs/eigen-banana-qwen-image-edit

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ImageEditing #TextGuidedEditing #Dataset #ComputerVision #AI

63 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MIRIX: Multi-Agent Memory System for LLM-Based Agents

📝 Summary:
MIRIX is a modular multi-agent memory system for LLM-based agents that integrates diverse memory types and a dynamic framework. It significantly enhances memory capabilities for multimodal and long-form conversations. MIRIX achieves superior performance on challenging benchmarks, outperforming ex...

🔹 Publication Date: Published on Jul 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.07957
• PDF: https://arxiv.org/pdf/2507.07957
• Project Page: https://mirix.io/
• Github: https://github.com/Mirix-AI/MIRIX

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #MultiAgentSystems #AISystems #MemorySystems #AI

63 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Cache-to-Cache: Direct Semantic Communication Between Large Language Models

📝 Summary:
C2C enables direct semantic communication between LLMs by projecting and fusing their KV-caches, overcoming text-based communication limits. This method preserves rich semantics, improving accuracy by 3-5% and achieving a 2x speedup over traditional text communication.

🔹 Publication Date: Published on Oct 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.03215
• PDF: https://arxiv.org/pdf/2510.03215
• Project Page: https://fuvty.github.io/C2C_Project_Page/
• Github: https://github.com/thu-nics/C2C

🔹 Models citing this paper:
• https://huggingface.co/nics-efc/C2C_Fuser

✨ Spaces citing this paper:
• https://huggingface.co/spaces/fuvty/C2C_demo
• https://huggingface.co/spaces/nics-efc/C2C_demo

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #SemanticCommunication #AI #DeepLearning #NLP

82 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

73 views06:04

ML Research Hub

✨Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

📝 Summary:
Skyfall-GS synthesizes large-scale, explorable 3D urban scenes by combining satellite imagery for geometry and diffusion models for realistic textures. This framework offers improved cross-view consistent geometry and photorealistic appearances without needing costly 3D annotations.

🔹 Publication Date: Published on Oct 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.15869
• PDF: https://arxiv.org/pdf/2510.15869
• Project Page: https://skyfall-gs.jayinnn.dev/
• Github: https://github.com/jayin92/skyfall-gs

🔹 Models citing this paper:
• https://huggingface.co/jayinnn/Skyfall-GS-ply

✨ Datasets citing this paper:
• https://huggingface.co/datasets/jayinnn/Skyfall-GS-eval
• https://huggingface.co/datasets/jayinnn/Skyfall-GS-datasets

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#3DReconstruction #ComputerVision #SatelliteImagery #DiffusionModels #UrbanModeling

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Synthesizing large-scale, explorable, and geometrically accurate 3D urban scenes is a challenging yet valuable task in providing immersive and embodied applications. The challenges lie in the lack...

73 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents

📝 Summary:
Easy Dataset is a framework that synthesizes LLM fine-tuning data from unstructured documents using a GUI and LLMs. It generates domain-specific question-answer pairs with human oversight. This improves LLM performance in specific domains while retaining general knowledge.

🔹 Publication Date: Published on Jul 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.04009
• PDF: https://arxiv.org/pdf/2507.04009
• Github: https://github.com/ConardLi/easy-dataset

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #DataSynthesis #FineTuning #AI #NLP

111 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

70 views06:04

ML Research Hub

✨InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

📝 Summary:
InternVL3 introduces a native multimodal pre-training paradigm, jointly learning from multimodal and text data to overcome conventional alignment challenges. This unified approach, combined with advanced techniques, achieves state-of-the-art performance on multimodal tasks, rivaling proprietary m...

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.10479
• PDF: https://arxiv.org/pdf/2504.10479
• Project Page: https://internvl.github.io/blog/2025-04-11-InternVL-3.0/

🔹 Models citing this paper:
• https://huggingface.co/OpenGVLab/InternVL3-78B
• https://huggingface.co/OpenGVLab/InternVL3_5-241B-A28B
• https://huggingface.co/OpenGVLab/InternVL3-8B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/OpenGVLab/MMPR-v1.2-prompts

✨ Spaces citing this paper:
• https://huggingface.co/spaces/AntResearchNLP/ViLaBench
• https://huggingface.co/spaces/TIGER-Lab/MEGA-Bench
• https://huggingface.co/spaces/prithivMLmods/Tiny-VLMs-Lab

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MultimodalAI #DeepLearning #AIResearch #OpenSourceAI #GenerativeAI

InternVL3: Exploring Advanced Training and Test-Time Recipes for...

We introduce InternVL3, a significant advancement in the InternVL series featuring a native multimodal pre-training paradigm. Rather than adapting a text-only large language model (LLM) into a...

141 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

90 views06:04

ML Research Hub

✨A decoder-only foundation model for time-series forecasting

📝 Summary:
This paper introduces a decoder-only foundation model, adapted from large language models, for time-series forecasting. It achieves near-optimal zero-shot performance on diverse datasets across various time scales and granularities.

🔹 Publication Date: Published on Oct 14, 2023

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2310.10688
• PDF: https://arxiv.org/pdf/2310.10688
• Github: https://github.com/google-research/timesfm

🔹 Models citing this paper:
• https://huggingface.co/google/timesfm-1.0-200m
• https://huggingface.co/google/timesfm-2.0-500m-pytorch
• https://huggingface.co/google/timesfm-2.5-200m-pytorch

✨ Spaces citing this paper:
• https://huggingface.co/spaces/autogluon/fev-leaderboard
• https://huggingface.co/spaces/JayLacoma/Trader_Technical_Indicators
• https://huggingface.co/spaces/pavel321/huggingface-cli-completion

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#TimeSeriesForecasting #FoundationModels #MachineLearning #DeepLearning #AI

A decoder-only foundation model for time-series forecasting

Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on...

112 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

📝 Summary:
RLinf-VLA is a unified framework for scalable reinforcement learning training of vision-language-action models, overcoming supervised fine-tuning limitations. It offers a 1.6x-1.8x speedup, supports diverse architectures and algorithms, and shows strong generalization in simulation and on a real ...

🔹 Publication Date: Published on Oct 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.06710
• PDF: https://arxiv.org/pdf/2510.06710
• Project Page: https://rlinf.readthedocs.io/en/latest/
• Github: https://github.com/RLinf/RLinf

🔹 Models citing this paper:
• https://huggingface.co/RLinf/RLinf-math-7B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ReinforcementLearning #VLA #Robotics #AIResearch #MachineLearning

160 views06:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

100 views06:05

ML Research Hub

✨ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation

📝 Summary:
ChronoEdit ensures physical consistency in image editing by reframing it as a video generation problem. It uses pretrained video models and temporal reasoning tokens to imagine plausible physical transformations between edited images. This approach significantly improves realism and visual fideli...

🔹 Publication Date: Published on Oct 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.04290
• PDF: https://arxiv.org/pdf/2510.04290
• Project Page: https://research.nvidia.com/labs/toronto-ai/chronoedit
• Github: https://github.com/nv-tlabs/ChronoEdit

🔹 Models citing this paper:
• https://huggingface.co/nvidia/ChronoEdit-14B-Diffusers
• https://huggingface.co/vantagewithai/ChronoEdit-GGUF
• https://huggingface.co/vantagewithai/ChronoEdit-fp8-scaled

✨ Spaces citing this paper:
• https://huggingface.co/spaces/nvidia/ChronoEdit
• https://huggingface.co/spaces/JarlJarle/nvidia-ChronoEdit-14B-Diffusers

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ImageEditing #VideoGeneration #TemporalReasoning #ComputerVision #AIResearch

ChronoEdit: Towards Temporal Reasoning for Image Editing and World...

Recent advances in large generative models have greatly enhanced both image editing and in-context image generation, yet a critical gap remains in ensuring physical consistency, where edited...

205 views06:05

✨ Explore Data Science 📝 Write your paper