ML Research Hub
32.8K subscribers
4.18K photos
253 videos
23 files
4.52K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Media is too big
VIEW IN TELEGRAM
GR-Dexter Technical Report

📝 Summary:
GR-Dexter introduces a hardware-model-data framework for bimanual dexterous-hand robot manipulation using VLA models. It combines a new 21-DoF hand, teleoperation for data, and diverse datasets. This framework achieves strong performance and robust generalization in real-world manipulation tasks.

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24210
• PDF: https://arxiv.org/pdf/2512.24210
• Project Page: https://byte-dexter.github.io/gr-dexter/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #DexterousManipulation #VLA #RobotHardware #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

📝 Summary:
SpaceTimePilot is a video diffusion model for dynamic scene rendering, offering independent control over spatial viewpoint and temporal motion. It achieves precise space-time disentanglement via a time-embedding, temporal-warping training, and a synthetic dataset.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.25075
• PDF: https://arxiv.org/pdf/2512.25075
• Project Page: https://zheninghuang.github.io/Space-Time-Pilot/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoDiffusion #GenerativeAI #DynamicScenes #ComputerGraphics #DeepLearning
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation

📝 Summary:
Recent advances in text-to-video (T2V) generation have achieved good visual quality, yet synthesizing videos that faithfully follow physical laws remains an open challenge. Existing methods mainly bas...

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24551
• PDF: https://arxiv.org/pdf/2512.24551
• Project Page: https://caiyuanhao1998.github.io/project/PhyGDPO/
• Github: https://github.com/caiyuanhao1998/Open-PhyGDPO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Scaling Open-Ended Reasoning to Predict the Future

📝 Summary:
This work trains language models for open-ended future prediction using a new dataset synthesized from news. Their OpenForecaster 8B model matches larger proprietary models in accuracy, calibration, and consistency. All resources are open-sourced.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.25070
• PDF: https://arxiv.org/pdf/2512.25070
• Project Page: https://www.openforecaster.github.io
• Github: https://github.com/OpenForecaster/scaling-forecasting-training

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #FuturePrediction #AI #OpenSourceAI #MachineLearning
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

📝 Summary:
This paper introduces RISE, an unsupervised framework using sparse auto-encoders to discover and control LLM reasoning behaviors. It identifies interpretable reasoning vectors like reflection and backtracking, enabling targeted interventions and discovery of novel behaviors without retraining.

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23988
• PDF: https://arxiv.org/pdf/2512.23988

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #MachineLearning #AIReasoning #Interpretability
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

📝 Summary:
The Agentic Learning Ecosystem ALE is a new infrastructure to streamline LLM agent development for real-world tasks. ALE comprises ROLL for optimization, ROCK for sandboxing, and iFlow CLI for context. Their agent ROME, built with ALE, shows strong benchmark performance.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24873
• PDF: https://arxiv.org/pdf/2512.24873

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIAgents #LLMDevelopment #AgenticLearning #AIArchitecture #MachineLearning
Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking

📝 Summary:
Complex reasoning problems often involve implicit spatial, geometric, and structural relationships that are not explicitly encoded in text. While recent reasoning models have achieved strong performan...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24297
• PDF: https://arxiv.org/pdf/2512.24297
• Github: https://github.com/chenmeiqii/FIGR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Pretraining Frame Preservation in Autoregressive Video Memory Compression

📝 Summary:
We present PFP, a neural network structure to compress long videos into short contexts, with an explicit pretraining objective to preserve the high-frequency details of single frames at arbitrary temp...

🔹 Publication Date: Published on Dec 29, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23851
• PDF: https://arxiv.org/pdf/2512.23851
• Github: https://github.com/lllyasviel/PFP

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Factorized Learning for Temporally Grounded Video-Language Models

📝 Summary:
Video-language models struggle with temporal grounding from coupled tasks. Our D^2VLM framework decouples grounding and textual response using evidence tokens. Factorized preference optimization explicitly optimizes temporal grounding for both tasks.

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24097
• PDF: https://arxiv.org/pdf/2512.24097
• Project Page: https://github.com/nusnlp/d2vlm
• Github: https://github.com/nusnlp/d2vlm

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

📝 Summary:
This paper presents JavisGPT, the first unified multimodal large language model (MLLM) for Joint Audio-Video (JAV) comprehension and generation. JavisGPT adopts a concise encoder-LLM-decoder architect...

🔹 Publication Date: Published on Dec 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.23377
• PDF: https://arxiv.org/pdf/2512.22905
• Project Page: https://javisverse.github.io/JavisGPT-page/
• Github: https://github.com/JavisVerse/JavisGPT

🔹 Models citing this paper:
https://huggingface.co/JavisVerse/JavisGPT-v0.1-7B-Instruct

Datasets citing this paper:
https://huggingface.co/datasets/JavisVerse/MM-PreTrain
https://huggingface.co/datasets/JavisVerse/JavisUnd-Eval
https://huggingface.co/datasets/JavisVerse/AV-FineTune

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

📝 Summary:
The rapid advancement of autonomous systems, including self-driving vehicles and drones, has intensified the need to forge true Spatial Intelligence from multi-modal onboard sensor data. While foundat...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24385
• PDF: https://arxiv.org/pdf/2512.24385
• Github: https://github.com/worldbench/awesome-spatial-intelligence

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Valori: A Deterministic Memory Substrate for AI Systems

📝 Summary:
Valori introduces a deterministic AI memory substrate using fixed-point arithmetic, ensuring bit-identical results across platforms. This eliminates non-determinism from floating-point operations in vector embeddings and search, making AI systems trustworthy and verifiable.

🔹 Publication Date: Published on Dec 25, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22280
• PDF: https://arxiv.org/pdf/2512.22280
• Project Page: https://valori.systems/
• Github: https://github.com/varshith-Git/Valori-Kernel

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

📝 Summary:
A framework called BEDA uses probabilistic constraints on belief estimation to improve strategic dialogue through formalized adversarial and alignment acts, outperforming baselines across multiple tas...

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24885
• PDF: https://arxiv.org/pdf/2512.24885

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
GaMO: Geometry-aware Multi-view Diffusion Outpainting for Sparse-View 3D Reconstruction

📝 Summary:
GaMO improves sparse-view 3D reconstruction by using geometry-aware multi-view outpainting. It expands existing views to enhance scene coverage and consistency. This achieves state-of-the-art quality 25x faster than prior methods, with reduced computational cost.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.25073
• PDF: https://arxiv.org/pdf/2512.25073
• Project Page: https://yichuanh.github.io/GaMO/
• Github: https://yichuanh.github.io/GaMO/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DReconstruction #ComputerVision #DiffusionModels #GaMO #AI
Geometry-Aware Optimization for Respiratory Sound Classification: Enhancing Sensitivity with SAM-Optimized Audio Spectrogram Transformers

📝 Summary:
This paper improves respiratory sound classification using AST enhanced with SAM. It optimizes loss surface geometry for flatter minima, yielding state-of-the-art 68.10% score and crucial 68.31% sensitivity on ICBHI 2017.

🔹 Publication Date: Published on Dec 27, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22564
• PDF: https://arxiv.org/pdf/2512.22564

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#RespiratoryHealth #MedicalAI #DeepLearning #SoundClassification #AIHealthcare
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

📝 Summary:
This paper bridges the gap between human memory systems and AI agent memory design. It synthesizes interdisciplinary knowledge, comparing biological and artificial memory mechanisms, reviewing benchmarks, and exploring security and future directions.

🔹 Publication Date: Published on Dec 29, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23343
• PDF: https://arxiv.org/pdf/2512.23343
• Github: https://github.com/AgentMemory/Huaman-Agent-Memory

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #CognitiveNeuroscience #MemorySystems #AutonomousAgents #BrainInspiredAI
mHC: Manifold-Constrained Hyper-Connections

📝 Summary:
Manifold-Constrained Hyper-Connections mHC resolve training instability and scalability issues of Hyper-Connections HC. mHC restores identity mapping via manifold projection and infrastructure optimization, enabling effective large-scale training with improved performance.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24880
• PDF: https://arxiv.org/pdf/2512.24880

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MachineLearning #DeepLearning #NeuralNetworks #ManifoldLearning #AI
Kronos: A Foundation Model for the Language of Financial Markets

📝 Summary:
Kronos is a novel foundation model for financial K-line data. It uses a specialized tokenizer and autoregressive pre-training on a vast dataset to significantly outperform existing models in price and volatility forecasting, and synthetic data generation, establishing it as a versatile tool for f...

🔹 Publication Date: Published on Aug 2, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02739
• PDF: https://arxiv.org/pdf/2508.02739
• Github: https://github.com/shiyu-coder/Kronos

🔹 Models citing this paper:
https://huggingface.co/NeoQuasar/Kronos-base
https://huggingface.co/NeoQuasar/Kronos-Tokenizer-base
https://huggingface.co/NeoQuasar/Kronos-mini

Spaces citing this paper:
https://huggingface.co/spaces/ByronWang2005/Kronos-CS2-Skins-Forecast-Demo
https://huggingface.co/spaces/yangyang158/kronos
https://huggingface.co/spaces/heyunfei/crypt

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#FoundationModel #FinancialAI #DeepLearning #QuantitativeFinance #Forecasting
Guiding a Diffusion Transformer with the Internal Dynamics of Itself

📝 Summary:
This paper introduces Internal Guidance IG for diffusion models, which adds auxiliary supervision to intermediate layers during training and extrapolates outputs during sampling. This simple strategy significantly improves training efficiency and generation quality. IG achieves state-of-the-art F...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24176
• PDF: https://arxiv.org/pdf/2512.24176
• Project Page: https://zhouxingyu13.github.io/Internal-Guidance/
• Github: https://github.com/CVL-UESTC/Internal-Guidance

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DiffusionModels #AI #DeepLearning #GenerativeAI #ComputerVision