ML Research Hub
32.9K subscribers
5.47K photos
348 videos
24 files
5.92K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution

📝 Summary:
ReMiT introduces a bidirectional training approach for LLMs. It leverages RL-guided mid-training to dynamically reweight tokens, improving pre-training performance and sustaining gains throughout post-training. This creates a self-reinforcing, iterative evolution cycle for LLMs.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03075
• PDF: https://arxiv.org/pdf/2602.03075

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #ReinforcementLearning #MachineLearning #AITraining #DeepLearning
Self-Improving World Modelling with Latent Actions

📝 Summary:
SWIRL learns world models from state-only data by treating actions as latent variables. It alternates forward and inverse dynamics modeling, using information maximization and ELBO, to achieve improved performance across diverse reasoning and planning tasks.

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06130
• PDF: https://arxiv.org/pdf/2602.06130

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#WorldModels #ReinforcementLearning #LatentVariables #MachineLearning #AI
Pisets: A Robust Speech Recognition System for Lectures and Interviews

📝 Summary:
Pisets is a robust Russian speech-to-text system combining Wav2Vec2, AST, and Whisper models. It uses curriculum learning and uncertainty modeling to improve accuracy and reduce hallucinations for long audio, outperforming other Whisper variants.

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18415
• PDF: https://arxiv.org/pdf/2601.18415

🔹 Models citing this paper:
https://huggingface.co/bond005/wav2vec2-large-ru-golos
https://huggingface.co/bond005/whisper-large-v3-ru-podlodka

Spaces citing this paper:
https://huggingface.co/spaces/ehristoforu/server0001
https://huggingface.co/spaces/dimafatality/bond005-wav2vec2-large-ru-golos
https://huggingface.co/spaces/PatrickRedStar/video_image

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpeechRecognition #AI #MachineLearning #NLP #WhisperAI
1
compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data

📝 Summary:
Compar:IA is a French government open-source platform collecting large-scale French human preference data for LLM training. It addresses the scarcity of non-English data via a blind pairwise comparison interface and releases three datasets, aiming to be an international public good.

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06669
• PDF: https://arxiv.org/pdf/2602.06669
• Project Page: https://comparia.beta.gouv.fr/
• Github: https://github.com/betagouv/ComparIA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology

📝 Summary:
AtlasPatch is an efficient and scalable tool for whole-slide image preprocessing. It uses a fine-tuned Segment-Anything model for accurate tissue detection and high-throughput patch extraction, significantly reducing computational overhead and matching state-of-the-art performance.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03998
• PDF: https://arxiv.org/pdf/2602.03998

🔹 Models citing this paper:
https://huggingface.co/AtlasAnalyticsLab/AtlasPatch

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
Uncovering Cross-Objective Interference in Multi-Objective Alignment

📝 Summary:
Multi-objective alignment in LLMs suffers from cross-objective interference where improving performance on some objectives degrades others, with a covariance-based analysis and a proposed method to ma...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06869
• PDF: https://arxiv.org/pdf/2602.06869
• Github: https://github.com/yining610/ctwa

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization

📝 Summary:
SE-Bench presents a diagnostic environment that obscures NumPy's API to evaluate agents' ability to internally store and utilize novel knowledge without external documentation, revealing challenges in...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04811
• PDF: https://arxiv.org/pdf/2602.04811

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Large Language Model Reasoning Failures

📝 Summary:
This paper surveys reasoning failures in large language models, proposing a novel categorization. It classifies failures into embodied and non-embodied types, and further into fundamental, application-specific, and robustness issues. The work unifies research to guide future efforts for stronger ...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06176
• PDF: https://arxiv.org/pdf/2602.06176

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SPARC: Separating Perception And Reasoning Circuits for Test-time Scaling of VLMs

📝 Summary:
SPARC decouples visual perception and reasoning in VLMs using a two-stage pipeline. This enables efficient test-time scaling with targeted compute allocation, significantly improving visual reasoning performance and reducing token budget compared to monolithic baselines.

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06566
• PDF: https://arxiv.org/pdf/2602.06566

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

📝 Summary:
Generative Reward Models suffer from deceptive alignment when prioritizing outcome accuracy. Introducing Rationale Consistency, a metric aligning reasoning with human judgment, and a hybrid training signal improves performance, avoids deceptive alignment, and boosts RLHF.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04649
• PDF: https://arxiv.org/pdf/2602.04649
• Github: https://github.com/QwenLM/RationaleRM

Datasets citing this paper:
https://huggingface.co/datasets/Qwen/RationaleRM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Uncertainty Drives Social Bias Changes in Quantized Large Language Models

📝 Summary:
Post-training quantization of large language models causes significant changes in social biases that aggregate metrics fail to detect, with quantization-induced masked bias flipping occurring more fre...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06181
• PDF: https://arxiv.org/pdf/2602.06181

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

📝 Summary:
TP-GRPO enhances GRPO for flow matching by using step-level incremental rewards instead of outcome-based ones. It also identifies turning points in denoising trajectories to capture and aggregate long-term effects. This improves reward signal effectiveness and consistently enhances generation qua...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06422
• PDF: https://arxiv.org/pdf/2602.06422
• Github: https://github.com/YunzeTong/TurningPoint-GRPO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models

📝 Summary:
NanoQuant enables efficient post-training quantization of large language models to binary and sub-1-bit levels using low-rank binary factorization and ADMM optimization, achieving state-of-the-art acc...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06694
• PDF: https://arxiv.org/pdf/2602.06694

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RelayGen: Intra-Generation Model Switching for Efficient Reasoning

📝 Summary:
RelayGen is a training-free framework that dynamically switches between large and small models during reasoning by identifying difficulty transitions at the segment level, achieving faster inference w...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06454
• PDF: https://arxiv.org/pdf/2602.06454
• Github: https://github.com/jiwonsong-dev/RelayGen

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

📝 Summary:
Researchers address the modality gap in multimodal learning by proposing a fixed-frame theory and a training-free alignment method that enables efficient scaling of multimodal models using unpaired da...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07026
• PDF: https://arxiv.org/pdf/2602.07026

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ECO: Energy-Constrained Optimization with Reinforcement Learning for Humanoid Walking

📝 Summary:
Energy-constrained optimization framework separates energy metrics from rewards using Lagrangian method to achieve stable, energy-efficient humanoid robot locomotion with reduced hyperparameter tuning...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06445
• PDF: https://arxiv.org/pdf/2602.06445

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Cybersecurity AI: Humanoid Robots as Attack Vectors

📝 Summary:
The Unitree G1 humanoid robot is vulnerable to BLE provisioning protocol exploits, exfiltrates sensor data, and can be repurposed for active cyber operations, highlighting the need for improved securi...

🔹 Publication Date: Published on Sep 17, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.14139
• PDF: https://arxiv.org/pdf/2509.14139
• Project Page: https://aliasrobotics.com
• Github: https://github.com/aliasrobotics/cai

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Towards Bridging the Gap between Large-Scale Pretraining and Efficient Finetuning for Humanoid Control

📝 Summary:
Off-policy Soft Actor-Critic with large-batch updates enables efficient humanoid locomotion policy pretraining, while model-based methods facilitate safe adaptation through deterministic data collecti...

🔹 Publication Date: Published on Jan 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21363
• PDF: https://arxiv.org/pdf/2601.21363
• Github: https://github.com/bigai-ai/LIFT-humanoid

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research