ML Research Hub
32.9K subscribers
5.48K photos
348 videos
24 files
5.93K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers

📝 Summary:
Canzona presents a unified asynchronous framework that addresses the conflict between matrix-based optimizers and distributed tensor fragmentation in LLM training, improving efficiency and reducing la...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06079
• PDF: https://arxiv.org/pdf/2602.06079

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals

📝 Summary:
QuantLRM improves Large Reasoning Model quantization by using weight update magnitudes from fine-tuning to estimate channel importance. It protects both smallest and largest updates, consistently outperforming traditional methods and applying even to non-fine-tuned models.

🔹 Publication Date: Published on Jan 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02581
• PDF: https://arxiv.org/pdf/2602.02581
• Github: https://github.com/psunlpgroup/QuantLRM

🔹 Models citing this paper:
https://huggingface.co/nanzhang/QuantLRM-R1-Qwen-32B-3-bit
https://huggingface.co/nanzhang/QuantLRM-R1-Llama-70B-3-bit
https://huggingface.co/nanzhang/QuantLRM-R1-Qwen3-8B-3-bit

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Quantization #LargeLanguageModels #DeepLearning #AI #ModelCompression
Table-as-Search: Formulate Long-Horizon Agentic Information Seeking as Table Completion

📝 Summary:
Table-as-Search TaS reformulates information seeking as table completion to robustly manage long-horizon search states. By mapping queries to structured tables, TaS explicitly tracks progress and plans, significantly outperforming baselines in complex search tasks.

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06724
• PDF: https://arxiv.org/pdf/2602.06724
• Github: https://github.com/AIDC-AI/Marco-DeepResearch/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #InformationRetrieval #AgenticAI #TableCompletion #SearchAlgorithms
1
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention

📝 Summary:
OmniVideo-R1 is a reinforced framework that enhances audio-visual understanding. It uses self-supervised query-intensive grounding and contrastive modality-attentive fusion. Experiments show OmniVideo-R1 consistently outperforms baselines, demonstrating its effectiveness.

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.05847
• PDF: https://arxiv.org/pdf/2602.05847

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AudioVisualAI #SelfSupervisedLearning #DeepLearning #MultimodalAI #AIResearch
SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue

📝 Summary:
SEAD enables service dialogue agents to learn effective strategies through self-evolving, decoupled user modeling. This trains agents without large human annotations, significantly improving task completion and dialogue efficiency compared to existing models.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03548
• PDF: https://arxiv.org/pdf/2602.03548
• Github: https://github.com/Da1yuqin/SEAD

🔹 Models citing this paper:
https://huggingface.co/dayll/SEAD-14B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ConversationalAI #ReinforcementLearning #NLP #AIagents
ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution

📝 Summary:
ReMiT introduces a bidirectional training approach for LLMs. It leverages RL-guided mid-training to dynamically reweight tokens, improving pre-training performance and sustaining gains throughout post-training. This creates a self-reinforcing, iterative evolution cycle for LLMs.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03075
• PDF: https://arxiv.org/pdf/2602.03075

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #ReinforcementLearning #MachineLearning #AITraining #DeepLearning
Self-Improving World Modelling with Latent Actions

📝 Summary:
SWIRL learns world models from state-only data by treating actions as latent variables. It alternates forward and inverse dynamics modeling, using information maximization and ELBO, to achieve improved performance across diverse reasoning and planning tasks.

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06130
• PDF: https://arxiv.org/pdf/2602.06130

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#WorldModels #ReinforcementLearning #LatentVariables #MachineLearning #AI
Pisets: A Robust Speech Recognition System for Lectures and Interviews

📝 Summary:
Pisets is a robust Russian speech-to-text system combining Wav2Vec2, AST, and Whisper models. It uses curriculum learning and uncertainty modeling to improve accuracy and reduce hallucinations for long audio, outperforming other Whisper variants.

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18415
• PDF: https://arxiv.org/pdf/2601.18415

🔹 Models citing this paper:
https://huggingface.co/bond005/wav2vec2-large-ru-golos
https://huggingface.co/bond005/whisper-large-v3-ru-podlodka

Spaces citing this paper:
https://huggingface.co/spaces/ehristoforu/server0001
https://huggingface.co/spaces/dimafatality/bond005-wav2vec2-large-ru-golos
https://huggingface.co/spaces/PatrickRedStar/video_image

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpeechRecognition #AI #MachineLearning #NLP #WhisperAI
1
compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data

📝 Summary:
Compar:IA is a French government open-source platform collecting large-scale French human preference data for LLM training. It addresses the scarcity of non-English data via a blind pairwise comparison interface and releases three datasets, aiming to be an international public good.

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06669
• PDF: https://arxiv.org/pdf/2602.06669
• Project Page: https://comparia.beta.gouv.fr/
• Github: https://github.com/betagouv/ComparIA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology

📝 Summary:
AtlasPatch is an efficient and scalable tool for whole-slide image preprocessing. It uses a fine-tuned Segment-Anything model for accurate tissue detection and high-throughput patch extraction, significantly reducing computational overhead and matching state-of-the-art performance.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03998
• PDF: https://arxiv.org/pdf/2602.03998

🔹 Models citing this paper:
https://huggingface.co/AtlasAnalyticsLab/AtlasPatch

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
Uncovering Cross-Objective Interference in Multi-Objective Alignment

📝 Summary:
Multi-objective alignment in LLMs suffers from cross-objective interference where improving performance on some objectives degrades others, with a covariance-based analysis and a proposed method to ma...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06869
• PDF: https://arxiv.org/pdf/2602.06869
• Github: https://github.com/yining610/ctwa

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization

📝 Summary:
SE-Bench presents a diagnostic environment that obscures NumPy's API to evaluate agents' ability to internally store and utilize novel knowledge without external documentation, revealing challenges in...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04811
• PDF: https://arxiv.org/pdf/2602.04811

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Large Language Model Reasoning Failures

📝 Summary:
This paper surveys reasoning failures in large language models, proposing a novel categorization. It classifies failures into embodied and non-embodied types, and further into fundamental, application-specific, and robustness issues. The work unifies research to guide future efforts for stronger ...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06176
• PDF: https://arxiv.org/pdf/2602.06176

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SPARC: Separating Perception And Reasoning Circuits for Test-time Scaling of VLMs

📝 Summary:
SPARC decouples visual perception and reasoning in VLMs using a two-stage pipeline. This enables efficient test-time scaling with targeted compute allocation, significantly improving visual reasoning performance and reducing token budget compared to monolithic baselines.

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06566
• PDF: https://arxiv.org/pdf/2602.06566

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

📝 Summary:
Generative Reward Models suffer from deceptive alignment when prioritizing outcome accuracy. Introducing Rationale Consistency, a metric aligning reasoning with human judgment, and a hybrid training signal improves performance, avoids deceptive alignment, and boosts RLHF.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04649
• PDF: https://arxiv.org/pdf/2602.04649
• Github: https://github.com/QwenLM/RationaleRM

Datasets citing this paper:
https://huggingface.co/datasets/Qwen/RationaleRM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Uncertainty Drives Social Bias Changes in Quantized Large Language Models

📝 Summary:
Post-training quantization of large language models causes significant changes in social biases that aggregate metrics fail to detect, with quantization-induced masked bias flipping occurring more fre...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06181
• PDF: https://arxiv.org/pdf/2602.06181

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

📝 Summary:
TP-GRPO enhances GRPO for flow matching by using step-level incremental rewards instead of outcome-based ones. It also identifies turning points in denoising trajectories to capture and aggregate long-term effects. This improves reward signal effectiveness and consistently enhances generation qua...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06422
• PDF: https://arxiv.org/pdf/2602.06422
• Github: https://github.com/YunzeTong/TurningPoint-GRPO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models

📝 Summary:
NanoQuant enables efficient post-training quantization of large language models to binary and sub-1-bit levels using low-rank binary factorization and ADMM optimization, achieving state-of-the-art acc...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06694
• PDF: https://arxiv.org/pdf/2602.06694

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RelayGen: Intra-Generation Model Switching for Efficient Reasoning

📝 Summary:
RelayGen is a training-free framework that dynamically switches between large and small models during reasoning by identifying difficulty transitions at the segment level, achieving faster inference w...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06454
• PDF: https://arxiv.org/pdf/2602.06454
• Github: https://github.com/jiwonsong-dev/RelayGen

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research