ML Research Hub
32.9K subscribers
5.35K photos
332 videos
24 files
5.78K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Weight Decay Improves Language Model Plasticity

📝 Summary:
Pretraining with larger weight decay values improves model plasticity and downstream fine-tuning performance by encouraging linearly separable representations and reducing overfitting. AI-generated su...

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11137
• PDF: https://arxiv.org/pdf/2602.11137

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression

📝 Summary:
ROCKET is a training-free model compression method that formulates layer-wise compression as a knapsack problem and uses single-step sparse matrix factorization. It achieves state-of-the-art performance, retaining over 90 percent of original performance at 30 percent compression without fine-tuning.

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11008
• PDF: https://arxiv.org/pdf/2602.11008
• Github: https://github.com/mts-ai/ROCKET

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies

📝 Summary:
EcoGym introduces a new benchmark for evaluating LLM agents long-horizon planning in interactive economic environments. It features three diverse scenarios with persistent dynamics and business-relevant metrics. Experiments reveal LLMs struggle with either high-level strategy or efficient action ...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09514
• PDF: https://arxiv.org/pdf/2602.09514
• Github: https://github.com/OPPO-PersonalAI/EcoGym

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AIPlanning #EconomicSimulation #AI #Benchmark
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

📝 Summary:
Training reasoning language models benefits from data repetition. For a fixed update budget, more epochs on smaller datasets beat single-pass training on larger datasets. Token accuracy signals optimal training duration.

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11149
• PDF: https://arxiv.org/pdf/2602.11149
• Github: https://github.com/dkopi/data-repetition

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #FineTuning #DataStrategy #MachineLearning #AIResearch
Benchmarking Large Language Models for Knowledge Graph Validation

📝 Summary:
This paper introduces FactCheck, a benchmark to evaluate LLMs for knowledge graph fact validation. Experiments show LLMs are not yet stable or reliable, and RAG or multi-model consensus offer inconsistent improvements, highlighting the need for such a benchmark.

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10748
• PDF: https://arxiv.org/pdf/2602.10748
• Github: https://github.com/FactCheck-AI

Datasets citing this paper:
https://huggingface.co/datasets/FactCheck-AI/FactCheck

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #KnowledgeGraphs #FactChecking #AIResearch #Benchmarking
Bielik Guard: Efficient Polish Language Safety Classifiers for LLM Content Moderation

📝 Summary:
Bielik Guard is a compact Polish language safety classifier family with two variants that effectively categorize content across five safety domains while maintaining high efficiency and accuracy. AI-g...

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07954
• PDF: https://arxiv.org/pdf/2602.07954
• Project Page: https://guard.bielik.ai/

🔹 Models citing this paper:
https://huggingface.co/speakleash/Bielik-Guard-0.1B-v1.0
https://huggingface.co/speakleash/Bielik-Guard-0.1B-v1.1
https://huggingface.co/speakleash/Bielik-Guard-0.5B-v1.1

Spaces citing this paper:
https://huggingface.co/spaces/jglowa/bielik-czat

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FedPS: Federated data Preprocessing via aggregated Statistics

📝 Summary:
FedPS is a federated data preprocessing framework for collaborative machine learning. It uses aggregated statistics and data-sketching for efficient privacy-preserving data preparation in FL, covering tasks like scaling and imputation.

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10870
• PDF: https://arxiv.org/pdf/2602.10870
• Project Page: https://xuefeng-xu.github.io/fedps.html
• Github: https://github.com/xuefeng-xu/fedps

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#FederatedLearning #DataPreprocessing #MachineLearning #PrivacyPreservingAI #DataScience
Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards

📝 Summary:
Blockwise Advantage Estimation BAE solves reward interference in multi-objective RL for structured generations. It assigns distinct advantages to text blocks, using an Outcome-Conditioned Baseline to estimate them without nested rollouts. This mitigates interference and scales to new objectives.

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10231
• PDF: https://arxiv.org/pdf/2602.10231

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #MultiObjectiveRL #NLP #MachineLearning #AIResearch
GoodVibe: Security-by-Vibe for LLM-Based Code Generation

📝 Summary:
GoodVibe secures LLM-generated code by precisely fine-tuning only a small subset of security-relevant neurons. This neuron-level framework greatly enhances code security and preserves utility with significantly fewer parameters and training costs than traditional methods.

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10778
• PDF: https://arxiv.org/pdf/2602.10778

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #CodeGeneration #Cybersecurity #AIsecurity #MachineLearning
Large Language Lobotomy: Jailbreaking Mixture-of-Experts via Expert Silencing

📝 Summary:
This paper introduces Large Language Lobotomy L3, an attack on Mixture-of-Experts LLMs. L3 exploits routing dynamics to identify and silence safety-critical experts, achieving high jailbreaking success while retaining language utility. This highlights a fundamental tension in MoE design.

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08741
• PDF: https://arxiv.org/pdf/2602.08741
• Github: https://github.com/jonatelintelo/LargeLanguageLobotomy

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #MixtureOfExperts #Jailbreaking #AISafety #AIResearch
Rethinking the Value of Agent-Generated Tests for LLM-Based Software Engineering Agents

📝 Summary:
This study finds that agent-generated tests for LLM software engineering agents may have limited value. Test writing frequency doesnt correlate with issue resolution, and agents prefer informal print statements. Varying test volume showed little impact, suggesting marginal utility in current prac...

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07900
• PDF: https://arxiv.org/pdf/2602.07900

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMAgents #SoftwareEngineering #AutomatedTesting #AIResearch #GenerativeAI
How Do Decoder-Only LLMs Perceive Users? Rethinking Attention Masking for User Representation Learning

📝 Summary:
This paper studies attention masking in decoder-only LLMs for user representation. It proposes Gradient-Guided Soft Masking GGSM to stabilize training when transitioning to bidirectional attention. GGSM yields higher quality user representations on industrial benchmarks.

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10622
• PDF: https://arxiv.org/pdf/2602.10622
• Github: https://github.com/JhCircle/Deepfind-GGSM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #UserRepresentation #AttentionMasking #NLP #AI
This media is not supported in your browser
VIEW IN TELEGRAM
VidVec: Unlocking Video MLLM Embeddings for Video-Text Retrieval

📝 Summary:
VidVec uses intermediate MLLM layers for zero-shot video-text retrieval. A novel text-based alignment, mapping video captions to summaries, learns embeddings without visual supervision. It achieves state-of-the-art results on video retrieval benchmarks.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://www.arxiv.org/abs/2602.08099
• PDF: https://arxiv.org/pdf/2602.08099
• Project Page: https://iyttor.github.io/VidVec
• Github: https://iyttor.github.io/VidVec

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoTextRetrieval #MLLM #Embeddings #ZeroShotLearning #AI
Graph-Enhanced Deep Reinforcement Learning for Multi-Objective Unrelated Parallel Machine Scheduling

📝 Summary:
A PPO-GNN Deep Reinforcement Learning framework solves multi-objective parallel machine scheduling. It balances total weighted tardiness and total setup time, outperforming traditional methods with a superior trade-off.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08052
• PDF: https://arxiv.org/pdf/2602.08052
• Project Page: https://bulentsoykan.github.io/GNN-DRL4UPMSP/
• Github: https://github.com/bulentsoykan/GNN-DRL4UPMSP

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning

📝 Summary:
ECHO-2 is a distributed reinforcement learning framework that enables efficient post-training of large language models by overlapping rollout generation, dissemination, and training while managing pol...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02192
• PDF: https://arxiv.org/pdf/2602.02192

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

📝 Summary:
RC, an iterative decoding algorithm, enables large language models to extrapolate and continuously improve beyond training budgets by constructing reasoning chains that enhance across iterations, achi...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03773
• PDF: https://arxiv.org/pdf/2602.03773

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors

📝 Summary:
StealthRL is a reinforcement learning framework that creates adversarial paraphrases to evade multiple AI-text detectors while preserving meaning. It achieves near-zero detection, exposing significant robustness gaps and shared architectural vulnerabilities in current AI-text detection systems.

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08934
• PDF: https://arxiv.org/pdf/2602.08934
• Github: https://github.com/suraj-ranganath/StealthRL

🔹 Models citing this paper:
https://huggingface.co/suraj-ranganath/StealthRL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
GameDevBench: Evaluating Agentic Capabilities Through Game Development

📝 Summary:
GameDevBench is introduced as the first benchmark for evaluating agents on game development tasks that combine software development complexity with deep multimodal understanding requirements. AI-gener...

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11103
• PDF: https://arxiv.org/pdf/2602.11103

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Latent Thoughts Tuning: Bridging Context and Reasoning with Fused Information in Latent Tokens

📝 Summary:
Latent Thoughts Tuning introduces a novel framework for robust reasoning in continuous latent space. It addresses feature collapse by fusing contextual hidden states with predictive semantic guidance. This method outperforms baselines and achieves improved reasoning accuracy.

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10229
• PDF: https://arxiv.org/pdf/2602.10229
• Github: https://github.com/NeosKnight233/Latent-Thoughts-Tuning

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LiveMedBench: A Contamination-Free Medical Benchmark for LLMs with Automated Rubric Evaluation

📝 Summary:
LiveMedBench addresses limitations in medical LLM evaluation by providing a continuously updated, contamination-free benchmark with rubric-based evaluation that better aligns with expert clinical reas...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10367
• PDF: https://arxiv.org/pdf/2602.10367
• Project Page: https://zhilingyan.github.io/LiveMedBench/
• Github: https://github.com/ZhilingYan/LiveMedBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TIC-VLA: A Think-in-Control Vision-Language-Action Model for Robot Navigation in Dynamic Environments

📝 Summary:
TIC-VLA is a latency-aware framework enhancing robot navigation by explicitly modeling delayed semantic reasoning. It uses a delayed semantic-control interface and latency-consistent training. This allows robots to maintain real-time control despite significant reasoning delays.

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02459
• PDF: https://arxiv.org/pdf/2602.02459
• Project Page: https://ucla-mobility.github.io/TIC-VLA/
• Github: https://github.com/ucla-mobility/TIC-VLA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research