ML Research Hub
32.8K subscribers
4.14K photos
249 videos
23 files
4.47K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
K-EXAONE Technical Report

📝 Summary:
K-EXAONE is a multilingual language model with a Mixture-of-Experts architecture that achieves competitive performance on various benchmarks while supporting multiple languages and long-context window...

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01739
• PDF: https://arxiv.org/pdf/2601.01739
• Github: https://github.com/LG-AI-EXAONE/K-EXAONE

🔹 Models citing this paper:
https://huggingface.co/LGAI-EXAONE/K-EXAONE-236B-A23B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

📝 Summary:
Falcon-H1R is a 7B-parameter language model that achieves competitive reasoning performance through efficient training strategies and architectural design, enabling scalable reasoning capabilities in ...

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02346
• PDF: https://arxiv.org/pdf/2601.02346

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

📝 Summary:
OpenNovelty is an LLM-powered agentic system for verifiable scholarly novelty assessment in peer review. It retrieves and analyzes prior work via semantic search and taxonomy construction, generating evidence-backed reports grounded in real papers. This tool aims to promote fair, consistent, and ...

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01576
• PDF: https://arxiv.org/pdf/2601.01576
• Project Page: https://www.opennovelty.org/
• Github: https://github.com/january-blue/OpenNovelty

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs

📝 Summary:
COMPASS evaluates large language models' compliance with organizational policies, revealing significant gaps in enforcing prohibitions despite strong performance on legitimate requests. AI-generated s...

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01836
• PDF: https://arxiv.org/pdf/2601.01836
• Github: https://github.com/AIM-Intelligence/COMPASS

🔹 Models citing this paper:
https://huggingface.co/AIM-Intelligence/COMPASS_Qwen2.5-7B-Instruct_LoRA
https://huggingface.co/AIM-Intelligence/COMPASS_gemma-3-4b-it_LoRA

Datasets citing this paper:
https://huggingface.co/datasets/AIM-Intelligence/COMPASS-Policy-Alignment-Testbed-Dataset
https://huggingface.co/datasets/AIM-Intelligence/COMPASS-Policy-aware-SFT-Dataset

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Project Ariadne: A Structural Causal Framework for Auditing Faithfulness in LLM Agents

📝 Summary:
Project Ariadne uses structural causal models and counterfactual logic to evaluate the causal integrity of LLM reasoning, revealing a faithfulness gap where reasoning traces are not reliable drivers o...

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02314
• PDF: https://arxiv.org/pdf/2601.02314
• Github: https://github.com/skhanzad/AridadneXAI

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
GARDO: Reinforcing Diffusion Models without Reward Hacking

📝 Summary:
Online reinforcement learning for diffusion model fine-tuning suffers from reward hacking due to proxy reward mismatches, which GARDO addresses through selective regularization, adaptive reference upd...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24138
• PDF: https://arxiv.org/pdf/2512.24138
• Project Page: https://tinnerhrhe.github.io/gardo_project/
• Github: https://github.com/tinnerhrhe/gardo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
IMA++: ISIC Archive Multi-Annotator Dermoscopic Skin Lesion Segmentation Dataset

📝 Summary:
A large-scale public multi-annotator skin lesion segmentation dataset is introduced with extensive metadata for annotator analysis and consensus modeling. AI-generated summary Multi-annotator medical ...

🔹 Publication Date: Published on Dec 25, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21472
• PDF: https://arxiv.org/pdf/2512.21472
• Github: https://github.com/sfu-mial/IMAplusplus

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion

📝 Summary:
A semi-supervised remote sensing image segmentation framework combines vision-language and self-supervised models to reduce pseudo-label drift through dual-student architecture and semantic co-guidanc...

🔹 Publication Date: Published on Dec 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23035
• PDF: https://arxiv.org/pdf/2512.23035
• Project Page: https://xavierjiezou.github.io/Co2S/
• Github: https://github.com/XavierJiezou/Co2S

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving

📝 Summary:
SWE-Lego achieves state-of-the-art software issue resolution through a lightweight supervised fine-tuning approach. It uses a high-quality dataset and refined training procedures like error masking and a difficulty-based curriculum, outperforming complex methods. Performance is further boosted by...

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01426
• PDF: https://arxiv.org/pdf/2601.01426
• Project Page: https://github.com/SWE-Lego/SWE-Lego
• Github: https://github.com/SWE-Lego/SWE-Lego

🔹 Models citing this paper:
https://huggingface.co/SWE-Lego/SWE-Lego-Qwen3-8B
https://huggingface.co/SWE-Lego/SWE-Lego-Qwen3-32B

Datasets citing this paper:
https://huggingface.co/datasets/SWE-Lego/SWE-Lego-Real-Data
https://huggingface.co/datasets/SWE-Lego/SWE-Lego-Synthetic-Data

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SoftwareEngineering #MachineLearning #LLM #FineTuning #AIforCode
M-ErasureBench: A Comprehensive Multimodal Evaluation Benchmark for Concept Erasure in Diffusion Models

📝 Summary:
Existing concept erasure methods in diffusion models are vulnerable to non-text inputs. M-ErasureBench is a new multimodal evaluation framework, and IRECE is a module to restore robustness against these attacks, reducing concept reproduction.

🔹 Publication Date: Published on Dec 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22877
• PDF: https://arxiv.org/pdf/2512.22877

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DiffusionModels #ConceptErasure #MultimodalAI #AISafety #MachineLearning