ML Research Hub
32.6K subscribers
3.39K photos
133 videos
23 files
3.62K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho
Download Telegram
Do LLMs Feel? Teaching Emotion Recognition with Prompts, Retrieval, and Curriculum Learning

📝 Summary:
PRC-Emo is a new framework that significantly improves LLMs' emotion recognition in conversations. It combines prompt engineering, demonstration retrieval, and curriculum learning, achieving state-of-the-art results on benchmark datasets.

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07061
• PDF: https://arxiv.org/pdf/2511.07061
• Github: https://github.com/LiXinran6/PRC-Emo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #EmotionRecognition #NLP #AIResearch #MachineLearning
10 Open Challenges Steering the Future of Vision-Language-Action Models

📝 Summary:
This paper identifies 10 principal challenges in vision-language-action VLA models, including multimodality, reasoning, and safety. It also explores emerging trends like spatial understanding and data synthesis. The goal is to accelerate VLA model development and wider acceptance.

🔹 Publication Date: Published on Nov 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.05936
• PDF: https://arxiv.org/pdf/2511.05936

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VLA #AI #MachineLearning #ComputerVision #NLP
🤖🧠 The Transformer Architecture: How Attention Revolutionized Deep Learning

🗓️ 11 Nov 2025
📚 AI News & Trends

The field of artificial intelligence has witnessed a remarkable evolution and at the heart of this transformation lies the Transformer architecture. Introduced by Vaswani et al. in 2017, the paper “Attention Is All You Need” redefined the foundations of natural language processing (NLP) and sequence modeling. Unlike its predecessors – recurrent and convolutional neural networks, ...

#TransformerArchitecture #AttentionMechanism #DeepLearning #NaturalLanguageProcessing #NLP #AIResearch
Adaptive Multi-Agent Response Refinement in Conversational Systems

📝 Summary:
This paper presents a multi-agent framework for refining conversational responses across factuality, personalization, and coherence. It employs dynamic agent coordination, outperforming single LLM approaches on challenging conversational datasets.

🔹 Publication Date: Published on Nov 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.08319
• PDF: https://arxiv.org/pdf/2511.08319

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MultiAgentSystems #ConversationalAI #LLMs #NLP #AIResearch
BiCA: Effective Biomedical Dense Retrieval with Citation-Aware Hard Negatives

📝 Summary:
BiCA improves biomedical dense retrieval by using citation links as hard negatives. This leverages document structure to enhance performance with minimal fine-tuning, enabling data-efficient domain adaptation.

🔹 Publication Date: Published on Nov 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.08029
• PDF: https://arxiv.org/pdf/2511.08029
• Github: https://github.com/NiravBhattLab/BiCA

🔹 Models citing this paper:
https://huggingface.co/bisectgroup/BiCA-small
https://huggingface.co/bisectgroup/BiCA-base

Datasets citing this paper:
https://huggingface.co/datasets/bisectgroup/2hop-citation-graphs
https://huggingface.co/datasets/bisectgroup/hard-negatives-traversal

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#BiomedicalAI #DenseRetrieval #NLP #MachineLearning #InformationRetrieval
Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

📝 Summary:
LMT introduces new multilingual translation models covering 60 languages, centered on Chinese and English. It uses Strategic Downsampling and Parallel Multilingual Prompting to improve translation quality and cross-lingual transfer, achieving state-of-the-art performance.

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07003
• PDF: https://arxiv.org/pdf/2511.07003
• Project Page: https://github.com/NiuTrans/LMT
• Github: https://github.com/NiuTrans/LMT

🔹 Models citing this paper:
https://huggingface.co/NiuTrans/LMT-60-1.7B
https://huggingface.co/NiuTrans/LMT-60-0.6B-Base
https://huggingface.co/NiuTrans/LMT-60-0.6B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MultilingualTranslation #LLMs #MachineTranslation #NLP #AI
🔥1
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

📝 Summary:
Ming-UniAudio introduces a unified speech LLM and tokenizer for joint understanding, generation, and instruction-based free-form editing. It overcomes token representation issues, achieves state-of-the-art results, and establishes a new benchmark for editing.

🔹 Publication Date: Published on Oct 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.05516
• PDF: https://arxiv.org/pdf/2511.05516
• Project Page: https://xqacmer.github.io/Ming-Unitok-Audio.github.io/
• Github: https://github.com/inclusionAI/Ming-UniAudio

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpeechLLM #AI #NLP #GenerativeAI #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
Optimizing Diversity and Quality through Base-Aligned Model Collaboration

📝 Summary:
BACo is a token-level collaboration framework for LLMs. It dynamically combines a base model with its aligned counterpart to improve both output diversity and quality during inference. BACo consistently outperforms baselines, achieving significant joint improvement.

🔹 Publication Date: Published on Nov 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.05650
• PDF: https://arxiv.org/pdf/2511.05650

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #AI #MachineLearning #NLP #ModelCollaboration
Stemming Hallucination in Language Models Using a Licensing Oracle

📝 Summary:
This study presents the Licensing Oracle, an architectural solution to eliminate language model hallucinations. It enforces truth constraints via formal validation against structured knowledge graphs, achieving perfect abstention precision and zero false answers where statistical methods fail.

🔹 Publication Date: Published on Nov 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06073
• PDF: https://arxiv.org/pdf/2511.06073

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AIHallucination #KnowledgeGraphs #NLP #AIResearch
1👏1
Efficient Guided Generation for Large Language Models

📝 Summary:
This paper introduces an efficient method to guide large language model text generation. It uses regular expressions and context-free grammars with minimal added overhead, making guided generation practical.

🔹 Publication Date: Published on Jul 19, 2023

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2307.09702
• PDF: https://arxiv.org/pdf/2307.09702
• Github: https://github.com/normal-computing/outlines

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #TextGeneration #NLP #AI #DeepLearning
Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models

📝 Summary:
This paper proposes an AI agent framework for adaptive long-form writing. It uses recursive task decomposition and dynamically integrates retrieval, reasoning, and composition, overcoming rigid outline-based methods. The framework consistently outperforms state-of-the-art approaches.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.08275
• PDF: https://arxiv.org/pdf/2503.08275
• Github: https://github.com/principia-ai/WriteHERE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #LanguageModels #LongformWriting #NLP #GenerativeAI
1
DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains

📝 Summary:
A new benchmark, DiscoX, and evaluation system, Metric-S, are introduced for discourse-level, expert Chinese-English translation. Findings show advanced LLMs still fall short of human performance, underscoring challenges in professional machine translation.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10984
• PDF: https://arxiv.org/pdf/2511.10984

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MachineTranslation #NLP #LLM #Benchmarking #AI
Qwen3 Technical Report

📝 Summary:
Qwen3 is a new series of large language models integrating thinking and non-thinking modes for unified performance and efficiency. It achieves state-of-the-art results across diverse tasks and expands multilingual support to 119 languages.

🔹 Publication Date: Published on May 14

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/qwen3-technical-report
• PDF: https://arxiv.org/pdf/2505.09388
• Project Page: https://qwenlm.github.io/blog/qwen3/
• Github: https://github.com/QwenLM/Qwen3

🔹 Models citing this paper:
https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct
https://huggingface.co/Qwen/Qwen3-235B-A22B
https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct

Spaces citing this paper:
https://huggingface.co/spaces/modelscope/DocResearch
https://huggingface.co/spaces/enzostvs/deepsite
https://huggingface.co/spaces/multimodalart/Eigen-Banana

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #MultilingualAI #NLP #Qwen3
Instella: Fully Open Language Models with Stellar Performance

📝 Summary:
Instella is a family of fully open language models trained on open data. It achieves state-of-the-art among fully open models and competes with leading open-weight LLMs. Specialized variants for long context and math reasoning are also offered.

🔹 Publication Date: Published on Nov 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10628
• PDF: https://arxiv.org/pdf/2511.10628
• Github: https://github.com/AMD-AGI/Instella

🔹 Models citing this paper:
https://huggingface.co/amd/AMD-OLMo
https://huggingface.co/amd/Instella-3B-Instruct
https://huggingface.co/amd/Instella-3B

Datasets citing this paper:
https://huggingface.co/datasets/amd/Instella-Long
https://huggingface.co/datasets/amd/Instella-GSM8K-synthetic

Spaces citing this paper:
https://huggingface.co/spaces/DexterSptizu/AMD-OLMo-1B
https://huggingface.co/spaces/universeofml/DeepFocusTrain

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #OpenSource #AI #MachineLearning #NLP
1
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

📝 Summary:
This paper clarifies RL for LLM Agents by extending the MDP framework. It introduces Agent-R1, a modular and flexible training framework, demonstrating its effectiveness on Multihop QA tasks.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14460
• PDF: https://arxiv.org/pdf/2511.14460
• Github: https://github.com/0russwest0/Agent-R1

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMAgents #ReinforcementLearning #AI #DeepLearning #NLP
Mitigating Label Length Bias in Large Language Models

📝 Summary:
Large Language Models exhibit a label length bias with multi-token class labels. This paper introduces Normalized Contextual Calibration NCC to mitigate this issue by normalizing and calibrating predictions at the full-label level. NCC significantly improves performance and reliability across div...

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14385
• PDF: https://arxiv.org/pdf/2511.14385

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #NLP #BiasInAI #MachineLearning
Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation

📝 Summary:
Thinking-while-Generating TwiG interleaves textual reasoning throughout the visual generation process. This on-the-fly multimodal interaction guides and reflects on visual content as it is created, resulting in more context-aware and semantically rich outputs.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16671
• PDF: https://arxiv.org/pdf/2511.16671
• Project Page: https://think-while-gen.github.io/
• Github: https://github.com/ZiyuGuo99/Thinking-while-Generating

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#GenerativeAI #MultimodalAI #ComputerVision #NLP #AIResearch
Boosting Medical Visual Understanding From Multi-Granular Language Learning

📝 Summary:
MGLL enhances visual understanding by improving multi-label and cross-granularity alignment in image-text pretraining, outperforming existing methods in complex domains like medical imaging.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15943
• PDF: https://arxiv.org/pdf/2511.15943
• Project Page: https://github.com/HUANGLIZI/MGLL
• Github: https://github.com/HUANGLIZI/MGLL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MedicalAI #ComputerVision #DeepLearning #NLP #ImageTextPretraining
2
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

📝 Summary:
GraphGen is a framework that enhances synthetic data generation for LLMs by constructing fine-grained knowledge graphs. It targets high-value knowledge gaps, uses multi-hop sampling, and style-controlled generation to create diverse and accurate QA pairs. This approach outperforms conventional me...

🔹 Publication Date: Published on May 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.20416
• PDF: https://arxiv.org/pdf/2505.20416
• Project Page: https://huggingface.co/spaces/chenzihong/GraphGen
• Github: https://github.com/open-sciencelab/GraphGen

Datasets citing this paper:
https://huggingface.co/datasets/chenzihong/GraphGen-Data

Spaces citing this paper:
https://huggingface.co/spaces/chenzihong/GraphGen

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #KnowledgeGraphs #SyntheticData #FineTuning #NLP
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story

📝 Summary:
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story
This study explores intrinsic dimension ID in large language models, revealing its independence from entropy and genre-specific stratification. Scientific texts show low ID, while creative/opinion writing exhibits hi...

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15210
• PDF: https://arxiv.org/pdf/2511.15210

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#IntrinsicDimension #LargeLanguageModels #NLP #TextAnalytics #DataScience