ML Research Hub
32.7K subscribers
5.63K photos
358 videos
24 files
6.09K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Qwen Technical Report

📝 Summary:
Qwen is a series of large language models encompassing base, chat, coding, and mathematics variants. These models consistently achieve superior performance across diverse tasks, significantly outperforming open-source counterparts. Qwen-Chat models also feature advanced tool-use and planning capa...

🔹 Publication Date: Published on Sep 28, 2023

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2309.16609
• PDF: https://arxiv.org/pdf/2309.16609
• Github: https://github.com/QwenLM/Qwen-7B

🔹 Models citing this paper:
https://huggingface.co/Qwen/Qwen-7B-Chat
https://huggingface.co/Qwen/Qwen-7B
https://huggingface.co/Qwen/Qwen-14B-Chat

Datasets citing this paper:
https://huggingface.co/datasets/huyxdang/qwen-medqa-tagged
https://huggingface.co/datasets/huyxdang/qwen-math-predictions

Spaces citing this paper:
https://huggingface.co/spaces/pliny-the-prompter/obliteratus
https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard
https://huggingface.co/spaces/lhoestq/fake-data-generator-jsonl

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Qwen #LLM #AI #NLP #DeepLearning
MIBURI: Towards Expressive Interactive Gesture Synthesis

📝 Summary:
MIBURI is an online, real-time framework generating expressive full-body gestures and facial expressions for spoken dialogue. It uses body-part aware codecs and LLM embeddings to create natural, diverse, and contextually aligned motions causally, overcoming limitations of prior methods.

🔹 Publication Date: Published on Mar 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03282
• PDF: https://arxiv.org/pdf/2603.03282
• Project Page: https://vcai.mpi-inf.mpg.de/projects/MIBURI/
• Github: https://github.com/m-hamza-mughal/miburi

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#GestureSynthesis #AI #HumanComputerInteraction #NLP #RealtimeTech
Specificity-aware reinforcement learning for fine-grained open-world classification

📝 Summary:
A novel RL framework SpeciaRL improves large multimodal models for open-world fine-grained classification. It enhances prediction specificity while maintaining correctness using a dynamic verifier-based reward. Experiments show SpeciaRL achieves the best trade-off.

🔹 Publication Date: Published on Mar 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03197
• PDF: https://arxiv.org/pdf/2603.03197

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #MachineLearning #ComputerVision #AI #MultimodalAI
HDINO: A Concise and Efficient Open-Vocabulary Detector

📝 Summary:
HDINO is an efficient open-vocabulary detector using a two-stage training strategy. It employs One-to-Many Semantic Alignment and lightweight feature fusion, avoiding manual data curation and complex feature extraction. HDINO achieves superior performance on COCO with less training data.

🔹 Publication Date: Published on Mar 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02924
• PDF: https://arxiv.org/pdf/2603.02924
• Github: https://github.com/HaoZ416/HDINO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ObjectDetection #ComputerVision #OpenVocabulary #DeepLearning #AIResearch
Qwen2.5 Technical Report

📝 Summary:
Qwen2.5, an enhanced series of large language models, demonstrates superior performance across various benchmarks and use cases through extensive pre-training and advanced post-training techniques. AI...

🔹 Publication Date: Published on Dec 19, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.15115
• PDF: https://arxiv.org/pdf/2412.15115
• Github: https://github.com/QwenLM/Qwen2.5

🔹 Models citing this paper:
https://huggingface.co/Qwen/QwQ-32B
https://huggingface.co/Qwen/QwQ-32B-GGUF
https://huggingface.co/Qwen/QwQ-32B-AWQ

Datasets citing this paper:
https://huggingface.co/datasets/HuggingFaceTB/smoltalk2

Spaces citing this paper:
https://huggingface.co/spaces/modelscope/DocResearch
https://huggingface.co/spaces/ITHwangg/candle-qwen25-wasm-demo
https://huggingface.co/spaces/GuminiResearch/Gumini_sLLM_Report

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

📝 Summary:
This study empirically analyzes visual token pruning in LVLMs. It finds attention-based pruning is better for simple images, while diversity-based methods suit complex ones. These insights lead to improved adaptive pruning strategies that reduce hallucination.

🔹 Publication Date: Published on Mar 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.01236
• PDF: https://arxiv.org/pdf/2603.01236
• Project Page: https://paper.pnu-cvsp.com/AgilePruner/
• Github: https://github.com/cvsp-lab/AgilePruner

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LVLMs #VisualTokenPruning #AdaptiveAI #HallucinationReduction #DeepLearning
1
This media is not supported in your browser
VIEW IN TELEGRAM
V_1: Unifying Generation and Self-Verification for Parallel Reasoners

📝 Summary:
V1 unifies generation and verification for complex reasoning tasks. It leverages models' superior ability in pairwise self-verification over independent scoring, improving performance and efficiency in code generation and math.

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04304
• PDF: https://arxiv.org/pdf/2603.04304
• Project Page: https://harmandotpy.github.io/v1-verification/
• Github: https://github.com/HarmanDotpy/pairwise-self-verification

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #LLMs #MachineLearning #CodeGeneration #AIReasoning
1
Underwater Camouflaged Object Tracking Meets Vision-Language SAM2

📝 Summary:
A new large-scale multi-modal underwater camouflaged object tracking dataset, UW-COT220, was introduced. Evaluations showed SAM2 improved tracking performance over SAM. A novel vision-language framework, VL-SAM2, achieved state-of-the-art results on both underwater and open-air object tracking da...

🔹 Publication Date: Published on Sep 25, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2409.16902
• PDF: https://arxiv.org/pdf/2409.16902
• Github: https://github.com/983632847/awesome-multimodal-object-tracking

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier

📝 Summary:
MOOSE-Star enables tractable training for generative scientific reasoning by tackling its intractable combinatorial complexity. It uses decomposed subtasks, hierarchical search to reduce complexity to logarithmic, and bounded composition, allowing scalable training and inference.

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03756
• PDF: https://arxiv.org/pdf/2603.03756
• Github: https://github.com/ZonglinY/MOOSE-Star

🔹 Models citing this paper:
https://huggingface.co/ZonglinY/MOOSE-Star-HC-R1D-7B
https://huggingface.co/ZonglinY/MOOSE-Star-IR-R1D-7B

Datasets citing this paper:
https://huggingface.co/datasets/ZonglinY/TOMATO-Star
https://huggingface.co/datasets/ZonglinY/TOMATO-Star-SFT-Data-R1D-32B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

📝 Summary:
HiFi-Inpaint generates high-fidelity human-product images using shared enhancement attention and detail-aware loss with a new 40K-image dataset. AI-generated summary Human-product images , which showc...

🔹 Publication Date: Published on Mar 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02210
• PDF: https://arxiv.org/pdf/2603.02210
• Project Page: https://correr-zhou.github.io/HiFi-Inpaint/
• Github: https://github.com/Correr-Zhou/HiFi-Inpaint

Datasets citing this paper:
https://huggingface.co/datasets/donghao-zhou/HP-Image-40K

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval

📝 Summary:
DARE is a retrieval model that improves R package retrieval by embedding data distribution information into function representations. It significantly outperforms existing models, enabling more reliable R code generation and statistical analysis.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04743
• PDF: https://arxiv.org/pdf/2603.04743
• Project Page: https://ama-cmfai.github.io/DARE_webpage/
• Github: https://ama-cmfai.github.io/DARE_webpage/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Locality-Attending Vision Transformer

📝 Summary:
Vision transformers are enhanced for segmentation tasks through a Gaussian kernel modulation that improves local attention while maintaining classification performance. AI-generated summary Vision tra...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04892
• PDF: https://arxiv.org/pdf/2603.04892
• Github: https://github.com/sinahmr/LocAtViT

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
RealWonder: Real-Time Physical Action-Conditioned Video Generation

📝 Summary:
RealWonder enables real-time action-conditioned video generation by integrating 3D reconstruction, physics simulation, and a distilled video generator to simulate physical consequences of 3D actions. ...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05449
• PDF: https://arxiv.org/pdf/2603.05449
• Project Page: https://liuwei283.github.io/RealWonder/
• Github: https://github.com/liuwei283/RealWonder

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
KARL: Knowledge Agents via Reinforcement Learning

📝 Summary:
A reinforcement learning system for enterprise search agents achieves superior performance through diverse training data generation and multi-task learning approaches. AI-generated summary We present ...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05218
• PDF: https://arxiv.org/pdf/2603.05218

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

📝 Summary:
Timer-S1 is a scalable Mixture-of-Experts time series model with 8.3B parameters that uses serial scaling and novel TimeMoE blocks to improve long-term forecasting accuracy. AI-generated summary We in...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04791
• PDF: https://arxiv.org/pdf/2603.04791

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
DreamWorld: Unified World Modeling in Video Generation

📝 Summary:
DreamWorld introduces a unified framework for video generation that integrates multiple types of world knowledge through joint modeling of temporal dynamics, spatial geometry, and semantic consistency...

🔹 Publication Date: Published on Feb 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00466
• PDF: https://arxiv.org/pdf/2603.00466
• Github: https://github.com/ABU121111/DreamWorld

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

📝 Summary:
MM-Lifelong dataset captures natural video sequences across multiple temporal scales to evaluate multimodal lifelong understanding, revealing limitations in current approaches and introducing a recurs...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05484
• PDF: https://arxiv.org/pdf/2603.05484
• Project Page: https://huggingface.co/datasets/CG-Bench/MM-Lifelong
• Github: https://github.com/cg1177/Recursive-Multimodal-Agent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
On-Policy Self-Distillation for Reasoning Compression

📝 Summary:
OPSDC enables efficient reasoning model compression by having models distill concise behavior from their own outputs, achieving significant token reduction while maintaining accuracy. AI-generated sum...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05433
• PDF: https://arxiv.org/pdf/2603.05433

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research