ML Research Hub
32.8K subscribers
4.13K photos
244 videos
23 files
4.46K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Factorized Learning for Temporally Grounded Video-Language Models

📝 Summary:
Video-language models struggle with temporal grounding from coupled tasks. Our D^2VLM framework decouples grounding and textual response using evidence tokens. Factorized preference optimization explicitly optimizes temporal grounding for both tasks.

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24097
• PDF: https://arxiv.org/pdf/2512.24097
• Project Page: https://github.com/nusnlp/d2vlm
• Github: https://github.com/nusnlp/d2vlm

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

📝 Summary:
This paper presents JavisGPT, the first unified multimodal large language model (MLLM) for Joint Audio-Video (JAV) comprehension and generation. JavisGPT adopts a concise encoder-LLM-decoder architect...

🔹 Publication Date: Published on Dec 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.23377
• PDF: https://arxiv.org/pdf/2512.22905
• Project Page: https://javisverse.github.io/JavisGPT-page/
• Github: https://github.com/JavisVerse/JavisGPT

🔹 Models citing this paper:
https://huggingface.co/JavisVerse/JavisGPT-v0.1-7B-Instruct

Datasets citing this paper:
https://huggingface.co/datasets/JavisVerse/MM-PreTrain
https://huggingface.co/datasets/JavisVerse/JavisUnd-Eval
https://huggingface.co/datasets/JavisVerse/AV-FineTune

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

📝 Summary:
The rapid advancement of autonomous systems, including self-driving vehicles and drones, has intensified the need to forge true Spatial Intelligence from multi-modal onboard sensor data. While foundat...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24385
• PDF: https://arxiv.org/pdf/2512.24385
• Github: https://github.com/worldbench/awesome-spatial-intelligence

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Valori: A Deterministic Memory Substrate for AI Systems

📝 Summary:
Valori introduces a deterministic AI memory substrate using fixed-point arithmetic, ensuring bit-identical results across platforms. This eliminates non-determinism from floating-point operations in vector embeddings and search, making AI systems trustworthy and verifiable.

🔹 Publication Date: Published on Dec 25, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22280
• PDF: https://arxiv.org/pdf/2512.22280
• Project Page: https://valori.systems/
• Github: https://github.com/varshith-Git/Valori-Kernel

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

📝 Summary:
A framework called BEDA uses probabilistic constraints on belief estimation to improve strategic dialogue through formalized adversarial and alignment acts, outperforming baselines across multiple tas...

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24885
• PDF: https://arxiv.org/pdf/2512.24885

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
On the Role of Discreteness in Diffusion LLMs

📝 Summary:
This paper examines diffusion language models, highlighting five properties separating diffusion mechanics from language requirements. Existing approaches face structural trade-offs. Key issues identified are uniform corruption and token-wise marginal training, urging development of diffusion pro...

🔹 Publication Date: Published on Dec 27, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22630
• PDF: https://arxiv.org/pdf/2512.22630

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

📝 Summary:
DiffThinker introduces a generative multimodal reasoning framework using diffusion models. It reframes vision-centric tasks as image-to-image generation for superior logical consistency and spatial precision. DiffThinker significantly outperforms existing MLLMs across various domains, showcasing ...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24165
• PDF: https://arxiv.org/pdf/2512.24165
• Project Page: https://diffthinker-project.github.io/
• Github: https://github.com/lcqysl/DiffThinker

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Deep Delta Learning

📝 Summary:
The efficacy of deep residual networks is fundamentally predicated on the identity shortcut connection. While this mechanism effectively mitigates the vanishing gradient problem, it imposes a strictly...

🔹 Publication Date: Published on Jan 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00417
• PDF: https://arxiv.org/pdf/2601.00417
• Github: https://github.com/yifanzhang-pro/deep-delta-learning

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Fast-weight Product Key Memory

📝 Summary:
FwPKM introduces a dynamic, fast-weight episodic memory mechanism for sequence modeling that balances storage capacity and efficiency, achieving strong performance on long-context tasks like Needle in...

🔹 Publication Date: Published on Jan 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00671
• PDF: https://arxiv.org/pdf/2601.00671

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

📝 Summary:
NextFlow is a unified decoder-only transformer enabling fast multimodal understanding and generation. It uses next-token prediction for text and next-scale for images, generating 1024x1024 images in 5 seconds. It achieves state-of-the-art performance among unified models.

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02204
• PDF: https://arxiv.org/pdf/2601.02204
• Github: https://github.com/ByteVisionLab/NextFlow

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research