ML Research Hub
32.3K subscribers
6.51K photos
447 videos
24 files
7.08K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

📝 Summary:
DataFlex is a unified framework for dynamic data-centric training of large language models that supports sample selection, domain mixture adjustment, and sample reweighting while maintaining compatibi...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26164
• PDF: https://arxiv.org/pdf/2603.26164
• Github: https://github.com/OpenDCAI/DataFlex

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Generative World Renderer

📝 Summary:
A large-scale dynamic dataset derived from AAA games is introduced to improve generative inverse and forward rendering, featuring high-resolution synchronized RGB and G-buffer data alongside a novel V...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02329
• PDF: https://arxiv.org/pdf/2604.02329
• Project Page: https://alaya-studio.github.io/renderer
• Github: https://github.com/ShandaAI/AlayaRenderer

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

📝 Summary:
An autonomous research pipeline discovers Omni-SimpleMem, a unified multimodal memory framework that significantly improves lifelong AI agent performance through automated architectural modifications,...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01007
• PDF: https://arxiv.org/pdf/2604.01007

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EgoSim: Egocentric World Simulator for Embodied Interaction Generation

📝 Summary:
W e i n t r o d u c e E g o S i m , a c l o s e d - l o o p e g o c e n t r i c w o r l d s i m u l a t o r t h a t g e n e r a t e s s p a t i a l l y c o n s i s t e n t i n t e r a c t i o n v i d ...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01001
• PDF: https://arxiv.org/pdf/2604.01001
• Project Page: https://egosimulator.github.io/
• Github: https://egosimulator.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving

📝 Summary:
UniDriveVLA is a unified vision-language-action model for autonomous driving that decouples spatial perception and semantic reasoning through a mixture-of-transformers architecture with expert coordin...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02190
• PDF: https://arxiv.org/pdf/2604.02190
• Project Page: https://xiaomi-research.github.io/unidrivevla/
• Github: https://github.com/xiaomi-research/unidrivevla

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification

📝 Summary:
VideoZeroBench presents a comprehensive benchmark for long-video question answering with rigorous spatio-temporal evidence verification, revealing significant gaps in current models' grounded video un...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01569
• PDF: https://arxiv.org/pdf/2604.01569
• Project Page: https://marinero4972.github.io/projects/VideoZeroBench
• Github: https://github.com/marinero4972/VideoZeroBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers

📝 Summary:
Controllable diffusion models using linear attention architectures enable secure on-device visual generation with improved multi-condition input handling and faster convergence. AI-generated summary R...

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27666
• PDF: https://arxiv.org/pdf/2603.27666

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model

📝 Summary:
LatentUM is a unified model that represents all modalities in a shared semantic latent space, enabling efficient cross-modal reasoning and generation without pixel-space mediation. AI-generated summar...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02097
• PDF: https://arxiv.org/pdf/2604.02097
• Github: https://github.com/SJTU-DENG-Lab/LatentUM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Apriel-Reasoner: RL Post-Training for General-Purpose and Efficient Reasoning

📝 Summary:
Apriel-Reasoner, a 15B LLM, uses reproducible multi-domain RL post-training with novel sampling and length penalty to boost reasoning accuracy and efficiency. It achieves 30-50% shorter traces, outperforming its base model and matching peers at lower inference cost.

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02007
• PDF: https://arxiv.org/pdf/2604.02007

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LinguDistill: Recovering Linguistic Ability in Vision- Language Models via Selective Cross-Modal Distillation

📝 Summary:
LinguDistill enables recovery of linguistic capabilities in vision-language models through adapter-free distillation using frozen language models as teachers, achieving performance close to pre-adapta...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00829
• PDF: https://arxiv.org/pdf/2604.00829

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VisionLanguageModels #NLP #ModelDistillation #ArtificialIntelligence #MachineLearning
Woosh: A Sound Effects Foundation Model

📝 Summary:
Woosh is a sound effect foundation model featuring audio encoding/decoding, text-audio alignment, and text-to-audio/video-to-audio generation capabilities with distilled versions for efficient deploym...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01929
• PDF: https://arxiv.org/pdf/2604.01929
• Project Page: https://sonyresearch.github.io/Woosh/
• Github: https://github.com/SonyResearch/Woosh

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Friends and Grandmothers in Silico: Localizing Entity Cells in Language Models

📝 Summary:
Entity-centric factual question answering involves localized MLP neurons that can be causally intervened to recover entity-consistent predictions, showing robustness to various linguistic variations b...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01404
• PDF: https://arxiv.org/pdf/2604.01404
• Github: https://github.com/1tux/in-silico

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Automatic Image-Level Morphological Trait Annotation for Organismal Images

📝 Summary:
This paper presents a scalable method for automatically annotating morphological traits from biological images. It uses sparse autoencoders on foundation model features to identify meaningful parts, then applies vision-language prompting to generate trait descriptions. This approach creates large...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01619
• PDF: https://arxiv.org/pdf/2604.01619
• Github: https://github.com/OSU-NLP-Group/sae-trait-annotation

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Executing as You Generate: Hiding Execution Latency in LLM Code Generation

📝 Summary:
Parallel execution paradigm for LLM-based coding agents reduces latency by executing code during generation rather than in sequential stages. AI-generated summary Current LLM-based coding agents follo...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00491
• PDF: https://arxiv.org/pdf/2604.00491

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UniRecGen: Unifying Multi-View 3D Reconstruction and Generation

📝 Summary:
UniRecGen combines feed-forward reconstruction and diffusion-based generation in a shared canonical space to produce complete and consistent 3D models from sparse inputs through disentangled cooperati...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01479
• PDF: https://arxiv.org/pdf/2604.01479
• Github: https://github.com/zsh523/UniRecGen

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
T5Gemma-TTS Technical Report

📝 Summary:
T5Gemma-TTS is an encoder-decoder codec language model that improves voice cloning and duration control for multilingual speech synthesis. It uses cross-attention for persistent text conditioning and Progress-Monitoring Rotary Position Embedding PM-RoPE for better target speech length tracking. I...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01760
• PDF: https://arxiv.org/pdf/2604.01760
• Github: https://github.com/Aratako/T5Gemma-TTS

🔹 Models citing this paper:
https://huggingface.co/Aratako/T5Gemma-TTS-2b-2b

Spaces citing this paper:
https://huggingface.co/spaces/Aratako/T5Gemma-TTS-Demo
https://huggingface.co/spaces/litagin/T5Gemma-TTS-Demo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpeechSynthesis #TTS #VoiceCloning #Multilingual #LanguageModels
Media is too big
VIEW IN TELEGRAM
DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data

📝 Summary:
DynaVid improves dynamic video synthesis by training with synthetic optical flow, which provides diverse motion patterns without artificial appearances. A two-stage framework learns dynamic motion while preserving visual realism, enhancing motion control.

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01666
• PDF: https://arxiv.org/pdf/2604.01666
• Project Page: https://jinwonjoon.github.io/DynaVid/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoGeneration #AIVideo #DeepLearning #ComputerVision #SyntheticData
This media is not supported in your browser
VIEW IN TELEGRAM
VOID: Video Object and Interaction Deletion

📝 Summary:
VOID is a video object removal framework designed for complex scenarios involving significant object interactions. It uses vision-language and video diffusion models, leveraging causal reasoning to generate physically plausible counterfactual scenes. VOID better preserves consistent scene dynamic...

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02296
• PDF: https://arxiv.org/pdf/2604.02296
• Project Page: https://void-model.github.io/
• Github: https://github.com/Netflix/void-model

🔹 Models citing this paper:
https://huggingface.co/netflix/void-model

Spaces citing this paper:
https://huggingface.co/spaces/sam-motamed/VOID

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoEditing #DiffusionModels #ComputerVision #GenerativeAI #DeepLearning
Forwarded from Code With Python
This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

https://t.iss.one/addlist/8_rRW2scgfRhOTc0

https://t.iss.one/Codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
1
Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generation

📝 Summary:
Omni123 is a 3D-native foundation model unifying text-to-2D and text-to-3D generation. It addresses limited 3D data by leveraging cross-modal consistency from abundant 2D images as a geometric prior. This model significantly improves text-guided 3D generation and editing.

🔹 Publication Date: Published on Apr 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.02289
• PDF: https://arxiv.org/pdf/2604.02289

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DGeneration #FoundationModels #AIResearch #ComputerVision #DeepLearning
2
Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

📝 Summary:
Researchers analyzed AI coding agent contributions to open source projects. They found increasing agent activity but higher code churn over time compared to human-authored code.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00917
• PDF: https://arxiv.org/pdf/2604.00917
• Project Page: https://arxiv.org/html/2604.00917v1

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIAgents #SoftwareEngineering #OpenSource #CodeQuality #AIResearch
2