ML Research Hub
32.4K subscribers
6.14K photos
404 videos
24 files
6.66K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
UniFunc3D: Unified Active Spatial-Temporal Grounding for 3D Functionality Segmentation

📝 Summary:
UniFunc3D enables 3D scene functionality segmentation by treating multimodal large language models as active observers that perform joint semantic, temporal, and spatial reasoning through adaptive fra...

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23478
• PDF: https://arxiv.org/pdf/2603.23478
• Project Page: https://jiaying.link/unifunc3d/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EVA: Efficient Reinforcement Learning for End-to-End Video Agent

📝 Summary:
EVA is an RL framework enabling efficient, adaptive video understanding by autonomously deciding what and how to watch. It uses iterative planning to handle long video sequences. EVA significantly outperforms existing MLLM and adaptive agent methods on multiple video benchmarks.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22918
• PDF: https://arxiv.org/pdf/2603.22918
• Project Page: https://huggingface.co/WRHC/EfficientVideoAgent/
• Github: https://github.com/wangruohui/EfficientVideoAgent

🔹 Models citing this paper:
https://huggingface.co/WRHC/EfficientVideoAgent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PLDR-LLMs Reason At Self-Organized Criticality

📝 Summary:
PLDR-LLMs exhibit reasoning capabilities at self-organized criticality through metastable steady states that mirror second-order phase transitions, enabling generalization without benchmark evaluation...

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23539
• PDF: https://arxiv.org/pdf/2603.23539
• Project Page: https://huggingface.co/fromthesky
• Github: https://github.com/burcgokden/PLDR-LLM-Self-Organized-Criticality

🔹 Models citing this paper:
https://huggingface.co/fromthesky/PLDR-LLM-v51-SOC-110M-1
https://huggingface.co/fromthesky/PLDR-LLM-v51-SOC-110M-2
https://huggingface.co/fromthesky/PLDR-LLM-v51-SOC-110M-3

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
StreamingClaw Technical Report

📝 Summary:
StreamingClaw is a unified framework for real-time streaming video understanding and embodied intelligence. It integrates real-time reasoning, multimodal long-term memory, and proactive interaction, enabling direct control of the physical world.

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22120
• PDF: https://arxiv.org/pdf/2603.22120

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#EmbodiedAI #VideoUnderstanding #RealTimeAI #Robotics #MultimodalAI
Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning

📝 Summary:
TRACE is a prompting method that enables MLLMs to perform 3D spatial reasoning by generating text-based representations of video environments. This improves spatial question answering and consistently outperforms prior strategies.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23404
• PDF: https://arxiv.org/pdf/2603.23404

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpatialReasoning #MLLMs #AI #PromptEngineering #ComputerVision
LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis

📝 Summary:
LagerNVS is a neural network for novel view synthesis that uses strong 3D inductive biases. It achieves this by initializing its encoder from a pre-trained 3D reconstruction network, enabling state-of-the-art, real-time NVS even with unknown cameras and in-the-wild data.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20176
• PDF: https://arxiv.org/pdf/2603.20176
• Project Page: https://szymanowiczs.github.io/lagernvs
• Github: https://github.com/facebookresearch/lagernvs

🔹 Models citing this paper:
https://huggingface.co/facebook/lagernvs_general_512
https://huggingface.co/facebook/lagernvs_re10k_2v_256
https://huggingface.co/facebook/lagernvs_dl3dv_2-6_v_256

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#NovelViewSynthesis #NeuralNetworks #3DReconstruction #ComputerVision #DeepLearning
6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models

📝 Summary:
This paper introduces a mixed-precision quantization framework for video diffusion transformers. It dynamically allocates NVFP4/INT8 based on layer volatility and uses Temporal Delta Cache to skip computations, significantly reducing memory and cost while preserving quality.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18742
• PDF: https://arxiv.org/pdf/2603.18742

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Quantization #DiffusionModels #VideoAI #DeepLearning #ModelOptimization
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search

📝 Summary:
The AI Scientist-v2 is an agentic system capable of autonomous scientific discovery, from hypothesis to manuscript. It produced the first fully AI-generated paper accepted by a peer-reviewed workshop, highlighting AI's growing research capabilities.

🔹 Publication Date: Published on Apr 10, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.08066
• PDF: https://arxiv.org/pdf/2504.08066
• Github: https://github.com/SakanaAI/AI-Scientist-v2

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ScientificDiscovery #AgenticAI #AutonomousResearch #FutureOfScience
Qworld: Question-Specific Evaluation Criteria for LLMs

📝 Summary:
Qworld is a new method that generates question-specific evaluation criteria for LLMs using recursive expansion trees. It decomposes questions into fine-grained criteria, enabling more insightful and granular assessment of LLM capabilities by adapting to each question's context. This approach reve...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23522
• PDF: https://arxiv.org/pdf/2603.23522
• Project Page: https://qworld.openscientist.ai/
• Github: https://github.com/mims-harvard/qworld

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMEvaluation #LargeLanguageModels #AIResearch #NLP #MachineLearning
1
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

📝 Summary:
AI Scientist is an LLM system for automated scientific discovery. It handles ideas, experiments, papers, and simulated review. This system produces high-quality research for under $15, exceeding top conference standards.

🔹 Publication Date: Published on Aug 12, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2402.00854
• PDF: https://arxiv.org/pdf/2408.06292
• Github: https://github.com/ExtensityAI/benchmark/blob/main/src/evals/eval_computation_graphs.py#L551

🔹 Models citing this paper:
https://huggingface.co/pradachan/AI-Scientist
https://huggingface.co/priyanshmahant12/AI-Scientist-main

Spaces citing this paper:
https://huggingface.co/spaces/AUXteam/Critical_Code_Agent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIScientist #AutomatedDiscovery #LLM #ScientificResearch #AIforScience
SpectralSplats: Robust Differentiable Tracking via Spectral Moment Supervision

📝 Summary:
SpectralSplats resolves vanishing gradients in 3D Gaussian Splatting tracking by optimizing in the frequency domain using spectral moments. This creates a global gradient basin of attraction, ensuring robust tracking even with severe misalignment. A frequency annealing schedule guides precise ali...

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24036
• PDF: https://arxiv.org/pdf/2603.24036
• Project Page: https://avigailco.github.io/SpectralSplats/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DTracking #GaussianSplatting #ComputerVision #Optimization #DifferentiableRendering
Understanding the Challenges in Iterative Generative Optimization with LLMs

📝 Summary:
Generative optimization with LLMs is often brittle due to implicit design choices about artifact modification and learning evidence. These hidden decisions, such as starting artifact or batching, critically determine success across applications. Making these choices explicit is crucial for wider ...

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23994
• PDF: https://arxiv.org/pdf/2603.23994

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #GenerativeAI #Optimization #AIResearch #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics

📝 Summary:
Generative video models suffer from inconsistent physical motion speeds due to varied training data. This work introduces Visual Chronometer, a tool that estimates a video's true physical frame rate from its visual dynamics. Correcting this significantly improves the naturalness of AI-generated v...

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14375
• PDF: https://arxiv.org/pdf/2603.14375
• Project Page: https://xiangbogaobarry.github.io/Pulse-of-Motion/
• Github: https://github.com/taco-group/Pulse-of-Motion

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIGeneratedVideo #ComputerVision #FrameRate #DeepLearning #AIResearch
Please open Telegram to view this post
VIEW IN TELEGRAM
Monetize Your Telegram Channel (Daily) — Without Complex Setup

📝 Summary:
Looking for a way to turn your Telegram audience into real income *safely*—without complicated configurations or risky “tricks”? This guide breaks down a simple approach to make it work consistently.

🔹 Learn more here:
👉 Airdrop Env፝֟፝֟oy

#ad 📢 InsideAd
Please open Telegram to view this post
VIEW IN TELEGRAM
1
Internal Safety Collapse in Frontier Large Language Models

📝 Summary:
Frontier LLMs suffer Internal Safety Collapse, continuously generating harmful content under specific task conditions, even for benign tasks. A new framework triggers this vulnerability, yielding 95% safety failure rates and revealing inherent unsafe capabilities despite alignment efforts.

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23509
• PDF: https://arxiv.org/pdf/2603.23509
• Project Page: https://wuyoscar.github.io/ISC-Bench
• Github: https://github.com/wuyoscar/ISC-Bench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AISafety #LLM #AIAlignment #MachineLearning #AIResearch
1
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data

📝 Summary:
To advance multi-reference image generation, this paper introduces MacroData, a large-scale dataset providing structured long-context supervision. It also proposes MacroBench, a standardized benchmark for evaluation. Fine-tuning on MacroData significantly improves generation performance.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25319
• PDF: https://arxiv.org/pdf/2603.25319
• Project Page: https://macro400k.github.io/
• Github: https://github.com/HKU-MMLab/Macro

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks

📝 Summary:
Software development is iterative, yet agentic coding benchmarks overwhelmingly evaluate single-shot solutions against complete specifications. Code can pass the test suite but become progressively ha...

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24755
• PDF: https://arxiv.org/pdf/2603.24755
• Project Page: https://www.scbench.ai
• Github: https://github.com/SprocketLab/slop-code-bench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AICoding #Benchmarking #LLMAgents #SoftwareEngineering #CodeGeneration
1
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

📝 Summary:
Memory Sparse Attention (MSA) enables large language models to process extremely long contexts with linear complexity and high efficiency through innovations like sparse attention and document-wise Ro...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23516
• PDF: https://arxiv.org/pdf/2603.23516
• Project Page: https://evermind.ai/blogs/breaking-the-100m-token-limit-msa-architecture-achieves-efficient-end-to-end-long-term-memory-for-llms
• Github: https://github.com/EverMind-AI/MSA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Voxtral TTS

📝 Summary:
Voxtral TTS is a multilingual text-to-speech model that generates natural speech from short reference audio using a hybrid architecture combining semantic token generation and flow-matching for acoust...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25551
• PDF: https://arxiv.org/pdf/2603.25551
• Project Page: https://mistral.ai/news/voxtral-tts

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

📝 Summary:
LGTM is a feed-forward framework that enables high-fidelity 4K novel view synthesis by predicting compact Gaussian primitives with per-primitive textures, decoupling geometric complexity from renderin...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25745
• PDF: https://arxiv.org/pdf/2603.25745
• Project Page: https://yxlao.github.io/lgtm/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1