ML Research Hub
32.5K subscribers
5.89K photos
377 videos
24 files
6.37K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Fine-grained Motion Retrieval via Joint-Angle Motion Images and Token-Patch Late Interaction

📝 Summary:
This paper presents a novel text-motion retrieval method. It maps joint-angle motion features into Vision Transformer-compatible pseudo-images and uses an enhanced late interaction mechanism. This achieves superior performance and offers interpretable fine-grained text-motion alignments.

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09930
• PDF: https://arxiv.org/pdf/2603.09930

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MotionRetrieval #DeepLearning #ComputerVision #AIResearch #NLP
Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

📝 Summary:
Video Streaming Thinking VST is a novel paradigm for real-time video understanding, enabling AI to think while watching during streaming playback. It optimizes VideoLLMs for responsive, low-latency interaction, showing significantly faster responses and strong performance on various benchmarks.

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.12262
• PDF: https://arxiv.org/pdf/2603.12262
• Project Page: https://1ranguan.github.io/VST/
• Github: https://github.com/1ranGuan/VST

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoLLMs #RealTimeAI #VideoUnderstanding #AIResearch #MachineLearning
CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges

📝 Summary:
Researchers introduced CreativeBench, a benchmark for evaluating machine creativity in code generation using a quality-novelty metric. They found scaling improves combinatorial creativity but yields diminishing returns for exploration. They also proposed EvoRePE, an inference-time strategy to enh...

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11863
• PDF: https://arxiv.org/pdf/2603.11863
• Project Page: https://zethwang.github.io/creativebench.github.io/
• Github: https://github.com/ZethWang/CreativeBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MachineCreativity #CodeGeneration #AIBenchmark #GenerativeAI #AIResearch
Think While Watching: Online Streaming Segment-Level Memory for Multi-Turn Video Reasoning in Multimodal Large Language Models

📝 Summary:
Think While Watching is a memory-anchored framework enabling multimodal large language models to perform continuous multi-turn video reasoning. It maintains long-range dependencies and boosts efficiency for streaming, significantly outperforming existing benchmarks.

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11896
• PDF: https://arxiv.org/pdf/2603.11896
• Github: https://github.com/wl666hhh/Think_While_Watching

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MLLM #VideoReasoning #StreamingAI #AIMemory #AIResearch
Surprised by Attention: Predictable Query Dynamics for Time Series Anomaly Detection

📝 Summary:
AxonAD is an unsupervised anomaly detector for multivariate time series. It detects structural dependency shifts by analyzing predictable multi-head attention query evolution, combining reconstruction with a query mismatch score. It outperforms existing methods.

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.12916
• PDF: https://arxiv.org/pdf/2603.12916
• Github: https://github.com/iis-esslingen/AxonAD

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AnomalyDetection #TimeSeries #MachineLearning #DeepLearning #UnsupervisedLearning
ECoLAD: Deployment-Oriented Evaluation for Automotive Time-Series Anomaly Detection

📝 Summary:
ECoLaD is a new framework evaluating time-series anomaly detection under compute constraints, critical for in-vehicle systems. It uses efficiency reductions to assess feasibility. Findings show classical methods sustain performance, but deep learning often becomes infeasible before losing accuracy.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10926
• PDF: https://arxiv.org/pdf/2603.10926

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AnomalyDetection #TimeSeries #AutomotiveAI #EdgeAI #DeepLearning
Can Fairness Be Prompted? Prompt-Based Debiasing Strategies in High-Stakes Recommendations

📝 Summary:
This study investigates prompt-based debiasing strategies for LLM recommenders to improve group fairness. It finds that instructing LLMs to be fair can boost fairness by up to 74% while maintaining recommendation effectiveness.

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.12935
• PDF: https://arxiv.org/pdf/2603.12935

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #Fairness #Debiasing #RecommenderSystems
EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery

📝 Summary:
EvoScientist is an evolving multi-agent AI framework that enhances scientific discovery. It uses persistent memory to continuously learn from past interactions, improving scientific idea generation and experimental execution success rates. Experiments show it outperforms state-of-the-art systems ...

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.08127
• PDF: https://arxiv.org/pdf/2603.08127
• Github: https://github.com/EvoScientist/EvoScientist

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #MultiAgentSystems #ScientificDiscovery #EvolutionaryAI #AIResearch
SDF-Net: Structure-Aware Disentangled Feature Learning for Opticall-SAR Ship Re-identification

📝 Summary:
SDF-Net improves optical-SAR ship re-identification by leveraging stable ship geometry despite radiometric differences. It extracts scale-invariant structural features and disentangles modality-invariant and modality-specific cues to enhance discrimination.

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.12588
• PDF: https://arxiv.org/pdf/2603.12588
• Github: https://github.com/cfrfree/SDF-Net

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering

📝 Summary:
EgoPointVQA presents a dataset and benchmark for gesture-grounded egocentric question answering, along with Hand Intent Tokens (HINT) that encode 3D hand keypoints to improve pointing intent interpret...

🔹 Publication Date: Published on Mar 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.12533
• PDF: https://arxiv.org/pdf/2603.12533
• Project Page: https://yuuraa.github.io/papers/choi2026egovqa/
• Github: https://github.com/Yuuraa/EgoPointVQA

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information

📝 Summary:
Language models prefer correct information from a 'Compression-Consistency Principle': next-token prediction favors shorter, more internally consistent data. Truth bias is a compression side effect, not inherent truth-seeking, emerging when false alternatives are hard to compress.

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11749
• PDF: https://arxiv.org/pdf/2603.11749
• Github: https://github.com/Rai220/compression-drives-truth/blob/master/paper_v2.md

Datasets citing this paper:
https://huggingface.co/datasets/krestnikov/compression-drives-truth

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Taking Shortcuts for Categorical VQA Using Super Neurons

📝 Summary:
This paper introduces Super Neurons SNs, scalar activations replacing Sparse Attention Vectors SAVs for Vision Language Model classification. SNs enable extreme early exiting from shallow layers, improving classification performance and achieving up to 5.10x speedup.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10781
• PDF: https://arxiv.org/pdf/2603.10781

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AI Can Learn Scientific Taste

📝 Summary:
RLCF trains AI to judge and propose high-impact research. Scientific Judge models preferences; Scientific Thinker proposes ideas. AI learns this capability, outperforming SOTA LLMs.

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14473
• PDF: https://arxiv.org/pdf/2603.14473
• Project Page: https://tongjingqi.github.io/AI-Can-Learn-Scientific-Taste/
• Github: https://github.com/tongjingqi/AI-Can-Learn-Scientific-Taste

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Safe and Scalable Web Agent Learning via Recreated Websites

📝 Summary:
VeriEnv enables safe and scalable training of web agents by creating synthetic, verifiable environments from real websites through language model-based cloning. AI-generated summary Training autonomou...

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10505
• PDF: https://arxiv.org/pdf/2603.10505
• Project Page: https://huggingface.co/spaces/hyungjoochae/verienv-project-page
• Github: https://github.com/kyle8581/VeriEnv

Spaces citing this paper:
https://huggingface.co/spaces/hyungjoochae/verienv-project-page

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance

📝 Summary:
Recent advances in trajectory-controllable video generation have achieved remarkable progress. Previous methods mainly use adapter-based architectures for precise motion control along predefined traje...

🔹 Publication Date: Published on Mar 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.12146
• PDF: https://arxiv.org/pdf/2603.12146
• Project Page: https://quanhaol.github.io/flashmotion-site/
• Github: https://github.com/quanhaol/FlashMotion

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FineRMoE: Dimension Expansion for Finer-Grained Expert with Its Upcycling Approach

📝 Summary:
As revealed by the scaling law of fine-grained MoE, model performance ceases to be improved once the granularity of the intermediate dimension exceeds the optimal threshold, limiting further gains fro...

🔹 Publication Date: Published on Mar 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13364
• PDF: https://arxiv.org/pdf/2603.13364
• Project Page: https://github.com/liaoning97/FineRMoE
• Github: https://github.com/liaoning97/FineRMoE

🔹 Models citing this paper:
https://huggingface.co/NingLiao/FineRMoE-26.65B-A7.94B
https://huggingface.co/NingLiao/FineRMoE-1.68B-A0.65B
https://huggingface.co/NingLiao/FineRMoE-5.64B-A1.85B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer

📝 Summary:
Diffusion Transformers (DiTs) have demonstrated remarkable scalability and quality in image and video generation, prompting growing interest in extending them to controllable generation and editing ta...

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15478
• PDF: https://arxiv.org/pdf/2603.15478
• Github: https://github.com/Lexie-YU/ViFeEdit

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Grounding World Simulation Models in a Real-World Metropolis

📝 Summary:
Seoul World Model SWM renders video simulations of actual cities, not imagined environments. It grounds autoregressive video generation using real street-view images, overcoming data challenges. SWM generates spatially faithful, long-horizon urban videos for diverse camera paths and scenarios.

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15583
• PDF: https://seoul-world-model.github.io/SWM_paper.pdf
• Project Page: https://seoul-world-model.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Panoramic Affordance Prediction

📝 Summary:
Affordance prediction serves as a critical bridge between perception and action in embodied AI. However, existing research is confined to pinhole camera models, which suffer from narrow Fields of View...

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15558
• PDF: https://arxiv.org/pdf/2603.15558
• Project Page: https://zixinzhang02.github.io/Panoramic-Affordance-Prediction/
• Github: https://zixinzhang02.github.io/Panoramic-Affordance-Prediction/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

📝 Summary:
Multimodal Large Language Models (MLLMs) have shown strong performance in visual and audio understanding when evaluated in isolation. However, their ability to jointly reason over omni-modal (visual, ...

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14145
• PDF: https://arxiv.org/pdf/2603.14145
• Project Page: https://huggingface.co/datasets/nvidia/MMOU

Datasets citing this paper:
https://huggingface.co/datasets/nvidia/MMOU

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Mind the Shift: Decoding Monetary Policy Stance from FOMC Statements with Large Language Models

📝 Summary:
Federal Open Market Committee (FOMC) statements are a major source of monetary-policy information, and even subtle changes in their wording can move global financial markets. A central task is therefo...

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14313
• PDF: https://arxiv.org/pdf/2603.14313
• Project Page: https://yixuantt.github.io/DeltaConsistent/
• Github: https://github.com/yixuantt/DeltaConsistentScoring

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research