ML Research Hub
32.5K subscribers
6.03K photos
388 videos
24 files
6.53K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model

📝 Summary:
This novel framework enables 3D-aware video customization by decoupling spatial geometry from temporal motion using 1-frame optimization to build robust 3D priors. It also incorporates a visual conditioning module for enhanced texture generation and faster convergence.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18524
• PDF: https://arxiv.org/pdf/2603.18524
• Project Page: https://ko-lani.github.io/3DreamBooth
• Github: https://github.com/Ko-Lani/3DreamBooth

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction

📝 Summary:
MonoArt presents a unified framework for reconstructing articulated 3D objects from single images through progressive structural reasoning that enables stable articulation inference without external t...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19231
• PDF: https://arxiv.org/pdf/2603.19231
• Project Page: https://lihaitian.com/MonoArt/
• Github: https://github.com/Quest4Science/MonoArt

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ReactMotion: Generating Reactive Listener Motions from Speaker Utterance

📝 Summary:
This paper introduces ReactMotion, a framework for generating natural listener body motions that react appropriately to speaker utterances. It uses a large dataset and preference-based training to create diverse, realistic responses, outperforming prior methods.

🔹 Publication Date: Published on Mar 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.15083
• PDF: https://arxiv.org/pdf/2603.15083

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #MachineLearning #HumanComputerInteraction #GenerativeAI #ComputerAnimation
1
VID-AD: A Dataset for Image-Level Logical Anomaly Detection under Vision-Induced Distraction

📝 Summary:
VID-AD is a dataset for logical anomaly detection in industrial inspection, specifically addressing challenges from visual distractions. A new language-based framework is also proposed, which uses text descriptions and contrastive learning to capture logical attributes.

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13964
• PDF: https://arxiv.org/pdf/2603.13964
• Github: https://github.com/nkthiroto/VID-AD

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AnomalyDetection #IndustrialInspection #ComputerVision #MachineLearning #Datasets
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD

📝 Summary:
Discrete Moment Matching Distillation (D-MMD) enables effective distillation of discrete diffusion models by adapting continuous-domain techniques, achieving superior performance compared to previous ...

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20155
• PDF: https://arxiv.org/pdf/2603.20155

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
How Well Does Generative Recommendation Generalize?

📝 Summary:
Generative recommendation models excel at generalization tasks while item ID-based models perform better at memorization, with a complementary approach showing improved recommendation performance thro...

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19809
• PDF: https://arxiv.org/pdf/2603.19809
• Github: https://github.com/Jamesding000/MemGen-GR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Teaching an Agent to Sketch One Part at a Time

📝 Summary:
Researchers developed an agent that generates vector sketches incrementally, one part at a time. It uses a multi-modal language model and process-reward reinforcement learning with a new part-annotated dataset. This enables controllable and editable text-to-vector sketch generation.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19500
• PDF: https://arxiv.org/pdf/2603.19500

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #GenerativeAI #MachineLearning #ComputerVision #ReinforcementLearning
AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

📝 Summary:
AgentDS benchmark evaluates AI agents and human-AI collaboration in domain-specific data science tasks, revealing continued necessity of human expertise despite advances in large language models and A...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19005
• PDF: https://arxiv.org/pdf/2603.19005
• Project Page: https://agentds.org/

Datasets citing this paper:
https://huggingface.co/datasets/lainmn/AgentDS-Insurance
https://huggingface.co/datasets/lainmn/AgentDS-RetailBanking
https://huggingface.co/datasets/lainmn/AgentDS-Manufacturing

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EgoForge: Goal-Directed Egocentric World Simulator

📝 Summary:
EgoForge is an egocentric goal-directed world simulator that generates coherent first-person video rollouts from minimal static inputs using trajectory-level reward-guided refinement during diffusion ...

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20169
• PDF: https://arxiv.org/pdf/2603.20169
• Project Page: https://plan-lab.github.io/projects/egoforge

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering

📝 Summary:
HiMu is a training-free framework for long video QA. It efficiently selects relevant frames using hierarchical query decomposition with lightweight multimodal experts, preserving temporal and cross-modal structure. HiMu advances the efficiency-accuracy Pareto front.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18558
• PDF: https://arxiv.org/pdf/2603.18558
• Project Page: https://danbenami.github.io/HiMu.io/
• Github: https://github.com/DanBenAmi/HiMu

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoQA #MultimodalAI #ComputerVision #MachineLearning #AI
Deep Tabular Research via Continual Experience-Driven Execution

📝 Summary:
This paper introduces Deep Tabular Research DTR, an agentic framework for complex tabular reasoning. It constructs a hierarchical meta-graph, uses expectation-aware path selection, and refines iteratively via siamese structured memory, highlighting the importance of separating planning from execu...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09151
• PDF: https://arxiv.org/pdf/2603.09151

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DeepLearning #TabularData #AI #MachineLearning #AIagents
s2n-bignum-bench: A practical benchmark for evaluating low-level code reasoning of LLMs

📝 Summary:
s2n-bignum-bench is a new benchmark evaluating LLMs on formal proof synthesis for industrial cryptographic assembly routines. It bridges the gap between competition math and real-world verification by requiring LLMs to generate HOL Light proofs for AWS s2n-bignum library code.

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14628
• PDF: https://arxiv.org/pdf/2603.14628
• Project Page: https://kings-crown.github.io/s2n-bignum-leaderboard/
• Github: https://github.com/kings-crown/s2n-bignum-bench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders

📝 Summary:
State space models demonstrate competitive performance as vision backbones for vision-language models, matching or exceeding transformer-based architectures while operating at smaller scales and requi...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19209
• PDF: https://arxiv.org/pdf/2603.19209
• Project Page: https://lab-spell.github.io/vlm-ssm-vision-encoders/
• Github: https://github.com/raykuo18/vlm-ssm-vision-encoders

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos

📝 Summary:
TAPESTRY generates high-fidelity 360-degree turntable videos conditioned on 3D geometry, enabling consistent texture synthesis and neural rendering for complete 3D asset creation. AI-generated summary...

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17735
• PDF: https://arxiv.org/pdf/2603.17735
• Project Page: https://zerone182.github.io/TAPESTRY/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research