ML Research Hub
32.9K subscribers
4.45K photos
273 videos
23 files
4.81K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

📝 Summary:
NeoVerse is a 4D world model for reconstruction and video generation. It scales to in-the-wild monocular videos using pose-free feed-forward reconstruction and online degradation simulation, achieving state-of-the-art performance.

🔹 Publication Date: Published on Jan 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00393
• PDF: https://arxiv.org/pdf/2601.00393
• Project Page: https://neoverse-4d.github.io/
• Github: https://neoverse-4d.github.io

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#4DWorldModel #VideoGeneration #ComputerVision #DeepLearning #AI
This media is not supported in your browser
VIEW IN TELEGRAM
AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction

📝 Summary:
AdaGaR reconstructs dynamic 3D scenes from monocular video. It introduces an Adaptive Gabor Representation for detail and stability, and Cubic Hermite Splines for temporal continuity. This method achieves state-of-the-art performance.

🔹 Publication Date: Published on Jan 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00796
• PDF: https://arxiv.org/pdf/2601.00796
• Project Page: https://jiewenchan.github.io/AdaGaR/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DReconstruction #ComputerVision #DynamicScenes #MonocularVideo #GaborRepresentation
1
OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions

📝 Summary:
OmniVCus introduces a system for feedforward multi-subject video customization with multimodal controls. It proposes a data pipeline, VideoCus-Factory, and a diffusion Transformer framework with novel embedding mechanisms. This enables more subjects and precise editing, significantly outperformin...

🔹 Publication Date: Published on Jun 29, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.23361
• PDF: https://arxiv.org/pdf/2506.23361
• Project Page: https://caiyuanhao1998.github.io/project/OmniVCus/
• Github: https://github.com/caiyuanhao1998/Open-OmniVCus

🔹 Models citing this paper:
https://huggingface.co/CaiYuanhao/OmniVCus

Datasets citing this paper:
https://huggingface.co/datasets/CaiYuanhao/OmniVCus
https://huggingface.co/datasets/CaiYuanhao/OmniVCus-Test
https://huggingface.co/datasets/CaiYuanhao/OmniVCus-Train

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoGeneration #DiffusionModels #MultimodalAI #DeepLearning #ComputerVision
1
This media is not supported in your browser
VIEW IN TELEGRAM
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

📝 Summary:
DreamID-V is a novel video face swapping framework that uses diffusion transformers and curriculum learning. It achieves superior identity preservation and visual realism by bridging the image-to-video gap, outperforming existing methods and enhancing temporal consistency.

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01425
• PDF: https://arxiv.org/pdf/2601.01425
• Project Page: https://guoxu1233.github.io/DreamID-V/
• Github: https://guoxu1233.github.io/DreamID-V/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#FaceSwapping #DiffusionModels #ComputerVision #GenerativeAI #VideoAI
This media is not supported in your browser
VIEW IN TELEGRAM
DiffProxy: Multi-View Human Mesh Recovery via Diffusion-Generated Dense Proxies

📝 Summary:
DiffProxy generates multi-view consistent human proxies using diffusion models to improve human mesh recovery. This bridges synthetic training and real-world generalization, achieving state-of-the-art performance on real benchmarks.

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02267
• PDF: https://arxiv.org/pdf/2601.02267
• Project Page: https://wrk226.github.io/DiffProxy.html
• Github: https://github.com/wrk226/DiffProxy

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#HumanMeshRecovery #DiffusionModels #ComputerVision #DeepLearning #AI
1
Prithvi-Complimentary Adaptive Fusion Encoder (CAFE): unlocking full-potential for flood inundation mapping

📝 Summary:
Prithvi-CAFE improves flood mapping by integrating a pretrained Geo-Foundation Model encoder with a parallel CNN branch featuring attention modules. This hybrid approach effectively captures both global context and critical local details, achieving state-of-the-art results on Sen1Flood11 and Floo...

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02315
• PDF: https://arxiv.org/pdf/2601.02315
• Github: https://github.com/Sk-2103/Prithvi-CAFE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#FloodMapping #DeepLearning #GeoAI #RemoteSensing #ComputerVision
This media is not supported in your browser
VIEW IN TELEGRAM
ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectors

📝 Summary:
ExposeAnyone is a self-supervised diffusion model for deepfake detection that personalizes to subjects and uses reconstruction errors to measure identity distance. It significantly outperforms prior methods on unseen manipulations, including Sora2 videos, and is robust to real-world corruptions.

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02359
• PDF: https://arxiv.org/pdf/2601.02359
• Github: https://mapooon.github.io/ExposeAnyonePage/

Datasets citing this paper:
https://huggingface.co/datasets/mapooon/S2CFP

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DeepfakeDetection #DiffusionModels #ComputerVision #AITechnology #ForgeryDetection
2
RGS-SLAM: Robust Gaussian Splatting SLAM with One-Shot Dense Initialization

📝 Summary:
RGS-SLAM is a robust Gaussian-splatting SLAM framework that uses a one-shot, correspondence-to-Gaussian initialization with DINOv3 descriptors. This method improves stability, accelerates convergence, and yields higher rendering fidelity and accuracy compared to existing systems.

🔹 Publication Date: Published on Dec 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00705
• PDF: https://arxiv.org/pdf/2601.00705

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SLAM #GaussianSplatting #ComputerVision #Robotics #DeepLearning
👍1
Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction

📝 Summary:
Gen3R combines reconstruction and video diffusion models to generate 3D scenes. It produces RGB videos and 3D geometry by aligning geometric and appearance latents. This achieves state-of-the-art results and improves reconstruction robustness.

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04090
• PDF: https://arxiv.org/pdf/2601.04090
• Project Page: https://xdimlab.github.io/Gen3R/
• Github: https://xdimlab.github.io/Gen3R/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DGeneration #DiffusionModels #ComputerVision #3DReconstruction #DeepLearning
👍1
RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes

📝 Summary:
RL-AWB is a novel framework combining statistical methods with deep reinforcement learning for improved nighttime auto white balance. It is the first RL approach for color constancy, mimicking expert tuning. This method shows superior generalization across various lighting conditions, and a new m...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05249
• PDF: https://arxiv.org/pdf/2601.05249
• Project Page: https://ntuneillee.github.io/research/rl-awb/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #ComputerVision #ImageProcessing #AutoWhiteBalance #LowLightImaging
2