Hi everybody,
I took a few weeks to take a breath from a lot of stuff, I dedicated all my mental energy to keep working and I dedicated all my spare time to take care of myself. Despite I'm still not ok (BTW, my health was/is always good), I feel it's time to come back and support this wonderful community in this journey. I feel the responsibility of that, time to get in the ring.
I'm very sorry for being out so long, but sometime life hits really hard. I got an incredible support from unknown people from all around the world. It's amazing.
Thanks again, you rock!
Alessandro.
I took a few weeks to take a breath from a lot of stuff, I dedicated all my mental energy to keep working and I dedicated all my spare time to take care of myself. Despite I'm still not ok (BTW, my health was/is always good), I feel it's time to come back and support this wonderful community in this journey. I feel the responsibility of that, time to get in the ring.
I'm very sorry for being out so long, but sometime life hits really hard. I got an incredible support from unknown people from all around the world. It's amazing.
Thanks again, you rock!
Alessandro.
1โค198๐16๐ฅ15๐5๐พ3๐ข2๐ฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ DINOv3 is out ๐ฆ
๐#Meta unveils DINOv3! A novel foundation model outperforming the previous SOTAs in computer vision. Code & weights released under DINOv3 License๐
๐Review https://t.ly/-S3ZL
๐Paper https://t.ly/ervOT
๐Project https://lnkd.in/dHFf3esd
๐Repo https://lnkd.in/dPxhDxAq
๐คHF https://lnkd.in/dWGudY2i
๐#Meta unveils DINOv3! A novel foundation model outperforming the previous SOTAs in computer vision. Code & weights released under DINOv3 License๐
๐Review https://t.ly/-S3ZL
๐Paper https://t.ly/ervOT
๐Project https://lnkd.in/dHFf3esd
๐Repo https://lnkd.in/dPxhDxAq
๐คHF https://lnkd.in/dWGudY2i
โค42๐ฅ13๐2๐1๐พ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ค Impact of SuperHuman AI ๐ค
๐The NoProfit AI Futures Project unveils a (dystopic) scenario about what super-AI might look like. Forecast from today to the bio-engineered human-like creatures. A fascinating speculation of the future with the "slow-down" and "race" scenarios. Enjoy ๐
๐Review https://t.ly/EgmfJ
๐Project https://ai-2027.com/
๐The NoProfit AI Futures Project unveils a (dystopic) scenario about what super-AI might look like. Forecast from today to the bio-engineered human-like creatures. A fascinating speculation of the future with the "slow-down" and "race" scenarios. Enjoy ๐
๐Review https://t.ly/EgmfJ
๐Project https://ai-2027.com/
โค7๐ฅ2๐คฏ2๐คฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐TOTNet: Occlusion-aware Tracking๐
๐TOTNet: novel Temporal Occlusion Tracking Network that leverages 3D-convs, visibility-weighted loss, & occlusion augmentation to improve performance under occlusions. Code & Data under MIT๐
๐Review https://t.ly/Q0jAf
๐Paper https://lnkd.in/dUYsa-GC
๐Repo https://lnkd.in/d3QGUHYb
๐TOTNet: novel Temporal Occlusion Tracking Network that leverages 3D-convs, visibility-weighted loss, & occlusion augmentation to improve performance under occlusions. Code & Data under MIT๐
๐Review https://t.ly/Q0jAf
๐Paper https://lnkd.in/dUYsa-GC
๐Repo https://lnkd.in/d3QGUHYb
๐ฅ10โค6๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Feed-Forward 4D video๐
๐4DNeX is the first feed-forward framework for generating 4D scene representations from a single image by fine-tuning diffusion model. HQ dynamic pt-clouds & downstream tasks such as novel-view video synthesis with strong generalizability. Code/Data announced ๐
๐Review https://t.ly/SpkD-
๐Paper arxiv.org/pdf/2508.13154
๐Project https://4dnex.github.io/
๐Repo github.com/3DTopia/4DNeX
๐Data https://lnkd.in/dh4_3Ghf
๐Demo https://lnkd.in/dztyzwgg
๐4DNeX is the first feed-forward framework for generating 4D scene representations from a single image by fine-tuning diffusion model. HQ dynamic pt-clouds & downstream tasks such as novel-view video synthesis with strong generalizability. Code/Data announced ๐
๐Review https://t.ly/SpkD-
๐Paper arxiv.org/pdf/2508.13154
๐Project https://4dnex.github.io/
๐Repo github.com/3DTopia/4DNeX
๐Data https://lnkd.in/dh4_3Ghf
๐Demo https://lnkd.in/dztyzwgg
โค10๐ฅ7๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐DAViD: Synthetic Depth-Normal-Segmentation๐
๐#Microsoft's DAViD: 100% synthetic dataset/models for human Depth, Normals & Segmentation. Dataset available, models & runtime under MIT๐
๐Review https://t.ly/-SlO_
๐Paper https://lnkd.in/eCmMXpTg
๐Project https://lnkd.in/eurCSWkm
๐Repo https://lnkd.in/e7PWFgP2
๐#Microsoft's DAViD: 100% synthetic dataset/models for human Depth, Normals & Segmentation. Dataset available, models & runtime under MIT๐
๐Review https://t.ly/-SlO_
๐Paper https://lnkd.in/eCmMXpTg
๐Project https://lnkd.in/eurCSWkm
๐Repo https://lnkd.in/e7PWFgP2
๐7โค6๐ฅ3๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ OmniTry: Virtual Try-On Anything ๐
๐OmniTry: unified framework that extends VTON beyond garment to encompass any wearable objects (jewelries, accessories, etc.) in mask-free setting. Weights, HF demo & benchmark released๐
๐Review https://t.ly/wMBGQ
๐Paper https://lnkd.in/dQe9MchS
๐Project https://omnitry.github.io/
๐Repo https://lnkd.in/d3QwAXY2
๐คDemo https://lnkd.in/duUcZpVA
๐OmniTry: unified framework that extends VTON beyond garment to encompass any wearable objects (jewelries, accessories, etc.) in mask-free setting. Weights, HF demo & benchmark released๐
๐Review https://t.ly/wMBGQ
๐Paper https://lnkd.in/dQe9MchS
๐Project https://omnitry.github.io/
๐Repo https://lnkd.in/d3QwAXY2
๐คDemo https://lnkd.in/duUcZpVA
๐ฅ15โค5๐ข1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ก ROVR Open Dataset is out ๐ก
๐A novel large-scale open 3D dataset for autonomous driving, robotics, and 4D perception tasks. To be released for academic (for free) & commercial๐
๐Review https://t.ly/iDcvg
๐Paper https://arxiv.org/pdf/2508.13977
๐Project https://xiandaguo.net/ROVR-Open-Dataset
๐A novel large-scale open 3D dataset for autonomous driving, robotics, and 4D perception tasks. To be released for academic (for free) & commercial๐
๐Review https://t.ly/iDcvg
๐Paper https://arxiv.org/pdf/2508.13977
๐Project https://xiandaguo.net/ROVR-Open-Dataset
โค12๐ฅ4๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ง YOPO: SOTA 9-DoF Pose๐ง
๐Pit In Co. unveils YOPO, a novel single-stage, query-based framework that treats category-level 9-DoF estimation as a natural extension of 2D detection. A practical solution for mono-RGB, category-level, multi-obj pose estimation. Code & models announced (coming)๐
๐Review https://t.ly/cf_Cl
๐Paper https://arxiv.org/pdf/2508.14965
๐Project mikigom.github.io/YOPO-project-page/
๐Repo TBA
๐Pit In Co. unveils YOPO, a novel single-stage, query-based framework that treats category-level 9-DoF estimation as a natural extension of 2D detection. A practical solution for mono-RGB, category-level, multi-obj pose estimation. Code & models announced (coming)๐
๐Review https://t.ly/cf_Cl
๐Paper https://arxiv.org/pdf/2508.14965
๐Project mikigom.github.io/YOPO-project-page/
๐Repo TBA
โค8๐ฅ1๐คฉ1
๐ฌIntern-S1: SOTA MM-MoE ๐ฌ
๐InternS1: a MM-MoE with 28B activated / 241b total parameters, continually pre-trained on 5T tokens, including 2.5T+ tokens from scientific domains. New SOTA for professional tasks, such as molecular synthesis planning, reaction condition prediction, etc. Models available under Apache 2.0๐
๐Review https://t.ly/3l5UW
๐Paper arxiv.org/pdf/2508.15763
๐Repo github.com/InternLM/Intern-S1
๐คHF huggingface.co/internlm/Intern-S1
๐InternS1: a MM-MoE with 28B activated / 241b total parameters, continually pre-trained on 5T tokens, including 2.5T+ tokens from scientific domains. New SOTA for professional tasks, such as molecular synthesis planning, reaction condition prediction, etc. Models available under Apache 2.0๐
๐Review https://t.ly/3l5UW
๐Paper arxiv.org/pdf/2508.15763
๐Repo github.com/InternLM/Intern-S1
๐คHF huggingface.co/internlm/Intern-S1
โค6๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ซATLAS: SOTA Human Model๐ซ
๐#META presents ATLAS, a novel high-fidelity body model learned from 600k high-res. scans captured using 240 synchronized cams. Code announced, to be released๐
๐Review https://t.ly/0hHud
๐Paper arxiv.org/pdf/2508.15767
๐Project jindapark.github.io/projects/atlas/
๐Repo TBA
๐#META presents ATLAS, a novel high-fidelity body model learned from 600k high-res. scans captured using 240 synchronized cams. Code announced, to be released๐
๐Review https://t.ly/0hHud
๐Paper arxiv.org/pdf/2508.15767
๐Project jindapark.github.io/projects/atlas/
๐Repo TBA
๐ฅ7โค6๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งคDiffusive Hand from Signs๐งค
๐LIGM + #NVIDIA unveil a novel generative model of 3D hand motions from Sign Language Data. Motion characteristics such as handshapes, locations, finger, hand & arm movements. Code, Models & Data to be released ๐
๐Review https://t.ly/HonX_
๐Paper https://arxiv.org/pdf/2508.15902
๐Project https://imagine.enpc.fr/~leore.bensabath/HandMDM/
๐Data drive.google.com/drive/u/1/folders/1BLsu2hAqhAJ_gnGb9TNXW7MLiSuSEzEj
๐Repo TBA
๐LIGM + #NVIDIA unveil a novel generative model of 3D hand motions from Sign Language Data. Motion characteristics such as handshapes, locations, finger, hand & arm movements. Code, Models & Data to be released ๐
๐Review https://t.ly/HonX_
๐Paper https://arxiv.org/pdf/2508.15902
๐Project https://imagine.enpc.fr/~leore.bensabath/HandMDM/
๐Data drive.google.com/drive/u/1/folders/1BLsu2hAqhAJ_gnGb9TNXW7MLiSuSEzEj
๐Repo TBA
โค4๐ฅ3๐2๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐๏ธ VROOM: F1 Reconstruction ๐๏ธ
๐Berkeley unveils VROOM, the first attempt for reconstructing 3D models of #Formula1 circuits using only onboard camera footage from racecars. Extreme challenges due to noise & speed. Repo released๐
๐Review https://t.ly/uuHdT
๐Paper arxiv.org/pdf/2508.17172
๐Repo github.com/yajatyadav/vroom
๐Project varun-bharadwaj.github.io/vroom/
๐Berkeley unveils VROOM, the first attempt for reconstructing 3D models of #Formula1 circuits using only onboard camera footage from racecars. Extreme challenges due to noise & speed. Repo released๐
๐Review https://t.ly/uuHdT
๐Paper arxiv.org/pdf/2508.17172
๐Repo github.com/yajatyadav/vroom
๐Project varun-bharadwaj.github.io/vroom/
1โค18๐ฅ5๐1
ezgif-8120c4563e81c3.mp4
510.6 KB
๐ฅถ OmniHuman-1.5 ๐ฅถ
๐#ByteDance proposes a novel framework designed to generate character animations that are not only physically plausible but also semantically coherent and expressive. Coherency with speech's rhythm, prosody and semantic content. Impressive results but no code ๐ฅบ
๐Review https://t.ly/CnRmX
๐Paper arxiv.org/pdf/2508.19209
๐Project omnihuman-lab.github.io/v1_5/
๐Repo ๐ฅบ
๐#ByteDance proposes a novel framework designed to generate character animations that are not only physically plausible but also semantically coherent and expressive. Coherency with speech's rhythm, prosody and semantic content. Impressive results but no code ๐ฅบ
๐Review https://t.ly/CnRmX
๐Paper arxiv.org/pdf/2508.19209
๐Project omnihuman-lab.github.io/v1_5/
๐Repo ๐ฅบ
โค4๐คฏ2๐1๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
โฝSoccerNet 2025 results!โฝ
๐SoccerNet 2025 Challenges is the open benchmarking dedicated to advancing computer vision research in football video understanding. Repo available ๐
๐Review https://t.ly/MfHKg
๐Paper https://arxiv.org/pdf/2508.19182
๐Project https://www.soccer-net.org/
๐Repo https://github.com/SoccerNet
๐SoccerNet 2025 Challenges is the open benchmarking dedicated to advancing computer vision research in football video understanding. Repo available ๐
๐Review https://t.ly/MfHKg
๐Paper https://arxiv.org/pdf/2508.19182
๐Project https://www.soccer-net.org/
๐Repo https://github.com/SoccerNet
โค15๐ฅ6๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐นROSE: Remove Objects & Effects๐น
๐Fix the objectโs effects on environment: shadows, reflections, light, translucency and mirror. Model, Demo & Dataset available via Hugging Face๐
๐Review https://t.ly/_KFM0
๐Paper https://lnkd.in/dNcTXQAE
๐Project https://lnkd.in/dFGmYT5h
๐Model https://lnkd.in/dhTT-VkN
๐Demo https://lnkd.in/dimgXZT6
๐Data https://lnkd.in/da7Jv667
๐Fix the objectโs effects on environment: shadows, reflections, light, translucency and mirror. Model, Demo & Dataset available via Hugging Face๐
๐Review https://t.ly/_KFM0
๐Paper https://lnkd.in/dNcTXQAE
๐Project https://lnkd.in/dFGmYT5h
๐Model https://lnkd.in/dhTT-VkN
๐Demo https://lnkd.in/dimgXZT6
๐Data https://lnkd.in/da7Jv667
โค15๐3๐2๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ Dress-up & Dance ๐
๐Novel diffusion framework that generates HQ 5-second-long 24 FPS VTON videos at 1152ร720 of a user wearing desired garments while moving in accordance with a given reference video. Impressive results but no repo๐ฅบ
๐Review https://t.ly/7NeTL
๐Paper arxiv.org/pdf/2508.21070
๐Project immortalco.github.io/DressAndDance/
๐Repo ๐ฅบ
๐Novel diffusion framework that generates HQ 5-second-long 24 FPS VTON videos at 1152ร720 of a user wearing desired garments while moving in accordance with a given reference video. Impressive results but no repo๐ฅบ
๐Review https://t.ly/7NeTL
๐Paper arxiv.org/pdf/2508.21070
๐Project immortalco.github.io/DressAndDance/
๐Repo ๐ฅบ
โค8๐ฅ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ Multi-View 3D Tracking ๐
๐MVTracker is the first data-driven multi-view 3D point tracker for tracking arbitrary 3D points across multiple cameras. Repo available๐
๐Review https://t.ly/rISMR
๐Paper arxiv.org/pdf/2508.21060
๐Project https://lnkd.in/drHtAmRC
๐Repo https://lnkd.in/d4k8mg3B
๐MVTracker is the first data-driven multi-view 3D point tracker for tracking arbitrary 3D points across multiple cameras. Repo available๐
๐Review https://t.ly/rISMR
๐Paper arxiv.org/pdf/2508.21060
๐Project https://lnkd.in/drHtAmRC
๐Repo https://lnkd.in/d4k8mg3B
โค10๐ฅ5๐1
This media is not supported in your browser
VIEW IN TELEGRAM
โค๏ธโ๐ฅPHD: Personalized 3D Humansโค๏ธโ๐ฅ
๐ETH & #Meta unveil PHD, a novel approach for personalized 3D human mesh recovery (HMR) and body fitting that leverages user-specific shape information. Code & models to be released๐
๐Review https://t.ly/IeRhH
๐Paper https://arxiv.org/pdf/2508.21257
๐Project https://phd-pose.github.io/
๐Repo TBA
๐ETH & #Meta unveil PHD, a novel approach for personalized 3D human mesh recovery (HMR) and body fitting that leverages user-specific shape information. Code & models to be released๐
๐Review https://t.ly/IeRhH
๐Paper https://arxiv.org/pdf/2508.21257
๐Project https://phd-pose.github.io/
๐Repo TBA
โค7๐ฅ2๐1