Dear friends,
I’m truly sorry for being away from the group for so long. I know: no updates so far while AI is running faster than speed of light.
I’m going through a very difficult time in my life and I need some space to heal. This spare-time project (but important for a lot of people here) needs energy and commitment I don’t have right now. I’m sorry, be patient. I’ll be back.
Love u all,
Alessandro.
I’m truly sorry for being away from the group for so long. I know: no updates so far while AI is running faster than speed of light.
I’m going through a very difficult time in my life and I need some space to heal. This spare-time project (but important for a lot of people here) needs energy and commitment I don’t have right now. I’m sorry, be patient. I’ll be back.
Love u all,
Alessandro.
❤399👍28😢27
Hi everybody,
I took a few weeks to take a breath from a lot of stuff, I dedicated all my mental energy to keep working and I dedicated all my spare time to take care of myself. Despite I'm still not ok (BTW, my health was/is always good), I feel it's time to come back and support this wonderful community in this journey. I feel the responsibility of that, time to get in the ring.
I'm very sorry for being out so long, but sometime life hits really hard. I got an incredible support from unknown people from all around the world. It's amazing.
Thanks again, you rock!
Alessandro.
I took a few weeks to take a breath from a lot of stuff, I dedicated all my mental energy to keep working and I dedicated all my spare time to take care of myself. Despite I'm still not ok (BTW, my health was/is always good), I feel it's time to come back and support this wonderful community in this journey. I feel the responsibility of that, time to get in the ring.
I'm very sorry for being out so long, but sometime life hits really hard. I got an incredible support from unknown people from all around the world. It's amazing.
Thanks again, you rock!
Alessandro.
1❤198👍16🔥15👏5🍾3😢2💩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦖 DINOv3 is out 🦖
👉#Meta unveils DINOv3! A novel foundation model outperforming the previous SOTAs in computer vision. Code & weights released under DINOv3 License💙
👉Review https://t.ly/-S3ZL
👉Paper https://t.ly/ervOT
👉Project https://lnkd.in/dHFf3esd
👉Repo https://lnkd.in/dPxhDxAq
🤗HF https://lnkd.in/dWGudY2i
👉#Meta unveils DINOv3! A novel foundation model outperforming the previous SOTAs in computer vision. Code & weights released under DINOv3 License💙
👉Review https://t.ly/-S3ZL
👉Paper https://t.ly/ervOT
👉Project https://lnkd.in/dHFf3esd
👉Repo https://lnkd.in/dPxhDxAq
🤗HF https://lnkd.in/dWGudY2i
❤42🔥13👍2😍1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖 Impact of SuperHuman AI 🤖
👉The NoProfit AI Futures Project unveils a (dystopic) scenario about what super-AI might look like. Forecast from today to the bio-engineered human-like creatures. A fascinating speculation of the future with the "slow-down" and "race" scenarios. Enjoy 💙
👉Review https://t.ly/EgmfJ
👉Project https://ai-2027.com/
👉The NoProfit AI Futures Project unveils a (dystopic) scenario about what super-AI might look like. Forecast from today to the bio-engineered human-like creatures. A fascinating speculation of the future with the "slow-down" and "race" scenarios. Enjoy 💙
👉Review https://t.ly/EgmfJ
👉Project https://ai-2027.com/
❤7🔥2🤯2🤣1
This media is not supported in your browser
VIEW IN TELEGRAM
🏓TOTNet: Occlusion-aware Tracking🏓
👉TOTNet: novel Temporal Occlusion Tracking Network that leverages 3D-convs, visibility-weighted loss, & occlusion augmentation to improve performance under occlusions. Code & Data under MIT💙
👉Review https://t.ly/Q0jAf
👉Paper https://lnkd.in/dUYsa-GC
👉Repo https://lnkd.in/d3QGUHYb
👉TOTNet: novel Temporal Occlusion Tracking Network that leverages 3D-convs, visibility-weighted loss, & occlusion augmentation to improve performance under occlusions. Code & Data under MIT💙
👉Review https://t.ly/Q0jAf
👉Paper https://lnkd.in/dUYsa-GC
👉Repo https://lnkd.in/d3QGUHYb
🔥10❤6👍1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔀Feed-Forward 4D video🔀
👉4DNeX is the first feed-forward framework for generating 4D scene representations from a single image by fine-tuning diffusion model. HQ dynamic pt-clouds & downstream tasks such as novel-view video synthesis with strong generalizability. Code/Data announced 💙
👉Review https://t.ly/SpkD-
👉Paper arxiv.org/pdf/2508.13154
👉Project https://4dnex.github.io/
👉Repo github.com/3DTopia/4DNeX
👉Data https://lnkd.in/dh4_3Ghf
👉Demo https://lnkd.in/dztyzwgg
👉4DNeX is the first feed-forward framework for generating 4D scene representations from a single image by fine-tuning diffusion model. HQ dynamic pt-clouds & downstream tasks such as novel-view video synthesis with strong generalizability. Code/Data announced 💙
👉Review https://t.ly/SpkD-
👉Paper arxiv.org/pdf/2508.13154
👉Project https://4dnex.github.io/
👉Repo github.com/3DTopia/4DNeX
👉Data https://lnkd.in/dh4_3Ghf
👉Demo https://lnkd.in/dztyzwgg
❤10🔥7👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈DAViD: Synthetic Depth-Normal-Segmentation🌈
👉#Microsoft's DAViD: 100% synthetic dataset/models for human Depth, Normals & Segmentation. Dataset available, models & runtime under MIT💙
👉Review https://t.ly/-SlO_
👉Paper https://lnkd.in/eCmMXpTg
👉Project https://lnkd.in/eurCSWkm
👉Repo https://lnkd.in/e7PWFgP2
👉#Microsoft's DAViD: 100% synthetic dataset/models for human Depth, Normals & Segmentation. Dataset available, models & runtime under MIT💙
👉Review https://t.ly/-SlO_
👉Paper https://lnkd.in/eCmMXpTg
👉Project https://lnkd.in/eurCSWkm
👉Repo https://lnkd.in/e7PWFgP2
👍7❤6🔥3🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
👠 OmniTry: Virtual Try-On Anything 👠
👉OmniTry: unified framework that extends VTON beyond garment to encompass any wearable objects (jewelries, accessories, etc.) in mask-free setting. Weights, HF demo & benchmark released💙
👉Review https://t.ly/wMBGQ
👉Paper https://lnkd.in/dQe9MchS
👉Project https://omnitry.github.io/
👉Repo https://lnkd.in/d3QwAXY2
🤗Demo https://lnkd.in/duUcZpVA
👉OmniTry: unified framework that extends VTON beyond garment to encompass any wearable objects (jewelries, accessories, etc.) in mask-free setting. Weights, HF demo & benchmark released💙
👉Review https://t.ly/wMBGQ
👉Paper https://lnkd.in/dQe9MchS
👉Project https://omnitry.github.io/
👉Repo https://lnkd.in/d3QwAXY2
🤗Demo https://lnkd.in/duUcZpVA
🔥15❤5😢1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
📡 ROVR Open Dataset is out 📡
👉A novel large-scale open 3D dataset for autonomous driving, robotics, and 4D perception tasks. To be released for academic (for free) & commercial💙
👉Review https://t.ly/iDcvg
👉Paper https://arxiv.org/pdf/2508.13977
👉Project https://xiandaguo.net/ROVR-Open-Dataset
👉A novel large-scale open 3D dataset for autonomous driving, robotics, and 4D perception tasks. To be released for academic (for free) & commercial💙
👉Review https://t.ly/iDcvg
👉Paper https://arxiv.org/pdf/2508.13977
👉Project https://xiandaguo.net/ROVR-Open-Dataset
❤12🔥4👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧉 YOPO: SOTA 9-DoF Pose🧉
👉Pit In Co. unveils YOPO, a novel single-stage, query-based framework that treats category-level 9-DoF estimation as a natural extension of 2D detection. A practical solution for mono-RGB, category-level, multi-obj pose estimation. Code & models announced (coming)💙
👉Review https://t.ly/cf_Cl
👉Paper https://arxiv.org/pdf/2508.14965
👉Project mikigom.github.io/YOPO-project-page/
👉Repo TBA
👉Pit In Co. unveils YOPO, a novel single-stage, query-based framework that treats category-level 9-DoF estimation as a natural extension of 2D detection. A practical solution for mono-RGB, category-level, multi-obj pose estimation. Code & models announced (coming)💙
👉Review https://t.ly/cf_Cl
👉Paper https://arxiv.org/pdf/2508.14965
👉Project mikigom.github.io/YOPO-project-page/
👉Repo TBA
❤8🔥1🤩1
🔬Intern-S1: SOTA MM-MoE 🔬
👉InternS1: a MM-MoE with 28B activated / 241b total parameters, continually pre-trained on 5T tokens, including 2.5T+ tokens from scientific domains. New SOTA for professional tasks, such as molecular synthesis planning, reaction condition prediction, etc. Models available under Apache 2.0💙
👉Review https://t.ly/3l5UW
👉Paper arxiv.org/pdf/2508.15763
👉Repo github.com/InternLM/Intern-S1
🤗HF huggingface.co/internlm/Intern-S1
👉InternS1: a MM-MoE with 28B activated / 241b total parameters, continually pre-trained on 5T tokens, including 2.5T+ tokens from scientific domains. New SOTA for professional tasks, such as molecular synthesis planning, reaction condition prediction, etc. Models available under Apache 2.0💙
👉Review https://t.ly/3l5UW
👉Paper arxiv.org/pdf/2508.15763
👉Repo github.com/InternLM/Intern-S1
🤗HF huggingface.co/internlm/Intern-S1
❤6🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🫔ATLAS: SOTA Human Model🫔
👉#META presents ATLAS, a novel high-fidelity body model learned from 600k high-res. scans captured using 240 synchronized cams. Code announced, to be released💙
👉Review https://t.ly/0hHud
👉Paper arxiv.org/pdf/2508.15767
👉Project jindapark.github.io/projects/atlas/
👉Repo TBA
👉#META presents ATLAS, a novel high-fidelity body model learned from 600k high-res. scans captured using 240 synchronized cams. Code announced, to be released💙
👉Review https://t.ly/0hHud
👉Paper arxiv.org/pdf/2508.15767
👉Project jindapark.github.io/projects/atlas/
👉Repo TBA
🔥7❤6👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤Diffusive Hand from Signs🧤
👉LIGM + #NVIDIA unveil a novel generative model of 3D hand motions from Sign Language Data. Motion characteristics such as handshapes, locations, finger, hand & arm movements. Code, Models & Data to be released 💙
👉Review https://t.ly/HonX_
👉Paper https://arxiv.org/pdf/2508.15902
👉Project https://imagine.enpc.fr/~leore.bensabath/HandMDM/
👉Data drive.google.com/drive/u/1/folders/1BLsu2hAqhAJ_gnGb9TNXW7MLiSuSEzEj
👉Repo TBA
👉LIGM + #NVIDIA unveil a novel generative model of 3D hand motions from Sign Language Data. Motion characteristics such as handshapes, locations, finger, hand & arm movements. Code, Models & Data to be released 💙
👉Review https://t.ly/HonX_
👉Paper https://arxiv.org/pdf/2508.15902
👉Project https://imagine.enpc.fr/~leore.bensabath/HandMDM/
👉Data drive.google.com/drive/u/1/folders/1BLsu2hAqhAJ_gnGb9TNXW7MLiSuSEzEj
👉Repo TBA
❤4🔥3👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🏎️ VROOM: F1 Reconstruction 🏎️
👉Berkeley unveils VROOM, the first attempt for reconstructing 3D models of #Formula1 circuits using only onboard camera footage from racecars. Extreme challenges due to noise & speed. Repo released💙
👉Review https://t.ly/uuHdT
👉Paper arxiv.org/pdf/2508.17172
👉Repo github.com/yajatyadav/vroom
👉Project varun-bharadwaj.github.io/vroom/
👉Berkeley unveils VROOM, the first attempt for reconstructing 3D models of #Formula1 circuits using only onboard camera footage from racecars. Extreme challenges due to noise & speed. Repo released💙
👉Review https://t.ly/uuHdT
👉Paper arxiv.org/pdf/2508.17172
👉Repo github.com/yajatyadav/vroom
👉Project varun-bharadwaj.github.io/vroom/
1❤18🔥5👏1
ezgif-8120c4563e81c3.mp4
510.6 KB
🥶 OmniHuman-1.5 🥶
👉#ByteDance proposes a novel framework designed to generate character animations that are not only physically plausible but also semantically coherent and expressive. Coherency with speech's rhythm, prosody and semantic content. Impressive results but no code 🥺
👉Review https://t.ly/CnRmX
👉Paper arxiv.org/pdf/2508.19209
👉Project omnihuman-lab.github.io/v1_5/
👉Repo 🥺
👉#ByteDance proposes a novel framework designed to generate character animations that are not only physically plausible but also semantically coherent and expressive. Coherency with speech's rhythm, prosody and semantic content. Impressive results but no code 🥺
👉Review https://t.ly/CnRmX
👉Paper arxiv.org/pdf/2508.19209
👉Project omnihuman-lab.github.io/v1_5/
👉Repo 🥺
❤4🤯2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽SoccerNet 2025 results!⚽
👉SoccerNet 2025 Challenges is the open benchmarking dedicated to advancing computer vision research in football video understanding. Repo available 💙
👉Review https://t.ly/MfHKg
👉Paper https://arxiv.org/pdf/2508.19182
👉Project https://www.soccer-net.org/
👉Repo https://github.com/SoccerNet
👉SoccerNet 2025 Challenges is the open benchmarking dedicated to advancing computer vision research in football video understanding. Repo available 💙
👉Review https://t.ly/MfHKg
👉Paper https://arxiv.org/pdf/2508.19182
👉Project https://www.soccer-net.org/
👉Repo https://github.com/SoccerNet
❤15🔥6👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌹ROSE: Remove Objects & Effects🌹
👉Fix the object’s effects on environment: shadows, reflections, light, translucency and mirror. Model, Demo & Dataset available via Hugging Face💙
👉Review https://t.ly/_KFM0
👉Paper https://lnkd.in/dNcTXQAE
👉Project https://lnkd.in/dFGmYT5h
👉Model https://lnkd.in/dhTT-VkN
👉Demo https://lnkd.in/dimgXZT6
👉Data https://lnkd.in/da7Jv667
👉Fix the object’s effects on environment: shadows, reflections, light, translucency and mirror. Model, Demo & Dataset available via Hugging Face💙
👉Review https://t.ly/_KFM0
👉Paper https://lnkd.in/dNcTXQAE
👉Project https://lnkd.in/dFGmYT5h
👉Model https://lnkd.in/dhTT-VkN
👉Demo https://lnkd.in/dimgXZT6
👉Data https://lnkd.in/da7Jv667
❤15👍3😍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🉐 Dress-up & Dance 🉐
👉Novel diffusion framework that generates HQ 5-second-long 24 FPS VTON videos at 1152×720 of a user wearing desired garments while moving in accordance with a given reference video. Impressive results but no repo🥺
👉Review https://t.ly/7NeTL
👉Paper arxiv.org/pdf/2508.21070
👉Project immortalco.github.io/DressAndDance/
👉Repo 🥺
👉Novel diffusion framework that generates HQ 5-second-long 24 FPS VTON videos at 1152×720 of a user wearing desired garments while moving in accordance with a given reference video. Impressive results but no repo🥺
👉Review https://t.ly/7NeTL
👉Paper arxiv.org/pdf/2508.21070
👉Project immortalco.github.io/DressAndDance/
👉Repo 🥺
❤8🔥2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈 Multi-View 3D Tracking 🌈
👉MVTracker is the first data-driven multi-view 3D point tracker for tracking arbitrary 3D points across multiple cameras. Repo available💙
👉Review https://t.ly/rISMR
👉Paper arxiv.org/pdf/2508.21060
👉Project https://lnkd.in/drHtAmRC
👉Repo https://lnkd.in/d4k8mg3B
👉MVTracker is the first data-driven multi-view 3D point tracker for tracking arbitrary 3D points across multiple cameras. Repo available💙
👉Review https://t.ly/rISMR
👉Paper arxiv.org/pdf/2508.21060
👉Project https://lnkd.in/drHtAmRC
👉Repo https://lnkd.in/d4k8mg3B
❤10🔥5👍1