This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ DINO-based Video Tracking ๐ฆ
๐The Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)๐
๐Review https://t.ly/_GIMT
๐Paper https://lnkd.in/dsGVDcar
๐Project dino-tracker.github.io/
๐Code https://github.com/AssafSinger94/dino-tracker
๐The Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)๐
๐Review https://t.ly/_GIMT
๐Paper https://lnkd.in/dsGVDcar
๐Project dino-tracker.github.io/
๐Code https://github.com/AssafSinger94/dino-tracker
๐ฅ18โค3๐คฏ2๐1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ T-Rex 2: a new SOTA is out! ๐ฆ
๐A novel (VERY STRONG) open-set object detector model. Strong zero-shot capabilities, suitable for various scenarios with only one suit of weights. Demo and Source Code released๐
๐Review https://t.ly/fYw8D
๐Paper https://lnkd.in/dpmRh2zh
๐Project https://lnkd.in/dnR_jPcR
๐Code https://lnkd.in/dnZnGRUn
๐Demo https://lnkd.in/drDUEDYh
๐A novel (VERY STRONG) open-set object detector model. Strong zero-shot capabilities, suitable for various scenarios with only one suit of weights. Demo and Source Code released๐
๐Review https://t.ly/fYw8D
๐Paper https://lnkd.in/dpmRh2zh
๐Project https://lnkd.in/dnR_jPcR
๐Code https://lnkd.in/dnZnGRUn
๐Demo https://lnkd.in/drDUEDYh
๐ฅ23๐3๐คฏ2โค1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐TinyBeauty: 460 FPS Make-up๐
๐TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile!
๐Review https://t.ly/LG5ok
๐Paper https://arxiv.org/pdf/2403.15033.pdf
๐Project https://tinybeauty.github.io/TinyBeauty/
๐TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile!
๐Review https://t.ly/LG5ok
๐Paper https://arxiv.org/pdf/2403.15033.pdf
๐Project https://tinybeauty.github.io/TinyBeauty/
๐7๐คฏ4๐2โก1๐ฅ1๐ฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ AiOS: All-in-One-Stage Humans โ
๐All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step.
๐Review https://t.ly/ekNd4
๐Paper https://arxiv.org/pdf/2403.17934.pdf
๐Project https://ttxskk.github.io/AiOS/
๐Code/Demo (announced)
๐All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step.
๐Review https://t.ly/ekNd4
๐Paper https://arxiv.org/pdf/2403.17934.pdf
๐Project https://ttxskk.github.io/AiOS/
๐Code/Demo (announced)
โค6๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ MAVOS Object Segmentation ๐
๐MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)๐
๐Review https://t.ly/SKaRG
๐Paper https://lnkd.in/dQyifKa3
๐Project github.com/Amshaker/MAVOS
๐MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)๐
๐Review https://t.ly/SKaRG
๐Paper https://lnkd.in/dQyifKa3
๐Project github.com/Amshaker/MAVOS
๐ฅ10๐2โค1๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ ObjectDrop: automagical objects removal ๐ฆ
๐#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!
๐Review https://t.ly/ZJ6NN
๐Paper https://arxiv.org/pdf/2403.18818.pdf
๐Project https://objectdrop.github.io/
๐#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!
๐Review https://t.ly/ZJ6NN
๐Paper https://arxiv.org/pdf/2403.18818.pdf
๐Project https://objectdrop.github.io/
๐14๐คฏ8โค4๐ฅ3๐พ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ชผ Universal Mono Metric Depth ๐ชผ
๐ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code released๐
๐Review https://t.ly/5C8eq
๐Paper arxiv.org/pdf/2403.18913.pdf
๐Code github.com/lpiccinelli-eth/unidepth
๐ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code released๐
๐Review https://t.ly/5C8eq
๐Paper arxiv.org/pdf/2403.18913.pdf
๐Code github.com/lpiccinelli-eth/unidepth
๐ฅ10๐1๐คฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ RELI11D: Multimodal Humans ๐
๐RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soon๐
๐Review https://t.ly/5EG6X
๐Paper https://lnkd.in/ep6Utcik
๐Project https://lnkd.in/eDhNHYBb
๐RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soon๐
๐Review https://t.ly/5EG6X
๐Paper https://lnkd.in/ep6Utcik
๐Project https://lnkd.in/eDhNHYBb
โค3๐ฅ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ ECoDepth: SOTA Diffusive Mono-Depth ๐ฅ
๐New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released ๐
๐Review https://t.ly/s2pbB
๐Paper https://lnkd.in/eYt5yr_q
๐Code https://lnkd.in/eEcyPQcd
๐New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released ๐
๐Review https://t.ly/s2pbB
๐Paper https://lnkd.in/eYt5yr_q
๐Code https://lnkd.in/eEcyPQcd
๐ฅ11๐4โค3โก1
AI with Papers - Artificial Intelligence & Deep Learning
๐ฆ DINO-based Video Tracking ๐ฆ ๐The Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)๐ ๐Review https://t.ly/_GIMT ๐Paper https://lnkd.in/dsGVDcar ๐Project dino-tracker.github.io/โฆ
GitHub
GitHub - AssafSinger94/dino-tracker: Official Pytorch Implementation for โDINO-Tracker: Taming DINO for Self-Supervised Point Trackingโฆ
Official Pytorch Implementation for โDINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Videoโ (ECCV 2024) - AssafSinger94/dino-tracker
๐10โค2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ท๏ธ Gen-NeRF2NeRF Translation ๐ท๏ธ
๐GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
๐Review https://t.ly/VMWAH
๐Paper arxiv.org/pdf/2404.02788.pdf
๐Project xiangyueliu.github.io/GenN2N/
๐Code github.com/Lxiangyue/GenN2N
๐GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
๐Review https://t.ly/VMWAH
๐Paper arxiv.org/pdf/2404.02788.pdf
๐Project xiangyueliu.github.io/GenN2N/
๐Code github.com/Lxiangyue/GenN2N
๐คฏ4โค3๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐iSeg: Interactive 3D Segmentation๐
๐ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
๐Review https://t.ly/tyFnD
๐Paper https://lnkd.in/dydAz8zp
๐Project https://lnkd.in/de-h6SRi
๐Code (coming)
๐ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
๐Review https://t.ly/tyFnD
๐Paper https://lnkd.in/dydAz8zp
๐Project https://lnkd.in/de-h6SRi
๐Code (coming)
โค7๐2๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ Neural Bodies with Clothes ๐
๐Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
๐Review https://t.ly/Un1wc
๐Project https://lnkd.in/dhDG6FF5
๐Paper https://lnkd.in/dhcfK7jZ
๐Code https://lnkd.in/dQvXWysP
๐Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
๐Review https://t.ly/Un1wc
๐Project https://lnkd.in/dhDG6FF5
๐Paper https://lnkd.in/dhcfK7jZ
๐Code https://lnkd.in/dQvXWysP
๐ฅ7๐2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ BodyMAP: human body & pressure ๐
๐#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming ๐
๐Review https://t.ly/8926S
๐Project bodymap3d.github.io/
๐Paper https://lnkd.in/gCxH4ev3
๐Code https://lnkd.in/gaifdy3q
๐#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming ๐
๐Review https://t.ly/8926S
๐Project bodymap3d.github.io/
๐Paper https://lnkd.in/gCxH4ev3
๐Code https://lnkd.in/gaifdy3q
โค8๐คฏ4โก1๐1๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ง XComposer2: 4K Vision-Language ๐ง
๐InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840ร1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released ๐
๐Review https://t.ly/GCHsz
๐Paper arxiv.org/pdf/2404.06512.pdf
๐Code github.com/InternLM/InternLM-XComposer
๐InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840ร1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released ๐
๐Review https://t.ly/GCHsz
๐Paper arxiv.org/pdf/2404.06512.pdf
๐Code github.com/InternLM/InternLM-XComposer
๐ฅฐ7โก2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
โ๏ธ Flying w/ Photons: Neural Render โ๏ธ
๐Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
๐Review https://t.ly/ZqL3a
๐Paper arxiv.org/pdf/2404.06493.pdf
๐Project anaghmalik.com/FlyingWithPhotons/
๐Code github.com/anaghmalik/FlyingWithPhotons
๐Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
๐Review https://t.ly/ZqL3a
๐Paper arxiv.org/pdf/2404.06493.pdf
๐Project anaghmalik.com/FlyingWithPhotons/
๐Code github.com/anaghmalik/FlyingWithPhotons
๐คฏ6โก3โค2๐1๐คฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ๏ธ Tracking Any 2D Pixels in 3D โ๏ธ
๐ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
๐Review https://t.ly/B28Cj
๐Paper https://lnkd.in/d8ers_nm
๐Project https://lnkd.in/deHjtZuE
๐Code https://lnkd.in/dMe3TvFT
๐ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
๐Review https://t.ly/B28Cj
๐Paper https://lnkd.in/d8ers_nm
๐Project https://lnkd.in/deHjtZuE
๐Code https://lnkd.in/dMe3TvFT
โค10๐ฅ5โก1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ชYOLO-CIANNA: Neural Astro๐ช
๐ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released ๐
๐Review https://t.ly/441XS
๐Paper arxiv.org/pdf/2402.05925.pdf
๐Code github.com/Deyht/CIANNA
๐Wiki github.com/Deyht/CIANNA/wiki
๐ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released ๐
๐Review https://t.ly/441XS
๐Paper arxiv.org/pdf/2402.05925.pdf
๐Code github.com/Deyht/CIANNA
๐Wiki github.com/Deyht/CIANNA/wiki
๐7โก5โค4๐ฅ2๐ฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐งคNeuro MusculoSkeletal-MANO๐งค
๐SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced ๐
๐Review https://t.ly/HOQrn
๐Paper arxiv.org/pdf/2404.10227.pdf
๐Project https://ms-mano.robotflow.ai/
๐Code announced (no repo yet)
๐SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced ๐
๐Review https://t.ly/HOQrn
๐Paper arxiv.org/pdf/2404.10227.pdf
๐Project https://ms-mano.robotflow.ai/
๐Code announced (no repo yet)
๐ฅ3โก1โค1๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
โฝSoccerNET: Athlete Trackingโฝ
๐SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
๐Review https://t.ly/Mdu9s
๐Paper arxiv.org/pdf/2404.11335.pdf
๐Code github.com/SoccerNet/sn-gamestate
๐SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
๐Review https://t.ly/Mdu9s
๐Paper arxiv.org/pdf/2404.11335.pdf
๐Code github.com/SoccerNet/sn-gamestate
โค9๐8๐ฅ3โก2๐คฏ1