This media is not supported in your browser
VIEW IN TELEGRAM
๐TinyBeauty: 460 FPS Make-up๐
๐TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile!
๐Review https://t.ly/LG5ok
๐Paper https://arxiv.org/pdf/2403.15033.pdf
๐Project https://tinybeauty.github.io/TinyBeauty/
๐TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile!
๐Review https://t.ly/LG5ok
๐Paper https://arxiv.org/pdf/2403.15033.pdf
๐Project https://tinybeauty.github.io/TinyBeauty/
๐7๐คฏ4๐2โก1๐ฅ1๐ฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ AiOS: All-in-One-Stage Humans โ
๐All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step.
๐Review https://t.ly/ekNd4
๐Paper https://arxiv.org/pdf/2403.17934.pdf
๐Project https://ttxskk.github.io/AiOS/
๐Code/Demo (announced)
๐All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step.
๐Review https://t.ly/ekNd4
๐Paper https://arxiv.org/pdf/2403.17934.pdf
๐Project https://ttxskk.github.io/AiOS/
๐Code/Demo (announced)
โค6๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ MAVOS Object Segmentation ๐
๐MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)๐
๐Review https://t.ly/SKaRG
๐Paper https://lnkd.in/dQyifKa3
๐Project github.com/Amshaker/MAVOS
๐MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)๐
๐Review https://t.ly/SKaRG
๐Paper https://lnkd.in/dQyifKa3
๐Project github.com/Amshaker/MAVOS
๐ฅ10๐2โค1๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ ObjectDrop: automagical objects removal ๐ฆ
๐#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!
๐Review https://t.ly/ZJ6NN
๐Paper https://arxiv.org/pdf/2403.18818.pdf
๐Project https://objectdrop.github.io/
๐#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!
๐Review https://t.ly/ZJ6NN
๐Paper https://arxiv.org/pdf/2403.18818.pdf
๐Project https://objectdrop.github.io/
๐14๐คฏ8โค4๐ฅ3๐พ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ชผ Universal Mono Metric Depth ๐ชผ
๐ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code released๐
๐Review https://t.ly/5C8eq
๐Paper arxiv.org/pdf/2403.18913.pdf
๐Code github.com/lpiccinelli-eth/unidepth
๐ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code released๐
๐Review https://t.ly/5C8eq
๐Paper arxiv.org/pdf/2403.18913.pdf
๐Code github.com/lpiccinelli-eth/unidepth
๐ฅ10๐1๐คฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ RELI11D: Multimodal Humans ๐
๐RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soon๐
๐Review https://t.ly/5EG6X
๐Paper https://lnkd.in/ep6Utcik
๐Project https://lnkd.in/eDhNHYBb
๐RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soon๐
๐Review https://t.ly/5EG6X
๐Paper https://lnkd.in/ep6Utcik
๐Project https://lnkd.in/eDhNHYBb
โค3๐ฅ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ ECoDepth: SOTA Diffusive Mono-Depth ๐ฅ
๐New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released ๐
๐Review https://t.ly/s2pbB
๐Paper https://lnkd.in/eYt5yr_q
๐Code https://lnkd.in/eEcyPQcd
๐New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released ๐
๐Review https://t.ly/s2pbB
๐Paper https://lnkd.in/eYt5yr_q
๐Code https://lnkd.in/eEcyPQcd
๐ฅ11๐4โค3โก1
AI with Papers - Artificial Intelligence & Deep Learning
๐ฆ DINO-based Video Tracking ๐ฆ ๐The Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)๐ ๐Review https://t.ly/_GIMT ๐Paper https://lnkd.in/dsGVDcar ๐Project dino-tracker.github.io/โฆ
GitHub
GitHub - AssafSinger94/dino-tracker: Official Pytorch Implementation for โDINO-Tracker: Taming DINO for Self-Supervised Point Trackingโฆ
Official Pytorch Implementation for โDINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Videoโ (ECCV 2024) - AssafSinger94/dino-tracker
๐10โค2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ท๏ธ Gen-NeRF2NeRF Translation ๐ท๏ธ
๐GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
๐Review https://t.ly/VMWAH
๐Paper arxiv.org/pdf/2404.02788.pdf
๐Project xiangyueliu.github.io/GenN2N/
๐Code github.com/Lxiangyue/GenN2N
๐GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
๐Review https://t.ly/VMWAH
๐Paper arxiv.org/pdf/2404.02788.pdf
๐Project xiangyueliu.github.io/GenN2N/
๐Code github.com/Lxiangyue/GenN2N
๐คฏ4โค3๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐iSeg: Interactive 3D Segmentation๐
๐ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
๐Review https://t.ly/tyFnD
๐Paper https://lnkd.in/dydAz8zp
๐Project https://lnkd.in/de-h6SRi
๐Code (coming)
๐ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
๐Review https://t.ly/tyFnD
๐Paper https://lnkd.in/dydAz8zp
๐Project https://lnkd.in/de-h6SRi
๐Code (coming)
โค7๐2๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ Neural Bodies with Clothes ๐
๐Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
๐Review https://t.ly/Un1wc
๐Project https://lnkd.in/dhDG6FF5
๐Paper https://lnkd.in/dhcfK7jZ
๐Code https://lnkd.in/dQvXWysP
๐Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
๐Review https://t.ly/Un1wc
๐Project https://lnkd.in/dhDG6FF5
๐Paper https://lnkd.in/dhcfK7jZ
๐Code https://lnkd.in/dQvXWysP
๐ฅ7๐2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ BodyMAP: human body & pressure ๐
๐#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming ๐
๐Review https://t.ly/8926S
๐Project bodymap3d.github.io/
๐Paper https://lnkd.in/gCxH4ev3
๐Code https://lnkd.in/gaifdy3q
๐#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming ๐
๐Review https://t.ly/8926S
๐Project bodymap3d.github.io/
๐Paper https://lnkd.in/gCxH4ev3
๐Code https://lnkd.in/gaifdy3q
โค8๐คฏ4โก1๐1๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ง XComposer2: 4K Vision-Language ๐ง
๐InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840ร1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released ๐
๐Review https://t.ly/GCHsz
๐Paper arxiv.org/pdf/2404.06512.pdf
๐Code github.com/InternLM/InternLM-XComposer
๐InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840ร1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released ๐
๐Review https://t.ly/GCHsz
๐Paper arxiv.org/pdf/2404.06512.pdf
๐Code github.com/InternLM/InternLM-XComposer
๐ฅฐ7โก2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
โ๏ธ Flying w/ Photons: Neural Render โ๏ธ
๐Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
๐Review https://t.ly/ZqL3a
๐Paper arxiv.org/pdf/2404.06493.pdf
๐Project anaghmalik.com/FlyingWithPhotons/
๐Code github.com/anaghmalik/FlyingWithPhotons
๐Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
๐Review https://t.ly/ZqL3a
๐Paper arxiv.org/pdf/2404.06493.pdf
๐Project anaghmalik.com/FlyingWithPhotons/
๐Code github.com/anaghmalik/FlyingWithPhotons
๐คฏ6โก3โค2๐1๐คฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ๏ธ Tracking Any 2D Pixels in 3D โ๏ธ
๐ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
๐Review https://t.ly/B28Cj
๐Paper https://lnkd.in/d8ers_nm
๐Project https://lnkd.in/deHjtZuE
๐Code https://lnkd.in/dMe3TvFT
๐ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
๐Review https://t.ly/B28Cj
๐Paper https://lnkd.in/d8ers_nm
๐Project https://lnkd.in/deHjtZuE
๐Code https://lnkd.in/dMe3TvFT
โค10๐ฅ5โก1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ชYOLO-CIANNA: Neural Astro๐ช
๐ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released ๐
๐Review https://t.ly/441XS
๐Paper arxiv.org/pdf/2402.05925.pdf
๐Code github.com/Deyht/CIANNA
๐Wiki github.com/Deyht/CIANNA/wiki
๐ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released ๐
๐Review https://t.ly/441XS
๐Paper arxiv.org/pdf/2402.05925.pdf
๐Code github.com/Deyht/CIANNA
๐Wiki github.com/Deyht/CIANNA/wiki
๐7โก5โค4๐ฅ2๐ฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐งคNeuro MusculoSkeletal-MANO๐งค
๐SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced ๐
๐Review https://t.ly/HOQrn
๐Paper arxiv.org/pdf/2404.10227.pdf
๐Project https://ms-mano.robotflow.ai/
๐Code announced (no repo yet)
๐SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced ๐
๐Review https://t.ly/HOQrn
๐Paper arxiv.org/pdf/2404.10227.pdf
๐Project https://ms-mano.robotflow.ai/
๐Code announced (no repo yet)
๐ฅ3โก1โค1๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
โฝSoccerNET: Athlete Trackingโฝ
๐SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
๐Review https://t.ly/Mdu9s
๐Paper arxiv.org/pdf/2404.11335.pdf
๐Code github.com/SoccerNet/sn-gamestate
๐SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
๐Review https://t.ly/Mdu9s
๐Paper arxiv.org/pdf/2404.11335.pdf
๐Code github.com/SoccerNet/sn-gamestate
โค9๐8๐ฅ3โก2๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฒ Articulated Objs from MonoClips ๐ฒ
๐REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video
๐Review https://t.ly/REuM8
๐Paper https://lnkd.in/d6PWagij
๐Project https://lnkd.in/dpg3x4tm
๐Repo https://lnkd.in/dRZWj6_N
๐REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video
๐Review https://t.ly/REuM8
๐Paper https://lnkd.in/d6PWagij
๐Project https://lnkd.in/dpg3x4tm
๐Repo https://lnkd.in/dRZWj6_N
๐คฏ6๐1๐ฅ1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ชผ All You Need is SAM (+Flow) ๐ชผ
๐Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced ๐
๐Review https://t.ly/ZRYtp
๐Paper https://lnkd.in/d4XqkEGF
๐Project https://lnkd.in/dHpmx3FF
๐Repo coming: https://github.com/Jyxarthur/
๐Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced ๐
๐Review https://t.ly/ZRYtp
๐Paper https://lnkd.in/d4XqkEGF
๐Project https://lnkd.in/dHpmx3FF
๐Repo coming: https://github.com/Jyxarthur/
โค12๐7๐ฅ2๐คฏ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ 6Img-to-3D driving scenarios ๐
๐EPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics
๐Review https://shorturl.at/dZ018
๐Paper arxiv.org/pdf/2404.12378.pdf
๐Project 6img-to-3d.github.io/
๐Code github.com/continental/6Img-to-3D
๐EPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics
๐Review https://shorturl.at/dZ018
๐Paper arxiv.org/pdf/2404.12378.pdf
๐Project 6img-to-3d.github.io/
๐Code github.com/continental/6Img-to-3D
๐ฅ5โค1๐1