AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ• DINO-based Video Tracking ๐Ÿฆ•

๐Ÿ‘‰The Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/_GIMT
๐Ÿ‘‰Paper https://lnkd.in/dsGVDcar
๐Ÿ‘‰Project dino-tracker.github.io/
๐Ÿ‘‰Code https://github.com/AssafSinger94/dino-tracker
๐Ÿ”ฅ18โค3๐Ÿคฏ2๐Ÿ‘1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ– T-Rex 2: a new SOTA is out! ๐Ÿฆ–

๐Ÿ‘‰A novel (VERY STRONG) open-set object detector model. Strong zero-shot capabilities, suitable for various scenarios with only one suit of weights. Demo and Source Code released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/fYw8D
๐Ÿ‘‰Paper https://lnkd.in/dpmRh2zh
๐Ÿ‘‰Project https://lnkd.in/dnR_jPcR
๐Ÿ‘‰Code https://lnkd.in/dnZnGRUn
๐Ÿ‘‰Demo https://lnkd.in/drDUEDYh
๐Ÿ”ฅ23๐Ÿ‘3๐Ÿคฏ2โค1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’„TinyBeauty: 460 FPS Make-up๐Ÿ’„

๐Ÿ‘‰TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile!

๐Ÿ‘‰Review https://t.ly/LG5ok
๐Ÿ‘‰Paper https://arxiv.org/pdf/2403.15033.pdf
๐Ÿ‘‰Project https://tinybeauty.github.io/TinyBeauty/
๐Ÿ‘7๐Ÿคฏ4๐Ÿ˜2โšก1๐Ÿ”ฅ1๐Ÿ’ฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜” AiOS: All-in-One-Stage Humans โ˜”

๐Ÿ‘‰All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step.

๐Ÿ‘‰Review https://t.ly/ekNd4
๐Ÿ‘‰Paper https://arxiv.org/pdf/2403.17934.pdf
๐Ÿ‘‰Project https://ttxskk.github.io/AiOS/
๐Ÿ‘‰Code/Demo (announced)
โค6๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ€ MAVOS Object Segmentation ๐Ÿ€

๐Ÿ‘‰MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/SKaRG
๐Ÿ‘‰Paper https://lnkd.in/dQyifKa3
๐Ÿ‘‰Project github.com/Amshaker/MAVOS
๐Ÿ”ฅ10๐Ÿ‘2โค1๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’ฆ ObjectDrop: automagical objects removal ๐Ÿ’ฆ

๐Ÿ‘‰#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!

๐Ÿ‘‰Review https://t.ly/ZJ6NN
๐Ÿ‘‰Paper https://arxiv.org/pdf/2403.18818.pdf
๐Ÿ‘‰Project https://objectdrop.github.io/
๐Ÿ‘14๐Ÿคฏ8โค4๐Ÿ”ฅ3๐Ÿพ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชผ Universal Mono Metric Depth ๐Ÿชผ

๐Ÿ‘‰ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/5C8eq
๐Ÿ‘‰Paper arxiv.org/pdf/2403.18913.pdf
๐Ÿ‘‰Code github.com/lpiccinelli-eth/unidepth
๐Ÿ”ฅ10๐Ÿ‘1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”˜ RELI11D: Multimodal Humans ๐Ÿ”˜

๐Ÿ‘‰RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soon๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/5EG6X
๐Ÿ‘‰Paper https://lnkd.in/ep6Utcik
๐Ÿ‘‰Project https://lnkd.in/eDhNHYBb
โค3๐Ÿ”ฅ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ ECoDepth: SOTA Diffusive Mono-Depth ๐Ÿ”ฅ

๐Ÿ‘‰New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/s2pbB
๐Ÿ‘‰Paper https://lnkd.in/eYt5yr_q
๐Ÿ‘‰Code https://lnkd.in/eEcyPQcd
๐Ÿ”ฅ11๐Ÿ‘4โค3โšก1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ•ท๏ธ Gen-NeRF2NeRF Translation ๐Ÿ•ท๏ธ

๐Ÿ‘‰GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.

๐Ÿ‘‰Review https://t.ly/VMWAH
๐Ÿ‘‰Paper arxiv.org/pdf/2404.02788.pdf
๐Ÿ‘‰Project xiangyueliu.github.io/GenN2N/
๐Ÿ‘‰Code github.com/Lxiangyue/GenN2N
๐Ÿคฏ4โค3๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘†iSeg: Interactive 3D Segmentation๐Ÿ‘†

๐Ÿ‘‰ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.

๐Ÿ‘‰Review https://t.ly/tyFnD
๐Ÿ‘‰Paper https://lnkd.in/dydAz8zp
๐Ÿ‘‰Project https://lnkd.in/de-h6SRi
๐Ÿ‘‰Code (coming)
โค7๐Ÿ‘2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘— Neural Bodies with Clothes ๐Ÿ‘—

๐Ÿ‘‰Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.

๐Ÿ‘‰Review https://t.ly/Un1wc
๐Ÿ‘‰Project https://lnkd.in/dhDG6FF5
๐Ÿ‘‰Paper https://lnkd.in/dhcfK7jZ
๐Ÿ‘‰Code https://lnkd.in/dQvXWysP
๐Ÿ”ฅ7๐Ÿ‘2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”Œ BodyMAP: human body & pressure ๐Ÿ”Œ

๐Ÿ‘‰#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/8926S
๐Ÿ‘‰Project bodymap3d.github.io/
๐Ÿ‘‰Paper https://lnkd.in/gCxH4ev3
๐Ÿ‘‰Code https://lnkd.in/gaifdy3q
โค8๐Ÿคฏ4โšก1๐Ÿ‘1๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงž XComposer2: 4K Vision-Language ๐Ÿงž

๐Ÿ‘‰InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840ร—1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/GCHsz
๐Ÿ‘‰Paper arxiv.org/pdf/2404.06512.pdf
๐Ÿ‘‰Code github.com/InternLM/InternLM-XComposer
๐Ÿฅฐ7โšก2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โš›๏ธ Flying w/ Photons: Neural Render โš›๏ธ

๐Ÿ‘‰Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!

๐Ÿ‘‰Review https://t.ly/ZqL3a
๐Ÿ‘‰Paper arxiv.org/pdf/2404.06493.pdf
๐Ÿ‘‰Project anaghmalik.com/FlyingWithPhotons/
๐Ÿ‘‰Code github.com/anaghmalik/FlyingWithPhotons
๐Ÿคฏ6โšก3โค2๐Ÿ‘1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜„๏ธ Tracking Any 2D Pixels in 3D โ˜„๏ธ

๐Ÿ‘‰ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.

๐Ÿ‘‰Review https://t.ly/B28Cj
๐Ÿ‘‰Paper https://lnkd.in/d8ers_nm
๐Ÿ‘‰Project https://lnkd.in/deHjtZuE
๐Ÿ‘‰Code https://lnkd.in/dMe3TvFT
โค10๐Ÿ”ฅ5โšก1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸชYOLO-CIANNA: Neural Astro๐Ÿช

๐Ÿ‘‰ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/441XS
๐Ÿ‘‰Paper arxiv.org/pdf/2402.05925.pdf
๐Ÿ‘‰Code github.com/Deyht/CIANNA
๐Ÿ‘‰Wiki github.com/Deyht/CIANNA/wiki
๐Ÿ‘7โšก5โค4๐Ÿ”ฅ2๐Ÿฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงคNeuro MusculoSkeletal-MANO๐Ÿงค

๐Ÿ‘‰SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/HOQrn
๐Ÿ‘‰Paper arxiv.org/pdf/2404.10227.pdf
๐Ÿ‘‰Project https://ms-mano.robotflow.ai/
๐Ÿ‘‰Code announced (no repo yet)
๐Ÿ”ฅ3โšก1โค1๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โšฝSoccerNET: Athlete Trackingโšฝ

๐Ÿ‘‰SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.

๐Ÿ‘‰Review https://t.ly/Mdu9s
๐Ÿ‘‰Paper arxiv.org/pdf/2404.11335.pdf
๐Ÿ‘‰Code github.com/SoccerNet/sn-gamestate
โค9๐Ÿ‘8๐Ÿ”ฅ3โšก2๐Ÿคฏ1