AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺΌ Universal Mono Metric Depth πŸͺΌ

πŸ‘‰ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code releasedπŸ’™

πŸ‘‰Review https://t.ly/5C8eq
πŸ‘‰Paper arxiv.org/pdf/2403.18913.pdf
πŸ‘‰Code github.com/lpiccinelli-eth/unidepth
πŸ”₯10πŸ‘1🀣1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”˜ RELI11D: Multimodal Humans πŸ”˜

πŸ‘‰RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soonπŸ’™

πŸ‘‰Review https://t.ly/5EG6X
πŸ‘‰Paper https://lnkd.in/ep6Utcik
πŸ‘‰Project https://lnkd.in/eDhNHYBb
❀3πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ ECoDepth: SOTA Diffusive Mono-Depth πŸ”₯

πŸ‘‰New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/s2pbB
πŸ‘‰Paper https://lnkd.in/eYt5yr_q
πŸ‘‰Code https://lnkd.in/eEcyPQcd
πŸ”₯11πŸ‘4❀3⚑1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ•·οΈ Gen-NeRF2NeRF Translation πŸ•·οΈ

πŸ‘‰GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.

πŸ‘‰Review https://t.ly/VMWAH
πŸ‘‰Paper arxiv.org/pdf/2404.02788.pdf
πŸ‘‰Project xiangyueliu.github.io/GenN2N/
πŸ‘‰Code github.com/Lxiangyue/GenN2N
🀯4❀3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘†iSeg: Interactive 3D SegmentationπŸ‘†

πŸ‘‰ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.

πŸ‘‰Review https://t.ly/tyFnD
πŸ‘‰Paper https://lnkd.in/dydAz8zp
πŸ‘‰Project https://lnkd.in/de-h6SRi
πŸ‘‰Code (coming)
❀7πŸ‘2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘— Neural Bodies with Clothes πŸ‘—

πŸ‘‰Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.

πŸ‘‰Review https://t.ly/Un1wc
πŸ‘‰Project https://lnkd.in/dhDG6FF5
πŸ‘‰Paper https://lnkd.in/dhcfK7jZ
πŸ‘‰Code https://lnkd.in/dQvXWysP
πŸ”₯7πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”Œ BodyMAP: human body & pressure πŸ”Œ

πŸ‘‰#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming πŸ’™

πŸ‘‰Review https://t.ly/8926S
πŸ‘‰Project bodymap3d.github.io/
πŸ‘‰Paper https://lnkd.in/gCxH4ev3
πŸ‘‰Code https://lnkd.in/gaifdy3q
❀8🀯4⚑1πŸ‘1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧞 XComposer2: 4K Vision-Language 🧞

πŸ‘‰InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840Γ—1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released πŸ’™

πŸ‘‰Review https://t.ly/GCHsz
πŸ‘‰Paper arxiv.org/pdf/2404.06512.pdf
πŸ‘‰Code github.com/InternLM/InternLM-XComposer
πŸ₯°7⚑2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
βš›οΈ Flying w/ Photons: Neural Render βš›οΈ

πŸ‘‰Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!

πŸ‘‰Review https://t.ly/ZqL3a
πŸ‘‰Paper arxiv.org/pdf/2404.06493.pdf
πŸ‘‰Project anaghmalik.com/FlyingWithPhotons/
πŸ‘‰Code github.com/anaghmalik/FlyingWithPhotons
🀯6⚑3❀2πŸ‘1🀣1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜„οΈ Tracking Any 2D Pixels in 3D β˜„οΈ

πŸ‘‰ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.

πŸ‘‰Review https://t.ly/B28Cj
πŸ‘‰Paper https://lnkd.in/d8ers_nm
πŸ‘‰Project https://lnkd.in/deHjtZuE
πŸ‘‰Code https://lnkd.in/dMe3TvFT
❀10πŸ”₯5⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺYOLO-CIANNA: Neural AstroπŸͺ

πŸ‘‰ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/441XS
πŸ‘‰Paper arxiv.org/pdf/2402.05925.pdf
πŸ‘‰Code github.com/Deyht/CIANNA
πŸ‘‰Wiki github.com/Deyht/CIANNA/wiki
πŸ‘7⚑5❀4πŸ”₯2πŸ₯°2
This media is not supported in your browser
VIEW IN TELEGRAM
🧀Neuro MusculoSkeletal-MANO🧀

πŸ‘‰SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced πŸ’™

πŸ‘‰Review https://t.ly/HOQrn
πŸ‘‰Paper arxiv.org/pdf/2404.10227.pdf
πŸ‘‰Project https://ms-mano.robotflow.ai/
πŸ‘‰Code announced (no repo yet)
πŸ”₯3⚑1❀1πŸ‘1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽SoccerNET: Athlete Tracking⚽

πŸ‘‰SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.

πŸ‘‰Review https://t.ly/Mdu9s
πŸ‘‰Paper arxiv.org/pdf/2404.11335.pdf
πŸ‘‰Code github.com/SoccerNet/sn-gamestate
❀9πŸ‘8πŸ”₯3⚑2🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎲 Articulated Objs from MonoClips 🎲

πŸ‘‰REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video

πŸ‘‰Review https://t.ly/REuM8
πŸ‘‰Paper https://lnkd.in/d6PWagij
πŸ‘‰Project https://lnkd.in/dpg3x4tm
πŸ‘‰Repo https://lnkd.in/dRZWj6_N
🀯6πŸ‘1πŸ”₯1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺΌ All You Need is SAM (+Flow) πŸͺΌ

πŸ‘‰Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced πŸ’™

πŸ‘‰Review https://t.ly/ZRYtp
πŸ‘‰Paper https://lnkd.in/d4XqkEGF
πŸ‘‰Project https://lnkd.in/dHpmx3FF
πŸ‘‰Repo coming: https://github.com/Jyxarthur/
❀12πŸ‘7πŸ”₯2🀯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ›ž 6Img-to-3D driving scenarios πŸ›ž

πŸ‘‰EPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics

πŸ‘‰Review https://shorturl.at/dZ018
πŸ‘‰Paper arxiv.org/pdf/2404.12378.pdf
πŸ‘‰Project 6img-to-3d.github.io/
πŸ‘‰Code github.com/continental/6Img-to-3D
πŸ”₯5❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🌹 Physics-Based 3D Video-Gen 🌹

πŸ‘‰PhysDreamer, a physics-based approach that leverages the object dynamics priors learned by video generation models. It enables realistic 3D interaction with objects

πŸ‘‰Review https://t.ly/zxXf9
πŸ‘‰Paper arxiv.org/pdf/2404.13026.pdf
πŸ‘‰Project physdreamer.github.io/
πŸ‘‰Code github.com/a1600012888/PhysDreamer
πŸ‘14❀9🀯4πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🎑 NER-Net: Seeing at Night-Time 🎑

πŸ‘‰Huazhong (+Beijing) unveils a novel event-based nighttime imaging solution under non-uniform illumination, plus a paired multi-illumination level real-world dataset. Repo online, code coming πŸ’™

πŸ‘‰Review https://t.ly/Z9JMJ
πŸ‘‰Paper arxiv.org/pdf/2404.11884.pdf
πŸ‘‰Repo github.com/Liu-haoyue/NER-Net
πŸ‘‰Clip https://www.youtube.com/watch?v=zpfTLCF1Kw4
🀯3πŸ”₯2❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🌊 FlowMap: dense depth video 🌊

πŸ‘‰MIT (+CSAIL) unveils FlowMap, a novel E2E differentiable method that solves for precise camera poses, camera intrinsics, and perframe dense depth of a video sequence. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/CBH48
πŸ‘‰Paper arxiv.org/pdf/2404.15259.pdf
πŸ‘‰Project cameronosmith.github.io/flowmap
πŸ‘‰Code github.com/dcharatan/flowmap
πŸ”₯18❀3πŸ‘2