AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
πŸ‘½ One Model <-> All Segmentations πŸ‘½

πŸ‘‰ 10+ different segmentation tasks in one framework, including image-level, video-level, interactive segmentation, & open-vocabulary segmentation. All in one!

πŸ‘‰Review https://t.ly/fywVz
πŸ‘‰Paper https://lnkd.in/dw3S4B74
πŸ‘‰Project https://lnkd.in/dzHT9v45
πŸ‘‰Repo https://lnkd.in/d6fDCnSp
πŸ”₯17πŸ‘5❀2πŸ₯°1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
😻 GARField: Group Anything 😻

πŸ‘‰ GARField is a novel approach for decomposing #3D scenes into a hierarchy of semantically meaningful groups from posed image inputs.

πŸ‘‰Review https://t.ly/6Hkeq
πŸ‘‰Paper https://lnkd.in/d28mfRcZ
πŸ‘‰Project https://lnkd.in/dzYdRNKy
πŸ‘‰Repo (coming) https://lnkd.in/d2VeRJCS
πŸ‘8❀3πŸ₯°1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ Depth Anything: new SOTA πŸ”₯

πŸ‘‰Depth Anything: the new SOTA in monocular depth estimation (MDE), trained with 1.5M labeled images and 62M+ unlabeled images jointly. It's the new SOTA!

πŸ‘‰Review https://t.ly/tCBwO
πŸ‘‰Paper https://lnkd.in/djx-9k2J
πŸ‘‰Project https://lnkd.in/dYetqZFa
πŸ‘‰Repo https://lnkd.in/d87CrUGv
πŸ‘‰DemoπŸ€— https://lnkd.in/dJhvKBep
πŸ”₯17❀3πŸ₯°2🀩2
This media is not supported in your browser
VIEW IN TELEGRAM
🎭 ULTRA-Realistic Avatar 🎭

πŸ‘‰Novel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.

πŸ‘‰Review https://t.ly/B3BEu
πŸ‘‰Project https://lnkd.in/dkUQHFEV
πŸ‘‰Paper https://lnkd.in/dtEQxrBu
πŸ‘‰Code coming 🩷
πŸ’©17❀5πŸ‘2🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Lumiere: SOTA video-genπŸ”₯

πŸ‘‰#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.

πŸ‘‰Review https://t.ly/nalJR
πŸ‘‰Paper https://lnkd.in/d-PvrGjT
πŸ‘‰Project https://t.ly/gK8hz
πŸ”₯18❀4πŸ‘3πŸ‘2🀩2πŸ₯°1🀯1πŸ’©1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ§ͺ SUPIR: SOTA restoration πŸ§ͺ

πŸ‘‰SUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics

πŸ‘‰Review https://t.ly/wgObH
πŸ‘‰Project https://supir.xpixel.group/
πŸ‘‰Paper https://lnkd.in/dZPYcUuq
πŸ‘‰Demo coming 🩷 but no code announced :(
❀8πŸ”₯4πŸ₯°1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🫧 SAM + Open Models 🫧

πŸ‘‰Grounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.

πŸ‘‰Review https://t.ly/FwasQ
πŸ‘‰Paper arxiv.org/pdf/2401.14159.pdf
πŸ‘‰Code github.com/IDEA-Research/Grounded-Segment-Anything
πŸ”₯9πŸ‘2πŸ‘1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘’"Virtual Try-All" by #Amazon πŸ‘’

πŸ‘‰#Amazon announces ”Diffuse to Choose”: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.

πŸ‘‰Review https://t.ly/at07Y
πŸ‘‰Paper https://lnkd.in/dxR7nGtd
πŸ‘‰Project diffuse2choose.github.io/
❀15πŸ‘7🀯4πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🦩 WildRGB-D: Objects in the Wild 🦩

πŸ‘‰#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.

πŸ‘‰Review https://t.ly/WCqVz
πŸ‘‰Data github.com/wildrgbd/wildrgbd
πŸ‘‰Paper arxiv.org/pdf/2401.12592.pdf
πŸ‘‰Project wildrgbd.github.io/
πŸ‘9❀3πŸ”₯2πŸ‘1🀩1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŒ‹EasyVolcap: Accelerating Neural VolumetricπŸŒ‹

πŸ‘‰Novel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering

πŸ‘‰Review https://t.ly/8BISl
πŸ‘‰Paper arxiv.org/pdf/2312.06575.pdf
πŸ‘‰Code github.com/zju3dv/EasyVolcap
πŸ”₯10πŸ‘2❀1πŸ₯°1πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ™ Rock-Track announced! πŸ™

πŸ‘‰Rock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.

πŸ‘‰Review https://t.ly/hC0ak
πŸ‘‰Repo, coming: https://lnkd.in/dtDkPwCC
πŸ‘‰Paper coming
πŸ‘4πŸ‘4πŸ”₯2❀1πŸ₯°1
🧠350+ Free #AI Courses by #Google🧠

πŸ‘‰350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.

βœ…π†πžπ§πžπ«πšπ­π’π―πž π€πˆ
βœ…πˆπ§π­π«π¨ 𝐭𝐨 π‹π‹πŒπ¬
βœ…π‚π• 𝐰𝐒𝐭𝐑 𝐓𝐅
βœ…πƒπšπ­πš, πŒπ‹, π€πˆ
βœ…π‘πžπ¬π©π¨π§π¬π’π›π₯𝐞 π€πˆ

πŸ‘‰Review: https://t.ly/517Dr
πŸ‘‰Full list: https://www.cloudskillsboost.google/catalog?page=1
❀13πŸ‘3πŸ‘2🍾2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‹ Diffutoon: new SOTA video πŸ‹

πŸ‘‰Diffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!

πŸ‘‰Review https://t.ly/sim2O
πŸ‘‰Paper https://lnkd.in/dPcSnAUu
πŸ‘‰Code https://lnkd.in/d9B_dGrf
πŸ‘‰Project https://lnkd.in/dpcsJcX2
πŸ”₯19❀3🀯3πŸ‘1πŸ₯°1🀩1πŸ’©1🍾1
πŸ₯“ RANSAC -> PARSAC (neural) πŸ₯“

πŸ‘‰Neural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released πŸ’™

πŸ‘‰Review https://t.ly/r9ngg
πŸ‘‰Paper https://lnkd.in/dadQ4Qec
πŸ‘‰Code https://lnkd.in/dYp6gADd
❀14πŸ‘3⚑1πŸ₯°1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
β†˜οΈ SEELE: "moving" the subjects ➑️

πŸ‘‰Subject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the image’s fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks

πŸ‘‰Review https://t.ly/4FS4H
πŸ‘‰Paper arxiv.org/pdf/2401.16861.pdf
πŸ‘‰Project yikai-wang.github.io/seele/
πŸ‘20❀3🀯3πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŽ‰ ADΞ”ER: Event-Camera Suite πŸŽ‰

πŸ‘‰ADΞ”ER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΞ”ER (Address, Decimation, Ξ”t Event Representation) video streams. Source code (RUST) released πŸ’™

πŸ‘‰Review https://t.ly/w5_KC
πŸ‘‰Paper arxiv.org/pdf/2401.17151.pdf
πŸ‘‰Repo github.com/ac-freeman/adder-codec-rs
❀7πŸ‘3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🚦(add) Anything in Any Video🚦

πŸ‘‰ XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/UYhl0
πŸ‘‰Code https://lnkd.in/gyi7Dhkn
πŸ‘‰Paper https://lnkd.in/gXyAJ6GZ
πŸ‘‰Project https://lnkd.in/gVA5vduD
πŸ”₯12🀯6πŸ‘5πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🍬 ABS: SOTA collision-free 🍬

πŸ‘‰ABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) πŸ’™

πŸ‘‰Review https://t.ly/AYu-Z
πŸ‘‰Paper arxiv.org/pdf/2401.17583.pdf
πŸ‘‰Project agile-but-safe.github.io/
πŸ‘‰Repo github.com/LeCAR-Lab/ABS
😍11πŸ‘3πŸ‘1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‡ Bootstrapping TAP πŸ‡

πŸ‘‰#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/-S_ZL
πŸ‘‰Paper arxiv.org/pdf/2402.00847.pdf
πŸ‘‰Code https://github.com/google-deepmind/tapnet
πŸ”₯5πŸ‘3πŸ₯°1🀩1
πŸ’₯Py4AI 2x Speakers, 2x TicketsπŸ’₯

βœ…Doubling the speakers (6 -> 12!)
βœ…A new track (2 tracks in parallel)
βœ…A new batch of 100 tickets!

πŸ‘‰ More: https://t.ly/WmVrM
❀7πŸ‘2🀯1