AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
πŸ”₯ "Deep Gen-AI" Full Course πŸ”₯

πŸ‘‰A fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...

πŸ‘‰Review https://t.ly/ylBxq
πŸ‘‰Course https://lnkd.in/dMKH9gNe
πŸ‘‰Lectures https://lnkd.in/d_uwDvT6
❀21πŸ”₯7πŸ‘2πŸ‘1πŸ₯°1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐏 EFM3D: 3D Ego-Foundation 🐏

πŸ‘‰#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code releasedπŸ’™

πŸ‘‰Review https://t.ly/cDJv6
πŸ‘‰Paper arxiv.org/pdf/2406.10224
πŸ‘‰Project www.projectaria.com/datasets/aeo/
πŸ‘‰Repo github.com/facebookresearch/efm3d
πŸ”₯9❀2πŸ‘2⚑1πŸ‘1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯¦Gaussian Splatting VTONπŸ₯¦

πŸ‘‰GS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedπŸ’™

πŸ‘‰Review https://t.ly/sTPbW
πŸ‘‰Paper arxiv.org/pdf/2410.05259
πŸ‘‰Project yukangcao.github.io/GS-VTON/
πŸ‘‰Repo github.com/yukangcao/GS-VTON
πŸ”₯14❀3πŸ‘1πŸ‘1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’‘Diffusion Models RelightingπŸ’‘

πŸ‘‰#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.

πŸ‘‰Review https://t.ly/fliXU
πŸ‘‰Paper arxiv.org/pdf/2410.08188
πŸ‘‰Project www.eyelinestudios.com/research/diffrelight.html
πŸ”₯17❀7⚑2πŸ‘2😍2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯ŽPOKEFLEX: Soft Object DatasetπŸ₯Ž

πŸ‘‰PokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedπŸ’™

πŸ‘‰Review https://t.ly/GXggP
πŸ‘‰Paper arxiv.org/pdf/2410.07688
πŸ‘‰Project https://lnkd.in/duv-jS7a
πŸ‘‰Repo
πŸ‘7πŸ”₯2πŸ₯°1πŸ‘1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ DEPTH ANY VIDEO is out! πŸ”₯

πŸ‘‰DAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!

πŸ‘‰Review https://t.ly/CjSz2
πŸ‘‰Paper arxiv.org/pdf/2410.10815
πŸ‘‰Project depthanyvideo.github.io/
πŸ‘‰Code github.com/Nightmare-n/DepthAnyVideo
πŸ”₯14🀯3❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺžRobo-Emulation via Video ImitationπŸͺž

πŸ‘‰OKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.

πŸ‘‰Review https://t.ly/_N29-
πŸ‘‰Paper arxiv.org/pdf/2410.11792
πŸ‘‰Project https://lnkd.in/d6bHF_-s
πŸ‘4🀯2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ CoTracker3 by #META is out! πŸ”₯

πŸ‘‰#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data 🀯🀯🀯

πŸ‘‰Review https://t.ly/TcRIv
πŸ‘‰Paper arxiv.org/pdf/2410.11831
πŸ‘‰Project cotracker3.github.io/
πŸ‘‰Code github.com/facebookresearch/co-tracker
❀14πŸ”₯3🀯3🍾2πŸ‘1😱1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🦠 Neural Metamorphosis 🦠

πŸ‘‰NU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.

πŸ‘‰Review https://t.ly/DJab3
πŸ‘‰Paper arxiv.org/pdf/2410.11878
πŸ‘‰Project adamdad.github.io/neumeta
πŸ‘‰Code github.com/Adamdad/neumeta
❀7πŸ”₯3🀯3😱2⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜€οΈ GS + Depth = SOTA β˜€οΈ

πŸ‘‰DepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonπŸ’™

πŸ‘‰Review https://t.ly/87HuH
πŸ‘‰Paper arxiv.org/abs/2410.13862
πŸ‘‰Project haofeixu.github.io/depthsplat/
πŸ‘‰Code github.com/cvg/depthsplat
🀯9πŸ”₯8❀3⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯BitNet: code of 1-bit LLM releasedπŸ”₯

πŸ‘‰BitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released πŸ’™

πŸ‘‰Review https://t.ly/3G2LA
πŸ‘‰Paper arxiv.org/pdf/2310.11453
πŸ‘‰Code https://lnkd.in/duPADJVb
πŸ”₯21❀5🀯2πŸ‘1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🧿 Look Ma, no markers 🧿

πŸ‘‰#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedπŸ’™

πŸ‘‰Review https://t.ly/5fN0g
πŸ‘‰Paper arxiv.org/pdf/2410.11520
πŸ‘‰Project microsoft.github.io/SynthMoCap/
πŸ‘‰Repo github.com/microsoft/SynthMoCap
🀯16πŸ‘10πŸ”₯3😱3❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ PL2Map: efficient neural 2D-3D πŸͺ

πŸ‘‰PL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences

πŸ‘‰Review https://t.ly/D-bVD
πŸ‘‰Paper arxiv.org/pdf/2402.18011
πŸ‘‰Project https://thpjp.github.io/pl2map
πŸ‘‰Code https://github.com/ais-lab/pl2map
πŸ”₯14🀯8πŸ‘2❀1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌻 Plant Camouflage Detection🌻

πŸ‘‰PlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/pYFX4
πŸ‘‰Paper arxiv.org/pdf/2410.17598
πŸ‘‰Code github.com/yjybuaa/PlantCamo
❀11πŸ‘6🀯4πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
β›ˆοΈ SMITE: SEGMENT IN TIME β›ˆοΈ

πŸ‘‰SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced πŸ’™

πŸ‘‰Review https://t.ly/w6aWJ
πŸ‘‰Paper arxiv.org/pdf/2410.18538
πŸ‘‰Project segment-me-in-time.github.io/
πŸ‘‰Repo github.com/alimohammadiamirhossein/smite
🀯11❀4🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫐 Blendify: #Python + Blender 🫐

πŸ‘‰Lightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code releasedπŸ’™

πŸ‘‰Review https://t.ly/l0crA
πŸ‘‰Paper https://arxiv.org/pdf/2410.17858
πŸ‘‰Code https://virtualhumans.mpi-inf.mpg.de/blendify/
🀩13πŸ‘4πŸ”₯4❀2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ D-FINE: new SOTA Detector πŸ”₯

πŸ‘‰D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available πŸ’™

πŸ‘‰Review https://t.ly/aw9fN
πŸ‘‰Paper https://arxiv.org/pdf/2410.13842
πŸ‘‰Code https://github.com/Peterande/D-FINE
❀16πŸ‘3πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍜 REM: Segment What You Describe 🍜

πŸ‘‰REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced πŸ’™

πŸ‘‰Review https://t.ly/OyVtV
πŸ‘‰Paper arxiv.org/pdf/2410.23287
πŸ‘‰Project https://miccooper9.github.io/projects/ReferEverything/
πŸ”₯18❀4πŸ‘3🀩2🀯1😍1