AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸŒū New SOTA Edge Detection ðŸŒū

👉CUP (+ ESPOCH) unveils the new SOTA for Edge Detection (NBED); superior performance consistently across multiple benchmarks, even compared with huge computational cost and complex training models. Source Code released💙

👉Review https://t.ly/zUMcS
👉Paper arxiv.org/pdf/2409.14976
👉Code github.com/Li-yachuan/NBED
ðŸ”Ĩ11👍5👏1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ‘Đ‍ðŸĶ° SOTA Gaussian Haircut ðŸ‘Đ‍ðŸĶ°

👉ETH et. al unveils Gaussian Haircut, the new SOTA in hair reconstruction via dual representation (classic + 3D Gaussian). Code and Model announced💙

👉Review https://t.ly/aiOjq
👉Paper arxiv.org/pdf/2409.14778
👉Project https://lnkd.in/dFRm2ycb
👉Repo https://lnkd.in/d5NWNkb5
ðŸ”Ĩ16👍2âĪ1ðŸĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
🍇SPARK: Real-time Face Capture🍇

👉Technicolor Group unveils SPARK, a novel high-precision 3D face capture via collection of unconstrained videos of a subject as prior information. New SOTA able to handle unseen pose, expression and lighting. Impressive results. Code & Model announced💙

👉Review https://t.ly/rZOgp
👉Paper arxiv.org/pdf/2409.07984
👉Project kelianb.github.io/SPARK/
👉Repo github.com/KelianB/SPARK/
ðŸ”Ĩ10âĪ2👏1ðŸ’Đ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĶī One-Image Object Detection ðŸĶī

👉Delft University (+Hensoldt Optronics) introduces OSSA, a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Code released💙

👉Review https://t.ly/-li2G
👉Paper arxiv.org/pdf/2410.00900
👉Code github.com/RobinGerster7/OSSA
ðŸ”Ĩ19👏2⚡1👍1ðŸĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ›ģïļ EVER Ellipsoid Rendering ðŸ›ģïļ

👉UCSD & Google present EVER, a novel method for real-time differentiable emission-only volume rendering. Unlike 3DGS it does not suffer from popping artifacts and view dependent density, achieving ∞30 FPS at 720p on #NVIDIA RTX4090.

👉Review https://t.ly/zAfGU
👉Paper arxiv.org/pdf/2410.01804
👉Project half-potato.gitlab.io/posts/ever/
ðŸ”Ĩ13âĪ2👍2👏1ðŸĪŊ1ðŸ˜ą1ðŸū1
ðŸ”Ĩ "Deep Gen-AI" Full Course ðŸ”Ĩ

👉A fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...

👉Review https://t.ly/ylBxq
👉Course https://lnkd.in/dMKH9gNe
👉Lectures https://lnkd.in/d_uwDvT6
âĪ21ðŸ”Ĩ7👏2👍1ðŸĨ°1ðŸĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
🐏 EFM3D: 3D Ego-Foundation 🐏

👉#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code released💙

👉Review https://t.ly/cDJv6
👉Paper arxiv.org/pdf/2406.10224
👉Project www.projectaria.com/datasets/aeo/
👉Repo github.com/facebookresearch/efm3d
ðŸ”Ĩ9âĪ2👍2⚡1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĨĶGaussian Splatting VTONðŸĨĶ

👉GS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announced💙

👉Review https://t.ly/sTPbW
👉Paper arxiv.org/pdf/2410.05259
👉Project yukangcao.github.io/GS-VTON/
👉Repo github.com/yukangcao/GS-VTON
ðŸ”Ĩ14âĪ3👍1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ’ĄDiffusion Models RelightingðŸ’Ą

👉#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.

👉Review https://t.ly/fliXU
👉Paper arxiv.org/pdf/2410.08188
👉Project www.eyelinestudios.com/research/diffrelight.html
ðŸ”Ĩ17âĪ7⚡2👍2😍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĨŽPOKEFLEX: Soft Object DatasetðŸĨŽ

👉PokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announced💙

👉Review https://t.ly/GXggP
👉Paper arxiv.org/pdf/2410.07688
👉Project https://lnkd.in/duv-jS7a
👉Repo
👍7ðŸ”Ĩ2ðŸĨ°1👏1ðŸ˜ą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ”Ĩ DEPTH ANY VIDEO is out! ðŸ”Ĩ

👉DAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!

👉Review https://t.ly/CjSz2
👉Paper arxiv.org/pdf/2410.10815
👉Project depthanyvideo.github.io/
👉Code github.com/Nightmare-n/DepthAnyVideo
ðŸ”Ĩ14ðŸĪŊ3âĪ1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🊞Robo-Emulation via Video Imitation🊞

👉OKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.

👉Review https://t.ly/_N29-
👉Paper arxiv.org/pdf/2410.11792
👉Project https://lnkd.in/d6bHF_-s
👍4ðŸĪŊ2ðŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ”Ĩ CoTracker3 by #META is out! ðŸ”Ĩ

👉#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data ðŸĪŊðŸĪŊðŸĪŊ

👉Review https://t.ly/TcRIv
👉Paper arxiv.org/pdf/2410.11831
👉Project cotracker3.github.io/
👉Code github.com/facebookresearch/co-tracker
âĪ14ðŸ”Ĩ3ðŸĪŊ3ðŸū2👍1ðŸ˜ą1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĶ  Neural Metamorphosis ðŸĶ 

👉NU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.

👉Review https://t.ly/DJab3
👉Paper arxiv.org/pdf/2410.11878
👉Project adamdad.github.io/neumeta
👉Code github.com/Adamdad/neumeta
âĪ7ðŸ”Ĩ3ðŸĪŊ3ðŸ˜ą2⚡1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
☀ïļ GS + Depth = SOTA ☀ïļ

👉DepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soon💙

👉Review https://t.ly/87HuH
👉Paper arxiv.org/abs/2410.13862
👉Project haofeixu.github.io/depthsplat/
👉Code github.com/cvg/depthsplat
ðŸĪŊ9ðŸ”Ĩ8âĪ3⚡1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ”ĨBitNet: code of 1-bit LLM releasedðŸ”Ĩ

👉BitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released 💙

👉Review https://t.ly/3G2LA
👉Paper arxiv.org/pdf/2310.11453
👉Code https://lnkd.in/duPADJVb
ðŸ”Ĩ21âĪ5ðŸĪŊ2👍1ðŸĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ§ŋ Look Ma, no markers ðŸ§ŋ

👉#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset released💙

👉Review https://t.ly/5fN0g
👉Paper arxiv.org/pdf/2410.11520
👉Project microsoft.github.io/SynthMoCap/
👉Repo github.com/microsoft/SynthMoCap
ðŸĪŊ16👍10ðŸ”Ĩ3ðŸ˜ą3âĪ1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🊁 PL2Map: efficient neural 2D-3D 🊁

👉PL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences

👉Review https://t.ly/D-bVD
👉Paper arxiv.org/pdf/2402.18011
👉Project https://thpjp.github.io/pl2map
👉Code https://github.com/ais-lab/pl2map
ðŸ”Ĩ14ðŸĪŊ8👍2âĪ1ðŸĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸŒŧ Plant Camouflage DetectionðŸŒŧ

👉PlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released 💙

👉Review https://t.ly/pYFX4
👉Paper arxiv.org/pdf/2410.17598
👉Code github.com/yjybuaa/PlantCamo
âĪ11👍6ðŸĪŊ4👏1ðŸĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
⛈ïļ SMITE: SEGMENT IN TIME ⛈ïļ

👉SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced 💙

👉Review https://t.ly/w6aWJ
👉Paper arxiv.org/pdf/2410.18538
👉Project segment-me-in-time.github.io/
👉Repo github.com/alimohammadiamirhossein/smite
ðŸĪŊ11âĪ4ðŸĪĐ1