AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ CoTracker3 by #META is out! πŸ”₯

πŸ‘‰#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data 🀯🀯🀯

πŸ‘‰Review https://t.ly/TcRIv
πŸ‘‰Paper arxiv.org/pdf/2410.11831
πŸ‘‰Project cotracker3.github.io/
πŸ‘‰Code github.com/facebookresearch/co-tracker
❀14πŸ”₯3🀯3🍾2πŸ‘1😱1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🦠 Neural Metamorphosis 🦠

πŸ‘‰NU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.

πŸ‘‰Review https://t.ly/DJab3
πŸ‘‰Paper arxiv.org/pdf/2410.11878
πŸ‘‰Project adamdad.github.io/neumeta
πŸ‘‰Code github.com/Adamdad/neumeta
❀7πŸ”₯3🀯3😱2⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜€οΈ GS + Depth = SOTA β˜€οΈ

πŸ‘‰DepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonπŸ’™

πŸ‘‰Review https://t.ly/87HuH
πŸ‘‰Paper arxiv.org/abs/2410.13862
πŸ‘‰Project haofeixu.github.io/depthsplat/
πŸ‘‰Code github.com/cvg/depthsplat
🀯9πŸ”₯8❀3⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯BitNet: code of 1-bit LLM releasedπŸ”₯

πŸ‘‰BitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released πŸ’™

πŸ‘‰Review https://t.ly/3G2LA
πŸ‘‰Paper arxiv.org/pdf/2310.11453
πŸ‘‰Code https://lnkd.in/duPADJVb
πŸ”₯21❀5🀯2πŸ‘1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🧿 Look Ma, no markers 🧿

πŸ‘‰#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedπŸ’™

πŸ‘‰Review https://t.ly/5fN0g
πŸ‘‰Paper arxiv.org/pdf/2410.11520
πŸ‘‰Project microsoft.github.io/SynthMoCap/
πŸ‘‰Repo github.com/microsoft/SynthMoCap
🀯16πŸ‘10πŸ”₯3😱3❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ PL2Map: efficient neural 2D-3D πŸͺ

πŸ‘‰PL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences

πŸ‘‰Review https://t.ly/D-bVD
πŸ‘‰Paper arxiv.org/pdf/2402.18011
πŸ‘‰Project https://thpjp.github.io/pl2map
πŸ‘‰Code https://github.com/ais-lab/pl2map
πŸ”₯14🀯8πŸ‘2❀1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌻 Plant Camouflage Detection🌻

πŸ‘‰PlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/pYFX4
πŸ‘‰Paper arxiv.org/pdf/2410.17598
πŸ‘‰Code github.com/yjybuaa/PlantCamo
❀11πŸ‘6🀯4πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
β›ˆοΈ SMITE: SEGMENT IN TIME β›ˆοΈ

πŸ‘‰SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced πŸ’™

πŸ‘‰Review https://t.ly/w6aWJ
πŸ‘‰Paper arxiv.org/pdf/2410.18538
πŸ‘‰Project segment-me-in-time.github.io/
πŸ‘‰Repo github.com/alimohammadiamirhossein/smite
🀯11❀4🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫐 Blendify: #Python + Blender 🫐

πŸ‘‰Lightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code releasedπŸ’™

πŸ‘‰Review https://t.ly/l0crA
πŸ‘‰Paper https://arxiv.org/pdf/2410.17858
πŸ‘‰Code https://virtualhumans.mpi-inf.mpg.de/blendify/
🀩13πŸ‘4πŸ”₯4❀2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ D-FINE: new SOTA Detector πŸ”₯

πŸ‘‰D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available πŸ’™

πŸ‘‰Review https://t.ly/aw9fN
πŸ‘‰Paper https://arxiv.org/pdf/2410.13842
πŸ‘‰Code https://github.com/Peterande/D-FINE
❀16πŸ‘3πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍜 REM: Segment What You Describe 🍜

πŸ‘‰REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced πŸ’™

πŸ‘‰Review https://t.ly/OyVtV
πŸ‘‰Paper arxiv.org/pdf/2410.23287
πŸ‘‰Project https://miccooper9.github.io/projects/ReferEverything/
πŸ”₯18❀4πŸ‘3🀩2🀯1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜€οΈ Universal Relightable Avatars β˜€οΈ

πŸ‘‰#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!

πŸ‘‰Review https://t.ly/U-ESX
πŸ‘‰Paper arxiv.org/pdf/2410.24223
πŸ‘‰Project junxuan-li.github.io/urgca-website
❀11πŸ”₯5⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🏣 CityGaussianV2: Large-Scale City 🏣

πŸ‘‰A novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code releasedπŸ’™

πŸ‘‰Review https://t.ly/Xgn59
πŸ‘‰Paper arxiv.org/pdf/2411.00771
πŸ‘‰Project dekuliutesla.github.io/CityGaussianV2/
πŸ‘‰Code github.com/DekuLiuTesla/CityGaussian
πŸ‘15πŸ”₯9❀2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’ͺ Muscles in Time Dataset πŸ’ͺ

πŸ‘‰Muscles in Time (MinT) is a large-scale synthetic muscle activation dataset. MinT contains 9+ hours of simulation data covering 227 subjects and 402 simulated muscle strands. Code & Dataset available soon πŸ’™

πŸ‘‰Review https://t.ly/108g6
πŸ‘‰Paper arxiv.org/pdf/2411.00128
πŸ‘‰Project davidschneider.ai/mint
πŸ‘‰Code github.com/simplexsigil/MusclesInTime
πŸ”₯8❀3πŸ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
🧠 Single Neuron Reconstruction 🧠

πŸ‘‰SIAT unveils NeuroFly, a framework for large-scale single neuron reconstruction. Formulating neuron reconstruction task as a 3-stage streamlined workflow: automatic segmentation - connection - manual proofreading. Bridging computer vision and neuroscience πŸ’™

πŸ‘‰Review https://t.ly/Y5Xu0
πŸ‘‰Paper https://arxiv.org/pdf/2411.04715
πŸ‘‰Repo github.com/beanli161514/neurofly
❀4πŸ”₯1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫠 X-Portrait 2: SOTA(?) Portrait Animation 🫠

πŸ‘‰ByteDance unveils a preview of X-Portrait2, the new SOTA expression encoder model that implicitly encodes every minuscule expressions from the input by training it on large-scale datasets. Impressive results but no paper & code announced.

πŸ‘‰Review https://t.ly/8Owh9 [UPDATE]
πŸ‘‰Paper ?
πŸ‘‰Project byteaigc.github.io/X-Portrait2/
πŸ‘‰Repo ?
πŸ”₯13🀯5πŸ‘4❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
❄️Don’t Look Twice: ViT by RLT❄️

πŸ‘‰CMU unveils RLT: speeding up the video transformers inspired by run-length encoding for data compression. Speed the training up and reducing the token count by up to 80%! Source Code announced πŸ’™

πŸ‘‰Review https://t.ly/ccSwN
πŸ‘‰Paper https://lnkd.in/d6VXur_q
πŸ‘‰Project https://lnkd.in/d4tXwM5T
πŸ‘‰Repo TBA
πŸ”₯9πŸ‘3❀1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”SeedEdit: foundational T2IπŸ”

πŸ‘‰ByteDance unveils a novel T2I foundational model capable of delivering stable, high-aesthetic image edits which maintain image quality through unlimited rounds of editing instructions. No code announced but a Demo is onlineπŸ’™

πŸ‘‰Review https://t.ly/hPlnN
πŸ‘‰Paper https://arxiv.org/pdf/2411.06686
πŸ‘‰Project team.doubao.com/en/special/seededit
πŸ€—Demo https://huggingface.co/spaces/ByteDance/SeedEdit-APP
πŸ”₯10❀6🀩2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ 4 NanoSeconds inference πŸ”₯

πŸ‘‰LogicTreeNet: convolutional differentiable logic gate net. with logic gate tree kernels: Computer Vision into differentiable LGNs. Up to 6100% smaller than SOTA, inference in 4 NANOsecs!

πŸ‘‰Review https://t.ly/GflOW
πŸ‘‰Paper https://lnkd.in/dAZQr3dW
πŸ‘‰Full clip https://lnkd.in/dvDJ3j-u
πŸ”₯29🀯12πŸ‘1🀩1