AI with Papers - Artificial Intelligence & Deep Learning
15.4K subscribers
140 photos
253 videos
14 files
1.33K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
☀️ 4D Neural Relightable Humans ☀️

👉Relighting4D: free-viewpoints relighting of humans under unknown illuminations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Relight dynamic, free viewpoints
Disentangled reflectance/geometry
SOTA on synthetic/real datasets
Code/models under MIT License

More: https://bit.ly/3RF3yH9
🔥9👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🍰 Long-Term Object Segmentation 🍰

👉XMem: object segmentation for long clips with unified feature memory stores

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Inspired by Atkinson–Shiffrin model
Stores with different temporal scales
Memory consolidation algorithm
Compact/powerful long-term memory
Source code and models available

More: https://bit.ly/3PP0EOn
🤯16👍5👏3
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Grand Unification of Object Tracking🔥

👉UNICORN: unified method for SOT, MOT, VOS, & MOTS with a single neural net. 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Great unification for 4 tracking tasks
Bridging methods / pixel-wise corresp.
SOTA on 8 challenging benchmarks
Source code under MIT License

More: https://bit.ly/3o74h6g
👍13🔥3🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥OmniBenchmark: CV beyond ImageNet🔥

👉 21 realms, 7,000+ concepts and 1M+ images. Far beyond ImageNet!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
vs. ImageNet: 2.5x realms, 9x concepts
Conciseness: no concept overlapping
ReCo: Relational Contrastive Learning
New supervised contrastive learning SOTA

More: https://bit.ly/3RJRKU0
🔥11🤩3
This media is not supported in your browser
VIEW IN TELEGRAM
💣 HD Neural Avatar @130FPS 💣

👉Samsung unveils MegaPortraits: novel one-shot creation of HD neural human avatar

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
One-shot neural avatars, SOTA up 512p
"Upgrading" to megapixel via more pics
First Neural Head Avatars in HD
Up to to 130 FPS via #GPU

More: https://bit.ly/3oboWWT
🔥22👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 TimeLens++: Event-based Interpolation 🦚

👉Novel event-based interpolation with non-linear flow & multi-scale fusion

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel motion spline estimator
Non-linear continuous event/frames flow
Multi-feature fusion, gated compression
Novel hybrid dataset with 100+ videos

More: https://bit.ly/3yJyY6g
🔥16👍4
This media is not supported in your browser
VIEW IN TELEGRAM
🪰NUWA-Infinity is out!🪰

👉∞ generation by #Microsoft: arbitrarily-sized HD images and long videos 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unconditional Image Gen.
Text-to-Image/Text-to-Clip
Animation / Out-painting
Hi-res, arbitrary long clip
NCP for patches caching

More: https://bit.ly/3zmBf9f
🔥7👍21👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 #AIwithPapers: we are 3,500+! 🔥

💙💛 Ready for YOLO 10, 11, π, ∞, Ψ, and more? The more we are, the faster we catch'em all 💙💛

😈 Invite your friends -> https://t.iss.one/AI_DeepLearning
👍1210😁5🔥3
This media is not supported in your browser
VIEW IN TELEGRAM
🎷🎷OMNI3D: #3D Objects in the Wild🎷🎷

👉#3D detection: 234k images, 3M+ instances & 97 categories

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
OMNI3D from publicly released dataset
234k pics, 3M+ annotation with 3D box
97 categories such as sofa, table, cars
Fast (450x) and exact algorithm for IoU
Cube R-CNN: novel 3D object detector

More: https://bit.ly/3cznjzG
👍11
This media is not supported in your browser
VIEW IN TELEGRAM
👹Multiface Neural Rendering 👹

👉A new multi-view, Hi-Res data collected at #META Reality Labs for neural face

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Mugsy, large scale multi-cam apparatus
High-Res sync facial performance
Closing the gap in accessing HQ data
Suitable for #VR & #mixedreality

More: https://bit.ly/3b6XfeL
🤯8👍3
This media is not supported in your browser
VIEW IN TELEGRAM
💄DEVIANT: SOTA in mono-3D detection💄

👉A novel Depth EquiVarIAnt NeTwork for 3D monocular detection in the wild

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Michigan + #Meta + Ford 🤯
Depth-equi. + scale equiv. steerable
New SOTA on KITTI & Waymo
Ok cross-dataset -> generalization

More: https://bit.ly/3OEFtgK
🔥16👍21
This media is not supported in your browser
VIEW IN TELEGRAM
🧱 Assembling #LEGO with #AI 🧱

👉Step-by-step assembly manual created by human into machine-interpretable instructions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Stanford + MIT + #Google 🤯
MEPNet: Manual-to-Executable-Plan Net
Manual to machine-executable plan
2D manual - 3D geometric shape
Reasoning on 3D alignments of legos

More: https://bit.ly/3PCwn5C
🔥93
This media is not supported in your browser
VIEW IN TELEGRAM
🎃New SOTA in UDA Semantic Seg.🎃

👉HRDA: multi-res Unsupervised Domain Adaptive Semantic Seg. -> SOTA

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ETH + MPG + KU Leuven 🤯
HRDA: multi-res approach for UDA
Manageable GPU memory footprint
Small objects & fine segmentation detail
New SOTA on GTA and Synthia dataset

More: https://bit.ly/3cKtDEp
🤯8👍1
This media is not supported in your browser
VIEW IN TELEGRAM
⚗️ SemAbs: 3D Scene Understanding ⚗️

👉Framework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
2D VLMs with 3D reasoning skills
ViTs Efficient MS Relevancy Extraction
Novel Open-World understanding tasks
Completing partially observed objects
Finding hidden objects from language

More: https://bit.ly/3PYYk7d
🔥71👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 TinyCD: Neural Change Detection 🦚

👉TinyCD: new SOTA in change detection with up to 150x fewer parameters.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
SOTA with up to 150X fewer params
Mixing blocks for s.t. cross-correlation
PW-MLP for pixel wise classification
MAMB: novel block for skip connection

More: https://bit.ly/3zFEngk
16👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦊 3D-Aware "StyleGANv2" version 🦊

👉Upgrading StyleGANv2 into a novel 3D-aware GAN with just a minimal set of changes🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
MPI-like 3D-aware GAN w/ single-view
GMPI: generative multiplane image
2D GAN 3D-aware with a minimal changes
Encoding 3D-aware inductive biases

More: https://bit.ly/3OJ5gnS
🤯6👍41
This media is not supported in your browser
VIEW IN TELEGRAM
📺 NeRF-ing "The Big Bang Theory" 📺

👉Berkeley unveils an approach for accurate estimation of actor’s 3D pose & location

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Input: images across the whole season
3D context (i.e. cams, structure, body)
Integrating context in 3D estimation
Re-ID, gaze, cinematography, pic editing
Knock, Knock, Penny!

More: https://bit.ly/3OLuaUb
🔥7🤯5🥰21