AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
⏏️Ensembling models for GAN training⏏️

👉Pretrained vision models to improve the GAN training. FID by 1.5 to 2×!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
CV models as ensemble of discriminators
Improving GAN in limited / large-scale set
10k samples matches StyleGAN2 w/ 1.6M
Source code / models under MIT license

More: https://bit.ly/3wgUVsr
🤯6🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🤯Cooperative Driving + AUTOCASTSIM🤯

👉COOPERNAUT: cross-vehicle perception for vision-based cooperative driving

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
UTexas + #Stanford + #Sony #AI
LiDAR into compact point-based
Network-augmented simulator
Source code and models available

More: https://bit.ly/3sr5HLk
🔥6🤯3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
💄NeuralHDHair: 3D Neural Hair💄

👉NeuralHDHair: fully automatic system for modeling HD hair from a single image

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
IRHairNet for hair geometric features
GrowingNet: 3D hair strands in parallel
VIFu: novel voxel-aligned implicit function
SOTA in 3D hair modeling from single pic

More: https://bit.ly/38iR0mQ
👍5🥰31
This media is not supported in your browser
VIEW IN TELEGRAM
🐡DyNeRF: Neural 3D Video Synthesis🐡

👉#Meta unveils DyNeRF, novel rendering HQ 3D video

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel NeRF-based on temp-latent codes
Novel training based on hierarchical step
Datasets of time-synch/calibrated clips
Attribution-NonCommercial 4.0 Int.

More: https://bit.ly/3MlBRA9
🤯8👍2🔥1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍋GATO: agent for multiple tasks🍋

👉The same network with the same weights can play Atari, caption pics, chat, and more🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
General-purpose agent, multiple tasks
Multi-modal-task, multi-embodiment
Inspired by large-scale language model

More: https://bit.ly/3LbBOWb
🤯103👍2🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🪐NeRF powered by keypoints🪐

👉ETHZ + META unveil how to encode relative spatial #3D info via sparse 3D keypoints

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Sparse 3D keypoints for SOTA avatars
Unseen subjects from 2/3 views
Never-before-seen iPhone captures

More: https://bit.ly/39NQqhe
🤯5🔥21👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌Self-Supervised human co-evolution🐌

👉Self-supervised 3D by co-evolution of pose estimator, imitator, and hallucinator

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel self-supervised 3D pose
Co-evo of pose, imitator, hallucinator
Realist 3D pose and 2D-3D supervision
Source code / model under MIT license

More: https://bit.ly/37J5ImL
🔥4👍31🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐲 Diff-SDF #3D Rendering 🐲

👉Reconstruction with no complex reg. or priors, using only a per-pixel RGB loss

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Diff-render to optimize geometry/albedo
No ad-hoc object mask or supervision
Extended sphere tracing algorithm

More: https://bit.ly/3yKWPnI
🤯10👍4🔥21🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
👄LVD: new SOTA for #3D human👄

👉Corona et al. unveils a novel 3D human model fitting

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Solution via neural field
Not sensitive to initialization
SOTA in shape from single pic
SOTA in fitting 3D scans

More: https://bit.ly/3Ng4lLr
👍4🔥2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🏳️‍🌈Deep Clustering on ImageNet & Co.🏳️‍🌈

👉World's first deep nonparametric clustering on large dataset such as ImageNet

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Deep clustering that infers nr. of clusters
Loss: amortized inference in mixt-models
Deep nonparametric clustering on ImageNet
Code and model available under MIT license

More: https://bit.ly/38p62rn
🔥9🤯3👍2🤩2
This media is not supported in your browser
VIEW IN TELEGRAM
💥HQ-E²FGVI just released💥💥

👉Flow-Guided Video Inpainting through three trainable modules

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Flow, pixel-prop, content hallucination
Three stage-modules, jointly optimized
The new SOTA, promising efficiency
Code and Models under MIT license

More: https://bit.ly/3Ln0ICj
🤯10👍1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🪔 AvatarCLIP: Text-Driven Avatar 🪔

👉Zero-shot text-driven for #3D avatar in #metaverse

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
First text-driven synthesis
Shape, texture, and motion
Animation-ready, HQ texture/geometry
Zero-shot text-guided ref-based motion
Code and model under MIT license

More: https://bit.ly/3LjTWgB
🔥4👍2🤯21
This media is not supported in your browser
VIEW IN TELEGRAM
🔥#AIwithPapers: we are 2,500!🔥

💙💛Only 2 Billion papers remaining on arXiv. The more we are, the faster we read💙💛

😈 Invite your friends -> https://t.iss.one/AI_DeepLearning
🔥94👍2🤔2👏1
💥Podcasting AI & CV💥

👉🏼For people fluent in Italian: 1 hour podcast in which I talk about AI, CV, Startup and more (included this wonderful project).

More: https://bit.ly/38DtBwB
👏63👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Inpainting: new SOTA! INSANE🔥

👉Novel two-stream approach: inpainting at the next level!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
High-freq locally, low-freq globally
Local to global -> error correction
44% / 26% improvements FID/scores
Source code, more clips available

More: https://bit.ly/3ltIX9R
👍8🤯3🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Super-Human Crossword Solver🔥

👉Solving crosswords outperforming best humans

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Crossword solving based on NNs
Q&A, structured decoding, local search
Wide domains with perfect accuracy
Large question-answer dataset

More: https://bit.ly/3a3zzqQ
🔥4🤯3👏2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🥸Imagen: far beyond DALL·E 2🥸

👉#Google: unprecedented photorealism and deep level of language understanding

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Dynamic thresh diffusion sampling
Efficient U-Net, efficient++ variant
DrawBench, new text-to-image
The new SOTA, COCO FID of 7.27

More: https://bit.ly/3lVtkbz
🔥9🤯6👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪤Tracking over SOTA detectors🪤

👉Lightweight Python lib for real-time 2D object tracking 💥

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Layer of tracking over SOTA detectors
Suitable for complex video processing
Source code under BSD 3-Clause
Maintained by Tryolabs team

More: https://bit.ly/3wKtGqg
👍7🔥3🤩3
This media is not supported in your browser
VIEW IN TELEGRAM
🥷🏿 FCA: #3D Neural Camouflage 🥷🏿

👉#3D full-camouflage adversarial patch to fool neural detectors

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Attack by diff-neural render
E2E physical adversarial attack
Envs, vehicles & detectors
Source code available!

More: https://bit.ly/38kKyfa
👍5🔥3🤯2👏1