AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
136 photos
248 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🐲A novel AI-controllable synthesis🐲

👉Modeling local semantic parts separately and synthesizing images in a compositional way

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Structure & texture locally controlled
Disentanglement between areas
Fine-grained editing of images
Extendible via transfer learning
Just accepted to #CVPR2022

More: https://bit.ly/3IBgkBy
😱3🤯21
This media is not supported in your browser
VIEW IN TELEGRAM
🥣 #AI-Generation with Dream Fields 🥣

👉Neural rendering with multi-modal image and text representations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Aligned image & text models
3D from natural language
No additional data
D.F. neural-scene

More: https://bit.ly/3Mhwm5D
👍10👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🟪 Mip-NeRF 360 for unbounded scenes 🟪

👉An extension of NeRF to overcome the challenges presented by unbounded scenes

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Realistic synthesized views
Intricate/unbounded scenes
Detailed depth maps
Mean-squared error -54%
No code provided 😥

More: https://bit.ly/36ZxsD4
🤯41
This media is not supported in your browser
VIEW IN TELEGRAM
🐓 PINA: personal Neural Avatar 🐓

👉A novel method to acquire neural avatars from RGB-D videos

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A virtual copy of themselves
Realistic clothing deformations
Shape & non-rigid deformation
Avatars from RGB-D sequences
Creative Commons Zero v1.0

More: https://bit.ly/3HAtRIh
👍41👏1😁1
This media is not supported in your browser
VIEW IN TELEGRAM
🐦 EfficientVIS: new SOTA for VIS 🐦

👉Simultaneous classification, segmentation, and tracking multiple object instances in videos

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Efficient and fully end-to-end
Iterative query-video interaction
First RoI-wise clip-level RT-VIS
Requires 15× fewer epochs

More: https://bit.ly/3KfqurN
👍10🔥3👎1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐠#AI-clips from single frame🐠

👉Moving objects in #3D while generating a video by a sequence of desired actions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A playable environments
A single starting image🤯
Controllable camera
Unsupervised learning

More: https://bit.ly/35VDrYO
3👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊Kubric: AI dataset generator🧊

👉Open-source #Python framework for photo-realistic scenes: full control, rich annotations, TBs of fresh data 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Synthetic datasets with GT
From NeRF to optical flow
Full control over data
Ok privacy & licensing
Apache License 2.0

More: https://bit.ly/3hQCaFs
🔥6👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🪂µTransfer for enormous NNs 🪂

👉Microsoft unveils how to tune enormous neural networks

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
New HP tuning: µTransfer
Zero-shot transfer to full-model
Outperforming BERT-large
Outperforming 6.7B GPT-3
Code under MIT license

More: https://bit.ly/3qc37Ij
🔥2🤯21
This media is not supported in your browser
VIEW IN TELEGRAM
🐧Semantic via only text supervision🐧

👉GroupViT with a text encoder on a large-scale image-text dataset: semantic with any pixel-level annotations in training!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Hierarc. Grouping Vision Transf.
Additional text encoder
NO pixel-level annotations
Semantic-seg task via zero-shot
Source code available soon

More:https://bit.ly/3hPGeWr
👍6🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
4D-Net: Lidar + RGB synchronization

👉Google unveils 4D-Net to combine 3D LiDAR and onboard RGB camera

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Point clouds/images in time
Fusing multiple modalities in 4D
Novel sampling for 3D P.C. in time
New SOTA for 3D detection

More: https://bit.ly/3hZCFwN
👍12🔥2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌 New SOTA in video synthesis! 🐌

👉Snap unveils a novel multimodal video generation framework via text/images

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multimodal video generation
Bidirectional transformer
Video token with self-learn.
Text augmentation for robustness
Longer sequence synthesis

More: https://bit.ly/3hZLXsG
🤯4👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🎁 StyelNeRF source code is out 🎁

👉3D consistent photo-realistic image synthesis

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
NeRF + style generator
3D consistency for HD image
Novel regularization loss
Camera control on styles

More: https://bit.ly/3t5xC49
🔥4🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦎CLD-based generative #AI by #Nvidia🦎

👉Nvidia unveils a novel critically-damped Langevin diffusion (CLD) for synthetic data

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A novel diffusion process for SGMs
Novel score matching obj. for CLD
Hybrid denoising score matching
Efficient sampling from CLD model
Source code under a specific license

More: https://bit.ly/35MToBe
🔥2🤩2👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🛸UFO: segmentation @140+ FPS🛸

👉Unified Transformer Framework for Co-Segmentation, Co-Saliency & Salient Object Detection. All in one!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unified framework for co-segmentation
Co-segmentation, co-saliency, saliency
Block for long-range dependencies
Able to reach for 140 FPS in inference
The new SOTA on multiple datasets
Source code under MIT License

More: https://bit.ly/3KLd9b9
🔥6👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 Multi-GANs fashion 👗

👉Global GAN blended with other GANs for faces, shoes, etc.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multi-GAN framework
Several generators
Free of artifacts
Full-body generation
Humans, 1024x1024

More: https://bit.ly/37mfOte
🔥2👏21🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚧 FLAG: #3D Avatar Generation 🚧

👉A flow-based generative model of the 3D human body from sparse observations.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
FLow-based Avatar Generative
Conditional distro of body pose
Exact pose likelihood process
Invertibility -> oracle latent code

More: https://bit.ly/3CQpk3p
👏2🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💃 Dancing in the wild with StyleGAN 💃

👉StyleGAN-based animations for AR/VR apps

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Video based motion retargeting
A StyleGAN architecture based
Novel explicit motion representation
SOTA qualitatively & quantitatively

More: https://bit.ly/3CZbL1W
👍6🤯3🥰2
This media is not supported in your browser
VIEW IN TELEGRAM
🪀TensoRF: the 4D evolution of NeRF 🪀

👉TensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
VM decomposition technique
Low-rank tensor factorization
Lower memory footprint (speed)
TensoRF is the new SOTA in R.F.
Code under the MIT License

More: https://bit.ly/3qffZgI
👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔼 GAN-meshes without key-points 🔼

👉ETH unveils a GAN framework for generating textured triangle meshes without annotations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Generative of textured meshes
3D generator for all categories
3D pose estimation framework
Code licensed under MIT License

More: https://bit.ly/3qfH9nJ
🤩3🤯2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐯 S.S. Latent Image Animator 🐯

👉Self-supervised autoencoder to animate unseen images by linear navigation in latent

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latent Image Animator
Linear displacement in latent
SOTA: VoxCeleb, Taichi, TED-talk
Source code (soon) available

More: https://bit.ly/36pgLAC
👍5🔥3🤯2💩1