AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🎁 StyelNeRF source code is out 🎁

👉3D consistent photo-realistic image synthesis

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
NeRF + style generator
3D consistency for HD image
Novel regularization loss
Camera control on styles

More: https://bit.ly/3t5xC49
🔥4🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦎CLD-based generative #AI by #Nvidia🦎

👉Nvidia unveils a novel critically-damped Langevin diffusion (CLD) for synthetic data

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A novel diffusion process for SGMs
Novel score matching obj. for CLD
Hybrid denoising score matching
Efficient sampling from CLD model
Source code under a specific license

More: https://bit.ly/35MToBe
🔥2🤩2👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🛸UFO: segmentation @140+ FPS🛸

👉Unified Transformer Framework for Co-Segmentation, Co-Saliency & Salient Object Detection. All in one!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unified framework for co-segmentation
Co-segmentation, co-saliency, saliency
Block for long-range dependencies
Able to reach for 140 FPS in inference
The new SOTA on multiple datasets
Source code under MIT License

More: https://bit.ly/3KLd9b9
🔥6👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 Multi-GANs fashion 👗

👉Global GAN blended with other GANs for faces, shoes, etc.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multi-GAN framework
Several generators
Free of artifacts
Full-body generation
Humans, 1024x1024

More: https://bit.ly/37mfOte
🔥2👏21🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚧 FLAG: #3D Avatar Generation 🚧

👉A flow-based generative model of the 3D human body from sparse observations.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
FLow-based Avatar Generative
Conditional distro of body pose
Exact pose likelihood process
Invertibility -> oracle latent code

More: https://bit.ly/3CQpk3p
👏2🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💃 Dancing in the wild with StyleGAN 💃

👉StyleGAN-based animations for AR/VR apps

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Video based motion retargeting
A StyleGAN architecture based
Novel explicit motion representation
SOTA qualitatively & quantitatively

More: https://bit.ly/3CZbL1W
👍6🤯3🥰2
This media is not supported in your browser
VIEW IN TELEGRAM
🪀TensoRF: the 4D evolution of NeRF 🪀

👉TensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
VM decomposition technique
Low-rank tensor factorization
Lower memory footprint (speed)
TensoRF is the new SOTA in R.F.
Code under the MIT License

More: https://bit.ly/3qffZgI
👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔼 GAN-meshes without key-points 🔼

👉ETH unveils a GAN framework for generating textured triangle meshes without annotations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Generative of textured meshes
3D generator for all categories
3D pose estimation framework
Code licensed under MIT License

More: https://bit.ly/3qfH9nJ
🤩3🤯2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐯 S.S. Latent Image Animator 🐯

👉Self-supervised autoencoder to animate unseen images by linear navigation in latent

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latent Image Animator
Linear displacement in latent
SOTA: VoxCeleb, Taichi, TED-talk
Source code (soon) available

More: https://bit.ly/36pgLAC
👍5🔥3🤯2💩1
This media is not supported in your browser
VIEW IN TELEGRAM
🪨 Google URF for neural-synthesis 🪨

👉Sequence of RGB + Lidar -> 3D surfaces and novel RGB images synthesized

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Extending Neural Radiance Fields
Leveraging asynch. lidar data
Addressing exposure variation
Leveraging segmentations for sky
SOTA #3D reconstructions/synthesizes

More: https://bit.ly/3L2vTDb
🔥11👍4👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚛 AV2: next-gen. self driving 🚛

👉One of the biggest dataset ever for #autonomousdriving

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
1k seq. of multimodal data
3D annotations, 26 categories
20k lidar & map-aligned pose
250k challenging interactions
HD Map: 3D lane & crosswalk
CC BY-NC-SA 4.0 license

More: https://bit.ly/3trx3lw
🔥3👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖CaTGrasp in Clutter from Simulation🤖

👉Task-relevant grasping: trained solely in simulation with synthetic + SS. hand-object interaction

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel cat-level, relevant grasping
S.S. hand-object-contact
Tiny objects from dense clutter
Train-simulation -> to real
Source code under Apache 2.0

More: https://bit.ly/3L2YVCo
👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🛼 Drive & Segment without Supervision 🛼

👉Learning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Cross-modal unsupervised
Synchronized LiDAR & RGB
Object proposal on LiDAR points
SOTA, significant improvements

More: https://bit.ly/3L0wWTW
👍3🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌍 NeRF-free Neural Rendering 🌍

👉A simple 2D-only method with a single pass of a neural network

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Synthesis with NO 3D reasoning
Autoregressive & masked transf.
Pose -> object, object -> pose
Attention: branching attention
Source code under MIT License

More: https://bit.ly/3JC7unt
🔥3😱2👍1🤩1
🤓👌Hey, TAKE OFF my eyeglasses! 😙👌

👉A novel framework to remove eyeglasses as well as their cast shadows from faces

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel mask-guided multi-step network
Leveraging 3D synthetic data only
Synthetic portraits with supervisions
Eyeglasses & shadows simultaneously

More: https://bit.ly/3IvQzlf
👍7🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🏥 #AI models/dataset for open surgery 🏥

👉Multi-task #AI model/dataset of real-time surgical behaviors, hands, and tools.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Annotated Videos Open Surgery
Largest dataset of open surgical
2k clips and 23 procedures
12k annotations, 11k+ keypoints
Models/Dataset soon available!

More: https://bit.ly/3tvDdkK
👍8🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🥽 #metaverse in 1991 🥽

👉Q: is #VR the technology that developed least in the last 30 years? 🤔

Discussion: https://bit.ly/3txWF07
👍3🤬3🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🫕NeRFusion: Large-Scale Reconstruction🫕

👉Efficient large-scale reconstruction & photo-realistic rendering

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Frame-by-frame R.F.
Neural reconstruction
Real-time at 20+ fps
SOTA on indoor / objects

More: https://bit.ly/3iyfoCo
🤯7🔥4👍3👏2
This media is not supported in your browser
VIEW IN TELEGRAM
ORViT for understanding tasks

👉ORViT: object-centric approach that extends ViT layers incorporating object representations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Spatio-temporal through the net
''Object-Region Attention''
''Object-Dynamics" module
Code just released! Apache 2.0

More: https://bit.ly/3wAUavW
🔥5👍3😱2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪅Insane Neural Sketching from #MIT🪅

👉Line drawing generation as unsupervised image translation with various losses

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unpaired method for line drawing
Geometry loss to predict depth
Semantic loss to match CLIP feats
SOTA on unpaired translation/generation
Code and Models under MIT License

More: https://bit.ly/36JRr8A
🤯7🔥41👍1🥰1👏1😁1