AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯πŸ”₯FANs: Fully Attentional NetworksπŸ”₯πŸ”₯

πŸ‘‰#Nvidia unveils the fully attentional networks (FANs)

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Efficient fully attentional design
βœ…Semantic seg. & object detection
βœ…Model/source code soon available!

More: https://bit.ly/3vtpITs
πŸ”₯7🀯3πŸ‘2❀1
πŸ‘¨πŸΌβ€πŸŽ¨ Open-Source DALLΒ·E 2 is out πŸ‘¨πŸΌβ€πŸŽ¨

πŸ‘‰#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…SOTA for text-to-image generation
βœ…Source code/model under MIT License
βœ…"Medieval painting of wifi not working"

More: https://bit.ly/3vzsff6
🀯14πŸ‘6😁1
This media is not supported in your browser
VIEW IN TELEGRAM
β›ΊViTPose: Transformer for Poseβ›Ί

πŸ‘‰ViTPose from ViTAE, ViT for human pose

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Plain/nonhierarchical ViT for pose
βœ…Deconv-layers after ViT for keypoints
βœ…Just the baseline is the new SOTA
βœ…Source code & models available soon!

More: https://bit.ly/3MJ0kz1
πŸ‘5🀯4πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🧳 Unsupervised HD Motion Transfer 🧳

πŸ‘‰Novel e2e unsupervised motion transfer for image animation

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…TPS motion estimation + Dropout
βœ…Novel E2E unsupervised motion transfer
βœ…Optical flow + multi-res. occlusion mask
βœ…Code and models under MIT license

More: https://bit.ly/3MGNPns
πŸ”₯8πŸ‘6🀯4❀2😱2
This media is not supported in your browser
VIEW IN TELEGRAM
🚀 Neural Self-Calibration in the wild 🚀

πŸ‘‰ Learning algorithm to regress calibration params from in the wild clips

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Params purely from self-supervision
βœ…S.S. depth/pose learning as objective
βœ…POV, fisheye, catadioptric: no changes
βœ…SOTA results on EuRoC MAV dataset

More: https://bit.ly/3w1n6LB
πŸ‘8🀩2πŸ”₯1πŸ₯°1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦… ConDor: S.S. Canonicalization πŸ¦…

πŸ‘‰Self-Supervised Canonicalization for full/partial 3D points cloud

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…RRC + Stanford + KAIST + Brown
βœ…On top of Tensor Field Networks (TFNs)
βœ…Unseen 3D -> equivariant canonical
βœ…Co-segmentation, NO supervision
βœ…Code and model under MIT license

More: https://bit.ly/3MNDyGa
πŸ”₯4πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦€ Event-aided Direct Sparse Odometry πŸ¦€

πŸ‘‰EDS: direct monocular visual odometry using events/frames

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Mono 6-DOF visual odometry + events
βœ…Direct photometric bundle adjustment
βœ…Camera motion tracking by sparse pixels
βœ…A new dataset with HQ events and frame

More: https://bit.ly/3s9FiBN
πŸ”₯5πŸ‘3🀯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ«€BlobGAN: Blob-Disentangled SceneπŸ«€

πŸ‘‰Unsupervised, mid-level (blobs) generation of scenes

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Spatial, depth-ordered Gaussian blobs
βœ…Reaching for supervised level, and more
βœ…Source under BSD-2 "Simplified" License

More: https://bit.ly/3kRyGnj
πŸ”₯8πŸ‘1πŸ₯°1🀯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦•E2EVE editor via pre-trained artistπŸ¦•

πŸ‘‰E2EVE generates a new version of the source image that resembles the "driver" one

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Blending regions by driver image
βœ…E2E cond-probability of the edits
βœ…S.S. augmenting in target domain
βœ…Implemented as SOTA transformer
βœ…Code/models available (soon)

More: https://bit.ly/3P9TDYW
🀯5πŸ‘2🀩2❀1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐢 Bringing pets in #metaverse 🐢

πŸ‘‰ARTEMIS: pipeline for generating articulated neural pets for virtual worlds

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…ARTiculated, appEarance, Mo-synthesIS
βœ…Motion control, animation & rendering
βœ…Neural-generated (NGI) animal engine
βœ…SOTA animal mocap + neural control

More: https://bit.ly/3LZSLDU
❀4πŸ‘2πŸ₯°2🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
😍Animated hand in 1972, damn romantic😍

πŸ‘‰Q: is #VR the technology that developed least in the last 30 years? πŸ€”

More: https://bit.ly/3snxNaq
πŸ‘7❀3🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
⏏️Ensembling models for GAN training⏏️

πŸ‘‰Pretrained vision models to improve the GAN training. FID by 1.5 to 2Γ—!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…CV models as ensemble of discriminators
βœ…Improving GAN in limited / large-scale set
βœ…10k samples matches StyleGAN2 w/ 1.6M
βœ…Source code / models under MIT license

More: https://bit.ly/3wgUVsr
🀯6πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
🀯Cooperative Driving + AUTOCASTSIM🀯

πŸ‘‰COOPERNAUT: cross-vehicle perception for vision-based cooperative driving

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…UTexas + #Stanford + #Sony #AI
βœ…LiDAR into compact point-based
βœ…Network-augmented simulator
βœ…Source code and models available

More: https://bit.ly/3sr5HLk
πŸ”₯6🀯3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’„NeuralHDHair: 3D Neural HairπŸ’„

πŸ‘‰NeuralHDHair: fully automatic system for modeling HD hair from a single image

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…IRHairNet for hair geometric features
βœ…GrowingNet: 3D hair strands in parallel
βœ…VIFu: novel voxel-aligned implicit function
βœ…SOTA in 3D hair modeling from single pic

More: https://bit.ly/38iR0mQ
πŸ‘5πŸ₯°3❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🐑DyNeRF: Neural 3D Video Synthesis🐑

πŸ‘‰#Meta unveils DyNeRF, novel rendering HQ 3D video

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel NeRF-based on temp-latent codes
βœ…Novel training based on hierarchical step
βœ…Datasets of time-synch/calibrated clips
βœ…Attribution-NonCommercial 4.0 Int.

More: https://bit.ly/3MlBRA9
🀯8πŸ‘2πŸ”₯1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‹GATO: agent for multiple tasksπŸ‹

πŸ‘‰The same network with the same weights can play Atari, caption pics, chat, and more🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…General-purpose agent, multiple tasks
βœ…Multi-modal-task, multi-embodiment
βœ…Inspired by large-scale language model

More: https://bit.ly/3LbBOWb
🀯10❀3πŸ‘2πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺNeRF powered by keypointsπŸͺ

πŸ‘‰ETHZ + META unveil how to encode relative spatial #3D info via sparse 3D keypoints

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Sparse 3D keypoints for SOTA avatars
βœ…Unseen subjects from 2/3 views
βœ…Never-before-seen iPhone captures

More: https://bit.ly/39NQqhe
🀯5πŸ”₯2❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌Self-Supervised human co-evolution🐌

πŸ‘‰Self-supervised 3D by co-evolution of pose estimator, imitator, and hallucinator

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel self-supervised 3D pose
βœ…Co-evo of pose, imitator, hallucinator
βœ…Realist 3D pose and 2D-3D supervision
βœ…Source code / model under MIT license

More: https://bit.ly/37J5ImL
πŸ”₯4πŸ‘3❀1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐲 Diff-SDF #3D Rendering 🐲

πŸ‘‰Reconstruction with no complex reg. or priors, using only a per-pixel RGB loss

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Diff-render to optimize geometry/albedo
βœ…No ad-hoc object mask or supervision
βœ…Extended sphere tracing algorithm

More: https://bit.ly/3yKWPnI
🀯10πŸ‘4πŸ”₯2❀1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘„LVD: new SOTA for #3D humanπŸ‘„

πŸ‘‰Corona et al. unveils a novel 3D human model fitting

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Solution via neural field
βœ…Not sensitive to initialization
βœ…SOTA in shape from single pic
βœ…SOTA in fitting 3D scans

More: https://bit.ly/3Ng4lLr
πŸ‘4πŸ”₯2🀯1