AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ§‹#4D Neural FieldsπŸ§‹

πŸ‘‰4D N.F. visual representations from monocular RGB-D 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…4D scene completion (occlusions)
βœ…Scene completion in cluttered scenes
βœ…Novel #AI for contextual point clouds
βœ…Data, code, models under MIT license

More: https://cutt.ly/6GveKiJ
πŸ‘6🀯2πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘”Largest dataset of human-object πŸ‘”

πŸ‘‰BEHAVE by Google: largest dataset of human-object interactions

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…8 subjects, 20 objects, 5 envs.
βœ…321 clips with 4 Kinect RGB-D
βœ…Masks and segmented point clouds
βœ…3D SMPL & mesh registration
βœ…Textured scan reconstructions

More: https://bit.ly/3Lx6NNo
πŸ‘5πŸ‘4πŸ”₯2❀1😱1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦴ENARF-GAN Neural Articulations🦴

πŸ‘‰Unsupervised method for 3D geometry-aware representation of articulated objects

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel efficient neural representation
βœ…Tri-planes deformation fields for training
βœ…Novel GAN for articulated representations
βœ…Controllable 3D from real unlabeled pic

More: https://bit.ly/3xYqedN
🀯3πŸ‘2❀1πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ–²οΈ HuMMan: 4D human dataset πŸ–²οΈ

πŸ‘‰HuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…RGB, pt-clouds, keypts, SMPL, texture
βœ…Mobile device in the sensor suite
βœ…500+ actions to cover movements

More: https://bit.ly/3vTRW8Z
πŸ₯°2😱2πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Neighborhood Attention Transformer πŸ”₯

πŸ‘‰A novel transformer for both image classification and downstream vision tasks

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Neighborhood Attention (NA)
βœ…Neighborhood Attention Transformer, NAT
βœ…Faster training/inference, good throughput
βœ…Checkpoints, train, #CUDA kernel available

More: https://bit.ly/3F5aVSo
🀯4πŸ‘3πŸ”₯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯πŸ”₯FANs: Fully Attentional NetworksπŸ”₯πŸ”₯

πŸ‘‰#Nvidia unveils the fully attentional networks (FANs)

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Efficient fully attentional design
βœ…Semantic seg. & object detection
βœ…Model/source code soon available!

More: https://bit.ly/3vtpITs
πŸ”₯7🀯3πŸ‘2❀1
πŸ‘¨πŸΌβ€πŸŽ¨ Open-Source DALLΒ·E 2 is out πŸ‘¨πŸΌβ€πŸŽ¨

πŸ‘‰#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…SOTA for text-to-image generation
βœ…Source code/model under MIT License
βœ…"Medieval painting of wifi not working"

More: https://bit.ly/3vzsff6
🀯14πŸ‘6😁1
This media is not supported in your browser
VIEW IN TELEGRAM
β›ΊViTPose: Transformer for Poseβ›Ί

πŸ‘‰ViTPose from ViTAE, ViT for human pose

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Plain/nonhierarchical ViT for pose
βœ…Deconv-layers after ViT for keypoints
βœ…Just the baseline is the new SOTA
βœ…Source code & models available soon!

More: https://bit.ly/3MJ0kz1
πŸ‘5🀯4πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🧳 Unsupervised HD Motion Transfer 🧳

πŸ‘‰Novel e2e unsupervised motion transfer for image animation

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…TPS motion estimation + Dropout
βœ…Novel E2E unsupervised motion transfer
βœ…Optical flow + multi-res. occlusion mask
βœ…Code and models under MIT license

More: https://bit.ly/3MGNPns
πŸ”₯8πŸ‘6🀯4❀2😱2
This media is not supported in your browser
VIEW IN TELEGRAM
🚀 Neural Self-Calibration in the wild 🚀

πŸ‘‰ Learning algorithm to regress calibration params from in the wild clips

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Params purely from self-supervision
βœ…S.S. depth/pose learning as objective
βœ…POV, fisheye, catadioptric: no changes
βœ…SOTA results on EuRoC MAV dataset

More: https://bit.ly/3w1n6LB
πŸ‘8🀩2πŸ”₯1πŸ₯°1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦… ConDor: S.S. Canonicalization πŸ¦…

πŸ‘‰Self-Supervised Canonicalization for full/partial 3D points cloud

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…RRC + Stanford + KAIST + Brown
βœ…On top of Tensor Field Networks (TFNs)
βœ…Unseen 3D -> equivariant canonical
βœ…Co-segmentation, NO supervision
βœ…Code and model under MIT license

More: https://bit.ly/3MNDyGa
πŸ”₯4πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦€ Event-aided Direct Sparse Odometry πŸ¦€

πŸ‘‰EDS: direct monocular visual odometry using events/frames

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Mono 6-DOF visual odometry + events
βœ…Direct photometric bundle adjustment
βœ…Camera motion tracking by sparse pixels
βœ…A new dataset with HQ events and frame

More: https://bit.ly/3s9FiBN
πŸ”₯5πŸ‘3🀯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ«€BlobGAN: Blob-Disentangled SceneπŸ«€

πŸ‘‰Unsupervised, mid-level (blobs) generation of scenes

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Spatial, depth-ordered Gaussian blobs
βœ…Reaching for supervised level, and more
βœ…Source under BSD-2 "Simplified" License

More: https://bit.ly/3kRyGnj
πŸ”₯8πŸ‘1πŸ₯°1🀯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦•E2EVE editor via pre-trained artistπŸ¦•

πŸ‘‰E2EVE generates a new version of the source image that resembles the "driver" one

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Blending regions by driver image
βœ…E2E cond-probability of the edits
βœ…S.S. augmenting in target domain
βœ…Implemented as SOTA transformer
βœ…Code/models available (soon)

More: https://bit.ly/3P9TDYW
🀯5πŸ‘2🀩2❀1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐢 Bringing pets in #metaverse 🐢

πŸ‘‰ARTEMIS: pipeline for generating articulated neural pets for virtual worlds

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…ARTiculated, appEarance, Mo-synthesIS
βœ…Motion control, animation & rendering
βœ…Neural-generated (NGI) animal engine
βœ…SOTA animal mocap + neural control

More: https://bit.ly/3LZSLDU
❀4πŸ‘2πŸ₯°2🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
😍Animated hand in 1972, damn romantic😍

πŸ‘‰Q: is #VR the technology that developed least in the last 30 years? πŸ€”

More: https://bit.ly/3snxNaq
πŸ‘7❀3🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
⏏️Ensembling models for GAN training⏏️

πŸ‘‰Pretrained vision models to improve the GAN training. FID by 1.5 to 2Γ—!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…CV models as ensemble of discriminators
βœ…Improving GAN in limited / large-scale set
βœ…10k samples matches StyleGAN2 w/ 1.6M
βœ…Source code / models under MIT license

More: https://bit.ly/3wgUVsr
🀯6πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
🀯Cooperative Driving + AUTOCASTSIM🀯

πŸ‘‰COOPERNAUT: cross-vehicle perception for vision-based cooperative driving

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…UTexas + #Stanford + #Sony #AI
βœ…LiDAR into compact point-based
βœ…Network-augmented simulator
βœ…Source code and models available

More: https://bit.ly/3sr5HLk
πŸ”₯6🀯3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’„NeuralHDHair: 3D Neural HairπŸ’„

πŸ‘‰NeuralHDHair: fully automatic system for modeling HD hair from a single image

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…IRHairNet for hair geometric features
βœ…GrowingNet: 3D hair strands in parallel
βœ…VIFu: novel voxel-aligned implicit function
βœ…SOTA in 3D hair modeling from single pic

More: https://bit.ly/38iR0mQ
πŸ‘5πŸ₯°3❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🐑DyNeRF: Neural 3D Video Synthesis🐑

πŸ‘‰#Meta unveils DyNeRF, novel rendering HQ 3D video

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel NeRF-based on temp-latent codes
βœ…Novel training based on hierarchical step
βœ…Datasets of time-synch/calibrated clips
βœ…Attribution-NonCommercial 4.0 Int.

More: https://bit.ly/3MlBRA9
🀯8πŸ‘2πŸ”₯1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‹GATO: agent for multiple tasksπŸ‹

πŸ‘‰The same network with the same weights can play Atari, caption pics, chat, and more🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…General-purpose agent, multiple tasks
βœ…Multi-modal-task, multi-embodiment
βœ…Inspired by large-scale language model

More: https://bit.ly/3LbBOWb
🀯10❀3πŸ‘2πŸ”₯2