AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸƒDeep Equilibrium for Optical FlowπŸƒ

πŸ‘‰DEQ: converge faster, less memory, often more accurate

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel formulation of optical flow method
βœ…Compatible with prior modeling/data-related
βœ…Sparse fixed-point correction for stability
βœ…Code/models under GNU Affero GPL v3.0

More: https://bit.ly/3v4fZmi
πŸ‘3πŸ₯°2🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌳Ultra High-Resolution Neural Saliency🌳

πŸ‘‰A novel ultra high-resolution saliency detector with dataset!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Ultra Hi-Res Saliency Detection
βœ…5,920 pics at 4K-8K resolution
βœ…Pyramid Grafting Network
βœ…Cross-Model Grafting Module
βœ…AGL: Attention Guided Loss
βœ…Code/models under MIT

More: https://bit.ly/3MnU1Rf
❀6πŸ‘3🀯3πŸ”₯2🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ†StyleGAN-Human for fashion πŸͺ†

πŸ‘‰A novel unconditional human generation based on StyleGAN is out!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…200,000+ labeled sample (pose/texture)
βœ…1024x512 StyleGAN-Human StyleGAN3
βœ…512x256 StyleGAN-Human StyleGAN1
βœ…Face model for downstream: InsetGAN
βœ…Source code and model available!

More: https://bit.ly/3xMg5B2
❀5πŸ‘4πŸ”₯3🀯1πŸ’©1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’€ OSSO: Skeletal Shape from Outside πŸ’€

πŸ‘‰Anatomic skeleton of a person from 3D surface of body 🦴

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Max Planck + IMATI-CNR + INRIA
βœ…DXA images to obtain #3D shape
βœ…External body to internal skeleton

More: https://bit.ly/3v7Z5TQ
πŸ‘4🀯2πŸ”₯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🎷 Pix2Seq: object detection by #Google 🎷

πŸ‘‰A novel framework to perform object detection as a language modeling task

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Obj. detection as a lang-modeling task
βœ…BBs/labels -> seq. of discrete token
βœ…Encoder-decoder (one token at a time)
βœ…Code under Apache License 2.0

More: https://bit.ly/3F49PX3
πŸ‘8🀯3πŸ”₯1😱1πŸŽ‰1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌹 Generalizable Neural Performer 🌹

πŸ‘‰General neural framework to synthesize free-viewpoint images of arbitrary human performers

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Free-viewpoint synthesis of humans
βœ…Implicit Geometric Body Embedding
βœ…Screen-Space Occlusion-Aware Blending
βœ…GeneBody: 4M frames, multi-view cams

More: https://cutt.ly/SGcnQzn
πŸ‘5πŸ”₯1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚌 Tire-defect inspection 🚌

πŸ‘‰Unsupervised defects in tires using neural networks

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Impurity, same material as tire
βœ…Impurity, with different material
βœ…Damage by temp/pressure
βœ…Crack or etched material

More: https://bit.ly/37GX1JT
❀5πŸ‘3🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ§‹#4D Neural FieldsπŸ§‹

πŸ‘‰4D N.F. visual representations from monocular RGB-D 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…4D scene completion (occlusions)
βœ…Scene completion in cluttered scenes
βœ…Novel #AI for contextual point clouds
βœ…Data, code, models under MIT license

More: https://cutt.ly/6GveKiJ
πŸ‘6🀯2πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘”Largest dataset of human-object πŸ‘”

πŸ‘‰BEHAVE by Google: largest dataset of human-object interactions

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…8 subjects, 20 objects, 5 envs.
βœ…321 clips with 4 Kinect RGB-D
βœ…Masks and segmented point clouds
βœ…3D SMPL & mesh registration
βœ…Textured scan reconstructions

More: https://bit.ly/3Lx6NNo
πŸ‘5πŸ‘4πŸ”₯2❀1😱1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦴ENARF-GAN Neural Articulations🦴

πŸ‘‰Unsupervised method for 3D geometry-aware representation of articulated objects

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel efficient neural representation
βœ…Tri-planes deformation fields for training
βœ…Novel GAN for articulated representations
βœ…Controllable 3D from real unlabeled pic

More: https://bit.ly/3xYqedN
🀯3πŸ‘2❀1πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ–²οΈ HuMMan: 4D human dataset πŸ–²οΈ

πŸ‘‰HuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…RGB, pt-clouds, keypts, SMPL, texture
βœ…Mobile device in the sensor suite
βœ…500+ actions to cover movements

More: https://bit.ly/3vTRW8Z
πŸ₯°2😱2πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Neighborhood Attention Transformer πŸ”₯

πŸ‘‰A novel transformer for both image classification and downstream vision tasks

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Neighborhood Attention (NA)
βœ…Neighborhood Attention Transformer, NAT
βœ…Faster training/inference, good throughput
βœ…Checkpoints, train, #CUDA kernel available

More: https://bit.ly/3F5aVSo
🀯4πŸ‘3πŸ”₯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯πŸ”₯FANs: Fully Attentional NetworksπŸ”₯πŸ”₯

πŸ‘‰#Nvidia unveils the fully attentional networks (FANs)

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Efficient fully attentional design
βœ…Semantic seg. & object detection
βœ…Model/source code soon available!

More: https://bit.ly/3vtpITs
πŸ”₯7🀯3πŸ‘2❀1
πŸ‘¨πŸΌβ€πŸŽ¨ Open-Source DALLΒ·E 2 is out πŸ‘¨πŸΌβ€πŸŽ¨

πŸ‘‰#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…SOTA for text-to-image generation
βœ…Source code/model under MIT License
βœ…"Medieval painting of wifi not working"

More: https://bit.ly/3vzsff6
🀯14πŸ‘6😁1
This media is not supported in your browser
VIEW IN TELEGRAM
β›ΊViTPose: Transformer for Poseβ›Ί

πŸ‘‰ViTPose from ViTAE, ViT for human pose

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Plain/nonhierarchical ViT for pose
βœ…Deconv-layers after ViT for keypoints
βœ…Just the baseline is the new SOTA
βœ…Source code & models available soon!

More: https://bit.ly/3MJ0kz1
πŸ‘5🀯4πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🧳 Unsupervised HD Motion Transfer 🧳

πŸ‘‰Novel e2e unsupervised motion transfer for image animation

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…TPS motion estimation + Dropout
βœ…Novel E2E unsupervised motion transfer
βœ…Optical flow + multi-res. occlusion mask
βœ…Code and models under MIT license

More: https://bit.ly/3MGNPns
πŸ”₯8πŸ‘6🀯4❀2😱2
This media is not supported in your browser
VIEW IN TELEGRAM
🚀 Neural Self-Calibration in the wild 🚀

πŸ‘‰ Learning algorithm to regress calibration params from in the wild clips

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Params purely from self-supervision
βœ…S.S. depth/pose learning as objective
βœ…POV, fisheye, catadioptric: no changes
βœ…SOTA results on EuRoC MAV dataset

More: https://bit.ly/3w1n6LB
πŸ‘8🀩2πŸ”₯1πŸ₯°1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦… ConDor: S.S. Canonicalization πŸ¦…

πŸ‘‰Self-Supervised Canonicalization for full/partial 3D points cloud

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…RRC + Stanford + KAIST + Brown
βœ…On top of Tensor Field Networks (TFNs)
βœ…Unseen 3D -> equivariant canonical
βœ…Co-segmentation, NO supervision
βœ…Code and model under MIT license

More: https://bit.ly/3MNDyGa
πŸ”₯4πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦€ Event-aided Direct Sparse Odometry πŸ¦€

πŸ‘‰EDS: direct monocular visual odometry using events/frames

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Mono 6-DOF visual odometry + events
βœ…Direct photometric bundle adjustment
βœ…Camera motion tracking by sparse pixels
βœ…A new dataset with HQ events and frame

More: https://bit.ly/3s9FiBN
πŸ”₯5πŸ‘3🀯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ«€BlobGAN: Blob-Disentangled SceneπŸ«€

πŸ‘‰Unsupervised, mid-level (blobs) generation of scenes

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Spatial, depth-ordered Gaussian blobs
βœ…Reaching for supervised level, and more
βœ…Source under BSD-2 "Simplified" License

More: https://bit.ly/3kRyGnj
πŸ”₯8πŸ‘1πŸ₯°1🀯1😱1