AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‹GATO: agent for multiple tasksπŸ‹

πŸ‘‰The same network with the same weights can play Atari, caption pics, chat, and more🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…General-purpose agent, multiple tasks
βœ…Multi-modal-task, multi-embodiment
βœ…Inspired by large-scale language model

More: https://bit.ly/3LbBOWb
🀯10❀3πŸ‘2πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺNeRF powered by keypointsπŸͺ

πŸ‘‰ETHZ + META unveil how to encode relative spatial #3D info via sparse 3D keypoints

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Sparse 3D keypoints for SOTA avatars
βœ…Unseen subjects from 2/3 views
βœ…Never-before-seen iPhone captures

More: https://bit.ly/39NQqhe
🀯5πŸ”₯2❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌Self-Supervised human co-evolution🐌

πŸ‘‰Self-supervised 3D by co-evolution of pose estimator, imitator, and hallucinator

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel self-supervised 3D pose
βœ…Co-evo of pose, imitator, hallucinator
βœ…Realist 3D pose and 2D-3D supervision
βœ…Source code / model under MIT license

More: https://bit.ly/37J5ImL
πŸ”₯4πŸ‘3❀1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐲 Diff-SDF #3D Rendering 🐲

πŸ‘‰Reconstruction with no complex reg. or priors, using only a per-pixel RGB loss

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Diff-render to optimize geometry/albedo
βœ…No ad-hoc object mask or supervision
βœ…Extended sphere tracing algorithm

More: https://bit.ly/3yKWPnI
🀯10πŸ‘4πŸ”₯2❀1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘„LVD: new SOTA for #3D humanπŸ‘„

πŸ‘‰Corona et al. unveils a novel 3D human model fitting

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Solution via neural field
βœ…Not sensitive to initialization
βœ…SOTA in shape from single pic
βœ…SOTA in fitting 3D scans

More: https://bit.ly/3Ng4lLr
πŸ‘4πŸ”₯2🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ³οΈβ€πŸŒˆDeep Clustering on ImageNet & Co.πŸ³οΈβ€πŸŒˆ

πŸ‘‰World's first deep nonparametric clustering on large dataset such as ImageNet

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Deep clustering that infers nr. of clusters
βœ…Loss: amortized inference in mixt-models
βœ…Deep nonparametric clustering on ImageNet
βœ…Code and model available under MIT license

More: https://bit.ly/38p62rn
πŸ”₯9🀯3πŸ‘2🀩2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’₯HQ-EΒ²FGVI just releasedπŸ’₯πŸ’₯

πŸ‘‰Flow-Guided Video Inpainting through three trainable modules

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Flow, pixel-prop, content hallucination
βœ…Three stage-modules, jointly optimized
βœ…The new SOTA, promising efficiency
βœ…Code and Models under MIT license

More: https://bit.ly/3Ln0ICj
🀯10πŸ‘1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ” AvatarCLIP: Text-Driven Avatar πŸͺ”

πŸ‘‰Zero-shot text-driven for #3D avatar in #metaverse

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…First text-driven synthesis
βœ…Shape, texture, and motion
βœ…Animation-ready, HQ texture/geometry
βœ…Zero-shot text-guided ref-based motion
βœ…Code and model under MIT license

More: https://bit.ly/3LjTWgB
πŸ”₯4πŸ‘2🀯2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯#AIwithPapers: we are 2,500!πŸ”₯

πŸ’™πŸ’›Only 2 Billion papers remaining on arXiv. The more we are, the faster we readπŸ’™πŸ’›

😈 Invite your friends -> https://t.iss.one/AI_DeepLearning
πŸ”₯9❀4πŸ‘2πŸ€”2πŸ‘1
πŸ’₯Podcasting AI & CVπŸ’₯

πŸ‘‰πŸΌFor people fluent in Italian: 1 hour podcast in which I talk about AI, CV, Startup and more (included this wonderful project).

More: https://bit.ly/38DtBwB
πŸ‘6❀3πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Inpainting: new SOTA! INSANEπŸ”₯

πŸ‘‰Novel two-stream approach: inpainting at the next level!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…High-freq locally, low-freq globally
βœ…Local to global -> error correction
βœ…44% / 26% improvements FID/scores
βœ…Source code, more clips available

More: https://bit.ly/3ltIX9R
πŸ‘8🀯3πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Super-Human Crossword SolverπŸ”₯

πŸ‘‰Solving crosswords outperforming best humans

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Crossword solving based on NNs
βœ…Q&A, structured decoding, local search
βœ…Wide domains with perfect accuracy
βœ…Large question-answer dataset

More: https://bit.ly/3a3zzqQ
πŸ”₯4🀯3πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯ΈImagen: far beyond DALLΒ·E 2πŸ₯Έ

πŸ‘‰#Google: unprecedented photorealism and deep level of language understanding

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Dynamic thresh diffusion sampling
βœ…Efficient U-Net, efficient++ variant
βœ…DrawBench, new text-to-image
βœ…The new SOTA, COCO FID of 7.27

More: https://bit.ly/3lVtkbz
πŸ”₯9🀯6πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ€Tracking over SOTA detectorsπŸͺ€

πŸ‘‰Lightweight Python lib for real-time 2D object tracking πŸ’₯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Layer of tracking over SOTA detectors
βœ…Suitable for complex video processing
βœ…Source code under BSD 3-Clause
βœ…Maintained by Tryolabs team

More: https://bit.ly/3wKtGqg
πŸ‘7πŸ”₯3🀩3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯·πŸΏ FCA: #3D Neural Camouflage πŸ₯·πŸΏ

πŸ‘‰#3D full-camouflage adversarial patch to fool neural detectors

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Attack by diff-neural render
βœ…E2E physical adversarial attack
βœ…Envs, vehicles & detectors
βœ…Source code available!

More: https://bit.ly/38kKyfa
πŸ‘5πŸ”₯3🀯2πŸ‘1
Media is too big
VIEW IN TELEGRAM
πŸ‹ One-Shot Object Pose πŸ‹

πŸ‘‰A novel one-shot object pose estimator

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Visual localization pipeline for object pose
βœ…Handling novel objects without CAD model
βœ…Novel graph attention for 2D-3D matching
βœ…Large dataset for one-shot object pose

More: https://bit.ly/3MTogjJ
πŸ”₯11❀4πŸ‘2🀯2
This media is not supported in your browser
VIEW IN TELEGRAM
β˜„οΈSTEVE: Slot-TransformEr for VidEosβ˜„οΈ

πŸ‘‰STEVE: unsupervised model for object-centric learning in videos

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Adoption of a slot decoder (SLATE)
βœ…SLATE with slot-level recurrence model
βœ…Complex and naturalistic videos
βœ…Significantly outperforms previous SOTA

More: https://bit.ly/3PNxxM3
πŸ”₯7πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦” CogVideo: insane text-to-clip πŸ¦”

πŸ‘‰CogVideo: 9B-parameters world's first large scale open-source text-to-video 😡

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Largest open-source T2C transformer
βœ…Finetuning of text-to-image model
βœ…Multi-frame-rate hierarchical training
βœ…From pretrained model CogView2

More: https://bit.ly/3Gzfl4n
πŸ”₯9πŸ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦„Time-Aware Neural VoxelsπŸ¦„

πŸ‘‰TiNeuVox: "NeRF" with time-aware voxel features 😡

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Dynamic scene w/ optimizable structure
βœ…Temporal information in radiance net
βœ…Small/large motion w/ single-res of feats
βœ…192Γ— faster than previous Hyper-NeRF

More: https://bit.ly/3wR4O08
πŸ‘11πŸ”₯2🀯1
🫐Neural Anomaly Detection by AWS🫐

πŸ‘‰Ultra-competitive inference and SOTA for both detection and localization

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Locally aggregated, mid-level feats patch
βœ…Maximizing nominal information at test time
βœ…Reducing biases towards ImageNet classes
βœ…Image-level anomaly AUROC of up to 99.6%

More: https://bit.ly/3t7Ndjg
πŸ”₯7🀯3πŸ‘2