AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Super-Human Crossword SolverπŸ”₯

πŸ‘‰Solving crosswords outperforming best humans

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Crossword solving based on NNs
βœ…Q&A, structured decoding, local search
βœ…Wide domains with perfect accuracy
βœ…Large question-answer dataset

More: https://bit.ly/3a3zzqQ
πŸ”₯4🀯3πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯ΈImagen: far beyond DALLΒ·E 2πŸ₯Έ

πŸ‘‰#Google: unprecedented photorealism and deep level of language understanding

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Dynamic thresh diffusion sampling
βœ…Efficient U-Net, efficient++ variant
βœ…DrawBench, new text-to-image
βœ…The new SOTA, COCO FID of 7.27

More: https://bit.ly/3lVtkbz
πŸ”₯9🀯6πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ€Tracking over SOTA detectorsπŸͺ€

πŸ‘‰Lightweight Python lib for real-time 2D object tracking πŸ’₯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Layer of tracking over SOTA detectors
βœ…Suitable for complex video processing
βœ…Source code under BSD 3-Clause
βœ…Maintained by Tryolabs team

More: https://bit.ly/3wKtGqg
πŸ‘7πŸ”₯3🀩3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯·πŸΏ FCA: #3D Neural Camouflage πŸ₯·πŸΏ

πŸ‘‰#3D full-camouflage adversarial patch to fool neural detectors

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Attack by diff-neural render
βœ…E2E physical adversarial attack
βœ…Envs, vehicles & detectors
βœ…Source code available!

More: https://bit.ly/38kKyfa
πŸ‘5πŸ”₯3🀯2πŸ‘1
Media is too big
VIEW IN TELEGRAM
πŸ‹ One-Shot Object Pose πŸ‹

πŸ‘‰A novel one-shot object pose estimator

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Visual localization pipeline for object pose
βœ…Handling novel objects without CAD model
βœ…Novel graph attention for 2D-3D matching
βœ…Large dataset for one-shot object pose

More: https://bit.ly/3MTogjJ
πŸ”₯11❀4πŸ‘2🀯2
This media is not supported in your browser
VIEW IN TELEGRAM
β˜„οΈSTEVE: Slot-TransformEr for VidEosβ˜„οΈ

πŸ‘‰STEVE: unsupervised model for object-centric learning in videos

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Adoption of a slot decoder (SLATE)
βœ…SLATE with slot-level recurrence model
βœ…Complex and naturalistic videos
βœ…Significantly outperforms previous SOTA

More: https://bit.ly/3PNxxM3
πŸ”₯7πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦” CogVideo: insane text-to-clip πŸ¦”

πŸ‘‰CogVideo: 9B-parameters world's first large scale open-source text-to-video 😡

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Largest open-source T2C transformer
βœ…Finetuning of text-to-image model
βœ…Multi-frame-rate hierarchical training
βœ…From pretrained model CogView2

More: https://bit.ly/3Gzfl4n
πŸ”₯9πŸ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦„Time-Aware Neural VoxelsπŸ¦„

πŸ‘‰TiNeuVox: "NeRF" with time-aware voxel features 😡

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Dynamic scene w/ optimizable structure
βœ…Temporal information in radiance net
βœ…Small/large motion w/ single-res of feats
βœ…192Γ— faster than previous Hyper-NeRF

More: https://bit.ly/3wR4O08
πŸ‘11πŸ”₯2🀯1
🫐Neural Anomaly Detection by AWS🫐

πŸ‘‰Ultra-competitive inference and SOTA for both detection and localization

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Locally aggregated, mid-level feats patch
βœ…Maximizing nominal information at test time
βœ…Reducing biases towards ImageNet classes
βœ…Image-level anomaly AUROC of up to 99.6%

More: https://bit.ly/3t7Ndjg
πŸ”₯7🀯3πŸ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ›Ή Project Skate from Google #AI πŸ›Ή

πŸ‘‰#AI tool to analyze the skateboarder's tricks in real-time

More: https://bit.ly/3zbQS3M
πŸ”₯15🀩3πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🧬Neural Text2Human Generation🧬

πŸ‘‰Text-driven neural human generation

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Full-body from a given human pose
βœ…Hierarchical texture-aware codebook
βœ…DeepFashion -> 44k Hi-Res images
βœ…Code and models available!

More: https://bit.ly/3Mdnpt0
πŸ”₯15πŸ‘1
🧨EfficientFormers: 1.6ms inference 🧨

πŸ‘‰Transformers fast as MobileNet? Snap shows that on #iphone!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Low latency on mobile, high performance!
βœ…Revisiting the design of ViT through latency
βœ…New dimension-consistent design paradigm
βœ…EfficientFormers: a new ViT for mobile!

More: https://bit.ly/3MdgW15
πŸ”₯16πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐒 Transformer-Based Sens-Fusion 🐒

πŸ‘‰Updating TransFuser (CVPR21): image + LiDAR representations with self-attention

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Existing approach can't handle traffic 😒
βœ…Novel multi-modal fusion transformer
βœ…The new SOTA in driving performance
βœ…Reducing avg collisions per KM by 48%
βœ…Insights on current limitations of E2E

More: https://bit.ly/391dmd6
πŸ‘11πŸ”₯2
πŸ§˜πŸ»β€β™‚οΈYogNet: neural yoga assistantπŸ§˜πŸ»β€β™‚οΈ

πŸ‘‰Multi-person yoga neural expert for 20 asanas

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…CNNs & reg.LSTMs + 3D-CNNs
βœ…Multi-person asanas in real-time
βœ…YAR: dataset for yoga & posture
βœ…1206 videos, 2D RGB camera

More: https://bit.ly/3NncVbE
❀13πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”΄ Geogram: geometric algos in C++ πŸ”΄

πŸ‘‰Novel open-source programming library with (research) geometric algorithms in C++

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Geometry Processing from #INRIA
βœ…30+ papers from SIGGRAPH, etc.
βœ…Grants: GOODSHAPE & VORPALINE
βœ…Code (mostly C++) under BSD 3

More: https://bit.ly/3mhS4L7
πŸ”₯6πŸ‘3❀1
🍏 Open Source Vision from #Apple 🍏

πŸ‘‰CVNets: open-source (not a joke) lib for neural vision.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…PyTorch-based neural lib. for vision
βœ…Train 2βˆ’4Γ— longer w/ augmentations
βœ…Plug-and-play components for CV
βœ…Source code under a custom license

More: https://bit.ly/39d1dSj
πŸ‘9
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‡πŸ»Neural Clips by #Nvidia: INSANE πŸ‡πŸ»

πŸ‘‰Neural generation with changes in camera viewpoint & content that arises over time 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel hierarchical generator architecture
βœ…Temp. receptive field + temporal embed.
βœ…Multi-res. with super-resolution network
βœ…SOTA in long clip with motion & changes
βœ…Code, data & models in August 2022 πŸ–οΈ

More: https://bit.ly/3zroWsC
🀯9πŸ‘Ž2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽ Zero to #Messi with #deeplearning ⚽

πŸ‘‰EA unveils a neural system to learn multiple soccer juggling skills 😍

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Learning difficult soccer juggling skills
βœ…Layer-wise mixture-of-experts architecture
βœ…Specialization arises naturally
βœ…Adaptive random walk training strategy

More: https://bit.ly/3mwRaL2
πŸ”₯7πŸ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ–οΈ HumanNeRF: source code is out! πŸ–οΈ

πŸ‘‰Pausing the video at any frame and rendering the subject from arbitrary views!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Synthesizing photorealistic humans
βœ…Synthesizing details, ie. cloth & face
βœ…Volumetric canonical T-pose
βœ…Skeletal rigid/non-rigid decomposition

More: https://bit.ly/3NEkTNY
🀯17πŸ”₯5πŸ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŽ’ EG3D: source code is out! πŸŽ’

πŸ‘‰#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Tri-plane-based 3D GAN framework
βœ…Pose-correlated attribute (expression)
βœ…SOTA in uncond. 3D-aware synthesis
βœ…Source code & models NOW available!

More: https://bit.ly/3aOfHs0
πŸ”₯7🀯6πŸ‘4❀2