AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🥶SOTA in crowd analysis is INSANE🥶

👉Tencent unveils P2PNet to predict heads in images

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Pure point counting/detecting
Normalized Average Precision
VGG16-like architecture
Simultaneous point/confidence
License: only academic

More: https://bit.ly/33UjoK0
😱43👍2🤯1
❄️OLSO: Transformers Optimization❄️

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Automagical with Hugging Face
GPU-based optimizations
Easily installation with pip
Apache License 2.0

More: https://bit.ly/3r8wY58
3🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦾SOTA in robotic manipulation🦾

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
VCD: Visible Connectivity Dynamics
VCG: Visible Connectivity Graph
Dynamics model over this VCG
Handling material, geometry, color
SOTA vs. model-based/model-free RL
Source code and models available

More: https://bit.ly/3HhusiH
🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
📟VRT: new SOTA in super resolution📟


𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Image restoration via Swin
Residual Swin Transf. Blocks
SOTA in Artifact Reduction
SOTA in Super-resolution
SOTA in Denoising
Parameters -67%!
Non commercial 🥲

More: https://bit.ly/3rfAta1
👍81🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🦖The new #MediaPipe is INSANE 🦖

👉Google just launched two new highly optimized body segmentation models

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Full body 3D pose
Designed for yoga, fitness & dance
Measurements for virtual tailor
Selfie Segmentation on call

More: https://bit.ly/3s6sjjx
👍5🔥4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥸 Clothed avatars for #metaverse 🥸

👉Telepresence, AR/VR, anthropometry, and virtual try-on.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Differential loss of explicit mesh
Details via neural rendering
Explicit mesh updating
Consistency loss for quality++
Hi-Fi surfaces by S.S. optimization

More: https://bit.ly/3ohAN6d
🔥6👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦕JoJoGAN: One Shot Face Stylization🦕

👉UIUC researchers unveil a novel method for one-shot image stylization.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Stylization from single input
Finetuning StyleGAN for stylization
No supervision, good generalization
MIT License (commercial allowed)

More: https://bit.ly/3ASVzyb
5👍2👏2
This media is not supported in your browser
VIEW IN TELEGRAM
🧦SOTA in OOD detection for safer #AI🧦

👉Out-of-distribution (OOD) detection produces wrong/overconfident predictions.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel framework for OOD
Synthesizing virtual outliers
Novel unknown-aware training
Code and model available

More: https://bit.ly/3JnFIL9
🔥3👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌅StyleGAN-XL neural synthesis🌅

👉From Tübingen, StyleGAN-XL: new SOTA for large diverse dataset.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
First 1024p-gen for large data
Growing strategy on StyleGAN3
Beyond the narrow domains
Pivotal Tuning Inversion (TPI)
SOTA vs. GAN & diffusion models

More: https://bit.ly/3HK9MQk
🔥6👍1
This media is not supported in your browser
VIEW IN TELEGRAM
📌This keypoint is pure GLUE📌

👉Keypoints play a central role in computer vision.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel Object-centric keypoint
Novel sim2real training method
Intra-salience / inter-distinctness
Enforcing semantic consistency
Close to fully-supervised method!

More: https://bit.ly/3rth1qh
🔥5🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💡 LEDNet: seeing in the dark 💡

👉Researchers from NTU unveil LEDNet to see in the dark

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel data synthesis for low-light
Low-light/deblurring dataset
12k low-blur/normal-sharp pairs
LEDNet: lowlight + deblurring


More: https://bit.ly/3HIyYqM
👏6👍4🔥3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👩‍🦰Back in the 50's with GAN👩‍🦰

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A few thousand vintage faces
Models available for download
Stylegan2-ffhqu-1024x1024
NO Commercial allowed

More: https://bit.ly/3LlOyKX
🤯21😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🦠VNCA: bio-inspired generative model 🦠

👉A novel generative model loosely inspired by the biological processes of cellular growth and differentiation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Variational Neural Cellular Automata
Probabilistic generative model
Learn from common vector format
Learn purely s.o. generative process
Far away from SOTA, but interesting

More: https://bit.ly/3oGb2wG
👍4🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍊Block-NeRF: Neural View Synthesis🍊

👉Large-scale scene reconstruction by multiple compact NeRFs that each fit into memory.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Berkeley + Google + Waymo = 🤯
Scaling NeRF to city-scale scenes
Trick: multiple simple NeRFs
Time decoupled, arbitrarily large scene
Data over months & different conditions

More: https://bit.ly/3GGVHBV
👍4🔥3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥬HW-Accelerated Neuro-Evolution🥬

👉Scalable, general purpose, hardware accelerated neuro-evolution toolkit by Google

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Parallel on multiple TPU/GPUs
Neuro-evo algorithms with NNs
WaterWorld, Abstract paint, more
From Google, not an official product
Code under Apache License 2.0

More: https://bit.ly/3szEi9w
👍3🔥2🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🚛 DeepETA: #Uber ETA via #AI🚛

👉Uber unveils the low-latency deep architecture for global ETA prediction

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latency / Accuracy / Generality
7 NNs architectures tested
Encoder-decoder + Self-Attention
Linear transformer (kernel trick)
Feature sparsity for speed

More: https://bit.ly/3gFWmJh
👍3🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
✏️CLIPasso: Semantic Sketching via CLIP✏️

👉Sketching method guided by geometric and semantic simplifications (CLIP)

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
EPFL, TAU and IDC Herzliya
CLIP image encoder for sketching
Sketching as a set of Bezier curves
Param-optimization on CLIP-loss
Source code and models available

More: https://bit.ly/3oLEDF4
🔥2🥰2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪂SAHI: slicing detection/segmentation🪂

👉An open-source lightweight library for large scale object detection & instance segmentation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Slicing Aided Hyper Inference
Large-scale detection/segment.
Sliced inference and merging
Utils for conversion, slicing, etc.
Code licensed under MIT License

More: https://bit.ly/3uMJoBZ
🔥32🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎁100,000,000 image-text pairs!🎁

👉Large-scale Chinese cross-modal dataset for benchmarking different multi-modal pre-training methods.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
100 Million <image, text> pairs
>200px size, aspect ratio (1/3~3)
Models of ResNet, ViT & SwinT
Methods: CLIP, FILIP and LiT
Privacy/Sensitive words 🤔

More: https://bit.ly/34BqlzX
👍5🤔1
This media is not supported in your browser
VIEW IN TELEGRAM
🧁33 Million synthetic pedestrians🧁

👉A novel large, fully synthetic dataset

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Exploiting the #gta5 engine
764 full-HD videos @20 fps
33M+ person instances
BBs & segmentation masks
2D/3D keypoints & depth

More: https://bit.ly/36njlY1
👍6🤯1