AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🍊Block-NeRF: Neural View Synthesis🍊

👉Large-scale scene reconstruction by multiple compact NeRFs that each fit into memory.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Berkeley + Google + Waymo = 🤯
Scaling NeRF to city-scale scenes
Trick: multiple simple NeRFs
Time decoupled, arbitrarily large scene
Data over months & different conditions

More: https://bit.ly/3GGVHBV
👍4🔥3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥬HW-Accelerated Neuro-Evolution🥬

👉Scalable, general purpose, hardware accelerated neuro-evolution toolkit by Google

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Parallel on multiple TPU/GPUs
Neuro-evo algorithms with NNs
WaterWorld, Abstract paint, more
From Google, not an official product
Code under Apache License 2.0

More: https://bit.ly/3szEi9w
👍3🔥2🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🚛 DeepETA: #Uber ETA via #AI🚛

👉Uber unveils the low-latency deep architecture for global ETA prediction

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latency / Accuracy / Generality
7 NNs architectures tested
Encoder-decoder + Self-Attention
Linear transformer (kernel trick)
Feature sparsity for speed

More: https://bit.ly/3gFWmJh
👍3🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
✏️CLIPasso: Semantic Sketching via CLIP✏️

👉Sketching method guided by geometric and semantic simplifications (CLIP)

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
EPFL, TAU and IDC Herzliya
CLIP image encoder for sketching
Sketching as a set of Bezier curves
Param-optimization on CLIP-loss
Source code and models available

More: https://bit.ly/3oLEDF4
🔥2🥰2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪂SAHI: slicing detection/segmentation🪂

👉An open-source lightweight library for large scale object detection & instance segmentation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Slicing Aided Hyper Inference
Large-scale detection/segment.
Sliced inference and merging
Utils for conversion, slicing, etc.
Code licensed under MIT License

More: https://bit.ly/3uMJoBZ
🔥32🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎁100,000,000 image-text pairs!🎁

👉Large-scale Chinese cross-modal dataset for benchmarking different multi-modal pre-training methods.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
100 Million <image, text> pairs
>200px size, aspect ratio (1/3~3)
Models of ResNet, ViT & SwinT
Methods: CLIP, FILIP and LiT
Privacy/Sensitive words 🤔

More: https://bit.ly/34BqlzX
👍5🤔1
This media is not supported in your browser
VIEW IN TELEGRAM
🧁33 Million synthetic pedestrians🧁

👉A novel large, fully synthetic dataset

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Exploiting the #gta5 engine
764 full-HD videos @20 fps
33M+ person instances
BBs & segmentation masks
2D/3D keypoints & depth

More: https://bit.ly/36njlY1
👍6🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥝Marker-free 6D-point tracking🥝

👉Full position and rotation of skeletal joints, with only a RGB frame

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Full 3-axis joint rotations
V-markers, emulating mocap
#3D from monocular with NN
Generalization, no retraining
SOTA rotation/position est.

More: https://bit.ly/34GdoF5
🔥12🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧼 Synthetic dataset for #Retail 🧼

👉A large-scale photorealistic synthetic dataset with annotations for semantic segmentation, instance segmentation, depth estimation, and object detection.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Dataset from Standard.AI
2,134 unique scenes
25k+ annotated samples
Introducing the "change detection"
Multi-view representation learning
NonCommercial-ShareAlike 4.0

More: https://bit.ly/3uXqubB
🤯6🥰3👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈 Graph Neural Nets Forecasting🌈

👉Data-driven approach for forecasting global weather using graph neural networks

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Data-driven forecasting via GNNs
Model: 6.7M parameters, float32
6-hours forecast in 0.04 secs.
A 5-day forecast in 0.8 secs.

More: https://bit.ly/3LH4CXR
👏4👍2🤔1
Media is too big
VIEW IN TELEGRAM
🥫Watch Those Words!🥫

👉Berkeley unveils a novel approach to discover cheap-fake and visually persuasive deep-fakes

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Regardless of falsification
Semantic person-specific
Word-conditioned analysis
Generalization across fakes

More: https://bit.ly/3oXWmcd
👍5😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔋V2X-sim for #selfdriving is out!🔋

👉V2X: collaboration between a vehicle and any surrounding entity

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Suitable for #selfdrivingcars
Rec. from road & vehicles
Multi-streams/perception
Detection, tracking, & segmentation
RGB, depth, semantic, BEV & LiDAR

More: https://bit.ly/3H6veOI
🔥6🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍏Infinite Synthetic dataset for Fitness🍏

👉Opensource synthetic images for fitness, single/multi-person, and realistic variation in lighting, camera angles, and occlusions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
60k images, 1-5 avatars
15 categories, 21 variations
Blender and ray-tracing
SMPL-X + facial expression
Cloth/skin tone sampled
147 4K HDRI panoramas
Creative Commons 4.0

More: https://bit.ly/33B1R9q
🤩51👍1
This media is not supported in your browser
VIEW IN TELEGRAM
DITTO: Digital Twins from Interaction

👉Digitizing objects for #metaverse through interactive perception

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
DIgital Twin of arTiculated Objects
Geometry & kinematic articulation
Articulation & 3D via perception
Source code under MIT License

More:https://bit.ly/3LMazCV
🔥52👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖 Robotic Telekinesis from Youtube 🤖

👉CMU unveils a Robot that observes humans and imitates their actions in real-time

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Enabling robo-hand teleoperation
Suitable for untrained operator
Single uncalibrated RGB camera
Leveraging unlabeled #youtube
No active fine-tuning or setup
No collision via Adv-Training

More: https://bit.ly/3H7zUnh
🔥3🤯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
💄DIGAN: #AI for video generation💄

👉A novel INR-based generative adversarial network for video generation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Dynamics-aware generator
INR-based clip generator
Manipulating space/time
Identifying unnatural motion

More: https://bit.ly/3H6sHE4
🔥4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦄FILM Neural Frame Interpolation🦄

👉Frame interpolation that synthesizes multiple intermediate frames from two input images with large in-between motion

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Single unified network
High quality output
SOTA on the Xiph
Apache License 2.0

More: https://bit.ly/3pl4ZxH
🔥5👍2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🔈Neural Maintenance via listening🔈

👉Novel neural-method to detect whether a machine is "healthy" or requires maintenance

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Defects at an early stage
FDWT, fast discrete wavelet
Learnable wavelet/denoising
Unsupervised learnable FDWT
The new SOTA in PM

More: https://bit.ly/3hiKWeX
🤯6🤔1
This media is not supported in your browser
VIEW IN TELEGRAM
🟦🟨 StyleGAN on Internet pics 🟦🟨

👉StyleGAN on raw uncurated images collected from Internet

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Outliers & multi-modal
Self-distillation approach
Self-filtering of outliers
Perceptual clustering

More: https://bit.ly/33Z1d5H
2👍1🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦜The new SOTA for Unsupervised 🦜

👉Self-supervised transformer to discover objects in images

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Visual tokens as nodes in graph
Edges as connectivity score
The second smallest eV = fg
Suitable for unsupervised saliency
Weakly supervised obj. detection
Code under MIT License


More: https://bit.ly/3sqbFg3
👍4🔥3🤯1