AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
136 photos
250 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ“ŒThis keypoint is pure GLUE๐Ÿ“Œ

๐Ÿ‘‰Keypoints play a central role in computer vision.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel Object-centric keypoint
โœ…Novel sim2real training method
โœ…Intra-salience / inter-distinctness
โœ…Enforcing semantic consistency
โœ…Close to fully-supervised method!

More: https://bit.ly/3rth1qh
๐Ÿ”ฅ5๐Ÿฅฐ1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’ก LEDNet: seeing in the dark ๐Ÿ’ก

๐Ÿ‘‰Researchers from NTU unveil LEDNet to see in the dark

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel data synthesis for low-light
โœ…Low-light/deblurring dataset
โœ…12k low-blur/normal-sharp pairs
โœ…LEDNet: lowlight + deblurring


More: https://bit.ly/3HIyYqM
๐Ÿ‘6๐Ÿ‘4๐Ÿ”ฅ3๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘ฉโ€๐ŸฆฐBack in the 50's with GAN๐Ÿ‘ฉโ€๐Ÿฆฐ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…A few thousand vintage faces
โœ…Models available for download
โœ…Stylegan2-ffhqu-1024x1024
โœ…NO Commercial allowed

More: https://bit.ly/3LlOyKX
๐Ÿคฏ2โค1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ VNCA: bio-inspired generative model ๐Ÿฆ 

๐Ÿ‘‰A novel generative model loosely inspired by the biological processes of cellular growth and differentiation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Variational Neural Cellular Automata
โœ…Probabilistic generative model
โœ…Learn from common vector format
โœ…Learn purely s.o. generative process
โœ…Far away from SOTA, but interesting

More: https://bit.ly/3oGb2wG
๐Ÿ‘4๐Ÿ”ฅ1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŠBlock-NeRF: Neural View Synthesis๐ŸŠ

๐Ÿ‘‰Large-scale scene reconstruction by multiple compact NeRFs that each fit into memory.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Berkeley + Google + Waymo = ๐Ÿคฏ
โœ…Scaling NeRF to city-scale scenes
โœ…Trick: multiple simple NeRFs
โœ…Time decoupled, arbitrarily large scene
โœ…Data over months & different conditions

More: https://bit.ly/3GGVHBV
๐Ÿ‘4๐Ÿ”ฅ3๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฅฌHW-Accelerated Neuro-Evolution๐Ÿฅฌ

๐Ÿ‘‰Scalable, general purpose, hardware accelerated neuro-evolution toolkit by Google

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Parallel on multiple TPU/GPUs
โœ…Neuro-evo algorithms with NNs
โœ…WaterWorld, Abstract paint, more
โœ…From Google, not an official product
โœ…Code under Apache License 2.0

More: https://bit.ly/3szEi9w
๐Ÿ‘3๐Ÿ”ฅ2๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿš› DeepETA: #Uber ETA via #AI๐Ÿš›

๐Ÿ‘‰Uber unveils the low-latency deep architecture for global ETA prediction

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Latency / Accuracy / Generality
โœ…7 NNs architectures tested
โœ…Encoder-decoder + Self-Attention
โœ…Linear transformer (kernel trick)
โœ…Feature sparsity for speed

More: https://bit.ly/3gFWmJh
๐Ÿ‘3๐Ÿ”ฅ1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
โœ๏ธCLIPasso: Semantic Sketching via CLIPโœ๏ธ

๐Ÿ‘‰Sketching method guided by geometric and semantic simplifications (CLIP)

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…EPFL, TAU and IDC Herzliya
โœ…CLIP image encoder for sketching
โœ…Sketching as a set of Bezier curves
โœ…Param-optimization on CLIP-loss
โœ…Source code and models available

More: https://bit.ly/3oLEDF4
๐Ÿ”ฅ2๐Ÿฅฐ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿช‚SAHI: slicing detection/segmentation๐Ÿช‚

๐Ÿ‘‰An open-source lightweight library for large scale object detection & instance segmentation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Slicing Aided Hyper Inference
โœ…Large-scale detection/segment.
โœ…Sliced inference and merging
โœ…Utils for conversion, slicing, etc.
โœ…Code licensed under MIT License

More: https://bit.ly/3uMJoBZ
๐Ÿ”ฅ3โค2๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ100,000,000 image-text pairs!๐ŸŽ

๐Ÿ‘‰Large-scale Chinese cross-modal dataset for benchmarking different multi-modal pre-training methods.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…100 Million <image, text> pairs
โœ…>200px size, aspect ratio (1/3~3)
โœ…Models of ResNet, ViT & SwinT
โœ…Methods: CLIP, FILIP and LiT
โœ…Privacy/Sensitive words ๐Ÿค”

More: https://bit.ly/34BqlzX
๐Ÿ‘5๐Ÿค”1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง33 Million synthetic pedestrians๐Ÿง

๐Ÿ‘‰A novel large, fully synthetic dataset

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Exploiting the #gta5 engine
โœ…764 full-HD videos @20 fps
โœ…33M+ person instances
โœ…BBs & segmentation masks
โœ…2D/3D keypoints & depth

More: https://bit.ly/36njlY1
๐Ÿ‘6๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฅMarker-free 6D-point tracking๐Ÿฅ

๐Ÿ‘‰Full position and rotation of skeletal joints, with only a RGB frame

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Full 3-axis joint rotations
โœ…V-markers, emulating mocap
โœ…#3D from monocular with NN
โœ…Generalization, no retraining
โœ…SOTA rotation/position est.

More: https://bit.ly/34GdoF5
๐Ÿ”ฅ12๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงผ Synthetic dataset for #Retail ๐Ÿงผ

๐Ÿ‘‰A large-scale photorealistic synthetic dataset with annotations for semantic segmentation, instance segmentation, depth estimation, and object detection.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Dataset from Standard.AI
โœ…2,134 unique scenes
โœ…25k+ annotated samples
โœ…Introducing the "change detection"
โœ…Multi-view representation learning
โœ…NonCommercial-ShareAlike 4.0

More: https://bit.ly/3uXqubB
๐Ÿคฏ6๐Ÿฅฐ3๐Ÿ‘1๐Ÿ”ฅ1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒˆ Graph Neural Nets Forecasting๐ŸŒˆ

๐Ÿ‘‰Data-driven approach for forecasting global weather using graph neural networks

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Data-driven forecasting via GNNs
โœ…Model: 6.7M parameters, float32
โœ…6-hours forecast in 0.04 secs.
โœ…A 5-day forecast in 0.8 secs.

More: https://bit.ly/3LH4CXR
๐Ÿ‘4๐Ÿ‘2๐Ÿค”1
Media is too big
VIEW IN TELEGRAM
๐ŸฅซWatch Those Words!๐Ÿฅซ

๐Ÿ‘‰Berkeley unveils a novel approach to discover cheap-fake and visually persuasive deep-fakes

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Regardless of falsification
โœ…Semantic person-specific
โœ…Word-conditioned analysis
โœ…Generalization across fakes

More: https://bit.ly/3oXWmcd
๐Ÿ‘5๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”‹V2X-sim for #selfdriving is out!๐Ÿ”‹

๐Ÿ‘‰V2X: collaboration between a vehicle and any surrounding entity

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Suitable for #selfdrivingcars
โœ…Rec. from road & vehicles
โœ…Multi-streams/perception
โœ…Detection, tracking, & segmentation
โœ…RGB, depth, semantic, BEV & LiDAR

More: https://bit.ly/3H6veOI
๐Ÿ”ฅ6๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸInfinite Synthetic dataset for Fitness๐Ÿ

๐Ÿ‘‰Opensource synthetic images for fitness, single/multi-person, and realistic variation in lighting, camera angles, and occlusions

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…60k images, 1-5 avatars
โœ…15 categories, 21 variations
โœ…Blender and ray-tracing
โœ…SMPL-X + facial expression
โœ…Cloth/skin tone sampled
โœ…147 4K HDRI panoramas
โœ…Creative Commons 4.0

More: https://bit.ly/33B1R9q
๐Ÿคฉ5โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โ™Š DITTO: Digital Twins from Interaction โ™Š

๐Ÿ‘‰Digitizing objects for #metaverse through interactive perception

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…DIgital Twin of arTiculated Objects
โœ…Geometry & kinematic articulation
โœ…Articulation & 3D via perception
โœ…Source code under MIT License

More:https://bit.ly/3LMazCV
๐Ÿ”ฅ5โค2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿค– Robotic Telekinesis from Youtube ๐Ÿค–

๐Ÿ‘‰CMU unveils a Robot that observes humans and imitates their actions in real-time

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Enabling robo-hand teleoperation
โœ…Suitable for untrained operator
โœ…Single uncalibrated RGB camera
โœ…Leveraging unlabeled #youtube
โœ…No active fine-tuning or setup
โœ…No collision via Adv-Training

More: https://bit.ly/3H7zUnh
๐Ÿ”ฅ3๐Ÿคฏ2๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’„DIGAN: #AI for video generation๐Ÿ’„

๐Ÿ‘‰A novel INR-based generative adversarial network for video generation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Dynamics-aware generator
โœ…INR-based clip generator
โœ…Manipulating space/time
โœ…Identifying unnatural motion

More: https://bit.ly/3H6sHE4
๐Ÿ”ฅ4๐Ÿคฏ1