AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
136 photos
249 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🥽 #metaverse in 1991 🥽

👉Q: is #VR the technology that developed least in the last 30 years? 🤔

Discussion: https://bit.ly/3txWF07
👍3🤬3🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🫕NeRFusion: Large-Scale Reconstruction🫕

👉Efficient large-scale reconstruction & photo-realistic rendering

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Frame-by-frame R.F.
Neural reconstruction
Real-time at 20+ fps
SOTA on indoor / objects

More: https://bit.ly/3iyfoCo
🤯7🔥4👍3👏2
This media is not supported in your browser
VIEW IN TELEGRAM
ORViT for understanding tasks

👉ORViT: object-centric approach that extends ViT layers incorporating object representations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Spatio-temporal through the net
''Object-Region Attention''
''Object-Dynamics" module
Code just released! Apache 2.0

More: https://bit.ly/3wAUavW
🔥5👍3😱2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪅Insane Neural Sketching from #MIT🪅

👉Line drawing generation as unsupervised image translation with various losses

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unpaired method for line drawing
Geometry loss to predict depth
Semantic loss to match CLIP feats
SOTA on unpaired translation/generation
Code and Models under MIT License

More: https://bit.ly/36JRr8A
🤯7🔥41👍1🥰1👏1😁1
This media is not supported in your browser
VIEW IN TELEGRAM
🏔️MPS-Net: new SOTA for #3D human🏔️

👉MPS-Net: accurate & temporally coherent 3D human pose/shape from video

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
MoCA: visual cues from motion
HAFI to mix past/future feats
Stronger temporal correlation
SOTA on multiple datasets

More: https://bit.ly/3uAI5EB
🤯9🔥1🥰1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🤿Transfiner: hyper-detailed segmentation🤿

👉Mask Transfiner: #AI for HQ & efficient instance segmentation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Transfiner: HQ segmentation
HQ seg. via quadtree structure
SOTA & extreme details
Code under MIT License

More: https://bit.ly/3KVzseM
👍5🔥3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥙 DualStyleGAN: SOTA in style transfer🥙

👉Flexible control of dual styles of face domain and extended artistic portrait domain

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
High-resolution (1024*1024)
Intrinsic/extrinsic style path
Hierarchical style manipulation
Novel progressive fine-tuning
Source code under MIT License

More: https://bit.ly/3uS26Xp
👍11🤩4🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🍚 GTR: Global Tracking Transformers 🍚

👉UTexas + Apple: transformer for global multi-object tracking

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
GTR operates on any object
Few frames->global trajectories
SOTA on detectors for any object
Code under Apache License 2.0

More: https://bit.ly/3DiqkxF
🔥7👍2🤯2😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧠E2E Perception for #selfdrivingcars🧠

👉HybridNets: multi-task net with several key optimizations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
End-to-end perception network
Traffic, lane, object detection
Drivable segmentation area
Real-time on embedded systems
Source code under MIT License

More: https://bit.ly/3JMk8Az
👍84👏2🤯1😱1
Media is too big
VIEW IN TELEGRAM
🛩️Smart Parking with UAVs🛩️

👉A novel methodology to monitor car parking areas in real-time via Drones/UAVs

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
YoloV3 + DeepSort tracker
Vehicle detection/tracking
Occupancy estimation via RT
Four blocks, unique pipeline

More: https://bit.ly/3iJD8nm
8👍5🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👕 Detecting Events via #AI 👕

👉Localizing object states & corresponding state-modifying actions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
SS-learning state-modifying
Noise adaptive weighting
ChangeIt: 2.6k+ hrs , 34k+ changes
Dataset, code, and model!

More: https://bit.ly/3uBwxkj
👍7🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈🌈 Interactive Neural Labelling 🌈🌈

👉Dense labelling of geometry, color & semantics via #3D neural field

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
No training data
Dense labeling
Classes on the fly
Labelling at a scale

More: https://bit.ly/36Y0faQ
🔥4👍1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
♟️Neural RGB-D Reconstruction♟️

👉Novel approach for #3D mixing implicit surface representations with NeRFs

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
RGB-D based reconstruction
Leveraging color & depth
Depth into the NeRF
Pose & camera refinement

More: https://bit.ly/3iN6e54
🔥5👍2🤯2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦓 Hyper-Fast Refinement 🦓

👉SharpContour: novel contour-based refinement for semantic segmentation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Instance-aware Point Classifier
Deforming by discrete updating
Estimating offsets independently
Source code soon available!

More: https://bit.ly/3qL04GY
👍5🔥4🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🥗 Neural Mesh via Text only 🥗

👉Zero-shot generation of 3D model using only a target text prompt

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ZS 3D model with text only
ZS text-guided generation
Meshes with texture/normal
Differentiable LLS implementation

More: https://bit.ly/3u0qnvb
🤯8👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🪆#3D, Materials, and Lighting from 2D🪆

👉Nvidia: topology, materials & map lighting jointly from 2D. INSANE 😮

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Topology, materials and lighting
Meshes with materials/lighting
Compact volumetric texturing
Differentiable all-frequency lighting
Code under #NVIDIA License

More: https://bit.ly/3IUoF2t
👏5👍1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍜Ref-NeRF for extreme realism🍜

👉Ref-NeRF: reflected radiance & structures via collection of spatially-varying scene properties

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Realism and accuracy
Replacing NeRF’s params
Regularization of volume density
Integrated Directional Encoding

More: https://bit.ly/3tTlS5l
👍4🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🦧OFA for all: Cross, Vision, Language🦧

👉Unified multimodal model for image generation, visual grounding, etc.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Sequence-to-sequence learning
Image Captioning / Generation
Visual Grounding / Classification
Text-to-Image Generation
Visual Question Answering

More: https://bit.ly/3wSTGlc
👍7🤯6👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🍿Old Films Back to Life with #AI🍿

👉Recurrent transformer network (RTN) to restore heavily degraded old films

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Transformer blocks for spatial
Knowledge from adjacent frames
Color from keyframes to whole clip
Source code available in days!

More: https://bit.ly/3wZbV8y
12👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍊Neural Head #Avatars from RGB🍊

👉Novel neural representation for animatable head avatar

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel articulated human head
Full-geometry reconstruction
Differentiable optimization pipeline
Disentanglement of shape/color

More: https://bit.ly/3DxUGMI
🔥3🤯2😱1