AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
136 photos
251 videos
14 files
1.32K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
☀️LocoProp: Neural Layers Composition☀️

👉Google AI unveils LocoProp: novel neural paradigm for modular composition of layers.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Backprop++ via Local Loss Optimization
Layer-based w-reg, target output, loss
Multiple local update via first-order opt.
Superior performance and efficiency

More: https://bit.ly/3Q40YJn
🔥13
This media is not supported in your browser
VIEW IN TELEGRAM
🔥PCVOS: clip-wise mask VOS🔥

👉PCVOS: new semi-supervised video object segmentation method

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Reformulating semi-supervised VOS
Novel per-clip inference perspective
Clip-wise operation on intra-clip
PCVOS: model for per-clip inference
New SOTA on multiple benchmarks

More: https://bit.ly/3vJtmbz
👍10😁21🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍑 World-Object Detection via ViT 🍑

👉Google unveils OWL-ViT: open-vocabulary detector based on ViTs 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ViTs for Open-World Localization
Img-level to open-vocabulary detection
SOTA one-shot (img.cond.) detection

More: https://bit.ly/3Sy3jOj
🤯12👍3
This media is not supported in your browser
VIEW IN TELEGRAM
🎹🎹 Learning Piano in #AR 🎹🎹

👉PianoVision (on #META #Quest2) accelerates the piano learning via Passthrough #AR & hand tracking

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Sheet Insight to learn sight-read
MIDI keyboard connectivity
Air piano for no physical pianos
Multiplayer Music Instruction
PianoVision Music Hall in #VR

More: https://bit.ly/3zYvwGX
15🤯6👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊EPro-PnP: Persp-n-Points Detection🧊

👉EPro-PnP: probabilistic PnP layer for general e2e pose estimation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Probabilistic PnP for general e2e pose
Top-tier in 6DoF by inserting into CDPN
Deformable accurate detection
2D-3D corresp. learned from scratch

More: https://bit.ly/3BNPXYr
👍11
This media is not supported in your browser
VIEW IN TELEGRAM
🥇#NVIDIA wins SIGGRAPH's Best Paper🥇

👉Instant #NeRF awarded as a best paper at SIGGRAPH 2022!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Speed-up of several orders of magnitude
HQ neural primitives in a matter of secs
Render in tens of milliseconds at 1080p
Source code and resources available!

More: https://bit.ly/3Qt8c9D
👏16🔥63👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪰 EasyMocap: Open Neural Mocap 🪰

👉EasyMocap: open-source marker-less mocap with novel view synthesis from RGB

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬 (of last paper added):
Editable free-viewpoint video
Layered neural representation of humans
Multi-pax -> instances, weakly-supervised
HQ neural representation of the humans
Addressing camera error by human poses

More: https://bit.ly/3p6lUDO
🤯6👍3👏32
This media is not supported in your browser
VIEW IN TELEGRAM
🎰 Texturify: Neural Textures Generator 🎰

👉A step towards automated content creation. HQ textures directly on surface of 3D object

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
TUM + Max Planck + Apple 🍏
Realistic, HQ textures from 2D pics
3D shape geometry, no 3D supervision
3D-aware surface-based generation net

More: https://bit.ly/3BW7UUU
👍8
This media is not supported in your browser
VIEW IN TELEGRAM
🍨 Scaling Neural Indoor Scene 🍨

👉Neural scene rendering for indoor: scalable in both training/rendering

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Neural scene rendering for indoor
#3D into tiles with MLPs to scale up
Parallel training of tile-based MLPs
View-indep. components (via surf-MLP)

More: https://bit.ly/3bH94IX
🔥2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Stable Diffusion on clips. INSANE🔥

👉The most advanced latent text-to-image DM. #RunwayML just announced is going to apply it on clips

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latent DM on 512p from LAION-5B
Frozen CLIP ViT-L/14 text encoder
Lightweight, runs on a 10GB-GPU
Checkpoints only for research

More: https://bit.ly/3QfkRx3
🤯13😱12👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐍 Implicitron: "democratizing" NeRF🐍

👉#META opens a novel framework for NeRF-world in #PyTorch3D #pytorch

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Implicit representations (NeRF) / Render
RaySampler/PointSampler & more
NeRF’s MLP, IDR’s FF, SRN, etc.
Renderers: MEAR, LSTMRenderer, etc.

More: https://bit.ly/3bPyJPJ
🔥4🤯2
This media is not supported in your browser
VIEW IN TELEGRAM
🧰 FGT: flow-guided inpainting 🧰

👉#Microsoft (+USTC) unveils FGT: flow-guided ViT for video inpainting 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
OF into transformer for attention++
Flow completion net w/ local feats.
Dual perspective spatial MHSA
Local attention with global content

More: https://bit.ly/3pk5J5S
11👍5
This media is not supported in your browser
VIEW IN TELEGRAM
🍏NeuMan: Human NeRF in the wild🍏

👉#Apple opens a novel human pose/view from just a single in-the-wild video

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
No extra devices/annotations
Both Human (novel poses) + Scene
E2E SMPL optimization + error-corr.
Applications such as "telegathering"

More: https://bit.ly/3K4iTO6
👍15
This media is not supported in your browser
VIEW IN TELEGRAM
🥑 CLIP-based Neural Style Transfer 🥑

👉From #Nvidia a novel method for transferring the style to a #3D object

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Texture style for 3D by CLIP-ResNet50
Nearest-neighbor feature matching loss
CLIP-based loss extraction of textures
NNFM for multiple style pics / control
No source code or models available 😒

More: https://bit.ly/3c32dK5
🤯12🔥54👍2😱2😁1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 KeypointNeRF: code is out! 🔥

👉KeypointNeRF by #Meta: "NeRF"-avatars

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Generalizable NeRF for virtual avatar
Sparse 3D keypoints for SOTA avatar
Novel unseen subjects from 2/3 views
"iPhone" captures for #metaverse

More: https://bit.ly/3pyl17e
🔥8👍3👎1
This media is not supported in your browser
VIEW IN TELEGRAM
🥭Massive GTA-V human dataset🥭

👉GTA-Human: outperforming SOTA with a purely synthetic training.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
600+ gender, age, ethnicity & clothing
20,000+ clips, variety of human activities
6 categories of location, different BGs
Occlusions, lighting, and weather system

More: https://bit.ly/3wpZyRD
🔥142👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🍈DeepBillboards: old-school trick for #VR🍈

👉DeepBillboards models a 3D object implicitly using neural net on the user’s viewing direction

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
#Google Brain +Tsukuba + Tokyo
Rendering at higher res., improving #VR
NeRF into interactive VR with accuracy++
NeRF (or any others) directly in #Unity

More: https://bit.ly/3CsTQ5y
👍6👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌐RelPose: Probabilistic Relative Pose🌐

👉A novel method for core component in #SLAM / NeRF-powered apps.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Core component of SfM/SLAM
Pre-processing for neural (NeRF)
Energy-based over rotations
SOTA on both seen/unseen objects

More: https://bit.ly/3T60TXw
🔥12👍2👏21