AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿš™ AutoRF: #3D objects in-the-wild ๐Ÿš™

๐Ÿ‘‰From #Meta: #3D object from just a single, in-the wild, image

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel view synthesis from in-the-wild
โœ…Normalized, object-centric representation
โœ…Disentangling shape, appearance & pose
โœ…Exploiting BBS & panoptic segmentation
โœ…Shape/appearance properties for objects


More: https://bit.ly/3O4ONeQ
๐Ÿคฏ7๐Ÿ˜ฑ2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒ GAN-based Darkest Dataset๐ŸŒ 

๐Ÿ‘‰Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…"Darkest" dataset ever seen
โœ…Moonless, no external illumination
โœ…GAN-tuned physics-based model
โœ…Clips with dancing, volleyball, flags...

More: https://bit.ly/3LXxMkN
๐Ÿ‘3๐Ÿคฏ2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿค–Populating with digital humans๐Ÿค–

๐Ÿ‘‰ETHZ unveils GAMMA to populate the #3D scene with digital humans

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…GenerAtive Motion primitive MArkers
โœ…Realistic, controllable, infinite motions
โœ…Tree-based search to preserve quality
โœ…SOTA in realistic/controllable motion

More: https://bit.ly/3OgY4AG
๐Ÿ˜ฑ5๐Ÿ‘4๐Ÿ”ฅ2๐Ÿ‘1๐Ÿคฏ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ#AIwithPapers: we are ~2,000!๐Ÿ”ฅ

๐Ÿ’™๐Ÿ’› Simply amazing. Thank you all ๐Ÿ’™๐Ÿ’›

๐Ÿ˜ˆ Invite your friends -> https://t.iss.one/AI_DeepLearning
โค18๐Ÿ”ฅ8๐Ÿฅฐ4๐Ÿ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ˜ผGARF: Gaussian Activated NeRF๐Ÿ˜ผ

๐Ÿ‘‰GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…NeRF from imperfect camera poses
โœ…NO hyper-parameter tuning/initialization
โœ…Theoretical insight on Gaussian activation
โœ…Unlocking NeRF for real-world application?

More: https://bit.ly/36bvdfU
๐Ÿ‘4๐Ÿคฉ2โค1๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽญNovel pre-training strategy for #AI๐ŸŽญ

๐Ÿ‘‰EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Multimodal: additional modal. over RGB
โœ…Multi-task: multiple outputs over RGB
โœ…General: MultiMAE by pseudo-labeling
โœ…Classification, segmentation, depth
โœ…Code under NonCommercial 4.0 Int.

More: https://bit.ly/3jRhNsN
๐Ÿ”ฅ7๐Ÿคฏ2๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงช A new SOTA in Dataset Distillation ๐Ÿงช

๐Ÿ‘‰A new approach by Matching Training Trajectories is out!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Distilling data "to match" bigger one
โœ…Distilled data to guide a network
โœ…Trajectories of experts from real data
โœ…SOTA + distilling higher-res visual data

More: https://bit.ly/3JwYOxW
๐Ÿ‘5๐Ÿ”ฅ1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงค Two-Hand tracking via GCN ๐Ÿงค

๐Ÿ‘‰The first-ever GCN for two interacting hands in single RGB image

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Reconstruction by GCN mesh regression
โœ…PIFA: pyramid attention for local occlusion
โœ…CHA: cross hand attention for interaction
โœ…SOTA + generalization in-the-wild scenario
โœ…Source code available under GNU ๐Ÿคฏ

More: https://bit.ly/3KH5FWO
๐Ÿ‘10๐Ÿ‘4๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ•น๏ธVideo K-Net, SOTA in Segmentation๐Ÿ•น๏ธ

๐Ÿ‘‰Simple, strong, and unified framework for fully end-to-end video panoptic segmentation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Learnable kernels from K-Net
โœ…K-Net learns to segment & track
โœ…Appearance / cross-T kernel interaction
โœ…New SOTA without bells and whistles ๐Ÿคทโ€โ™‚๏ธ

More: https://bit.ly/3uEEZQR
๐Ÿ‘6๐Ÿ”ฅ1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸญDeepLabCut: tracking animals in the wild๐Ÿญ

๐Ÿ‘‰A toolbox for markerless pose estimation of animals performing various tasks

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Multi-animal pose estimation
โœ…Datasets for multi-animal pose
โœ…Key-points, limbs, animal identity
โœ…Optimal key-points without input

More: https://bit.ly/37L1mLE
๐Ÿ”ฅ6๐Ÿค”4๐Ÿ‘2๐Ÿคฏ2โค1๐Ÿ‘1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸกNeural Articulated Human Body๐Ÿก

๐Ÿ‘‰Novel neural implicit representation for articulated body

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…COmpositional Articulated People
โœ…Large variety of shapes & poses
โœ…Novel encoder-decoder architecture

More: https://bit.ly/3xvn7dl
๐Ÿ‘4๐Ÿฅฐ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆš 2K Resolution Generative #AI ๐Ÿฆš

๐Ÿ‘‰Novel continuous-scale training with variable output resolutions

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Mixed-resolution data
โœ…Arbitrary scales during training
โœ…Generations beyond 1024ร—1024
โœ…Variant of FID metric for scales
โœ…Source code under MIT license

More: https://bit.ly/3uNfVY6
๐Ÿคฏ11๐Ÿ‘2๐Ÿ”ฅ2๐Ÿ˜ฑ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸDS Unsupervised Video Decomposition๐Ÿ

๐Ÿ‘‰Novel method to extract persistent elements of a scene

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Scene element as Deformable Sprite (DS)
โœ…Deformable Sprites by video auto-encoder
โœ…Canonical texture image for appearance
โœ…Non-rigid geom. transformation

More: https://bit.ly/37WV9w1
๐Ÿ‘4๐Ÿคฏ3๐Ÿ”ฅ1๐Ÿฅฐ1๐Ÿ‘1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅ“ L-SVPE for Deep Deblurring ๐Ÿฅ“

๐Ÿ‘‰L-SVPE to deblur scenes while recovering high-freq details

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Learned Spatially Varying Pixel Exposures
โœ…Next-gen focal-plane sensor + DL
โœ…Deep conv decoder for motion deblurring
โœ…Superior results over non-optimized exp.

More: https://bit.ly/3uRYQMT
๐Ÿคฉ7๐Ÿ‘2๐Ÿค”2๐ŸŽ‰1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงงHyper-Fast Instance Segmentation๐Ÿงง

๐Ÿ‘‰Novel Temporally Efficient Vision Transformer (TeViT) for VIS

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Video instance segmentation transformer
โœ…Contextual-info at frame/instance level
โœ…Nearly convolution-free framework ๐Ÿคทโ€โ™‚๏ธ
โœ…The new SOTA for VIS, ~70 FPS!
โœ…Code & models under MIT license

More: https://bit.ly/3rCMXIn
๐Ÿ”ฅ10๐Ÿ‘3๐Ÿ‘1๐Ÿคฏ1
๐Ÿ“—Unified Scene Text/Layout Detection๐Ÿ“—

๐Ÿ‘‰World's first hierarchical scene text dataset + novel detection method

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Unified detection & geometric layout
โœ…Hierarchical annotations in natural scenes
โœ…Word, line, & paragraph level annotations
โœ…Source under CC Attribution Share Alike 4.0

More: https://bit.ly/3jRpezV
๐Ÿ”ฅ3๐Ÿคฏ2โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ™Œ #Oculus' new Hand Tracking ๐Ÿ™Œ

๐Ÿ‘‰Hands are able to move as naturally and intuitively in the #metaverse as do in real life

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Hands2.0 powered by CV & ML
โœ…Tracking hand-over-hand interactions
โœ…Crossing hands, clapping, high-fives
โœ…Accurate thumbs-up gesture

More: https://bit.ly/3JXPvY2
๐Ÿคฏ6โค4๐Ÿ‘2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ—๏ธNew SOTA in #3D human avatar๐ŸŽ—๏ธ

๐Ÿ‘‰PHORHUM: photorealistic 3D human from mono-RGB

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Pixel-aligned method for 3D geometry
โœ…Unshaded surface color + illumination
โœ…Patch-based rendering losses for visible
โœ…Plausible color estimation for non-visible

More: https://bit.ly/3MkvBrA
๐Ÿคฏ4๐Ÿ‘2๐Ÿฅฐ2โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ“Ÿ What's in your hands (#3D) ? ๐Ÿ“Ÿ

๐Ÿ‘‰Reconstructing hand-held objects (from single RGB) without knowing their 3D templates๐Ÿคทโ€โ™‚๏ธ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Hand is highly predictive of object shape
โœ…Conditional-based on the articulation
โœ…Visual feats. / articulation-aware coords.
โœ…Code and models available!

More: https://bit.ly/3vuYn2a
๐Ÿ‘9๐Ÿคฏ2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”‹YODO: You Only Demonstrate Once๐Ÿ”‹

๐Ÿ‘‰A novel category-level manipulation learned in sim from single demonstration video๐Ÿคฏ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…One-shot IL, model-free 6D pose tracking
โœ…Demonstration BY single 3rd-person-view
โœ…manipulation including hi-precision tasks
โœ…Category-level Behavior Cloning
โœ…Attention for dynamic coords selection
โœ…Generalizability to novel unseen obj/env

More: https://bit.ly/3v0V4R4
๐Ÿคฏ8โค3๐Ÿ‘2๐Ÿ˜ฑ2๐Ÿคฉ2๐Ÿ‘1