AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
136 photos
248 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ°NUWA-Infinity is out!πŸͺ°

πŸ‘‰βˆž generation by #Microsoft: arbitrarily-sized HD images and long videos 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Unconditional Image Gen.
βœ…Text-to-Image/Text-to-Clip
βœ…Animation / Out-painting
βœ…Hi-res, arbitrary long clip
βœ…NCP for patches caching

More: https://bit.ly/3zmBf9f
πŸ”₯7πŸ‘2❀1πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ #AIwithPapers: we are 3,500+! πŸ”₯

πŸ’™πŸ’› Ready for YOLO 10, 11, Ο€, ∞, Ξ¨, and more? The more we are, the faster we catch'em all πŸ’™πŸ’›

😈 Invite your friends -> https://t.iss.one/AI_DeepLearning
πŸ‘12❀10😁5πŸ”₯3
This media is not supported in your browser
VIEW IN TELEGRAM
🎷🎷OMNI3D: #3D Objects in the Wild🎷🎷

πŸ‘‰#3D detection: 234k images, 3M+ instances & 97 categories

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…OMNI3D from publicly released dataset
βœ…234k pics, 3M+ annotation with 3D box
βœ…97 categories such as sofa, table, cars
βœ…Fast (450x) and exact algorithm for IoU
βœ…Cube R-CNN: novel 3D object detector

More: https://bit.ly/3cznjzG
πŸ‘11
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘ΉMultiface Neural Rendering πŸ‘Ή

πŸ‘‰A new multi-view, Hi-Res data collected at #META Reality Labs for neural face

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Mugsy, large scale multi-cam apparatus
βœ…High-Res sync facial performance
βœ…Closing the gap in accessing HQ data
βœ…Suitable for #VR & #mixedreality

More: https://bit.ly/3b6XfeL
🀯8πŸ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’„DEVIANT: SOTA in mono-3D detectionπŸ’„

πŸ‘‰A novel Depth EquiVarIAnt NeTwork for 3D monocular detection in the wild

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Michigan + #Meta + Ford 🀯
βœ…Depth-equi. + scale equiv. steerable
βœ…New SOTA on KITTI & Waymo
βœ…Ok cross-dataset -> generalization

More: https://bit.ly/3OEFtgK
πŸ”₯16πŸ‘2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🧱 Assembling #LEGO with #AI 🧱

πŸ‘‰Step-by-step assembly manual created by human into machine-interpretable instructions

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Stanford + MIT + #Google 🀯
βœ…MEPNet: Manual-to-Executable-Plan Net
βœ…Manual to machine-executable plan
βœ…2D manual - 3D geometric shape
βœ…Reasoning on 3D alignments of legos

More: https://bit.ly/3PCwn5C
πŸ”₯9❀3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŽƒNew SOTA in UDA Semantic Seg.πŸŽƒ

πŸ‘‰HRDA: multi-res Unsupervised Domain Adaptive Semantic Seg. -> SOTA

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…ETH + MPG + KU Leuven 🀯
βœ…HRDA: multi-res approach for UDA
βœ…Manageable GPU memory footprint
βœ…Small objects & fine segmentation detail
βœ…New SOTA on GTA and Synthia dataset

More: https://bit.ly/3cKtDEp
🀯8πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
βš—οΈ SemAbs: 3D Scene Understanding βš—οΈ

πŸ‘‰Framework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…2D VLMs with 3D reasoning skills
βœ…ViTs Efficient MS Relevancy Extraction
βœ…Novel Open-World understanding tasks
βœ…Completing partially observed objects
βœ…Finding hidden objects from language

More: https://bit.ly/3PYYk7d
πŸ”₯7❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 TinyCD: Neural Change Detection 🦚

πŸ‘‰TinyCD: new SOTA in change detection with up to 150x fewer parameters.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…SOTA with up to 150X fewer params
βœ…Mixing blocks for s.t. cross-correlation
βœ…PW-MLP for pixel wise classification
βœ…MAMB: novel block for skip connection

More: https://bit.ly/3zFEngk
❀16πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🦊 3D-Aware "StyleGANv2" version 🦊

πŸ‘‰Upgrading StyleGANv2 into a novel 3D-aware GAN with just a minimal set of changes🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…MPI-like 3D-aware GAN w/ single-view
βœ…GMPI: generative multiplane image
βœ…2D GAN 3D-aware with a minimal changes
βœ…Encoding 3D-aware inductive biases

More: https://bit.ly/3OJ5gnS
🀯6πŸ‘4❀1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ“Ί NeRF-ing "The Big Bang Theory" πŸ“Ί

πŸ‘‰Berkeley unveils an approach for accurate estimation of actor’s 3D pose & location

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Input: images across the whole season
βœ…3D context (i.e. cams, structure, body)
βœ…Integrating context in 3D estimation
βœ…Re-ID, gaze, cinematography, pic editing
βœ…Knock, Knock, Penny!

More: https://bit.ly/3OLuaUb
πŸ”₯7🀯5πŸ₯°2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🎩ShAPO: SOTA in object understanding🎩

πŸ‘‰Joint multi-object detection, #3D texture, 6D object pose & size estimation.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Disentangled shape & appearance
βœ…Efficient octree-based differentiable
βœ…Object-centric understanding pipeline
βœ…Detection, reconstruction , 6D & size
βœ…SOTA in reconstruction & pose est.

More: https://bit.ly/3oHN5EQ
πŸ‘7🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ™οΈ CityNeRF: Neural Rendering of City Scenes πŸ™οΈ

πŸ‘‰Progressive NeRF model and training set on city-scenes

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…BungeeNeRF: novel progressive NeRF
βœ…Details on drastically varied scales
βœ…Growing with residual block structure
βœ…Inclusive multi-level data supervision

More: https://bit.ly/3cS9vk7
πŸ₯°7πŸ‘3🀯3😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍦🍦 Rewriting Geometry of GAN 🍦🍦

πŸ‘‰Drive GAN synthesizing many unseen objects with the desired shape

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…User-friendly "warping" with geometry
βœ…Low-rank update to layer for editing
βœ…Latent augmentation based on style-mix
βœ…Endless objects with defined changes
βœ…Latent space interpolation, image editing

More: https://bit.ly/3zIfOj8
πŸ‘8😱7😁3πŸ‘Ž2❀1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍏🍏 GAUDI: the Neural Architect 🍏🍏

πŸ‘‰Novel generative model for immersive 3D scenes from a moving camera

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Hundreds of thousands pics/scenes
βœ…Novel denoising optimization objective
βœ…New SOTA across multiple datasets
βœ…Un/conditional on images/text

More: https://bit.ly/3Bt65ye
πŸ”₯6
This media is not supported in your browser
VIEW IN TELEGRAM
🚜NeDDF: the NeRF evolution!🚜

πŸ‘‰Novel 3D representation that reciprocally constrains distance & density fields

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…NeRF provides no distance
βœ…Extending for arbitrary density
βœ…Density via dist-field & gradient
βœ…Alleviating the instability

More: https://bit.ly/3Bte8LC
πŸ‘7
Media is too big
VIEW IN TELEGRAM
πŸ”₯AND/OR: Composable Diffusion ModelsπŸ”₯

πŸ‘‰Novel neural compositional generation via Composable Diffusion Models

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…DM as energy-based models
βœ…Connecting diffusion models
βœ…Conjunction & negation, on top of DM
βœ…Zero-shot combinatorial generalization

More: https://bit.ly/3PYv1Cs
🀯5πŸ‘3❀2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ MobileNeRF is out -> Pure Fire! πŸ”₯

πŸ‘‰MobileNeRF is out: the mobile evolution of NeRF via textured polygons.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Same quality, 10x faster than SNeRG
βœ…Memory-- by storing surface textures
βœ…Integrated GPUs: less memory/power
βœ…Suitable for browser & viewer is HTML

More: https://bit.ly/3PUKPWy
πŸ”₯25πŸ‘5