AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”΅ Deep Saliency: driving the attention πŸ”΅

πŸ‘‰Google unveils a family of operators to "drive" human saliency

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Editing image to drive Saliency
βœ…Transforms to hide distractors
βœ…Warping operator for distractor
βœ…GAN-op for less-saliency altern.

More: https://bit.ly/3KoQQc2
πŸ‘9🀩4
This media is not supported in your browser
VIEW IN TELEGRAM
🎍#3D scene manipulation from 2D🎍

πŸ‘‰Reconstruct, decompose, manipulate & render 3D scenes in a single pipeline

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Unique 3D, non-occupied space from 2D
βœ…Inverse query algorithm for shapes
βœ…First synthetic dataset for 3D editing

More: https://bit.ly/3RlYhTY
πŸ”₯11❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🍊StableFace: Talking Face Generation🍊

πŸ‘‰Analysis on motion jittering in 3D face generation (audio-in -> video-out)

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Motion jittering analysis for stability
βœ…Gaussian-based adaptive smoothing
βœ…Augmented erosions of neural renderer
βœ…Audio-fused generator for dependency

More: https://bit.ly/3Kt95gI
πŸ‘5😱3❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🧑 Avatarization in 90's. So Romantic 🧑

πŸ‘‰Making of the first #MortalKombat in early 90's

More: https://bit.ly/3wTSpJB
❀13
This media is not supported in your browser
VIEW IN TELEGRAM
πŸš— Massive Dataset in Virtual Cities πŸš—

πŸ‘‰Synthehicle: 7 hours of labeled material, 340 cams, 64 days, rain, dawn, & night scenes.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Multi-target multi-cam tracking
βœ…2D, 3D, segm. & depth annotations
βœ…Instance, semantic & panoptic segm.
βœ…340 clips, 64 scenes, 17 hrs, 4M BBs

More: https://bit.ly/3TArHiV
❀10πŸ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ¨Controllable #3D Adversarial FaceπŸͺ¨

πŸ‘‰#Meta (+CMU) on decoupling identity/expression + granular control over expressions

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Supervised auto-enc. + GAN
βœ…UV texture maps + 3D faces
βœ…Control expression, saving ID
βœ…Code under X11 License

More: https://bit.ly/3AVE80q
πŸ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯‘ DALLΒ·E: Outpainting via #NLP πŸ₯‘

πŸ‘‰Extending any original image, creating large-scale images in any aspect ratio

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Extending an image beyond its borders
βœ…Visual elements in same style of the input
βœ…Driving the image "story" in new directions
βœ…Shadows, reflections & textures w/ context

More: https://bit.ly/3eoH8uD
πŸ”₯20🀯7❀1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŒͺ️ TimeLapse++: Video Temporal PyramidπŸŒͺ️

πŸ‘‰Multi-scale lens to view the passage of time: far beyond a "classic" timelapse

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Inspired by "old-school" spatial pyramids
βœ…Video Spectrogram to go through pyramid
βœ…Months/years of data in a few seconds!
βœ…Multi-temporal freq., no aliasing

More: https://bit.ly/3TKnYPS
🀯6πŸ‘2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🫐 Stable Diffusion Video is out! 🫐

πŸ‘‰A free notebook to generate videos by interpolating the latent space of SD.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Blueberry to strawberry spaghetti
βœ…Dream items from same prompt
βœ…Morph different prompts (seeds)
βœ…Built on a script by A. Karpathy

More: https://bit.ly/3ey8632
🀯15πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🦎 VMT: Video Mask Transfiner 🦎

πŸ‘‰Novel highly efficient ViT structure for video instance segmentation.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…HD & more temporally stable mask
βœ…Higher resolution features for VIS
βœ…Detecting error-prone s-t. regions
βœ…Auto-refinement on training data!

More: https://bit.ly/3RKXtb4
🀯9❀1
🀯 #StableDiffusion + #Dallemini = BOOM! 🀯

πŸ‘‰A #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)

More: https://bit.ly/3TTOshR
πŸ”₯9πŸ‘5😒1
This media is not supported in your browser
VIEW IN TELEGRAM
🐠VIS - Deformable Transformers 🐠

πŸ‘‰DeVIS: VIS method with efficiency and performance of deformable ViT

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Temp. multi-scale D-Attention
βœ…Instance-aware object queries
βœ…Mask: DA + multi-scale feats map
βœ…Improved multi-cue clip tracking
βœ…SOTA on YouTube-VIS 2021/OVIS

More: https://bit.ly/3TQv1Xc
πŸ”₯8❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈 X-NeRF: Cross-Spectral NeRF 🌈

πŸ‘‰Cross-Spectral NeRF from cams with different light spectrums

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…First ever cross-spectral NeRF
βœ…Avoiding non-trivial calib/match
βœ…Normalized Cross-Device Coords
βœ…Novel dataset w/ RGB, MS, & IR

More: https://bit.ly/3RqHnUo
πŸ‘7
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘ΉTT-GNeRF: generative NeRF for FacesπŸ‘Ή

πŸ‘‰TT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…ETH + Uni_Trento + #Snap 🀯
βœ…DAEM for disentanglement of 3D model
βœ…"Training-as-Init, Optimizing-for-Tuning"
βœ…Consistency++, preserving non-target ROI
βœ…Unsupervised optimization of geometry

More: https://bit.ly/3ARZmMw
πŸ”₯4❀1πŸ‘1
πŸŽͺ SOTA in Arbitrary Shape Text Detection πŸŽͺ

πŸ‘‰Novel unified coarse-to-fine Transformer for arbitrary shape text detection

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Coarse-to-fine arbitrary text detection
βœ…Accurate text detection, NO post-process
βœ…Boundary proposal generation mechanism
βœ…Innovative boundary transformer (iterative)
βœ…Boundary energy loss (BEL) for refinement

More: https://bit.ly/3D6Ryt4
❀8πŸ‘2😒1
This media is not supported in your browser
VIEW IN TELEGRAM
🐲 Open-Source Self-Driving projects 🐲

πŸ‘‰A free repo with many autonomous vehicle-related projects

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Basic/Advance Lane/Line Detection
βœ…Driving behavior by training & validating
βœ…Autopilot: predicting steering angle

More: https://bit.ly/3qqJ7RB
πŸ”₯22πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯€K-VIL: Keypoint-based visual imitationπŸ₯€

πŸ‘‰K-VIL: auto-incremental extraction of object-centric task representation.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Efficient task-relevant keypoints
βœ…Embodiment-independent tasks
βœ…Adaptation of tasks to new scenes
βœ…Input: only a small set of demo clips
βœ…Novel keypoint-based controller

More: https://bit.ly/3eIrxpP
πŸ”₯7πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’œ #Selfdriving in 80's. Damn Romantic πŸ’œ

πŸ‘‰The first self-driving car with people on board, 1986. So slow and lovely.

More: https://bit.ly/3BtRDon
❀9πŸ‘4πŸ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
🏡️ TORAS: SOTA #AI for annotation 🏡️

πŸ‘‰TORAS: web-based AI-powered, cooperative, annotation platform.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…SOTA AI tools -> significant speedup
βœ…"Recipes" to define how to annotate
βœ…Repo with folder structure for storage
βœ…Also on-prem for (commercial) firms

More: https://bit.ly/3L78YI2
πŸ”₯9🀯2πŸ‘1