AI with Papers - Artificial Intelligence & Deep Learning
15.4K subscribers
140 photos
253 videos
14 files
1.33K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ›ŧ Drive & Segment without Supervision đŸ›ŧ

👉Learning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Cross-modal unsupervised
✅Synchronized LiDAR & RGB
✅Object proposal on LiDAR points
✅SOTA, significant improvements

More: https://bit.ly/3L0wWTW
👍3đŸ”Ĩ1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌍 NeRF-free Neural Rendering 🌍

👉A simple 2D-only method with a single pass of a neural network

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Synthesis with NO 3D reasoning
✅Autoregressive & masked transf.
✅Pose -> object, object -> pose
✅Attention: branching attention
✅Source code under MIT License

More: https://bit.ly/3JC7unt
đŸ”Ĩ3😱2👍1🤩1
🤓👌Hey, TAKE OFF my eyeglasses! 😙👌

👉A novel framework to remove eyeglasses as well as their cast shadows from faces

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Novel mask-guided multi-step network
✅Leveraging 3D synthetic data only
✅Synthetic portraits with supervisions
✅Eyeglasses & shadows simultaneously

More: https://bit.ly/3IvQzlf
👍7đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨ #AI models/dataset for open surgery đŸĨ

👉Multi-task #AI model/dataset of real-time surgical behaviors, hands, and tools.

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Annotated Videos Open Surgery
✅Largest dataset of open surgical
✅2k clips and 23 procedures
✅12k annotations, 11k+ keypoints
✅Models/Dataset soon available!

More: https://bit.ly/3tvDdkK
👍8đŸ¤¯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨŊ #metaverse in 1991 đŸĨŊ

👉Q: is #VR the technology that developed least in the last 30 years? 🤔

Discussion: https://bit.ly/3txWF07
👍3đŸ¤Ŧ3đŸĨ°1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĢ•NeRFusion: Large-Scale ReconstructionđŸĢ•

👉Efficient large-scale reconstruction & photo-realistic rendering

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Frame-by-frame R.F.
✅Neural reconstruction
✅Real-time at 20+ fps
✅SOTA on indoor / objects

More: https://bit.ly/3iyfoCo
đŸ¤¯7đŸ”Ĩ4👍3👏2
This media is not supported in your browser
VIEW IN TELEGRAM
☕ORViT for understanding tasks☕

👉ORViT: object-centric approach that extends ViT layers incorporating object representations

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Spatio-temporal through the net
✅''Object-Region Attention''
✅''Object-Dynamics" module
✅Code just released! Apache 2.0

More: https://bit.ly/3wAUavW
đŸ”Ĩ5👍3😱2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĒ…Insane Neural Sketching from #MITđŸĒ…

👉Line drawing generation as unsupervised image translation with various losses

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Unpaired method for line drawing
✅Geometry loss to predict depth
✅Semantic loss to match CLIP feats
✅SOTA on unpaired translation/generation
✅Code and Models under MIT License

More: https://bit.ly/36JRr8A
đŸ¤¯7đŸ”Ĩ4❤1👍1đŸĨ°1👏1😁1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ”ī¸MPS-Net: new SOTA for #3D humanđŸ”ī¸

👉MPS-Net: accurate & temporally coherent 3D human pose/shape from video

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅MoCA: visual cues from motion
✅HAFI to mix past/future feats
✅Stronger temporal correlation
✅SOTA on multiple datasets

More: https://bit.ly/3uAI5EB
đŸ¤¯9đŸ”Ĩ1đŸĨ°1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ¤ŋTransfiner: hyper-detailed segmentationđŸ¤ŋ

👉Mask Transfiner: #AI for HQ & efficient instance segmentation

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Transfiner: HQ segmentation
✅HQ seg. via quadtree structure
✅SOTA & extreme details
✅Code under MIT License

More: https://bit.ly/3KVzseM
👍5đŸ”Ĩ3đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨ™ DualStyleGAN: SOTA in style transferđŸĨ™

👉Flexible control of dual styles of face domain and extended artistic portrait domain

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅High-resolution (1024*1024)
✅Intrinsic/extrinsic style path
✅Hierarchical style manipulation
✅Novel progressive fine-tuning
✅Source code under MIT License

More: https://bit.ly/3uS26Xp
👍11🤩4đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
🍚 GTR: Global Tracking Transformers 🍚

👉UTexas + Apple: transformer for global multi-object tracking

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅GTR operates on any object
✅Few frames->global trajectories
✅SOTA on detectors for any object
✅Code under Apache License 2.0

More: https://bit.ly/3DiqkxF
đŸ”Ĩ7👍2đŸ¤¯2😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧠E2E Perception for #selfdrivingcars🧠

👉HybridNets: multi-task net with several key optimizations

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅End-to-end perception network
✅Traffic, lane, object detection
✅Drivable segmentation area
✅Real-time on embedded systems
✅Source code under MIT License

More: https://bit.ly/3JMk8Az
👍8❤4👏2đŸ¤¯1😱1
Media is too big
VIEW IN TELEGRAM
đŸ›Šī¸Smart Parking with UAVsđŸ›Šī¸

👉A novel methodology to monitor car parking areas in real-time via Drones/UAVs

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅YoloV3 + DeepSort tracker
✅Vehicle detection/tracking
✅Occupancy estimation via RT
✅Four blocks, unique pipeline

More: https://bit.ly/3iJD8nm
❤8👍5đŸĨ°1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
👕 Detecting Events via #AI 👕

👉Localizing object states & corresponding state-modifying actions

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅SS-learning state-modifying
✅Noise adaptive weighting
✅ChangeIt: 2.6k+ hrs , 34k+ changes
✅Dataset, code, and model!

More: https://bit.ly/3uBwxkj
👍7đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈🌈 Interactive Neural Labelling 🌈🌈

👉Dense labelling of geometry, color & semantics via #3D neural field

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅No training data
✅Dense labeling
✅Classes on the fly
✅Labelling at a scale

More: https://bit.ly/36Y0faQ
đŸ”Ĩ4👍1đŸ¤¯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
â™Ÿī¸Neural RGB-D Reconstructionâ™Ÿī¸

👉Novel approach for #3D mixing implicit surface representations with NeRFs

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅RGB-D based reconstruction
✅Leveraging color & depth
✅Depth into the NeRF
✅Pose & camera refinement

More: https://bit.ly/3iN6e54
đŸ”Ĩ5👍2đŸ¤¯2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĻ“ Hyper-Fast Refinement đŸĻ“

👉SharpContour: novel contour-based refinement for semantic segmentation

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Instance-aware Point Classifier
✅Deforming by discrete updating
✅Estimating offsets independently
✅Source code soon available!

More: https://bit.ly/3qL04GY
👍5đŸ”Ĩ4đŸ¤¯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨ— Neural Mesh via Text only đŸĨ—

👉Zero-shot generation of 3D model using only a target text prompt

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅ZS 3D model with text only
✅ZS text-guided generation
✅Meshes with texture/normal
✅Differentiable LLS implementation

More: https://bit.ly/3u0qnvb
đŸ¤¯8👍1đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĒ†#3D, Materials, and Lighting from 2DđŸĒ†

👉Nvidia: topology, materials & map lighting jointly from 2D. INSANE 😮

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Topology, materials and lighting
✅Meshes with materials/lighting
✅Compact volumetric texturing
✅Differentiable all-frequency lighting
✅Code under #NVIDIA License

More: https://bit.ly/3IUoF2t
👏5👍1đŸ¤¯1😱1