AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
136 photos
250 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸชคTracking over SOTA detectors๐Ÿชค

๐Ÿ‘‰Lightweight Python lib for real-time 2D object tracking ๐Ÿ’ฅ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Layer of tracking over SOTA detectors
โœ…Suitable for complex video processing
โœ…Source code under BSD 3-Clause
โœ…Maintained by Tryolabs team

More: https://bit.ly/3wKtGqg
๐Ÿ‘7๐Ÿ”ฅ3๐Ÿคฉ3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅท๐Ÿฟ FCA: #3D Neural Camouflage ๐Ÿฅท๐Ÿฟ

๐Ÿ‘‰#3D full-camouflage adversarial patch to fool neural detectors

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Attack by diff-neural render
โœ…E2E physical adversarial attack
โœ…Envs, vehicles & detectors
โœ…Source code available!

More: https://bit.ly/38kKyfa
๐Ÿ‘5๐Ÿ”ฅ3๐Ÿคฏ2๐Ÿ‘1
Media is too big
VIEW IN TELEGRAM
๐Ÿ‹ One-Shot Object Pose ๐Ÿ‹

๐Ÿ‘‰A novel one-shot object pose estimator

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Visual localization pipeline for object pose
โœ…Handling novel objects without CAD model
โœ…Novel graph attention for 2D-3D matching
โœ…Large dataset for one-shot object pose

More: https://bit.ly/3MTogjJ
๐Ÿ”ฅ11โค4๐Ÿ‘2๐Ÿคฏ2
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜„๏ธSTEVE: Slot-TransformEr for VidEosโ˜„๏ธ

๐Ÿ‘‰STEVE: unsupervised model for object-centric learning in videos

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Adoption of a slot decoder (SLATE)
โœ…SLATE with slot-level recurrence model
โœ…Complex and naturalistic videos
โœ…Significantly outperforms previous SOTA

More: https://bit.ly/3PNxxM3
๐Ÿ”ฅ7๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ” CogVideo: insane text-to-clip ๐Ÿฆ”

๐Ÿ‘‰CogVideo: 9B-parameters world's first large scale open-source text-to-video ๐Ÿ˜ต

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Largest open-source T2C transformer
โœ…Finetuning of text-to-image model
โœ…Multi-frame-rate hierarchical training
โœ…From pretrained model CogView2

More: https://bit.ly/3Gzfl4n
๐Ÿ”ฅ9๐Ÿ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ„Time-Aware Neural Voxels๐Ÿฆ„

๐Ÿ‘‰TiNeuVox: "NeRF" with time-aware voxel features ๐Ÿ˜ต

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Dynamic scene w/ optimizable structure
โœ…Temporal information in radiance net
โœ…Small/large motion w/ single-res of feats
โœ…192ร— faster than previous Hyper-NeRF

More: https://bit.ly/3wR4O08
๐Ÿ‘11๐Ÿ”ฅ2๐Ÿคฏ1
๐ŸซNeural Anomaly Detection by AWS๐Ÿซ

๐Ÿ‘‰Ultra-competitive inference and SOTA for both detection and localization

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Locally aggregated, mid-level feats patch
โœ…Maximizing nominal information at test time
โœ…Reducing biases towards ImageNet classes
โœ…Image-level anomaly AUROC of up to 99.6%

More: https://bit.ly/3t7Ndjg
๐Ÿ”ฅ7๐Ÿคฏ3๐Ÿ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ›น Project Skate from Google #AI ๐Ÿ›น

๐Ÿ‘‰#AI tool to analyze the skateboarder's tricks in real-time

More: https://bit.ly/3zbQS3M
๐Ÿ”ฅ15๐Ÿคฉ3๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงฌNeural Text2Human Generation๐Ÿงฌ

๐Ÿ‘‰Text-driven neural human generation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Full-body from a given human pose
โœ…Hierarchical texture-aware codebook
โœ…DeepFashion -> 44k Hi-Res images
โœ…Code and models available!

More: https://bit.ly/3Mdnpt0
๐Ÿ”ฅ15๐Ÿ‘1
๐ŸงจEfficientFormers: 1.6ms inference ๐Ÿงจ

๐Ÿ‘‰Transformers fast as MobileNet? Snap shows that on #iphone!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Low latency on mobile, high performance!
โœ…Revisiting the design of ViT through latency
โœ…New dimension-consistent design paradigm
โœ…EfficientFormers: a new ViT for mobile!

More: https://bit.ly/3MdgW15
๐Ÿ”ฅ16๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿข Transformer-Based Sens-Fusion ๐Ÿข

๐Ÿ‘‰Updating TransFuser (CVPR21): image + LiDAR representations with self-attention

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Existing approach can't handle traffic ๐Ÿ˜ข
โœ…Novel multi-modal fusion transformer
โœ…The new SOTA in driving performance
โœ…Reducing avg collisions per KM by 48%
โœ…Insights on current limitations of E2E

More: https://bit.ly/391dmd6
๐Ÿ‘11๐Ÿ”ฅ2
๐Ÿง˜๐Ÿปโ€โ™‚๏ธYogNet: neural yoga assistant๐Ÿง˜๐Ÿปโ€โ™‚๏ธ

๐Ÿ‘‰Multi-person yoga neural expert for 20 asanas

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…CNNs & reg.LSTMs + 3D-CNNs
โœ…Multi-person asanas in real-time
โœ…YAR: dataset for yoga & posture
โœ…1206 videos, 2D RGB camera

More: https://bit.ly/3NncVbE
โค13๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ด Geogram: geometric algos in C++ ๐Ÿ”ด

๐Ÿ‘‰Novel open-source programming library with (research) geometric algorithms in C++

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Geometry Processing from #INRIA
โœ…30+ papers from SIGGRAPH, etc.
โœ…Grants: GOODSHAPE & VORPALINE
โœ…Code (mostly C++) under BSD 3

More: https://bit.ly/3mhS4L7
๐Ÿ”ฅ6๐Ÿ‘3โค1
๐Ÿ Open Source Vision from #Apple ๐Ÿ

๐Ÿ‘‰CVNets: open-source (not a joke) lib for neural vision.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…PyTorch-based neural lib. for vision
โœ…Train 2โˆ’4ร— longer w/ augmentations
โœ…Plug-and-play components for CV
โœ…Source code under a custom license

More: https://bit.ly/39d1dSj
๐Ÿ‘9
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‡๐ŸปNeural Clips by #Nvidia: INSANE ๐Ÿ‡๐Ÿป

๐Ÿ‘‰Neural generation with changes in camera viewpoint & content that arises over time ๐Ÿคฏ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel hierarchical generator architecture
โœ…Temp. receptive field + temporal embed.
โœ…Multi-res. with super-resolution network
โœ…SOTA in long clip with motion & changes
โœ…Code, data & models in August 2022 ๐Ÿ–๏ธ

More: https://bit.ly/3zroWsC
๐Ÿคฏ9๐Ÿ‘Ž2โค1
This media is not supported in your browser
VIEW IN TELEGRAM
โšฝ Zero to #Messi with #deeplearning โšฝ

๐Ÿ‘‰EA unveils a neural system to learn multiple soccer juggling skills ๐Ÿ˜

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Learning difficult soccer juggling skills
โœ…Layer-wise mixture-of-experts architecture
โœ…Specialization arises naturally
โœ…Adaptive random walk training strategy

More: https://bit.ly/3mwRaL2
๐Ÿ”ฅ7๐Ÿ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ–๏ธ HumanNeRF: source code is out! ๐Ÿ–๏ธ

๐Ÿ‘‰Pausing the video at any frame and rendering the subject from arbitrary views!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Synthesizing photorealistic humans
โœ…Synthesizing details, ie. cloth & face
โœ…Volumetric canonical T-pose
โœ…Skeletal rigid/non-rigid decomposition

More: https://bit.ly/3NEkTNY
๐Ÿคฏ17๐Ÿ”ฅ5๐Ÿ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ’ EG3D: source code is out! ๐ŸŽ’

๐Ÿ‘‰#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Tri-plane-based 3D GAN framework
โœ…Pose-correlated attribute (expression)
โœ…SOTA in uncond. 3D-aware synthesis
โœ…Source code & models NOW available!

More: https://bit.ly/3aOfHs0
๐Ÿ”ฅ7๐Ÿคฏ6๐Ÿ‘4โค2
๐Ÿ”ฅOne Millisecond Backbone. Fire!๐Ÿ”ฅ

๐Ÿ‘‰MobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…75.9% top-1 accuracy on ImageNet
โœ…38ร— faster than MobileFormer net
โœ…Classification, detection & segmentation
โœ…Source code & model soon available!

More: https://bit.ly/3tsT7f2
โค24๐Ÿ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงจ Scaling Transformers to GigaPixels!๐Ÿงจ

๐Ÿ‘‰Novel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Gigapixel whole-slide imaging (WSI)
โœ…Leveraging natural hier. structure of WSI
โœ…Self-supervised Hi-Res representations
โœ…Source code and models available!

More: https://bit.ly/3xLuzkg
๐Ÿคฏ16๐Ÿ‘1