AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ๐Ÿ”ฅ #META's DINOv2 is now commercial! ๐Ÿ”ฅ๐Ÿ”ฅ

๐Ÿ‘‰Universal features for image classification, instance retrieval, video understanding, depth & semantic segmentation. Now suitable for commercial.

๐Ÿ˜ŽReview https://t.ly/LNrGy
๐Ÿ˜ŽPaper arxiv.org/pdf/2304.07193.pdf
๐Ÿ˜ŽCode github.com/facebookresearch/dinov2
๐Ÿ˜ŽDemo dinov2.metademolab.com/
๐Ÿ”ฅ15๐Ÿ‘3โค1๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง„FreeMan: towards #3D Humans ๐Ÿง„

๐Ÿ‘‰FreeMan: the first large-scale, real-world, multi-view dataset for #3D human pose estimation. 11M frames!

๐Ÿ˜ŽReview https://t.ly/ICxpA
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.05073.pdf
๐Ÿ˜ŽProject wangjiongw.github.io/freeman
๐Ÿ‘6๐Ÿคฏ4๐Ÿฅฐ1
๐ŸฆŠ MagiCapture: HD Multi-Concept Portrait ๐ŸฆŠ

๐Ÿ‘‰KAIST unveils MagiCapture: integrating subject and style concepts to generate high-resolution portrait images using just a few subject and style references

๐Ÿ˜ŽReview https://t.ly/c9rOo
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2309.06895.pdf
โค5๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
โšฝ Dynamic NeRFs for Soccer โšฝ

๐Ÿ‘‰SoccerNeRF: first attempt of "cheap" NeRF applied to football for reconstructing soccer replays in space and time.

๐Ÿ˜ŽReview https://t.ly/Ywcvk
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.06802.pdf
๐Ÿ˜ŽProject https://soccernerfs.isach.be/
๐Ÿ˜ŽCode github.com/iSach/SoccerNeRFs
๐Ÿ”ฅ8โค4๐Ÿ‘3๐Ÿคฉ2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜ข๏ธ GlueStick: Graph Neural Matching โ˜ข๏ธ

๐Ÿ‘‰GlueStick is joint deep matcher for points and lines that leverages the connectivity information between nodes to better glue them together

๐Ÿ˜ŽReview https://t.ly/Atxqo
๐Ÿ˜ŽPaper arxiv.org/pdf/2304.02008.pdf
๐Ÿ˜ŽCode https://github.com/cvg/GlueStick
๐Ÿ”ฅ11๐Ÿ‘4โค1๐Ÿคฏ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿซ€CPR-Coach: Neural Cardiopulmonary Resuscitation๐Ÿซ€

๐Ÿ‘‰CPR-Coach: fine-grained action recognition in cardiopulmonary resuscitation

๐Ÿ˜ŽReview https://t.ly/Qbg4K
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.11718.pdf
๐Ÿ˜ŽCode github.com/Shunli-Wang/CPR-Coach
๐Ÿ˜ŽProject shunli-wang.github.io/CPR-Coach
โค7๐Ÿ”ฅ3๐Ÿ‘1
๐Ÿงช NeuralLabeling with NeRF ๐Ÿงช

๐Ÿ‘‰Annotating a scene by generating segmentation masks, affordance maps, 2D bounding boxes, 3D BB, 6DOF poses, depth & meshes.

๐Ÿ˜ŽReview https://t.ly/1GPsj
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.11966.pdf
๐Ÿ˜ŽCode github.com/FlorisE/neural-labeling
๐Ÿ˜ŽProject florise.github.io/neural_labeling_web
๐Ÿ‘5๐Ÿคฏ3๐Ÿ”ฅ2โค1๐Ÿฅฐ1
๐ŸŸ DE-ViT: detecting everything via DINOv2 ๐ŸŸ

๐Ÿ‘‰DE-ViT: open-set object detector based on DINOv2 backbone. It's the new SOTA on COCO & LVIS dataset

๐Ÿ˜ŽReview https://t.ly/_DAmt
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.12969.pdf
๐Ÿ˜ŽCode https://github.com/mlzxy/devit
๐Ÿ”ฅ8๐Ÿ‘4โค1๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ›ตCoTracker: fast transformer-tracker๐Ÿ›ต

๐Ÿ‘‰META's CoTracker is a fast transformer-based model that can track any point in a video

๐Ÿ˜ŽReview https://t.ly/M36A_
๐Ÿ˜ŽPaper arxiv.org/pdf/2307.07635.pdf
๐Ÿ˜ŽProject https://co-tracker.github.io/
๐Ÿ˜ŽCode github.com/facebookresearch/co-tracker
โค7๐Ÿ‘4๐Ÿคฏ2๐Ÿ”ฅ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฌ๏ธ Neural Blowing in Still Photos ๐ŸŒฌ๏ธ

๐Ÿ‘‰ A novel approach to animate human hair (and clothes) in a still portraits

๐Ÿ˜ŽReview https://t.ly/HKG0t
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.14207.pdf
๐Ÿ˜ŽProject nevergiveu.github.io/AutomaticHairBlowing
๐Ÿ‘6๐Ÿคฏ3๐Ÿ”ฅ1๐Ÿ‘1๐Ÿ˜1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฎ OW Indoor Segmentation ๐ŸŒฎ

๐Ÿ‘‰3D-OWIS is a novel open-world 3D indoor instance segmentation method (with auto-labeling scheme) to separate known/unknown category labels

๐Ÿ˜ŽReview https://t.ly/-7ALf
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.14338.pdf
๐Ÿ˜ŽCode github.com/aminebdj/3D-OWIS
๐Ÿ‘6๐Ÿ”ฅ1๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงฑ Generating Scenes from Touch ๐Ÿงฑ

๐Ÿ‘‰#AI for synthesizing images from tactile signals (and vice versa) and apply it to a number of visuo-tactile synthesis tasks

๐Ÿ˜ŽReview https://t.ly/Gxr0L
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2309.15117.pdf
๐Ÿ˜ŽProject https://fredfyyang.github.io/vision-from-touch
๐Ÿ˜ŽCode https://github.com/fredfyyang/vision-from-touch
๐Ÿคฏ9๐Ÿ‘6โค1๐Ÿ”ฅ1๐Ÿ‘1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜•Decaf: 3D Face-Hand Interactionsโ˜•

๐Ÿ‘‰The first learning-based MoCap to track human hands interacting with human faces in #3D from single monocular RGB videos

๐Ÿ˜ŽReview https://t.ly/070Tj
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.16670.pdf
๐Ÿ˜ŽProject vcai.mpi-inf.mpg.de/projects/Decaf
๐Ÿ‘8๐Ÿคฏ8๐Ÿ”ฅ3โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฑ Making LLaMA See and Draw ๐ŸŒฑ

๐Ÿ‘‰Tencent #AI planted a SEED of Vision in Large Language Model. Making LLaMA see 'n' draw stuff.

๐Ÿ˜ŽReview https://t.ly/QiCAv
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.01218.pdf
๐Ÿ˜ŽCode github.com/AILab-CVC/SEED
โค8๐Ÿ‘4๐Ÿคฏ3๐Ÿ”ฅ1
๐Ÿ”ฅVisual-Math Q&A: MathVista is out! ๐Ÿ”ฅ

๐Ÿ‘‰ MathVista is the ultimate benchmark designed to amalgamate challenges from diverse mathematical and visual tasks

๐Ÿ˜ŽReview https://t.ly/yfqHZ
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2310.02255.pdf
๐Ÿ˜ŽProject https://mathvista.github.io/
๐Ÿ˜ŽCode github.com/lupantech/MathVista
โค8๐Ÿ‘3๐Ÿ”ฅ3๐Ÿพ2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’š๐Ÿ’™ Where Is OpenCV 5? ๐Ÿ’™๐Ÿ’š

๐Ÿ‘‰On October 24th, the organization is launching a crowdfunding campaign to raise funds for #OpenCV 5 development.

๐Ÿ‘†me in 2008 during my thesis work about face tracking; up to 50x faster than the previous SOTA. No chance to did it without OpenCV library and support from the community.

๐Ÿ”ฅSupport #OpenCV 5 to create the next-gen of researchers and scientists. Spread the voice: https://t.ly/UTukV
โค22๐Ÿ‘8๐Ÿ”ฅ3๐Ÿ’ฉ1
๐ŸŠSwimXYZ: Synthetic Swim๐ŸŠ

๐Ÿ‘‰SwimXYZ: synthetic dataset for swimming, monocular videos annotated with ground truth 2D and 3D joints

๐Ÿ˜ŽReview https://t.ly/F-rdF
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.04360.pdf
๐Ÿ˜ŽData g-fiche.github.io/research-pages/swimxyz
๐Ÿ”ฅ4๐Ÿ‘2โค1๐Ÿ˜ฑ1๐Ÿคฉ1
๐Ÿ“Š TextPSG: PSG from Text ๐Ÿ“Š

๐Ÿ‘‰A novel problem in #AI: Panoptic Scene Graph Generation from Purely Textual Descriptions (Caption-toPSG)

๐Ÿ˜ŽReview https://t.ly/UXEmk
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.07056.pdf
๐Ÿ˜ŽProject vis-www.cs.umass.edu/TextPSG
๐Ÿ˜ŽCode github.com/chengyzhao/TextPSG
๐Ÿ”ฅ9โค5๐Ÿ‘3๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ™‹ Full Human Motion ๐Ÿ™‹

๐Ÿ‘‰OmniControl by Google is novel framework for text-conditioned human motion generation model based on diffusion process

๐Ÿ˜ŽReview https://t.ly/F_0Ov
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.08580.pdf
๐Ÿ˜ŽProject neu-vi.github.io/omnicontrol/
๐Ÿ‘5๐Ÿคฏ3๐Ÿ”ฅ2๐Ÿ‘1๐Ÿ˜ฑ1
๐Ÿฆนโ€โ™€๏ธ Snap's Hyper-Realistic Human ๐Ÿฆนโ€โ™€๏ธ

๐Ÿ‘‰New diffusive #AI by Snap that generates in-the-wild human images with hyper-realism. Swipe the gallery, NUTS!๐Ÿ‘‡

๐Ÿ˜ŽGallery https://t.ly/cG74X
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.08579.pdf
๐Ÿ˜ŽProject snap-research.github.io/HyperHuman
๐Ÿ˜ŽCode github.com/snap-research/HyperHuman
๐Ÿ‘4๐Ÿ”ฅ1๐Ÿคฏ1๐Ÿ˜ฑ1๐Ÿคฉ1๐Ÿคฃ1