AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜•Decaf: 3D Face-Hand Interactionsโ˜•

๐Ÿ‘‰The first learning-based MoCap to track human hands interacting with human faces in #3D from single monocular RGB videos

๐Ÿ˜ŽReview https://t.ly/070Tj
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.16670.pdf
๐Ÿ˜ŽProject vcai.mpi-inf.mpg.de/projects/Decaf
๐Ÿ‘8๐Ÿคฏ8๐Ÿ”ฅ3โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฑ Making LLaMA See and Draw ๐ŸŒฑ

๐Ÿ‘‰Tencent #AI planted a SEED of Vision in Large Language Model. Making LLaMA see 'n' draw stuff.

๐Ÿ˜ŽReview https://t.ly/QiCAv
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.01218.pdf
๐Ÿ˜ŽCode github.com/AILab-CVC/SEED
โค8๐Ÿ‘4๐Ÿคฏ3๐Ÿ”ฅ1
๐Ÿ”ฅVisual-Math Q&A: MathVista is out! ๐Ÿ”ฅ

๐Ÿ‘‰ MathVista is the ultimate benchmark designed to amalgamate challenges from diverse mathematical and visual tasks

๐Ÿ˜ŽReview https://t.ly/yfqHZ
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2310.02255.pdf
๐Ÿ˜ŽProject https://mathvista.github.io/
๐Ÿ˜ŽCode github.com/lupantech/MathVista
โค8๐Ÿ‘3๐Ÿ”ฅ3๐Ÿพ2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’š๐Ÿ’™ Where Is OpenCV 5? ๐Ÿ’™๐Ÿ’š

๐Ÿ‘‰On October 24th, the organization is launching a crowdfunding campaign to raise funds for #OpenCV 5 development.

๐Ÿ‘†me in 2008 during my thesis work about face tracking; up to 50x faster than the previous SOTA. No chance to did it without OpenCV library and support from the community.

๐Ÿ”ฅSupport #OpenCV 5 to create the next-gen of researchers and scientists. Spread the voice: https://t.ly/UTukV
โค22๐Ÿ‘8๐Ÿ”ฅ3๐Ÿ’ฉ1
๐ŸŠSwimXYZ: Synthetic Swim๐ŸŠ

๐Ÿ‘‰SwimXYZ: synthetic dataset for swimming, monocular videos annotated with ground truth 2D and 3D joints

๐Ÿ˜ŽReview https://t.ly/F-rdF
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.04360.pdf
๐Ÿ˜ŽData g-fiche.github.io/research-pages/swimxyz
๐Ÿ”ฅ4๐Ÿ‘2โค1๐Ÿ˜ฑ1๐Ÿคฉ1
๐Ÿ“Š TextPSG: PSG from Text ๐Ÿ“Š

๐Ÿ‘‰A novel problem in #AI: Panoptic Scene Graph Generation from Purely Textual Descriptions (Caption-toPSG)

๐Ÿ˜ŽReview https://t.ly/UXEmk
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.07056.pdf
๐Ÿ˜ŽProject vis-www.cs.umass.edu/TextPSG
๐Ÿ˜ŽCode github.com/chengyzhao/TextPSG
๐Ÿ”ฅ9โค5๐Ÿ‘3๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ™‹ Full Human Motion ๐Ÿ™‹

๐Ÿ‘‰OmniControl by Google is novel framework for text-conditioned human motion generation model based on diffusion process

๐Ÿ˜ŽReview https://t.ly/F_0Ov
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.08580.pdf
๐Ÿ˜ŽProject neu-vi.github.io/omnicontrol/
๐Ÿ‘5๐Ÿคฏ3๐Ÿ”ฅ2๐Ÿ‘1๐Ÿ˜ฑ1
๐Ÿฆนโ€โ™€๏ธ Snap's Hyper-Realistic Human ๐Ÿฆนโ€โ™€๏ธ

๐Ÿ‘‰New diffusive #AI by Snap that generates in-the-wild human images with hyper-realism. Swipe the gallery, NUTS!๐Ÿ‘‡

๐Ÿ˜ŽGallery https://t.ly/cG74X
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.08579.pdf
๐Ÿ˜ŽProject snap-research.github.io/HyperHuman
๐Ÿ˜ŽCode github.com/snap-research/HyperHuman
๐Ÿ‘4๐Ÿ”ฅ1๐Ÿคฏ1๐Ÿ˜ฑ1๐Ÿคฉ1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘—AG3D clothed avatar from 2D๐Ÿ‘—

๐Ÿ‘‰The novel SOTA in adversarial generative of realistic 3D people

๐Ÿ˜ŽReview https://t.ly/vnJO7
๐Ÿ˜ŽProject https://zj-dong.github.io/AG3D
๐Ÿ˜ŽCode https://github.com/zj-dong/AG3D
๐Ÿ˜ŽPaper zj-dong.github.io/AG3D/assets/paper.pdf
โค7๐Ÿ‘4๐Ÿ”ฅ2๐Ÿฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฑPose-Format: All-in-One Pose๐ŸŒฑ

๐Ÿ‘‰ Pose-format: a comprehensive toolkit designed for human pose: unified, flexible, and easy-to-use

๐Ÿ˜ŽReview https://t.ly/rFrhq
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.09066.pdf
๐Ÿ˜ŽCode github.com/sign-language-processing/pose
๐Ÿ”ฅ9๐Ÿคฏ4๐Ÿ‘3๐Ÿ˜ฑ2โšก1๐Ÿ’ฉ1
๐Ÿ˜ป CatFLW: Cat Neural Landmarks ๐Ÿ˜ป

๐Ÿ‘‰Landmark convolution neural network-based model for cat faces

๐Ÿ˜ŽReview https://t.ly/Y3mQ8
๐Ÿ˜ŽPaper arxiv.org/pdf/2305.04232.pdf
๐Ÿ˜ŽDataset www.tech4animals.org/catflw
๐Ÿฅฐ17โค4๐Ÿ‘3๐Ÿ˜ฑ1๐Ÿคฉ1๐Ÿ˜1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿก4K4D: Real-Time 4D at 4K๐Ÿก

๐Ÿ‘‰THE new SOTA in view synthesis of dynamic 3D scenes at 4K. 30x faster, up to 400 FPS. Nuts!

๐Ÿ˜ŽReview https://t.ly/6ddQh
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.11448.pdf
๐Ÿ˜ŽProject zju3dv.github.io/4k4d/
๐Ÿ˜ŽCode github.com/zju3dv/4K4D
๐Ÿ”ฅ8๐Ÿ‘5๐Ÿคฏ5โค1๐Ÿ˜ฑ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ›ฃ๏ธ Holistic Parking Detection (YOLO) ๐Ÿ›ฃ๏ธ

๐Ÿ‘‰ One-step Holistic Parking Slot Network: a tailor-made adaptation of YOLOv4 algorithm for all-shaped parking slot detection

๐Ÿ˜ŽReview https://t.ly/2l4ZG
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.11629.pdf
๐Ÿ”ฅ8๐Ÿคฏ6โค4๐Ÿคฉ3๐Ÿ‘1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿˆ Cutie: VOS with heavy occlusions๐Ÿˆ

๐Ÿ‘‰Cutie: novel VOS for challenging scenarios with heavy occlusions & distractors

๐Ÿ˜ŽReview https://t.ly/W3FR-
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.12982.pdf
๐Ÿ˜ŽProject https://hkchengrex.com/Cutie
๐Ÿ˜ŽCode https://github.com/hkchengrex/Cutie
๐Ÿ‘13๐Ÿคฃ3โค1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงก Rotoscoping Prince Of Persia (1985) ๐Ÿงก

๐Ÿ‘‰ A rare footage for the animation of Prince of Persia (1989). Damn Romantic.

๐Ÿ˜Ž More https://t.ly/xJife
โค17๐Ÿ‘2๐Ÿ‘2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿช›PACE: new SOTA Motion๐Ÿช›

๐Ÿ‘‰#Nvidia unveils the novel SOTA to estimate the human motion in a global scene from moving cams. Stunning results.

๐Ÿ˜ŽReview https://t.ly/20you
๐Ÿ˜ŽProject https://nvlabs.github.io/PACE
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2310.13768.pdf
๐Ÿคฃ5โค4๐Ÿ”ฅ1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฅคNanoSAM: SAM on low-cost boards๐Ÿฅค

๐Ÿ‘‰NanoSAM is a Segment Anything variant capable of running in real-time on #NVIDIA Jetson Orin with TensorRT

๐Ÿ˜ŽReview https://t.ly/UErq_
๐Ÿ˜ŽTutorial https://github.com/NVIDIA-AI-IOT/nanosam
๐Ÿ”ฅ11๐Ÿ‘1๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง‚ SOTA RGB-D Video Salient Object ๐Ÿง‚

๐Ÿ‘‰ DCTNet+ (model) and RDVS(dataset) for a new SOTA in Video Saliency Object Detection

๐Ÿ˜ŽReview https://t.ly/DapLV
๐Ÿ˜ŽCode github.com/kerenfu/RDVS
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.15482.pdf
๐Ÿ”ฅ4๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
โœŒ๏ธ Relighted 3D Hands ๐Ÿคž

๐Ÿ‘‰#META unveils Re:InterHand: a large dataset of relighted 3D interacting hands

๐Ÿ˜ŽReview https://t.ly/I1dQk
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.17768.pdf
๐Ÿ˜ŽProject mks0601.github.io/ReInterHand
๐Ÿ˜ŽData github.com/mks0601/ReInterHand
๐Ÿคฏ8โค1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ„ Video Understanding with GPT-4V(ision) ๐Ÿ„

๐Ÿ‘‰ #Microsoft unveils MM-Vid, the most advanced video understanding framework (w/ #chatgpt4). Impressive results on long-form videos & intricate tasks such as audio description & multimodal high-level comprehension

๐Ÿ˜ŽReview https://t.ly/RISMm
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.19773.pdf
๐Ÿ˜ŽProject https://multimodal-vid.github.io
๐Ÿคฏ22๐Ÿ‘9๐Ÿ”ฅ2๐Ÿ‘1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘ฃ Foot via Synthetic Data ๐Ÿ‘ฃ

๐Ÿ‘‰ 50,000 synthetic/photorealistic foot images + a novel SOTA library for foot

๐Ÿ˜ŽReview https://t.ly/TVanP
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2310.18279.pdf
๐Ÿ˜ŽProject https://ollieboyne.github.io/FOUND
๐Ÿ˜ŽCode https://github.com/OllieBoyne/FOUND
๐Ÿคฃ8๐Ÿ‘4โค2๐Ÿฅฐ2๐Ÿคฉ2