AI with Papers - Artificial Intelligence & Deep Learning

🌈 Multi-View 3D Tracking 🌈

👉MVTracker is the first data-driven multi-view 3D point tracker for tracking arbitrary 3D points across multiple cameras. Repo available💙

👉Review https://t.ly/rISMR
👉Paper arxiv.org/pdf/2508.21060
👉Project https://lnkd.in/drHtAmRC
👉Repo https://lnkd.in/d4k8mg3B

❤10🔥5👍1

5.07K views09:14

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

❤️‍🔥PHD: Personalized 3D Humans❤️‍🔥

👉ETH & #Meta unveil PHD, a novel approach for personalized 3D human mesh recovery (HMR) and body fitting that leverages user-specific shape information. Code & models to be released💙

👉Review https://t.ly/IeRhH
👉Paper https://arxiv.org/pdf/2508.21257
👉Project https://phd-pose.github.io/
👉Repo TBA

❤7🔥2👏1

4.3K viewsedited 11:52

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🪴 Pixie: Physics from Pixels 🪴

👉UPenn + MIT unveil Pixie: training a neural-net that maps pretrained visual features (i.e., CLIP) to dense material fields of physical properties in a single forward pass, enabling real‑time physics simulations. Repo & Dataset under MIT license💙

👉Review https://t.ly/1W0n5
👉Paper https://lnkd.in/dsHAHDqM
👉Project https://lnkd.in/dwrHRbRc
👉Repo https://lnkd.in/dy7bvjsK

❤6👏2🔥1

4.12K views06:43

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🫛TMR: Few-Shot Template-matching🫛

👉POSTECH unveils TMR, a novel and simple template-matching detector for few-shot pattern detection, achieving strong (and SOTA) results on diverse datasets. A new dataset (RPINE) released, repo soon💙

👉Review https://t.ly/WWAcL
👉Paper https://lnkd.in/dJbSu5vk
👉Project https://lnkd.in/dwcDnHHQ
👉Repo https://lnkd.in/dp7aw8Cs

🔥5❤3👍1

4.37K views12:22

AI with Papers - Artificial Intelligence & Deep Learning

Could you please help me with this poll?

https://t.ly/3c3Aa

Thanks,
A.

Alessandro Ferrari posted on LinkedIn

👍8❤2

4.17K views15:11

AI with Papers - Artificial Intelligence & Deep Learning

🧬 OpenVision 2 is out! 🧬

👉UCSC releases OpenVision2: a novel family of generative pretrained visual encoders that removes the text encoder and contrastive loss, training with caption-only supervision. Fully open, Apache 2.0💙

👉Review https://t.ly/Oma3w
👉Paper https://arxiv.org/pdf/2509.01644
👉Project https://ucsc-vlaa.github.io/OpenVision2/
👉Repo https://github.com/UCSC-VLAA/OpenVision

🔥7❤1👍1

4.14K views07:31

AI with Papers - Artificial Intelligence & Deep Learning

0:05

This media is not supported in your browser

VIEW IN TELEGRAM

🐉 #DoubleDragon with #AI 🐉

👉How Double Dragon would look like in real life? Each character has been transformed with #AI to capture their style, fighting spirit, and charisma, as if they had stepped right out of the game’s streets into the real world. AUDIO ON. Damn romantic💙

#artificialintelligence #machinelearning #ml #AI #deeplearning #computervision #AIwithPapers #metaverse #LLM

👉Post https://t.ly/0IpER
👉Channel https://www.youtube.com/@iaiaoh84

❤5👍2🔥1

4.12K views11:00

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍐 Promptable Human Mesh 🍐

👉PromptHMR is a promptable human pose/shape (HPS) estimation method that processes images with spatial or semantic prompts. It takes “side information” readily available from vision-language models or user input to improve the accuracy and robustness of 3D HPS. Code released💙

👉Review https://t.ly/zJ7S-
👉Paper arxiv.org/pdf/2504.06397
👉Project yufu-wang.github.io/phmr-page/
👉Repo github.com/yufu-wang/PromptHMR

🤣21❤10🔥2👍1

4.43K viewsedited 11:35

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥WebEyeTrack: real-time/web eye🔥

👉WebEyeTrack is a novel framework that integrates lightweight SOTA gaze estimation models directly in the browser. Bringing deep‑learning gaze estimation to the web browser and explicitly accounts for head pose. Source Code released under MIT license💙

👉Review https://t.ly/Xon9h
👉Paper https://arxiv.org/pdf/2508.19544
👉Project redforestai.github.io/WebEyeTrack/
👉Repo github.com/RedForestAi/WebEyeTrack

🔥8❤3👍1

4.31K views07:30

AI with Papers - Artificial Intelligence & Deep Learning

0:05

This media is not supported in your browser

VIEW IN TELEGRAM

✂️ AI Open-Source Annotation ✂️

👉VisioFirm by TOELT is a fully open-source, AI-powered image annotation tool designed to accelerate labeling for Computer Vision tasks like object detection, oriented BBs, and segmentation. Source code released under Apache 2.0💙

👉Review https://t.ly/MoMvv
👉Paper https://lnkd.in/dxTncSgv
👉Repo https://lnkd.in/dCWMXp3x

🔥11❤4🤯4👍3⚡1

5.04K viewsedited 12:46

AI with Papers - Artificial Intelligence & Deep Learning

Friends,
I’ve just open my IG account: https://www.instagram.com/aleferra.ig | Feel free to add me

What about posting stuff about AI on IG? Thoughts?

👍11❤1🤯1

4.08K views18:57

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🖌️Real-Time Drag-Based Editing🖌️

👉The Visual AI Lab unveils Inpaint4Drag, a novel framework that decomposes drag-based editing into pixel-space bidirectional warping/inpainting. Inspired by elastic object deformation. Demo and Code released (unknown license)💙

👉Review https://t.ly/H5nlR
👉Paper https://arxiv.org/pdf/2509.04582
👉Project https://visual-ai.github.io/inpaint4drag/
👉Repo https://github.com/Visual-AI/Inpaint4Drag
👉Demo https://colab.research.google.com/drive/1fzoyNzcJNZjM1_08FE9V2V20EQxGf4PH

❤7🔥7👏1

4.16K views06:29

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🩸Foundation Red Blood Cells🩸

👉RedDino from University of Cagliari is a self-supervised foundation model designed for red blood cell (RBC) morphology analysis. Trained on 1.25M RBC images, it's the new SOTA in shape classification. Code & Models released under Apache2.0💙

👉Review https://t.ly/uWAch
👉Paper arxiv.org/pdf/2508.08180
👉Code github.com/Snarci/RedDino
👉Models huggingface.co/collections/Snarcy/reddino-689a13e29241d2e5690202fc

❤18👍4🔥2

4.35K viewsedited 14:09

AI with Papers - Artificial Intelligence & Deep Learning

0:03

This media is not supported in your browser

VIEW IN TELEGRAM

👻 From Skin to Skeleton 👻

👉This paper try unifying the SMPL body model with BSM, a new Biomechanical Skeleton Model. The SKEL model is animatable like SMPL but with fewer, and biomechanically-realistic, degrees of freedom. Model, code, and data available for research💙

👉Review https://t.ly/JsI8M
👉Paper arxiv.org/pdf/2509.06607
👉Project https://skel.is.tue.mpg.de/

❤7👍3🔥2👏1

4.47K views07:44

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌱 FoMo4Wheat Foundational Model 🌱

👉PheniX Lab et al. unveil a novel family of foundational models tailored for wheat image tasks, suitable for classification, detection, counting and segmentation. Demo, Dataset, Model & Code under MIT💙

👉Review https://t.ly/UzM-Z
👉Paper arxiv.org/pdf/2509.06907
👉Project fomo4wheat.phenix-lab.com/
👉Repo github.com/PheniX-Lab/FoMo4Wheat?
👉Demo fomo4wheat.phenix-lab.com/demos

❤9👍3🔥1🍾1

4.51K views06:37

AI with Papers - Artificial Intelligence & Deep Learning

0:03

This media is not supported in your browser

VIEW IN TELEGRAM

🐙Human-Centric Video Generation🐙

👉Tsinghua & #ByteDance unveil HuMo: a unified, human-centric video generation framework designed to produce HQ fine-grained, and controllable human videos from multimodal inputs: text prompt following, consistent subject preservation, synchronized audio-driven motion. Repo released under Apache2.0💙

👉Review https://t.ly/3S8Yb
👉Paper https://arxiv.org/pdf/2509.08519
👉Project https://phantom-video.github.io/HuMo/
👉Repo https://github.com/Phantom-video/HuMo

🔥8❤4🤯3👏1

4.7K viewsedited 07:53

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 21,000+ Hours Dataset 🔥

👉SpatialVID is a novel large-scale video dataset with explicit spatial annotations including camera poses, depth maps, structured captions and serialized motion instructions. The dataset consists of 7,089 hours of real-world dynamic scenes. Repo & Dataset Apache-2.0 💙

👉Review https://t.ly/Y9o5k
👉Paper arxiv.org/pdf/2509.09676
👉Project nju-3dv.github.io/projects/SpatialVID/
👉Repo github.com/NJU-3DV/spatialVID

❤12🔥9👏2🤯1😍1

5.34K viewsedited 07:37

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦠 Segment & Track Any Cell 🦠

👉RWTH unveils a novel zero-shot cell tracking framework by integrating Segment Anything 2 (SAM2) into the tracking pipeline. Source Code released💙

👉Review https://t.ly/n_srg
👉Paper https://arxiv.org/pdf/2509.09943
👉Repo https://github.com/zhuchen96/sam4celltracking

❤5🔥2👍1

3.93K viewsedited 07:34

AI with Papers - Artificial Intelligence & Deep Learning

🔥 How We Use ChatGPT 🔥

👉By July 2025, ChatGPT has 700M+ users sending more than 2.5B+ messages per day. About 29,000 messages per second. This paper documents eight important facts about ChatGPT usage in the last three years. 63 pages of impressive statistics. To read.💙

👉Review https://t.ly/QYHSi

🔥5❤1👍1

3.78K viewsedited 06:49

AI with Papers - Artificial Intelligence & Deep Learning

0:03

This media is not supported in your browser

VIEW IN TELEGRAM

🛡️3D Prompted Vision-LLM🛡️

👉#Nvidia unveils SR-3D, a novel aware vision-language model that connects single-view 2D images and multi-view 3D data through a shared visual token space. Flexible region prompting, allowing users to annotate regions with bounding boxes, segmentation masks on any frame, or directly in 3D, without the need for exhaustive multi-frame labeling. Code & Dataset announced💙

👉Review https://t.ly/5Y2c5
👉Paper https://arxiv.org/pdf/2509.13317
👉Project https://www.anjiecheng.me/sr3d
👉Repo TBA

❤6🔥5👍1👏1

4.24K viewsedited 08:41

About

Blog

Apps

Platform