AI with Papers - Artificial Intelligence & Deep Learning
15.5K subscribers
145 photos
256 videos
14 files
1.34K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฑPlanarTrack: Large Planar Tracking๐ŸŒฑ

๐Ÿ‘‰PlanarTrack is a large-scale HQ and challenging benchmark for planar tracking: 1,150 sequences with 733K+ frames, including 1,000 short-term & 150 long-term videos. Repo & Dataset available๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/mYNi7
๐Ÿ‘‰Paper arxiv.org/pdf/2510.23368
๐Ÿ‘‰Repo https://lnkd.in/edb3GMyT
๐Ÿ‘‰Project https://lnkd.in/eC-hVB-U
๐Ÿ‘‰Data https://lnkd.in/eew2j4tM
๐Ÿ”ฅ11โค5๐Ÿ‘2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘ขGenerative View Stitching ๐Ÿ‘ข

๐Ÿ‘‰GVS is a novel approach that enables collision-free camera-guided video generation for predefined trajectories, it's a non-autoregressive alternative to video length extrapolation. Full repo under MIT๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/TiN_5
๐Ÿ‘‰Paper https://arxiv.org/pdf/2510.24718
๐Ÿ‘‰Project https://andrewsonga.github.io/gvs/
๐Ÿ‘‰Repo github.com/andrewsonga/generative_view_stitching
๐Ÿ”ฅ10โค3๐Ÿ‘1
Greetings from the SMART CITY WORLD CONGRESS in Barcellona. If you are around, ping me ;)
๐Ÿคฃ41โค3๐Ÿ‘3๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ชTracking Object Transformations๐Ÿ”ช

๐Ÿ‘‰"Track Any State": tracking objects through transformations while detecting/describing state changes. Repo & Dataset available under MIT๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/NPyW4
๐Ÿ‘‰Paper https://lnkd.in/d4pA3bXJ
๐Ÿ‘‰Project https://lnkd.in/dgbNfCuj
๐Ÿ‘‰Repo https://lnkd.in/dtVWq2z7
๐Ÿ”ฅ20โค7๐Ÿคฏ3๐Ÿ‘2๐Ÿ‘1
๐Ÿ”ฅ๐Ÿ”ฅ Sunday mood ๐Ÿ”ฅ๐Ÿ”ฅ
๐Ÿคฃ32โค2
๐ŸŽธAnother BRIXEL in the Wall ๐ŸŽธ

๐Ÿ‘‰BRIXEL allows the user to produce high-resolution feature maps using the DINOv3 backbone without requiring large amounts of compute. Repo released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/fZPwC
๐Ÿ‘‰Paper arxiv.org/pdf/2511.05168
๐Ÿ‘‰Repo github.com/alexanderlappe/BRIXEL
๐Ÿคฉ7๐Ÿคฏ3๐Ÿ”ฅ2โค1๐Ÿ‘1๐Ÿ˜1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸผPixel-Dense Embedding๐Ÿผ

๐Ÿ‘‰FlowFeat is a novel high-resolution and multi-task feature representation that embeds a distribution of plausible apparent motions, or motion profiles. Repo available under ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/aUx_U
๐Ÿ‘‰Paper arxiv.org/pdf/2511.07696
๐Ÿ‘‰Project tum-vision.github.io/flowfeat
๐Ÿ‘‰Repo github.com/tum-vision/flowfeat
๐Ÿ”ฅ5๐Ÿ‘3โค2
๐Ÿฟ๐Ÿฟ๐Ÿฟ
๐Ÿคฏ18๐Ÿ”ฅ8๐Ÿ‘2โค1๐Ÿ‘1
๐Ÿšจ Announcement ๐Ÿšจ

Iโ€™ve received numerous reports of people blatantly copying my content on LinkedIn just to get a few likes.

Let me be very clear: I put a great deal of time and effort into reviewing papers and creating original, meaningful content. Itโ€™s disappointing to see professionals (some of whom are even members of this group or my connections) resorting to plagiarism instead of contributing their own ideas.

๐Ÿ‘‰ Starting today, Iโ€™ll be removing these connections from LinkedIn and banning such individuals from this group.

๐Ÿ“ข I also encourage everyone to report these cases whenever you come across them. Every single report helps stop this bad habit and keeps our community fair, respectful, and authentic.
โค64๐Ÿ‘21๐Ÿ‘17๐Ÿ˜ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŸฉ Foundational Humanoid ๐ŸŸฉ

๐Ÿ‘‰#NVIDIA unveils SONIC a novel foundational model for high-precision teleoperation & interactive control capabilities (running, jumping, crawling) with natural human-like movements. Code announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/_3wnt
๐Ÿ‘‰Paper https://lnkd.in/dctfShu8
๐Ÿ‘‰Project https://lnkd.in/d_inmA2p
๐Ÿคฏ9โค4๐Ÿ‘1๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅDepth Anything 3 is out๐Ÿ”ฅ

๐Ÿ‘‰ByteDance unveils Depth Anything 3 (DA3), a model that predicts spatially consistent geometry from arbitrary visual inputs, with or without known camera poses. Repo under Apache 2.0๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/AOPu7
๐Ÿ‘‰Paper arxiv.org/pdf/2511.10647
๐Ÿ‘‰Project https://lnkd.in/dnByyn2z
๐Ÿ‘‰Repo https://lnkd.in/daCVz_4a
๐Ÿ‘‰Demo https://lnkd.in/dKUZiJt
๐Ÿ”ฅ18โค9๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฉ๏ธ It's "Time-to-Move" ๐ŸŒฉ๏ธ

๐Ÿ‘‰Technion + Nvidia Time-to-Move (TTM) is a training-free, plug-and-play framework for motion- and appearance-controlled video generation with I2V diffusion models (Wan 2.2, CogVideoX, & Stable VD). Impressive results!

๐Ÿ‘‰Review https://t.ly/0pwXm
๐Ÿ‘‰Paper https://lnkd.in/dxD3uHYb
๐Ÿ‘‰Project https://lnkd.in/dcE5juyM
๐Ÿ‘‰Repo https://lnkd.in/dMMUjybJ
1๐Ÿ‘2๐Ÿ”ฅ2โค1
This media is not supported in your browser
VIEW IN TELEGRAM
โŒš Multi-Shot Video Segmentation โŒš

๐Ÿ‘‰Fudan focuses on an underexplored task of multi-shot video object segmentation (MVOS). Benchmark and repo available (the extension part of SAM) under Apache 2.0๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/WBW00
๐Ÿ‘‰Paper https://arxiv.org/pdf/2511.13715
๐Ÿ‘‰Project https://henghuiding.com/SAAS/
๐Ÿ‘‰Repo https://github.com/FudanCVL/SAAS
1๐Ÿ”ฅ6โค2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ SAM 3/3D are OUT!! ๐Ÿ”ฅ

๐Ÿ‘‰#META released SAM 3, a unified model for detection, segmentation, tracking of objects in images & video using text, exemplar & visual prompts. Repo/Models under proprietary license๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/lnRZN
๐Ÿ‘‰Paper https://t.ly/5tq9N
๐Ÿ‘‰Project https://ai.meta.com/sam3/
๐Ÿ‘‰Demo: https://segment-anything.com
๐Ÿ‘‰Repo https://github.com/facebookresearch/sam3
๐Ÿ”ฅ22โค4๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฏUnwrapping of 3D Meshes๐Ÿฏ

๐Ÿ‘‰PartUV is a novel part-based UV unwrapping method for 3D meshes; it combines learned part priors with geometric cues to generate a compact set of part-aligned charts. Repo released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/8dNIY
๐Ÿ‘‰Paper arxiv.org/pdf/2511.16659
๐Ÿ‘‰Project www.zhaoningwang.com/PartUV/
๐Ÿ‘‰Repo github.com/EricWang12/PartUV
โค14๐Ÿ‘2๐Ÿ”ฅ1
๐Ÿ• Upsample Anything ๐Ÿ•

๐Ÿ‘‰Upsample Anything, a novel universal, training-free up-sampler via lightweight test-time optimization. No code but it's a relevant paper๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/7LE6G
๐Ÿ‘‰Paper https://lnkd.in/dsUfdtih
๐Ÿ”ฅ7โค3๐Ÿ‘2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฆžSingle Synthetic Image per Class๐Ÿฆž

๐Ÿ‘‰MIT unveils Linear Gradient Matching (H/T Torralba), a novel method of distillation to use a single synthetic image per class for linear classifiers training (and more). Repo available๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/dD3un
๐Ÿ‘‰Paper arxiv.org/pdf/2511.16674
๐Ÿ‘‰Project linear-gradient-matching.github.io/
๐Ÿ‘‰Repo github.com/GeorgeCazenavette/linear-gradient-matching
1โค6๐Ÿ”ฅ2๐Ÿ‘1๐Ÿ˜1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงช EfficientSAM3 is out ๐Ÿงช

๐Ÿ‘‰Bristol announces EfficientSAM3, a family of efficient models built on Progressive Hierarchical Distillation that transfers capability from SAM3 to lightweight students. Code coming (in sync with SAM3 release)๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/bfXP2
๐Ÿ‘‰Paper arxiv.org/pdf/2511.15833
๐Ÿ‘‰Project simonzeng7108.github.io/efficientsam3/
๐Ÿ‘‰Repo github.com/SimonZeng7108/efficientsam3
โค4๐Ÿ‘2๐Ÿ”ฅ1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฉ๏ธ Cloud4D in time ๐ŸŒฉ๏ธ

๐Ÿ‘‰Cloud4D: physically-realistic 3D cloud fields using ground-based cameras at a 25 m spatial resolution and 5 s temporal resolution. Repo coming, Data released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/w7Zly
๐Ÿ‘‰Paper arxiv.org/pdf/2511.19431
๐Ÿ‘‰Project cloud4d.jacob-lin.com/
๐Ÿ‘‰Data https://drive.google.com/drive/folders/1QU_0kIUXIVt8h3uqygBeaF3Gvr_L5SdX?usp=drive_link
๐Ÿ‘‰Repo TBA
๐Ÿ”ฅ7