AI with Papers - Artificial Intelligence & Deep Learning
15.5K subscribers
145 photos
256 videos
14 files
1.34K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🌩️ It's "Time-to-Move" 🌩️

πŸ‘‰Technion + Nvidia Time-to-Move (TTM) is a training-free, plug-and-play framework for motion- and appearance-controlled video generation with I2V diffusion models (Wan 2.2, CogVideoX, & Stable VD). Impressive results!

πŸ‘‰Review https://t.ly/0pwXm
πŸ‘‰Paper https://lnkd.in/dxD3uHYb
πŸ‘‰Project https://lnkd.in/dcE5juyM
πŸ‘‰Repo https://lnkd.in/dMMUjybJ
1πŸ‘2πŸ”₯2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
⌚ Multi-Shot Video Segmentation ⌚

πŸ‘‰Fudan focuses on an underexplored task of multi-shot video object segmentation (MVOS). Benchmark and repo available (the extension part of SAM) under Apache 2.0πŸ’™

πŸ‘‰Review https://t.ly/WBW00
πŸ‘‰Paper https://arxiv.org/pdf/2511.13715
πŸ‘‰Project https://henghuiding.com/SAAS/
πŸ‘‰Repo https://github.com/FudanCVL/SAAS
1πŸ”₯6❀2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ SAM 3/3D are OUT!! πŸ”₯

πŸ‘‰#META released SAM 3, a unified model for detection, segmentation, tracking of objects in images & video using text, exemplar & visual prompts. Repo/Models under proprietary licenseπŸ’™

πŸ‘‰Review https://t.ly/lnRZN
πŸ‘‰Paper https://t.ly/5tq9N
πŸ‘‰Project https://ai.meta.com/sam3/
πŸ‘‰Demo: https://segment-anything.com
πŸ‘‰Repo https://github.com/facebookresearch/sam3
πŸ”₯22❀6πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🍯Unwrapping of 3D Meshes🍯

πŸ‘‰PartUV is a novel part-based UV unwrapping method for 3D meshes; it combines learned part priors with geometric cues to generate a compact set of part-aligned charts. Repo releasedπŸ’™

πŸ‘‰Review https://t.ly/8dNIY
πŸ‘‰Paper arxiv.org/pdf/2511.16659
πŸ‘‰Project www.zhaoningwang.com/PartUV/
πŸ‘‰Repo github.com/EricWang12/PartUV
❀15πŸ‘2πŸ”₯2
πŸ• Upsample Anything πŸ•

πŸ‘‰Upsample Anything, a novel universal, training-free up-sampler via lightweight test-time optimization. No code but it's a relevant paperπŸ’™

πŸ‘‰Review https://t.ly/7LE6G
πŸ‘‰Paper https://lnkd.in/dsUfdtih
πŸ”₯8❀4πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🦞Single Synthetic Image per Class🦞

πŸ‘‰MIT unveils Linear Gradient Matching (H/T Torralba), a novel method of distillation to use a single synthetic image per class for linear classifiers training (and more). Repo availableπŸ’™

πŸ‘‰Review https://t.ly/dD3un
πŸ‘‰Paper arxiv.org/pdf/2511.16674
πŸ‘‰Project linear-gradient-matching.github.io/
πŸ‘‰Repo github.com/GeorgeCazenavette/linear-gradient-matching
1❀6πŸ”₯2πŸ‘1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ§ͺ EfficientSAM3 is out πŸ§ͺ

πŸ‘‰Bristol announces EfficientSAM3, a family of efficient models built on Progressive Hierarchical Distillation that transfers capability from SAM3 to lightweight students. Code coming (in sync with SAM3 release)πŸ’™

πŸ‘‰Review https://t.ly/bfXP2
πŸ‘‰Paper arxiv.org/pdf/2511.15833
πŸ‘‰Project simonzeng7108.github.io/efficientsam3/
πŸ‘‰Repo github.com/SimonZeng7108/efficientsam3
❀5πŸ‘2πŸ”₯1πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌩️ Cloud4D in time 🌩️

πŸ‘‰Cloud4D: physically-realistic 3D cloud fields using ground-based cameras at a 25 m spatial resolution and 5 s temporal resolution. Repo coming, Data releasedπŸ’™

πŸ‘‰Review https://t.ly/w7Zly
πŸ‘‰Paper arxiv.org/pdf/2511.19431
πŸ‘‰Project cloud4d.jacob-lin.com/
πŸ‘‰Data https://drive.google.com/drive/folders/1QU_0kIUXIVt8h3uqygBeaF3Gvr_L5SdX?usp=drive_link
πŸ‘‰Repo TBA
πŸ”₯7❀1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ“MotionV2V: Editing Motion in VideoπŸ“

πŸ‘‰ Google unveils motion edits, a new approach for editing videos by controlling the change in motion from the original to the edited video using diffusion models. Impressive results. Repo released soonπŸ’™

πŸ‘‰Review https://t.ly/s0sIT
πŸ‘‰Paper https://arxiv.org/pdf/2511.20640
πŸ‘‰Project https://ryanndagreat.github.io/MotionV2V/
πŸ‘‰Repo https://github.com/RyannDaGreat/MotionV2V
❀6πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ Smell Like Vision Spirit πŸ”₯

πŸ‘‰New York Smells is a novel large-scale dataset of paired vision and olfaction captured in-the-wild, enabling the new task of cross-modal learning between smell and sight. With the lights out, it's less dangerous. Dataset availableπŸ’™

πŸ‘‰Review https://t.ly/Ycn_B
πŸ‘‰Paper arxiv.org/pdf/2511.20544
πŸ‘‰Project smell.cs.columbia.edu/
❀8πŸ”₯2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ•ΆοΈ Seeing without Pixels πŸ•ΆοΈ

πŸ‘‰Is it possible to perceive a video’s content without seeing its pixels, just from the camera trajectory? Deepmind (+ UTexas) is the first to systematically investigate this seemingly implausible questionπŸ’™

πŸ‘‰Review https://t.ly/Ymd1c
πŸ‘‰Paper arxiv.org/pdf/2511.21681
πŸ‘‰Project sites.google.com/view/seeing-without-pixels
πŸ”₯5❀1