AI with Papers - Artificial Intelligence & Deep Learning – Telegram

AI with Papers - Artificial Intelligence & Deep Learning

@AI_DeepLearning

15.1K subscribers

135 photos

247 videos

13 files

1.31K links

All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI

Download Telegram

About

Blog

Apps

Platform

AI with Papers - Artificial Intelligence & Deep Learning

15.1K subscribers

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌈 #Nvidia Foundation ZS-Stereo 🌈

👉Nvidia unveils FoundationStereo, a foundation model for stereo depth estimation with strong zero-shot generalization. In addition, a large-scale (1M stereo pairs) synthetic training dataset featuring large diversity and high photorealism. Code, model & dataset to be released💙

👉Review https://t.ly/rfBr5
👉Paper arxiv.org/pdf/2501.09898
👉Project nvlabs.github.io/FoundationStereo/
👉Repo github.com/NVlabs/FoundationStereo/tree/master

❤6🔥6🤩1

7.13K views07:01

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 [SOTA] Long-Video Depth Anything 🔥

👉ByteDance unveils Video Depth Anything: HQ, consistent depth estimation in SUPER-long videos (over several minutes) without sacrificing efficiency. Based on Depth Anything V2 with a novel efficient spatial-temporal head. Repo available under Apache 2.0💙

👉Review https://t.ly/Q4ZZd
👉Paper arxiv.org/pdf/2501.12375
👉Project https://lnkd.in/dKNwJzbM
👉Repo https://lnkd.in/ddfwwpCj

🔥9🤯1

7.19K views11:38

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧵Time-Aware Pts-Tracking🧵

👉Chrono: feature backbone specifically designed for point tracking with built-in temporal awareness. Long-term temporal context, enabling precise prediction even without the refinements. Code announced💙

👉Review https://t.ly/XAL7G
👉Paper arxiv.orgzpdf/2501.12218
👉Project cvlab-kaist.github.io/Chrono/
👉Repo github.com/cvlab-kaist/Chrono

❤5🔥5👍3😍1

6.77K viewsedited 07:37

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🎤EMO2: Audio-Driven Avatar🎤

👉Alibaba previews a novel audio-driven talking head method capable of simultaneously generating highly expressive facial expressions and hand gestures. Turn your audio ON. Stunning results but no code 🥺

👉Review https://t.ly/x8slQ
👉Paper arxiv.org/pdf/2501.10687
👉Project humanaigc.github.io/emote-portrait-alive-2/
👉Repo 🥺

🤯7❤6👍2🤩1

7.16K viewsedited 10:24

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦠A-Life with Foundation Models🦠

👉A super team unveils ASAL, a new paradigm for Artificial Life research. A diverse range of ALife substrates including Boids, Particle Life, Game of Life, Lenia & Neural Cellular Automata. Code under Apache 2.0💙

👉Review https://t.ly/7SZ8A
👉Paper arxiv.org/pdf/2412.17799
👉Project https://pub.sakana.ai/asal/
👉Repo https://lnkd.in/dP5yxKtw

❤11⚡2🤩2

6.75K views14:08

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 The code of DynOMo is out 🔥

👉DynOMo is a novel model able to track any point in a dynamic scene over time through 3D reconstruction from monocular video: 2D and 3D point tracking from unposed monocular camera input

👉Review https://t.ly/t5pCf
👉Paper https://lnkd.in/dwhzz4_t
👉Repo github.com/dvl-tum/DynOMo
👉Project https://lnkd.in/dMyku2HW

🔥7❤5😍5👍2🤩2🍾2⚡1

8.07K viewsedited 07:49

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🪆SOTA Points Segmentation🪆

👉VGG Oxford unveils a novel loss to segment objects in videos based on their motion and NO other forms of supervision! Training the net using long-term point trajectories as a supervisory signal to complement optical flow. New SOTA!

👉Review https://t.ly/8Bsbt
👉Paper https://arxiv.org/pdf/2501.12392
👉Code https://github.com/karazijal/lrtl
👉Project www.robots.ox.ac.uk/~vgg/research/lrtl/

🔥3❤2🤩1

7.81K viewsedited 12:57

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🎨MatAnyone: Human Matting🎨

👉MatAnyone is a novel approach for human video matting that supports the target assignment. Stable tracking in long videos even with complex/ambiguous BGs. Code & 🤗-Demo announced💙

👉Review https://t.ly/NVXsT
👉Paper arxiv.org/pdf/2501.14677
👉Project pq-yang.github.io/projects/MatAnyone
👉Repo TBA

❤15👏2🤩2👍1🔥1

7.38K views09:50

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦕[SOTA] Visual Grounding VOS🦕

👉ReferDINO is the first end-to-end approach for adapting foundational visual grounding models to RVOS. Code & models to be released soon💙

👉Review https://t.ly/SDFy9
👉Paper arxiv.org/pdf/2501.14607
👉Project isee-laboratory.github.io/ReferDINO/
👉Repo github.com/iSEE-Laboratory/ReferDINO

🤯4❤1🔥1🤩1

7.06K viewsedited 13:52

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

☀️ Relightable Full-Body Avatars ☀️

👉#Meta unveils the first approach ever to jointly model the relightable appearance of the body, face, and hands of drivable avatars.

👉Review https://t.ly/kx9gf
👉Paper arxiv.org/pdf/2501.14726
👉Project neuralbodies.github.io/RFGCA

❤3👍3🔥3⚡1🤯1😢1🤩1

7.44K viewsedited 07:50

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌅 Generative Human Mesh Recovery 🌅

👉GenHMR is a novel generative framework that reformulates monocular HMR as an image-conditioned generative task, explicitly modeling and mitigating uncertainties in 2D-to-3D mapping process. Impressive results but no code announced 🥺

👉Review https://t.ly/Rrzpj
👉Paper https://arxiv.org/pdf/2412.14444
👉Project m-usamasaleem.github.io/publication/GenHMR/GenHMR.html

🔥6👍2❤1🤯1🍾1

8.59K views07:54

AI with Papers - Artificial Intelligence & Deep Learning

Social feed of everyone is broken because of unnecessary/not required opinions about DeepSeek. Your wish:

Anonymous Poll

🛑 STOP posting about!

🟩 Keep posting. we want more!

👍1

411 voters8.32K views16:05

AI with Papers - Artificial Intelligence & Deep Learning

💎AI-driven Docs Conversion💎

👉Docling by IBM, is the ALL-in-ONE, open source solution for documents; parsing several types of popular formats into a unified, richly structured representation. Powered by SOTA models for layout (DocLayNet) and table structure (TableFormer), it runs efficiently on low-cost hardware. Code under MIT💙

👉Review https://t.ly/nSCfT
👉Paper https://lnkd.in/dc5Kpc2F
👉Repo https://lnkd.in/d9gvw9bt

❤18👍8🔥1🍾1

8.33K viewsedited 07:15

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🈯 SOTA 0-Shot Multi-View 🈯

👉MVGD by #TOYOTA is the SOTA method that generates images and scale-consistent depth maps from novel viewpoints given an arbitrary number of posed input views. A novel diffusion-based architecture capable of direct pixel-level generation. Code announced 💙

👉Review https://t.ly/_ecKl
👉Paper arxiv.org/pdf/2501.18804
👉Project mvgd.github.io/
👉Repo TBA

🔥8❤1😍1

8.03K viewsedited 12:54

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🐙MambaGlue: SOTA feats. matching🐙

👉MambaGlue is a hybrid neural network combining the Mamba and the Transformer architectures to match local features. Source Code announced, to be released💙

👉Review https://shorturl.at/LxDG1
👉Paper arxiv.org/pdf/2502.00462
👉Repo https://lnkd.in/dAujfGZQ

🤩9❤3🔥2👏2👍1🍾1

7.87K viewsedited 08:10

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🛸Real-Time Differentiable Tracing🛸

👉 Radiant Foam is a novel scene representation by leveraging the decades-old efficient volumetric mesh ray tracing algorithm (largely overlooked in recent research). Performing like Gaussian Splatting, without the constraints of rasterization. Code announced💙

👉Review https://shorturl.at/26U06
👉Paper https://arxiv.org/pdf/2502.01157
👉Project https://radfoam.github.io/
👉Repo https://github.com/theialab/radfoam

🔥7❤1😍1

7.72K viewsedited 07:52

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 VideoJAM: #META's Video-Model (SOTA) 🔥

👉#META's VideoJAM: the new SOTA (by large margin) in motion coherence for video generation, much better than SORA! A strong motion prior into any video-gen model. Impressive results, no code announced🥲

👉Review https://shorturl.at/id7Bt
👉Paper https://arxiv.org/pdf/2502.02492
👉Project https://hila-chefer.github.io/videojam-paper.github.io/

🔥9❤4👍1👏1🤩1

8.03K views11:30

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

👗3D Dynamic Garments👗

👉UCLA introduces Dress-1-to-3, a novel pipeline that reconstructs physics-plausible, simulation-ready separated garments with sewing patterns and humans from an in-the-wild image.

👉Review https://t.ly/qciHV
👉Paper arxiv.org/pdf/2502.03449
👉Project dress-1-to-3.github.io

🔥8❤3👍3👏2🤩1😍1

8.14K viewsedited 08:25

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🤖 META Human-Robot 🤖

👉#META PARTNR: novel benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. The largest benchmark of its kind: 100,000+ natural language tasks, spanning 60 houses and 5,819 unique objects. Code & Data (🤗) under MIT💙

👉Review https://t.ly/zcN0K
👉Paper arxiv.org/pdf/2411.00081
👉Repo github.com/facebookresearch/partnr-planner
🤗Data huggingface.co/datasets/ai-habitat/partnr_episodes

🔥9🤩2❤1👍1😍1

8.25K viewsedited 08:40

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

💃HumanDiT Long-form Human💃

👉HumanDiT is a novel pose-guided Diffusion trained on a large and wild dataset w/ 14,000 hours of HQ video to produce HD videos with fine-grained bodies. Stunning results but no code announced🥲

👉Review https://t.ly/7rTRr
👉Paper https://arxiv.org/pdf/2502.04847
👉Project https://agnjason.github.io/HumanDiT-page/

❤6🔥3👍2👏1🤯1

7.64K viewsedited 07:59