AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
136 photos
249 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🦜 ColorDiffuser: Text-to-Video Colorization 🦜

👉HK University unveils ColorDiffuser: adapting pre-trained text-to-image latent diffusion model for video colorization

😎Review https://t.ly/XGv_
😎Paper arxiv.org/pdf/2306.01732.pdf
😎Project colordiffuser.github.io/
😎Code github.com/ColorDiffuser/ColorDiffuser
🤯82🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌻 Extending Mona Lisa with AI 🌻

👉 A guy on Reddit extends Mona Lisa Painting with #Photoshop AI. The result is surprising.

😎More https://t.ly/j_2r
🤯20👍5🤩4🔥3😱2🤣21
This media is not supported in your browser
VIEW IN TELEGRAM
🏸 Segment Anything in HQ 🏸

👉HQ-SAM: SAM with the ability to accurately segment objects, maintaining promptable design, efficiency, zero-shot generalizability

😎Review https://t.ly/GxX5B
😎Paper arxiv.org/pdf/2306.01567.pdf
😎Models github.com/SysCV/SAM-HQ
🔥18👍4🤯1😱1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈 Track Everything Everywhere 🌈

👉#Google unveils OmniMotion: full-length motion tracking for every pixel in every frame of video.

😎Review https://t.ly/Krvw
😎Paper arxiv.org/pdf/2306.05422.pdf
😎Project omnimotion.github.io/
😎Demo omnimotion.github.io/#interactive_demo
😎Code github.com/qianqianwang68/omnimotion
🔥235🤯3🤩1💩1
This media is not supported in your browser
VIEW IN TELEGRAM
👁️ Scene Five: Through Her Eyes 👁️

👉 #3D scene reconstruction of what a person is observing using only the reflections of their eyes

😎Review https://t.ly/uBO6
😎Paper arxiv.org/pdf/2306.09348.pdf
😎Project https://world-from-eyes.github.io/
🤯28🔥12💩2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫣 Text-Guided Adversarial Makeup 🫣

👉Novel facial privacy protection via adversarial latent codes. Makeup vs Face Recognition.

😎Review https://t.ly/pBCP
😎Paper arxiv.org/pdf/2306.10008.pdf
😎Code github.com/fahadshamshad/Clip2Protect
6👍1🔥1🥰1💩1
Media is too big
VIEW IN TELEGRAM
🦷 Few-Shot Geometry-Aware Keypoints 🦷

👉UBC (+Flawless AI) unveils the new SOTA in semantic keypoints localization. Suitable for faces, animals, cars, mouth, teeth & more

😎Review https://t.ly/-0qN
😎Paper arxiv.org/pdf/2303.17216.pdf
😎Project xingzhehe.github.io/FewShot3DKP/
🤯10👍422👏2🤩2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🚔 Fooling Neural Forensic Classifiers 🚔

👉Adversarial faces able to fool the forensic classifiers, while remaining undetectable by humans

😎Review https://t.ly/33Cc
😎Paper arxiv.org/pdf/2306.13091.pdf
😎Project koushiksrivats.github.io/face_attribute_attack
😎Code github.com/koushiksrivats/face_attribute_attack
😢64👏2😱2🍾2👍1🤯1😍1
panohead_overview-min.gif
24.3 MB
🍥 PanoHead: 3D Full-Head Synthesis 🍥

👉#ByteDance (+UW-M) unveils PanoHead: 360◦ view-consistent portraits from a single-view image

😎Review https://t.ly/MrLNR
😎Paper arxiv.org/pdf/2303.13071.pdf
😎Project sizhean.github.io/panohead
😎Code github.com/sizhean/panohead
🔥74🤯3😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔮SAM-PT: Segment Anything+Tracking🔮

👉SAM-PT is the first method to utilize sparse point propagation for Video Object Segmentation (VOS).

😎Review https://t.ly/QLMG
😎Paper arxiv.org/pdf/2307.01197.pdf
😎Project www.vis.xyz/pub/sam-pt/
😎Code github.com/SysCV/sam-pt
🔥147🤯3👍1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🪩 DISCO: Human Dance Generation 🪩

👉NTU (+ #Microsoft) unveils DISCO: a big step towards the Human Dance Generation.

😎Review https://t.ly/cNGX
😎Paper arxiv.org/pdf/2307.00040.pdf
😎Project disco-dance.github.io/
😎Code github.com/Wangt-CN/DisCo
🔥13🥰4😍21👍1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🛣️ STAR.: 3D-tracking w/ attention paradigm 🛣️

👉#Mercedes STAR: e2e 3D object tracking that follows the tracking-by-attention paradigm

😎Review https://t.ly/JoGj
😎Paper arxiv.org/pdf/2306.17602.pdf
😎Project simondoll.github.io/publications/star_track
👍14🔥1🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🍡 Text2Cinemagraphs: Cinemagraph from text 🍡

👉CMU (+ #Snap) unveils a fully automated method for creating cinemagraphs from text descriptions

😎Review https://t.ly/BwZs6
😎Paper arxiv.org/pdf/2307.03190.pdf
😎Project text2cinemagraph.github.io/website
😎Code github.com/text2cinemagraph/text2cinemagraph
12🤯3😱1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Test-Time Training on fire 🔥

👉Extending the TTT to the streaming setting. Suitable for Panoptic, Instance & Colorization.

😎Review https://t.ly/eZYA
😎Paper arxiv.org/pdf/2307.05014.pdf
😎Project https://video-ttt.github.io/
😎Code github.com/renwang435/video-ttt-release
🔥10👍31🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🃏 Deepfake via casual self-scan 🃏

👉TAU presents a novel approach to reenact an ID using only a casual self-scan

😎Review https://t.ly/9T8Wi
😎Paper arxiv.org/pdf/2307.06307.pdf
😎Project arielazary.github.io/PGR
🤯7👍65🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🎪 Extreme Human Pose Estimation 🎪

👉RePoGen: novel synthetic data generator of extreme/realistic poses of humans

😎Review https://t.ly/ecBvM
😎Paper arxiv.org/pdf/2307.06737.pdf
😎Project mirapurkrabek.github.io/RePoGen-paper
😎Code github.com/MiraPurkrabek/RePoGen
🔥12👍2👏1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧯Neural Focal Modulation VAR🧯

👉A novel architecture for video recognition that models both local/global context

😎Review https://t.ly/rF_fk
😎Paper arxiv.org/pdf/2307.06947.pdf
😎Project talalwasim.github.io/Video-FocalNets
😎Code github.com/TalalWasim/Video-FocalNets
🔥81👏1🤩1