AI with Papers - Artificial Intelligence & Deep Learning
15.2K subscribers
135 photos
247 videos
14 files
1.31K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅ• Scenimefy: I-2-I for anime ๐Ÿฅ•

๐Ÿ‘‰S-Lab unveils a novel semi-supervised I-2-I translation framework + HD dataset for anime

๐Ÿ˜ŽReview https://t.ly/IsdEG
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.12968.pdf
๐Ÿ˜ŽCode https://github.com/Yuxinn-J/Scenimefy
๐Ÿ˜ŽProject https://yuxinn-j.github.io/projects/Scenimefy.html
๐Ÿฅฐ13โค2๐Ÿ”ฅ1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿจ Watch Your Steps: Editing by Text ๐Ÿจ

๐Ÿ‘‰The novel SOTA in image & scene (text) editing via denoising diffusion models

๐Ÿ˜ŽReview https://t.ly/fv9wn
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.08947.pdf
๐Ÿ˜ŽProject ashmrz.github.io/WatchYourSteps
โค4๐Ÿ‘3๐Ÿคฏ3๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’ก Relighting NeRF ๐Ÿ’ก

๐Ÿ‘‰Neural implicit radiance representation for free viewpoint relighting of an object lit by a moving point light

๐Ÿ˜ŽReview https://t.ly/J-3_L
๐Ÿ˜ŽProject nrhints.github.io
๐Ÿ˜ŽCode github.com/iamNCJ/NRHints
๐Ÿ˜ŽPaper nrhints.github.io/pdfs/nrhints-sig23.pdf
๐Ÿคฏ3๐Ÿ‘2โค1โšก1๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชถ ReST: Multi-Camera MOT ๐Ÿชถ

๐Ÿ‘‰Novel reconfigurable two-steps graph model for multi-camera multi object video tracking (MC-MOT)

๐Ÿ˜ŽReview https://t.ly/3C5tb
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.13229.pdf
๐Ÿ˜ŽCode github.com/chengche6230/ReST
๐Ÿ”ฅ7โค3๐Ÿคฉ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฒMagicEdit: Magic Video Edit๐ŸŒฒ

๐Ÿ‘‰MagicEdit: explicit disentangling content, structure & motion for Hi-Fi and temporally coherent video editing

๐Ÿ˜ŽReport https://t.ly/tREX4
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.14749.pdf
๐Ÿ˜ŽProject magic-edit.github.io
๐Ÿ˜ŽCode github.com/magic-research/magic-edit
๐Ÿฅฐ8โค4๐Ÿ‘3๐Ÿ”ฅ1๐Ÿ˜ฑ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
โœ‚๏ธ VideoCutLER: Simple UVIS โœ‚๏ธ

๐Ÿ‘‰VideoCutLER is a simple unsupervised video instance segmentation (UVIS) method without relying on optical flows

๐Ÿ˜ŽReview https://t.ly/PBBjG
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.14710.pdf
๐Ÿ˜ŽProject people.eecs.berkeley.edu/~xdwang/projects/CutLER
๐Ÿ˜ŽCode github.com/facebookresearch/CutLER/tree/main/videocutler
๐Ÿ”ฅ8๐Ÿ‘3โค2๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ 3D Pigeons Pose & Tracking ๐Ÿฆ

๐Ÿ‘‰ 3D-MuPPET: estimate and track 3D poses of pigeons with multiple-views

๐Ÿ˜ŽReview https://t.ly/jfAJJ
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.15316.pdf
๐Ÿ˜ŽCode github.com/alexhang212/3D-MuPPET/
๐Ÿคฃ17๐Ÿคฏ14๐Ÿ‘4๐Ÿฅฐ2โค1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽRoboTAP: Dense Tracking for Few-Shot Imitation๐ŸŽ

๐Ÿ‘‰RoboTAP: novel dense tracking representation for robotic arm

๐Ÿ˜ŽReview https://t.ly/MCO_V
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.15975.pdf
๐Ÿ˜ŽProject https://robotap.github.io/
๐Ÿ˜ŽCode github.com/deepmind/tapnet
๐Ÿ”ฅ8๐Ÿ‘2๐Ÿคฏ2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ›บFACET: Fairness in Computer Visionโ›บ

๐Ÿ‘‰#META AI opens a large, publicly available dataset for classification, detection & segmentation. Potential performance disparities & challenges across sensitive demographic attributes

๐Ÿ˜ŽReview https://t.ly/mKn-t
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.00035.pdf
๐Ÿ˜ŽDataset https://facet.iss.onetademolab.com/
๐Ÿ”ฅ10โค6๐Ÿ‘4๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โ™Š๏ธ Doppelgangers in Structures โ™Š๏ธ

๐Ÿ‘‰A novel learning-based approach for visual disambiguation: distinguishing illusory matches to produce correct, disambiguated #3D reconstructions

๐Ÿ˜ŽReview https://t.ly/9yLot
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.02420.pdf
๐Ÿ˜ŽCode github.com/RuojinCai/Doppelgangers
๐Ÿ˜ŽProject doppelgangers-3d.github.io/
๐Ÿ”ฅ8๐Ÿ‘3๐Ÿคฏ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿƒ Tracking Anything with Decoupled VOS ๐Ÿƒ

๐Ÿ‘‰A novel VOS approach that extends SAM for open-world video segmentation with no user input required

๐Ÿ˜ŽReview https://t.ly/xeobR
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.03903.pdf
๐Ÿ˜ŽProject hkchengrex.com/Tracking-Anything-with-DEVA
๐Ÿ˜ŽCode github.com/hkchengrex/Tracking-Anything-with-DEVA
๐Ÿ˜ŽColab https://colab.research.google.com/drive/1OsyNVoV_7ETD1zIE8UWxL3NXxu12m_YZ
๐Ÿ”ฅ13๐Ÿ‘6๐Ÿคฏ4โค2๐Ÿ˜ข1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชท Diffusive Consistent Video Editing ๐Ÿชท

๐Ÿ‘‰ Weizmann Institute of Science unveils TokenFlow, a novel text-to-image diffusion model for text-driven video editing

๐Ÿ˜ŽReview https://t.ly/ru8km
๐Ÿ˜ŽPaper arxiv.org/pdf/2307.10373.pdf
๐Ÿ˜ŽProject diffusion-tokenflow.github.io
๐Ÿ˜ŽCode github.com/omerbt/TokenFlow
โค9๐Ÿ‘6๐Ÿ”ฅ2๐Ÿคฏ1๐Ÿ˜ฑ1๐Ÿ˜ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ๐Ÿ”ฅ #META's DINOv2 is now commercial! ๐Ÿ”ฅ๐Ÿ”ฅ

๐Ÿ‘‰Universal features for image classification, instance retrieval, video understanding, depth & semantic segmentation. Now suitable for commercial.

๐Ÿ˜ŽReview https://t.ly/LNrGy
๐Ÿ˜ŽPaper arxiv.org/pdf/2304.07193.pdf
๐Ÿ˜ŽCode github.com/facebookresearch/dinov2
๐Ÿ˜ŽDemo dinov2.metademolab.com/
๐Ÿ”ฅ15๐Ÿ‘3โค1๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง„FreeMan: towards #3D Humans ๐Ÿง„

๐Ÿ‘‰FreeMan: the first large-scale, real-world, multi-view dataset for #3D human pose estimation. 11M frames!

๐Ÿ˜ŽReview https://t.ly/ICxpA
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.05073.pdf
๐Ÿ˜ŽProject wangjiongw.github.io/freeman
๐Ÿ‘6๐Ÿคฏ4๐Ÿฅฐ1
๐ŸฆŠ MagiCapture: HD Multi-Concept Portrait ๐ŸฆŠ

๐Ÿ‘‰KAIST unveils MagiCapture: integrating subject and style concepts to generate high-resolution portrait images using just a few subject and style references

๐Ÿ˜ŽReview https://t.ly/c9rOo
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2309.06895.pdf
โค5๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
โšฝ Dynamic NeRFs for Soccer โšฝ

๐Ÿ‘‰SoccerNeRF: first attempt of "cheap" NeRF applied to football for reconstructing soccer replays in space and time.

๐Ÿ˜ŽReview https://t.ly/Ywcvk
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.06802.pdf
๐Ÿ˜ŽProject https://soccernerfs.isach.be/
๐Ÿ˜ŽCode github.com/iSach/SoccerNeRFs
๐Ÿ”ฅ8โค4๐Ÿ‘3๐Ÿคฉ2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜ข๏ธ GlueStick: Graph Neural Matching โ˜ข๏ธ

๐Ÿ‘‰GlueStick is joint deep matcher for points and lines that leverages the connectivity information between nodes to better glue them together

๐Ÿ˜ŽReview https://t.ly/Atxqo
๐Ÿ˜ŽPaper arxiv.org/pdf/2304.02008.pdf
๐Ÿ˜ŽCode https://github.com/cvg/GlueStick
๐Ÿ”ฅ11๐Ÿ‘4โค1๐Ÿคฏ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿซ€CPR-Coach: Neural Cardiopulmonary Resuscitation๐Ÿซ€

๐Ÿ‘‰CPR-Coach: fine-grained action recognition in cardiopulmonary resuscitation

๐Ÿ˜ŽReview https://t.ly/Qbg4K
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.11718.pdf
๐Ÿ˜ŽCode github.com/Shunli-Wang/CPR-Coach
๐Ÿ˜ŽProject shunli-wang.github.io/CPR-Coach
โค7๐Ÿ”ฅ3๐Ÿ‘1
๐Ÿงช NeuralLabeling with NeRF ๐Ÿงช

๐Ÿ‘‰Annotating a scene by generating segmentation masks, affordance maps, 2D bounding boxes, 3D BB, 6DOF poses, depth & meshes.

๐Ÿ˜ŽReview https://t.ly/1GPsj
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.11966.pdf
๐Ÿ˜ŽCode github.com/FlorisE/neural-labeling
๐Ÿ˜ŽProject florise.github.io/neural_labeling_web
๐Ÿ‘5๐Ÿคฏ3๐Ÿ”ฅ2โค1๐Ÿฅฐ1
๐ŸŸ DE-ViT: detecting everything via DINOv2 ๐ŸŸ

๐Ÿ‘‰DE-ViT: open-set object detector based on DINOv2 backbone. It's the new SOTA on COCO & LVIS dataset

๐Ÿ˜ŽReview https://t.ly/_DAmt
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.12969.pdf
๐Ÿ˜ŽCode https://github.com/mlzxy/devit
๐Ÿ”ฅ8๐Ÿ‘4โค1๐Ÿคฏ1๐Ÿ˜ฑ1