โคโ๐ฅ1
OpenSeeD
A Simple Framework for Open-Vocabulary Segmentation and Detection
๐ฅ Github: https://github.com/idea-research/openseed
โฉ Paper: https://arxiv.org/abs/2303.08131v2
๐จ Dataset: https://paperswithcode.com/dataset/objects365
https://t.iss.one/DataScienceT
A Simple Framework for Open-Vocabulary Segmentation and Detection
๐ฅ Github: https://github.com/idea-research/openseed
โฉ Paper: https://arxiv.org/abs/2303.08131v2
๐จ Dataset: https://paperswithcode.com/dataset/objects365
https://t.iss.one/DataScienceT
โคโ๐ฅ3
Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank
๐ฅ Github: https://github.com/huang-shirui/semi-uir
โฉ Paper: https://arxiv.org/abs/2303.09101v1
๐จ Project: https://paperswithcode.com/dataset/uieb
https://t.iss.one/DataScienceT
๐ฅ Github: https://github.com/huang-shirui/semi-uir
โฉ Paper: https://arxiv.org/abs/2303.09101v1
๐จ Project: https://paperswithcode.com/dataset/uieb
https://t.iss.one/DataScienceT
โคโ๐ฅ2
This media is not supported in your browser
VIEW IN TELEGRAM
WebSHAP: Towards Explaining Any Machine Learning Models Anywhere
๐ฅ Github: https://github.com/poloclub/webshap
โฉ Paper: https://arxiv.org/abs/2303.09545v1
๐จ Project: https://poloclub.github.io/webshap
https://t.iss.one/DataScienceT
๐ฅ Github: https://github.com/poloclub/webshap
โฉ Paper: https://arxiv.org/abs/2303.09545v1
๐จ Project: https://poloclub.github.io/webshap
https://t.iss.one/DataScienceT
โคโ๐ฅ3๐1
๐ฅ GigaGAN - Pytorch
Implementation of GigaGAN, new SOTA GAN out of Adobe.
https://github.com/lucidrains/gigagan-pytorch
https://t.iss.one/DataScienceT
Implementation of GigaGAN, new SOTA GAN out of Adobe.
https://github.com/lucidrains/gigagan-pytorch
https://t.iss.one/DataScienceT
โคโ๐ฅ2
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation (CVPR 2023)
Novel Diffusion Audio-Gesture Transformer is devised to better attend to the information from multiple modalities and model the long-term temporal dependency.
๐ฅ Github: https://github.com/advocate99/diffgesture
โฉ Paper: https://arxiv.org/abs/2303.09119v1
๐จ Dataset: https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
Novel Diffusion Audio-Gesture Transformer is devised to better attend to the information from multiple modalities and model the long-term temporal dependency.
๐ฅ Github: https://github.com/advocate99/diffgesture
โฉ Paper: https://arxiv.org/abs/2303.09119v1
๐จ Dataset: https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
๐3โคโ๐ฅ2
Deep Metric Learning for Unsupervised CD
๐ฅ Github: https://github.com/wgcban/metric-cd
โฉ Paper: https://arxiv.org/abs/2303.09536v1
https://t.iss.one/DataScienceT
๐ฅ Github: https://github.com/wgcban/metric-cd
โฉ Paper: https://arxiv.org/abs/2303.09536v1
https://t.iss.one/DataScienceT
๐2โคโ๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ๏ธ ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
๐ฅ Github: https://github.com/cvlab-columbia/viper
โฉ Paper: https://arxiv.org/pdf/2303.08128.pdf
๐จ Project: https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
๐ฅ Github: https://github.com/cvlab-columbia/viper
โฉ Paper: https://arxiv.org/pdf/2303.08128.pdf
๐จ Project: https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
๐3๐2โคโ๐ฅ1
๐ฅ Zero-1-to-3: Zero-shot One Image to 3D Object
Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image.
๐ฅ Github: https://github.com/cvlab-columbia/zero123
๐ค Hugging face: https://huggingface.co/spaces/cvlab/zero123-live
โฉ Paper: https://arxiv.org/abs/2303.11328v1
โฉ Dataset: https://zero123.cs.columbia.edu/
๐จ Project: https://paperswithcode.com/dataset/beat
โญ๏ธ Demo: https://huggingface.co/spaces/cvlab/zero123
https://t.iss.one/DataScienceT
Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image.
๐ฅ Github: https://github.com/cvlab-columbia/zero123
๐ค Hugging face: https://huggingface.co/spaces/cvlab/zero123-live
โฉ Paper: https://arxiv.org/abs/2303.11328v1
โฉ Dataset: https://zero123.cs.columbia.edu/
๐จ Project: https://paperswithcode.com/dataset/beat
โญ๏ธ Demo: https://huggingface.co/spaces/cvlab/zero123
https://t.iss.one/DataScienceT
โค3โคโ๐ฅ3๐2๐1
MIT Introduction to Deep Learning - 2023 Starting soon! MIT Intro to DL is one of the most concise AI courses on the web that cover basic deep learning techniques, architectures, and applications.
2023 lectures are starting in just one day, Jan 9th!
Link to register:
https://introtodeeplearning.com
MIT Introduction to Deep Learning The 2022 lectures can be found here:
https://m.youtube.com/playlist?list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI
https://t.iss.one/DataScienceT
2023 lectures are starting in just one day, Jan 9th!
Link to register:
https://introtodeeplearning.com
MIT Introduction to Deep Learning The 2022 lectures can be found here:
https://m.youtube.com/playlist?list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI
https://t.iss.one/DataScienceT
โคโ๐ฅ3๐3๐2
Train your ControlNet with diffusers ๐งจ
ControlNet is a neural network structure that allows fine-grained control of diffusion models by adding extra conditions.
๐ค Hugging face: https://huggingface.co/blog/train-your-controlnet#
๐ฅ Github: https://github.com/huggingface/blog/blob/main/train-your-controlnet.md
โฉ ControlNet training example: https://github.com/huggingface/diffusers/tree/main/examples/controlnet
https://t.iss.one/DataScienceT
ControlNet is a neural network structure that allows fine-grained control of diffusion models by adding extra conditions.
๐ค Hugging face: https://huggingface.co/blog/train-your-controlnet#
๐ฅ Github: https://github.com/huggingface/blog/blob/main/train-your-controlnet.md
โฉ ControlNet training example: https://github.com/huggingface/diffusers/tree/main/examples/controlnet
https://t.iss.one/DataScienceT
โคโ๐ฅ3๐2
๐ฅ Fix the Noise: Disentangling Source Feature for Controllable Domain Translation
A new approach for high-quality domain translation with better controllability.
๐ฅ Github: https://github.com/LeeDongYeun/FixNoise
โฉ Paper: https://arxiv.org/abs/2303.11545v1
๐จ Dataset: https://paperswithcode.com/dataset/metfaces
https://t.iss.one/DataScienceT
A new approach for high-quality domain translation with better controllability.
๐ฅ Github: https://github.com/LeeDongYeun/FixNoise
โฉ Paper: https://arxiv.org/abs/2303.11545v1
๐จ Dataset: https://paperswithcode.com/dataset/metfaces
https://t.iss.one/DataScienceT
โค1
This media is not supported in your browser
VIEW IN TELEGRAM
"A panda is playing guitar on times square"
Text2Video-Zero
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Paper: https://arxiv.org/abs/2303.13439
Video Result: video result link
Source code: https://github.com/picsart-ai-research/text2video-zero
https://t.iss.one/DataScienceT
Text2Video-Zero
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Paper: https://arxiv.org/abs/2303.13439
Video Result: video result link
Source code: https://github.com/picsart-ai-research/text2video-zero
https://t.iss.one/DataScienceT
โค1
This media is not supported in your browser
VIEW IN TELEGRAM
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
New approach for cI2V using novel latent flow diffusion models (LFDM) that synthesize an optical flow sequence in the latent space based on the given condition to warp the given image.
๐ฅ Github: https://github.com/nihaomiao/cvpr23_lfdm
โฉ Paper: https://arxiv.org/abs/2303.13744v1
๐จ Dataset: https://drive.google.com/file/d/1dRn1wl5TUaZJiiDpIQADt1JJ0_q36MVG/view?usp=share_link
https://t.iss.one/DataScienceT
New approach for cI2V using novel latent flow diffusion models (LFDM) that synthesize an optical flow sequence in the latent space based on the given condition to warp the given image.
๐ฅ Github: https://github.com/nihaomiao/cvpr23_lfdm
โฉ Paper: https://arxiv.org/abs/2303.13744v1
๐จ Dataset: https://drive.google.com/file/d/1dRn1wl5TUaZJiiDpIQADt1JJ0_q36MVG/view?usp=share_link
https://t.iss.one/DataScienceT
โคโ๐ฅ2โค2๐1
What's your gender?
This media is not supported in your browser
VIEW IN TELEGRAM
Test of Time: Instilling Video-Language Models with a Sense of Time
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
https://t.iss.one/DataScienceT
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
https://t.iss.one/DataScienceT
โคโ๐ฅ3
This media is not supported in your browser
VIEW IN TELEGRAM
One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
๐ฅ Github: https://github.com/IDEA-Research/OSX
โฉ Paper: https://arxiv.org/abs/2303.16160
โญ๏ธ Project: https://osx-ubody.github.io
๐จ Dataset: https://paperswithcode.com/dataset/expose
https://t.iss.one/DataScienceT
๐ฅ Github: https://github.com/IDEA-Research/OSX
โฉ Paper: https://arxiv.org/abs/2303.16160
โญ๏ธ Project: https://osx-ubody.github.io
๐จ Dataset: https://paperswithcode.com/dataset/expose
https://t.iss.one/DataScienceT
๐1
This media is not supported in your browser
VIEW IN TELEGRAM
ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
โคโ๐ฅ2
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Propose a three-stage processing pipeline for filtering noisy data and generating high-quality captions, where ChatGPT.
๐ฅ Github: https://github.com/xinhaomei/wavcaps
โฉ Paper: https://arxiv.org/abs/2303.17395v1
๐จ Dataset: https://paperswithcode.com/dataset/sounddescs
https://t.iss.one/DataScienceT
Propose a three-stage processing pipeline for filtering noisy data and generating high-quality captions, where ChatGPT.
๐ฅ Github: https://github.com/xinhaomei/wavcaps
โฉ Paper: https://arxiv.org/abs/2303.17395v1
๐จ Dataset: https://paperswithcode.com/dataset/sounddescs
https://t.iss.one/DataScienceT
โคโ๐ฅ2๐2
DPF: Learning Dense Prediction Fields with Weak Supervision
๐ฅ Github: https://github.com/cxx226/dpf
โฉ Paper: https://arxiv.org/abs/2303.16890v1
๐จ Dataset: https://paperswithcode.com/dataset/pascal-context
https://t.iss.one/DataScienceT
๐ฅ Github: https://github.com/cxx226/dpf
โฉ Paper: https://arxiv.org/abs/2303.16890v1
๐จ Dataset: https://paperswithcode.com/dataset/pascal-context
https://t.iss.one/DataScienceT
โคโ๐ฅ3๐1