This media is not supported in your browser
VIEW IN TELEGRAM
"A panda is playing guitar on times square"
Text2Video-Zero
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Paper: https://arxiv.org/abs/2303.13439
Video Result: video result link
Source code: https://github.com/picsart-ai-research/text2video-zero
https://t.iss.one/DataScienceT
Text2Video-Zero
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Paper: https://arxiv.org/abs/2303.13439
Video Result: video result link
Source code: https://github.com/picsart-ai-research/text2video-zero
https://t.iss.one/DataScienceT
❤1
This media is not supported in your browser
VIEW IN TELEGRAM
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
New approach for cI2V using novel latent flow diffusion models (LFDM) that synthesize an optical flow sequence in the latent space based on the given condition to warp the given image.
🖥 Github: https://github.com/nihaomiao/cvpr23_lfdm
⏩ Paper: https://arxiv.org/abs/2303.13744v1
💨 Dataset: https://drive.google.com/file/d/1dRn1wl5TUaZJiiDpIQADt1JJ0_q36MVG/view?usp=share_link
https://t.iss.one/DataScienceT
New approach for cI2V using novel latent flow diffusion models (LFDM) that synthesize an optical flow sequence in the latent space based on the given condition to warp the given image.
🖥 Github: https://github.com/nihaomiao/cvpr23_lfdm
⏩ Paper: https://arxiv.org/abs/2303.13744v1
💨 Dataset: https://drive.google.com/file/d/1dRn1wl5TUaZJiiDpIQADt1JJ0_q36MVG/view?usp=share_link
https://t.iss.one/DataScienceT
❤🔥2❤2👍1
What's your gender?
This media is not supported in your browser
VIEW IN TELEGRAM
Test of Time: Instilling Video-Language Models with a Sense of Time
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
https://t.iss.one/DataScienceT
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
https://t.iss.one/DataScienceT
❤🔥3
This media is not supported in your browser
VIEW IN TELEGRAM
One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
🖥 Github: https://github.com/IDEA-Research/OSX
⏩ Paper: https://arxiv.org/abs/2303.16160
⭐️ Project: https://osx-ubody.github.io
💨 Dataset: https://paperswithcode.com/dataset/expose
https://t.iss.one/DataScienceT
🖥 Github: https://github.com/IDEA-Research/OSX
⏩ Paper: https://arxiv.org/abs/2303.16160
⭐️ Project: https://osx-ubody.github.io
💨 Dataset: https://paperswithcode.com/dataset/expose
https://t.iss.one/DataScienceT
👍1
This media is not supported in your browser
VIEW IN TELEGRAM
ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
ViperGPT, a framework that leverages code-generation models to compose vision-and-language models into subroutines to produce a result for any query.
Github:
https://github.com/cvlab-columbia/viper
Paper:
https://arxiv.org/pdf/2303.08128.pdf
Project:
https://paperswithcode.com/dataset/beat
https://t.iss.one/DataScienceT
❤🔥2
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Propose a three-stage processing pipeline for filtering noisy data and generating high-quality captions, where ChatGPT.
🖥 Github: https://github.com/xinhaomei/wavcaps
⏩ Paper: https://arxiv.org/abs/2303.17395v1
💨 Dataset: https://paperswithcode.com/dataset/sounddescs
https://t.iss.one/DataScienceT
Propose a three-stage processing pipeline for filtering noisy data and generating high-quality captions, where ChatGPT.
🖥 Github: https://github.com/xinhaomei/wavcaps
⏩ Paper: https://arxiv.org/abs/2303.17395v1
💨 Dataset: https://paperswithcode.com/dataset/sounddescs
https://t.iss.one/DataScienceT
❤🔥2👍2
DPF: Learning Dense Prediction Fields with Weak Supervision
🖥 Github: https://github.com/cxx226/dpf
⏩ Paper: https://arxiv.org/abs/2303.16890v1
💨 Dataset: https://paperswithcode.com/dataset/pascal-context
https://t.iss.one/DataScienceT
🖥 Github: https://github.com/cxx226/dpf
⏩ Paper: https://arxiv.org/abs/2303.16890v1
💨 Dataset: https://paperswithcode.com/dataset/pascal-context
https://t.iss.one/DataScienceT
❤🔥3👍1
Human Guided Ground-truth Generation for Realistic Image Super-resolution
🖥 Github: https://github.com/chrisdud0257/hggt
⏩ Paper: https://arxiv.org/abs/2303.13069
💨 Dataset: https://paperswithcode.com/dataset/div2k
https://t.iss.one/DataScienceT
🖥 Github: https://github.com/chrisdud0257/hggt
⏩ Paper: https://arxiv.org/abs/2303.13069
💨 Dataset: https://paperswithcode.com/dataset/div2k
https://t.iss.one/DataScienceT
❤2❤🔥1
ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing
🖥 Github: https://github.com/alibaba/easyrobust/tree/main/benchmarks/imagenet-e
⏩ Paper: https://arxiv.org/abs/2303.17096v1
💨 Dataset: https://paperswithcode.com/dataset/objectnet
https://t.iss.one/DataScienceT
🖥 Github: https://github.com/alibaba/easyrobust/tree/main/benchmarks/imagenet-e
⏩ Paper: https://arxiv.org/abs/2303.17096v1
💨 Dataset: https://paperswithcode.com/dataset/objectnet
https://t.iss.one/DataScienceT
❤🔥4
⚡️Token Merging for Stable Diffusion
Token Merging (ToMe) speeds up transformers by merging redundant tokens, which means the transformer has to do less work.
🖥 Github: https://github.com/dbolya/tomesd
⏩ Paper: https://arxiv.org/abs/2303.17604v1
💨 Blog: https://research.facebook.com/blog/2023/2/token-merging-your-vit-but-faster/
https://t.iss.one/DataScienceT
Token Merging (ToMe) speeds up transformers by merging redundant tokens, which means the transformer has to do less work.
pip install tomesd
🖥 Github: https://github.com/dbolya/tomesd
⏩ Paper: https://arxiv.org/abs/2303.17604v1
💨 Blog: https://research.facebook.com/blog/2023/2/token-merging-your-vit-but-faster/
https://t.iss.one/DataScienceT
❤🔥1
⭐️ HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
Language serves as an interface for LLMs to connect numerous AI models for solving complicated AI tasks!
🖥 Github: https://github.com/microsoft/JARVIS
⏩ Paper: https://arxiv.org/abs/2303.17604v1
https://t.iss.one/DataScienceT
Language serves as an interface for LLMs to connect numerous AI models for solving complicated AI tasks!
🖥 Github: https://github.com/microsoft/JARVIS
⏩ Paper: https://arxiv.org/abs/2303.17604v1
https://t.iss.one/DataScienceT
👍4
WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation
🖥 Github: https://github.com/hustvl/weaktr
⏩ Paper: https://arxiv.org/abs/2304.01184v1
💨 Dataset: https://paperswithcode.com/dataset/imagenet
https://t.iss.one/DataScienceT
🖥 Github: https://github.com/hustvl/weaktr
⏩ Paper: https://arxiv.org/abs/2304.01184v1
💨 Dataset: https://paperswithcode.com/dataset/imagenet
https://t.iss.one/DataScienceT
❤🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
Test of Time: Instilling Video-Language Models with a Sense of Time
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
https://t.iss.one/DataScienceT
GPT-5 will likely have video abilities, but will it have a sense of time? Here is answer to this question in #CVPR2023 paper by student of University of Amsterdam to learn how to instil time into video-language foundation models.
Paper:
https://arxiv.org/abs/2301.02074
Code:
https://github.com/bpiyush/TestOfTime
Project Page:
https://bpiyush.github.io/testoftime-website/
https://t.iss.one/DataScienceT
❤🔥4
This media is not supported in your browser
VIEW IN TELEGRAM
Segment Anything
The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.
🖥 Github: https://github.com/facebookresearch/segment-anything
⭐️ Project: https://segment-anything.com/
⏩ Paper: https://arxiv.org/abs/2304.02643v1
💨 Dataset: https://segment-anything.com/dataset/index.html
https://t.iss.one/DataScienceT
The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.
🖥 Github: https://github.com/facebookresearch/segment-anything
⭐️ Project: https://segment-anything.com/
⏩ Paper: https://arxiv.org/abs/2304.02643v1
💨 Dataset: https://segment-anything.com/dataset/index.html
https://t.iss.one/DataScienceT
❤🔥5
Media is too big
VIEW IN TELEGRAM
Painter → SegGPT: Vision Foundation Models from BAAI
SegGPT, a generalist model for segmenting everything in context.
🖥 Github: https://github.com/baaivision/painter
⏩ Paper: https://arxiv.org/abs/2304.03284v1
⏩ Demo: https://huggingface.co/spaces/BAAI/SegGPT
💨 Dataset: https://paperswithcode.com/dataset/youtube-vos
https://t.iss.one/DataScienceT
SegGPT, a generalist model for segmenting everything in context.
🖥 Github: https://github.com/baaivision/painter
⏩ Paper: https://arxiv.org/abs/2304.03284v1
⏩ Demo: https://huggingface.co/spaces/BAAI/SegGPT
💨 Dataset: https://paperswithcode.com/dataset/youtube-vos
https://t.iss.one/DataScienceT
❤🔥3👍1
To watch paid channel content
All you have to do is subscribe to the paid channel. The paid channel includes multiple and huge programming courses, in addition to very useful books that are not available for free except in the paid channel.
To request a subscription: talk to @Hussein_Sheikho
Channel link: https://t.iss.one/+LnCmAFJO3tNmYjUy
⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
We launched a special bot some time ago to download all scientific, software and mathematics books The bot contains more than thirty million books, and new books are downloaded first, In addition to the possibility of downloading all articles and scientific papers for free
To request a subscription: talk to @Hussein_Sheikho
All you have to do is subscribe to the paid channel. The paid channel includes multiple and huge programming courses, in addition to very useful books that are not available for free except in the paid channel.
To request a subscription: talk to @Hussein_Sheikho
Channel link: https://t.iss.one/+LnCmAFJO3tNmYjUy
⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
We launched a special bot some time ago to download all scientific, software and mathematics books The bot contains more than thirty million books, and new books are downloaded first, In addition to the possibility of downloading all articles and scientific papers for free
To request a subscription: talk to @Hussein_Sheikho
❤🔥2
Instruction Tuning with GPT-4
First attempt to use GPT-4 to generate instruction-following data for LLM finetuning.
🖥 Github: https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
⏩ Paper: https://arxiv.org/abs/2304.03277v1
⏩ Project: https://instruction-tuning-with-gpt-4.github.io/
https://t.iss.one/DataScienceT
First attempt to use GPT-4 to generate instruction-following data for LLM finetuning.
🖥 Github: https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
⏩ Paper: https://arxiv.org/abs/2304.03277v1
⏩ Project: https://instruction-tuning-with-gpt-4.github.io/
https://t.iss.one/DataScienceT
👍3
⚜️ OpenAGI: When LLM Meets Domain Experts
Reinforcement Learning from Task Feedback (RLTF) mechanism, which uses the task-solving result as feedback to improve the LLM's task-solving ability
git clone https://github.com/agiresearch/OpenAGI.git
🖥 Github: https://github.com/agiresearch/openagi
⏩ Paper: https://arxiv.org/pdf/2304.04370.pdf
⭐️ Dataset: https://drive.google.com/drive/folders/1AjT6y7qLIMxcmHhUBG5IE1_5SnCPR57e?usp=share_link
https://t.iss.one/DataScienceT
Reinforcement Learning from Task Feedback (RLTF) mechanism, which uses the task-solving result as feedback to improve the LLM's task-solving ability
git clone https://github.com/agiresearch/OpenAGI.git
🖥 Github: https://github.com/agiresearch/openagi
⏩ Paper: https://arxiv.org/pdf/2304.04370.pdf
⭐️ Dataset: https://drive.google.com/drive/folders/1AjT6y7qLIMxcmHhUBG5IE1_5SnCPR57e?usp=share_link
https://t.iss.one/DataScienceT
❤🔥3👍1
⭐️ Hard Patches Mining for Masked Image Modeling
We observe that the reconstruction loss can naturally be the metric of the difficulty of the pre-training task.
🖥 Github: https://github.com/haochen-wang409/hpm
⏩ Paper: https://arxiv.org/abs/2304.05919v1
⭐️ Dataset: https://paperswithcode.com/dataset/ade20k
https://t.iss.one/DataScienceT
We observe that the reconstruction loss can naturally be the metric of the difficulty of the pre-training task.
🖥 Github: https://github.com/haochen-wang409/hpm
⏩ Paper: https://arxiv.org/abs/2304.05919v1
⭐️ Dataset: https://paperswithcode.com/dataset/ade20k
https://t.iss.one/DataScienceT
👍3❤🔥2