Multi-scale Multi-band DenseNets for Audio Source Separation
This paper deals with the problem of audio source separation.
GitHub https://github.com/Anjok07/ultimatevocalremovergui
This paper deals with the problem of audio source separation.
GitHub https://github.com/Anjok07/ultimatevocalremovergui
GitHub
GitHub - Anjok07/ultimatevocalremovergui: GUI for a Vocal Remover that uses Deep Neural Networks.
GUI for a Vocal Remover that uses Deep Neural Networks. - Anjok07/ultimatevocalremovergui
CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
******Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.
https://github.com/celebv-hq/celebv-hq
******Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.
https://github.com/celebv-hq/celebv-hq
GitHub
GitHub - CelebV-HQ/CelebV-HQ: [ECCV 2022] CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
[ECCV 2022] CelebV-HQ: A Large-Scale Video Facial Attributes Dataset - CelebV-HQ/CelebV-HQ
DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection
******As a result, DEVIANT is equivariant to the depth translations in the projective manifold whereas vanilla networks are not.
https://github.com/abhi1kumar/deviant
******As a result, DEVIANT is equivariant to the depth translations in the projective manifold whereas vanilla networks are not.
https://github.com/abhi1kumar/deviant
GitHub
GitHub - abhi1kumar/DEVIANT: [ECCV 2022] Official PyTorch Code of DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection
[ECCV 2022] Official PyTorch Code of DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection - GitHub - abhi1kumar/DEVIANT: [ECCV 2022] Official PyTorch Code of DEVIANT: Depth EquiVarI...
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition
******Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.
https://github.com/lbh1024/can
******Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.
https://github.com/lbh1024/can
GitHub
GitHub - LBH1024/CAN: When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022…
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022 Poster). - LBH1024/CAN
In Defense of Online Models for Video Instance Segmentation
******In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance.
https://github.com/wjf5203/vnext
******In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance.
https://github.com/wjf5203/vnext
GitHub
GitHub - wjf5203/VNext: Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR…
Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and IDOL(ECCV Oral)) - wjf5203/VNext
Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
******Omni3D re-purposes and combines existing datasets resulting in 234k images annotated with more than 3 million instances and 97 categories. 3D detection at such scale is challenging due to variations in camera intrinsics and the rich diversity of scene and object types.
https://github.com/facebookresearch/omni3d
******Omni3D re-purposes and combines existing datasets resulting in 234k images annotated with more than 3 million instances and 97 categories. 3D detection at such scale is challenging due to variations in camera intrinsics and the rich diversity of scene and object types.
https://github.com/facebookresearch/omni3d
GitHub
GitHub - facebookresearch/omni3d: Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild"
Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild" - facebookresearch/omni3d
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
📝YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56. 8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.
https://github.com/wongkinyiu/yolov7
📝YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56. 8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.
https://github.com/wongkinyiu/yolov7
GitHub
GitHub - WongKinYiu/yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time…
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors - WongKinYiu/yolov7
Generative Multiplane Images: Making a 2D GAN 3D-Aware
📝What is really needed to make an existing 2D GAN 3D-aware?
https://github.com/apple/ml-gmpi
📝What is really needed to make an existing 2D GAN 3D-aware?
https://github.com/apple/ml-gmpi
GitHub
GitHub - apple/ml-gmpi: [ECCV 2022, Oral Presentation] Official PyTorch implementation of GMPI
[ECCV 2022, Oral Presentation] Official PyTorch implementation of GMPI - GitHub - apple/ml-gmpi: [ECCV 2022, Oral Presentation] Official PyTorch implementation of GMPI
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
📝In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
https://github.com/andreas128/RePaint
📝In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
https://github.com/andreas128/RePaint
GitHub
GitHub - andreas128/RePaint: Official PyTorch Code and Models of "RePaint: Inpainting using Denoising Diffusion Probabilistic Models"…
Official PyTorch Code and Models of "RePaint: Inpainting using Denoising Diffusion Probabilistic Models", CVPR 2022 - andreas128/RePaint
Reconstructing 3D Human Pose by Watching Humans in the Mirror
In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.
https://github.com/zju3dv/EasyMocap
In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.
https://github.com/zju3dv/EasyMocap
Pretraining is All You Need for Image-to-Image Translation
We propose to use pretraining to boost general image-to-image translation.
https://github.com/PITI-Synthesis/PITI
We propose to use pretraining to boost general image-to-image translation.
https://github.com/PITI-Synthesis/PITI
👍1
Elucidating the Design Space of Diffusion-Based Generative Models
We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices.
https://github.com/lucidrains/imagen-pytorch
We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices.
https://github.com/lucidrains/imagen-pytorch
Ivy: Templated Deep Learning for Inter-Framework Portability
We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.
https://github.com/ivy-dl/ivy
We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.
https://github.com/ivy-dl/ivy
This media is not supported in your browser
VIEW IN TELEGRAM
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.
https://github.com/compvis/latent-diffusion
In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.
https://github.com/compvis/latent-diffusion
This media is not supported in your browser
VIEW IN TELEGRAM
Flow-Guided Transformer for Video Inpainting
Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
https://github.com/hitachinsk/fgt
Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
https://github.com/hitachinsk/fgt
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
📝We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.
https://github.com/timdettmers/bitsandbytes
📝We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.
https://github.com/timdettmers/bitsandbytes
GitHub
GitHub - TimDettmers/bitsandbytes: 8-bit CUDA functions for PyTorch
8-bit CUDA functions for PyTorch. Contribute to TimDettmers/bitsandbytes development by creating an account on GitHub.
KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints
📝In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.
https://github.com/facebookresearch/KeypointNeRF
📝In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.
https://github.com/facebookresearch/KeypointNeRF
GitHub
GitHub - facebookresearch/KeypointNeRF: KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding…
KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints - facebookresearch/KeypointNeRF
Collaborative Neural Rendering using Anime Character Sheets
📝Drawing images of characters with desired poses is an essential but laborious task in anime production.
https://github.com/megvii-research/conr
📝Drawing images of characters with desired poses is an essential but laborious task in anime production.
https://github.com/megvii-research/conr
GitHub
GitHub - megvii-research/CoNR: Official implementation of CoNR: Collaborative Neural Rendering using Anime Character Sheets
Official implementation of CoNR: Collaborative Neural Rendering using Anime Character Sheets - GitHub - megvii-research/CoNR: Official implementation of CoNR: Collaborative Neural Rendering using A...
Deep Patch Visual Odometry
📝We propose Deep Patch Visual Odometry (DPVO), a new deep learning system for monocular Visual Odometry (VO).
https://github.com/princeton-vl/dpvo
📝We propose Deep Patch Visual Odometry (DPVO), a new deep learning system for monocular Visual Odometry (VO).
https://github.com/princeton-vl/dpvo
GitHub
GitHub - princeton-vl/DPVO: Deep Patch Visual Odometry/SLAM
Deep Patch Visual Odometry/SLAM. Contribute to princeton-vl/DPVO development by creating an account on GitHub.
StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3
📝Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
https://github.com/arthur-qiu/stylefacev
📝Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
https://github.com/arthur-qiu/stylefacev
GitHub
GitHub - arthur-qiu/StyleFaceV: Code for StyleFaceV
Code for StyleFaceV. Contribute to arthur-qiu/StyleFaceV development by creating an account on GitHub.