Multiface: A Dataset for Neural Face Rendering
Along with the release of the dataset, we conduct ablation studies on the influence of different model architectures toward the model's interpolation capacity of novel viewpoint and expressions.
https://github.com/facebookresearch/multiface
Along with the release of the dataset, we conduct ablation studies on the influence of different model architectures toward the model's interpolation capacity of novel viewpoint and expressions.
https://github.com/facebookresearch/multiface
GitHub
GitHub - facebookresearch/multiface: Hosts the Multiface dataset, which is a multi-view dataset of multiple identities performing…
Hosts the Multiface dataset, which is a multi-view dataset of multiple identities performing a sequence of facial expressions. - facebookresearch/multiface
Multi-scale Multi-band DenseNets for Audio Source Separation
This paper deals with the problem of audio source separation.
GitHub https://github.com/Anjok07/ultimatevocalremovergui
This paper deals with the problem of audio source separation.
GitHub https://github.com/Anjok07/ultimatevocalremovergui
GitHub
GitHub - Anjok07/ultimatevocalremovergui: GUI for a Vocal Remover that uses Deep Neural Networks.
GUI for a Vocal Remover that uses Deep Neural Networks. - Anjok07/ultimatevocalremovergui
CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
******Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.
https://github.com/celebv-hq/celebv-hq
******Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.
https://github.com/celebv-hq/celebv-hq
GitHub
GitHub - CelebV-HQ/CelebV-HQ: [ECCV 2022] CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
[ECCV 2022] CelebV-HQ: A Large-Scale Video Facial Attributes Dataset - CelebV-HQ/CelebV-HQ
DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection
******As a result, DEVIANT is equivariant to the depth translations in the projective manifold whereas vanilla networks are not.
https://github.com/abhi1kumar/deviant
******As a result, DEVIANT is equivariant to the depth translations in the projective manifold whereas vanilla networks are not.
https://github.com/abhi1kumar/deviant
GitHub
GitHub - abhi1kumar/DEVIANT: [ECCV 2022] Official PyTorch Code of DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection
[ECCV 2022] Official PyTorch Code of DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection - GitHub - abhi1kumar/DEVIANT: [ECCV 2022] Official PyTorch Code of DEVIANT: Depth EquiVarI...
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition
******Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.
https://github.com/lbh1024/can
******Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.
https://github.com/lbh1024/can
GitHub
GitHub - LBH1024/CAN: When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022…
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022 Poster). - LBH1024/CAN
In Defense of Online Models for Video Instance Segmentation
******In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance.
https://github.com/wjf5203/vnext
******In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance.
https://github.com/wjf5203/vnext
GitHub
GitHub - wjf5203/VNext: Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR…
Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and IDOL(ECCV Oral)) - wjf5203/VNext
Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
******Omni3D re-purposes and combines existing datasets resulting in 234k images annotated with more than 3 million instances and 97 categories. 3D detection at such scale is challenging due to variations in camera intrinsics and the rich diversity of scene and object types.
https://github.com/facebookresearch/omni3d
******Omni3D re-purposes and combines existing datasets resulting in 234k images annotated with more than 3 million instances and 97 categories. 3D detection at such scale is challenging due to variations in camera intrinsics and the rich diversity of scene and object types.
https://github.com/facebookresearch/omni3d
GitHub
GitHub - facebookresearch/omni3d: Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild"
Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild" - facebookresearch/omni3d
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
📝YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56. 8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.
https://github.com/wongkinyiu/yolov7
📝YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56. 8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.
https://github.com/wongkinyiu/yolov7
GitHub
GitHub - WongKinYiu/yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time…
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors - WongKinYiu/yolov7
Generative Multiplane Images: Making a 2D GAN 3D-Aware
📝What is really needed to make an existing 2D GAN 3D-aware?
https://github.com/apple/ml-gmpi
📝What is really needed to make an existing 2D GAN 3D-aware?
https://github.com/apple/ml-gmpi
GitHub
GitHub - apple/ml-gmpi: [ECCV 2022, Oral Presentation] Official PyTorch implementation of GMPI
[ECCV 2022, Oral Presentation] Official PyTorch implementation of GMPI - GitHub - apple/ml-gmpi: [ECCV 2022, Oral Presentation] Official PyTorch implementation of GMPI
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
📝In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
https://github.com/andreas128/RePaint
📝In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
https://github.com/andreas128/RePaint
GitHub
GitHub - andreas128/RePaint: Official PyTorch Code and Models of "RePaint: Inpainting using Denoising Diffusion Probabilistic Models"…
Official PyTorch Code and Models of "RePaint: Inpainting using Denoising Diffusion Probabilistic Models", CVPR 2022 - andreas128/RePaint
Reconstructing 3D Human Pose by Watching Humans in the Mirror
In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.
https://github.com/zju3dv/EasyMocap
In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.
https://github.com/zju3dv/EasyMocap
Pretraining is All You Need for Image-to-Image Translation
We propose to use pretraining to boost general image-to-image translation.
https://github.com/PITI-Synthesis/PITI
We propose to use pretraining to boost general image-to-image translation.
https://github.com/PITI-Synthesis/PITI
👍1
Elucidating the Design Space of Diffusion-Based Generative Models
We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices.
https://github.com/lucidrains/imagen-pytorch
We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices.
https://github.com/lucidrains/imagen-pytorch
Ivy: Templated Deep Learning for Inter-Framework Portability
We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.
https://github.com/ivy-dl/ivy
We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.
https://github.com/ivy-dl/ivy
This media is not supported in your browser
VIEW IN TELEGRAM
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.
https://github.com/compvis/latent-diffusion
In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.
https://github.com/compvis/latent-diffusion
This media is not supported in your browser
VIEW IN TELEGRAM
Flow-Guided Transformer for Video Inpainting
Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
https://github.com/hitachinsk/fgt
Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
https://github.com/hitachinsk/fgt