Ivy: Templated Deep Learning for Inter-Framework Portability 
We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.
https://github.com/ivy-dl/ivy
  We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.
https://github.com/ivy-dl/ivy
This media is not supported in your browser
    VIEW IN TELEGRAM
  Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models 
In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.
https://github.com/compvis/latent-diffusion
  In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.
https://github.com/compvis/latent-diffusion
This media is not supported in your browser
    VIEW IN TELEGRAM
  Flow-Guided Transformer for Video Inpainting
Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
https://github.com/hitachinsk/fgt
  Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
https://github.com/hitachinsk/fgt
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
📝We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.
https://github.com/timdettmers/bitsandbytes
  
  📝We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.
https://github.com/timdettmers/bitsandbytes
GitHub
  
  GitHub - TimDettmers/bitsandbytes: 8-bit CUDA functions for PyTorch
  8-bit CUDA functions for PyTorch. Contribute to TimDettmers/bitsandbytes development by creating an account on GitHub.
  KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints
📝In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.
https://github.com/facebookresearch/KeypointNeRF
  
  📝In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.
https://github.com/facebookresearch/KeypointNeRF
GitHub
  
  GitHub - facebookresearch/KeypointNeRF: KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding…
  KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints - facebookresearch/KeypointNeRF
  Collaborative Neural Rendering using Anime Character Sheets
📝Drawing images of characters with desired poses is an essential but laborious task in anime production.
https://github.com/megvii-research/conr
  
  📝Drawing images of characters with desired poses is an essential but laborious task in anime production.
https://github.com/megvii-research/conr
GitHub
  
  GitHub - megvii-research/CoNR: Official implementation of CoNR: Collaborative Neural Rendering using Anime Character Sheets
  Official implementation of CoNR: Collaborative Neural Rendering using Anime Character Sheets - GitHub - megvii-research/CoNR: Official implementation of CoNR: Collaborative Neural Rendering using A...
  Deep Patch Visual Odometry
📝We propose Deep Patch Visual Odometry (DPVO), a new deep learning system for monocular Visual Odometry (VO).
https://github.com/princeton-vl/dpvo
  
  📝We propose Deep Patch Visual Odometry (DPVO), a new deep learning system for monocular Visual Odometry (VO).
https://github.com/princeton-vl/dpvo
GitHub
  
  GitHub - princeton-vl/DPVO: Deep Patch Visual Odometry/SLAM
  Deep Patch Visual Odometry/SLAM. Contribute to princeton-vl/DPVO development by creating an account on GitHub.
  StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3
📝Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
https://github.com/arthur-qiu/stylefacev
  
  📝Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
https://github.com/arthur-qiu/stylefacev
GitHub
  
  GitHub - arthur-qiu/StyleFaceV: Code for StyleFaceV
  Code for StyleFaceV. Contribute to arthur-qiu/StyleFaceV development by creating an account on GitHub.
  Multi-instrument Music Synthesis with Spectrogram Diffusion 
📝An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes.
https://github.com/magenta/music-spectrogram-diffusion
  
  📝An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes.
https://github.com/magenta/music-spectrogram-diffusion
GitHub
  
  GitHub - magenta/music-spectrogram-diffusion
  Contribute to magenta/music-spectrogram-diffusion development by creating an account on GitHub.
  MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction 
📝We adopt a hierarchical query embedding scheme to flexibly encode structured map information and perform hierarchical bipartite matching for map element learning.
https://github.com/hustvl/maptr
  
  📝We adopt a hierarchical query embedding scheme to flexibly encode structured map information and perform hierarchical bipartite matching for map element learning.
https://github.com/hustvl/maptr
GitHub
  
  GitHub - hustvl/MapTR: [ICLR'23 Spotlight] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
  [ICLR'23 Spotlight] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction - GitHub - hustvl/MapTR: [ICLR'23 Spotlight] MapTR: Structured Modeling and Lea...
  YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception 
📝Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance.
https://github.com/CAIC-AD/YOLOPv2
  
  📝Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance.
https://github.com/CAIC-AD/YOLOPv2
GitHub
  
  GitHub - CAIC-AD/YOLOPv2: YOLOPv2: Better, Faster, Stronger for Panoptic driving Perception
  YOLOPv2: Better, Faster, Stronger for Panoptic driving Perception - CAIC-AD/YOLOPv2
  Topology-aware Convolutional Neural Network for Efficient Skeleton-based Action Recognition 
📝In particular, we develop a novel cross-channel feature augmentation module, which is a combo of map-attend-group-map operations.
https://github.com/hikvision-research/skelact
  
  📝In particular, we develop a novel cross-channel feature augmentation module, which is a combo of map-attend-group-map operations.
https://github.com/hikvision-research/skelact
GitHub
  
  GitHub - hikvision-research/skelact: Skeleton-based action recognition models in PyTorch, including Two-Stream CNN, HCN, HCN-Baseline…
  Skeleton-based action recognition models in PyTorch, including Two-Stream CNN, HCN, HCN-Baseline, Ta-CNN and Dynamic GCN - hikvision-research/skelact
  FILM: Frame Interpolation for Large Motion
 
📝Recent methods use multiple networks to estimate optical flow or depth and a separate network dedicated to frame synthesis.
https://github.com/google-research/frame-interpolation
  
  📝Recent methods use multiple networks to estimate optical flow or depth and a separate network dedicated to frame synthesis.
https://github.com/google-research/frame-interpolation
GitHub
  
  GitHub - google-research/frame-interpolation: FILM: Frame Interpolation for Large Motion, In ECCV 2022.
  FILM: Frame Interpolation for Large Motion, In ECCV 2022. - google-research/frame-interpolation
👍1
  Online Decision Transformer 
📝Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.
https://github.com/facebookresearch/online-dt
  
  📝Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.
https://github.com/facebookresearch/online-dt
GitHub
  
  GitHub - facebookresearch/online-dt: Online Decision Transformer
  Online Decision Transformer. Contribute to facebookresearch/online-dt development by creating an account on GitHub.
  YOLOX-PAI: An Improved YOLOX Version by PAI 
📝We develop an all-in-one computer vision toolbox named EasyCV to facilitate the use of various SOTA computer vision methods.
https://github.com/alibaba/EasyCV
  
  📝We develop an all-in-one computer vision toolbox named EasyCV to facilitate the use of various SOTA computer vision methods.
https://github.com/alibaba/EasyCV
GitHub
  
  GitHub - alibaba/EasyCV: An all-in-one toolkit for computer vision
  An all-in-one toolkit for computer vision. Contribute to alibaba/EasyCV development by creating an account on GitHub.
  This media is not supported in your browser
    VIEW IN TELEGRAM
  PeRFception: Perception using Radiance Fields 
📝The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.
https://github.com/POSTECH-CVLab/PeRFception
  📝The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.
https://github.com/POSTECH-CVLab/PeRFception
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation 
📝Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes.
https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
  
  📝Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes.
https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
GitHub
  
  GitHub - XavierXiao/Dreambooth-Stable-Diffusion: Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
  Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion - XavierXiao/Dreambooth-Stable-Diffusion
👍2
  CLIP-Mesh: Generating textured meshes from text using pretrained image-text models 
📝We present a technique for zero-shot generation of a 3D model using only a target text prompt.
https://github.com/NasirKhalid24/CLIP-Mesh
  
  📝We present a technique for zero-shot generation of a 3D model using only a target text prompt.
https://github.com/NasirKhalid24/CLIP-Mesh
GitHub
  
  GitHub - NasirKhalid24/CLIP-Mesh: Official implementation of CLIP-Mesh: Generating textured meshes from text using pretrained image…
  Official implementation of CLIP-Mesh: Generating textured meshes from text using pretrained image-text models - NasirKhalid24/CLIP-Mesh
🔥1
  A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification 
📝Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models.
https://github.com/aangelopoulos/conformal-prediction
  
  📝Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models.
https://github.com/aangelopoulos/conformal-prediction
GitHub
  
  GitHub - aangelopoulos/conformal-prediction: Lightweight, useful implementation of conformal prediction on real data.
  Lightweight, useful implementation of conformal prediction on real data. - aangelopoulos/conformal-prediction
👍1
  Transformers are Sample Efficient World Models 
📝Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems.
https://github.com/eloialonso/iris
  
  📝Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems.
https://github.com/eloialonso/iris
GitHub
  
  GitHub - eloialonso/iris: Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.
  Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%. - eloialonso/iris
👍1
  Text-Free Learning of a Natural Language Interface for Pretrained Face Generators 
📝We propose Fast text2StyleGAN, a natural language interface that adapts pre-trained GANs for text-guided human face synthesis.
https://github.com/duxiaodan/fast_text2stylegan
  
  📝We propose Fast text2StyleGAN, a natural language interface that adapts pre-trained GANs for text-guided human face synthesis.
https://github.com/duxiaodan/fast_text2stylegan
GitHub
  
  GitHub - duxiaodan/Fast_text2StyleGAN: Official repo of Text-Free Learning of a Natural Language Interface for Pretrained Face…
  Official repo of Text-Free Learning of a Natural Language Interface for Pretrained Face Generators - duxiaodan/Fast_text2StyleGAN
👍1
  