Multi-instrument Music Synthesis with Spectrogram Diffusion
📝An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes.
https://github.com/magenta/music-spectrogram-diffusion
📝An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes.
https://github.com/magenta/music-spectrogram-diffusion
GitHub
GitHub - magenta/music-spectrogram-diffusion
Contribute to magenta/music-spectrogram-diffusion development by creating an account on GitHub.
MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
📝We adopt a hierarchical query embedding scheme to flexibly encode structured map information and perform hierarchical bipartite matching for map element learning.
https://github.com/hustvl/maptr
📝We adopt a hierarchical query embedding scheme to flexibly encode structured map information and perform hierarchical bipartite matching for map element learning.
https://github.com/hustvl/maptr
GitHub
GitHub - hustvl/MapTR: [ICLR'23 Spotlight] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
[ICLR'23 Spotlight] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction - GitHub - hustvl/MapTR: [ICLR'23 Spotlight] MapTR: Structured Modeling and Lea...
YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception
📝Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance.
https://github.com/CAIC-AD/YOLOPv2
📝Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance.
https://github.com/CAIC-AD/YOLOPv2
GitHub
GitHub - CAIC-AD/YOLOPv2: YOLOPv2: Better, Faster, Stronger for Panoptic driving Perception
YOLOPv2: Better, Faster, Stronger for Panoptic driving Perception - CAIC-AD/YOLOPv2
Topology-aware Convolutional Neural Network for Efficient Skeleton-based Action Recognition
📝In particular, we develop a novel cross-channel feature augmentation module, which is a combo of map-attend-group-map operations.
https://github.com/hikvision-research/skelact
📝In particular, we develop a novel cross-channel feature augmentation module, which is a combo of map-attend-group-map operations.
https://github.com/hikvision-research/skelact
GitHub
GitHub - hikvision-research/skelact: Skeleton-based action recognition models in PyTorch, including Two-Stream CNN, HCN, HCN-Baseline…
Skeleton-based action recognition models in PyTorch, including Two-Stream CNN, HCN, HCN-Baseline, Ta-CNN and Dynamic GCN - hikvision-research/skelact
FILM: Frame Interpolation for Large Motion
📝Recent methods use multiple networks to estimate optical flow or depth and a separate network dedicated to frame synthesis.
https://github.com/google-research/frame-interpolation
📝Recent methods use multiple networks to estimate optical flow or depth and a separate network dedicated to frame synthesis.
https://github.com/google-research/frame-interpolation
GitHub
GitHub - google-research/frame-interpolation: FILM: Frame Interpolation for Large Motion, In ECCV 2022.
FILM: Frame Interpolation for Large Motion, In ECCV 2022. - google-research/frame-interpolation
👍1
Online Decision Transformer
📝Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.
https://github.com/facebookresearch/online-dt
📝Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.
https://github.com/facebookresearch/online-dt
GitHub
GitHub - facebookresearch/online-dt: Online Decision Transformer
Online Decision Transformer. Contribute to facebookresearch/online-dt development by creating an account on GitHub.
YOLOX-PAI: An Improved YOLOX Version by PAI
📝We develop an all-in-one computer vision toolbox named EasyCV to facilitate the use of various SOTA computer vision methods.
https://github.com/alibaba/EasyCV
📝We develop an all-in-one computer vision toolbox named EasyCV to facilitate the use of various SOTA computer vision methods.
https://github.com/alibaba/EasyCV
GitHub
GitHub - alibaba/EasyCV: An all-in-one toolkit for computer vision
An all-in-one toolkit for computer vision. Contribute to alibaba/EasyCV development by creating an account on GitHub.
This media is not supported in your browser
VIEW IN TELEGRAM
PeRFception: Perception using Radiance Fields
📝The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.
https://github.com/POSTECH-CVLab/PeRFception
📝The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.
https://github.com/POSTECH-CVLab/PeRFception
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
📝Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes.
https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
📝Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes.
https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
GitHub
GitHub - XavierXiao/Dreambooth-Stable-Diffusion: Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion - XavierXiao/Dreambooth-Stable-Diffusion
👍2
CLIP-Mesh: Generating textured meshes from text using pretrained image-text models
📝We present a technique for zero-shot generation of a 3D model using only a target text prompt.
https://github.com/NasirKhalid24/CLIP-Mesh
📝We present a technique for zero-shot generation of a 3D model using only a target text prompt.
https://github.com/NasirKhalid24/CLIP-Mesh
GitHub
GitHub - NasirKhalid24/CLIP-Mesh: Official implementation of CLIP-Mesh: Generating textured meshes from text using pretrained image…
Official implementation of CLIP-Mesh: Generating textured meshes from text using pretrained image-text models - NasirKhalid24/CLIP-Mesh
🔥1
A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification
📝Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models.
https://github.com/aangelopoulos/conformal-prediction
📝Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models.
https://github.com/aangelopoulos/conformal-prediction
GitHub
GitHub - aangelopoulos/conformal-prediction: Lightweight, useful implementation of conformal prediction on real data.
Lightweight, useful implementation of conformal prediction on real data. - aangelopoulos/conformal-prediction
👍1
Transformers are Sample Efficient World Models
📝Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems.
https://github.com/eloialonso/iris
📝Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems.
https://github.com/eloialonso/iris
GitHub
GitHub - eloialonso/iris: Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.
Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%. - eloialonso/iris
👍1
Text-Free Learning of a Natural Language Interface for Pretrained Face Generators
📝We propose Fast text2StyleGAN, a natural language interface that adapts pre-trained GANs for text-guided human face synthesis.
https://github.com/duxiaodan/fast_text2stylegan
📝We propose Fast text2StyleGAN, a natural language interface that adapts pre-trained GANs for text-guided human face synthesis.
https://github.com/duxiaodan/fast_text2stylegan
GitHub
GitHub - duxiaodan/Fast_text2StyleGAN: Official repo of Text-Free Learning of a Natural Language Interface for Pretrained Face…
Official repo of Text-Free Learning of a Natural Language Interface for Pretrained Face Generators - duxiaodan/Fast_text2StyleGAN
👍1
Behavior Trees in Robotics and AI: An Introduction
📝A Behavior Tree (BT) is a way to structure the switching between different tasks in an autonomous agent, such as a robot or a virtual entity in a computer game.
https://github.com/BehaviorTree/BehaviorTree.CPP
📝A Behavior Tree (BT) is a way to structure the switching between different tasks in an autonomous agent, such as a robot or a virtual entity in a computer game.
https://github.com/BehaviorTree/BehaviorTree.CPP
GitHub
GitHub - BehaviorTree/BehaviorTree.CPP: Behavior Trees Library in C++. Batteries included.
Behavior Trees Library in C++. Batteries included. - BehaviorTree/BehaviorTree.CPP
👍2
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization
📝The emerging paradigm of federated learning (FL) strives to enable collaborative training of deep models on the network edge without centrally aggregating raw data and hence improving data privacy.
https://github.com/adap/flower
📝The emerging paradigm of federated learning (FL) strives to enable collaborative training of deep models on the network edge without centrally aggregating raw data and hence improving data privacy.
https://github.com/adap/flower
GitHub
GitHub - adap/flower: Flower: A Friendly Federated AI Framework
Flower: A Friendly Federated AI Framework. Contribute to adap/flower development by creating an account on GitHub.
👍1
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
📝Through the preliminary study on diffusion model parameterization, we find that previous gradient-based TTS models require hundreds or thousands of iterations to guarantee high sample quality, which poses a challenge for accelerating sampling.
https://github.com/Rongjiehuang/ProDiff
📝Through the preliminary study on diffusion model parameterization, we find that previous gradient-based TTS models require hundreds or thousands of iterations to guarantee high sample quality, which poses a challenge for accelerating sampling.
https://github.com/Rongjiehuang/ProDiff
GitHub
GitHub - Rongjiehuang/ProDiff: PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline
PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline - Rongjiehuang/ProDiff
👍1
Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
📝To achieve super-resolution inverse tone mapping, we derive a continuous representation of 360-degree imaging from the LDR panorama as a set of structured latent codes anchored to the sphere.
https://github.com/frozenburning/text2light
📝To achieve super-resolution inverse tone mapping, we derive a continuous representation of 360-degree imaging from the LDR panorama as a set of structured latent codes anchored to the sphere.
https://github.com/frozenburning/text2light
GitHub
GitHub - FrozenBurning/Text2Light: [SIGGRAPH Asia 2022] Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
[SIGGRAPH Asia 2022] Text2Light: Zero-Shot Text-Driven HDR Panorama Generation - GitHub - FrozenBurning/Text2Light: [SIGGRAPH Asia 2022] Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
👍2
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
📝Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90. 6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it.
https://github.com/visual-attention-network/segnext
📝Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90. 6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it.
https://github.com/visual-attention-network/segnext
GitHub
GitHub - Visual-Attention-Network/SegNeXt: Official Pytorch implementations for "SegNeXt: Rethinking Convolutional Attention Design…
Official Pytorch implementations for "SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation" (NeurIPS 2022) - Visual-Attention-Network/SegNeXt
Robust Speech Recognition via Large-Scale Weak Supervision
📝We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.
https://github.com/openai/whisper
📝We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.
https://github.com/openai/whisper
GitHub
GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision - openai/whisper
Diffusion Models: A Comprehensive Survey of Methods and Applications
📝Diffusion models are a class of deep generative models that have shown impressive results on various tasks with a solid theoretical foundation.
https://github.com/YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy
📝Diffusion models are a class of deep generative models that have shown impressive results on various tasks with a solid theoretical foundation.
https://github.com/YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy
GitHub
GitHub - YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy: Diffusion model papers, survey, and taxonomy
Diffusion model papers, survey, and taxonomy. Contribute to YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy development by creating an account on GitHub.
Plenoxels: Radiance Fields without Neural Networks
📝We introduce Plenoxels (plenoptic voxels), a system for photorealistic view synthesis.
https://github.com/kakaobrain/NeRF-Factory
📝We introduce Plenoxels (plenoptic voxels), a system for photorealistic view synthesis.
https://github.com/kakaobrain/NeRF-Factory
GitHub
GitHub - kakaobrain/nerf-factory: An awesome PyTorch NeRF library
An awesome PyTorch NeRF library. Contribute to kakaobrain/nerf-factory development by creating an account on GitHub.