Проекты машинного обучения
78 subscribers
4 photos
414 links
Download Telegram
Multi-instrument Music Synthesis with Spectrogram Diffusion

📝An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes.

https://github.com/magenta/music-spectrogram-diffusion
YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception

📝Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance.

https://github.com/CAIC-AD/YOLOPv2
Online Decision Transformer

📝Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.

https://github.com/facebookresearch/online-dt
This media is not supported in your browser
VIEW IN TELEGRAM
PeRFception: Perception using Radiance Fields

📝The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.

https://github.com/POSTECH-CVLab/PeRFception
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

📝Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes.

https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
👍2
A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

📝Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models.
https://github.com/aangelopoulos/conformal-prediction
👍1
Transformers are Sample Efficient World Models

📝Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems.
https://github.com/eloialonso/iris
👍1
Behavior Trees in Robotics and AI: An Introduction

📝A Behavior Tree (BT) is a way to structure the switching between different tasks in an autonomous agent, such as a robot or a virtual entity in a computer game.
https://github.com/BehaviorTree/BehaviorTree.CPP
👍2
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization

📝The emerging paradigm of federated learning (FL) strives to enable collaborative training of deep models on the network edge without centrally aggregating raw data and hence improving data privacy.
https://github.com/adap/flower
👍1
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech

📝Through the preliminary study on diffusion model parameterization, we find that previous gradient-based TTS models require hundreds or thousands of iterations to guarantee high sample quality, which poses a challenge for accelerating sampling.
https://github.com/Rongjiehuang/ProDiff
👍1
Robust Speech Recognition via Large-Scale Weak Supervision

📝We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

https://github.com/openai/whisper