Проекты машинного обучения

High Fidelity Neural Audio Compression

📝We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural networks.
https://github.com/facebookresearch/encodec

GitHub

GitHub - facebookresearch/encodec: State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo…

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio. - facebookresearch/encodec

36 views06:37

Проекты машинного обучения

Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation

📝In this work, we present a dense tracking and mapping system named Vox-Fusion, which seamlessly fuses neural implicit representations with traditional volumetric fusion methods.
https://github.com/zju3dv/vox-fusion

GitHub

GitHub - zju3dv/Vox-Fusion: Code for "Dense Tracking and Mapping with Voxel-based Neural Implicit Representation", ISMAR 2022

Code for "Dense Tracking and Mapping with Voxel-based Neural Implicit Representation", ISMAR 2022 - GitHub - zju3dv/Vox-Fusion: Code for "Dense Tracking and Mapping with ...

31 views05:38

Проекты машинного обучения

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

📝The success of Transformer models has pushed the deep learning model scale to billions of parameters.
https://github.com/hpcaitech/colossalai

GitHub

GitHub - hpcaitech/ColossalAI: Making large AI models cheaper, faster and more accessible

Making large AI models cheaper, faster and more accessible - hpcaitech/ColossalAI

29 views07:24

Проекты машинного обучения

Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations

📝Moreover, we optimize the training strategy by leveraging more audio to learn MSMCRs better for low-resource languages.
https://github.com/hhguo/msmc-tts

GitHub

GitHub - hhguo/MSMC-TTS: Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS - GitHub - hhguo/MSMC-TTS: Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

29 views09:29

Проекты машинного обучения

Referring Image Matting

📝Image matting refers to extracting the accurate foregrounds in the image.
https://github.com/jizhizili/rim

GitHub

GitHub - JizhiziLi/RIM: The official repo for the paper "Referring Image Matting".

The official repo for the paper "Referring Image Matting". - GitHub - JizhiziLi/RIM: The official repo for the paper "Referring Image Matting".

28 views07:20

Проекты машинного обучения

What Makes Convolutional Models Great on Long Sequence Modeling?

📝We focus on the structure of the convolution kernel and identify two critical but intuitive principles enjoyed by S4 that are sufficient to make up an effective global convolutional model: 1) The parameterization of the convolutional kernel needs to be efficient in the sense that the number of parameters should scale sub-linearly with sequence length.
https://github.com/ctlllll/sgconv

GitHub

GitHub - ctlllll/SGConv

Contribute to ctlllll/SGConv development by creating an account on GitHub.

33 views07:21

Проекты машинного обучения

MetaFormer Baselines for Vision

📝By simply applying depthwise separable convolutions as token mixer in the bottom stages and vanilla self-attention in the top stages, the resulting model CAFormer sets a new record on ImageNet-1K: it achieves an accuracy of 85. 5% at 224x224 resolution, under normal supervised training without external data or distillation.
https://github.com/sail-sg/metaformer

GitHub

GitHub - sail-sg/metaformer: MetaFormer Baselines for Vision (TPAMI 2024)

MetaFormer Baselines for Vision (TPAMI 2024). Contribute to sail-sg/metaformer development by creating an account on GitHub.

38 views07:21

Проекты машинного обучения

Real-Time Target Sound Extraction

📝We present the first neural network model to achieve real-time and streaming target sound extraction.
https://github.com/vb000/waveformer

GitHub

GitHub - vb000/Waveformer: A deep neural network architecture for low-latency audio processing

A deep neural network architecture for low-latency audio processing - vb000/Waveformer

43 views07:23

Проекты машинного обучения

Poisson Flow Generative Models

📝We interpret the data points as electrical charges on the $z=0$ hyperplane in a space augmented with an additional dimension $z$, generating a high-dimensional electric field (the gradient of the solution to Poisson equation).
https://github.com/newbeeer/poisson_flow

GitHub

GitHub - Newbeeer/Poisson_flow: Code for NeurIPS 2022 Paper, "Poisson Flow Generative Models" (PFGM)

Code for NeurIPS 2022 Paper, "Poisson Flow Generative Models" (PFGM) - GitHub - Newbeeer/Poisson_flow: Code for NeurIPS 2022 Paper, "Poisson Flow Generative Models" (PFGM)

49 views07:21

Проекты машинного обучения

TAP-Vid: A Benchmark for Tracking Any Point in a Video

📝Generic motion understanding from video involves not only tracking objects, but also perceiving how their surfaces deform and move.
https://github.com/deepmind/tapnet

GitHub

GitHub - google-deepmind/tapnet: Tracking Any Point (TAP)

Tracking Any Point (TAP). Contribute to google-deepmind/tapnet development by creating an account on GitHub.

56 views07:23

Проекты машинного обучения

OneFlow: Redesign the Distributed Deep Learning Framework from Scratch

📝Aiming at a simple, neat redesign of distributed deep learning frameworks for various parallelism paradigms, we present OneFlow, a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model.
https://github.com/Oneflow-Inc/oneflow

56 views07:25

Проекты машинного обучения

Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese

📝The tremendous success of CLIP (Radford et al., 2021) has promoted the research and application of contrastive learning for vision-language pretraining.
https://github.com/ofa-sys/chinese-clip

GitHub

GitHub - OFA-Sys/Chinese-CLIP: Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation. - GitHub - OFA-Sys/Chinese-CLIP: Chinese version of CLIP which achieves Chinese cross-modal retri...

👍1

49 views07:24

Проекты машинного обучения

PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping

📝Previous generative adversarial network (GAN)-based neural vocoders are trained to reconstruct the exact ground truth waveform from the paired mel-spectrogram and do not consider the one-to-many relationship of speech synthesis.
https://github.com/mindslab-ai/phaseaug

GitHub

GitHub - maum-ai/phaseaug: ICASSP 2023 Accepted

ICASSP 2023 Accepted. Contribute to maum-ai/phaseaug development by creating an account on GitHub.

39 views12:44

Проекты машинного обучения

Example-Based Named Entity Recognition

📝We present a novel approach to named entity recognition (NER) in the presence of scarce data that we call example-based NER.
https://github.com/sayef/fsner

GitHub

GitHub - sayef/fsner: Few-shot Named Entity Recognition

Few-shot Named Entity Recognition. Contribute to sayef/fsner development by creating an account on GitHub.

43 views13:44

Проекты машинного обучения

Fine-Tuning Language Models from Human Preferences

📝Most work on reward learning has used simulated environments, but complex information about values is often expressed in natural language, and we believe reward learning for language is a key to making RL practical and safe for real-world tasks.
https://github.com/lvwerra/trl

GitHub

GitHub - huggingface/trl: Train transformer language models with reinforcement learning.

Train transformer language models with reinforcement learning. - huggingface/trl

26 views10:10

Проекты машинного обучения

DI-engine

📝OpenDILab Decision AI Engine
https://github.com/opendilab/DI-engine

GitHub

GitHub - opendilab/DI-engine: OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P. - opendilab/DI-engine

28 views10:11

Проекты машинного обучения

DiffusionInst: Diffusion Model for Instance Segmentation

📝This paper proposes DiffusionInst, a novel framework that represents instances as instance-aware filters and formulates instance segmentation as a noise-to-filter denoising process.
https://github.com/chenhaoxing/DiffusionInst

GitHub

GitHub - chenhaoxing/DiffusionInst: This repo is the code of paper "DiffusionInst: Diffusion Model for Instance Segmentation" (ICASSP'24).

This repo is the code of paper "DiffusionInst: Diffusion Model for Instance Segmentation" (ICASSP'24). - chenhaoxing/DiffusionInst

36 views09:59

Проекты машинного обучения

DAMO-YOLO : A Report on Real-Time Object Detection Design

📝In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series.
https://github.com/tinyvision/damo-yolo

GitHub

GitHub - tinyvision/DAMO-YOLO: DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones…

DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement. - tinyvision/DAMO-YOLO

34 views10:01

Проекты машинного обучения

Programming Is Hard -- Or at Least It Used to Be: Educational Opportunities And Challenges of AI Code Generation

📝The introductory programming sequence has been the focus of much research in computing education.
https://github.com/deepmind/code_contests

GitHub

GitHub - google-deepmind/code_contests

Contribute to google-deepmind/code_contests development by creating an account on GitHub.

38 views10:01

Проекты машинного обучения

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

📝In this work, we present Painter, a generalist model which addresses these obstacles with an "image"-centric solution, that is, to redefine the output of core vision tasks as images, and specify task prompts as also images.
https://github.com/baaivision/painter

GitHub

GitHub - baaivision/Painter: Painter & SegGPT Series: Vision Foundation Models from BAAI

Painter & SegGPT Series: Vision Foundation Models from BAAI - GitHub - baaivision/Painter: Painter & SegGPT Series: Vision Foundation Models from BAAI

34 views10:00

Проекты машинного обучения

ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

📝In the learning phase, each agent minimizes the TD error that is dependent on how the subsequent agents have reacted to their chosen action.
https://github.com/opendilab/ace

GitHub

GitHub - opendilab/ACE: [AAAI 2023] Official PyTorch implementation of paper "ACE: Cooperative Multi-agent Q-learning with Bidirectional…

[AAAI 2023] Official PyTorch implementation of paper "ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency". - GitHub - opendilab/ACE: [AAAI 2023] Official...

36 views10:01

About

Blog

Apps

Platform