Проекты машинного обучения
79 subscribers
4 photos
414 links
Download Telegram
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale

📝We launch EVA, a vision-centric foundation model to explore the limits of visual representation at scale using only publicly accessible data.
https://github.com/baaivision/eva
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

📝This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets on various recognition benchmarks, including ImageNet classification, COCO detection, and ADE20K segmentation.

https://github.com/facebookresearch/convnext-v2
Cramming: Training a Language Model on a Single GPU in One Day

📝Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners.

https://github.com/jonasgeiping/cramming
Muse: Text-To-Image Generation via Masked Generative Transformers

📝Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

https://github.com/lucidrains/muse-pytorch
A Survey for In-context Learning

📝With the increasing ability of large language models (LLMs), in-context learning (ICL) has become a new paradigm for natural language processing (NLP), where LLMs make predictions only based on contexts augmented with a few training examples.

https://github.com/dqxiu/icl_paperlist
BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

📝In these experiments, we observe that BanditPAM returns the same results as state-of-the-art PAM-like algorithms up to 4x faster while performing up to 200x fewer distance computations.https://github.com/ThrunGroup/BanditPAM
SegGPT: Segmenting Everything In Context

📝We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images.https://github.com/baaivision/painter
Instruction Tuning with GPT-4

📝Prior work has shown that finetuning large language models (LLMs) using machine-generated instruction-following data enables such models to achieve remarkable zero-shot capabilities on new tasks, and no human-written instructions are needed.https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

📝In this paper, we address this challenge, and propose GPTQ, a new one-shot weight quantization method based on approximate second-order information, that is both highly-accurate and highly-efficient.https://github.com/thudm/chatglm-6b
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation📝We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.https://github.com/winfredy/sadtalker

Segment Everything Everywhere All at Once

📝https://github.com/ux-decoder/segment-everything-everywhere-all-at-once