Data Science by ODS.ai 🦜

🚀🎉Another exciting day for Multimodal AI! The MiniCPM-V repository by is trending on GitHub.

🤯 Impressive Results:
👉MiniCPM-Llama3-V 2.5 (8B) surpasses GPT-4V, Gemini Pro, & Claude 3
👉MiniCPM-V 2.0 (2B) surpasses Yi-VL 34B, CogVLM-Chat 17B, & Qwen-VL-Chat 10B

MiniCPM-V is efficiently deployable on end-side devices🤖📱 Read more: https://github.com/OpenBMB/MiniCPM-V

🚀MiniCPM-V is building with Gradio to showcase framework's flexibility for creating powerful AI Vision apps. Local Gradio demo: https://github.com/OpenBMB/MiniCPM-V?tab=readme-ov-file#webui-demo

@opendatascience

🔥12👍9❤2💩1

14K viewsedited 11:03

Data Science by ODS.ai 🦜

Forwarded from Machinelearning

🔥🔥🔥 YOLOv10: Real-Time End-to-End Object Detection

⚡️ Вышла новая версия детектора объектов YOLOv10

Добавлена новая функция сквозного обнаружения объектов в реальном времени. Код выпущен под лицензией GNU GPL v3.0

▪Paper: arxiv.org/pdf/2405.14458
▪Github: https://github.com/THU-MIG/yolov10/
▪Demo :https://huggingface.co/spaces/kadirnar/Yolov10
▪Colab: https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolov10-object-detection-on-custom-dataset.ipynb#scrollTo=SaKTSzSWnG7s

@ai_machinelearning_big_data

🔥19👍11❤7

18.4K views08:01

Data Science by ODS.ai 🦜

Forwarded from Machinelearning

⚡️ Qwen2 - самый крутой релиз откртых LLM со времен Llama 3!

Alibaba только что выпустили свое новое семейство мультиязычных моделей, которых превосходят по производительности Llama 3 по многим параметрам.

🤯 Qwen2 выпущен в 5 размерах, обучен на 29 языках!

5️⃣ Размеры: 0.5B, 1.5B, 7B, 57B-14B (MoE), 72B.
✅ Контекст: 32k для 0.5B & 1.5B, 64k для 57B MoE, 128k для 7B и 72B
✅ Поддерживает 29 языков.
📜 Выпущены под лицензией Apache 2.0, за исключением версии 72B.

📖 BLOG: https://qwenlm.github.io/blog/qwen2/
🤗 HF collection: https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557f
🤖 https://modelscope.cn/organization/qwen
💻 GitHub: https://github.com/QwenLM/Qwen2

@ai_machinelearning_big_data

👍19🔥18❤6🤡1

17.7K views21:37

Data Science by ODS.ai 🦜

Forwarded from Анализ данных (Data analysis)

This media is not supported in your browser

VIEW IN TELEGRAM

🎨 pypalettes: A large (+2500) collection of color maps for matplotlib/seaborn.

Поиск идеальных цветов для вашей диаграммы на Python может оказаться непростой задачей. Выбор цветов вручную часто приводит к перебору множества неподходящих вариантов.

Pypalette - новый пакет предоставляет коллекцию цветов из более чем 2500 палитр, тщательно отобранных сотнями экспертов.

Это приложение позволяет вам без особых усилий изучать различные палитры и выбирать лучшие ваорианты.

Импортируется всего в две строки кода, работает с диаграммами Matplotlib.

Найдите для себя подходящую цветовую палитру, которая выделит вашу диаграмму на общем фоне! 😍

pip install git+https://github.com/JosephBARBIERDARNAL/pypalettes.git

▪Github
▪Проект

@data_analysis_ml

👍30❤11🔥7👎3😁2

15.5K views22:28

Data Science by ODS.ai 🦜

Open-MAGVIT2: Democratizing Autoregressive Visual Generation 🔥

QGAN remains essential in autoregressive visual generation, despite limitations in codebook size and utilization that underestimate its capabilities. MAGVIT2 addresses these issues with a lookup-free technique and a large codebook, showing promising results in image and video generation, and playing a key role in VideoPoet

https://github.com/TencentARC/Open-MAGVIT2

@opendatascience

👍10❤5🔥1

16.6K views12:06

Data Science by ODS.ai 🦜

Forwarded from Machine learning Interview

🔥 Interview questions on DS, AI, ML, DL, NLP, Python,computer vision.

Большая подборка вопросов для собеседования по DS, AI, ML, DL, NLP, компьютерному зрению.

Подборка вопросов для собеседования поможет вам на собеседовании в области науки о данных, искусственного интеллекта, машинного обучения, глубинного обучения, обработки естественного языка, компьютерного зрения.

▪100 вопросов с собеседований Data Science

▪100 вопросов для собеседования по машинному обучению в 2024 году

▪Более 100 вопросов с собеседования Python. Разбор реальных вопросов.

▪50 вопросов для собеседования по компьютерному зрению в 2024 году

▪50 вопросов для интервью по глубинному обучению в 2024 году

▪50 вопросов для интервью по НЛП (обработке естественного языка) в 2024 году

▪Топ-60 вопросов с собеседований R

@machinelearning_interview

👍17❤5🔥4🤡2

12.7K views12:14

Data Science by ODS.ai 🦜

Yandex introduces YaFSDP, a method for faster and more efficient LLM training

This enhanced version of FSDP significantly improves LLM training efficiency by optimizing memory management, reducing unnecessary computations, and streamlining communication and synchronization. Here’s an overview of YaFSDP based on this Medium article.

How it works:

- Layer sharding: YaFSDP shards entire layers for efficient communication and reduced redundancy, minimizing memory usage across GPUs.
- Buffer pre-allocation: YaFSDP pre-allocates buffers for all necessary data, eliminating inefficiencies. This method uses two buffers for intermediate weights and gradients, alternating between odd and even layers.

Using CUDA streams, YaFSDP effectively manages concurrent computations and communications. Furthermore, the method ensures that data transfers occur only when necessary and minimizes redundant operations. To optimize memory consumption, YaFSDP employs sharding and efficient buffer use while reducing the number of stored activations.

Comparatively, YaFSDP has demonstrated a speedup of up to 26% over the standard FSDP method and can facilitate up to 20% savings in GPU resources. In a pre-training scenario involving a model with 70 billion parameters, using YaFSDP can save the resources of approximately 150 GPUs monthly.

For those interested in implementing this method, Yandex has made it open-source and available on GitHub:
https://github.com/yandex/YaFSDP

More reviews of NLP-articles in Russian in TG channel - @StuffyNLP

👍27🔥11❤5👏1

14.3K views16:21

Data Science by ODS.ai 🦜

⚡️ BERGEN: A Benchmarking Library for Retrieval-Augmented Generation

Naver introduces a Python library for standardizing RAG experiments and reveals key insights through extensive benchmarking.

📝https://arxiv.org/abs/2407.01102
👨🏽‍💻https://github.com/naver/bergen

@opendatascience

❤7🔥6👍4🥰1

15.4K views13:03

Data Science by ODS.ai 🦜

Meta-prompting Optimized Retrieval-augmented Generation

Proposes a method to enhance RAG by refining retrieved content using meta-prompting optimization, demonstrating a 30% performance improvement in multi-hop QA tasks.

📝https://arxiv.org/abs/2407.03955
👨🏽‍💻https://github.com/nlx-group/rag-meta-prompt

@opendatascience

❤8👍5🥰4🥱3

14.6K views17:03

Data Science by ODS.ai 🦜

Smol Model 🚨: Danube 3 0.5B & 4B LLMs by H2o! 🔥

> Beats Qwen 2 0.5B and competitive with Phi3 4B
> Apache 2.0 licensed checkpoints ⚡
> Uses Llama architecture w/ Mistral tokenizer (32K vocabulary)
> 8192 context length along with Grouped Query Attention
> 4B trained on 6T tokens and 0.5B on 4T tokens with multiple stages

https://huggingface.co/collections/h2oai/h2o-danube3-6687a993641452457854c609

@opendatascience

🔥11👍9❤5

11.3K viewsedited 15:48

Data Science by ODS.ai 🦜

⚡️ Google presents YouTube-SL-25

A Large-Scale, Open-Domain Multilingual Sign Language Parallel Corpus

Even for better-studied sign languages like American Sign Language (ASL), data is the bottleneck for machine learning research.

The situation is worse yet for the many other sign languages used by Deaf/Hard of Hearing communities around the world. In this paper, we present YouTube-SL-25, a large-scale, open-domain multilingual corpus of sign language videos with seemingly well-aligned captions drawn from YouTube. With >3000 hours of videos across >25 sign languages, YouTube-SL-25 is a) >3x the size of YouTube-ASL, b) the largest parallel sign language dataset to date, and c) the first or largest parallel dataset for many of its component languages.

We provide baselines for sign-to-text tasks using a unified multilingual multitask model based on T5 and report scores on benchmarks across 4 sign languages. The results demonstrate that multilingual transfer benefits both higher- and lower-resource sign languages within YouTube-SL-25.

https://huggingface.co/papers/2407.11144

@opendatascience

👍21❤4🔥2

13.5K views14:33

Data Science by ODS.ai 🦜

Forwarded from Machinelearning

0:23

This media is not supported in your browser

VIEW IN TELEGRAM

0:32

This media is not supported in your browser

VIEW IN TELEGRAM

0:17

This media is not supported in your browser

VIEW IN TELEGRAM

0:18

This media is not supported in your browser

VIEW IN TELEGRAM

🌟

FoleyCrafter: Генерация звуковых эффектов для беззвучных видео.

FoleyCrafter - методика, разработанная для автоматического создания звуковых эффектов, синхронизированных с целевым видеорядом
Архитектура метода построена на основе предварительно обученной модели преобразования текста в аудио (Text2Audio). Система состоит из двух ключевых компонентов:

🟢Семантический адаптер - использует параллельные слои cross-attention для обусловливания генерации аудио на основе видеопризнаков. Выполняет семантическое соответствие генерируемых звуков визуальному контенту.
🟢Временной контроллер - детектор временных меток анализирует и предсказывает интервалы звука и тишины в видео. Временной адаптер синхронизирует аудио с видео на основе выставленных детектором временных меток.

Оба компонента являются обучаемыми модулями, которые принимают видео в качестве входных данных для синтеза аудио. При этом модель Text2Audio остается фиксированной для сохранения ее способности к синтезу аудио постоянного качества.

Разработчики FoleyCrafter провели количественные и качественные эксперименты на наборах данных VGGSound и AVSync15 по метрикам семантического соответствия MKL, CLIP Score, FID и временной синхронизации Onset ACC, Onset AP.
По сравнению с существующими методами Text2Audio (SpecVQGAN, Diff-Foley и V2A-Mapper) FoleyCrafter показал лучшие результаты.

▶️ Локальный запуск с использованием GradioUI:


# Clone the Repository
git clone https://github.com/open-mmlab/foleycrafter.git

# Navigate to the Repository
cd projects/foleycrafter

# Create Virtual Environment with Conda & Install Dependencies
conda create env create -f requirements/environment.yaml
conda activate foleycrafter

# Install GiT LFS
conda install git-lfs 
git lfs install

# Download checkpoints 
git clone https://huggingface.co/auffusion/auffusion-full-no-adapter checkpoints/auffusion
git clone https://huggingface.co/ymzhang319/FoleyCrafter checkpoints/

# Run Gradio
python app.py --share

🔗 Лицензирование: Apache-2.0

🔗Страница проекта
🔗Arxiv
🔗Модели на HF
🔗Demo
🔗Github [ Stars: 272 | Issues: 4 | Forks: 15]

@ai_machinelearning_big_data

#AI #Text2Audio #FoleyCrafter #ML

Please open Telegram to view this post

VIEW IN TELEGRAM

Please open Telegram to view this post

VIEW IN TELEGRAM

👍15❤5🔥5

14.1K views10:41

About

Blog

Apps

Platform