Data Science by ODS.ai 🦜

Forwarded from Memes on Machine Learning for Young Ladies

14.2K views10:53

⚽️ Advancing sports analytics through AI research

🔥 Deepmind blog : https://deepmind.com/blog/article/advancing-sports-analytics-through-ai

A Dataset and Benchmarks: https://soccer-net.org/

Dataset: https://github.com/statsbomb/open-data

Paper: https://sites.google.com/view/ijcai-aisa-2021/

@ai_machinelearning_big_data

👍2

15.4K views11:03

Data Science by ODS.ai 🦜

Data Fest returns! 🎉 And pretty soon

📅 Starting May 22nd and until June 19th we host an Online Fest just like we did last year:

🔸Our YouTube livestream return to a zoo-forest with 🦙🦌 and this time 🐻a bear cub! (RU)

🔸Unlimited networking in our spatial.chat - May 22nd will be the real community maelstrom (RU & EN)

🔸Tracks on our ODS.AI platform, with new types of activities and tons of new features (RU & EN)

Registration is live! Check out Data Fest 2021 website for the astonishing tracks we have in our programme and all the details 🤩

16.5K views12:48

Data Science by ODS.ai 🦜

GAN Prior Embedded Network for Blind Face Restoration in the Wild

New proposed method allowed authors to improve the quality of old photoes

ArXiV: https://arxiv.org/abs/2105.06070
Github: https://github.com/yangxy/GPEN

#GAN #GPEN #blind_face_restoration #CV #DL

👍2

17.9K views19:10

Data Science by ODS.ai 🦜

GAN Prior Embedded Network for Blind Face Restoration in the Wild New proposed method allowed authors to improve the quality of old photoes ArXiV: https://arxiv.org/abs/2105.06070 Github: https://github.com/yangxy/GPEN #GAN #GPEN #blind_face_restoration…

From comments

19.6K views19:26

Data Science by ODS.ai 🦜

Forwarded from Binary Tree

Testing Python Applications with Pytest.

Pytest is a testing framework and test runner for Python. In this guide we will have a look at the most useful and common configuration and usage, including several pytest plugins and external libraries. Although Python comes with a unittest module in the standard library and there are other Python test frameworks like nose2 or Ward, pytest remains my favourite. The beauty of using simple functions instead of class hierarchies, one simple assert instead of many different assert functions, built-in parametrized testing, a nice system of fixtures and the number of available plugins makes it a pleasure to use.

#guide #testing #python #pytest

🥰2

15.1K views09:21

Data Science by ODS.ai 🦜

Last Call: register to participate in the EMERGENCY DATAHACK

Online-hackathon for data-scientists and specialists in the fields of machine-learning, geography and geology.

Best solutions designed by the contestants during the event will be later utilized by the Ministry of the Russian Federation for Civil Defence, Emergencies and Elimination of Consequences of Natural Disasters (EMERCOM).

The contestants will be able to research and analyze data, for the first time provided by the Ministry. Also, the contestants will be able to work with data provided by the partners of the event: the Federal Service for Hydrometeorology (Roshydromet), the Federal Road Agency (Rosavtodor), GLONASS BDD, Tele2, Rostelecom, the Federal Water Resources Agency (Rosvodresources), the Main Directorate for Traffic Safety of Russia.

Date: May 28 – 30
Format: online
Registration: open until May 24 (the date is inclusive)

Link: https://emergencydatahack.ru

The aggregated prize fund for the event – 12 200 USD (in the national currency).

16.9K views10:27

Data Science by ODS.ai 🦜

Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence

Modern NLP models still struggle with generating long and coherent texts, especially for open-ended dialogues such as story generation. The authors of the paper suggest a new model architecture HINT (a generation model equipped with HIgh-level representations for loNg Text generation) with two pre-training objectives to improve the language generation models: predicting inter-sentence semantic similarity and distinguishing between normal and shuffled sentence orders. Experiments and ablation studies show that these improvements result in more coherent texts than state-of-the-art baselines.

Paper: https://arxiv.org/abs/2105.08963

Code: https://github.com/thu-coai/HINT

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-hint

#deeplearning #nlp #nlg #pretraining

👍1

21.9K views14:00

🔝 19 👍 41

Открыть комментарии

Data Science by ODS.ai 🦜

ByT5: Towards a token-free future with pre-trained byte-to-byte models

Pre-trained language models usually operate on the sequences of tokens, which are based on words or subword units.

Token-free models operate directly on the raw text (characters or bytes) instead. They can work with any language, are more robust to the noise, and don’t require preprocessing.

The authors use a modified mT5 architecture and show that their approach is competitive with token-level models.

Paper: https://arxiv.org/abs/2105.13626
Code: https://github.com/google-research/byt5

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-byt5

#nlp #deeplearning #transformer #pretraining

20.3K viewsedited 17:32

🔥 39 🔝 16

Открыть комментарии

Data Science by ODS.ai 🦜

Implementing original #UNet paper using #PyTorch

Video tutorial on how to code your own neural network from scratch.

Link: https://www.youtube.com/watch?v=u1loyDCoGbE&t=1s
Paper: https://arxiv.org/abs/1505.04597

YouTube

Implementing original U-Net from scratch using PyTorch

In this video, I show you how to implement original UNet paper using PyTorch. UNet paper can be found here: https://arxiv.org/abs/1505.04597

Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)

To buy my…

👍2❤1

17.9K views09:12

Data Science by ODS.ai 🦜

CoAtNet: Marrying Convolution and Attention for All Data Sizes

This is a paper on combining CNN and attention for Computer Vision tasks by Google Research.

The authors unify depthwise convolutions and self-attention via relative attention and vertically stack attention and convolutional layers in a specific way.
Resulting CoAtNets have good generalization, capacity and efficiency.

CoAtNet achieves 86.0% ImageNet top-1 accuracy without extra data and 89.77% with extra JFT data, outperforming the prior state of the art of both convolutional networks and Transformers. Notably, when pre-trained with 13M images from ImageNet-21K, CoAtNet achieves 88.56% top-1 accuracy, matching ViT-huge pre-trained with 300M images from JFT while using 23x less data.

Paper: https://arxiv.org/abs/2106.04803

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-coatnet

#cv #deeplearning #transformer #pretraining

16.1K viewsedited 14:57

👀 18 🔥 24

Открыть комментарии

Data Science by ODS.ai 🦜

Forwarded from opendatasciencebot

Microsoft's FLAML - Fast and Lightweight AutoML

Github: https://github.com/microsoft/FLAML

Code: https://github.com/microsoft/FLAML/tree/main/notebook/

Paper: https://arxiv.org/abs/2106.04815v1

@a

👍1

14.2K views09:44

Data Science by ODS.ai 🦜

Forwarded from Gradient Dude

Chinese researchers are very fond of doing extensive surveys of a particular sub-field of machine learning, listing the main works and the major breakthrough ideas. There are so many articles published every day, and it is impossible to read everything. Therefore, such reviews are valuable (if they are well written, of course, which is quite rare).

Recently there was a very good paper reviewing various variants of Transformers with a focus on language modeling (NLP). This is a must-read for anyone getting into the world of NLP and interested in Transformers. The paper discusses the basic principles of self-attention and such details of modern variants of Transformers as architecture modifications, pre-training, and various applications.

📝Paper: A Survey of Transformers.

16.9K views10:43

Data Science by ODS.ai 🦜

Color2Style: Real-Time Exemplar-Based Image Colorization with Self-Reference Learning and Deep Feature Modulation

ArXiV: https://arxiv.org/pdf/2106.08017.pdf

#colorization #dl

14.3K views08:53

Data Science by ODS.ai 🦜

Semi-Autoregressive Transformer for Image Captioning

Current state-of-the-art image captioning models use autoregressive decoders - they generate one word after another, which leads to heavy latency during inference. Non-autoregressive models predict all the words in parallel; however, they suffer from quality degradation as they remove word dependence excessively.

The authors suggest a semi-autoregressive approach to image captioning to improve a trade-off between speed and quality: the model keeps the autoregressive property in global but generates words parallelly in local. Experiments on MSCOCO show that SATIC can achieve a better trade-off without bells and whistles.

Paper: https://arxiv.org/abs/2106.09436

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-satic

#imagecaptioning #deeplearning #transformer

15.6K viewsedited 17:53

🍊 16 🖼 19

Data Science by ODS.ai 🦜

Forwarded from Spark in me (Alexander)

Transformer Module Optimization

Article on how to apply different methods to make your transformer network up to 10x smaller and faster:

- Plain model optimization and PyTorch tricks;
- How and why to use FFT instead of self-attention;
- Model Factorization and quantization;

https://habr.com/ru/post/563778/

#deep_learning

Хабр

Сжимаем трансформеры: простые, универсальные и прикладные способы cделать их компактными и быстрыми

Сейчас в сфере ML постоянно слышно про невероятные "успехи" трансформеров в разных областях. Но появляется все больше статей о том, что многие из этих успехов м...

14.7K views07:18

Data Science by ODS.ai 🦜

Forwarded from Towards NLP🇺🇦

DocNLI

Natural Language Inference (NLI) is the task of determining whether a “hypothesis” is true (entailment), false (contradiction), or undetermined (neutral) given a “premise”.

Previously, this task was solved for sentence-level texts. A new work "DOCNLI: A Large-scale Dataset for Document-level Natural Language Inference" to be appeared in ACL 2021 presenting the study for document/paragraph level NLI:
https://arxiv.org/abs/2106.09449v1

In Github repo you can find data and pretrained weights of RoBERTa:
https://github.com/salesforce/DocNLI
For release in HuggingFace we, probably, should wait...

P.S. I am already waiting to test this setup for fake news detection🙃

14.2K views16:20

Data Science by ODS.ai 🦜

13.7K views07:57

Data Science by ODS.ai 🦜

Article on how to use #XGBoost for #timeseries forcasting

Link: https://machinelearningmastery.com/xgboost-for-time-series-forecasting/

👍2

13.9K views19:15

About

Blog

Apps

Platform