Data Science by ODS.ai 🦜

Yandex Team Talk at NeurIPS. Talk will be most interesting for those who are working on critical aspects of successful data collection and labeling.

Moderation team will focus on:
- Remoteness. A discussion about effectiveness and efficiency of remote work on crowdsourcing platforms.
- Fairness. How the working environment (e.g., a crowdsourcing platform) may help provide executors flexibility in choosing/switching tasks and working hours.
- Mechanisms. Discussion on bilateral mechanisms that not only provide flexibility to the performers, but also guarantee the quality of the result and the efficiency of the process to the customers.

Toloka's workshop info: https://clck.ru/SNwi3

#NeurIPS2020 #labeling #Yandex

Toloka: Data solutions to drive AI

Crowd Science Workshop at NeurIPS 2020

Remoteness, fairness, and mechanisms as challenges of data supply by humans for automation.

15.2K views13:45

Data Science by ODS.ai 🦜

Supporting content decision makers with machine learning

#Netflix shared a post providing information about how they research and prepare data for new title production.

Link: https://netflixtechblog.com/supporting-content-decision-makers-with-machine-learning-995b7b76006f

#NLU #NLP #recommendation #embeddings

14.9K viewsedited 08:44

🎥 25 🎬 12 🤖 16

Data Science by ODS.ai 🦜

🔥Everything You Always Wanted To Know About GitHub (But Were Afraid To Ask)

ClickHouse team provided extensive statistics on GitHub, including but not limited to distribution of repositories by star count, top repositories by stars, affinity list, top labels etc.

All the data is available for download with instructions for ClickHouse import

Link: https://gh.clickhouse.tech/explorer/

#GitHub #ClickHouse #Yandex #statistics #EDA #engineerketing

13.9K views09:50

Data Science by ODS.ai 🦜

Do you need any more proofs that GitHub is the best social network ever?

15.4K views10:00

Data Science by ODS.ai 🦜

If you know someone who might like our channel @opendatascience, please invite them or share this post

14.9K views15:13

Data Science by ODS.ai 🦜

16.5K views15:13

Data Science by ODS.ai 🦜

Forwarded from Graph Machine Learning

GML Newsletter - Issue #5: Was 2020 a good year for graph research?

My new newsletter is out! 🔥 Talking about my predictions for 2020, NeurIPS recordings, ICLR submissions and a few links that you probably have seen already, my friends!

19.2K views17:34

Data Science by ODS.ai 🦜

Forwarded from Karim Iskakov - канал (Karim Iskakov)

0:27

Media is too big

VIEW IN TELEGRAM

New method to create relightable 3D selfies from Samsung AI (Moscow). You have: single smartphone video of a head with blinking flash. You get: sharp renderings under any lighting and from any viewpoint.

🌐 saic-violet.github.io/relightable-portrait
📝 arxiv.org/abs/2012.09963
📉 @loss_function_porn

16K views17:17

Data Science by ODS.ai 🦜

Hey, fellow researchers, engineers and students.

We can recommend you another great frequently updated channel, covering Machine and Deep Learning: @ai_machinelearning_big_data

14.4K views08:59

Data Science by ODS.ai 🦜

Forwarded from Machinelearning

YolactEdge: Real-time Instance Segmentation on the Edge

Github: https://github.com/haotian-liu/yolact_edge

Demo: https://www.youtube.com/watch?v=GBCK9SrcCLM

Paper: https://arxiv.org/abs/2012.12259

@ai_machinelearning_big_data

15.2K views08:59

Data Science by ODS.ai 🦜

QVMix and QVMix-Max: Extending the Deep Quality-Value Family of Algorithms to Cooperative Multi-Agent Reinforcement Learning

Paper extends the Deep Quality-Value (DQV) family of al-
gorithms to multi-agent reinforcement learning and outperforms #SOTA

ArXiV: https://arxiv.org/abs/2012.12062

#DQV #RL #Starcraft

14.2K views11:00

Data Science by ODS.ai 🦜

Solving Mixed Integer Programs Using Neural Networks

Article on speeding up Mixed Integer Programs with ML. Mixed Integer Programs are usually NP-hard problems:

- Problems solved with linear programming
- Production planning (pipeline optimization)
- Scheduling / Dispatching

Or any problems where integers represent various decisions (including some of the graph problems).

ArXiV: https://arxiv.org/abs/2012.13349
Wikipedia on Mixed Integer Programming: https://en.wikipedia.org/wiki/Integer_programming

#NPhard #MILP #DeepMind #productionml #linearprogramming #optimizationproblem

22.7K viewsedited 12:58

🎒 27 🚪 4 🤩 17

Data Science by ODS.ai 🦜

17.4K views09:37

🤣 204 😑 29

Data Science by ODS.ai 🦜

this comic does not exist

horror dataset + stylegan2

pdf book here

#gan #comix #book

19.9K viewsedited 13:53

👎🏿 34 👍🏿 37

Data Science by ODS.ai 🦜

Forwarded from Machinelearning

🧠 2020: A Year Full of Amazing AI Papers — A Review

https://www.kdnuggets.com/2020/12/2020-amazing-ai-papers.html

@ai_machinelearning_big_data

18.2K views11:31

Data Science by ODS.ai 🦜

🔥New breakthrough on text2image generation by #OpenAI

DALL·E: Creating Images from Text

This architecture is capable of understanding style descriptions as well as complex relationship between objects in context.

That opens whole new perspective for digital agencies, potentially threatening stock photo sites and new opportunies for regulations and lawers to work on.

Interesting times!

Website: https://openai.com/blog/dall-e/

#GAN #GPT3 #openai #dalle #DL

22.5K views00:03

Data Science by ODS.ai 🦜

Forwarded from Towards NLP🇺🇦

🤗 multilingual datasets

- 611 datasets you can download in one line of python;
- 467 languages covered, 99 with at least 10 datasets;
- efficient pre-processing to free you from memory constraints;

https://github.com/huggingface/datasets

GitHub

GitHub - huggingface/datasets: 🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data…

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools - huggingface/datasets

19.5K views09:55

Data Science by ODS.ai 🦜

Open Software Packaging for Science

#opensource alternative to #conda.

Mamba (drop-in replacement) direct link: https://github.com/TheSnakePit/mamba
Link: https://medium.com/@QuantStack/open-software-packaging-for-science-61cecee7fc23

#python #packagemanagement

GitHub

GitHub - mamba-org/mamba: The Fast Cross-Platform Package Manager

The Fast Cross-Platform Package Manager. Contribute to mamba-org/mamba development by creating an account on GitHub.

👍1

17.9K views12:21

Data Science by ODS.ai 🦜

Characterising Bias in Compressed Models

Popular compression techniques turned out to amplify bias in deep neural networks.

ArXiV: https://arxiv.org/abs/2010.03058

#NN #DL #bias

17.4K views08:28

Data Science by ODS.ai 🦜

Interactive and explorable explanations

Collection of links to different explanations of how things work.

Link: https://explorabl.es
How network effect (ideas, diseases) works: https://meltingasphalt.com/interactive/going-critical/
How trust works: https://ncase.me/trust/

#howstuffworks #explanations

❤1

18.6K views10:06

Data Science by ODS.ai 🦜

Forwarded from Towards NLP🇺🇦

Choosing Transfer Languages for Cross-Lingual Learning

Given a particular task low-resource language and NLP task, how can we determine which languages we should be performing transfer from?
If we train models on the top K transfer languages suggested by the ranking model and pick the best one, how good is the best model expected to be?

In the era of transfer learning now we have a possibility not to collect the massive data for each language, but using already pretrained model achieve good scores training on smaller data. But how should we choose the language from which we can transfer knowledge? Will it be okay to transfer from English to Chinese or from Russian to Turkish?

The paper investigate on this question. The features the authors created to detect the best transfer language are the follows:

* Dataset Size: as simple as it is — do we have enough data in transfer language with respect to ratio to train language?
* Type-Token Ratio: diversity of both languages;
* Word Overlap and Subword Overlap: kind of similarity of languages; it is very good if both languages have as much the same words as possible;
* Geographic distance: are the languages from the territories that are close on the Earth surface?
* Genetic distance: are they close to each other in terms of language genealogical tree?
* Inventory distance: are they sound familiar?

The idea is pretty simple and clear but very important for studies of multilingual models.

The post is based on reading task from Multilingual NLP course by CMU (from the post).

Towards NLP

Multilingual NLP

Сейчас время начинать (или вспомнить, что забросили) новые учебные курсы. И, если честно, сейчас онлайн курсов невероятное количество и по классическому ML, и по deeplearning, и по NLP. Так, fast.ai перезапустили свой курс по глубокому обучению…

19.9K views17:47

About

Blog

Apps

Platform