Data Science by ODS.ai 🦜
46K subscribers
677 photos
77 videos
7 files
1.75K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
Download Telegram
​​QVMix and QVMix-Max: Extending the Deep Quality-Value Family of Algorithms to Cooperative Multi-Agent Reinforcement Learning

Paper extends the Deep Quality-Value (DQV) family of al-
gorithms to multi-agent  reinforcement learning and outperforms #SOTA

ArXiV: https://arxiv.org/abs/2012.12062

#DQV #RL #Starcraft
​​Solving Mixed Integer Programs Using Neural Networks

Article on speeding up Mixed Integer Programs with ML. Mixed Integer Programs are usually NP-hard problems:

- Problems solved with linear programming
- Production planning (pipeline optimization)
- Scheduling / Dispatching

Or any problems where integers represent various decisions (including some of the graph problems).

ArXiV: https://arxiv.org/abs/2012.13349
Wikipedia on Mixed Integer Programming: https://en.wikipedia.org/wiki/Integer_programming

#NPhard #MILP #DeepMind #productionml #linearprogramming #optimizationproblem
​​this comic does not exist

horror dataset + stylegan2


pdf book here

#gan #comix #book
​​🔥New breakthrough on text2image generation by #OpenAI

DALL·E: Creating Images from Text

This architecture is capable of understanding style descriptions as well as complex relationship between objects in context.

That opens whole new perspective for digital agencies, potentially threatening stock photo sites and new opportunies for regulations and lawers to work on.

Interesting times!

Website: https://openai.com/blog/dall-e/

#GAN #GPT3 #openai #dalle #DL
​​Characterising Bias in Compressed Models

Popular compression techniques turned out to amplify bias in deep neural networks.

ArXiV: https://arxiv.org/abs/2010.03058

#NN #DL #bias
​​Interactive and explorable explanations

Collection of links to different explanations of how things work.

Link: https://explorabl.es
How network effect (ideas, diseases) works: https://meltingasphalt.com/interactive/going-critical/
How trust works: https://ncase.me/trust/

#howstuffworks #explanations
1
Forwarded from Towards NLP🇺🇦
Choosing Transfer Languages for Cross-Lingual Learning

Given a particular task low-resource language and NLP task, how can we determine which languages we should be performing transfer from?
If we train models on the top K transfer languages suggested by the ranking model and pick the best one, how good is the best model expected to be?

In the era of transfer learning now we have a possibility not to collect the massive data for each language, but using already pretrained model achieve good scores training on smaller data. But how should we choose the language from which we can transfer knowledge? Will it be okay to transfer from English to Chinese or from Russian to Turkish?

The paper investigate on this question. The features the authors created to detect the best transfer language are the follows:

* Dataset Size: as simple as it is — do we have enough data in transfer language with respect to ratio to train language?
* Type-Token Ratio: diversity of both languages;
* Word Overlap and Subword Overlap: kind of similarity of languages; it is very good if both languages have as much the same words as possible;
* Geographic distance: are the languages from the territories that are close on the Earth surface?
* Genetic distance: are they close to each other in terms of language genealogical tree?
* Inventory distance: are they sound familiar?

The idea is pretty simple and clear but very important for studies of multilingual models.

The post is based on reading task from Multilingual NLP course by CMU (from the post).
Forwarded from Towards NLP🇺🇦
NLP Highlights of 2020

by Sebastian Ruder:

1. Scaling up—and down
2. Retrieval augmentation
3. Few-shot learning
4. Contrastive learning
5. Evaluation beyond accuracy
6. Practical concerns of large
7. LMs
8. Multilinguality
9. Image Transformers
10. ML for science
11. Reinforcement learning

https://ruder.io/research-highlights-2020/
S+SSPR Workshop: An online workshop on Statistical techniques in Pattern Recognition and Structural and Syntactic Pattern Recognition.

The event is free to attend, it is happening today and tomorrow (online) with a fantastic list of keynotes: Nicholas Carlini, Michael Bronstein, Max Welling, Fabio Roli — professors and researcher in the field of geometric deep learning, pattern recognition and adversarial learning.

Live YouTube Streaming: https://www.youtube.com/channel/UCjA0Mhynad2FDlNaxzqGLhQ

Official Program here: https://www.dais.unive.it/sspr2020/program/

Don't miss it!
👍3
Dear AC, who just submitted link through @opendatasciencebot, can you please do it once again and include your telegram handle?

The link you provided is incorrect and we can’t reach you
Forwarded from Graph Machine Learning
Course: ODS Knowledge Graphs

Michael Galkin starts a self-paced course on knowledge graphs. For now, it's only in Russian, with the plan to make it in English after the first iteration. The first introduction lecture is available on YouTube. You can join discussion group for all your questions and proposals: @kg_course. The first lecture starts this Thursday, more in the channel @kg_course.

Course curriculum:
* Knowledge representations (RDF, RDFS, OWL)
* Storage and queries (SPARQL, Graph DBs)
* Consistency (RDF*, SHACL, ShEx)
* Semantic Data Integration
* Graph theory intro
* KG embeddings
* GNNs for KGs
* Applications: Question Answering, Query Embeddings
​​The new year is a good reason to rearrange things

From now on we will post all reports, ML trainings, and other videos in English on the YouTube channel ODS AI Global 🌐. All English videos from Data Fest Online 2020 are already there – check them out and don't forget to subscribe! 👀

P.S.
All content in Russian will be posted on ODS AI RU 🇷🇺 as always.
​​JigsawGAN: Self-supervised Learning for Solving Jigsaw Puzzles with Generative Adversarial Networks

The authors suggest a GAN-based approach for solving jigsaw puzzles. JigsawGAN is a self-supervised method with a multi-task pipeline: classification branch classifies jigsaw permutations, GAN branch recovers features to images with the correct order.
The proposed method can solve jigsaw puzzles efficiently by utilizing both semantic information and edge information simultaneously.


Paper: https://arxiv.org/abs/2101.07555

#deeplearning #jigsaw #selfsupervised #gan
Forwarded from Towards NLP🇺🇦
Open Datasets for Research

During last week there were several news about newly open datasets for researchers.

1. Twitter opened “full history of public conversation” for academics (specifically, for academics):
https://www.theverge.com/2021/1/26/22250203/twitter-academic-research-public-tweet-archive-free-access
We can happily conduct researches about social networks graphs, users behavior and fake news (especially fake news🙃) without fighting with Twitter API.

2. Papers with code are now also Papers with Datasets:
https://www.paperswithcode.com/datasets
Not for only NLP, but for all fields structured for easy search and download.