Forwarded from Towards NLP🇺🇦
NLP Highlights of 2020
by Sebastian Ruder:
1. Scaling up—and down
2. Retrieval augmentation
3. Few-shot learning
4. Contrastive learning
5. Evaluation beyond accuracy
6. Practical concerns of large
7. LMs
8. Multilinguality
9. Image Transformers
10. ML for science
11. Reinforcement learning
https://ruder.io/research-highlights-2020/
by Sebastian Ruder:
1. Scaling up—and down
2. Retrieval augmentation
3. Few-shot learning
4. Contrastive learning
5. Evaluation beyond accuracy
6. Practical concerns of large
7. LMs
8. Multilinguality
9. Image Transformers
10. ML for science
11. Reinforcement learning
https://ruder.io/research-highlights-2020/
ruder.io
ML and NLP Research Highlights of 2020
This post summarizes progress in 10 exciting and impactful directions in ML and NLP in 2020.
S+SSPR Workshop: An online workshop on Statistical techniques in Pattern Recognition and Structural and Syntactic Pattern Recognition.
The event is free to attend, it is happening today and tomorrow (online) with a fantastic list of keynotes: Nicholas Carlini, Michael Bronstein, Max Welling, Fabio Roli — professors and researcher in the field of geometric deep learning, pattern recognition and adversarial learning.
Live YouTube Streaming: https://www.youtube.com/channel/UCjA0Mhynad2FDlNaxzqGLhQ
Official Program here: https://www.dais.unive.it/sspr2020/program/
Don't miss it!
The event is free to attend, it is happening today and tomorrow (online) with a fantastic list of keynotes: Nicholas Carlini, Michael Bronstein, Max Welling, Fabio Roli — professors and researcher in the field of geometric deep learning, pattern recognition and adversarial learning.
Live YouTube Streaming: https://www.youtube.com/channel/UCjA0Mhynad2FDlNaxzqGLhQ
Official Program here: https://www.dais.unive.it/sspr2020/program/
Don't miss it!
👍3
Dear AC, who just submitted link through @opendatasciencebot, can you please do it once again and include your telegram handle?
The link you provided is incorrect and we can’t reach you
The link you provided is incorrect and we can’t reach you
Forwarded from Graph Machine Learning
Course: ODS Knowledge Graphs
Michael Galkin starts a self-paced course on knowledge graphs. For now, it's only in Russian, with the plan to make it in English after the first iteration. The first introduction lecture is available on YouTube. You can join discussion group for all your questions and proposals: @kg_course. The first lecture starts this Thursday, more in the channel @kg_course.
Course curriculum:
* Knowledge representations (RDF, RDFS, OWL)
* Storage and queries (SPARQL, Graph DBs)
* Consistency (RDF*, SHACL, ShEx)
* Semantic Data Integration
* Graph theory intro
* KG embeddings
* GNNs for KGs
* Applications: Question Answering, Query Embeddings
Michael Galkin starts a self-paced course on knowledge graphs. For now, it's only in Russian, with the plan to make it in English after the first iteration. The first introduction lecture is available on YouTube. You can join discussion group for all your questions and proposals: @kg_course. The first lecture starts this Thursday, more in the channel @kg_course.
Course curriculum:
* Knowledge representations (RDF, RDFS, OWL)
* Storage and queries (SPARQL, Graph DBs)
* Consistency (RDF*, SHACL, ShEx)
* Semantic Data Integration
* Graph theory intro
* KG embeddings
* GNNs for KGs
* Applications: Question Answering, Query Embeddings
YouTube
Михаил Галкин - Анонс курса Knowledge Graphs
Михаил и его коллеги подготовили курс по графам знаний ( Knowledge Graphs ) https://ods.ai/tracks/kgcourse2021
Будет дан краткий анонс курса и ответы на вопросы участников семинара.
Graph Representation Learning (GRL) - одна из самых быстро растущих тем…
Будет дан краткий анонс курса и ответы на вопросы участников семинара.
Graph Representation Learning (GRL) - одна из самых быстро растущих тем…
The new year is a good reason to rearrange things
From now on we will post all reports, ML trainings, and other videos in English on the YouTube channel ODS AI Global 🌐. All English videos from Data Fest Online 2020 are already there – check them out and don't forget to subscribe! 👀
P.S.
All content in Russian will be posted on ODS AI RU 🇷🇺 as always.
From now on we will post all reports, ML trainings, and other videos in English on the YouTube channel ODS AI Global 🌐. All English videos from Data Fest Online 2020 are already there – check them out and don't forget to subscribe! 👀
P.S.
All content in Russian will be posted on ODS AI RU 🇷🇺 as always.
JigsawGAN: Self-supervised Learning for Solving Jigsaw Puzzles with Generative Adversarial Networks
The authors suggest a GAN-based approach for solving jigsaw puzzles. JigsawGAN is a self-supervised method with a multi-task pipeline: classification branch classifies jigsaw permutations, GAN branch recovers features to images with the correct order.
The proposed method can solve jigsaw puzzles efficiently by utilizing both semantic information and edge information simultaneously.
Paper: https://arxiv.org/abs/2101.07555
#deeplearning #jigsaw #selfsupervised #gan
The authors suggest a GAN-based approach for solving jigsaw puzzles. JigsawGAN is a self-supervised method with a multi-task pipeline: classification branch classifies jigsaw permutations, GAN branch recovers features to images with the correct order.
The proposed method can solve jigsaw puzzles efficiently by utilizing both semantic information and edge information simultaneously.
Paper: https://arxiv.org/abs/2101.07555
#deeplearning #jigsaw #selfsupervised #gan
Forwarded from Towards NLP🇺🇦
Open Datasets for Research
During last week there were several news about newly open datasets for researchers.
1. Twitter opened “full history of public conversation” for academics (specifically, for academics):
https://www.theverge.com/2021/1/26/22250203/twitter-academic-research-public-tweet-archive-free-access
We can happily conduct researches about social networks graphs, users behavior and fake news (especially fake news🙃) without fighting with Twitter API.
2. Papers with code are now also Papers with Datasets:
https://www.paperswithcode.com/datasets
Not for only NLP, but for all fields structured for easy search and download.
During last week there were several news about newly open datasets for researchers.
1. Twitter opened “full history of public conversation” for academics (specifically, for academics):
https://www.theverge.com/2021/1/26/22250203/twitter-academic-research-public-tweet-archive-free-access
We can happily conduct researches about social networks graphs, users behavior and fake news (especially fake news🙃) without fighting with Twitter API.
2. Papers with code are now also Papers with Datasets:
https://www.paperswithcode.com/datasets
Not for only NLP, but for all fields structured for easy search and download.
The Verge
Twitter is opening up its full tweet archive to academic researchers for free
A full searchable archive of public tweets will now be available for free.
ObjectAug: Object-level Data Augmentation for Semantic Image Segmentation
The authors suggest ObjectAug perform object-level augmentation for semantic image segmentation.
This approach has the following steps:
- decouple the image into individual objects and the background using the semantic labels;
- augment each object separately;
- restore the black area brought by object augmentation using image inpainting;
- assemble the augmented objects and background;
Thanks to the fact that objects are separate, we can apply different augmentations to different categories and combine them with image-level augmentation methods.
Paper: https://arxiv.org/abs/2102.00221
#deeplearning #augmentation #imageinpainting #imagesegmentation
The authors suggest ObjectAug perform object-level augmentation for semantic image segmentation.
This approach has the following steps:
- decouple the image into individual objects and the background using the semantic labels;
- augment each object separately;
- restore the black area brought by object augmentation using image inpainting;
- assemble the augmented objects and background;
Thanks to the fact that objects are separate, we can apply different augmentations to different categories and combine them with image-level augmentation methods.
Paper: https://arxiv.org/abs/2102.00221
#deeplearning #augmentation #imageinpainting #imagesegmentation
Call for speakers for Machine Learning REPA Week 2021
ML REPA and LeanDS communities organize an international online conference Machine Learning REPA Week 2021
We are inviting speakers to give talks or workshops on Machine Learning Engineering, Automation, MLOps and Management topics.
CALL FOR SPEAKERS
Conference language: ENGLISH
Dates: 5 - 11 April 2021 (7 pm - 9 pm Moscow time, GMT+3)
Format: Online, zoom
Content: Talks up to 30 min, workshops / demos up to 60 min
Topics: Management, Version Control, Pipelines Automation, MLOps, Testing, Monitoring
Deadline: 15 March 2021
Url to apply: https://mlrepa.com/mlrepa-week-2021
#conference #callforspeakers
ML REPA and LeanDS communities organize an international online conference Machine Learning REPA Week 2021
We are inviting speakers to give talks or workshops on Machine Learning Engineering, Automation, MLOps and Management topics.
CALL FOR SPEAKERS
Conference language: ENGLISH
Dates: 5 - 11 April 2021 (7 pm - 9 pm Moscow time, GMT+3)
Format: Online, zoom
Content: Talks up to 30 min, workshops / demos up to 60 min
Topics: Management, Version Control, Pipelines Automation, MLOps, Testing, Monitoring
Deadline: 15 March 2021
Url to apply: https://mlrepa.com/mlrepa-week-2021
#conference #callforspeakers
Data Science by ODS.ai 🦜
Ultimate post on where to start learning DS Most common request we received through the years was to share insights and advices on how to start career in data science and to recommend decent cources. Apparently, using hashtag #wheretostart wasn't enough…
Hands on ML notebook series
Updated our ultimate post with a series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.
Link: https://github.com/ageron/handson-ml
#wheretostart #opensource #jupyter
Updated our ultimate post with a series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.
Link: https://github.com/ageron/handson-ml
#wheretostart #opensource #jupyter
GitHub
GitHub - ageron/handson-ml: ⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.
⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead. - ageron/handson-ml
Media is too big
VIEW IN TELEGRAM
neuroplanets
took from the channel https://t.iss.one/NeuralShit
full video: https://www.youtube.com/watch?v=tPyPwW7W1GM
took from the channel https://t.iss.one/NeuralShit
full video: https://www.youtube.com/watch?v=tPyPwW7W1GM
👍1
Pie chart naming in different languages.
Credit: https://twitter.com/ElephantEating/status/1360988590814023683
Credit: https://twitter.com/ElephantEating/status/1360988590814023683
Forwarded from Towards NLP🇺🇦
2020 in ML and NLP publications
By conferences, countries, companies, universities and most productive scientists:
https://www.marekrei.com/blog/ml-and-nlp-publications-in-2020/
By conferences, countries, companies, universities and most productive scientists:
https://www.marekrei.com/blog/ml-and-nlp-publications-in-2020/
Marek Rei
ML and NLP Publications in 2020 - Marek Rei
I ran my paper analysis pipeline once again in order to get statistics for 2020. It certainly was an unusual year. While ML and NLP…
🔥1
Introducing Model Search: An Open Source Platform for Finding Optimal ML Models
#Google has released an open source #AutoML framework capable of hyperparameter tuning and ensembling.
Blog post: https://ai.googleblog.com/2021/02/introducing-model-search-open-source.html
Repo: https://github.com/google/model_search
#Google has released an open source #AutoML framework capable of hyperparameter tuning and ensembling.
Blog post: https://ai.googleblog.com/2021/02/introducing-model-search-open-source.html
Repo: https://github.com/google/model_search
👍1
Towards Causal Representation Learning
Work on how neural networks derive casual variables from low-level observations.
Link: https://arxiv.org/abs/2102.11107
#casuallearning #bengio #nn #DL
Work on how neural networks derive casual variables from low-level observations.
Link: https://arxiv.org/abs/2102.11107
#casuallearning #bengio #nn #DL
Real-World Super-Resolution of Face-Images from Surveillance Cameras
Most SR methods are trained on LR (low resolution) data, which is downsampled from HR (high resolution) data using bicubic interpolation, but real-life LR images are usually different, so models work worse on them. In this paper, the authors suggest using blur kernels, noise, and JPEG compression artifacts to generate LR images similar to the original ones.
Another suggested improvement is using ESRGAN and replacing VGG-loss with LPIPS-loss, as well as adding PatchGAN.
In addition, the authors show that NIMA metric better correlates with human perception (mean opinion rank) than traditional Image Quality Assessment methods.
Paper: https://arxiv.org/abs/2102.03113
#deeplearning #superresolution #gan #facesuperresolution
Most SR methods are trained on LR (low resolution) data, which is downsampled from HR (high resolution) data using bicubic interpolation, but real-life LR images are usually different, so models work worse on them. In this paper, the authors suggest using blur kernels, noise, and JPEG compression artifacts to generate LR images similar to the original ones.
Another suggested improvement is using ESRGAN and replacing VGG-loss with LPIPS-loss, as well as adding PatchGAN.
In addition, the authors show that NIMA metric better correlates with human perception (mean opinion rank) than traditional Image Quality Assessment methods.
Paper: https://arxiv.org/abs/2102.03113
#deeplearning #superresolution #gan #facesuperresolution
👍1
Forwarded from Spark in me (Alexander)
Ukrainian Open STT 1000 Hours
Following the path of Open STT in Russian, now you can enjoy a similar dataset in Ukrainian:
- Torrent Link
- GitHub Link
Congratulations to our Ukrainian friends for finally publishing a diverse easily downloadable dataset!
Their pages / dataset UX is still a bit rough on the edges, but compared how fast for example Common Voice accumulates data (130 hours for Russian and 43 hours for Ukrainian), UA Open STT and Open STT remain the best resource for respective languages to date.
Also unlike the majority of STT datasets which are (i) behind a paywall or sponsored by corporations (ii) have limited scope / domains (iii) fit some sort of agenda (i.e. use more GPUs than necessary, use our bloated tools, etc), this dataset is legit made by real people.
Also recently corporations have taken up the trend of rehashing publicly available data, which is cool, but unique data is still nowhere to be seen for obvious reasons (except for Common Voice, which is decent only for English).
#dataset
Following the path of Open STT in Russian, now you can enjoy a similar dataset in Ukrainian:
- Torrent Link
- GitHub Link
Congratulations to our Ukrainian friends for finally publishing a diverse easily downloadable dataset!
Their pages / dataset UX is still a bit rough on the edges, but compared how fast for example Common Voice accumulates data (130 hours for Russian and 43 hours for Ukrainian), UA Open STT and Open STT remain the best resource for respective languages to date.
Also unlike the majority of STT datasets which are (i) behind a paywall or sponsored by corporations (ii) have limited scope / domains (iii) fit some sort of agenda (i.e. use more GPUs than necessary, use our bloated tools, etc), this dataset is legit made by real people.
Also recently corporations have taken up the trend of rehashing publicly available data, which is cool, but unique data is still nowhere to be seen for obvious reasons (except for Common Voice, which is decent only for English).
#dataset
GitHub
GitHub - snakers4/open_stt: Open STT
Open STT. Contribute to snakers4/open_stt development by creating an account on GitHub.
👍1
Forwarded from Towards NLP🇺🇦
Recent Advances in Language Model Fine-tuning
By Sebastian Ruder:
https://ruder.io/recent-advances-lm-fine-tuning/
By Sebastian Ruder:
https://ruder.io/recent-advances-lm-fine-tuning/
ruder.io
Recent Advances in Language Model Fine-tuning
This post provides an overview of recent methods to fine-tune large pre-trained language models.