Transformers from scratch
Modern transformers are super simple, so they can be explained in a really straightforward manner
Blog by Peter Bloem, with pytorch code : https://peterbloem.nl/blog/transformers
#MachineLearning #PyTorch #Transformers
Modern transformers are super simple, so they can be explained in a really straightforward manner
Blog by Peter Bloem, with pytorch code : https://peterbloem.nl/blog/transformers
#MachineLearning #PyTorch #Transformers
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Medium
🏎 Smaller, faster, cheaper, lighter: Introducing DilBERT, a distilled version of BERT
You can find the code to reproduce the training of DilBERT along with pre-trained weights for DilBERT here.
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Medium
🏎 Smaller, faster, cheaper, lighter: Introducing DilBERT, a distilled version of BERT
You can find the code to reproduce the training of DilBERT along with pre-trained weights for DilBERT here.
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita, Rico Sennrich, Ivan Titov
Blog: https://lena-voita.github.io/posts/emnlp19_evolution.html
Paper: https://arxiv.org/abs/1909.01380
#ArtificialIntelligence #MachineLearning #Transformers
Elena Voita, Rico Sennrich, Ivan Titov
Blog: https://lena-voita.github.io/posts/emnlp19_evolution.html
Paper: https://arxiv.org/abs/1909.01380
#ArtificialIntelligence #MachineLearning #Transformers
"Hierarchical Reinforcement Learning for Open-Domain Dialog"
Abdelrhman Saleh, Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Rosalind Picard : https://arxiv.org/abs/1909.07547
Code: https://github.com/natashamjaques/neural_chat
Bots! https://neural.chat/vhrl_techniques/
#MachineLearning #ReinforcementLearning #Transformers
Abdelrhman Saleh, Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Rosalind Picard : https://arxiv.org/abs/1909.07547
Code: https://github.com/natashamjaques/neural_chat
Bots! https://neural.chat/vhrl_techniques/
#MachineLearning #ReinforcementLearning #Transformers
Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch
By 🤗 Hugging Face : https://huggingface.co/transformers
#Transformers #MachineLearning #NLP
By 🤗 Hugging Face : https://huggingface.co/transformers
#Transformers #MachineLearning #NLP
Attention? Attention!
Blog by Lilian Weng : https://lilianweng.github.io/lil-log/2018/06/24/attention-attention.html
#machinelearning #neuralnetwork #transformers
Blog by Lilian Weng : https://lilianweng.github.io/lil-log/2018/06/24/attention-attention.html
#machinelearning #neuralnetwork #transformers
Lil'Log
Attention Attention
Transformers: State-of-the-art Natural Language Processing
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Jamie Brew : https://arxiv.org/abs/1910.03771
#Transformers #NaturalLanguageProcessing #PyTorch #TensorFlow
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Jamie Brew : https://arxiv.org/abs/1910.03771
#Transformers #NaturalLanguageProcessing #PyTorch #TensorFlow
Stabilizing Transformers for Reinforcement Learning
Parisotto et al.: https://arxiv.org/abs/1910.06764
#DeepLearning #Transformers #ReinforcementLearning
Parisotto et al.: https://arxiv.org/abs/1910.06764
#DeepLearning #Transformers #ReinforcementLearning
arXiv.org
Stabilizing Transformers for Reinforcement Learning
Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success...
Language Models as Knowledge Bases?
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
arXiv.org
Language Models as Knowledge Bases?
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be...