Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet

16 February 2021

Anil Bas

Papers citing "Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet"

20 / 20 papers shown

Title
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 739 41,894 0 28 May 2020
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension M. Lewis Yinhan Liu Naman Goyal Marjan Ghazvininejad Abdel-rahman Mohamed Omer Levy Veselin Stoyanov Luke Zettlemoyer AIMat VLM 241 10,815 0 29 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Victor Sanh Lysandre Debut Julien Chaumond Thomas Wolf 223 7,498 0 02 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Zhenzhong Lan Mingda Chen Sebastian Goodman Kevin Gimpel Piyush Sharma Radu Soricut SSL AIMat 346 6,448 0 26 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 587 24,422 0 26 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding Zhilin Yang Zihang Dai Yiming Yang J. Carbonell Ruslan Salakhutdinov Quoc V. Le AI4CE 227 8,424 0 19 Jun 2019
DocBERT: BERT for Document Classification Ashutosh Adhikari Achyudh Ram Raphael Tang Jimmy J. Lin LLMAG VLM 36 298 0 17 Apr 2019
An Attentive Survey of Attention Models S. Chaudhari Varun Mithal Gungor Polatkan R. Ramanath 124 657 0 05 Apr 2019
Attention in Natural Language Processing Andrea Galassi Marco Lippi Paolo Torroni GNN 53 477 0 04 Feb 2019
BioBERT: a pre-trained biomedical language representation model for biomedical text mining Jinhyuk Lee Wonjin Yoon Sungdong Kim Donghyeon Kim Sunkyu Kim Chan Ho So Jaewoo Kang OOD 149 5,643 0 25 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Zihang Dai Zhilin Yang Yiming Yang J. Carbonell Quoc V. Le Ruslan Salakhutdinov VLM 216 3,726 0 09 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.7K 94,729 0 11 Oct 2018
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 677 131,414 0 12 Jun 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation Albert Gatt E. Krahmer LM&MA ELM 101 823 0 29 Mar 2017
Listen, Attend and Spell William Chan Navdeep Jaitly Quoc V. Le Oriol Vinyals RALM 153 2,266 0 05 Aug 2015
End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results J. Chorowski Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio 92 470 0 04 Dec 2014
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches Kyunghyun Cho B. V. Merrienboer Dzmitry Bahdanau Yoshua Bengio AI4CE AIMat 237 6,775 0 03 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 541 27,295 0 01 Sep 2014
Efficient Estimation of Word Representations in Vector Space Tomas Mikolov Kai Chen G. Corrado J. Dean 3DV 655 31,490 0 16 Jan 2013
On the difficulty of training Recurrent Neural Networks Razvan Pascanu Tomas Mikolov Yoshua Bengio ODL 190 5,342 0 21 Nov 2012