Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,913 papers shown
Title
What Happens To BERT Embeddings During Fine-tuning?
Amil Merchant
Elahe Rahimtoroghi
Ellie Pavlick
Ian Tenney
20
176
0
29 Apr 2020
General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference
Jingfei Du
Myle Ott
Haoran Li
Xing Zhou
Veselin Stoyanov
AI4CE
6
10
0
29 Apr 2020
Zero-Shot Learning and its Applications from Autonomous Vehicles to COVID-19 Diagnosis: A Review
Mahdi Rezaei
Mahsa Shahidi
24
53
0
29 Apr 2020
Benchmarking Robustness of Machine Reading Comprehension Models
Chenglei Si
Ziqing Yang
Yiming Cui
Wentao Ma
Ting Liu
Shijin Wang
ELM
AAML
21
42
0
29 Apr 2020
Knowledgeable Dialogue Reading Comprehension on Key Turns
Junlong Li
ZhuoSheng Zhang
Hai Zhao
37
1
0
29 Apr 2020
BURT: BERT-inspired Universal Representation from Twin Structure
Yian Li
Hai Zhao
19
0
0
29 Apr 2020
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Yiming Cui
Wanxiang Che
Ting Liu
Bing Qin
Shijin Wang
Guoping Hu
34
682
0
29 Apr 2020
DomBERT: Domain-oriented Language Model for Aspect-based Sentiment Analysis
Hu Xu
Bing-Quan Liu
Lei Shu
Philip S. Yu
8
51
0
28 Apr 2020
ColBERT: Using BERT Sentence Embedding in Parallel Neural Networks for Computational Humor
Issa Annamoradnejad
Gohar Zoghi
32
25
0
27 Apr 2020
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting
Sanyuan Chen
Yutai Hou
Yiming Cui
Wanxiang Che
Ting Liu
Xiangzhan Yu
KELM
CLL
21
212
0
27 Apr 2020
Cross-lingual Information Retrieval with BERT
Zhuolin Jiang
A. El-Jaroudi
William Hartmann
Damianos G. Karakos
Lingjun Zhao
28
55
0
24 Apr 2020
Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order
Yi-Lun Liao
Xin Jiang
Qun Liu
25
40
0
24 Apr 2020
UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection
Gregor Wiedemann
Seid Muhie Yimam
Christian Biemann
17
28
0
23 Apr 2020
DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications
Hongxuan Tang
Hongyu Li
Jing Liu
Yu Hong
Hua Wu
Haifeng Wang
11
18
0
23 Apr 2020
QURIOUS: Question Generation Pretraining for Text Generation
Shashi Narayan
Gonçalo Simães
Ji Ma
Hannah Craighead
Ryan T. McDonald
37
15
0
23 Apr 2020
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
21
351
0
21 Apr 2020
Attention is Not Only a Weight: Analyzing Transformers with Vector Norms
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
30
15
0
21 Apr 2020
A Generic Network Compression Framework for Sequential Recommender Systems
Yang Sun
Fajie Yuan
Ming Yang
Guoao Wei
Zhou Zhao
Duo Liu
26
54
0
21 Apr 2020
Investigating the Effectiveness of Representations Based on Pretrained Transformer-based Language Models in Active Learning for Labelling Text Datasets
Jinghui Lu
B. MacNamee
20
19
0
21 Apr 2020
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
Guanming Xiong
26
0
0
20 Apr 2020
The Cost of Training NLP Models: A Concise Overview
Or Sharir
Barak Peleg
Y. Shoham
40
210
0
19 Apr 2020
ETC: Encoding Long and Structured Inputs in Transformers
Joshua Ainslie
Santiago Ontanon
Chris Alberti
Vaclav Cvicek
Zachary Kenneth Fisher
Philip Pham
Anirudh Ravula
Sumit Sanghai
Qifan Wang
Li Yang
20
54
0
17 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
23
16
0
17 Apr 2020
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning
Joongbo Shin
Yoonhyung Lee
Seunghyun Yoon
Kyomin Jung
OOD
23
12
0
17 Apr 2020
Transform and Tell: Entity-Aware News Image Captioning
Alasdair Tran
A. Mathews
Lexing Xie
VLM
17
95
0
17 Apr 2020
Training with Quantization Noise for Extreme Model Compression
Angela Fan
Pierre Stock
Benjamin Graham
Edouard Grave
Remi Gribonval
Hervé Jégou
Armand Joulin
MQ
24
242
0
15 Apr 2020
lamBERT: Language and Action Learning Using Multimodal BERT
Kazuki Miyazawa
Tatsuya Aoki
Takato Horii
Takayuki Nagai
SSL
LM&Ro
16
12
0
15 Apr 2020
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue
Chien-Sheng Wu
Guosheng Lin
R. Socher
Caiming Xiong
28
319
0
15 Apr 2020
Cascade Neural Ensemble for Identifying Scientifically Sound Articles
Ashwin Karthik Ambalavanan
M. Devarakonda
6
0
0
13 Apr 2020
Robustly Pre-trained Neural Model for Direct Temporal Relation Extraction
Hong Guan
Jianfu Li
Hua Xu
M. Devarakonda
10
10
0
13 Apr 2020
Pretrained Transformers Improve Out-of-Distribution Robustness
Dan Hendrycks
Xiaoyuan Liu
Eric Wallace
Adam Dziedzic
R. Krishnan
D. Song
OOD
15
429
0
13 Apr 2020
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu
Hai Hu
Xuanwei Zhang
Lu Li
Chenjie Cao
...
Cong Yue
Xinrui Zhang
Zhen-Yi Yang
Kyle Richardson
Zhenzhong Lan
ELM
45
377
0
13 Apr 2020
Explaining Question Answering Models through Text Generation
Veronica Latcinnik
Jonathan Berant
LRM
16
51
0
12 Apr 2020
Multimodal Categorization of Crisis Events in Social Media
Mahdi Abavisani
Liwei Wu
Shengli Hu
Joel R. Tetreault
A. Jaimes
29
87
0
10 Apr 2020
Designing Precise and Robust Dialogue Response Evaluators
Tianyu Zhao
Divesh Lala
Tatsuya Kawahara
19
53
0
10 Apr 2020
Telling BERT's full story: from Local Attention to Global Aggregation
Damian Pascual
Gino Brunner
Roger Wattenhofer
25
19
0
10 Apr 2020
Injecting Numerical Reasoning Skills into Language Models
Mor Geva
Ankit Gupta
Jonathan Berant
AIMat
LRM
12
220
0
09 Apr 2020
Generating Counter Narratives against Online Hate Speech: Data and Strategies
Serra Sinem Tekiroğlu
Yi-Ling Chung
Marco Guerini
14
108
0
08 Apr 2020
DynaBERT: Dynamic BERT with Adaptive Width and Depth
Lu Hou
Zhiqi Huang
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
MQ
20
319
0
08 Apr 2020
Analyzing Redundancy in Pretrained Transformer Models
Fahim Dalvi
Hassan Sajjad
Nadir Durrani
Yonatan Belinkov
22
2
0
08 Apr 2020
On the Effect of Dropping Layers of Pre-trained Transformer Models
Hassan Sajjad
Fahim Dalvi
Nadir Durrani
Preslav Nakov
31
132
0
08 Apr 2020
DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement
Tianda Li
Jia-Chen Gu
Xiao-Dan Zhu
Quan Liu
Zhenhua Ling
Zhiming Su
Si Wei
29
27
0
08 Apr 2020
Towards Evaluating the Robustness of Chinese BERT Classifiers
Wei Ping
Boyuan Pan
Xin Li
Bo-wen Li
AAML
26
8
0
07 Apr 2020
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj Bostrom
Greg Durrett
14
200
0
07 Apr 2020
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
Changmao Li
Jinho Choi
9
26
0
07 Apr 2020
A Few Topical Tweets are Enough for Effective User-Level Stance Detection
Younes Samih
Kareem Darwish
11
7
0
07 Apr 2020
Deep Learning Based Text Classification: A Comprehensive Review
Shervin Minaee
Nal Kalchbrenner
Min Zhang
Narjes Nikzad
M. Asgari-Chenaghlu
Jianfeng Gao
AILaw
VLM
AI4TS
19
1,090
0
06 Apr 2020
Continual Domain-Tuning for Pretrained Language Models
Subendhu Rongali
Abhyuday N. Jagannatha
Bhanu Pratap Singh Rawat
Hong-ye Yu
CLL
KELM
6
7
0
05 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference Time
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
57
354
0
05 Apr 2020
Finding Black Cat in a Coal Cellar -- Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments
Manikandan Ravikiran
6
0
0
03 Apr 2020
Previous
1
2
3
...
55
56
57
58
59
Next