ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,913 papers shown
Title
What Happens To BERT Embeddings During Fine-tuning?
What Happens To BERT Embeddings During Fine-tuning?
Amil Merchant
Elahe Rahimtoroghi
Ellie Pavlick
Ian Tenney
20
176
0
29 Apr 2020
General Purpose Text Embeddings from Pre-trained Language Models for
  Scalable Inference
General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference
Jingfei Du
Myle Ott
Haoran Li
Xing Zhou
Veselin Stoyanov
AI4CE
6
10
0
29 Apr 2020
Zero-Shot Learning and its Applications from Autonomous Vehicles to
  COVID-19 Diagnosis: A Review
Zero-Shot Learning and its Applications from Autonomous Vehicles to COVID-19 Diagnosis: A Review
Mahdi Rezaei
Mahsa Shahidi
24
53
0
29 Apr 2020
Benchmarking Robustness of Machine Reading Comprehension Models
Benchmarking Robustness of Machine Reading Comprehension Models
Chenglei Si
Ziqing Yang
Yiming Cui
Wentao Ma
Ting Liu
Shijin Wang
ELM
AAML
21
42
0
29 Apr 2020
Knowledgeable Dialogue Reading Comprehension on Key Turns
Knowledgeable Dialogue Reading Comprehension on Key Turns
Junlong Li
ZhuoSheng Zhang
Hai Zhao
37
1
0
29 Apr 2020
BURT: BERT-inspired Universal Representation from Twin Structure
BURT: BERT-inspired Universal Representation from Twin Structure
Yian Li
Hai Zhao
19
0
0
29 Apr 2020
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Yiming Cui
Wanxiang Che
Ting Liu
Bing Qin
Shijin Wang
Guoping Hu
34
682
0
29 Apr 2020
DomBERT: Domain-oriented Language Model for Aspect-based Sentiment
  Analysis
DomBERT: Domain-oriented Language Model for Aspect-based Sentiment Analysis
Hu Xu
Bing-Quan Liu
Lei Shu
Philip S. Yu
8
51
0
28 Apr 2020
ColBERT: Using BERT Sentence Embedding in Parallel Neural Networks for
  Computational Humor
ColBERT: Using BERT Sentence Embedding in Parallel Neural Networks for Computational Humor
Issa Annamoradnejad
Gohar Zoghi
32
25
0
27 Apr 2020
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
  Forgetting
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting
Sanyuan Chen
Yutai Hou
Yiming Cui
Wanxiang Che
Ting Liu
Xiangzhan Yu
KELM
CLL
21
212
0
27 Apr 2020
Cross-lingual Information Retrieval with BERT
Cross-lingual Information Retrieval with BERT
Zhuolin Jiang
A. El-Jaroudi
William Hartmann
Damianos G. Karakos
Lingjun Zhao
28
55
0
24 Apr 2020
Probabilistically Masked Language Model Capable of Autoregressive
  Generation in Arbitrary Word Order
Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order
Yi-Lun Liao
Xin Jiang
Qun Liu
25
40
0
24 Apr 2020
UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer
  Networks for Offensive Language Detection
UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection
Gregor Wiedemann
Seid Muhie Yimam
Christian Biemann
17
28
0
23 Apr 2020
DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and
  Generalization of Machine Reading Comprehension in Real-World Applications
DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications
Hongxuan Tang
Hongyu Li
Jing Liu
Yu Hong
Hua Wu
Haifeng Wang
11
18
0
23 Apr 2020
QURIOUS: Question Generation Pretraining for Text Generation
QURIOUS: Question Generation Pretraining for Text Generation
Shashi Narayan
Gonçalo Simães
Ji Ma
Hannah Craighead
Ryan T. McDonald
37
15
0
23 Apr 2020
Experience Grounds Language
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
21
351
0
21 Apr 2020
Attention is Not Only a Weight: Analyzing Transformers with Vector Norms
Attention is Not Only a Weight: Analyzing Transformers with Vector Norms
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
30
15
0
21 Apr 2020
A Generic Network Compression Framework for Sequential Recommender
  Systems
A Generic Network Compression Framework for Sequential Recommender Systems
Yang Sun
Fajie Yuan
Ming Yang
Guoao Wei
Zhou Zhao
Duo Liu
26
54
0
21 Apr 2020
Investigating the Effectiveness of Representations Based on Pretrained
  Transformer-based Language Models in Active Learning for Labelling Text
  Datasets
Investigating the Effectiveness of Representations Based on Pretrained Transformer-based Language Models in Active Learning for Labelling Text Datasets
Jinghui Lu
B. MacNamee
20
19
0
21 Apr 2020
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
Guanming Xiong
26
0
0
20 Apr 2020
The Cost of Training NLP Models: A Concise Overview
The Cost of Training NLP Models: A Concise Overview
Or Sharir
Barak Peleg
Y. Shoham
40
210
0
19 Apr 2020
ETC: Encoding Long and Structured Inputs in Transformers
ETC: Encoding Long and Structured Inputs in Transformers
Joshua Ainslie
Santiago Ontanon
Chris Alberti
Vaclav Cvicek
Zachary Kenneth Fisher
Philip Pham
Anirudh Ravula
Sumit Sanghai
Qifan Wang
Li Yang
20
54
0
17 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
23
16
0
17 Apr 2020
Fast and Accurate Deep Bidirectional Language Representations for
  Unsupervised Learning
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning
Joongbo Shin
Yoonhyung Lee
Seunghyun Yoon
Kyomin Jung
OOD
23
12
0
17 Apr 2020
Transform and Tell: Entity-Aware News Image Captioning
Transform and Tell: Entity-Aware News Image Captioning
Alasdair Tran
A. Mathews
Lexing Xie
VLM
17
95
0
17 Apr 2020
Training with Quantization Noise for Extreme Model Compression
Training with Quantization Noise for Extreme Model Compression
Angela Fan
Pierre Stock
Benjamin Graham
Edouard Grave
Remi Gribonval
Hervé Jégou
Armand Joulin
MQ
24
242
0
15 Apr 2020
lamBERT: Language and Action Learning Using Multimodal BERT
lamBERT: Language and Action Learning Using Multimodal BERT
Kazuki Miyazawa
Tatsuya Aoki
Takato Horii
Takayuki Nagai
SSL
LM&Ro
16
12
0
15 Apr 2020
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
  Dialogue
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue
Chien-Sheng Wu
Guosheng Lin
R. Socher
Caiming Xiong
28
319
0
15 Apr 2020
Cascade Neural Ensemble for Identifying Scientifically Sound Articles
Cascade Neural Ensemble for Identifying Scientifically Sound Articles
Ashwin Karthik Ambalavanan
M. Devarakonda
6
0
0
13 Apr 2020
Robustly Pre-trained Neural Model for Direct Temporal Relation
  Extraction
Robustly Pre-trained Neural Model for Direct Temporal Relation Extraction
Hong Guan
Jianfu Li
Hua Xu
M. Devarakonda
10
10
0
13 Apr 2020
Pretrained Transformers Improve Out-of-Distribution Robustness
Pretrained Transformers Improve Out-of-Distribution Robustness
Dan Hendrycks
Xiaoyuan Liu
Eric Wallace
Adam Dziedzic
R. Krishnan
D. Song
OOD
15
429
0
13 Apr 2020
CLUE: A Chinese Language Understanding Evaluation Benchmark
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu
Hai Hu
Xuanwei Zhang
Lu Li
Chenjie Cao
...
Cong Yue
Xinrui Zhang
Zhen-Yi Yang
Kyle Richardson
Zhenzhong Lan
ELM
45
377
0
13 Apr 2020
Explaining Question Answering Models through Text Generation
Explaining Question Answering Models through Text Generation
Veronica Latcinnik
Jonathan Berant
LRM
16
51
0
12 Apr 2020
Multimodal Categorization of Crisis Events in Social Media
Multimodal Categorization of Crisis Events in Social Media
Mahdi Abavisani
Liwei Wu
Shengli Hu
Joel R. Tetreault
A. Jaimes
29
87
0
10 Apr 2020
Designing Precise and Robust Dialogue Response Evaluators
Designing Precise and Robust Dialogue Response Evaluators
Tianyu Zhao
Divesh Lala
Tatsuya Kawahara
19
53
0
10 Apr 2020
Telling BERT's full story: from Local Attention to Global Aggregation
Telling BERT's full story: from Local Attention to Global Aggregation
Damian Pascual
Gino Brunner
Roger Wattenhofer
25
19
0
10 Apr 2020
Injecting Numerical Reasoning Skills into Language Models
Injecting Numerical Reasoning Skills into Language Models
Mor Geva
Ankit Gupta
Jonathan Berant
AIMat
LRM
12
220
0
09 Apr 2020
Generating Counter Narratives against Online Hate Speech: Data and
  Strategies
Generating Counter Narratives against Online Hate Speech: Data and Strategies
Serra Sinem Tekiroğlu
Yi-Ling Chung
Marco Guerini
14
108
0
08 Apr 2020
DynaBERT: Dynamic BERT with Adaptive Width and Depth
DynaBERT: Dynamic BERT with Adaptive Width and Depth
Lu Hou
Zhiqi Huang
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
MQ
20
319
0
08 Apr 2020
Analyzing Redundancy in Pretrained Transformer Models
Analyzing Redundancy in Pretrained Transformer Models
Fahim Dalvi
Hassan Sajjad
Nadir Durrani
Yonatan Belinkov
22
2
0
08 Apr 2020
On the Effect of Dropping Layers of Pre-trained Transformer Models
On the Effect of Dropping Layers of Pre-trained Transformer Models
Hassan Sajjad
Fahim Dalvi
Nadir Durrani
Preslav Nakov
31
132
0
08 Apr 2020
DialBERT: A Hierarchical Pre-Trained Model for Conversation
  Disentanglement
DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement
Tianda Li
Jia-Chen Gu
Xiao-Dan Zhu
Quan Liu
Zhenhua Ling
Zhiming Su
Si Wei
29
27
0
08 Apr 2020
Towards Evaluating the Robustness of Chinese BERT Classifiers
Towards Evaluating the Robustness of Chinese BERT Classifiers
Wei Ping
Boyuan Pan
Xin Li
Bo-wen Li
AAML
26
8
0
07 Apr 2020
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj Bostrom
Greg Durrett
14
200
0
07 Apr 2020
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for
  Span-based Question Answering
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
Changmao Li
Jinho Choi
9
26
0
07 Apr 2020
A Few Topical Tweets are Enough for Effective User-Level Stance
  Detection
A Few Topical Tweets are Enough for Effective User-Level Stance Detection
Younes Samih
Kareem Darwish
11
7
0
07 Apr 2020
Deep Learning Based Text Classification: A Comprehensive Review
Deep Learning Based Text Classification: A Comprehensive Review
Shervin Minaee
Nal Kalchbrenner
Min Zhang
Narjes Nikzad
M. Asgari-Chenaghlu
Jianfeng Gao
AILaw
VLM
AI4TS
19
1,090
0
06 Apr 2020
Continual Domain-Tuning for Pretrained Language Models
Continual Domain-Tuning for Pretrained Language Models
Subendhu Rongali
Abhyuday N. Jagannatha
Bhanu Pratap Singh Rawat
Hong-ye Yu
CLL
KELM
6
7
0
05 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference Time
FastBERT: a Self-distilling BERT with Adaptive Inference Time
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
57
354
0
05 Apr 2020
Finding Black Cat in a Coal Cellar -- Keyphrase Extraction &
  Keyphrase-Rubric Relationship Classification from Complex Assignments
Finding Black Cat in a Coal Cellar -- Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments
Manikandan Ravikiran
6
0
0
03 Apr 2020
Previous
123...5556575859
Next