ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,913 papers shown
Title
Quantitative Argument Summarization and Beyond: Cross-Domain Key Point
  Analysis
Quantitative Argument Summarization and Beyond: Cross-Domain Key Point Analysis
Roy Bar-Haim
Yoav Kantor
Lilach Eden
Roni Friedman
Dan Lahav
Noam Slonim
40
43
0
11 Oct 2020
InfoMiner at WNUT-2020 Task 2: Transformer-based Covid-19 Informative
  Tweet Extraction
InfoMiner at WNUT-2020 Task 2: Transformer-based Covid-19 Informative Tweet Extraction
Hansi Hettiarachchi
Tharindu Ranasinghe
MedIm
10
21
0
11 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
31
44
0
11 Oct 2020
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation
  Systems for the WMT20 News Translation Task
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
Z. Li
Hai Zhao
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
36
15
0
11 Oct 2020
Hierarchical Evidence Set Modeling for Automated Fact Extraction and
  Verification
Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification
Shyam Subramanian
Kyumin Lee
11
21
0
10 Oct 2020
On the Importance of Adaptive Data Collection for Extremely Imbalanced
  Pairwise Tasks
On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks
Stephen Mussmann
Robin Jia
Percy Liang
29
15
0
10 Oct 2020
Compressing Transformer-Based Semantic Parsing Models using
  Compositional Code Embeddings
Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings
P. Prakash
Saurabh Kumar Shashidhar
Wenlong Zhao
Subendhu Rongali
Haidar Khan
Michael Kayser
22
5
0
10 Oct 2020
Adversarial Self-Supervised Data-Free Distillation for Text
  Classification
Adversarial Self-Supervised Data-Free Distillation for Text Classification
Xinyin Ma
Yongliang Shen
Gongfan Fang
Chen Chen
Chenghao Jia
Weiming Lu
33
24
0
10 Oct 2020
Relation Classification as Two-way Span-Prediction
Relation Classification as Two-way Span-Prediction
Amir D. N. Cohen
Shachar Rosenman
Yoav Goldberg
19
18
0
09 Oct 2020
Learning Binary Decision Trees by Argmin Differentiation
Learning Binary Decision Trees by Argmin Differentiation
Valentina Zantedeschi
Matt J. Kusner
Vlad Niculae
34
13
0
09 Oct 2020
TurboTransformers: An Efficient GPU Serving System For Transformer
  Models
TurboTransformers: An Efficient GPU Serving System For Transformer Models
Jiarui Fang
Yang Yu
Chen-liang Zhao
Jie Zhou
9
138
0
09 Oct 2020
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text
  Generation
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation
Yan Zhang
Zhijiang Guo
Zhiyang Teng
Wei Lu
Shay B. Cohen
Zuozhu Liu
Lidong Bing
GNN
24
18
0
09 Oct 2020
Deep Learning Meets Projective Clustering
Deep Learning Meets Projective Clustering
Alaa Maalouf
Harry Lang
Daniela Rus
Dan Feldman
24
9
0
08 Oct 2020
Two are Better than One: Joint Entity and Relation Extraction with
  Table-Sequence Encoders
Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders
Jue Wang
Wei Lu
26
225
0
08 Oct 2020
Infusing Disease Knowledge into BERT for Health Question Answering,
  Medical Inference and Disease Name Recognition
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition
Yun He
Ziwei Zhu
Yin Zhang
Qin Chen
James Caverlee
AI4MH
36
108
0
08 Oct 2020
PARADE: A New Dataset for Paraphrase Identification Requiring Computer
  Science Domain Knowledge
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge
Yun He
Zhuoer Wang
Yin Zhang
Ruihong Huang
James Caverlee
28
22
0
08 Oct 2020
Don't Parse, Insert: Multilingual Semantic Parsing with Insertion Based
  Decoding
Don't Parse, Insert: Multilingual Semantic Parsing with Insertion Based Decoding
Qile Zhu
Haidar Khan
Saleh Soltan
Stephen Rawls
Wael Hamza
27
24
0
08 Oct 2020
AxFormer: Accuracy-driven Approximation of Transformers for Faster,
  Smaller and more Accurate NLP Models
AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models
Amrit Nagarajan
Sanchari Sen
Jacob R. Stevens
A. Raghunathan
16
3
0
07 Oct 2020
Exposing Shallow Heuristics of Relation Extraction Models with Challenge
  Data
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Shachar Rosenman
Alon Jacovi
Yoav Goldberg
19
28
0
07 Oct 2020
A Mathematical Exploration of Why Language Models Help Solve Downstream
  Tasks
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
33
87
0
07 Oct 2020
SRLGRN: Semantic Role Labeling Graph Reasoning Network
SRLGRN: Semantic Role Labeling Graph Reasoning Network
Chen Zheng
Parisa Kordjamshidi
17
22
0
07 Oct 2020
A Self-supervised Approach for Semantic Indexing in the Context of
  COVID-19 Pandemic
A Self-supervised Approach for Semantic Indexing in the Context of COVID-19 Pandemic
Nima Ebadi
Peyman Najafirad
OOD
17
2
0
07 Oct 2020
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive
  Language Identification using Pre-trained Language Models
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models
Shuohuan Wang
Jiaxiang Liu
Ouyang Xuan
Yu Sun
36
36
0
07 Oct 2020
What Can We Learn from Collective Human Opinions on Natural Language
  Inference Data?
What Can We Learn from Collective Human Opinions on Natural Language Inference Data?
Yixin Nie
Xiang Zhou
Joey Tianyi Zhou
29
129
0
07 Oct 2020
CATBERT: Context-Aware Tiny BERT for Detecting Social Engineering Emails
CATBERT: Context-Aware Tiny BERT for Detecting Social Engineering Emails
Younghoon Lee
Joshua Saxe
Richard E. Harang
11
25
0
07 Oct 2020
Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for
  Low-Latency Inference in NLP Applications
Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications
Matthew Khoury
Rumen Dangovski
L. Ou
Preslav Nakov
Yichen Shen
L. Jing
23
0
0
06 Oct 2020
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for
  Language Model Adaptation
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation
Minki Kang
Moonsu Han
Sung Ju Hwang
OOD
25
18
0
06 Oct 2020
On the Sparsity of Neural Machine Translation Models
On the Sparsity of Neural Machine Translation Models
Yong Wang
Longyue Wang
V. Li
Zhaopeng Tu
MoE
17
11
0
06 Oct 2020
On the Interplay Between Fine-tuning and Sentence-level Probing for
  Linguistic Knowledge in Pre-trained Transformers
On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers
Marius Mosbach
A. Khokhlova
Michael A. Hedderich
Dietrich Klakow
25
44
0
06 Oct 2020
LEGAL-BERT: The Muppets straight out of Law School
LEGAL-BERT: The Muppets straight out of Law School
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Nikolaos Aletras
Ion Androutsopoulos
AILaw
14
255
0
06 Oct 2020
GRUEN for Evaluating Linguistic Quality of Generated Text
GRUEN for Evaluating Linguistic Quality of Generated Text
Wanzheng Zhu
S. Bhat
33
60
0
06 Oct 2020
Pretrained Language Model Embryology: The Birth of ALBERT
Pretrained Language Model Embryology: The Birth of ALBERT
Cheng-Han Chiang
Sung-Feng Huang
Hung-yi Lee
29
39
0
06 Oct 2020
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
Wenhu Chen
Yu-Chuan Su
Xifeng Yan
Wenjie Wang
VLM
21
20
0
05 Oct 2020
Pareto Probing: Trading Off Accuracy for Complexity
Pareto Probing: Trading Off Accuracy for Complexity
Tiago Pimentel
Naomi Saphra
Adina Williams
Ryan Cotterell
34
60
0
05 Oct 2020
Second-Order NLP Adversarial Examples
Second-Order NLP Adversarial Examples
John X. Morris
AAML
20
0
0
05 Oct 2020
LUKE: Deep Contextualized Entity Representations with Entity-aware
  Self-attention
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Ikuya Yamada
Akari Asai
Hiroyuki Shindo
Hideaki Takeda
Yuji Matsumoto
34
663
0
02 Oct 2020
Which *BERT? A Survey Organizing Contextualized Encoders
Which *BERT? A Survey Organizing Contextualized Encoders
Patrick Xia
Shijie Wu
Benjamin Van Durme
26
50
0
02 Oct 2020
Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis
Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis
Katsuhiko Ishiguro
K. Ujihara
R. Sawada
Hirotaka Akita
Masaaki Kotera
32
6
0
02 Oct 2020
Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID
  Twitter BERT and Bagging Ensemble Technique based on Plurality Voting
Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID Twitter BERT and Bagging Ensemble Technique based on Plurality Voting
Anshul Wadhawan
24
7
0
01 Oct 2020
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked
  Language Models
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
Nikita Nangia
Clara Vania
Rasika Bhalerao
Samuel R. Bowman
41
645
0
30 Sep 2020
NatCat: Weakly Supervised Text Classification with Naturally Annotated
  Resources
NatCat: Weakly Supervised Text Classification with Naturally Annotated Resources
Zewei Chu
K. Stratos
Kevin Gimpel
6
5
0
29 Sep 2020
Attention that does not Explain Away
Attention that does not Explain Away
Nan Ding
Xinjie Fan
Zhenzhong Lan
Dale Schuurmans
Radu Soricut
27
3
0
29 Sep 2020
Contrastive Distillation on Intermediate Representations for Language
  Model Compression
Contrastive Distillation on Intermediate Representations for Language Model Compression
S. Sun
Zhe Gan
Yu Cheng
Yuwei Fang
Shuohang Wang
Jingjing Liu
VLM
28
69
0
29 Sep 2020
Improve Transformer Models with Better Relative Position Embeddings
Improve Transformer Models with Better Relative Position Embeddings
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
ViT
26
127
0
28 Sep 2020
What Disease does this Patient Have? A Large-scale Open Domain Question
  Answering Dataset from Medical Exams
What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams
Di Jin
Eileen Pan
Nassim Oufattole
W. Weng
Hanyi Fang
Peter Szolovits
FaML
ELM
LM&MA
31
709
0
28 Sep 2020
TernaryBERT: Distillation-aware Ultra-low Bit BERT
TernaryBERT: Distillation-aware Ultra-low Bit BERT
Wei Zhang
Lu Hou
Yichun Yin
Lifeng Shang
Xiao Chen
Xin Jiang
Qun Liu
MQ
33
209
0
27 Sep 2020
Topic-Aware Multi-turn Dialogue Modeling
Topic-Aware Multi-turn Dialogue Modeling
Yi Xu
Hai Zhao
Zhuosheng Zhang
21
73
0
26 Sep 2020
BET: A Backtranslation Approach for Easy Data Augmentation in
  Transformer-based Paraphrase Identification Context
BET: A Backtranslation Approach for Easy Data Augmentation in Transformer-based Paraphrase Identification Context
Jean-Philippe Corbeil
Hadi Abdi Ghadivel
6
27
0
25 Sep 2020
Attention Meets Perturbations: Robust and Interpretable Attention with
  Adversarial Training
Attention Meets Perturbations: Robust and Interpretable Attention with Adversarial Training
Shunsuke Kitada
Hitoshi Iyatomi
OOD
AAML
22
26
0
25 Sep 2020
AnchiBERT: A Pre-Trained Model for Ancient ChineseLanguage Understanding
  and Generation
AnchiBERT: A Pre-Trained Model for Ancient ChineseLanguage Understanding and Generation
Huishuang Tian
Kexin Yang
Dayiheng Liu
Jiancheng Lv
33
31
0
24 Sep 2020
Previous
123...505152...575859
Next