ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown
Title
Towards Data Distillation for End-to-end Spoken Conversational Question
  Answering
Towards Data Distillation for End-to-end Spoken Conversational Question Answering
Chenyu You
Nuo Chen
Fenglin Liu
Dongchao Yang
Yuexian Zou
77
48
0
18 Oct 2020
HABERTOR: An Efficient and Effective Deep Hatespeech Detector
HABERTOR: An Efficient and Effective Deep Hatespeech Detector
T. Tran
Yifan Hu
Changwei Hu
Kevin Yen
Fei Tan
Kyumin Lee
Serim Park
VLM
90
32
0
17 Oct 2020
Hierarchical Multitask Learning Approach for BERT
Hierarchical Multitask Learning Approach for BERT
Çagla Aksoy
Alper Ahmetoglu
Tunga Güngör
SSL
60
5
0
17 Oct 2020
TweetBERT: A Pretrained Language Representation Model for Twitter Text
  Analysis
TweetBERT: A Pretrained Language Representation Model for Twitter Text Analysis
Mohiuddin Md Abdul Qudar
Vijay K. Mago
SSeg
62
38
0
17 Oct 2020
Delaying Interaction Layers in Transformer-based Encoders for Efficient
  Open Domain Question Answering
Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering
W. Siblini
Mohamed Challal
Charlotte Pasqual
56
3
0
16 Oct 2020
Detecting ESG topics using domain-specific language models and data
  augmentation approaches
Detecting ESG topics using domain-specific language models and data augmentation approaches
Timothy Nugent
N. Stelea
Jochen L. Leidner
67
13
0
16 Oct 2020
A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation
A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation
Mingshuo Ding
Yi Ma
22
1
0
15 Oct 2020
TopicBERT for Energy Efficient Document Classification
TopicBERT for Energy Efficient Document Classification
Yatin Chaudhary
Pankaj Gupta
Khushbu Saxena
Vivek Kulkarni
Thomas Runkler
Hinrich Schütze
70
21
0
15 Oct 2020
Text Classification Using Label Names Only: A Language Model
  Self-Training Approach
Text Classification Using Label Names Only: A Language Model Self-Training Approach
Yu Meng
Yunyi Zhang
Jiaxin Huang
Chenyan Xiong
Heng Ji
Chao Zhang
Jiawei Han
VLM
88
76
0
14 Oct 2020
An Investigation on Different Underlying Quantization Schemes for
  Pre-trained Language Models
An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models
Zihan Zhao
Yuncong Liu
Lu Chen
Qi Liu
Rao Ma
Kai Yu
MQ
46
12
0
14 Oct 2020
Weight Squeezing: Reparameterization for Knowledge Transfer and Model
  Compression
Weight Squeezing: Reparameterization for Knowledge Transfer and Model Compression
Artem Chumachenko
Daniil Gavrilov
Nikita Balagansky
Pavel Kalaidin
56
1
0
14 Oct 2020
Vokenization: Improving Language Understanding with Contextualized,
  Visual-Grounded Supervision
Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision
Hao Tan
Joey Tianyi Zhou
CLIP
89
121
0
14 Oct 2020
With Little Power Comes Great Responsibility
With Little Power Comes Great Responsibility
Dallas Card
Peter Henderson
Urvashi Khandelwal
Robin Jia
Kyle Mahowald
Dan Jurafsky
277
119
0
13 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
393
628
0
13 Oct 2020
A Wrong Answer or a Wrong Question? An Intricate Relationship between
  Question Reformulation and Answer Selection in Conversational Question
  Answering
A Wrong Answer or a Wrong Question? An Intricate Relationship between Question Reformulation and Answer Selection in Conversational Question Answering
Svitlana Vakulenko
Shayne Longpre
Zhucheng Tu
R. Anantha
60
13
0
13 Oct 2020
Improving Text Generation Evaluation with Batch Centering and Tempered
  Word Mover Distance
Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance
Xi Chen
Nan Ding
Tomer Levinboim
Radu Soricut
37
5
0
13 Oct 2020
Oort: Efficient Federated Learning via Guided Participant Selection
Oort: Efficient Federated Learning via Guided Participant Selection
Fan Lai
Xiangfeng Zhu
H. Madhyastha
Mosharaf Chowdhury
FedMLOODD
133
275
0
12 Oct 2020
Zero-shot Entity Linking with Efficient Long Range Sequence Modeling
Zero-shot Entity Linking with Efficient Long Range Sequence Modeling
Zonghai Yao
Liangliang Cao
Huapu Pan
VLM
105
21
0
12 Oct 2020
Improving Self-supervised Pre-training via a Fully-Explored Masked
  Language Model
Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model
Ming Zheng
Dinghan Shen
Yelong Shen
Weizhu Chen
Lin Xiao
SSL
29
4
0
12 Oct 2020
Measuring and Reducing Gendered Correlations in Pre-trained Models
Measuring and Reducing Gendered Correlations in Pre-trained Models
Kellie Webster
Xuezhi Wang
Ian Tenney
Alex Beutel
Emily Pitler
Ellie Pavlick
Jilin Chen
Ed Chi
Slav Petrov
FaML
97
260
0
12 Oct 2020
EFSG: Evolutionary Fooling Sentences Generator
EFSG: Evolutionary Fooling Sentences Generator
Marco Di Giovanni
Marco Brambilla
AAML
61
3
0
12 Oct 2020
A BERT-based Distractor Generation Scheme with Multi-tasking and
  Negative Answer Training Strategies
A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies
Ho-Lam Chung
Ying-Hong Chan
Yao-Chung Fan
86
41
0
12 Oct 2020
Quantitative Argument Summarization and Beyond: Cross-Domain Key Point
  Analysis
Quantitative Argument Summarization and Beyond: Cross-Domain Key Point Analysis
Roy Bar-Haim
Yoav Kantor
Lilach Eden
Roni Friedman
Dan Lahav
Noam Slonim
82
47
0
11 Oct 2020
InfoMiner at WNUT-2020 Task 2: Transformer-based Covid-19 Informative
  Tweet Extraction
InfoMiner at WNUT-2020 Task 2: Transformer-based Covid-19 Informative Tweet Extraction
Hansi Hettiarachchi
Tharindu Ranasinghe
MedIm
36
21
0
11 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
106
46
0
11 Oct 2020
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation
  Systems for the WMT20 News Translation Task
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
Z. Li
Hai Zhao
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
62
15
0
11 Oct 2020
Hierarchical Evidence Set Modeling for Automated Fact Extraction and
  Verification
Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification
Shyam Subramanian
Kyumin Lee
68
23
0
10 Oct 2020
On the Importance of Adaptive Data Collection for Extremely Imbalanced
  Pairwise Tasks
On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks
Stephen Mussmann
Robin Jia
Percy Liang
83
15
0
10 Oct 2020
Compressing Transformer-Based Semantic Parsing Models using
  Compositional Code Embeddings
Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings
P. Prakash
Saurabh Kumar Shashidhar
Wenlong Zhao
Subendhu Rongali
Haidar Khan
Michael Kayser
39
5
0
10 Oct 2020
Adversarial Self-Supervised Data-Free Distillation for Text
  Classification
Adversarial Self-Supervised Data-Free Distillation for Text Classification
Xinyin Ma
Yongliang Shen
Gongfan Fang
Chen Chen
Chenghao Jia
Weiming Lu
124
24
0
10 Oct 2020
Relation Classification as Two-way Span-Prediction
Relation Classification as Two-way Span-Prediction
Amir D. N. Cohen
Shachar Rosenman
Yoav Goldberg
79
18
0
09 Oct 2020
Learning Binary Decision Trees by Argmin Differentiation
Learning Binary Decision Trees by Argmin Differentiation
Valentina Zantedeschi
Matt J. Kusner
Vlad Niculae
62
13
0
09 Oct 2020
TurboTransformers: An Efficient GPU Serving System For Transformer
  Models
TurboTransformers: An Efficient GPU Serving System For Transformer Models
Jiarui Fang
Yang Yu
Chen-liang Zhao
Jie Zhou
84
140
0
09 Oct 2020
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text
  Generation
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation
Yan Zhang
Zhijiang Guo
Zhiyang Teng
Wei Lu
Shay B. Cohen
Zuozhu Liu
Lidong Bing
GNN
88
19
0
09 Oct 2020
Deep Learning Meets Projective Clustering
Deep Learning Meets Projective Clustering
Alaa Maalouf
Harry Lang
Daniela Rus
Dan Feldman
113
9
0
08 Oct 2020
Two are Better than One: Joint Entity and Relation Extraction with
  Table-Sequence Encoders
Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders
Jue Wang
Wei Lu
66
231
0
08 Oct 2020
Infusing Disease Knowledge into BERT for Health Question Answering,
  Medical Inference and Disease Name Recognition
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition
Yun He
Ziwei Zhu
Yin Zhang
Qin Chen
James Caverlee
AI4MH
87
109
0
08 Oct 2020
PARADE: A New Dataset for Paraphrase Identification Requiring Computer
  Science Domain Knowledge
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge
Yun He
Zhuoer Wang
Yin Zhang
Ruihong Huang
James Caverlee
51
23
0
08 Oct 2020
Don't Parse, Insert: Multilingual Semantic Parsing with Insertion Based
  Decoding
Don't Parse, Insert: Multilingual Semantic Parsing with Insertion Based Decoding
Qile Zhu
Haidar Khan
Saleh Soltan
Stephen Rawls
Wael Hamza
79
24
0
08 Oct 2020
AxFormer: Accuracy-driven Approximation of Transformers for Faster,
  Smaller and more Accurate NLP Models
AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models
Amrit Nagarajan
Sanchari Sen
Jacob R. Stevens
A. Raghunathan
22
3
0
07 Oct 2020
Exposing Shallow Heuristics of Relation Extraction Models with Challenge
  Data
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Shachar Rosenman
Alon Jacovi
Yoav Goldberg
79
29
0
07 Oct 2020
A Mathematical Exploration of Why Language Models Help Solve Downstream
  Tasks
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
87
89
0
07 Oct 2020
SRLGRN: Semantic Role Labeling Graph Reasoning Network
SRLGRN: Semantic Role Labeling Graph Reasoning Network
Chen Zheng
Parisa Kordjamshidi
51
22
0
07 Oct 2020
A Self-supervised Approach for Semantic Indexing in the Context of
  COVID-19 Pandemic
A Self-supervised Approach for Semantic Indexing in the Context of COVID-19 Pandemic
Nima Ebadi
Peyman Najafirad
OOD
37
2
0
07 Oct 2020
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive
  Language Identification using Pre-trained Language Models
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models
Shuohuan Wang
Jiaxiang Liu
Ouyang Xuan
Yu Sun
68
36
0
07 Oct 2020
What Can We Learn from Collective Human Opinions on Natural Language
  Inference Data?
What Can We Learn from Collective Human Opinions on Natural Language Inference Data?
Yixin Nie
Xiang Zhou
Joey Tianyi Zhou
106
138
0
07 Oct 2020
CATBERT: Context-Aware Tiny BERT for Detecting Social Engineering Emails
CATBERT: Context-Aware Tiny BERT for Detecting Social Engineering Emails
Younghoon Lee
Joshua Saxe
Richard E. Harang
51
25
0
07 Oct 2020
Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for
  Low-Latency Inference in NLP Applications
Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications
Matthew Khoury
Rumen Dangovski
L. Ou
Preslav Nakov
Yichen Shen
L. Jing
44
0
0
06 Oct 2020
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for
  Language Model Adaptation
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation
Minki Kang
Moonsu Han
Sung Ju Hwang
OOD
73
18
0
06 Oct 2020
On the Sparsity of Neural Machine Translation Models
On the Sparsity of Neural Machine Translation Models
Yong Wang
Longyue Wang
Victor O.K. Li
Zhaopeng Tu
MoE
58
11
0
06 Oct 2020
Previous
123...505152...575859
Next