ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown
Title
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight
  BERT
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT
Siyuan Lu
Chenchen Zhou
Keli Xie
Jun Lin
Zhongfeng Wang
42
1
0
16 Nov 2022
MEAL: Stable and Active Learning for Few-Shot Prompting
MEAL: Stable and Active Learning for Few-Shot Prompting
Abdullatif Köksal
Timo Schick
Hinrich Schütze
98
26
0
15 Nov 2022
GLUE-X: Evaluating Natural Language Understanding Models from an
  Out-of-distribution Generalization Perspective
GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective
Linyi Yang
Shuibai Zhang
Libo Qin
Yafu Li
Yidong Wang
Hanmeng Liu
Jindong Wang
Xingxu Xie
Yue Zhang
ELM
185
82
0
15 Nov 2022
DeepParliament: A Legal domain Benchmark & Dataset for Parliament Bills
  Prediction
DeepParliament: A Legal domain Benchmark & Dataset for Parliament Bills Prediction
Ankit Pal
AILaw
106
1
0
15 Nov 2022
A Survey for Efficient Open Domain Question Answering
A Survey for Efficient Open Domain Question Answering
Qin Zhang
Shan Chen
Dongkuan Xu
Qingqing Cao
Xiaojun Chen
Trevor Cohn
Meng Fang
90
36
0
15 Nov 2022
Language models are good pathologists: using attention-based sequence
  reduction and text-pretrained transformers for efficient WSI classification
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLMMedIm
83
3
0
14 Nov 2022
ALBERT with Knowledge Graph Encoder Utilizing Semantic Similarity for
  Commonsense Question Answering
ALBERT with Knowledge Graph Encoder Utilizing Semantic Similarity for Commonsense Question Answering
Byeongmin Choi
Yong-Sook Lee
Yeunwoong Kyung
Eunchan Kim
55
10
0
14 Nov 2022
TIER-A: Denoising Learning Framework for Information Extraction
TIER-A: Denoising Learning Framework for Information Extraction
Yongkang Li
M. Zhang
40
0
0
13 Nov 2022
Dark patterns in e-commerce: a dataset and its baseline evaluations
Dark patterns in e-commerce: a dataset and its baseline evaluations
Yukiharu Yada
J. Feng
Tsuneo Matsumoto
Naotake Fukushima
Fuyuko Kido
Hayato Yamana
98
14
0
12 Nov 2022
The Architectural Bottleneck Principle
The Architectural Bottleneck Principle
Tiago Pimentel
Josef Valvoda
Niklas Stoehr
Ryan Cotterell
52
5
0
11 Nov 2022
Zero-shot Visual Commonsense Immorality Prediction
Zero-shot Visual Commonsense Immorality Prediction
Yujin Jeong
Seongbeom Park
Suhong Moon
Jinkyu Kim
VLM
39
2
0
10 Nov 2022
Can Transformers Reason in Fragments of Natural Language?
Can Transformers Reason in Fragments of Natural Language?
Viktor Schlegel
Kamen V. Pavlov
Ian Pratt-Hartmann
LRMReLM
77
7
0
10 Nov 2022
LERT: A Linguistically-motivated Pre-trained Language Model
LERT: A Linguistically-motivated Pre-trained Language Model
Yiming Cui
Wanxiang Che
Shijin Wang
Ting Liu
91
25
0
10 Nov 2022
Collateral facilitation in humans and language models
Collateral facilitation in humans and language models
J. Michaelov
Benjamin Bergen
110
11
0
09 Nov 2022
Detecting Languages Unintelligible to Multilingual Models through Local
  Structure Probes
Detecting Languages Unintelligible to Multilingual Models through Local Structure Probes
Louis Clouâtre
Prasanna Parthasarathi
Payel Das
Sarath Chandar
75
3
0
09 Nov 2022
Mask More and Mask Later: Efficient Pre-training of Masked Language
  Models by Disentangling the [MASK] Token
Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token
Baohao Liao
David Thulke
Sanjika Hewavitharana
Hermann Ney
Christof Monz
75
9
0
09 Nov 2022
Nested Named Entity Recognition from Medical Texts: An Adaptive Shared
  Network Architecture with Attentive CRF
Nested Named Entity Recognition from Medical Texts: An Adaptive Shared Network Architecture with Attentive CRF
Jun Jiang
Mingyue Cheng
Qi Liu
Zhi Li
Enhong Chen
56
1
0
09 Nov 2022
Learning Semantic Textual Similarity via Topic-informed Discrete Latent
  Variables
Learning Semantic Textual Similarity via Topic-informed Discrete Latent Variables
Erxin Yu
Lan Du
Yuan Jin
Zhepei Wei
Yi-Ju Chang
BDL
47
7
0
07 Nov 2022
On the Domain Adaptation and Generalization of Pretrained Language
  Models: A Survey
On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey
Xu Guo
Han Yu
LM&MAVLM
145
30
0
06 Nov 2022
Prompt-based Text Entailment for Low-Resource Named Entity Recognition
Prompt-based Text Entailment for Low-Resource Named Entity Recognition
Dongfang Li
Baotian Hu
Qingcai Chen
69
6
0
06 Nov 2022
HERB: Measuring Hierarchical Regional Bias in Pre-trained Language
  Models
HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models
Yizhi Li
Ge Zhang
Bohao Yang
Chenghua Lin
Shi Wang
Anton Ragni
Jie Fu
58
10
0
05 Nov 2022
BERT-Deep CNN: State-of-the-Art for Sentiment Analysis of COVID-19
  Tweets
BERT-Deep CNN: State-of-the-Art for Sentiment Analysis of COVID-19 Tweets
Javad Hassannataj Joloudari
Sadiq Hussain
M. Nematollahi
Rouhollah Bagheri
Fatemeh Fazl
R. Alizadehsani
Reza Lashgari
Ashis Talukder
55
47
0
04 Nov 2022
Query-based Instance Discrimination Network for Relational Triple
  Extraction
Query-based Instance Discrimination Network for Relational Triple Extraction
Zeqi Tan
Yongliang Shen
Xuming Hu
Wenqi Zhang
Xiaoxia Cheng
Weiming Lu
Yueting Zhuang
ISeg
85
9
0
03 Nov 2022
MPCFormer: fast, performant and private Transformer inference with MPC
MPCFormer: fast, performant and private Transformer inference with MPC
Dacheng Li
Rulin Shao
Hongyi Wang
Han Guo
Eric P. Xing
Haotong Zhang
92
87
0
02 Nov 2022
The future is different: Large pre-trained language models fail in
  prediction tasks
The future is different: Large pre-trained language models fail in prediction tasks
K. Cvejoski
Ramses J. Sanchez
C. Ojeda
87
4
0
01 Nov 2022
An Empirical Study on Data Leakage and Generalizability of Link
  Prediction Models for Issues and Commits
An Empirical Study on Data Leakage and Generalizability of Link Prediction Models for Issues and Commits
Maliheh Izadi
Pooya Rostami Mazrae
T. Mens
Arie van Deursen
53
2
0
01 Nov 2022
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture
  Videos into Multiple Indian Languages
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages
Anusha Prakash
Arun Kumar
Ashish Seth
Bhagyashree Mukherjee
Ishika Gupta
...
D. Sharma
H. Murthy
P. Bhattacharya
S. Umesh
R. Sangal
79
5
0
01 Nov 2022
Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks
Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks
Rochelle Choenni
Dan Garrette
Ekaterina Shutova
92
2
0
31 Oct 2022
1Cademy @ Causal News Corpus 2022: Enhance Causal Span Detection via
  Beam-Search-based Position Selector
1Cademy @ Causal News Corpus 2022: Enhance Causal Span Detection via Beam-Search-based Position Selector
Xingran Chen
Ge Zhang
A. Nik
Mingyu Li
Jie Fu
80
5
0
31 Oct 2022
Improving Temporal Generalization of Pre-trained Language Models with
  Lexical Semantic Change
Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change
Zhao-yu Su
Zecheng Tang
Xinyan Guan
Juntao Li
Lijun Wu
Hao Fei
CLLAI4CE
90
23
0
31 Oct 2022
Character-level White-Box Adversarial Attacks against Transformers via
  Attachable Subwords Substitution
Character-level White-Box Adversarial Attacks against Transformers via Attachable Subwords Substitution
Aiwei Liu
Honghai Yu
Xuming Hu
Shuang Li
Li Lin
Fukun Ma
Yawen Yang
Lijie Wen
86
35
0
31 Oct 2022
Parameter-Efficient Tuning Makes a Good Classification Head
Parameter-Efficient Tuning Makes a Good Classification Head
Zhuoyi Yang
Ming Ding
Yanhui Guo
Qingsong Lv
Jie Tang
VLM
108
14
0
30 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with
  Pre-trained Masked Language Model
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
118
26
0
29 Oct 2022
Auxo: Efficient Federated Learning via Scalable Client Clustering
Auxo: Efficient Federated Learning via Scalable Client Clustering
Jiachen Liu
Fan Lai
Yinwei Dai
Aditya Akella
H. Madhyastha
Mosharaf Chowdhury
120
10
0
29 Oct 2022
Empirical Evaluation of Post-Training Quantization Methods for Language
  Tasks
Empirical Evaluation of Post-Training Quantization Methods for Language Tasks
Ting Hu
Christoph Meinel
Haojin Yang
MQ
96
3
0
29 Oct 2022
Spectrograms Are Sequences of Patches
Spectrograms Are Sequences of Patches
Leyi Zhao
Yi Li
SSL
57
0
0
28 Oct 2022
RoChBert: Towards Robust BERT Fine-tuning for Chinese
RoChBert: Towards Robust BERT Fine-tuning for Chinese
Zihan Zhang
Jinfeng Li
Ning Shi
Bo Yuan
Xiangyu Liu
Rong Zhang
Hui Xue
Donghong Sun
Chao Zhang
AAML
61
4
0
28 Oct 2022
Reinforced Question Rewriting for Conversational Question Answering
Reinforced Question Rewriting for Conversational Question Answering
Zhiyu Zoey Chen
Jie Zhao
Anjie Fang
B. Fetahu
Oleg Rokhlenko
S. Malmasi
59
27
0
27 Oct 2022
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency
  with Slenderized Multi-exit Language Models
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models
Bowen Shen
Zheng Lin
Yuanxin Liu
Zhengxiao Liu
Lei Wang
Weiping Wang
VLM
77
5
0
27 Oct 2022
MorphTE: Injecting Morphology in Tensorized Embeddings
MorphTE: Injecting Morphology in Tensorized Embeddings
Guobing Gan
Peng Zhang
Sunzhu Li
Xiuqing Lu
Benyou Wang
80
6
0
27 Oct 2022
Unsupervised Boundary-Aware Language Model Pretraining for Chinese
  Sequence Labeling
Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
Peijie Jiang
Dingkun Long
Yanzhao Zhang
Pengjun Xie
Meishan Zhang
Hao Fei
SSL
53
13
0
27 Oct 2022
Multilevel Transformer For Multimodal Emotion Recognition
Multilevel Transformer For Multimodal Emotion Recognition
Junyi He
Meimei Wu
Meng Li
Xiaobo Zhu
Feng Ye
64
6
0
26 Oct 2022
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual
  Question Answering
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering
Q. Si
Yuanxin Liu
Zheng Lin
Peng Fu
Weiping Wang
VLM
120
1
0
26 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for
  Language Models
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
135
55
0
25 Oct 2022
PolyHope: Two-Level Hope Speech Detection from Tweets
PolyHope: Two-Level Hope Speech Detection from Tweets
F. Balouchzahi
Grigori Sidorov
Alexander Gelbukh
53
50
0
25 Oct 2022
This joke is [MASK]: Recognizing Humor and Offense with Prompting
This joke is [MASK]: Recognizing Humor and Offense with Prompting
Junze Li
Mengjie Zhao
Yubo Xie
Antonis Maronikolakis
Pearl Pu
Hinrich Schütze
AAML
61
1
0
25 Oct 2022
DialogConv: A Lightweight Fully Convolutional Network for Multi-view
  Response Selection
DialogConv: A Lightweight Fully Convolutional Network for Multi-view Response Selection
Yongkang Liu
Shi Feng
Wei Gao
Daling Wang
Yifei Zhang
57
4
0
25 Oct 2022
Effective Pre-Training Objectives for Transformer-based Autoencoders
Effective Pre-Training Objectives for Transformer-based Autoencoders
Luca Di Liello
Matteo Gabburo
Alessandro Moschitti
41
3
0
24 Oct 2022
Investigating the detection of Tortured Phrases in Scientific Literature
Investigating the detection of Tortured Phrases in Scientific Literature
Puthineath Lay
M. Lentschat
Cyril Labbe
59
5
0
24 Oct 2022
Multi-Type Conversational Question-Answer Generation with Closed-ended
  and Unanswerable Questions
Multi-Type Conversational Question-Answer Generation with Closed-ended and Unanswerable Questions
Seonjeong Hwang
Yunsu Kim
G. G. Lee
43
3
0
24 Oct 2022
Previous
123...232425...575859
Next