ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,913 papers shown
Title
A Graph-guided Multi-round Retrieval Method for Conversational
  Open-domain Question Answering
A Graph-guided Multi-round Retrieval Method for Conversational Open-domain Question Answering
Yongqi Li
Wenjie Li
Liqiang Nie
RALM
19
10
0
17 Apr 2021
Capturing Row and Column Semantics in Transformer Based Question
  Answering over Tables
Capturing Row and Column Semantics in Transformer Based Question Answering over Tables
Michael R. Glass
Mustafa Canim
A. Gliozzo
Saneem A. Chemmengath
Vishwajeet Kumar
Rishav Chakravarti
Avirup Sil
FeiFei Pan
Samarth Bharadwaj
Nicolas Rodolfo Fauceglia
LMTD
26
54
0
16 Apr 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language
  Models
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MA
MedIm
34
164
0
16 Apr 2021
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Ioana Croitoru
Simion-Vlad Bogolin
Marius Leordeanu
Hailin Jin
Andrew Zisserman
Samuel Albanie
Yang Liu
VGen
21
124
0
16 Apr 2021
Condenser: a Pre-training Architecture for Dense Retrieval
Condenser: a Pre-training Architecture for Dense Retrieval
Luyu Gao
Jamie Callan
AI4CE
36
253
0
16 Apr 2021
$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues
  via Question Generation and Question Answering
Q2Q^{2}Q2: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Or Honovich
Leshem Choshen
Roee Aharoni
Ella Neeman
Idan Szpektor
Omri Abend
HILM
36
138
0
16 Apr 2021
Back to Square One: Artifact Detection, Training and Commonsense
  Disentanglement in the Winograd Schema
Back to Square One: Artifact Detection, Training and Commonsense Disentanglement in the Winograd Schema
Yanai Elazar
Hongming Zhang
Yoav Goldberg
Dan Roth
ReLM
LRM
47
44
0
16 Apr 2021
Gradient-based Adversarial Attacks against Text Transformers
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
109
230
0
15 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of
  Hierarchical Phrase Structure in Pretrained Language Models
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models
Matteo Alleman
J. Mamou
Miguel Rio
Hanlin Tang
Yoon Kim
SueYeon Chung
NAI
46
17
0
15 Apr 2021
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language
  Models
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models
Karolina Stañczak
Sagnik Ray Choudhury
Tiago Pimentel
Ryan Cotterell
Isabelle Augenstein
30
23
0
15 Apr 2021
Unmasking the Mask -- Evaluating Social Biases in Masked Language Models
Unmasking the Mask -- Evaluating Social Biases in Masked Language Models
Masahiro Kaneko
Danushka Bollegala
29
69
0
15 Apr 2021
TransferNet: An Effective and Transparent Framework for Multi-hop
  Question Answering over Relation Graph
TransferNet: An Effective and Transparent Framework for Multi-hop Question Answering over Relation Graph
Jiaxin Shi
S. Cao
Lei Hou
Juan-Zi Li
Hanwang Zhang
GNN
34
105
0
15 Apr 2021
Text Guide: Improving the quality of long text classification by a text
  selection method based on feature importance
Text Guide: Improving the quality of long text classification by a text selection method based on feature importance
K. Fiok
W. Karwowski
Edgar Gutierrez-Franco
Mohammad Reza Davahli
Maciej Wilamowski
T. Ahram
Awad M. Aljuaid
Jozef Zurada
VLM
28
33
0
15 Apr 2021
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
  Pre-trained Language Models
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models
Yuxuan Lai
Yijia Liu
Yansong Feng
Songfang Huang
Dongyan Zhao
VLM
AI4CE
40
37
0
15 Apr 2021
TWEAC: Transformer with Extendable QA Agent Classifiers
TWEAC: Transformer with Extendable QA Agent Classifiers
Gregor Geigle
Nils Reimers
Andreas Rucklé
Iryna Gurevych
ViT
27
22
0
14 Apr 2021
The Surprising Performance of Simple Baselines for Misinformation
  Detection
The Surprising Performance of Simple Baselines for Misinformation Detection
Kellin Pelrine
Jacob Danovitch
Reihaneh Rabbany
30
63
0
14 Apr 2021
I Wish I Would Have Loved This One, But I Didn't -- A Multilingual
  Dataset for Counterfactual Detection in Product Reviews
I Wish I Would Have Loved This One, But I Didn't -- A Multilingual Dataset for Counterfactual Detection in Product Reviews
James OÑeill
Polina Rozenshtein
Ryuichi Kiryo
Motoko Kubota
Danushka Bollegala
40
26
0
14 Apr 2021
AR-LSAT: Investigating Analytical Reasoning of Text
AR-LSAT: Investigating Analytical Reasoning of Text
Wanjun Zhong
Siyuan Wang
Duyu Tang
Zenan Xu
Daya Guo
Jiahai Wang
Jian Yin
Ming Zhou
Nan Duan
ELM
27
41
0
14 Apr 2021
Demystifying BERT: Implications for Accelerator Design
Demystifying BERT: Implications for Accelerator Design
Suchita Pati
Shaizeen Aga
Nuwan Jayasena
Matthew D. Sinclair
LLMAG
40
17
0
14 Apr 2021
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question
  Answering
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
Michihiro Yasunaga
Hongyu Ren
Antoine Bosselut
Percy Liang
J. Leskovec
RALM
LMTD
AI4MH
LRM
21
577
0
13 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
25
85
0
13 Apr 2021
Restoring and Mining the Records of the Joseon Dynasty via Neural
  Language Modeling and Machine Translation
Restoring and Mining the Records of the Joseon Dynasty via Neural Language Modeling and Machine Translation
Kyeongpil Kang
Kyohoon Jin
Soyoung Yang
Show-Ling Jang
Jaegul Choo
Yougbin Kim
MU
19
16
0
13 Apr 2021
Semantic maps and metrics for science Semantic maps and metrics for
  science using deep transformer encoders
Semantic maps and metrics for science Semantic maps and metrics for science using deep transformer encoders
Brendan Chambers
James A. Evans
MedIm
13
0
0
13 Apr 2021
Discourse Probing of Pretrained Language Models
Discourse Probing of Pretrained Language Models
Fajri Koto
Jey Han Lau
Tim Baldwin
36
53
0
13 Apr 2021
Evaluating Pre-Trained Models for User Feedback Analysis in Software
  Engineering: A Study on Classification of App-Reviews
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews
M. Hadi
Fatemeh H. Fard
26
30
0
12 Apr 2021
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
Roshanak Mirzaee
Hossein Rajaby Faghihi
Qiang Ning
Parisa Kordjmashidi
26
77
0
12 Apr 2021
DATE: Detecting Anomalies in Text via Self-Supervision of Transformers
DATE: Detecting Anomalies in Text via Self-Supervision of Transformers
Andrei Manolache
Florin Brad
Elena Burceanu
UQCV
46
33
0
12 Apr 2021
Factual Probing Is [MASK]: Learning vs. Learning to Recall
Factual Probing Is [MASK]: Learning vs. Learning to Recall
Zexuan Zhong
Dan Friedman
Danqi Chen
16
403
0
12 Apr 2021
Not All Attention Is All You Need
Not All Attention Is All You Need
Hongqiu Wu
Hai Zhao
Min Zhang
22
9
0
10 Apr 2021
WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for
  Detecting Toxic Spans
WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for Detecting Toxic Spans
Tharindu Ranasinghe
Diptanu Sarkar
Marcos Zampieri
Alexander Ororbia
MedIm
27
13
0
09 Apr 2021
AdCOFE: Advanced Contextual Feature Extraction in Conversations for
  emotion classification
AdCOFE: Advanced Contextual Feature Extraction in Conversations for emotion classification
Vaibhav Bhat
Anita Yadav
Sonal Yadav
Dhivya Chandrasekaran
Vijay K. Mago
35
4
0
09 Apr 2021
Did they answer? Subjective acts and intents in conversational discourse
Did they answer? Subjective acts and intents in conversational discourse
Elisa Ferracane
Greg Durrett
Junjie Li
K. Erk
18
19
0
09 Apr 2021
Larger-Context Tagging: When and Why Does It Work?
Larger-Context Tagging: When and Why Does It Work?
Jinlan Fu
Liangjing Feng
Qi Zhang
Xuanjing Huang
Pengfei Liu
27
5
0
09 Apr 2021
Transformers: "The End of History" for NLP?
Transformers: "The End of History" for NLP?
Anton Chernyavskiy
Dmitry Ilvovsky
Preslav Nakov
52
30
0
09 Apr 2021
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via
  Layer Consistency
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency
Jinchuan Tian
Rongzhi Gu
Helin Wang
Yuexian Zou
26
0
0
08 Apr 2021
A Question-answering Based Framework for Relation Extraction Validation
A Question-answering Based Framework for Relation Extraction Validation
Cheng Jiayang
Haiyun Jiang
Deqing Yang
Yanghua Xiao
17
11
0
07 Apr 2021
Creativity and Machine Learning: A Survey
Creativity and Machine Learning: A Survey
Giorgio Franceschelli
Mirco Musolesi
VLM
AI4CE
34
40
0
06 Apr 2021
Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with
  Common Sense and World Knowledge
Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with Common Sense and World Knowledge
Canwen Xu
Wangchunshu Zhou
Tao Ge
Ke Xu
Julian McAuley
Furu Wei
23
16
0
06 Apr 2021
Extremely Low Footprint End-to-End ASR System for Smart Device
Extremely Low Footprint End-to-End ASR System for Smart Device
Zhifu Gao
Yiwu Yao
Shiliang Zhang
Jun Yang
Ming Lei
Ian Mcloughlin
24
12
0
06 Apr 2021
CodeTrans: Towards Cracking the Language of Silicon's Code Through
  Self-Supervised Deep Learning and High Performance Computing
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing
Ahmed Elnaggar
Wei Ding
Llion Jones
Tom Gibbs
Tamas B. Fehér
Christoph Angerer
Silvia Severini
Florian Matthes
B. Rost
28
72
0
06 Apr 2021
Automating Transfer Credit Assessment in Student Mobility -- A Natural
  Language Processing-based Approach
Automating Transfer Credit Assessment in Student Mobility -- A Natural Language Processing-based Approach
Dhivya Chandrasekaran
Vijay K. Mago
25
2
0
05 Apr 2021
Explainability-aided Domain Generalization for Image Classification
Explainability-aided Domain Generalization for Image Classification
Robin M. Schmidt
FAtt
OOD
27
1
0
05 Apr 2021
MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual
  Word-in-Context Disambiguation using Augmented Data, Signals, and
  Transformers
MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation using Augmented Data, Signals, and Transformers
Rohan Gupta
Jay Mundra
Deepak Mahajan
Ashutosh Modi
22
3
0
04 Apr 2021
ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for
  Abstract Word Prediction
ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for Abstract Word Prediction
Abhishek Mittal
Ashutosh Modi
22
2
0
04 Apr 2021
Exploring the Role of BERT Token Representations to Explain Sentence
  Probing Results
Exploring the Role of BERT Token Representations to Explain Sentence Probing Results
Hosein Mohebbi
Ali Modarressi
Mohammad Taher Pilehvar
MILM
27
24
0
03 Apr 2021
Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying
  Humor and Offensiveness
Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying Humor and Offensiveness
Aishwarya Gupta
Avik Pal
Bholeshwar Khurana
Lakshay Tyagi
Ashutosh Modi
34
6
0
02 Apr 2021
Action-Based Conversations Dataset: A Corpus for Building More In-Depth
  Task-Oriented Dialogue Systems
Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems
Derek Chen
Howard Chen
Yi Yang
A. Lin
Zhou Yu
25
65
0
01 Apr 2021
Self-Supervised Euphemism Detection and Identification for Content
  Moderation
Self-Supervised Euphemism Detection and Identification for Content Moderation
Wanzheng Zhu
Hongyu Gong
Rohan Bansal
Zachary Weinberg
Nicolas Christin
Giulia Fanti
S. Bhat
31
40
0
31 Mar 2021
Pre-training for low resource speech-to-intent applications
Pre-training for low resource speech-to-intent applications
Pu Wang
Hugo Van hamme
14
4
0
30 Mar 2021
XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head
  Co-Attention for Reading Comprehension of Abstract Meaning
XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head Co-Attention for Reading Comprehension of Abstract Meaning
Yuxin Jiang
Ziyi Shou
Qijun Wang
Hao Wu
Fangzhen Lin
RALM
17
2
0
30 Mar 2021
Previous
123...434445...575859
Next