ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,918 papers shown
Title
SVG-Net: An SVG-based Trajectory Prediction Model
SVG-Net: An SVG-based Trajectory Prediction Model
Mohammadhossein Bahari
Vahid Zehtab
Sadegh Khorasani
Sana Ayromlou
Saeed Saadatnejad
Alexandre Alahi
3DPC
29
3
0
07 Oct 2021
Noisy Text Data: Achilles' Heel of popular transformer based NLP models
Noisy Text Data: Achilles' Heel of popular transformer based NLP models
Kartikay Bagla
Ankit Kumar
Shivam Gupta
Anuj Gupta
34
5
0
07 Oct 2021
Distributed Methods with Compressed Communication for Solving
  Variational Inequalities, with Theoretical Guarantees
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees
Aleksandr Beznosikov
Peter Richtárik
Michael Diskin
Max Ryabinin
Alexander Gasnikov
FedML
30
21
0
07 Oct 2021
GNN is a Counter? Revisiting GNN for Question Answering
GNN is a Counter? Revisiting GNN for Question Answering
Kuan-Chieh Wang
Yuyu Zhang
Diyi Yang
Le Song
Tao Qin
LMTD
34
30
0
07 Oct 2021
A Comparative Study of Transformer-Based Language Models on Extractive
  Question Answering
A Comparative Study of Transformer-Based Language Models on Extractive Question Answering
Kate Pearce
Tiffany Zhan
Aneesh Komanduri
J. Zhan
ELM
35
33
0
07 Oct 2021
Weakly-supervised Text Classification Based on Keyword Graph
Weakly-supervised Text Classification Based on Keyword Graph
Lu Zhang
Jiandong Ding
Yi Xu
Yingyao Liu
Shuigeng Zhou
OffRL
OOD
44
56
0
06 Oct 2021
KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier
KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier
Linyang Li
Demin Song
Ruotian Ma
Xipeng Qiu
Xuanjing Huang
40
21
0
06 Oct 2021
PoNet: Pooling Network for Efficient Token Mixing in Long Sequences
PoNet: Pooling Network for Efficient Token Mixing in Long Sequences
Chao-Hong Tan
Qian Chen
Wen Wang
Qinglin Zhang
Siqi Zheng
Zhenhua Ling
ViT
27
11
0
06 Oct 2021
BERT Attends the Conversation: Improving Low-Resource Conversational ASR
BERT Attends the Conversation: Improving Low-Resource Conversational ASR
Pablo Ortiz
Simen Burud
39
4
0
05 Oct 2021
Is Attention always needed? A Case Study on Language Identification from
  Speech
Is Attention always needed? A Case Study on Language Identification from Speech
A. Mandal
Santanu Pal
Indranil Dutta
Mahidas Bhattacharya
S. Naskar
32
6
0
05 Oct 2021
Investigating the Impact of Pre-trained Language Models on Dialog
  Evaluation
Investigating the Impact of Pre-trained Language Models on Dialog Evaluation
Chen Zhang
L. F. D’Haro
Yiming Chen
Thomas Friedrichs
Haizhou Li
27
5
0
05 Oct 2021
A Survey On Neural Word Embeddings
A Survey On Neural Word Embeddings
Erhan Sezerer
Selma Tekir
AI4TS
43
12
0
05 Oct 2021
Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics
Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics
Prajjwal Bhargava
Aleksandr Drozd
Anna Rogers
104
103
0
04 Oct 2021
Revisiting Self-Training for Few-Shot Learning of Language Model
Revisiting Self-Training for Few-Shot Learning of Language Model
Yiming Chen
Yan Zhang
Chen Zhang
Grandee Lee
Ran Cheng
Haizhou Li
30
42
0
04 Oct 2021
Text-based automatic personality prediction: A bibliographic review
Text-based automatic personality prediction: A bibliographic review
Ali Reza Feizi Derakhshi
M. Feizi-Derakhshi
Majid Ramezani
Narjes Nikzad Khasmakhi
M. Asgari-Chenaghlu
Taymaz Akan
Mehrdad Ranjbar-Khadivi
Elnaz Zafarni-Moattar
Zoleikha Jahanbakhsh-Naghadeh
50
20
0
04 Oct 2021
Scheduling Optimization Techniques for Neural Network Training
Scheduling Optimization Techniques for Neural Network Training
Hyungjun Oh
Junyeol Lee
HyeongJu Kim
Jiwon Seo
26
0
0
03 Oct 2021
Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction
  Benchmark
Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark
Joel Niklaus
Ilias Chalkidis
Matthias Sturmer
ELM
AILaw
22
69
0
02 Oct 2021
Towards Efficient Post-training Quantization of Pre-trained Language
  Models
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
Michael R. Lyu
MQ
82
47
0
30 Sep 2021
Structural Persistence in Language Models: Priming as a Window into
  Abstract Language Representations
Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations
Arabella J. Sinclair
Jaap Jumelet
Willem H. Zuidema
Raquel Fernández
63
38
0
30 Sep 2021
Prose2Poem: The Blessing of Transformers in Translating Prose to Persian
  Poetry
Prose2Poem: The Blessing of Transformers in Translating Prose to Persian Poetry
Reza Khanmohammadi
Mitra Sadat Mirshafiee
Yazdan Rezaee Jouryabi
Seyed Abolghasem Mirroshandel
23
7
0
30 Sep 2021
Using Pause Information for More Accurate Entity Recognition
Using Pause Information for More Accurate Entity Recognition
Sahas Dendukuri
Pooja Chitkara
Joel Ruben Antony Moniz
Xiao Yang
M. Tsagkias
S. Pulman
26
5
0
27 Sep 2021
Context-guided Triple Matching for Multiple Choice Question Answering
Context-guided Triple Matching for Multiple Choice Question Answering
Xun Yao
Junlong Ma
Xinrong Hu
Junping Liu
Jie Yang
Wanqing Li
29
2
0
27 Sep 2021
Pragmatic competence of pre-trained language models through the lens of
  discourse connectives
Pragmatic competence of pre-trained language models through the lens of discourse connectives
Lalchand Pandia
Yan Cong
Allyson Ettinger
14
25
0
27 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer
  Quantization
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
30
135
0
27 Sep 2021
Click-through Rate Prediction with Auto-Quantized Contrastive Learning
Click-through Rate Prediction with Auto-Quantized Contrastive Learning
Yujie Pan
Jiangchao Yao
Bo Han
Kunyang Jia
Ya Zhang
Hongxia Yang
MQ
39
18
0
27 Sep 2021
Multiplicative Position-aware Transformer Models for Language
  Understanding
Multiplicative Position-aware Transformer Models for Language Understanding
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
17
1
0
27 Sep 2021
On the Prunability of Attention Heads in Multilingual BERT
On the Prunability of Attention Heads in Multilingual BERT
Aakriti Budhraja
Madhura Pande
Pratyush Kumar
Mitesh M. Khapra
57
4
0
26 Sep 2021
Improving Question Answering Performance Using Knowledge Distillation
  and Active Learning
Improving Question Answering Performance Using Knowledge Distillation and Active Learning
Yasaman Boreshban
Seyed Morteza Mirbostani
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
Shahin Amiriparian
49
15
0
26 Sep 2021
Self-Supervised Video Representation Learning by Video Incoherence
  Detection
Self-Supervised Video Representation Learning by Video Incoherence Detection
Haozhi Cao
Yuecong Xu
Jianfei Yang
K. Mao
Lihua Xie
Jianxiong Yin
Simon See
SSL
33
6
0
26 Sep 2021
MINIMAL: Mining Models for Data Free Universal Adversarial Triggers
MINIMAL: Mining Models for Data Free Universal Adversarial Triggers
Swapnil Parekh
Yaman Kumar Singla
Somesh Singh
Changyou Chen
Balaji Krishnamurthy
R. Shah
AAML
32
3
0
25 Sep 2021
DziriBERT: a Pre-trained Language Model for the Algerian Dialect
DziriBERT: a Pre-trained Language Model for the Algerian Dialect
Amine Abdaoui
Mohamed Berrimi
Mourad Oussalah
A. Moussaoui
37
44
0
25 Sep 2021
Finetuning Transformer Models to Build ASAG System
Finetuning Transformer Models to Build ASAG System
Mithun Thakkar
8
2
0
25 Sep 2021
Monolingual and Cross-Lingual Acceptability Judgments with the Italian
  CoLA corpus
Monolingual and Cross-Lingual Acceptability Judgments with the Italian CoLA corpus
Daniela Trotta
R. Guarasci
Elisa Leonardelli
Sara Tonelli
55
30
0
24 Sep 2021
DACT-BERT: Differentiable Adaptive Computation Time for an Efficient
  BERT Inference
DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference
Cristobal Eyzaguirre
Felipe del-Rio
Vladimir Araujo
Alvaro Soto
16
7
0
24 Sep 2021
Finding a Balanced Degree of Automation for Summary Evaluation
Finding a Balanced Degree of Automation for Summary Evaluation
Shiyue Zhang
Joey Tianyi Zhou
55
43
0
23 Sep 2021
Automated Fact-Checking: A Survey
Automated Fact-Checking: A Survey
Xia Zeng
Amani S. Abumansour
A. Zubiaga
HILM
201
96
0
23 Sep 2021
BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles
BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles
Yunxiang Zhang
Xiaojun Wan
68
12
0
23 Sep 2021
Small-Bench NLP: Benchmark for small single GPU trained models in
  Natural Language Processing
Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing
K. Kanakarajan
Bhuvana Kundumani
Malaikannan Sankarasubbu
ALM
MoE
19
5
0
22 Sep 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning
  Transformers
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
210
112
0
22 Sep 2021
Distilling Relation Embeddings from Pre-trained Language Models
Distilling Relation Embeddings from Pre-trained Language Models
Asahi Ushio
Jose Camacho-Collados
Steven Schockaert
32
21
0
21 Sep 2021
Survey: Transformer based Video-Language Pre-training
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
72
44
0
21 Sep 2021
BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology
BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology
Luke Gessler
Nathan Schneider
49
7
0
20 Sep 2021
Dyadformer: A Multi-modal Transformer for Long-Range Modeling of Dyadic
  Interactions
Dyadformer: A Multi-modal Transformer for Long-Range Modeling of Dyadic Interactions
D. Curto
Albert Clapés
Javier Selva
Sorina Smeureanu
Julio C. S. Jacques Junior
...
G. Guilera
D. Leiva
T. Moeslund
Sergio Escalera
Cristina Palmero
51
29
0
20 Sep 2021
Commonsense Knowledge in Word Associations and ConceptNet
Commonsense Knowledge in Word Associations and ConceptNet
Chunhua Liu
Trevor Cohn
Lea Frermann
57
8
0
20 Sep 2021
CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in
  Abstractive Summarization
CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization
Shuyang Cao
Lu Wang
HILM
39
178
0
19 Sep 2021
Navigating the Kaleidoscope of COVID-19 Misinformation Using Deep
  Learning
Navigating the Kaleidoscope of COVID-19 Misinformation Using Deep Learning
Yuanzhi Chen
Mohammad Rashedul Hasan
26
4
0
19 Sep 2021
Augmenting semantic lexicons using word embeddings and transfer learning
Augmenting semantic lexicons using word embeddings and transfer learning
Thayer Alshaabi
C. V. Oort
M. Fudolig
M. V. Arnold
C. Danforth
P. Dodds
40
4
0
18 Sep 2021
Fine-Tuned Transformers Show Clusters of Similar Representations Across
  Layers
Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers
Jason Phang
Haokun Liu
Samuel R. Bowman
35
26
0
17 Sep 2021
Distilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model Compression
Geondo Park
Gyeongman Kim
Eunho Yang
48
38
0
17 Sep 2021
Task-adaptive Pre-training of Language Models with Word Embedding
  Regularization
Task-adaptive Pre-training of Language Models with Word Embedding Regularization
Kosuke Nishida
Kyosuke Nishida
Sen Yoshida
VLM
50
8
0
17 Sep 2021
Previous
123...363738...575859
Next