ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,916 papers shown
Title
Matching with Transformers in MELT
Matching with Transformers in MELT
S. Hertling
Jan Portisch
Heiko Paulheim
44
9
0
15 Sep 2021
Allocating Large Vocabulary Capacity for Cross-lingual Language Model
  Pre-training
Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
Bo Zheng
Li Dong
Shaohan Huang
Saksham Singhal
Wanxiang Che
Ting Liu
Xia Song
Furu Wei
VLM
26
22
0
15 Sep 2021
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up
  Knowledge Distillation
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
41
28
0
15 Sep 2021
Transformer-based Language Models for Factoid Question Answering at
  BioASQ9b
Transformer-based Language Models for Factoid Question Answering at BioASQ9b
Urvashi Khanna
Diego Mollá Aliod
41
5
0
15 Sep 2021
Incorporating Residual and Normalization Layers into Analysis of Masked
  Language Models
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
171
46
0
15 Sep 2021
Towards Document-Level Paraphrase Generation with Sentence Rewriting and
  Reordering
Towards Document-Level Paraphrase Generation with Sentence Rewriting and Reordering
Zhe Lin
Yitao Cai
Xiaojun Wan
45
13
0
15 Sep 2021
Will this Question be Answered? Question Filtering via Answer Model
  Distillation for Efficient Question Answering
Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering
Siddhant Garg
Alessandro Moschitti
34
26
0
14 Sep 2021
Explainable Identification of Dementia from Transcripts using
  Transformer Networks
Explainable Identification of Dementia from Transcripts using Transformer Networks
Loukas Ilias
D. Askounis
31
39
0
14 Sep 2021
ePiC: Employing Proverbs in Context as a Benchmark for Abstract Language
  Understanding
ePiC: Employing Proverbs in Context as a Benchmark for Abstract Language Understanding
Sayan Ghosh
Shashank Srivastava
29
11
0
14 Sep 2021
STraTA: Self-Training with Task Augmentation for Better Few-shot
  Learning
STraTA: Self-Training with Task Augmentation for Better Few-shot Learning
Tu Vu
Minh-Thang Luong
Quoc V. Le
Grady Simon
Mohit Iyyer
131
61
0
13 Sep 2021
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extraction
Deming Ye
Yankai Lin
Peng Li
Maosong Sun
146
106
0
13 Sep 2021
Compute and Energy Consumption Trends in Deep Learning Inference
Compute and Energy Consumption Trends in Deep Learning Inference
Radosvet Desislavov
Fernando Martínez-Plumed
José Hernández-Orallo
35
113
0
12 Sep 2021
"Let Your Characters Tell Their Story": A Dataset for Character-Centric
  Narrative Understanding
"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding
Faeze Brahman
Meng Huang
Oyvind Tafjord
Chao Zhao
Mrinmaya Sachan
Snigdha Chaturvedi
32
53
0
12 Sep 2021
FBERT: A Neural Transformer for Identifying Offensive Content
FBERT: A Neural Transformer for Identifying Offensive Content
Diptanu Sarkar
Marcos Zampieri
Tharindu Ranasinghe
Alexander Ororbia
VLM
41
55
0
10 Sep 2021
Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
  and Language Model Augmented Self-Training
Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training
Yu Meng
Yunyi Zhang
Jiaxin Huang
Xuan Wang
Yu Zhang
Heng Ji
Jiawei Han
51
71
0
10 Sep 2021
On the validity of pre-trained transformers for natural language
  processing in the software engineering domain
On the validity of pre-trained transformers for natural language processing in the software engineering domain
Julian von der Mosel
Alexander Trautsch
Steffen Herbold
45
67
0
10 Sep 2021
Knowledge-Aware Meta-learning for Low-Resource Text Classification
Knowledge-Aware Meta-learning for Low-Resource Text Classification
Huaxiu Yao
Yingxin Wu
Maruan Al-Shedivat
Eric Xing
VLM
CLIP
72
11
0
10 Sep 2021
EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident
  Learning and Language Modeling
EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling
Jue Wang
Haofan Wang
Jincan Deng
Weijia Wu
Debing Zhang
VLM
CLIP
72
19
0
10 Sep 2021
Query-driven Segment Selection for Ranking Long Documents
Query-driven Segment Selection for Ranking Long Documents
Youngwoo Kim
Razieh Rahimi
Hamed Bonab
James Allan
RALM
30
5
0
10 Sep 2021
Augmenting BERT-style Models with Predictive Coding to Improve
  Discourse-level Representations
Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations
Vladimir Araujo
Andrés Villa
Marcelo Mendoza
Marie-Francine Moens
Alvaro Soto
44
7
0
10 Sep 2021
Is Attention Better Than Matrix Decomposition?
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
62
139
0
09 Sep 2021
Bag of Tricks for Optimizing Transformer Efficiency
Bag of Tricks for Optimizing Transformer Efficiency
Ye Lin
Yanyang Li
Tong Xiao
Jingbo Zhu
36
6
0
09 Sep 2021
Graph Based Network with Contextualized Representations of Turns in
  Dialogue
Graph Based Network with Contextualized Representations of Turns in Dialogue
Bongseok Lee
Y. Choi
66
69
0
09 Sep 2021
What's Hidden in a One-layer Randomly Weighted Transformer?
What's Hidden in a One-layer Randomly Weighted Transformer?
Sheng Shen
Z. Yao
Douwe Kiela
Kurt Keutzer
Michael W. Mahoney
39
4
0
08 Sep 2021
A Bayesian Framework for Information-Theoretic Probing
A Bayesian Framework for Information-Theoretic Probing
Tiago Pimentel
Ryan Cotterell
35
24
0
08 Sep 2021
Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance
  for Multi-party Dialogue Reading Comprehension
Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Reading Comprehension
Yiyang Li
Hai Zhao
32
23
0
08 Sep 2021
Label Verbalization and Entailment for Effective Zero- and Few-Shot
  Relation Extraction
Label Verbalization and Entailment for Effective Zero- and Few-Shot Relation Extraction
Oscar Sainz
Oier López de Lacalle
Gorka Labaka
Ander Barrena
Eneko Agirre
16
117
0
08 Sep 2021
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question
  Answering over Historical News Collections
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question Answering over Historical News Collections
Jiexin Wang
Adam Jatowt
Masatoshi Yoshikawa
52
33
0
08 Sep 2021
Self-supervised Contrastive Cross-Modality Representation Learning for
  Spoken Question Answering
Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering
Chenyu You
Nuo Chen
Yuexian Zou
SSL
32
63
0
08 Sep 2021
NumGPT: Improving Numeracy Ability of Generative Pre-trained Models
NumGPT: Improving Numeracy Ability of Generative Pre-trained Models
Zhihua Jin
Xin Jiang
Xingbo Wang
Qun Liu
Yong Wang
Xiaozhe Ren
Huamin Qu
24
19
0
07 Sep 2021
IndicBART: A Pre-trained Model for Indic Natural Language Generation
IndicBART: A Pre-trained Model for Indic Natural Language Generation
Raj Dabre
Himani Shrotriya
Anoop Kunchukuttan
Ratish Puduppully
Mitesh M. Khapra
Pratyush Kumar
57
71
0
07 Sep 2021
Sent2Span: Span Detection for PICO Extraction in the Biomedical Text
  without Span Annotations
Sent2Span: Span Detection for PICO Extraction in the Biomedical Text without Span Annotations
Shifeng Liu
Yifang Sun
Bing Li
Wei Wang
Florence T. Bourgeois
A. Dunn
24
14
0
06 Sep 2021
STaCK: Sentence Ordering with Temporal Commonsense Knowledge
STaCK: Sentence Ordering with Temporal Commonsense Knowledge
Deepanway Ghosal
Navonil Majumder
Rada Mihalcea
Soujanya Poria
50
11
0
06 Sep 2021
Re-entry Prediction for Online Conversations via Self-Supervised
  Learning
Re-entry Prediction for Online Conversations via Self-Supervised Learning
Lingzhi Wang
Xingshan Zeng
Huang Hu
Kam-Fai Wong
Daxin Jiang
40
6
0
05 Sep 2021
FewshotQA: A simple framework for few-shot learning of question
  answering tasks using pre-trained text-to-text models
FewshotQA: A simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models
Rakesh Chada
P. Natarajan
41
45
0
04 Sep 2021
Frustratingly Simple Pretraining Alternatives to Masked Language
  Modeling
Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
Atsuki Yamaguchi
G. Chrysostomou
Katerina Margatina
Nikolaos Aletras
32
25
0
04 Sep 2021
Do Prompt-Based Models Really Understand the Meaning of their Prompts?
Do Prompt-Based Models Really Understand the Meaning of their Prompts?
Albert Webson
Ellie Pavlick
LRM
66
359
0
02 Sep 2021
So Cloze yet so Far: N400 Amplitude is Better Predicted by
  Distributional Information than Human Predictability Judgements
So Cloze yet so Far: N400 Amplitude is Better Predicted by Distributional Information than Human Predictability Judgements
J. Michaelov
S. Coulson
Benjamin Bergen
24
44
0
02 Sep 2021
Fight Fire with Fire: Fine-tuning Hate Detectors using Large Samples of
  Generated Hate Speech
Fight Fire with Fire: Fine-tuning Hate Detectors using Large Samples of Generated Hate Speech
Tomer Wullach
A. Adler
Einat Minkov
11
41
0
01 Sep 2021
Does Knowledge Help General NLU? An Empirical Study
Does Knowledge Help General NLU? An Empirical Study
Ruochen Xu
Yuwei Fang
Chenguang Zhu
Michael Zeng
ELM
34
9
0
01 Sep 2021
What Have Been Learned & What Should Be Learned? An Empirical Study of
  How to Selectively Augment Text for Classification
What Have Been Learned & What Should Be Learned? An Empirical Study of How to Selectively Augment Text for Classification
Biyang Guo
S. Han
Hailiang Huang
19
5
0
01 Sep 2021
It's not Rocket Science : Interpreting Figurative Language in Narratives
It's not Rocket Science : Interpreting Figurative Language in Narratives
Tuhin Chakrabarty
Yejin Choi
Vered Shwartz
29
55
0
31 Aug 2021
Effectiveness of Deep Networks in NLP using BiDAF as an example
  architecture
Effectiveness of Deep Networks in NLP using BiDAF as an example architecture
Soumyendu Sarkar
34
2
0
31 Aug 2021
Thermostat: A Large Collection of NLP Model Explanations and Analysis
  Tools
Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools
Nils Feldhus
Robert Schwarzenberg
Sebastian Möller
37
14
0
31 Aug 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced
  Operator Fusion
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
35
147
0
30 Aug 2021
ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language
  Understanding
ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Lingyun Feng
Jianwei Yu
Deng Cai
Songxiang Liu
Haitao Zheng
Yan Wang
ELM
79
14
0
30 Aug 2021
Shatter: An Efficient Transformer Encoder with Single-Headed
  Self-Attention and Relative Sequence Partitioning
Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning
Ran Tian
Joshua Maynez
Ankur P. Parikh
ViT
42
2
0
30 Aug 2021
Generating Answer Candidates for Quizzes and Answer-Aware Question
  Generators
Generating Answer Candidates for Quizzes and Answer-Aware Question Generators
Kristiyan Vachev
Momchil Hardalov
Georgi Karadzhov
Georgi Georgiev
Ivan Koychev
Preslav Nakov
AI4Ed
31
5
0
29 Aug 2021
Span Fine-tuning for Pre-trained Language Models
Span Fine-tuning for Pre-trained Language Models
Rongzhou Bao
Zhuosheng Zhang
Hai Zhao
19
2
0
29 Aug 2021
Analyzing and Mitigating Interference in Neural Architecture Search
Analyzing and Mitigating Interference in Neural Architecture Search
Jin Xu
Xu Tan
Kaitao Song
Renqian Luo
Yichong Leng
Tao Qin
Tie-Yan Liu
Jian Li
MoMe
39
29
0
29 Aug 2021
Previous
123...373839...575859
Next