Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,913 papers shown
Title
On the Importance of Local Information in Transformer Based Models
Madhura Pande
Aakriti Budhraja
Preksha Nema
Pratyush Kumar
Mitesh M. Khapra
33
2
0
13 Aug 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
21
8
0
13 Aug 2020
Compression of Deep Learning Models for Text: A Survey
Manish Gupta
Puneet Agrawal
VLM
MedIm
AI4CE
22
115
0
12 Aug 2020
SemEval-2020 Task 8: Memotion Analysis -- The Visuo-Lingual Metaphor!
Chhavi Sharma
Deepesh Bhageria
W. Scott
Srinivas Pykl
A. Das
Tanmoy Chakraborty
Viswanath Pulabaigari
Björn Gambäck
20
169
0
09 Aug 2020
SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual Media
Amirreza Shirani
Franck Dernoncourt
Nedim Lipka
P. Asente
J. Echevarria
Thamar Solorio
23
21
0
07 Aug 2020
ConvBERT: Improving BERT with Span-based Dynamic Convolution
Zihang Jiang
Weihao Yu
Daquan Zhou
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
43
157
0
06 Aug 2020
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jingkai Li
D. Song
Jacob Steinhardt
63
522
0
05 Aug 2020
DeLighT: Deep and Light-weight Transformer
Sachin Mehta
Marjan Ghazvininejad
Srini Iyer
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
33
32
0
03 Aug 2020
SemEval-2020 Task 5: Counterfactual Recognition
Xiaoyu Yang
Stephen Obadinma
Huasha Zhao
Qiong Zhang
Stan Matwin
Xiao-Dan Zhu
11
42
0
02 Aug 2020
A Survey on Text Classification: From Shallow to Deep Learning
Qian Li
Hao Peng
Jianxin Li
Congyin Xia
Renyu Yang
Lichao Sun
Philip S. Yu
Lifang He
VLM
28
329
0
02 Aug 2020
On Learning Universal Representations Across Languages
Xiangpeng Wei
Rongxiang Weng
Yue Hu
Luxi Xing
Heng Yu
Weihua Luo
SSL
VLM
35
85
0
31 Jul 2020
Language Modelling for Source Code with Transformer-XL
Thomas D. Dowdell
Hongyu Zhang
8
8
0
31 Jul 2020
Deep Learning Brasil -- NLP at SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets
Manoel Veríssimo dos Santos Neto
Ayrton Amaral
Nádia Félix F. da Silva
A. S. Soares
16
4
0
28 Jul 2020
TensorCoder: Dimension-Wise Attention via Tensor Representation for Natural Language Modeling
Shuai Zhang
Peng Zhang
Xindian Ma
Junqiu Wei
Ning Wang
Qun Liu
19
5
0
28 Jul 2020
ECNU-SenseMaker at SemEval-2020 Task 4: Leveraging Heterogeneous Knowledge Resources for Commonsense Validation and Explanation
Qiang Zhao
Siyu Tao
Jie Zhou
Linlin Wang
Xin Lin
Liang He
37
8
0
28 Jul 2020
BUT-FIT at SemEval-2020 Task 5: Automatic detection of counterfactual statements with deep pre-trained language representation models
Martin Fajcik
Josef Jon
Martin Docekal
Pavel Smrz
22
11
0
28 Jul 2020
Variants of BERT, Random Forests and SVM approach for Multimodal Emotion-Target Sub-challenge
Hoang Manh Hung
Hyung-Jeong Yang
Soohyung Kim
Gueesang Lee
15
0
0
28 Jul 2020
Public Sentiment Toward Solar Energy: Opinion Mining of Twitter Using a Transformer-Based Language Model
Serena Y Kim
K. Ganesan
P. Dickens
S. Panda
28
58
0
27 Jul 2020
Self-supervised Learning for Large-scale Item Recommendations
Tiansheng Yao
Xinyang Yi
D. Cheng
Felix X. Yu
Ting-Li Chen
...
Lichan Hong
Ed H. Chi
S. Tjoa
Jieqi Kang
Evan Ettinger
SSL
25
48
0
25 Jul 2020
Multi-task learning for natural language processing in the 2020s: where are we going?
Joseph Worsham
Jugal Kalita
AIMat
24
76
0
22 Jul 2020
XD at SemEval-2020 Task 12: Ensemble Approach to Offensive Language Identification in Social Media Using Transformer Encoders
Xiangjue Dong
Jinho Choi
10
1
0
21 Jul 2020
CS-NET at SemEval-2020 Task 4: Siamese BERT for ComVE
S. Dash
Sandeep K. Routray
P. Varshney
Ashutosh Modi
28
3
0
21 Jul 2020
PanRep: Graph neural networks for extracting universal node embeddings in heterogeneous graphs
V. Ioannidis
Da Zheng
George Karypis
SSL
22
4
0
20 Jul 2020
Mono vs Multilingual Transformer-based Models: a Comparison across Several Language Tasks
Diego de Vargas Feijó
V. Moreira
MILM
19
7
0
19 Jul 2020
Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms
Firoj Alam
Fahim Dalvi
Shaden Shaar
Nadir Durrani
Hamdy Mubarak
...
Giovanni Da San Martino
Ahmed Abdelali
Hassan Sajjad
Kareem Darwish
Preslav Nakov
22
102
0
15 Jul 2020
COBE: Contextualized Object Embeddings from Narrated Instructional Video
Gedas Bertasius
Lorenzo Torresani
11
24
0
14 Jul 2020
ProtTrans: Towards Cracking the Language of Life's Code Through Self-Supervised Deep Learning and High Performance Computing
Ahmed Elnaggar
M. Heinzinger
Christian Dallago
Ghalia Rehawi
Yu Wang
...
Tamas B. Fehér
Christoph Angerer
Martin Steinegger
D. Bhowmik
B. Rost
DRL
20
917
0
13 Jul 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
SSL
62
356
0
12 Jul 2020
HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Yi Tay
Zhe Zhao
Dara Bahri
Donald Metzler
Da-Cheng Juan
48
9
0
12 Jul 2020
Deep or Simple Models for Semantic Tagging? It Depends on your Data [Experiments]
Jinfeng Li
Yuliang Li
Xiaolan Wang
W. Tan
VLM
14
9
0
11 Jul 2020
Alleviating the Burden of Labeling: Sentence Generation by Attention Branch Encoder-Decoder Network
Tadashi Ogura
A. Magassouba
K. Sugiura
Tsubasa Hirakawa
Takayoshi Yamashita
H. Fujiyoshi
Hisashi Kawai
24
11
0
09 Jul 2020
IQ-VQA: Intelligent Visual Question Answering
Vatsal Goel
Mohit Chandak
A. Anand
Prithwijit Guha
28
5
0
08 Jul 2020
Remix: Rebalanced Mixup
Hsin-Ping Chou
Shih-Chieh Chang
Jia-Yu Pan
Wei Wei
Da-Cheng Juan
36
232
0
08 Jul 2020
Pre-Trained Models for Heterogeneous Information Networks
Yang Fang
Xiang Zhao
Yifan Chen
W. Xiao
Maarten de Rijke
SSL
43
1
0
07 Jul 2020
Efficient Conformal Prediction via Cascaded Inference with Expanded Admission
Adam Fisch
Tal Schuster
Tommi Jaakkola
Regina Barzilay
16
1
0
06 Jul 2020
LMVE at SemEval-2020 Task 4: Commonsense Validation and Explanation using Pretraining Language Model
Shilei Liu
Yu Guo
Bochao Li
Feiliang Ren
LRM
31
4
0
06 Jul 2020
Reading Comprehension in Czech via Machine Translation and Cross-lingual Transfer
K. Macková
Milan Straka
16
12
0
03 Jul 2020
SemEval-2020 Task 4: Commonsense Validation and Explanation
Cunxiang Wang
Shuailong Liang
Yili Jin
Yilong Wang
Xiao-Dan Zhu
Yue Zhang
LRM
25
98
0
01 Jul 2020
Transferability of Natural Language Inference to Biomedical Question Answering
Minbyul Jeong
Mujeen Sung
Gangwoo Kim
Donghyeon Kim
Wonjin Yoon
J. Yoo
Jaewoo Kang
19
38
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
SE3M: A Model for Software Effort Estimation Using Pre-trained Embedding Models
E. M. D. B. Fávero
Dalcimar Casanova
Andrey R. Pimentel
30
11
0
30 Jun 2020
Multi-Head Attention: Collaborate Instead of Concatenate
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
6
108
0
29 Jun 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
65
1,680
0
29 Jun 2020
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision
Chen Liang
Yue Yu
Haoming Jiang
Siawpeng Er
Ruijia Wang
T. Zhao
Chao Zhang
OffRL
19
234
0
28 Jun 2020
Video-Grounded Dialogues with Pretrained Generation Language Models
Hung Le
Guosheng Lin
34
28
0
27 Jun 2020
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig
Ali Madani
Lav Varshney
Caiming Xiong
R. Socher
Nazneen Rajani
31
289
0
26 Jun 2020
Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings
Mayee F. Chen
Daniel Y. Fu
Frederic Sala
Sen Wu
Ravi Teja Mullapudi
Fait Poms
Kayvon Fatahalian
Christopher Ré
27
10
0
26 Jun 2020
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
30
45
0
22 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
35
2
0
21 Jun 2020
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
F. Iandola
Albert Eaton Shaw
Ravi Krishna
Kurt Keutzer
VLM
28
127
0
19 Jun 2020
Previous
1
2
3
...
52
53
54
...
57
58
59
Next