Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.11692
Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach
26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"RoBERTa: A Robustly Optimized BERT Pretraining Approach"
50 / 10,878 papers shown
Title
Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
102
46
0
06 Oct 2022
Binding Language Models in Symbolic Languages
Zhoujun Cheng
Tianbao Xie
Peng Shi
Chengzu Li
Rahul Nadkarni
...
Dragomir R. Radev
Mari Ostendorf
Luke Zettlemoyer
Noah A. Smith
Tao Yu
LMTD
263
215
0
06 Oct 2022
Time Will Change Things: An Empirical Study on Dynamic Language Understanding in Social Media Classification
Yuji Zhang
Jing Li
85
5
0
06 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding
Jingye Chen
Tengchao Lv
Lei Cui
Changrong Zhang
Furu Wei
108
14
0
06 Oct 2022
Just ClozE! A Novel Framework for Evaluating the Factual Consistency Faster in Abstractive Summarization
Yiyang Li
Lei Li
Marina Litvak
N. Vanetik
Dingxing Hu
Yuze Li
Yanquan Zhou
HILM
95
0
0
06 Oct 2022
Revisiting Structured Dropout
Yiren Zhao
Oluwatomisin Dada
Xitong Gao
Robert D. Mullins
BDL
77
2
0
05 Oct 2022
"No, they did not": Dialogue response dynamics in pre-trained language models
Sanghee Kim
Lang-Chi Yu
Allyson Ettinger
57
1
0
05 Oct 2022
GAPX: Generalized Autoregressive Paraphrase-Identification X
Yi Zhou
Renyu Li
Hayden Housen
Ser-Nam Lim
BDL
79
0
0
05 Oct 2022
Recitation-Augmented Language Models
Zhiqing Sun
Xuezhi Wang
Yi Tay
Yiming Yang
Denny Zhou
RALM
281
65
0
04 Oct 2022
Understanding Prior Bias and Choice Paralysis in Transformer-based Language Representation Models through Four Experimental Probes
Ke Shen
Mayank Kejriwal
89
4
0
03 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
136
250
0
03 Oct 2022
Community Learning: Understanding A Community Through NLP for Positive Impact
Towhid Chowdhury
Naveen Sharma
48
0
0
02 Oct 2022
ReAct: A Review Comment Dataset for Actionability (and more)
G. Choudhary
Natwar Modani
Nitish Maurya
54
3
0
02 Oct 2022
Differentially Private Optimization on Large Model at Small Cost
Zhiqi Bu
Yu Wang
Sheng Zha
George Karypis
130
55
0
30 Sep 2022
Differentially Private Bias-Term Fine-tuning of Foundation Models
Zhiqi Bu
Yu Wang
Sheng Zha
George Karypis
146
48
0
30 Sep 2022
Relative representations enable zero-shot latent space communication
Luca Moschella
Valentino Maiorca
Marco Fumero
Antonio Norelli
Francesco Locatello
Emanuele Rodolà
130
109
0
30 Sep 2022
Construction and Applications of Billion-Scale Pre-Trained Multimodal Business Knowledge Graph
Shumin Deng
Chengming Wang
Zhoubo Li
Ningyu Zhang
Zelin Dai
...
Mosha Chen
Jiaoyan Chen
Jeff Z. Pan
Bryan Hooi
Huajun Chen
VLM
117
22
0
30 Sep 2022
What Makes Pre-trained Language Models Better Zero-shot Learners?
Jinghui Lu
Dongsheng Zhu
Weidong Han
Rui Zhao
Brian Mac Namee
Fei Tan
104
24
0
30 Sep 2022
Self-Distillation for Further Pre-training of Transformers
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
Kenji Kawaguchi
118
8
0
30 Sep 2022
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification
Muhammad N. ElNokrashy
Badr AlKhamissi
Mona T. Diab
MoMe
90
5
0
30 Sep 2022
PART: Pre-trained Authorship Representation Transformer
Javier Huertas-Tato
Álvaro Huertas-García
Alejandro Martín
137
9
0
30 Sep 2022
How to tackle an emerging topic? Combining strong and weak labels for Covid news NER
Aleksander Ficek
Fangyu Liu
Nigel Collier
79
1
0
29 Sep 2022
Improving Molecular Pretraining with Complementary Featurizations
Yanqiao Zhu
Dingshuo Chen
Yuanqi Du
Yingze Wang
Qiang Liu
Shu Wu
AI4CE
82
7
0
29 Sep 2022
Few-shot Text Classification with Dual Contrastive Consistency
Liwen Sun
Jiawei Han
59
0
0
29 Sep 2022
NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer
Valentin Leplat
D. Merkulov
Aleksandr Katrutsa
Daniel Bershatsky
Olga Tsymboi
Ivan Oseledets
137
3
0
29 Sep 2022
polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics
Christopher Kuenneth
R. Ramprasad
109
113
0
29 Sep 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffM
VGen
147
1,441
0
29 Sep 2022
Perturbations and Subpopulations for Testing Robustness in Token-Based Argument Unit Recognition
Jonathan Kamp
Lisa Beinborn
Antske Fokkens
52
0
0
29 Sep 2022
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts
Timo Spinde
Manuel Plank
Jan-David Krieger
Terry Ruas
Bela Gipp
Akiko Aizawa
72
76
0
29 Sep 2022
Bidirectional Language Models Are Also Few-shot Learners
Ajay Patel
Bryan Li
Mohammad Sadegh Rasooli
Noah Constant
Colin Raffel
Chris Callison-Burch
LRM
140
47
0
29 Sep 2022
Multi-stage Information Retrieval for Vietnamese Legal Texts
Nhat-Minh Pham
Nguyen Ha Thanh
Trong-Hop Do
AILaw
46
3
0
29 Sep 2022
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Kundan Krishna
Saurabh Garg
Jeffrey P. Bigham
Zachary Chase Lipton
108
33
0
28 Sep 2022
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Jonah Anton
H. Coppock
Pancham Shukla
Bjorn W. Schuller
BDL
SSL
85
8
0
28 Sep 2022
Causal Proxy Models for Concept-Based Model Explanations
Zhengxuan Wu
Karel DÓosterlinck
Atticus Geiger
Amir Zur
Christopher Potts
MILM
134
37
0
28 Sep 2022
Audio Retrieval with WavText5K and CLAP Training
Soham Deshmukh
Benjamin Elizalde
Huaming Wang
3DV
CLIP
187
53
0
28 Sep 2022
TVLT: Textless Vision-Language Transformer
Zineng Tang
Jaemin Cho
Yixin Nie
Joey Tianyi Zhou
VLM
137
31
0
28 Sep 2022
CEFER: A Four Facets Framework based on Context and Emotion embedded features for Implicit and Explicit Emotion Recognition
Fereshte Khoshnam
Ahmad Baraani-Dastjerdi
M. J. Liaghatdar
71
0
0
28 Sep 2022
YATO: Yet Another deep learning based Text analysis Open toolkit
Zeqiang Wang
Yile Wang
Jiageng Wu
Zhiyang Teng
Jie Yang
115
3
0
28 Sep 2022
Using contradictions improves question answering systems
Étienne Fortier-Dubois
Domenic Rosati
102
0
0
28 Sep 2022
METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets
Peilin Zhou
Zeqiang Wang
Dading Chong
Zhijiang Guo
Yining Hua
Zichang Su
Zhiyang Teng
Jiageng Wu
Jie Yang
85
23
0
28 Sep 2022
Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset
Hong Liu
Hao Peng
Zhijian Ou
Juan-Zi Li
Yi Huang
Junlan Feng
101
7
0
27 Sep 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
164
153
0
27 Sep 2022
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Yang Jin
Yongzhi Li
Zehuan Yuan
Yadong Mu
83
34
0
27 Sep 2022
Regularized Contrastive Learning of Semantic Search
Mingxi Tan
Alexis Rolland
Andong Tian
65
0
0
27 Sep 2022
A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing
Pranav Shetty
Arunkumar Chitteth Rajan
Christopher Kuenneth
Sonkakshi Gupta
L. P. Panchumarti
Lauren Holm
Chaoran Zhang
R. Ramprasad
73
71
0
27 Sep 2022
Lex2Sent: A bagging approach to unsupervised sentiment analysis
Kai-Robin Lange
Jonas Rieger
Carsten Jentsch
SSL
42
2
0
26 Sep 2022
Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned
Ahmed Sabir
137
0
0
26 Sep 2022
Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers
Nurullah Sevim
Ege Ozan Özyedek
Furkan Şahinuç
Aykut Koç
95
12
0
26 Sep 2022
Do ever larger octopi still amplify reporting biases? Evidence from judgments of typical colour
Fangyu Liu
Julian Martin Eisenschlos
Jeremy R. Cole
Nigel Collier
96
4
0
26 Sep 2022
Text Summarization with Oracle Expectation
Yumo Xu
Mirella Lapata
VLM
79
4
0
26 Sep 2022
Previous
1
2
3
...
137
138
139
...
216
217
218
Next