Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.11692
Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach
26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"RoBERTa: A Robustly Optimized BERT Pretraining Approach"
50 / 10,839 papers shown
Title
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Jiangyong Huang
William Zhu
Baoxiong Jia
Zan Wang
Xiaojian Ma
Qing Li
Siyuan Huang
123
5
0
28 Nov 2022
Revisiting Distance Metric Learning for Few-Shot Natural Language Classification
Witold Sosnowski
Anna Wróblewska
Karolina Seweryn
P. Gawrysiak
63
0
0
28 Nov 2022
Large Pre-Trained Models with Extra-Large Vocabularies: A Contrastive Analysis of Hebrew BERT Models and a New One to Outperform Them All
Eylon Guetta
Avi Shmidman
Shaltiel Shmidman
C. Shmidman
Joshua Guedalia
Moshe Koppel
Dan Bareket
Amit Seker
Reut Tsarfaty
VLM
64
15
0
28 Nov 2022
Distance Metric Learning Loss Functions in Few-Shot Scenarios of Supervised Language Models Fine-Tuning
Witold Sosnowski
Karolina Seweryn
Anna Wróblewska
P. Gawrysiak
76
0
0
28 Nov 2022
Topic Segmentation in the Wild: Towards Segmentation of Semi-structured & Unstructured Chats
R. Ghosh
Harjeet Singh Kajal
Sharanya Kamath
Dhuri Shrivastava
S. Basu
Soundararajan Srinivasan
53
4
0
27 Nov 2022
Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT5
Nghi D. Q. Bui
Yue Wang
Steven C. H. Hoi
82
17
0
27 Nov 2022
Understanding BLOOM: An empirical study on diverse NLP tasks
Parag Dakle
Sai Krishna Rallabandi
Preethi Raghavan
AI4CE
96
4
0
27 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
90
2
0
27 Nov 2022
MNER-QG: An End-to-End MRC framework for Multimodal Named Entity Recognition with Query Grounding
Meihuizi Jia
Lei Shen
Xin Shen
L. Liao
Meng Chen
Xiaodong He
Zhen-Heng Chen
Jiaqi Li
78
43
0
27 Nov 2022
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models
Kenan Tang
Hanchun Jiang
AI4CE
96
1
0
26 Nov 2022
Asymmetric Cross-Scale Alignment for Text-Based Person Search
Zhong Ji
Junhua Hu
Deyin Liu
Yuan Wu
Ye Zhao
108
46
0
26 Nov 2022
Lexicon-injected Semantic Parsing for Task-Oriented Dialog
Xiaojun Meng
Wenlin Dai
Yasheng Wang
Baojun Wang
Zhiyong Wu
Xin Jiang
Qun Liu
66
2
0
26 Nov 2022
Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts
A. Tonja
M. Yigezu
Olga Kolesnikova
Moein Shahiki Tash
Grigori Sidorov
Alexander Gelbukh
94
23
0
26 Nov 2022
Finetuning BERT on Partially Annotated NER Corpora
Viktor Scherbakov
V. Mayorov
33
1
0
25 Nov 2022
Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization?
Rishi Bommasani
Kathleen A. Creel
Ananya Kumar
Dan Jurafsky
Percy Liang
92
89
0
25 Nov 2022
Testing the effectiveness of saliency-based explainability in NLP using randomized survey-based experiments
Adel Rahimi
Shaurya Jain
FAtt
106
0
0
25 Nov 2022
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Xupeng Miao
Yujie Wang
Youhe Jiang
Chunan Shi
Xiaonan Nie
Hailin Zhang
Tengjiao Wang
GNN
MoE
110
64
0
25 Nov 2022
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai
Sarvnaz Karimi
70
3
0
24 Nov 2022
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Tanish Lad
Himanshu Maheshwari
Shreyas Kottukkal
R. Mamidi
83
3
0
24 Nov 2022
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Yatai Ji
Rong-Cheng Tu
Jie Jiang
Weijie Kong
Chengfei Cai
Wenzhe Zhao
Hongfa Wang
Yujiu Yang
Wei Liu
VLM
85
15
0
24 Nov 2022
A Report on the Euphemisms Detection Shared Task
Patrick Lee
Anna Feldman
J. Peng
96
9
0
23 Nov 2022
Proceedings of the 4th International Workshop on Reading Music Systems
Jorge Calvo-Zaragoza
Alexander Pacha
Elona Shatri
59
0
0
23 Nov 2022
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction
Kai Shen
Yichong Leng
Xuejiao Tan
Si-Qi Tang
Yuan Zhang
Wenjie Liu
Ed Lin
69
15
0
23 Nov 2022
TorchScale: Transformers at Scale
Shuming Ma
Hongyu Wang
Shaohan Huang
Wenhui Wang
Zewen Chi
...
Alon Benhaim
Barun Patra
Vishrav Chaudhary
Xia Song
Furu Wei
AI4CE
64
10
0
23 Nov 2022
Sarcasm Detection Framework Using Context, Emotion and Sentiment Features
O. Vitman
Ye. Kostiuk
Grigori Sidorov
Alexander Gelbukh
36
24
0
23 Nov 2022
Agent-Specific Deontic Modality Detection in Legal Language
Abhilasha Sancheti
Aparna Garimella
Balaji Vasan Srinivasan
Rachel Rudinger
AILaw
73
6
0
23 Nov 2022
Word-Level Representation From Bytes For Language Modeling
Chul Lee
Qipeng Guo
Xipeng Qiu
77
1
0
23 Nov 2022
DyRRen: A Dynamic Retriever-Reranker-Generator Model for Numerical Reasoning over Tabular and Textual Data
Xiao Li
Sichen Liu
Kaiwen Shi
Jiangzhou Ju
Yuzhong Qu
Gong Cheng
AIMat
RALM
LRM
75
16
0
23 Nov 2022
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
S. Bhattamishra
Arkil Patel
Varun Kanade
Phil Blunsom
131
49
0
22 Nov 2022
A Scope Sensitive and Result Attentive Model for Multi-Intent Spoken Language Understanding
Lizhi Cheng
Wenmian Yang
Weijia Jia
64
10
0
22 Nov 2022
Event Causality Identification with Causal News Corpus -- Shared Task 3, CASE 2022
Fiona Anting Tan
Hansi Hettiarachchi
Ali Hurriyetouglu
Tommaso Caselli
Onur Uca
Farhana Ferdousi Liza
Nelleke Oostdijk
77
27
0
22 Nov 2022
Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models
Mark Rofin
Nikita Balagansky
Daniil Gavrilov
MoMe
KELM
96
7
0
22 Nov 2022
Compiler Provenance Recovery for Multi-CPU Architectures Using a Centrifuge Mechanism
Yuhei Otsubo
Akira Otsuka
M. Mimura
58
3
0
22 Nov 2022
Evaluating the Knowledge Dependency of Questions
Hyeongdon Moon
Yoonseok Yang
Jamin Shin
Hangyeol Yu
Seunghyun Lee
Myeongho Jeong
Juneyoung Park
Minsam Kim
Seungtaek Choi
AI4Ed
68
11
0
21 Nov 2022
Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
E. Mitchell
Joseph J. Noh
Siyan Li
William S. Armstrong
Ananth Agarwal
Patrick Liu
Chelsea Finn
Christopher D. Manning
90
35
0
21 Nov 2022
Unsupervised extraction, labelling and clustering of segments from clinical notes
Petr Zelina
J. Halámková
V. Nováček
51
3
0
21 Nov 2022
Teaching Structured Vision&Language Concepts to Vision&Language Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Yikang Shen
Roei Herzig
...
Donghyun Kim
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLM
CoGe
133
72
0
21 Nov 2022
Legal and Political Stance Detection of SCOTUS Language
Noah Bergam
Emily Allaway
Kathleen McKeown
AILaw
ELM
59
6
0
21 Nov 2022
Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention
Zineng Tang
Jaemin Cho
Jie Lei
Joey Tianyi Zhou
VLM
84
9
0
21 Nov 2022
Deanthropomorphising NLP: Can a Language Model Be Conscious?
Matthew Shardlow
Piotr Przybyła
77
7
0
21 Nov 2022
L3Cube-HindBERT and DevBERT: Pre-Trained BERT Transformer models for Devanagari based Hindi and Marathi Languages
Raviraj Joshi
97
62
0
21 Nov 2022
AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model
Yongyu Yan
Kui Xue
Xiaoming Shi
Qi Ye
Jingping Liu
Tong Ruan
CLL
85
2
0
21 Nov 2022
TCBERT: A Technical Report for Chinese Topic Classification BERT
Ting Han
Kunhao Pan
Xinyu Chen
Dingjie Song
Yuchen Fan
Xinyu Gao
Ruyi Gan
Jiaxing Zhang
VLM
67
1
0
21 Nov 2022
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
Qianhui Wu
Huiqiang Jiang
Haonan Yin
Börje F. Karlsson
Chin-Yew Lin
120
12
0
21 Nov 2022
Cross-Modal Contrastive Learning for Robust Reasoning in VQA
Qinjie Zheng
Chaoyue Wang
Daqing Liu
Dadong Wang
Dacheng Tao
LRM
66
0
0
21 Nov 2022
Unsupervised Explanation Generation via Correct Instantiations
Sijie Cheng
Zhiyong Wu
Jiangjie Chen
Zhixing Li
Yang Liu
Lingpeng Kong
ReLM
LRM
80
5
0
21 Nov 2022
Unifying Vision-Language Representation Space with Single-tower Transformer
Jiho Jang
Chaerin Kong
D. Jeon
Seonhoon Kim
Nojun Kwak
113
21
0
21 Nov 2022
An Algorithm for Routing Vectors in Sequences
Franz A. Heinsen
50
18
0
20 Nov 2022
Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual Synonym Knowledge
Yongqian Li
Jiaoyan Chen
Hai-Tao Zheng
Tianyu Yu
Xi Chen
Haitao Zheng
82
14
0
20 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
74
1
0
20 Nov 2022
Previous
1
2
3
...
128
129
130
...
215
216
217
Next