ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,764 papers shown
Title
DocFormerv2: Local Features for Document Understanding
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju
Peng Tang
Qi Dong
Nishant Sankaran
Yichu Zhou
R. Manmatha
105
41
0
02 Jun 2023
Improving Generalization in Task-oriented Dialogues with Workflows and
  Action Plans
Improving Generalization in Task-oriented Dialogues with Workflows and Action Plans
Stefania Raimondo
C. Pal
Xiaotian Liu
David Vazquez
Héctor Palacios
50
2
0
02 Jun 2023
OMNI: Open-endedness via Models of human Notions of Interestingness
OMNI: Open-endedness via Models of human Notions of Interestingness
Jenny Zhang
Joel Lehman
Kenneth O. Stanley
Jeff Clune
LRM
121
36
0
02 Jun 2023
Learning Multi-Step Reasoning by Solving Arithmetic Tasks
Learning Multi-Step Reasoning by Solving Arithmetic Tasks
Tianduo Wang
Wei Lu
ReLMLRM
73
16
0
02 Jun 2023
SourceP: Detecting Ponzi Schemes on Ethereum with Source Code
SourceP: Detecting Ponzi Schemes on Ethereum with Source Code
Pengcheng Lu
Liang Cai
Keting Yin
AI4TS
91
4
0
02 Jun 2023
DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control
  for Empathetic Response Generation
DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation
Guanqun Bi
Lei Shen
Yanan Cao
Meng Chen
Yuqiang Xie
Zheng Lin
Xiao-feng He
DiffM
138
14
0
02 Jun 2023
Centered Self-Attention Layers
Centered Self-Attention Layers
Ameen Ali
Tomer Galanti
Lior Wolf
140
8
0
02 Jun 2023
Evaluating Machine Translation Quality with Conformal Predictive
  Distributions
Evaluating Machine Translation Quality with Conformal Predictive Distributions
Patrizio Giovannotti
UQLM
105
7
0
02 Jun 2023
CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities
CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities
A. Agrawal
Raghav Arora
Ahana Datta
Snehasis Banerjee
Brojeshwar Bhowmick
Krishna Murthy Jatavallabhula
Mohan Sridharan
Madhava Krishna
77
2
0
02 Jun 2023
Supervised Adversarial Contrastive Learning for Emotion Recognition in
  Conversations
Supervised Adversarial Contrastive Learning for Emotion Recognition in Conversations
Dou Hu
Yinan Bao
Lingwei Wei
Wei Zhou
Song Hu
105
56
0
02 Jun 2023
Data-Efficient French Language Modeling with CamemBERTa
Data-Efficient French Language Modeling with CamemBERTa
Wissam Antoun
Benoît Sagot
Djamé Seddah
52
7
0
02 Jun 2023
LyricSIM: A novel Dataset and Benchmark for Similarity Detection in
  Spanish Song LyricS
LyricSIM: A novel Dataset and Benchmark for Similarity Detection in Spanish Song LyricS
Alejandro Benito-Santos
Adrián Ghajari
Pedro Hernández
Víctor Fresno-Fernández
Salvador Ros
E. González-Blanco
11
1
0
02 Jun 2023
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Yu Yang
Hao Kang
Baharan Mirzasoleiman
86
35
0
02 Jun 2023
Examining the Causal Effect of First Names on Language Models: The Case
  of Social Commonsense Reasoning
Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning
Sullam Jeoung
Jana Diesner
H. Kilicoglu
LRM
36
5
0
01 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
182
778
0
01 Jun 2023
Towards Learning Discrete Representations via Self-Supervision for
  Wearables-Based Human Activity Recognition
Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition
H. Haresamudram
Irfan Essa
Thomas Ploetz
98
8
0
01 Jun 2023
UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of
  Multilingual BERT for Low-resource Sentiment Analysis
UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment Analysis
Dou Hu
Lingwei Wei
Yaxin Liu
Wei Zhou
Songlin Hu
108
2
0
01 Jun 2023
Quantization-Aware and Tensor-Compressed Training of Transformers for
  Natural Language Understanding
Quantization-Aware and Tensor-Compressed Training of Transformers for Natural Language Understanding
Ziao Yang
Samridhi Choudhary
Siegfried Kunzmann
Zheng Zhang
MQ
81
3
0
01 Jun 2023
TimelineQA: A Benchmark for Question Answering over Timelines
TimelineQA: A Benchmark for Question Answering over Timelines
W. Tan
Jane Dwivedi-Yu
Yuliang Li
Lambert Mathias
Marzieh Saeidi
J. Yan
A. Halevy
LMTD
57
12
0
01 Jun 2023
UniDiff: Advancing Vision-Language Models with Generative and
  Discriminative Learning
UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning
Xiao Dong
Runhu Huang
Xiaoyong Wei
Zequn Jie
Jianxing Yu
Jian Yin
Xiaodan Liang
VLMDiffM
69
1
0
01 Jun 2023
Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection
Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection
Erik Arakelyan
Arnav Arora
Isabelle Augenstein
56
10
0
01 Jun 2023
Column Type Annotation using ChatGPT
Column Type Annotation using ChatGPT
Keti Korini
Christian Bizer
LMTD
116
28
0
01 Jun 2023
Boosting the Performance of Transformer Architectures for Semantic
  Textual Similarity
Boosting the Performance of Transformer Architectures for Semantic Textual Similarity
Ivan Rep
V. Ceperic
26
0
0
01 Jun 2023
Encoder-decoder multimodal speaker change detection
Encoder-decoder multimodal speaker change detection
Jee-weon Jung
Soonshin Seo
Hee-Soo Heo
Geon-min Kim
You Jin Kim
Youngki Kwon
Min-Ji Lee
Bong-Jin Lee
49
2
0
01 Jun 2023
Towards Argument-Aware Abstractive Summarization of Long Legal Opinions
  with Summary Reranking
Towards Argument-Aware Abstractive Summarization of Long Legal Opinions with Summary Reranking
Mohamed S. Elaraby
Yang Zhong
Diane Litman
AILawELM
71
10
0
01 Jun 2023
Explanation Graph Generation via Generative Pre-training over Synthetic
  Graphs
Explanation Graph Generation via Generative Pre-training over Synthetic Graphs
H. Cui
Sha Li
Yu Zhang
Qi Shi
117
1
0
01 Jun 2023
Contextual Distortion Reveals Constituency: Masked Language Models are
  Implicit Parsers
Contextual Distortion Reveals Constituency: Masked Language Models are Implicit Parsers
Jiaxi Li
Wei Lu
58
6
0
01 Jun 2023
ReviewerGPT? An Exploratory Study on Using Large Language Models for
  Paper Reviewing
ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing
Ryan Liu
Nihar B. Shah
ELM
112
76
0
01 Jun 2023
A Call for Standardization and Validation of Text Style Transfer
  Evaluation
A Call for Standardization and Validation of Text Style Transfer Evaluation
Phil Ostheimer
Mayank Nagda
Marius Kloft
Sophie Fellenz
201
15
0
01 Jun 2023
Layout and Task Aware Instruction Prompt for Zero-shot Document Image
  Question Answering
Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering
Wenjin Wang
Yunhao Li
Yixin Ou
Yin Zhang
VLM
131
26
0
01 Jun 2023
Revisiting Event Argument Extraction: Can EAE Models Learn Better When
  Being Aware of Event Co-occurrences?
Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?
Yuxin He
Jing-Hao Hu
Buzhou Tang
74
30
0
01 Jun 2023
Inspecting Spoken Language Understanding from Kids for Basic Math
  Learning at Home
Inspecting Spoken Language Understanding from Kids for Basic Math Learning at Home
Eda Okur
Roddy Fuentes Alba
Saurav Sahay
L. Nachman
63
0
0
01 Jun 2023
Make Pre-trained Model Reversible: From Parameter to Memory Efficient
  Fine-Tuning
Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
Baohao Liao
Shaomu Tan
Christof Monz
KELM
105
30
0
01 Jun 2023
How Many Answers Should I Give? An Empirical Study of Multi-Answer
  Reading Comprehension
How Many Answers Should I Give? An Empirical Study of Multi-Answer Reading Comprehension
Chen Zhang
Jiuheng Lin
Xiao Liu
Yuxuan Lai
Yansong Feng
Dongyan Zhao
65
4
0
01 Jun 2023
Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts
  for Zero-Shot Dialogue State Tracking
Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking
Qingyue Wang
Liang Ding
Yanan Cao
Yibing Zhan
Zheng Lin
Shi Wang
Dacheng Tao
Li Guo
MoMeMoE
88
12
0
01 Jun 2023
Adapting Pre-trained Language Models to Vision-Language Tasks via
  Dynamic Visual Prompting
Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting
Shubin Huang
Qiong Wu
Yiyi Zhou
Weijie Chen
Rongsheng Zhang
Xiaoshuai Sun
Rongrong Ji
VLMVPVLMLRM
43
0
0
01 Jun 2023
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Shentao Yang
Shujian Zhang
Congying Xia
Yihao Feng
Caiming Xiong
Mi Zhou
146
28
0
01 Jun 2023
Better Context Makes Better Code Language Models: A Case Study on
  Function Call Argument Completion
Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion
Hengzhi Pei
Jinman Zhao
Leonard Lausen
Sheng Zha
George Karypis
ELMLRM
67
21
0
01 Jun 2023
Focused Prefix Tuning for Controllable Text Generation
Focused Prefix Tuning for Controllable Text Generation
Congda Ma
Tianyu Zhao
Makoto Shing
Kei Sawada
Manabu Okumura
70
8
0
01 Jun 2023
Prompt Algebra for Task Composition
Prompt Algebra for Task Composition
Pramuditha Perera
Matthew Trager
Luca Zancato
Alessandro Achille
Stefano Soatto
VLM
77
8
0
01 Jun 2023
FEED PETs: Further Experimentation and Expansion on the Disambiguation
  of Potentially Euphemistic Terms
FEED PETs: Further Experimentation and Expansion on the Disambiguation of Potentially Euphemistic Terms
Patrick Lee
Iyanuoluwa Shode
Alain Chirino Trujillo
Yuan Zhao
O. E. Ojo
Diana Cuervas Plancarte
Anna Feldman
J. Peng
68
6
0
31 May 2023
Inconsistency, Instability, and Generalization Gap of Deep Neural
  Network Training
Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training
Rie Johnson
Tong Zhang
43
6
0
31 May 2023
Measuring the Robustness of NLP Models to Domain Shifts
Measuring the Robustness of NLP Models to Domain Shifts
Nitay Calderon
Naveh Porat
Eyal Ben-David
Alexander Chapanin
Zorik Gekhman
Nadav Oved
Vitaly Shalumov
Roi Reichart
124
8
0
31 May 2023
Mechanic: A Learning Rate Tuner
Mechanic: A Learning Rate Tuner
Ashok Cutkosky
Aaron Defazio
Harsh Mehta
OffRL
127
18
0
31 May 2023
ManagerTower: Aggregating the Insights of Uni-Modal Experts for
  Vision-Language Representation Learning
ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
Xiao Xu
Bei Li
Chenfei Wu
Shao-Yen Tseng
Anahita Bhiwandiwalla
Shachar Rosenman
Vasudev Lal
Wanxiang Che
Nan Duan
AIFinVLM
73
4
0
31 May 2023
Findings of the VarDial Evaluation Campaign 2023
Findings of the VarDial Evaluation Campaign 2023
Noëmi Aepli
Çagri Çöltekin
Rob van der Goot
T. Jauhiainen
Mourhaf Kazzaz
Nikola Ljubesic
Kai North
Barbara Plank
Yves Scherrer
Marcos Zampieri
81
31
0
31 May 2023
Adam Accumulation to Reduce Memory Footprints of both Activations and
  Gradients for Large-scale DNN Training
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training
Yijia Zhang
Yibo Han
Shijie Cao
Guohao Dai
Youshan Miao
Ting Cao
Fan Yang
Ningyi Xu
59
4
0
31 May 2023
MedNgage: A Dataset for Understanding Engagement in Patient-Nurse
  Conversations
MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations
Yan Wang
H. Donovan
Sabit Hassan
Mailhe Alikhani
85
3
0
31 May 2023
Correcting Semantic Parses with Natural Language through Dynamic Schema
  Encoding
Correcting Semantic Parses with Natural Language through Dynamic Schema Encoding
Parker Glenn
Parag Dakle
Preethi Raghavan
41
3
0
31 May 2023
How to Plant Trees in Language Models: Data and Architectural Effects on
  the Emergence of Syntactic Inductive Biases
How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases
Aaron Mueller
Tal Linzen
AI4CE
62
21
0
31 May 2023
Previous
123...979899...214215216
Next