ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,798 papers shown
Title
NoisywikiHow: A Benchmark for Learning with Real-world Noisy Labels in
  Natural Language Processing
NoisywikiHow: A Benchmark for Learning with Real-world Noisy Labels in Natural Language Processing
Tingting Wu
Xiao Ding
Minji Tang
Haotian Zhang
Bing Qin
Ting Liu
NoLa
94
11
0
18 May 2023
ReGen: Zero-Shot Text Classification via Training Data Generation with
  Progressive Dense Retrieval
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Yue Yu
Yuchen Zhuang
Rongzhi Zhang
Yu Meng
Jiaming Shen
Chao Zhang
VLM
89
37
0
18 May 2023
A Better Way to Do Masked Language Model Scoring
A Better Way to Do Masked Language Model Scoring
Carina Kauf
Anna A. Ivanova
94
27
0
17 May 2023
Incorporating Attribution Importance for Improving Faithfulness Metrics
Incorporating Attribution Importance for Improving Faithfulness Metrics
Zhixue Zhao
Nikolaos Aletras
111
13
0
17 May 2023
Accelerating Transformer Inference for Translation via Parallel Decoding
Accelerating Transformer Inference for Translation via Parallel Decoding
Andrea Santilli
Silvio Severino
Emilian Postolache
Valentino Maiorca
Michele Mancusi
R. Marin
Emanuele Rodolà
130
90
0
17 May 2023
Large-Scale Text Analysis Using Generative Language Models: A Case Study
  in Discovering Public Value Expressions in AI Patents
Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents
Sergio Pelaez
Gaurav Verma
Barbara Ribeiro
P. Shapira
103
15
0
17 May 2023
G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning
  for Graph Transformer Networks
G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks
Anchun Gui
Jinqiang Ye
Han Xiao
80
22
0
17 May 2023
UniEX: An Effective and Efficient Framework for Unified Information
  Extraction via a Span-extractive Perspective
UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective
Ping Yang
Junyu Lu
Ruyi Gan
Junjie Wang
Yuxiang Zhang
Jiaxing Zhang
Pingjian Zhang
71
11
0
17 May 2023
Towards More Robust NLP System Evaluation: Handling Missing Scores in
  Benchmarks
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
Anas Himmi
Ekhine Irurozki
Nathan Noiry
Stephan Clémençon
Pierre Colombo
193
9
0
17 May 2023
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark
  for Chinese Large Language Models
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models
Chuang Liu
Renren Jin
Yuqi Ren
Linhao Yu
Tianyu Dong
...
Peiyi Zhang
Qingqing Lyu
Xiaowen Su
Qun Liu
Deyi Xiong
ELMALM
119
26
0
17 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages
Language Model Tokenizers Introduce Unfairness Between Languages
Aleksandar Petrov
Emanuele La Malfa
Philip Torr
Adel Bibi
128
113
0
17 May 2023
OpenSLU: A Unified, Modularized, and Extensible Toolkit for Spoken
  Language Understanding
OpenSLU: A Unified, Modularized, and Extensible Toolkit for Spoken Language Understanding
Libo Qin
Qiguang Chen
Xiao Xu
Yunlong Feng
Wanxiang Che
ELMVLM
44
4
0
17 May 2023
Shielded Representations: Protecting Sensitive Attributes Through
  Iterative Gradient-Based Projection
Shielded Representations: Protecting Sensitive Attributes Through Iterative Gradient-Based Projection
Shadi Iskander
Kira Radinsky
Yonatan Belinkov
152
19
0
17 May 2023
Boosting Distress Support Dialogue Responses with Motivational
  Interviewing Strategy
Boosting Distress Support Dialogue Responses with Motivational Interviewing Strategy
A. Welivita
Pearl Pu
OffRL
98
17
0
17 May 2023
Knowledge-enhanced Mixed-initiative Dialogue System for Emotional
  Support Conversations
Knowledge-enhanced Mixed-initiative Dialogue System for Emotional Support Conversations
Yang Deng
Wenxuan Zhang
Yifei Yuan
W. Lam
101
38
0
17 May 2023
Use of a Taxonomy of Empathetic Response Intents to Control and
  Interpret Empathy in Neural Chatbots
Use of a Taxonomy of Empathetic Response Intents to Control and Interpret Empathy in Neural Chatbots
A. Welivita
Pearl Pu
35
1
0
17 May 2023
When Gradient Descent Meets Derivative-Free Optimization: A Match Made
  in Black-Box Scenario
When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario
Chengcheng Han
Liqing Cui
Renyu Zhu
Jiadong Wang
Nuo Chen
Qiushi Sun
Xiang Li
Ming Gao
82
7
0
17 May 2023
AD-KD: Attribution-Driven Knowledge Distillation for Language Model
  Compression
AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression
Siyue Wu
Hongzhan Chen
Xiaojun Quan
Qifan Wang
Rui Wang
VLM
86
20
0
17 May 2023
"I'm fully who I am": Towards Centering Transgender and Non-Binary
  Voices to Measure Biases in Open Language Generation
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Anaelia Ovalle
Palash Goyal
Jwala Dhamala
Zachary Jaggers
Kai-Wei Chang
Aram Galstyan
R. Zemel
Rahul Gupta
97
73
0
17 May 2023
Balancing Lexical and Semantic Quality in Abstractive Summarization
Balancing Lexical and Semantic Quality in Abstractive Summarization
Jeewoo Sul
Y. Choi
81
6
0
17 May 2023
Clustering-Aware Negative Sampling for Unsupervised Sentence
  Representation
Clustering-Aware Negative Sampling for Unsupervised Sentence Representation
Jinghao Deng
Fanqi Wan
Tao Yang
Xiaojun Quan
Rui Wang
SSL
48
11
0
17 May 2023
Semantic Similarity Measure of Natural Language Text through Machine
  Learning and a Keyword-Aware Cross-Encoder-Ranking Summarizer -- A Case Study
  Using UCGIS GIS&T Body of Knowledge
Semantic Similarity Measure of Natural Language Text through Machine Learning and a Keyword-Aware Cross-Encoder-Ranking Summarizer -- A Case Study Using UCGIS GIS&T Body of Knowledge
Yuanyuan Tian
Wenwen Li
Sizhe Wang
Zhining Gu
66
3
0
17 May 2023
Machine-Made Media: Monitoring the Mobilization of Machine-Generated
  Articles on Misinformation and Mainstream News Websites
Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites
Hans W. A. Hanley
Zakir Durumeric
DeLMO
65
32
0
16 May 2023
On Dataset Transferability in Active Learning for Transformers
On Dataset Transferability in Active Learning for Transformers
Fran Jelenić
Josip Jukić
Nina Drobac
Jan vSnajder
69
2
0
16 May 2023
Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned
  Language Models
Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models
Na Li
Hanane Kteich
Zied Bouraoui
Steven Schockaert
64
9
0
16 May 2023
Tailoring Instructions to Student's Learning Levels Boosts Knowledge
  Distillation
Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation
Yuxin Ren
Zi-Qi Zhong
Xingjian Shi
Yi Zhu
Chun Yuan
Mu Li
105
7
0
16 May 2023
UOR: Universal Backdoor Attacks on Pre-trained Language Models
UOR: Universal Backdoor Attacks on Pre-trained Language Models
Wei Du
Peixuan Li
Yue Liu
Haodong Zhao
Gongshen Liu
AAML
59
9
0
16 May 2023
Adapting Sentence Transformers for the Aviation Domain
Adapting Sentence Transformers for the Aviation Domain
Liya Wang
Jason Chou
Dave Rouck
A. Tien
Diane M. Baumgartner
AI4TS
55
4
0
16 May 2023
Sequence-to-Sequence Pre-training with Unified Modality Masking for
  Visual Document Understanding
Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding
ShuWei Feng
Tianyang Zhan
Zhanming Jie
Trung Quoc Luong
Xiaoran Jin
51
1
0
16 May 2023
Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop
  Fact Verification
Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop Fact Verification
Jiasheng Si
Yingjie Zhu
Deyu Zhou
AAML
131
4
0
16 May 2023
UniS-MMC: Multimodal Classification via Unimodality-supervised
  Multimodal Contrastive Learning
UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning
Heqing Zou
Meng Shen
Chen Chen
Yuchen Hu
D. Rajan
Chng Eng Siong
SSL
100
17
0
16 May 2023
Pre-Training to Learn in Context
Pre-Training to Learn in Context
Yuxian Gu
Li Dong
Furu Wei
Minlie Huang
CLIPLRMReLM
167
38
0
16 May 2023
Weight-Inherited Distillation for Task-Agnostic BERT Compression
Weight-Inherited Distillation for Task-Agnostic BERT Compression
Taiqiang Wu
Cheng-An Hou
Shanshan Lao
Jiayi Li
Ngai Wong
Zhe Zhao
Yujiu Yang
136
10
0
16 May 2023
Small Models are Valuable Plug-ins for Large Language Models
Small Models are Valuable Plug-ins for Large Language Models
Canwen Xu
Yichong Xu
Shuohang Wang
Yang Liu
Chenguang Zhu
Julian McAuley
LLMAG
85
50
0
15 May 2023
Exploring In-Context Learning Capabilities of Foundation Models for
  Generating Knowledge Graphs from Text
Exploring In-Context Learning Capabilities of Foundation Models for Generating Knowledge Graphs from Text
H. Khorashadizadeh
Nandana Mihindukulasooriya
Sanju Tiwari
Jinghua Groppe
Sven Groppe
73
23
0
15 May 2023
Question-Answering System Extracts Information on Injection Drug Use
  from Clinical Notes
Question-Answering System Extracts Information on Injection Drug Use from Clinical Notes
Maria Mahbub
Ian Goethert
Ioana Danciu
Kathryn Knight
Sudarshan Srinivasan
...
Hugo Solares
Susana Martins
Jodie Trafton
Edmon Begoli
Gregory D. Peterson
45
4
0
15 May 2023
Knowledge Rumination for Pre-trained Language Models
Knowledge Rumination for Pre-trained Language Models
Yunzhi Yao
Peng Wang
Shengyu Mao
Chuanqi Tan
Fei Huang
Huajun Chen
Ningyu Zhang
KELM
74
4
0
15 May 2023
Recyclable Tuning for Continual Pre-training
Recyclable Tuning for Continual Pre-training
Yujia Qin
Cheng Qian
Xu Han
Yankai Lin
Huadong Wang
Ruobing Xie
Zhiyuan Liu
Maosong Sun
Jie Zhou
CLL
66
13
0
15 May 2023
Continual Multimodal Knowledge Graph Construction
Continual Multimodal Knowledge Graph Construction
Xiang Chen
Jintian Zhang
Xiaohan Wang
Ningyu Zhang
Tongtong Wu
Luo Si
Yongheng Wang
Huajun Chen
KELMCLL
87
15
0
15 May 2023
Unsupervised Sentence Representation Learning with Frequency-induced
  Adversarial Tuning and Incomplete Sentence Filtering
Unsupervised Sentence Representation Learning with Frequency-induced Adversarial Tuning and Incomplete Sentence Filtering
Bing Wang
Ximing Li
Zhiyao Yang
Yuanyuan Guan
Jiayin Li
Sheng-sheng Wang
77
7
0
15 May 2023
AdamR at SemEval-2023 Task 10: Solving the Class Imbalance Problem in
  Sexism Detection with Ensemble Learning
AdamR at SemEval-2023 Task 10: Solving the Class Imbalance Problem in Sexism Detection with Ensemble Learning
Adam Rydelek
Daryna Dementieva
Georg Groh
33
2
0
15 May 2023
Adam-Smith at SemEval-2023 Task 4: Discovering Human Values in Arguments
  with Ensembles of Transformer-based Models
Adam-Smith at SemEval-2023 Task 4: Discovering Human Values in Arguments with Ensembles of Transformer-based Models
Daniel Schroter
Daryna Dementieva
Georg Groh
63
9
0
15 May 2023
DarkBERT: A Language Model for the Dark Side of the Internet
DarkBERT: A Language Model for the Dark Side of the Internet
Youngjin Jin
Eugene Jang
Jian Cui
Jin-Woo Chung
Yongjae Lee
Seung-Eui Shin
56
36
0
15 May 2023
Measuring Consistency in Text-based Financial Forecasting Models
Measuring Consistency in Text-based Financial Forecasting Models
Linyi Yang
Yingpeng Ma
Yue Zhang
59
4
0
15 May 2023
Taxi1500: A Multilingual Dataset for Text Classification in 1500
  Languages
Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages
Chunlan Ma
Ayyoob Imani
Haotian Ye
Renhao Pei
Ehsaneddin Asgari
Hinrich Schütze
82
25
0
15 May 2023
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Rabeeh Karimi Mahabadi
Hamish Ivison
Jaesung Tae
James Henderson
Iz Beltagy
Matthew E. Peters
Arman Cohan
104
28
0
15 May 2023
Text Classification via Large Language Models
Text Classification via Large Language Models
Xiaofei Sun
Xiaoya Li
Jiwei Li
Leilei Gan
Shangwei Guo
Tianwei Zhang
Guoyin Wang
RALMLRM
104
150
0
15 May 2023
SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation
SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation
Junfeng Jiang
Chengzhang Dong
Sadao Kurohashi
Akiko Aizawa
35
7
0
15 May 2023
Coreference-aware Double-channel Attention Network for Multi-party
  Dialogue Reading Comprehension
Coreference-aware Double-channel Attention Network for Multi-party Dialogue Reading Comprehension
Yanling Li
Bowei Zou
Yifan Fan
Mengxing Dong
Yu Hong
76
4
0
15 May 2023
From Pretraining Data to Language Models to Downstream Tasks: Tracking
  the Trails of Political Biases Leading to Unfair NLP Models
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
Shangbin Feng
Chan Young Park
Yuhan Liu
Yulia Tsvetkov
104
248
0
15 May 2023
Previous
123...105106107...214215216
Next