ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,822 papers shown
Title
Improved knowledge distillation by utilizing backward pass knowledge in
  neural networks
Improved knowledge distillation by utilizing backward pass knowledge in neural networks
A. Jafari
Mehdi Rezagholizadeh
A. Ghodsi
39
1
0
27 Jan 2023
EmbedDistill: A Geometric Knowledge Distillation for Information
  Retrieval
EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Seungyeon Kim
A. S. Rawat
Manzil Zaheer
Sadeep Jayasumana
Veeranjaneyulu Sadhanala
Wittawat Jitkrittum
A. Menon
Rob Fergus
Surinder Kumar
FedML
91
7
0
27 Jan 2023
Understanding the Effectiveness of Very Large Language Models on Dialog
  Evaluation
Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation
Jessica Huynh
Cathy Jiao
Prakhar Gupta
Shikib Mehri
Payal Bajaj
Vishrav Chaudhary
M. Eskénazi
ELMLM&MA
73
17
0
27 Jan 2023
Prompt-Based Editing for Text Style Transfer
Prompt-Based Editing for Text Style Transfer
Guoqing Luo
Yu Tong Han
Lili Mou
Mauajama Firdaus
98
26
0
27 Jan 2023
Case-Based Reasoning with Language Models for Classification of Logical
  Fallacies
Case-Based Reasoning with Language Models for Classification of Logical Fallacies
Zhivar Sourati
Filip Ilievski
Hông-Ân Sandlin
Alain Mermoud
LRM
67
13
0
27 Jan 2023
A Comparative Study of Pretrained Language Models for Long Clinical Text
A Comparative Study of Pretrained Language Models for Long Clinical Text
Yikuan Li
R. M. Wehbe
F. Ahmad
Hanyin Wang
Yuan Luo
LM&MAELMVLMMedIm
93
86
0
27 Jan 2023
Learning the Effects of Physical Actions in a Multi-modal Environment
Learning the Effects of Physical Actions in a Multi-modal Environment
Gautier Dagan
Frank Keller
A. Lascarides
LM&Ro
94
4
0
27 Jan 2023
Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on
  a developmentally plausible corpus
Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
Alex Warstadt
Leshem Choshen
Aaron Mueller
Adina Williams
Ethan Gotlieb Wilcox
Chengxu Zhuang
115
57
0
27 Jan 2023
Graph Attention with Hierarchies for Multi-hop Question Answering
Graph Attention with Hierarchies for Multi-hop Question Answering
Yunjie He
P. Gorinski
Ieva Staliunaite
Pontus Stenetorp
63
3
0
27 Jan 2023
Towards Personalized Review Summarization by Modeling Historical Reviews
  from Customer and Product Separately
Towards Personalized Review Summarization by Modeling Historical Reviews from Customer and Product Separately
Xin Cheng
Shen Gao
Yuchi Zhang
Yongliang Wang
Preslav Nakov
Mingzhe Li
Dongyan Zhao
Rui Yan
94
10
0
27 Jan 2023
Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and
  Politicised Hate Speech
Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
Jarod Govers
Philip G. Feldman
Aaron Dant
Panos Patros
62
29
0
27 Jan 2023
Semi-Parametric Video-Grounded Text Generation
Semi-Parametric Video-Grounded Text Generation
Sungdong Kim
Jin-Hwa Kim
Jiyoung Lee
Minjoon Seo
VGen
80
14
0
27 Jan 2023
Open Problems in Applied Deep Learning
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
117
2
0
26 Jan 2023
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability
  Curvature
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
E. Mitchell
Yoonho Lee
Alexander Khazatsky
Christopher D. Manning
Chelsea Finn
137
633
0
26 Jan 2023
Characterizing the Entities in Harmful Memes: Who is the Hero, the
  Villain, the Victim?
Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim?
Shivam Sharma
Atharva Kulkarni
Tharun Suresh
Himanshi Mathur
Preslav Nakov
Md. Shad Akhtar
Tanmoy Chakraborty
113
17
0
26 Jan 2023
Improving Text-based Early Prediction by Distillation from Privileged
  Time-Series Text
Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text
Jinghui Liu
Daniel Capurro
Anthony N. Nguyen
Karin Verspoor
AI4TS
42
3
0
26 Jan 2023
Backward Compatibility During Data Updates by Weight Interpolation
Backward Compatibility During Data Updates by Weight Interpolation
Raphael Schumann
Elman Mansimov
Yi-An Lai
Nikolaos Pappas
Xibin Gao
Yi Zhang
49
5
0
25 Jan 2023
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Asha Vishwanathan
R. Warrier
G. V. Suresh
Chandrashekhar Kandpal
58
3
0
25 Jan 2023
Knowledge-augmented Graph Neural Networks with Concept-aware Attention
  for Adverse Drug Event Detection
Knowledge-augmented Graph Neural Networks with Concept-aware Attention for Adverse Drug Event Detection
Shaoxiong Ji
Ya Gao
Pekka Marttinen
GNN
85
4
0
25 Jan 2023
ViDeBERTa: A powerful pre-trained language model for Vietnamese
ViDeBERTa: A powerful pre-trained language model for Vietnamese
Cong Dao Tran
Nhut Huy Pham
Anh-Viêt Nguyên
Truong-Son Hy
Tu Vu
67
17
0
25 Jan 2023
One Model for All Domains: Collaborative Domain-Prefix Tuning for
  Cross-Domain NER
One Model for All Domains: Collaborative Domain-Prefix Tuning for Cross-Domain NER
Xiang Chen
Lei Li
Q. Fei
Ningyu Zhang
Chuanqi Tan
Yong Jiang
Fei Huang
Huajun Chen
106
24
0
25 Jan 2023
Language Model Detoxification in Dialogue with Contextualized Stance
  Control
Language Model Detoxification in Dialogue with Contextualized Stance Control
Jingu Qian
Xifeng Yan
56
1
0
25 Jan 2023
A Watermark for Large Language Models
A Watermark for Large Language Models
John Kirchenbauer
Jonas Geiping
Yuxin Wen
Jonathan Katz
Ian Miers
Tom Goldstein
VLMWaLM
181
511
0
24 Jan 2023
Conclusion-based Counter-Argument Generation
Conclusion-based Counter-Argument Generation
Milad Alshomary
Henning Wachsmuth
LRM
68
6
0
24 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
79
3
0
24 Jan 2023
Model Agnostic Sample Reweighting for Out-of-Distribution Learning
Model Agnostic Sample Reweighting for Out-of-Distribution Learning
Xiao Zhou
Yong Lin
Renjie Pi
Weizhong Zhang
Renzhe Xu
Peng Cui
Tong Zhang
OODD
102
61
0
24 Jan 2023
PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question
  Answering Research and Development
PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development
Avirup Sil
Jaydeep Sen
Bhavani Iyer
M. Franz
Kshitij P. Fadnis
...
Yulong Li
Md Arafat Sultan
Riyaz Ahmad Bhat
Radu Florian
Salim Roukos
79
4
0
23 Jan 2023
WDC Products: A Multi-Dimensional Entity Matching Benchmark
WDC Products: A Multi-Dimensional Entity Matching Benchmark
Ralph Peeters
Reng Chiz Der
Christian Bizer
57
16
0
23 Jan 2023
Deep Learning Mental Health Dialogue System
Deep Learning Mental Health Dialogue System
L. Brocki
George C. Dyer
A. Gładka
N. C. Chung
AI4MH
14
15
0
23 Jan 2023
Lexi: Self-Supervised Learning of the UI Language
Lexi: Self-Supervised Learning of the UI Language
Pratyay Banerjee
Shweti Mahajan
Kushal Arora
Chitta Baral
Oriana Riva
68
17
0
23 Jan 2023
StockEmotions: Discover Investor Emotions for Financial Sentiment
  Analysis and Multivariate Time Series
StockEmotions: Discover Investor Emotions for Financial Sentiment Analysis and Multivariate Time Series
Jean Lee
Hoyoul Luis Youn
Josiah Poon
S. Han
AIFin
69
8
0
23 Jan 2023
AttMEMO : Accelerating Transformers with Memoization on Big Memory
  Systems
AttMEMO : Accelerating Transformers with Memoization on Big Memory Systems
Yuan Feng
Hyeran Jeon
F. Blagojevic
Cyril Guyot
Qing Li
Dong Li
GNN
72
3
0
23 Jan 2023
An Empirical Study of Metrics to Measure Representational Harms in
  Pre-Trained Language Models
An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models
Saghar Hosseini
Hamid Palangi
Ahmed Hassan Awadallah
65
24
0
22 Jan 2023
Summarize the Past to Predict the Future: Natural Language Descriptions
  of Context Boost Multimodal Object Interaction Anticipation
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
169
14
0
22 Jan 2023
SPEC5G: A Dataset for 5G Cellular Network Protocol Analysis
SPEC5G: A Dataset for 5G Cellular Network Protocol Analysis
Imtiaz Karim
Kazi Samin Mubasshir
Mirza Masfiqur Rahman
Elisa Bertino
59
25
0
22 Jan 2023
Transfer Knowledge from Natural Language to Electrocardiography: Can We
  Detect Cardiovascular Disease Through Language Models?
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Jielin Qiu
William Jongwon Han
Jiacheng Zhu
Mengdi Xu
Michael A. Rosenberg
Emerson Liu
Douglas Weber
Ding Zhao
94
23
0
21 Jan 2023
Adapting a Language Model While Preserving its General Knowledge
Adapting a Language Model While Preserving its General Knowledge
Zixuan Ke
Yijia Shao
Haowei Lin
Hu Xu
Lei Shu
Bin Liu
KELMCLLVLM
68
21
0
21 Jan 2023
Exploring Methods for Building Dialects-Mandarin Code-Mixing Corpora: A
  Case Study in Taiwanese Hokkien
Exploring Methods for Building Dialects-Mandarin Code-Mixing Corpora: A Case Study in Taiwanese Hokkien
Sin-En Lu
Bo-Han Lu
Chaohong Lu
Richard Tzong-Han Tsai
67
6
0
21 Jan 2023
Unifying Structure Reasoning and Language Model Pre-training for Complex
  Reasoning
Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning
Siyuan Wang
Zhongyu Wei
Jiarong Xu
Taishan Li
Zhihao Fan
LRM
93
5
0
21 Jan 2023
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL
  Robustness
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
Shuaichen Chang
Jun Wang
Mingwen Dong
Lin Pan
Henghui Zhu
...
William Yang Wang
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Bing Xiang
OOD
105
35
0
21 Jan 2023
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme
  Predictions
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
Yinghao Aaron Li
Cong Han
Xilin Jiang
N. Mesgarani
66
24
0
20 Jan 2023
Visual Semantic Relatedness Dataset for Image Captioning
Visual Semantic Relatedness Dataset for Image Captioning
Ahmed Sabir
Francesc Moreno-Noguer
Lluís Padró
CoGeVLM
75
3
0
20 Jan 2023
Can Peanuts Fall in Love with Distributional Semantics?
Can Peanuts Fall in Love with Distributional Semantics?
J. Michaelov
S. Coulson
Benjamin Bergen
MILM
74
8
0
20 Jan 2023
Neural Architecture Search: Insights from 1000 Papers
Neural Architecture Search: Insights from 1000 Papers
Colin White
Mahmoud Safari
R. Sukthanker
Binxin Ru
T. Elsken
Arber Zela
Debadeepta Dey
Frank Hutter
3DVAI4CE
131
143
0
20 Jan 2023
JCSE: Contrastive Learning of Japanese Sentence Embeddings and Its
  Applications
JCSE: Contrastive Learning of Japanese Sentence Embeddings and Its Applications
Zihao Chen
H. Handa
Kimiaki Shirahama
63
2
0
19 Jan 2023
Reversing The Twenty Questions Game
Reversing The Twenty Questions Game
Parth Parikh
Anisha Gupta
33
1
0
19 Jan 2023
Learning-Rate-Free Learning by D-Adaptation
Learning-Rate-Free Learning by D-Adaptation
Aaron Defazio
Konstantin Mishchenko
110
85
0
18 Jan 2023
How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation,
  and Detection
How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
Biyang Guo
Xin Zhang
Ziyuan Wang
Minqi Jiang
Jinran Nie
Yuxuan Ding
Jianwei Yue
Yupeng Wu
DeLMOELM
135
622
0
18 Jan 2023
Towards a Holistic Understanding of Mathematical Questions with
  Contrastive Pre-training
Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training
Yuting Ning
Zhenya Huang
Xin Lin
Enhong Chen
Shiwei Tong
Zheng Gong
Shijin Wang
AIMat
73
7
0
18 Jan 2023
KILDST: Effective Knowledge-Integrated Learning for Dialogue State
  Tracking using Gazetteer and Speaker Information
KILDST: Effective Knowledge-Integrated Learning for Dialogue State Tracking using Gazetteer and Speaker Information
Hyungtak Choi
Hyeonmok Ko
Gurpreet Kaur
Lohith Ravuru
Kiranmayi Gandikota
Manisha Jhawar
S. Dharani
Pranamya Patil
96
0
0
18 Jan 2023
Previous
123...121122123...215216217
Next