ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,831 papers shown
Title
KILDST: Effective Knowledge-Integrated Learning for Dialogue State
  Tracking using Gazetteer and Speaker Information
KILDST: Effective Knowledge-Integrated Learning for Dialogue State Tracking using Gazetteer and Speaker Information
Hyungtak Choi
Hyeonmok Ko
Gurpreet Kaur
Lohith Ravuru
Kiranmayi Gandikota
Manisha Jhawar
S. Dharani
Pranamya Patil
99
0
0
18 Jan 2023
Effective End-to-End Vision Language Pretraining with Semantic Visual
  Loss
Effective End-to-End Vision Language Pretraining with Semantic Visual Loss
Xiaofeng Yang
Fayao Liu
Guosheng Lin
VLM
49
8
0
18 Jan 2023
BERT-ERC: Fine-tuning BERT is Enough for Emotion Recognition in
  Conversation
BERT-ERC: Fine-tuning BERT is Enough for Emotion Recognition in Conversation
Xiangyu Qin
Zhiyu Wu
J. Cui
Ting Zhang
Yanran Li
Jian Luan
Bin Wang
L. xilinx Wang
75
26
0
17 Jan 2023
ClassBases at CASE-2022 Multilingual Protest Event Detection Tasks:
  Multilingual Protest News Detection and Automatically Replicating Manually
  Created Event Datasets
ClassBases at CASE-2022 Multilingual Protest Event Detection Tasks: Multilingual Protest News Detection and Automatically Replicating Manually Created Event Datasets
Peratham Wiriyathammabhum
56
3
0
16 Jan 2023
TEDB System Description to a Shared Task on Euphemism Detection 2022
TEDB System Description to a Shared Task on Euphemism Detection 2022
Peratham Wiriyathammabhum
57
4
0
16 Jan 2023
XNLI 2.0: Improving XNLI dataset and performance on Cross Lingual
  Understanding (XLU)
XNLI 2.0: Improving XNLI dataset and performance on Cross Lingual Understanding (XLU)
A. Upadhyay
Harsit Kumar Upadhya
36
1
0
16 Jan 2023
Computational Assessment of Hyperpartisanship in News Titles
Computational Assessment of Hyperpartisanship in News Titles
Hanjia Lyu
Jinsheng Pan
Zichen Wang
Jiebo Luo
98
6
0
16 Jan 2023
EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
Gyubok Lee
Hyeonji Hwang
Seongsu Bae
Yeonsu Kwon
W. Shin
Seongjun Yang
Minjoon Seo
Jong-Yeup Kim
Edward Choi
98
24
0
16 Jan 2023
Bike Frames: Understanding the Implicit Portrayal of Cyclists in the
  News
Bike Frames: Understanding the Implicit Portrayal of Cyclists in the News
Xingmeng Zhao
Dan Schumacher
Sashank Nalluri
Xavier Walton
Suhana Shrestha
Anthony Rios
62
2
0
15 Jan 2023
It's Just a Matter of Time: Detecting Depression with Time-Enriched
  Multimodal Transformers
It's Just a Matter of Time: Detecting Depression with Time-Enriched Multimodal Transformers
Ana-Maria Bucur
Adrian Cosma
Paolo Rosso
Liviu P. Dinu
88
34
0
13 Jan 2023
Adversarial Adaptation for French Named Entity Recognition
Adversarial Adaptation for French Named Entity Recognition
Arjun Choudhry
Inder Khatri
Pankaj Gupta
Aaryan Gupta
Maxime Nicol
Marie-Jean Meurs
Dinesh Kumar Vishwakarma
68
0
0
12 Jan 2023
Toward Building General Foundation Models for Language, Vision, and
  Vision-Language Understanding Tasks
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLMAI4CELRM
128
17
0
12 Jan 2023
Everyone's Voice Matters: Quantifying Annotation Disagreement Using
  Demographic Information
Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information
Ruyuan Wan
Jaehyung Kim
Dongyeop Kang
78
38
0
12 Jan 2023
Self-Attention Amortized Distributional Projection Optimization for
  Sliced Wasserstein Point-Cloud Reconstruction
Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction
Khai Nguyen
Dang Nguyen
N. Ho
80
9
0
12 Jan 2023
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Haoxin Li
Phillip Keung
Daniel Cheng
Jungo Kasai
Noah A. Smith
65
4
0
11 Jan 2023
Few-shot Learning for Cross-Target Stance Detection by Aggregating
  Multimodal Embeddings
Few-shot Learning for Cross-Target Stance Detection by Aggregating Multimodal Embeddings
Parisa Jamadi Khiabani
A. Zubiaga
90
12
0
11 Jan 2023
Counteracts: Testing Stereotypical Representation in Pre-trained
  Language Models
Counteracts: Testing Stereotypical Representation in Pre-trained Language Models
Damin Zhang
Julia Taylor Rayz
Romila Pradhan
79
2
0
11 Jan 2023
Topics in Contextualised Attention Embeddings
Topics in Contextualised Attention Embeddings
Mozhgan Talebpour
A. G. S. D. Herrera
Shoaib Jameel
71
2
0
11 Jan 2023
MGeo: Multi-Modal Geographic Pre-Training Method
MGeo: Multi-Modal Geographic Pre-Training Method
Ruixue Ding
Boli Chen
Pengjun Xie
Fei Huang
Xin Li
Qiang-Wei Zhang
Yao Xu
108
20
0
11 Jan 2023
ODIM: Outlier Detection via Likelihood of Under-Fitted Generative Models
ODIM: Outlier Detection via Likelihood of Under-Fitted Generative Models
Dongha Kim
Jaesung Hwang
Jongjin Lee
Kunwoong Kim
Yongdai Kim
OODD
69
2
0
11 Jan 2023
Towards Answering Climate Questionnaires from Unstructured Climate
  Reports
Towards Answering Climate Questionnaires from Unstructured Climate Reports
Daniel M. Spokoyny
Tanmay Laud
Thomas W. Corringham
Taylor Berg-Kirkpatrick
79
7
0
11 Jan 2023
Structured Case-based Reasoning for Inference-time Adaptation of
  Text-to-SQL parsers
Structured Case-based Reasoning for Inference-time Adaptation of Text-to-SQL parsers
Abhijeet Awasthi
Soumen Chakrabarti
Sunita Sarawagi
94
5
0
10 Jan 2023
Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
Zhuosheng Zhang
Hai Zhao
Longxiang Liu
BDL
99
2
0
10 Jan 2023
CHRONOS: Time-Aware Zero-Shot Identification of Libraries from
  Vulnerability Reports
CHRONOS: Time-Aware Zero-Shot Identification of Libraries from Vulnerability Reports
Yu-zeng Lyu
Thanh Le-Cong
Hong Jin Kang
Ratnadira Widyasari
Zhipeng Zhao
X. Le
Ming Li
David Lo
103
18
0
10 Jan 2023
Understanding the Complexity and Its Impact on Testing in ML-Enabled
  Systems
Understanding the Complexity and Its Impact on Testing in ML-Enabled Systems
Junming Cao
Bihuan Chen
Longjie Hu
Jie Ying Gao
Kaifeng Huang
Xin Peng
72
3
0
10 Jan 2023
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using
  Large Language Models
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models
Toufique Ahmed
Supriyo Ghosh
Chetan Bansal
Thomas Zimmermann
Xuchao Zhang
Saravan Rajmohan
AI4CE
77
59
0
10 Jan 2023
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language
  Understanding
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language Understanding
Yunchang Zhu
Liang Pang
Kangxi Wu
Yanyan Lan
Huawei Shen
Xueqi Cheng
AAMLELM
63
2
0
10 Jan 2023
Neighborhood-Regularized Self-Training for Learning with Few Labels
Neighborhood-Regularized Self-Training for Learning with Few Labels
Ran Xu
Yue Yu
Hejie Cui
Xuan Kan
Yanqiao Zhu
Joyce C. Ho
Chao Zhang
Carl Yang
SSL
114
25
0
10 Jan 2023
Transfer learning for conflict and duplicate detection in software
  requirement pairs
Transfer learning for conflict and duplicate detection in software requirement pairs
G. Malik
Savaş Yıldırım
Mucahit Cevik
A. Bener
D. Parikh
56
4
0
09 Jan 2023
Designing BERT for Convolutional Networks: Sparse and Hierarchical
  Masked Modeling
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Keyu Tian
Yi Jiang
Qishuai Diao
Chen Lin
Liwei Wang
Zehuan Yuan
99
106
0
09 Jan 2023
ERNIE 3.0 Tiny: Frustratingly Simple Method to Improve Task-Agnostic
  Distillation Generalization
ERNIE 3.0 Tiny: Frustratingly Simple Method to Improve Task-Agnostic Distillation Generalization
Weixin Liu
Xuyi Chen
Jiaxiang Liu
Shi Feng
Yu Sun
Hao Tian
Hua Wu
97
2
0
09 Jan 2023
AI2: The next leap toward native language based and explainable machine
  learning framework
AI2: The next leap toward native language based and explainable machine learning framework
J. Dessureault
Daniel Massicotte
53
1
0
09 Jan 2023
Universal Multimodal Representation for Language Understanding
Universal Multimodal Representation for Language Understanding
Zhuosheng Zhang
Kehai Chen
Rui Wang
Masao Utiyama
Eiichiro Sumita
Z. Li
Hai Zhao
SSL
109
22
0
09 Jan 2023
Universal Information Extraction as Unified Semantic Matching
Universal Information Extraction as Unified Semantic Matching
Jie Lou
Yaojie Lu
Dai Dai
Wei Jia
Hongyu Lin
Xianpei Han
Le Sun
Hua Wu
82
72
0
09 Jan 2023
Removing Non-Stationary Knowledge From Pre-Trained Language Models for
  Entity-Level Sentiment Classification in Finance
Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance
Seunghyeok Hong
Hanwool Albert Lee
Nahyeon Kang
Moonjeong Hahm
67
8
0
09 Jan 2023
Mitigating Human and Computer Opinion Fraud via Contrastive Learning
Mitigating Human and Computer Opinion Fraud via Contrastive Learning
Yuliya Tukmacheva
Ivan Oseledets
Evgeny Frolov
27
1
0
08 Jan 2023
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
Leonid Boytsov
Preksha Patel
Vivek Sourabh
Riddhi Nisar
Sayan Kundu
R. Ramanathan
Eric Nyberg
83
20
0
08 Jan 2023
Mind Reasoning Manners: Enhancing Type Perception for Generalized
  Zero-shot Logical Reasoning over Text
Mind Reasoning Manners: Enhancing Type Perception for Generalized Zero-shot Logical Reasoning over Text
Fangzhi Xu
Jun Liu
Qika Lin
Tianzhe Zhao
Jian Zhang
Lingling Zhang
ReLMLRM
71
4
0
08 Jan 2023
Traditional Readability Formulas Compared for English
Traditional Readability Formulas Compared for English
Bruce W. Lee
J. Lee
AIMat
91
6
0
08 Jan 2023
Why do Nearest Neighbor Language Models Work?
Why do Nearest Neighbor Language Models Work?
Frank F. Xu
Uri Alon
Graham Neubig
RALM
81
23
0
07 Jan 2023
Facilitating Contrastive Learning of Discourse Relational Senses by
  Exploiting the Hierarchy of Sense Relations
Facilitating Contrastive Learning of Discourse Relational Senses by Exploiting the Hierarchy of Sense Relations
Wanqiu Long
Bonnie Webber
114
34
0
06 Jan 2023
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
81
9
0
06 Jan 2023
Causal Categorization of Mental Health Posts using Transformers
Causal Categorization of Mental Health Posts using Transformers
Simranjeet Kaur
Ritika Bhardwaj
Aastha Jain
Muskan Garg
Chandni Saxena
AI4MH
97
1
0
06 Jan 2023
In Defense of Structural Symbolic Representation for Video
  Event-Relation Prediction
In Defense of Structural Symbolic Representation for Video Event-Relation Prediction
Andrew Lu
Xudong Lin
Yulei Niu
Shih-Fu Chang
106
2
0
06 Jan 2023
Stealthy Backdoor Attack for Code Models
Stealthy Backdoor Attack for Code Models
Zhou Yang
Bowen Xu
Jie M. Zhang
Hong Jin Kang
Jieke Shi
Junda He
David Lo
AAML
60
68
0
06 Jan 2023
OPD@NL4Opt: An ensemble approach for the NER task of the optimization
  problem
OPD@NL4Opt: An ensemble approach for the NER task of the optimization problem
Kangxu Wang
Ze Chen
Jiewen Zheng
64
6
0
06 Jan 2023
Text2Poster: Laying out Stylized Texts on Retrieved Images
Text2Poster: Laying out Stylized Texts on Retrieved Images
Chuhao Jin
Hongteng Xu
Ruihua Song
Zhiwu Lu
DiffM
70
8
0
06 Jan 2023
CiT: Curation in Training for Effective Vision-Language Data
CiT: Curation in Training for Effective Vision-Language Data
Hu Xu
Saining Xie
Po-Yao (Bernie) Huang
Licheng Yu
Russ Howes
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
VLMDiffM
71
26
0
05 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
199
727
0
05 Jan 2023
Test of Time: Instilling Video-Language Models with a Sense of Time
Test of Time: Instilling Video-Language Models with a Sense of Time
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
191
37
0
05 Jan 2023
Previous
123...122123124...215216217
Next