ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,764 papers shown
Title
Never Train from Scratch: Fair Comparison of Long-Sequence Models
  Requires Data-Driven Priors
Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors
Ido Amos
Jonathan Berant
Ankit Gupta
112
29
0
04 Oct 2023
Out-of-Distribution Detection by Leveraging Between-Layer Transformation
  Smoothness
Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Fran Jelenić
Josip Jukić
Martin Tutek
Mate Puljiz
Jan vSnajder
OODD
86
7
0
04 Oct 2023
Multimodal Prompt Transformer with Hybrid Contrastive Learning for
  Emotion Recognition in Conversation
Multimodal Prompt Transformer with Hybrid Contrastive Learning for Emotion Recognition in Conversation
Shihao Zou
Xianying Huang
Xudong Shen
74
10
0
04 Oct 2023
MIDDAG: Where Does Our News Go? Investigating Information Diffusion via
  Community-Level Information Pathways
MIDDAG: Where Does Our News Go? Investigating Information Diffusion via Community-Level Information Pathways
Mingyu Derek Ma
Alexander K. Taylor
Nuan Wen
Yanchen Liu
Po-Nien Kung
...
Azure Zhou
Diyi Yang
Xuezhe Ma
Nanyun Peng
Wei Wang
48
2
0
04 Oct 2023
On the Cognition of Visual Question Answering Models and Human
  Intelligence: A Comparative Study
On the Cognition of Visual Question Answering Models and Human Intelligence: A Comparative Study
Liben Chen
Long Chen
Tian Ellison-Chen
Zhuoyuan Xu
LRM
36
0
0
04 Oct 2023
A Deep Reinforcement Learning Approach for Interactive Search with
  Sentence-level Feedback
A Deep Reinforcement Learning Approach for Interactive Search with Sentence-level Feedback
Jianghong Zhou
Joyce C. Ho
Chen Lin
Eugene Agichtein
61
0
0
03 Oct 2023
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving
Tushar Choudhary
Vikrant Dewangan
Shivam Chandhok
Shubham Priyadarshan
Anushka Jain
A. K. Singh
Siddharth Srivastava
Krishna Murthy Jatavallabhula
K. M. Krishna
102
66
0
03 Oct 2023
Editing Personality for Large Language Models
Editing Personality for Large Language Models
Shengyu Mao
Xiaohan Wang
Meng Wang
Yong Jiang
Pengjun Xie
Yan Zhang
Ningyu Zhang
KELM
88
11
0
03 Oct 2023
Sieve: Multimodal Dataset Pruning Using Image Captioning Models
Sieve: Multimodal Dataset Pruning Using Image Captioning Models
Anas Mahmoud
Mostafa Elhoushi
Amro Abbas
Yu Yang
Newsha Ardalani
Hugh Leather
Ari S. Morcos
VLMCLIP
80
21
0
03 Oct 2023
The Inhibitor: ReLU and Addition-Based Attention for Efficient
  Transformers
The Inhibitor: ReLU and Addition-Based Attention for Efficient Transformers
Rickard Brannvall
46
0
0
03 Oct 2023
Towards Training Without Depth Limits: Batch Normalization Without
  Gradient Explosion
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
Alexandru Meterez
Amir Joudaki
Francesco Orabona
Alexander Immer
Gunnar Rätsch
Hadi Daneshmand
71
8
0
03 Oct 2023
Selective Feature Adapter for Dense Vision Transformers
Selective Feature Adapter for Dense Vision Transformers
XueQing Deng
Qi Fan
Xiaojie Jin
Linjie Yang
Peng Wang
63
0
0
03 Oct 2023
Zero-Shot Continuous Prompt Transfer: Generalizing Task Semantics Across
  Language Models
Zero-Shot Continuous Prompt Transfer: Generalizing Task Semantics Across Language Models
Zijun Wu
Yongkang Wu
Lili Mou
VLM
79
5
0
02 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its
  Routing Policy
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
85
39
0
02 Oct 2023
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language
  Models
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
Jingwei Sun
Ziyue Xu
Hongxu Yin
Dong Yang
Daguang Xu
Yiran Chen
Holger R. Roth
VLM
144
26
0
02 Oct 2023
On the Generalization of Training-based ChatGPT Detection Methods
On the Generalization of Training-based ChatGPT Detection Methods
Han Xu
Jie Ren
Pengfei He
Shenglai Zeng
Yingqian Cui
Amy Liu
Hui Liu
Jiliang Tang
DeLMO
77
13
0
02 Oct 2023
SPELL: Semantic Prompt Evolution based on a LLM
SPELL: Semantic Prompt Evolution based on a LLM
Yujian Betterest Li
Kai Wu
96
12
0
02 Oct 2023
ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by
  Learning to Scale
ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale
Markus Frohmann
Carolin Holtermann
Shahed Masoudian
Anne Lauscher
Navid Rekabsaz
96
2
0
02 Oct 2023
From Bricks to Bridges: Product of Invariances to Enhance Latent Space
  Communication
From Bricks to Bridges: Product of Invariances to Enhance Latent Space Communication
Irene Cannistraci
Luca Moschella
Marco Fumero
Valentino Maiorca
Emanuele Rodolà
108
14
0
02 Oct 2023
Label Supervised LLaMA Finetuning
Label Supervised LLaMA Finetuning
Zongxi Li
Xianming Li
Yuzhang Liu
Haoran Xie
Jing Li
F. Wang
Qing Li
Xiaoqin Zhong
ALM
64
23
0
02 Oct 2023
Gotcha! This Model Uses My Code! Evaluating Membership Leakage Risks in
  Code Models
Gotcha! This Model Uses My Code! Evaluating Membership Leakage Risks in Code Models
Zhou Yang
Zhipeng Zhao
Chenyu Wang
Jieke Shi
Dongsum Kim
Donggyun Han
David Lo
SILMAAMLMIACV
113
12
0
02 Oct 2023
Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of
  Large Language Models
Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models
Jean Kaddour
Qi Liu
SyDa
61
2
0
02 Oct 2023
Language Model Decoding as Direct Metrics Optimization
Language Model Decoding as Direct Metrics Optimization
Haozhe Ji
Pei Ke
Hongning Wang
Minlie Huang
61
7
0
02 Oct 2023
EALM: Introducing Multidimensional Ethical Alignment in Conversational
  Information Retrieval
EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval
Yiyao Yu
Junjie Wang
Yuxiang Zhang
Lin Zhang
Yujiu Yang
Tetsuya Sakai
74
1
0
02 Oct 2023
Resolving Knowledge Conflicts in Large Language Models
Resolving Knowledge Conflicts in Large Language Models
Yike Wang
Shangbin Feng
Heng Wang
Weijia Shi
Vidhisha Balachandran
Tianxing He
Yulia Tsvetkov
108
20
0
02 Oct 2023
Fooling the Textual Fooler via Randomizing Latent Representations
Fooling the Textual Fooler via Randomizing Latent Representations
Duy C. Hoang
Quang H. Nguyen
Saurav Manchanda
MinLong Peng
Kok-Seng Wong
Khoa D. Doan
SILMAAML
70
0
0
02 Oct 2023
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and
  Diffusion Models
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models
Yongchan Kwon
Eric Wu
K. Wu
James Zou
DiffMTDI
96
68
0
02 Oct 2023
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical
  Reasoning Capabilities of Language Models
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models
Man Luo
Shrinidhi Kumbhar
Ming shen
Mihir Parmar
Neeraj Varshney
Pratyay Banerjee
Somak Aditya
Chitta Baral
ReLMELMLRM
137
31
0
02 Oct 2023
TRAM: Benchmarking Temporal Reasoning for Large Language Models
TRAM: Benchmarking Temporal Reasoning for Large Language Models
Yuqing Wang
Yun Zhao
LRM
111
14
0
02 Oct 2023
LEGO-Prover: Neural Theorem Proving with Growing Libraries
LEGO-Prover: Neural Theorem Proving with Growing Libraries
Haiming Wang
Huajian Xin
Chuanyang Zheng
Lin Li
Zhengying Liu
...
Enze Xie
Jian Yin
Zhenguo Li
Heng Liao
Xiaodan Liang
LRM
127
74
0
01 Oct 2023
PETA: Parameter-Efficient Trojan Attacks
PETA: Parameter-Efficient Trojan Attacks
Lauren Hong
Ting Wang
AAML
92
1
0
01 Oct 2023
Faithful Explanations of Black-box NLP Models Using LLM-generated
  Counterfactuals
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals
Y. Gat
Nitay Calderon
Amir Feder
Alexander Chapanin
Amit Sharma
Roi Reichart
133
36
0
01 Oct 2023
City Foundation Models for Learning General Purpose Representations from
  OpenStreetMap
City Foundation Models for Learning General Purpose Representations from OpenStreetMap
Pasquale Balsebre
Weiming Huang
Gao Cong
Yi Li
AI4CE
110
18
0
01 Oct 2023
A Brief History of Prompt: Leveraging Language Models. (Through Advanced
  Prompting)
A Brief History of Prompt: Leveraging Language Models. (Through Advanced Prompting)
G. Muktadir
SILM
52
10
0
30 Sep 2023
It HAS to be Subjective: Human Annotator Simulation via Zero-shot
  Density Estimation
It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation
Wen Wu
Jiajun He
Chuxu Zhang
P. Woodland
60
1
0
30 Sep 2023
Question-Answering Model for Schizophrenia Symptoms and Their Impact on
  Daily Life using Mental Health Forums Data
Question-Answering Model for Schizophrenia Symptoms and Their Impact on Daily Life using Mental Health Forums Data
Christian Internò
Eloisa Ambrosini
AI4MH
115
0
0
30 Sep 2023
Enhancing Representation Generalization in Authorship Identification
Enhancing Representation Generalization in Authorship Identification
Haining Wang
64
0
0
30 Sep 2023
Unlocking Bias Detection: Leveraging Transformer-Based Models for
  Content Analysis
Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis
Shaina Raza
Oluwanifemi Bamgbose
Veronica Chatrath
Shardul Ghuge
Yan Sidyakin
Abdullah Y. Muaad
87
13
0
30 Sep 2023
RelBERT: Embedding Relations with Language Models
RelBERT: Embedding Relations with Language Models
Asahi Ushio
Jose Camacho-Collados
Steven Schockaert
KELM
80
1
0
30 Sep 2023
Active Learning Based Fine-Tuning Framework for Speech Emotion
  Recognition
Active Learning Based Fine-Tuning Framework for Speech Emotion Recognition
Dongyuan Li
Yusong Wang
Kotaro Funakoshi
Manabu Okumura
103
4
0
30 Sep 2023
Bridging the Gap Between Foundation Models and Heterogeneous Federated
  Learning
Bridging the Gap Between Foundation Models and Heterogeneous Federated Learning
Sixing Yu
J. P. Muñoz
Ali Jannesari
AI4CE
124
8
0
30 Sep 2023
STRONG -- Structure Controllable Legal Opinion Summary Generation
STRONG -- Structure Controllable Legal Opinion Summary Generation
Yang Zhong
Diane Litman
ELMAILaw
60
3
0
29 Sep 2023
GAIA-1: A Generative World Model for Autonomous Driving
GAIA-1: A Generative World Model for Autonomous Driving
Masane Fuchi
Lloyd Russell
Hudson Yeo
Zak Murez
Hiroto Minami
Alex Kendall
Tomohiro Takagi
Gianluca Corrado
VGen
130
252
0
29 Sep 2023
Understanding and Mitigating the Label Noise in Pre-training on
  Downstream Tasks
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks
Hao Chen
Jindong Wang
Ankit Shah
Ran Tao
Hongxin Wei
Berfin cSimcsek
Masashi Sugiyama
Bhiksha Raj
108
31
0
29 Sep 2023
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs
Lu Yin
Ajay Jaiswal
Shiwei Liu
Souvik Kundu
Zhangyang Wang
74
7
0
29 Sep 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
369
1,921
0
28 Sep 2023
Navigating Healthcare Insights: A Birds Eye View of Explainability with
  Knowledge Graphs
Navigating Healthcare Insights: A Birds Eye View of Explainability with Knowledge Graphs
Satvik Garg
Anh Nguyen
Somya Garg
83
2
0
28 Sep 2023
Unsupervised Pretraining for Fact Verification by Language Model
  Distillation
Unsupervised Pretraining for Fact Verification by Language Model Distillation
A. Bazaga
Pietro Lio
Bo Dai
HILM
104
2
0
28 Sep 2023
Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News
  Detection
Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection
Jiaying Wu
Xinyu Chen
Haobin Yang
Qi Zhao
Yuhui Shi
AAML
85
12
0
28 Sep 2023
Leveraging Pre-trained Language Models for Time Interval Prediction in
  Text-Enhanced Temporal Knowledge Graphs
Leveraging Pre-trained Language Models for Time Interval Prediction in Text-Enhanced Temporal Knowledge Graphs
Duygu Sezen Islakoglu
Mel Chekol
Yannis Velegrakis
56
1
0
28 Sep 2023
Previous
123...798081...214215216
Next