Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.11692
Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach
26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RoBERTa: A Robustly Optimized BERT Pretraining Approach"
50 / 4,654 papers shown
Title
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
70
19
0
08 Jan 2025
BabyLMs for isiXhosa: Data-Efficient Language Modelling in a Low-Resource Context
Alexis Matzopoulos
Charl Hendriks
Hishaam Mahomed
Francois Meyer
30
0
0
08 Jan 2025
IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry
Mohammad AL-Smadi
38
0
0
07 Jan 2025
Trust Modeling in Counseling Conversations: A Benchmark Study
Aseem Srivastava
Zuhair Hasan Shaik
Tanmoy Chakraborty
Md. Shad Akhtar
43
0
0
06 Jan 2025
Tougher Text, Smarter Models: Raising the Bar for Adversarial Defence Benchmarks
Yang Wang
Chenghua Lin
ELM
40
0
0
05 Jan 2025
Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection
S M Mostaq Hossain
Amani Altarawneh
Jesse Roberts
38
0
0
04 Jan 2025
Mathematical Language Models: A Survey
Wen Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
84
13
0
03 Jan 2025
Efficient support ticket resolution using Knowledge Graphs
Sherwin Varghese
James Tian
39
0
0
03 Jan 2025
Toward Corpus Size Requirements for Training and Evaluating Depression Risk Models Using Spoken Language
Tomek Rutowski
Amir Harati
Elizabeth Shriberg
Yang Lu
Piotr Chlebek
Ricardo Oliveira
53
7
0
03 Jan 2025
U-GIFT: Uncertainty-Guided Firewall for Toxic Speech in Few-Shot Scenario
Jiaxin Song
Xinyu Wang
Yihao Wang
Yifan Tang
Ru Zhang
Jianyi Liu
Gongshen Liu
AAML
50
0
0
03 Jan 2025
On Adversarial Robustness of Language Models in Transfer Learning
Bohdan Turbal
Anastasiia Mazur
Jiaxu Zhao
Mykola Pechenizkiy
AAML
49
0
0
03 Jan 2025
Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets
Vatsal Gupta
Pranshu Pandya
Tushar Kataria
Vivek Gupta
Dan Roth
AAML
66
1
0
03 Jan 2025
Proof Recommendation System for the HOL4 Theorem Prover
Nour Dekhil
Adnan Rashid
Sofiene Tahar
51
1
0
31 Dec 2024
Deep Neural Networks and Brain Alignment: Brain Encoding and Decoding (Survey)
S. Oota
Zijiao Chen
Manish Gupta
R. Bapi
G. Jobard
F. Alexandre
X. Hinaut
3DV
AI4CE
49
11
0
31 Dec 2024
Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment
Jianfei Zhang
Jun Bai
Yangqiu Song
Yanmeng Wang
Rumei Li
Chenghua Lin
Wenge Rong
44
0
0
31 Dec 2024
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
88
4
0
31 Dec 2024
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
Yehonathan Refael
Jonathan Svirsky
Boris Shustin
Wasim Huleihel
Ofir Lindenbaum
47
3
0
31 Dec 2024
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
96
12
0
31 Dec 2024
CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code
Batu Guan
Yao Wan
Zhangqian Bi
Zheng Wang
Hongyu Zhang
Yulei Sui
Pan Zhou
45
9
0
31 Dec 2024
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
221
0
0
30 Dec 2024
ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis
James P. Beno
VLM
47
0
0
29 Dec 2024
LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning
Shuguang Chen
Guang Lin
LRM
230
0
0
28 Dec 2024
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
Huawen Feng
Pu Zhao
Qingfeng Sun
Can Xu
Fangkai Yang
...
Qianli Ma
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
AAML
ALM
62
0
0
23 Dec 2024
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao Song
ODL
98
2
0
22 Dec 2024
VaeDiff-DocRE: End-to-end Data Augmentation Framework for Document-level Relation Extraction
Khai Phan Tran
Wen Hua
Xue Li
SyDa
93
0
0
18 Dec 2024
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Le Yang
Ziwei Zheng
Boxu Chen
Zhengyu Zhao
Chenhao Lin
Chao Shen
VLM
145
3
0
18 Dec 2024
RelationField: Relate Anything in Radiance Fields
Sebastian Koch
Johanna Wald
Mirco Colosi
Narunas Vaskevicius
Pedro Hermosilla
F. Tombari
Timo Ropinski
119
1
0
18 Dec 2024
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
Yuxi Sun
Wei Gao
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
ELM
82
0
0
17 Dec 2024
Can Neural Decompilation Assist Vulnerability Prediction on Binary Code?
D. Cotroneo
F. C. Grasso
R. Natella
V. Orbinato
96
0
0
10 Dec 2024
A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
Surangika Ranathunga
Asanka Ranasinghea
Janaka Shamala
Ayodya Dandeniyaa
Rashmi Galappaththia
Malithi Samaraweeraa
86
0
0
03 Dec 2024
CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search
Kaixin Wu
Yixin Ji
Ziyang Chen
Qiang Wang
Cunxiang Wang
...
Jia Xu
Zhongyi Liu
Jinjie Gu
Yuan Zhou
Linjian Mo
KELM
CLL
97
0
0
02 Dec 2024
Referring Video Object Segmentation via Language-aligned Track Selection
Seongchan Kim
Woojeong Jin
Sangbeom Lim
Heeji Yoon
Hyunwook Choi
Seungryong Kim
VOS
97
0
0
02 Dec 2024
SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Chunlin Yu
Hanqing Wang
Ye Shi
Haoyang Luo
Sibei Yang
Jingyi Yu
Jingya Wang
LRM
LM&Ro
99
1
0
02 Dec 2024
ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain
Ali Shiraee Kasmaee
Mohammad Khodadad
Mohammad Arshi Saloot
Nick Sherck
Stephen Dokas
H. Mahyar
Soheila Samiee
ELM
263
0
0
30 Nov 2024
Train Once for All: A Transitional Approach for Efficient Aspect Sentiment Triplet Extraction
Xinmeng Hou
Lingyue Fu
Chenhao Meng
Hai Hu
Wuqi Wang
Hai Hu
75
0
0
29 Nov 2024
GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding
Yawen Shao
Wei-dong Zhai
Yuhang Yang
Hongchen Luo
Yang Cao
Zheng-jun Zha
100
1
0
29 Nov 2024
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
Kaustubh Ponkshe
Raghav Singhal
Eduard A. Gorbunov
Alexey Tumanov
Samuel Horváth
Praneeth Vepakomma
76
3
0
29 Nov 2024
EzSQL: An SQL intermediate representation for improving SQL-to-text Generation
Meher Bhardwaj
Hrishikesh Ethari
Dennis Singh Moirangthem
AI4TS
82
0
0
28 Nov 2024
MetaphorShare: A Dynamic Collaborative Repository of Open Metaphor Datasets
Joanne Boisson
Arif Mehmood
Jose Camacho-Collados
69
0
0
27 Nov 2024
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
112
2
0
26 Nov 2024
KinMo: Kinematic-aware Human Motion Understanding and Generation
Pengfei Zhang
Pinxin Liu
Hyeongwoo Kim
Pablo Garrido
Bindita Chaudhuri
96
1
0
23 Nov 2024
Comparative Analysis of Pooling Mechanisms in LLMs: A Sentiment Analysis Perspective
Jinming Xing
Ruilin Xing
Yan Sun
Ruilin Xing
LLMAG
76
2
0
22 Nov 2024
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
79
2
0
20 Nov 2024
Just KIDDIN: Knowledge Infusion and Distillation for Detection of INdecent Memes
Rahul Garg
Trilok Padhi
Hemang Jain
Ugur Kursuncu
Ponnurangam Kumaraguru
83
3
0
19 Nov 2024
Hysteresis Activation Function for Efficient Inference
Moshe Kimhi
Idan Kashani
A. Mendelson
Chaim Baskin
LLMSV
49
0
0
15 Nov 2024
BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency
Akari Haga
Akiyo Fukatsu
Miyu Oba
Arianna Bisazza
Yohei Oseki
35
1
0
14 Nov 2024
Prompt-enhanced Network for Hateful Meme Classification
Junxi Liu
Yanyan Feng
Jiehai Chen
Yun Xue
Fenghuan Li
VLM
63
0
0
12 Nov 2024
HeartBERT: A Self-Supervised ECG Embedding Model for Efficient and Effective Medical Signal Analysis
Saedeh Tahery
Fatemeh Hamid Akhlaghi
Termeh Amirsoleimani
OOD
88
1
0
08 Nov 2024
Beemo: Benchmark of Expert-edited Machine-generated Outputs
Ekaterina Artemova
Jason Samuel Lucas
Saranya Venkatraman
Jooyoung Lee
Sergei Tilga
Adaku Uchendu
Vladislav Mikhailov
DeLMO
MoE
71
6
0
06 Nov 2024
Confidence Calibration of Classifiers with Many Classes
Adrien LeCoz
Stéphane Herbin
Faouzi Adjed
UQCV
39
1
0
05 Nov 2024
Previous
1
2
3
...
5
6
7
...
92
93
94
Next