Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.00293
Cited By
How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks
1 March 2023
Xuanting Chen
Junjie Ye
Can Zu
Nuo Xu
Rui Zheng
Minlong Peng
Jie Zhou
Tao Gui
Qi Zhang
Xuanjing Huang
AI4MH
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks"
44 / 44 papers shown
Title
SignX: The Foundation Model for Sign Recognition
Sen Fang
Chunyu Sui
Hongwei Yi
C. Neidle
Dimitris N. Metaxas
SLR
40
0
0
22 Apr 2025
Assessing how hyperparameters impact Large Language Models' sarcasm detection performance
Montgomery Gole
Andriy Miranskyy
AI4MH
23
0
0
08 Apr 2025
Enhancing NER Performance in Low-Resource Pakistani Languages using Cross-Lingual Data Augmentation
Toqeer Ehsan
Thamar Solorio
190
0
0
07 Apr 2025
Pastiche Novel Generation Creating: Fan Fiction You Love in Your Favorite Author's Style
Xueran Han
Yuhan Liu
Mingzhe Li
Wei Liu
Sen Hu
Rui Yan
Zhiqiang Xu
Preslav Nakov
69
0
0
24 Feb 2025
LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning
Yansheng Mao
Jiaqi Li
Fanxu Meng
Jing Xiong
Zilong Zheng
Muhan Zhang
LLMAG
RALM
101
1
0
18 Dec 2024
A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models
Elena Kardanova
Alina Ivanova
Ksenia Tarasova
Taras Pashchenko
Aleksei Tikhoniuk
Elen Yusupova
Anatoly Kasprzhak
Yaroslav Kuzminov
Ekaterina Kruchinskaia
Irina Brun
47
1
0
29 Oct 2024
Robots in the Middle: Evaluating LLMs in Dispute Resolution
Jinzhe Tan
Hannes Westermann
Nikhil Reddy Pottanigari
Jaromír Šavelka
Sébastien Meeùs
Mia Godet
Karim Benyekhlef
47
1
0
09 Oct 2024
Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning
Hui Liu
Wenya Wang
Hao Sun
Chris Xing Tian
Chenqi Kong
Xin Dong
Haoliang Li
54
5
0
14 Jun 2024
PostDoc: Generating Poster from a Long Multimodal Document Using Deep Submodular Optimization
Vijay Jaisankar
Sambaran Bandyopadhyay
Kalp Vyas
Varre Chaitanya
Shwetha Somasundaram
32
2
0
30 May 2024
Comparative Study of Domain Driven Terms Extraction Using Large Language Models
Sandeep Chataut
Tuyen Do
Bichar Dip Shrestha Gurung
Shiva Aryal
Anup Khanal
Carol Lushbough
Etienne Z. Gnimpieba
30
10
0
02 Apr 2024
Continual Few-shot Event Detection via Hierarchical Augmentation Networks
Chenlong Zhang
Pengfei Cao
Yubo Chen
Kang Liu
Qing Cui
Mengshu Sun
Jun Zhao
38
3
0
26 Mar 2024
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals
Rui Zheng
Yuhao Zhou
Zhiheng Xi
Tao Gui
Qi Zhang
Xuanjing Huang
AAML
55
0
0
24 Mar 2024
Motion Generation from Fine-grained Textual Descriptions
Kunhang Li
Yansong Feng
DiffM
VGen
37
1
0
20 Mar 2024
SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models
Xiang Gao
Jiaxin Zhang
Lalla Mouatadid
Kamalika Das
29
10
0
04 Mar 2024
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Yuan Zhang
Xiao Wang
Zhiheng Xi
Han Xia
Tao Gui
Qi Zhang
Xuanjing Huang
51
3
0
26 Feb 2024
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition
Junjie Ye
Nuo Xu
Yikun Wang
Jie Zhou
Qi Zhang
Tao Gui
Xuanjing Huang
31
15
0
22 Feb 2024
Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges
Yizheng Huang
Jimmy X. Huang
35
10
0
17 Feb 2024
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages
Junjie Ye
Sixian Li
Guanyu Li
Caishuang Huang
Songyang Gao
Yilong Wu
Qi Zhang
Tao Gui
Xuanjing Huang
LLMAG
38
17
0
16 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
26
158
0
06 Feb 2024
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
Junjie Ye
Yilong Wu
Songyang Gao
Caishuang Huang
Sixian Li
Guanyu Li
Xiaoran Fan
Qi Zhang
Tao Gui
Xuanjing Huang
AAML
35
16
0
16 Jan 2024
LLM-MARS: Large Language Model for Behavior Tree Generation and NLP-enhanced Dialogue in Multi-Agent Robot Systems
Artem Lykov
Maria Dronova
Nikolay Naglov
Mikhail Litvinov
Sergei Satsevich
Artem Bazhenov
Vladimir Berman
Aleksei Shcherbak
Dzmitry Tsetserukou
LLMAG
LM&Ro
29
14
0
14 Dec 2023
On Sarcasm Detection with OpenAI GPT-based Models
Montgomery Gole
Williams-Paul Nwadiugwu
Andriy Miranskyy
21
8
0
07 Dec 2023
LooGLE: Can Long-Context Language Models Understand Long Contexts?
Jiaqi Li
Mengmeng Wang
Zilong Zheng
Muhan Zhang
ELM
RALM
40
107
0
08 Nov 2023
Boosting Data Analytics With Synthetic Volume Expansion
Xiaotong Shen
Yifei Liu
Rex Shen
19
3
0
27 Oct 2023
Evaluating General-Purpose AI with Psychometrics
Xiting Wang
Liming Jiang
Jose Hernandez-Orallo
David Stillwell
Luning Sun
Fang Luo
Xing Xie
AI4MH
ELM
38
12
0
25 Oct 2023
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index
Megha Chakraborty
S.M. Towhidul Islam Tonmoy
S. M. Mehedi
Krish Sharma
Niyar R. Barman
...
Tanay Kumar
Vinija Jain
Aman Chadha
Amit P. Sheth
Amitava Das
DeLMO
22
21
0
08 Oct 2023
Knowledge Graph Question Answering for Materials Science (KGQA4MAT): Developing Natural Language Interface for Metal-Organic Frameworks Knowledge Graph (MOF-KG) Using LLM
Yuan An
Jane Greenberg
Alexander Kalinowski
Xintong Zhao
Xiaohua Hu
F. Uribe-Romo
Kyle Langlois
Jacob Furst
Diego A. Gómez-Gualdrón
32
3
0
20 Sep 2023
Rethinking STS and NLI in Large Language Models
Yuxia Wang
Minghan Wang
Preslav Nakov
LRM
20
3
0
16 Sep 2023
Automatic Scam-Baiting Using ChatGPT
P. Bajaj
Matthew Edwards
24
6
0
04 Sep 2023
Language models as master equation solvers
Chuanbo Liu
Jin Wang
41
0
0
29 Jul 2023
Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models
Yuheng Huang
Jiayang Song
Zhijie Wang
Shengming Zhao
Huaming Chen
Felix Juefei-Xu
Lei Ma
28
34
0
16 Jul 2023
GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP
Md. Tawkat Islam Khondaker
Abdul Waheed
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
ELM
LM&MA
32
62
0
24 May 2023
Robust Prompt Optimization for Large Language Models Against Distribution Shifts
Moxin Li
Wenjie Wang
Fuli Feng
Yixin Cao
Jizhi Zhang
Tat-Seng Chua
OffRL
42
15
0
23 May 2023
Distilling Robustness into Natural Language Inference Models with Domain-Targeted Augmentation
Joe Stacey
Marek Rei
32
2
0
22 May 2023
Sensitivity and Robustness of Large Language Models to Prompt Template in Japanese Text Classification Tasks
Chengguang Gan
Tatsunori Mori
AAML
22
17
0
15 May 2023
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction
Xiao Wang
Wei Zhou
Can Zu
Han Xia
Tianze Chen
...
Tao Gui
Jihua Kang
J. Yang
Siyuan Li
Chunsai Du
39
147
0
17 Apr 2023
ArguGPT: evaluating, understanding and identifying argumentative essays generated by GPT models
Yikang Liu
Ziyin Zhang
Wanyang Zhang
Shisen Yue
Xiaojing Zhao
Xinyuan Cheng
Yiwen Zhang
Hai Hu
DeLMO
19
49
0
16 Apr 2023
Towards Interpretable Mental Health Analysis with Large Language Models
Kailai Yang
Shaoxiong Ji
Tianlin Zhang
Qianqian Xie
Zi-Zhou Kuang
Sophia Ananiadou
ELM
AI4MH
LRM
35
59
0
06 Apr 2023
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Junjie Ye
Xuanting Chen
Nuo Xu
Can Zu
Zekai Shao
...
Jie Zhou
Siming Chen
Tao Gui
Qi Zhang
Xuanjing Huang
ELM
38
309
0
18 Mar 2023
Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family
Yiming Tan
Dehai Min
Y. Li
Wenbo Li
Nan Hu
Yongrui Chen
Guilin Qi
AI4MH
ELM
49
95
0
14 Mar 2023
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
215
1,661
0
15 Oct 2021
Trustworthy AI: A Computational Perspective
Haochen Liu
Yiqi Wang
Wenqi Fan
Xiaorui Liu
Yaxin Li
Shaili Jain
Yunhao Liu
Anil K. Jain
Jiliang Tang
FaML
104
196
0
12 Jul 2021
DynaSent: A Dynamic Benchmark for Sentiment Analysis
Christopher Potts
Zhengxuan Wu
Atticus Geiger
Douwe Kiela
230
77
0
30 Dec 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,505
0
23 Jan 2020
1