ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.10950
  4. Cited By
E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models

E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models

16 June 2024
Zhenyu Zhang
Bingguang Hao
Jinpeng Li
Zekai Zhang
Dongyan Zhao
ArXiv (abs)PDFHTML

Papers citing "E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models"

13 / 13 papers shown
Title
SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and
  Lexical Alterations
SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and Lexical Alterations
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
CoGe
83
12
0
17 Jun 2024
How are Prompts Different in Terms of Sensitivity?
How are Prompts Different in Terms of Sensitivity?
Sheng Lu
Hendrik Schuff
Iryna Gurevych
71
19
0
13 Nov 2023
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
  Models
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Wei Ping
Weixin Chen
Hengzhi Pei
Chulin Xie
Mintong Kang
...
Zinan Lin
Yuk-Kit Cheng
Sanmi Koyejo
Basel Alomair
Yue Liu
119
430
0
20 Jun 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
1.5K
14,748
0
15 Mar 2023
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution
  Perspective
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective
Jindong Wang
Xixu Hu
Wenxin Hou
Hao Chen
Runkai Zheng
...
Weirong Ye
Xiubo Geng
Binxing Jiao
Yue Zhang
Xingxu Xie
AI4MH
139
236
0
22 Feb 2023
GLUE-X: Evaluating Natural Language Understanding Models from an
  Out-of-distribution Generalization Perspective
GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective
Linyi Yang
Shuibai Zhang
Libo Qin
Yafu Li
Yidong Wang
Hanmeng Liu
Jindong Wang
Xingxu Xie
Yue Zhang
ELM
118
82
0
15 Nov 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
888
13,207
0
04 Mar 2022
Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of
  Language Models
Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models
Wei Ping
Chejian Xu
Shuohang Wang
Zhe Gan
Yu Cheng
Jianfeng Gao
Ahmed Hassan Awadallah
Yangqiu Song
VLMELMAAML
72
225
0
04 Nov 2021
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
  Task-oriented Dialog Systems
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems
Baolin Peng
Chunyuan Li
Zhu Zhang
Chenguang Zhu
Jinchao Li
Jianfeng Gao
61
50
0
29 Dec 2020
OpenAttack: An Open-source Textual Adversarial Attack Toolkit
OpenAttack: An Open-source Textual Adversarial Attack Toolkit
Guoyang Zeng
Fanchao Qi
Qianrui Zhou
Ting Zhang
Zixian Ma
Bairu Hou
Yuan Zang
Zhiyuan Liu
Maosong Sun
AAML
174
124
0
19 Sep 2020
TextBugger: Generating Adversarial Text Against Real-world Applications
TextBugger: Generating Adversarial Text Against Real-world Applications
Jinfeng Li
S. Ji
Tianyu Du
Bo Li
Ting Wang
SILMAAML
216
747
0
13 Dec 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,201
0
20 Apr 2018
Generating Natural Adversarial Examples
Generating Natural Adversarial Examples
Zhengli Zhao
Dheeru Dua
Sameer Singh
GANAAML
186
601
0
31 Oct 2017
1