Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.05685
Cited By
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
9 June 2023
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
Yonghao Zhuang
Zi Lin
Zhuohan Li
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena"
50 / 2,990 papers shown
Title
Tiny LVLM-eHub: Early Multimodal Experiments with Bard
Wenqi Shao
Yutao Hu
Peng Gao
Meng Lei
Kaipeng Zhang
...
Peng Xu
Siyuan Huang
Hongsheng Li
Yuning Qiao
Ping Luo
VLM
MLLM
40
2
0
07 Aug 2023
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Weihao Yu
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
Zicheng Liu
Xinchao Wang
Lijuan Wang
MLLM
60
623
0
04 Aug 2023
Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text
Nandana Mihindukulasooriya
Sanju Tiwari
Carlos F. Enguix
K. Lata
44
55
0
04 Aug 2023
ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation
Chenglong Wang
Hang Zhou
Yimin Hu
Yi Huo
Bei Li
Tongran Liu
Tong Xiao
Jingbo Zhu
32
8
0
04 Aug 2023
A Survey of Spanish Clinical Language Models
Guillem García Subies
Á. Jiménez
Paloma Martínez
LM&MA
ELM
LRM
34
0
0
04 Aug 2023
Wider and Deeper LLM Networks are Fairer LLM Evaluators
Xinghua Zhang
Yu Bowen
Haiyang Yu
Yangyu Lv
Tingwen Liu
Fei Huang
Hongbo Xu
Yongbin Li
ALM
85
84
0
03 Aug 2023
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation
Xueying Du
Mingwei Liu
Kaixin Wang
Hanlin Wang
Junwei Liu
Yixuan Chen
Jiayi Feng
Chaofeng Sha
Xin Peng
Yiling Lou
ELM
ALM
31
141
0
03 Aug 2023
Local Large Language Models for Complex Structured Medical Tasks
V. Bumgardner
Aaron D. Mullen
Samuel E. Armstrong
Caylin D. Hickey
Jeffrey A. Talbert
41
5
0
03 Aug 2023
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
Z. Yao
Reza Yazdani Aminabadi
Olatunji Ruwase
Samyam Rajbhandari
Xiaoxia Wu
...
Heyang Qin
Masahiro Tanaka
Shuai Che
Shuaiwen Leon Song
Yuxiong He
ALM
OffRL
48
69
0
02 Aug 2023
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation
Zhiqiang Yuan
Junwei Liu
Qiancheng Zi
Mingwei Liu
Xin Peng
Yiling Lou
ALM
ELM
LRM
36
74
0
02 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
52
14
0
31 Jul 2023
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection
Jun Yan
Vikas Yadav
Shiyang Li
Lichang Chen
Zheng Tang
Hai Wang
Vijay Srinivasan
Xiang Ren
Hongxia Jin
SILM
44
85
0
31 Jul 2023
NLLG Quarterly arXiv Report 06/23: What are the most influential current AI Papers?
Steffen Eger
Christoph Leiter
Jonas Belouadi
Ran Zhang
Aida Kostikova
Daniil Larionov
Yanran Chen
Vivian Fresen
AI4CE
48
4
0
31 Jul 2023
Camoscio: an Italian Instruction-tuned LLaMA
Andrea Santilli
Emanuele Rodolà
37
26
0
31 Jul 2023
HouYi: An open-source large language model specially designed for renewable energy and carbon neutrality field
Mingliang Bai
Zhihao Zhou
Ruidong Wang
Yusheng Yang
Zizhen Qin
Yunxia Chen
Chunjin Mu
Jinfu Liu
Daren Yu
37
2
0
31 Jul 2023
CHATREPORT: Democratizing Sustainability Disclosure Analysis through LLM-based Tools
Jingwei Ni
J. Bingler
Chiara Colesanti-Senni
Mathias Kraus
Glen Gostlow
...
Qian Wang
Nicolas Webersinke
Tobias Wekhof
Ting Yu
Markus Leippold
39
29
0
28 Jul 2023
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou
Zifan Wang
Nicholas Carlini
Milad Nasr
J. Zico Kolter
Matt Fredrikson
121
1,328
0
27 Jul 2023
SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark
Liang Xu
Anqi Li
Lei Zhu
Han Xue
Changtai Zhu
Kangkang Zhao
Hao He
Xuanwei Zhang
Qiyue Kang
Zhenzhong Lan
RALM
ELM
LRM
20
51
0
27 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming-Hsuan Yang
Fahad Shahbaz Khan
VLM
51
120
0
25 Jul 2023
Fashion Matrix: Editing Photos by Just Talking
Zheng Chong
Xujie Zhang
Fuwei Zhao
Zhenyu Xie
Xiaodan Liang
DiffM
36
2
0
25 Jul 2023
L-Eval: Instituting Standardized Evaluation for Long Context Language Models
Chen An
Shansan Gong
Ming Zhong
Xingjian Zhao
Mukai Li
Jun Zhang
Lingpeng Kong
Xipeng Qiu
ELM
ALM
53
138
0
20 Jul 2023
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
Seonghyeon Ye
Doyoung Kim
Sungdong Kim
Hyeonbin Hwang
Seungone Kim
Yongrae Jo
James Thorne
Juho Kim
Minjoon Seo
ALM
60
101
0
20 Jul 2023
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
Jianguo Zhang
Kun Qian
Zhiwei Liu
Shelby Heinecke
Rui Meng
Ye Liu
Zhou Yu
Huan Wang
Silvio Savarese
Caiming Xiong
44
22
0
19 Jul 2023
Code Detection for Hardware Acceleration Using Large Language Models
Pablo Antonio Martínez
Gregorio Bernabé
J. M. García
31
2
0
19 Jul 2023
Emotional Intelligence of Large Language Models
Xuena Wang
Xueting Li
Zi Yin
Yue Wu
Tsinghua University
30
77
0
18 Jul 2023
AlpaGasus: Training A Better Alpaca with Fewer Data
Lichang Chen
Shiyang Li
Jun Yan
Hai Wang
Kalpa Gunaratna
...
Zheng Tang
Vijay Srinivasan
Dinesh Manocha
Heng-Chiao Huang
Hongxia Jin
ALM
59
0
0
17 Jul 2023
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots
Gelei Deng
Yi Liu
Yuekang Li
Kailong Wang
Ying Zhang
Zefeng Li
Haoyu Wang
Tianwei Zhang
Yang Liu
SILM
42
122
0
16 Jul 2023
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models
Adian Liusie
Potsawee Manakul
Mark Gales
ELM
39
36
0
15 Jul 2023
Large Language Models Understand and Can be Enhanced by Emotional Stimuli
Cheng-rong Li
Jindong Wang
Yixuan Zhang
Kaijie Zhu
Wenxin Hou
Jianxun Lian
Fang Luo
Qiang Yang
Xingxu Xie
LRM
80
122
0
14 Jul 2023
MMBench: Is Your Multi-modal Model an All-around Player?
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
...
Jiaqi Wang
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
29
933
0
12 Jul 2023
Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration
Zhenhailong Wang
Shaoguang Mao
Wenshan Wu
Tao Ge
Furu Wei
Heng Ji
LLMAG
LRM
34
134
0
11 Jul 2023
Emu: Generative Pretraining in Multimodality
Quan-Sen Sun
Qiying Yu
Yufeng Cui
Fan Zhang
Xiaosong Zhang
Yueze Wang
Hongcheng Gao
Jingjing Liu
Tiejun Huang
Xinlong Wang
MLLM
45
128
0
11 Jul 2023
Piecing Together Clues: A Benchmark for Evaluating the Detective Skills of Large Language Models
Zhouhong Gu
Lin Zhang
Jiangjie Chen
Haoning Ye
Xiaoxuan Zhu
...
Jianchen Wang
Yikai Zhang
Wenhao Huang
Yanghua Xiao
Hongwei Feng
RALM
ELM
36
0
0
11 Jul 2023
Secrets of RLHF in Large Language Models Part I: PPO
Rui Zheng
Shihan Dou
Songyang Gao
Yuan Hua
Wei Shen
...
Hang Yan
Tao Gui
Qi Zhang
Xipeng Qiu
Xuanjing Huang
ALM
OffRL
55
159
0
11 Jul 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
VLM
MLLM
92
226
0
07 Jul 2023
A Survey on Evaluation of Large Language Models
Yu-Chu Chang
Xu Wang
Jindong Wang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELM
LM&MA
ALM
85
1,551
0
06 Jul 2023
Style Over Substance: Evaluation Biases for Large Language Models
Minghao Wu
Alham Fikri Aji
ALM
ELM
51
43
0
06 Jul 2023
What Should Data Science Education Do with Large Language Models?
Xinming Tu
James Zou
Weijie J. Su
Linjun Zhang
AI4Ed
47
33
0
06 Jul 2023
Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning
Deepanway Ghosal
Yew Ken Chia
Navonil Majumder
Soujanya Poria
ALM
LRM
38
17
0
05 Jul 2023
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
Jinhao Duan
Hao-Ran Cheng
Shiqi Wang
Alex Zavalny
Chenan Wang
Renjing Xu
B. Kailkhura
Kaidi Xu
61
37
0
03 Jul 2023
Visual Instruction Tuning with Polite Flamingo
Delong Chen
Jianfeng Liu
Wenliang Dai
Baoyuan Wang
MLLM
39
43
0
03 Jul 2023
Preference Ranking Optimization for Human Alignment
Feifan Song
Yu Bowen
Minghao Li
Haiyang Yu
Fei Huang
Yongbin Li
Houfeng Wang
ALM
34
240
0
30 Jun 2023
On the Exploitability of Instruction Tuning
Manli Shu
Jiong Wang
Chen Zhu
Jonas Geiping
Chaowei Xiao
Tom Goldstein
SILM
58
93
0
28 Jun 2023
Composing Parameter-Efficient Modules with Arithmetic Operations
Jinghan Zhang
Shiqi Chen
Junteng Liu
Junxian He
KELM
MoMe
36
111
0
26 Jun 2023
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
71
263
0
24 Jun 2023
Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Daniel Zou
X. Jin
Xueyang Yu
Haotian Zhang
J. Demmel
MoE
32
0
0
24 Jun 2023
Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities
Xudong Shen
H. Brown
Jiashu Tao
Martin Strobel
Yao Tong
Akshay Narayan
Harold Soh
Finale Doshi-Velez
49
3
0
22 Jun 2023
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
Shizhe Diao
Rui Pan
Hanze Dong
Kashun Shum
Jipeng Zhang
Wei Xiong
Tong Zhang
ALM
42
63
0
21 Jun 2023
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Xuan-Phi Nguyen
Sharifah Mahani Aljunied
Shafiq Joty
Lidong Bing
55
33
0
20 Jun 2023
CHORUS: Foundation Models for Unified Data Discovery and Exploration
Moe Kayali
A. Lykov
Ilias Fountalis
N. Vasiloglou
Dan Olteanu
Dan Suciu
40
22
0
16 Jun 2023
Previous
1
2
3
...
58
59
60
Next