Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 3,408 papers shown
Title
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic
Fajri Koto
Haonan Li
Sara Shatnawi
Jad Doughman
Abdelrahman Boda Sadallah
...
Neha Sengupta
Shady Shehata
Nizar Habash
Preslav Nakov
Timothy Baldwin
ELM
LRM
155
44
0
20 Feb 2024
Me LLaMA: Foundation Large Language Models for Medical Applications
Qianqian Xie
Qingyu Chen
Aokun Chen
C.A.I. Peng
Yan Hu
...
Huan He
Lucila Ohno-Machido
Yonghui Wu
Hua Xu
Jiang Bian
LM&MA
AI4MH
131
4
0
20 Feb 2024
Thermometer: Towards Universal Calibration for Large Language Models
Maohao Shen
Subhro Das
Kristjan Greenewald
P. Sattigeri
Greg Wornell
Soumya Ghosh
100
11
0
20 Feb 2024
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
MoE
127
13
0
20 Feb 2024
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Runlong Zhou
Simon S. Du
Beibin Li
OffRL
89
4
0
20 Feb 2024
Defending Jailbreak Prompts via In-Context Adversarial Game
Yujun Zhou
Yufei Han
Haomin Zhuang
Kehan Guo
Zhenwen Liang
Hongyan Bao
Xiangliang Zhang
LLMAG
AAML
117
15
0
20 Feb 2024
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?
Nishant Balepur
Abhilasha Ravichander
Rachel Rudinger
ELM
122
28
0
19 Feb 2024
LoRA+: Efficient Low Rank Adaptation of Large Models
Soufiane Hayou
Nikhil Ghosh
Bin Yu
AI4CE
120
188
0
19 Feb 2024
Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers
Zihan Qiu
Zeyu Huang
Youcheng Huang
Jie Fu
KELM
72
5
0
19 Feb 2024
Stick to Your Role! Context-dependence and Stability of Personal Value Expression in Large Language Models
Grgur Kovač
Rémy Portelas
Masataka Sawayama
Peter Ford Dominey
Pierre-Yves Oudeyer
LLMAG
57
1
0
19 Feb 2024
Learning to Edit: Aligning LLMs with Knowledge Editing
Yuxin Jiang
Yufei Wang
Chuhan Wu
Wanjun Zhong
Xingshan Zeng
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Qun Liu
Wei Wang
KELM
96
30
0
19 Feb 2024
Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models
Jiahao Ying
Yixin Cao
Yushi Bai
Qianru Sun
Bo Wang
Wei Tang
Zhaojun Ding
Yizhe Yang
Xuanjing Huang
Shuicheng Yan
KELM
59
10
0
19 Feb 2024
Revisiting Knowledge Distillation for Autoregressive Language Models
Qihuang Zhong
Liang Ding
Li Shen
Juhua Liu
Bo Du
Dacheng Tao
KELM
117
19
0
19 Feb 2024
ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
LM&MA
100
23
0
19 Feb 2024
InMD-X: Large Language Models for Internal Medicine Doctors
Hansle Gwon
Imjin Ahn
Hyoje Jung
Byeolhee Kim
Young-Hak Kim
Tae Joon Jun
LM&MA
85
1
0
19 Feb 2024
Head-wise Shareable Attention for Large Language Models
Zouying Cao
Yifei Yang
Hai Zhao
71
4
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
127
2
0
19 Feb 2024
Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding
Hanling Yi
Feng-Huei Lin
Hongbin Li
Peiyang Ning
Xiaotian Yu
Rong Xiao
LRM
82
13
0
19 Feb 2024
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles
Oleksandr Balabanov
Hampus Linander
UQCV
116
20
0
19 Feb 2024
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
Seunghyeok Hong
Sangwon Baek
Sangdae Nam
Guijin Son
Seungone Kim
ELM
LRM
119
17
0
18 Feb 2024
KMMLU: Measuring Massive Multitask Language Understanding in Korean
Seunghyeok Hong
Hanwool Albert Lee
Sungdong Kim
Seungone Kim
Niklas Muennighoff
Taekyoon Choi
Cheonbok Park
Kang Min Yoo
Stella Biderman
ALM
RALM
ELM
116
44
0
18 Feb 2024
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
Xunjian Yin
Xu Zhang
Jie Ruan
Xiaojun Wan
ELM
112
24
0
18 Feb 2024
SciAgent: Tool-augmented Language Models for Scientific Reasoning
Yubo Ma
Zhibin Gou
Junheng Hao
Ruochen Xu
Shuohang Wang
...
Yujiu Yang
Yixin Cao
Aixin Sun
Hany Awadalla
Weizhu Chen
RALM
LRM
LLMAG
128
24
0
18 Feb 2024
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation
Siyuan Wang
Zhuohan Long
Zhihao Fan
Zhongyu Wei
Xuanjing Huang
LLMAG
102
38
0
18 Feb 2024
Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs
Arian Askari
Roxana Petcu
Chuan Meng
Mohammad Aliannejadi
Amin Abolghasemi
Evangelos Kanoulas
Suzan Verberne
115
9
0
18 Feb 2024
CliqueParcel: An Approach For Batching LLM Prompts That Jointly Optimizes Efficiency And Faithfulness
Jiayi Liu
Tinghan Yang
Jennifer Neville
56
11
0
17 Feb 2024
OneBit: Towards Extremely Low-bit Large Language Models
Yuzhuang Xu
Xu Han
Zonghan Yang
Shuo Wang
Qingfu Zhu
Zhiyuan Liu
Weidong Liu
Wanxiang Che
MQ
119
46
0
17 Feb 2024
Multi-Perspective Consistency Enhances Confidence Estimation in Large Language Models
Pei Wang
Yejie Wang
Muxi Diao
Keqing He
Guanting Dong
Weiran Xu
167
0
0
17 Feb 2024
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
Wenkai Yang
Xiaohan Bi
Yankai Lin
Sishuo Chen
Jie Zhou
Xu Sun
LLMAG
AAML
135
71
0
17 Feb 2024
LaCo: Large Language Model Pruning via Layer Collapse
Yifei Yang
Zouying Cao
Hai Zhao
90
64
0
17 Feb 2024
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Zihao Lin
Mohammad Beigi
Hongxuan Li
Yufan Zhou
Yuxiang Zhang
Qifan Wang
Wenpeng Yin
Lifu Huang
KELM
70
9
0
16 Feb 2024
Language Models as Science Tutors
Alexis Chevalier
Jiayi Geng
Alexander Wettig
Howard Chen
Sebastian Mizera
...
Jiatong Yu
Jun-Jie Zhu
Z. Ren
Sanjeev Arora
Danqi Chen
ELM
74
13
0
16 Feb 2024
Humans or LLMs as the Judge? A Study on Judgement Biases
Guiming Hardy Chen
Shunian Chen
Ziche Liu
Feng Jiang
Benyou Wang
208
113
0
16 Feb 2024
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation
Zhaowei Wang
Wei Fan
Qing Zong
Hongming Zhang
Sehyun Choi
Tianqing Fang
Xin Liu
Yangqiu Song
Ginny Wong
Simon See
94
14
0
16 Feb 2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Dayou Du
Yijia Zhang
Shijie Cao
Jiaqi Guo
Ting Cao
Xiaowen Chu
Ningyi Xu
MQ
114
37
0
16 Feb 2024
Enhancing Role-playing Systems through Aggressive Queries: Evaluation and Improvement
Yihong Tang
Jiao Ou
Che Liu
Fuzheng Zhang
Di Zhang
Kun Gai
101
5
0
16 Feb 2024
Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
Ming Li
Jiuhai Chen
Lichang Chen
Dinesh Manocha
148
21
0
16 Feb 2024
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models
Hanxing Ding
Liang Pang
Zihao Wei
Huawei Shen
Xueqi Cheng
HILM
RALM
146
18
0
16 Feb 2024
QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning
Hossein Rajabzadeh
Mojtaba Valipour
Tianshu Zhu
Marzieh S. Tahaei
Hyock Ju Kwon
Ali Ghodsi
Boxing Chen
Mehdi Rezagholizadeh
67
10
0
16 Feb 2024
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
Yanis Labrak
Adrien Bazoge
Emmanuel Morin
P. Gourraud
Mickael Rouvier
Richard Dufour
223
228
0
15 Feb 2024
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs
Yebowen Hu
Kaiqiang Song
Sangwoo Cho
Xiaoyang Wang
H. Foroosh
Dong Yu
Fei Liu
84
9
0
15 Feb 2024
A StrongREJECT for Empty Jailbreaks
Alexandra Souly
Qingyuan Lu
Dillon Bowen
Tu Trinh
Elvis Hsieh
...
Pieter Abbeel
Justin Svegliato
Scott Emmons
Olivia Watkins
Sam Toyer
113
98
0
15 Feb 2024
Data Engineering for Scaling Language Models to 128K Context
Yao Fu
Yikang Shen
Xinyao Niu
Xiang Yue
Hanna Hajishirzi
Yoon Kim
Hao-Chun Peng
MoE
116
145
0
15 Feb 2024
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
Ming Li
Lichang Chen
Jiuhai Chen
Shwai He
Jiuxiang Gu
Dinesh Manocha
149
59
0
15 Feb 2024
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence
Weixiang Zhao
Zhuojun Li
Shilong Wang
Yang Wang
Yulin Hu
Yanyan Zhao
Chen Wei
Bing Qin
116
5
0
15 Feb 2024
Inadequacies of Large Language Model Benchmarks in the Era of Generative Artificial Intelligence
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
ALM
ELM
108
58
0
15 Feb 2024
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models
Shengrui Li
Junzhe Chen
Xueting Han
Jing Bai
83
6
0
15 Feb 2024
How to Train Data-Efficient LLMs
Noveen Sachdeva
Benjamin Coleman
Wang-Cheng Kang
Jianmo Ni
Lichan Hong
Ed H. Chi
James Caverlee
Julian McAuley
D. Cheng
104
64
0
15 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
86
52
0
14 Feb 2024
Instruction Tuning for Secure Code Generation
Jingxuan He
Mark Vero
Gabriela Krasnopolska
Martin Vechev
94
24
0
14 Feb 2024
Previous
1
2
3
...
51
52
53
...
67
68
69
Next