Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 3,408 papers shown
Title
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Qintong Li
Jiahui Gao
Sheng Wang
Renjie Pi
Xueliang Zhao
Chuan Wu
Xin Jiang
Zhiyu Li
Lingpeng Kong
SyDa
103
3
0
22 Oct 2024
Influential Language Data Selection via Gradient Trajectory Pursuit
Zhiwei Deng
Tao Li
Yang Li
67
1
0
22 Oct 2024
PLDR-LLM: Large Language Model from Power Law Decoder Representations
Burc Gokden
59
1
0
22 Oct 2024
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes
Bryan R Christ
Zack Gottesman
Jonathan Kropko
Thomas Hartvigsen
LRM
138
4
0
22 Oct 2024
ToW: Thoughts of Words Improve Reasoning in Large Language Models
Zhikun Xu
Ming shen
Jacob Dineen
Zhaonan Li
Xiao Ye
Shijie Lu
Aswin Rrv
Chitta Baral
Ben Zhou
LRM
454
1
0
21 Oct 2024
Pre-training Distillation for Large Language Models: A Design Space Exploration
Hao Peng
Xin Lv
Yushi Bai
Zijun Yao
Jing Zhang
Lei Hou
Juanzi Li
74
4
0
21 Oct 2024
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Yantao Liu
Zijun Yao
Rui Min
Yixin Cao
Lei Hou
Juanzi Li
OffRL
ALM
128
42
0
21 Oct 2024
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model
Divyanshu Aggarwal
Sankarshan Damle
Navin Goyal
Satya Lokam
Sunayana Sitaram
CLL
65
1
0
21 Oct 2024
Do Large Language Models Have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs
Yanzhu Guo
Simone Conia
Zelin Zhou
Min Li
Saloni Potdar
Henry Xiao
85
3
0
21 Oct 2024
Residual vector quantization for KV cache compression in large language model
Ankur Kumar
MQ
104
0
0
21 Oct 2024
OpenMU: Your Swiss Army Knife for Music Understanding
Mengjie Zhao
Zhi-Wei Zhong
Zhuoyuan Mao
Shiqi Yang
Wei-Hsiang Liao
Shusuke Takahashi
Hiromi Wakaki
Yuki Mitsufuji
OSLM
103
8
0
21 Oct 2024
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Wenkai Li
Jiarui Liu
Andy Liu
Xuhui Zhou
Mona Diab
Maarten Sap
165
11
0
21 Oct 2024
Catastrophic Failure of LLM Unlearning via Quantization
Zhiwei Zhang
Fali Wang
Xiaomin Li
Zongyu Wu
Xianfeng Tang
Hui Liu
Qi He
Wenpeng Yin
Suhang Wang
MU
97
18
0
21 Oct 2024
Compute-Constrained Data Selection
Junjie Oscar Yin
Alexander M. Rush
194
1
0
21 Oct 2024
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Norbert Tihanyi
Tamás Bisztray
Richard A. Dubniczky
Rebeka Tóth
B. Borsos
...
Ryan Marinelli
Lucas C. Cordeiro
Merouane Debbah
Vasileios Mavroeidis
Audun Josang
95
5
0
20 Oct 2024
The Best Defense is a Good Offense: Countering LLM-Powered Cyberattacks
Daniel Ayzenshteyn
Roy Weiss
Yisroel Mirsky
AAML
56
2
0
20 Oct 2024
When Machine Unlearning Meets Retrieval-Augmented Generation (RAG): Keep Secret or Forget Knowledge?
Shang Wang
Tianqing Zhu
Dayong Ye
Wanlei Zhou
MU
80
5
0
20 Oct 2024
Lossless KV Cache Compression to 2%
Zhen Yang
Jizong Han
Kan Wu
Ruobing Xie
An Wang
Xingwu Sun
Zhanhui Kang
VLM
MQ
80
2
0
20 Oct 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja
Lester James V. Miranda
Shayekh Bin Islam
Rishabh Maheshwary
Drishti Sharma
Gusti Winata
Nathan Lambert
Sebastian Ruder
Sara Hooker
Marzieh Fadaee
LRM
142
24
0
20 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
169
4
0
20 Oct 2024
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Alan Dao
Dinh Bach Vu
Huy Hoang Ha
AuLLM
VLM
141
5
0
20 Oct 2024
An Electoral Approach to Diversify LLM-based Multi-Agent Collective Decision-Making
Xiutian Zhao
Ke Wang
Wei Peng
98
4
0
19 Oct 2024
CAP: Data Contamination Detection via Consistency Amplification
Yi Zhao
Jing Li
Linyi Yang
55
1
0
19 Oct 2024
SPRIG: Improving Large Language Model Performance by System Prompt Optimization
Lechen Zhang
Tolga Ergen
Lajanugen Logeswaran
Moontae Lee
David Jurgens
LRM
79
9
0
18 Oct 2024
MoDification: Mixture of Depths Made Easy
C. Zhang
M. Zhong
Qimeng Wang
Xuantao Lu
Zheyu Ye
...
Yan Gao
Yao Hu
Kehai Chen
Min Zhang
Dawei Song
VLM
MoE
59
2
0
18 Oct 2024
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li
Zichun Yu
Chenyan Xiong
SyDa
82
1
0
18 Oct 2024
Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning
Pengfei He
Zitao Li
Yue Xing
Yaling Li
Jiliang Tang
Bolin Ding
LLMAG
LRM
43
3
0
18 Oct 2024
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
Qin Liu
Fei Wang
Chaowei Xiao
Muhao Chen
457
2
0
18 Oct 2024
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
Lang Cao
Chao Peng
Renhong Chen
Wu Ning
Yingtian Zou
Yitong Li
LRM
111
0
0
18 Oct 2024
Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs
Runchu Tian
Yanghao Li
Yuepeng Fu
Siyang Deng
Qinyu Luo
...
Zhong Zhang
Yesai Wu
Yankai Lin
Huadong Wang
Xiaojiang Liu
131
1
0
18 Oct 2024
RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs
Jiatan Huang
Mingchen Li
Zonghai Yao
Zhichao Yang
Yongkang Xiao
Feiyun Ouyang
Xiaohan Li
Shuo Han
Hong-ye Yu
RALM
140
3
0
17 Oct 2024
Accounting for Sycophancy in Language Model Uncertainty Estimation
Anthony Sicilia
Mert Inan
Malihe Alikhani
71
2
0
17 Oct 2024
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
Qiaoyu Tang
Le Yu
Bowen Yu
Hongyu Lin
Keming Lu
Yaojie Lu
Jia Zheng
Le Sun
MoMe
83
1
0
17 Oct 2024
Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
Mazda Moayeri
Vidhisha Balachandran
Varun Chandrasekaran
Safoora Yousefi
Thomas Fel
Soheil Feizi
Besmira Nushi
Neel Joshi
Vibhav Vineet
71
5
0
17 Oct 2024
BenTo: Benchmark Task Reduction with In-Context Transferability
Hongyu Zhao
Ming Li
Lichao Sun
Tianyi Zhou
98
0
0
17 Oct 2024
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix J Binder
James Chua
Tomek Korbak
Henry Sleight
John Hughes
Robert Long
Ethan Perez
Miles Turpin
Owain Evans
KELM
AIFin
LRM
95
17
0
17 Oct 2024
Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Shailaja Keyur Sampat
Maitreya Patel
Yezhou Yang
Chitta Baral
35
0
0
17 Oct 2024
IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection
Jielin Song
Siyu Liu
Bin Zhu
Yanghui Rao
40
3
0
17 Oct 2024
MedINST: Meta Dataset of Biomedical Instructions
Wenhan Han
Meng Fang
Zihan Zhang
Yu Yin
Zirui Song
Ling-Hao Chen
Mykola Pechenizkiy
Qingyu Chen
LM&MA
69
3
0
17 Oct 2024
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du
Jinyi Han
Yizhou Ying
Aili Chen
Qianyu He
...
Haoran Guo
Jiaqing Liang
Zulong Chen
Liangyue Li
Yanghua Xiao
KELM
CLL
LRM
69
1
0
17 Oct 2024
MoR: Mixture of Ranks for Low-Rank Adaptation Tuning
Chuanyu Tang
Yilong Chen
Zhenyu Zhang
Junyuan Shang
Wenyuan Zhang
Yong Huang
Tingwen Liu
MoE
59
0
0
17 Oct 2024
Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning
Minseok Choi
C. Park
Dohyun Lee
Jaegul Choo
KELM
MU
55
1
0
17 Oct 2024
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang
Pengda Wang
Luke D. Plonsky
Frederick L. Oswald
Hanjie Chen
ELM
77
2
0
17 Oct 2024
LocateBench: Evaluating the Locating Ability of Vision Language Models
Ting-Rui Chiang
Joshua Robinson
Xinyan Velocity Yu
Dani Yogatama
VLM
ELM
70
0
0
17 Oct 2024
Balancing Label Quantity and Quality for Scalable Elicitation
Alex Troy Mallen
Nora Belrose
75
2
0
17 Oct 2024
Retrieval-Enhanced Named Entity Recognition
Enzo Shiraishi
Raphael Y. de Camargo
Henrique L. P. Silva
Ronaldo C. Prati
RALM
118
0
0
17 Oct 2024
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
Haotian Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
110
12
0
17 Oct 2024
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang
Pei Zhang
Baosong Yang
Derek F. Wong
Rui Wang
LRM
115
15
0
17 Oct 2024
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
Florian E. Dorner
Vivian Y. Nastl
Moritz Hardt
ELM
ALM
121
10
0
17 Oct 2024
Learning to Route LLMs with Confidence Tokens
Yu-Neng Chuang
Helen Zhou
Prathusha Kameswara Sarma
Parikshit Gopalan
John Boccio
Sara Bolouki
Helen Zhou
85
0
0
17 Oct 2024
Previous
1
2
3
...
25
26
27
...
67
68
69
Next