Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 3,408 papers shown
Title
OntoTune: Ontology-Driven Self-training for Aligning Large Language Models
Zhiqiang Liu
Chengtao Gan
Junjie Wang
Yanzhe Zhang
Zhongpu Bo
Mengshu Sun
Ningyu Zhang
Wen Zhang
120
2
0
08 Feb 2025
BCQ: Block Clustered Quantization for 4-bit (W4A4) LLM Inference
Reena Elangovan
Charbel Sakr
A. Raghunathan
Brucek Khailany
MQ
121
1
0
07 Feb 2025
M-IFEval: Multilingual Instruction-Following Evaluation
Antoine Dussolle
Andrea Cardeña Díaz
Shota Sato
Peter Devine
ELM
174
0
0
07 Feb 2025
\Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Ilia Karmanov
A. Deshmukh
Lukas Voegtle
Philipp Fischer
Kateryna Chumachenko
...
Jarno Seppänen
Jupinder Parmar
Pritam Gundecha
Andrew Tao
Karan Sapra
135
1
0
06 Feb 2025
Decoder-Only LLMs are Better Controllers for Diffusion Models
Ziyi Dong
Yao Xiao
Pengxu Wei
Liang Lin
DiffM
216
0
0
06 Feb 2025
The Order Effect: Investigating Prompt Sensitivity to Input Order in LLMs
Bryan Guan
Tanya Roosta
Peyman Passban
Mehdi Rezagholizadeh
131
0
0
06 Feb 2025
Training an LLM-as-a-Judge Model: Pipeline, Insights, and Practical Lessons
Renjun Hu
Yi Cheng
Libin Meng
Jiaxin Xia
Yi Zong
Xing Shi
Wei Lin
164
2
0
05 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Carla P. Gomes
B. Selman
Qingsong Wen
LRM
223
17
0
05 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
475
7
0
04 Feb 2025
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
Angelina Wang
Michelle Phan
Daniel E. Ho
Sanmi Koyejo
144
2
0
04 Feb 2025
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Maohao Shen
Guangtao Zeng
Zhenting Qi
Zhang-Wei Hong
Zhenfang Chen
Wei Lu
G. Wornell
Subhro Das
David D. Cox
Chuang Gan
LRM
LLMAG
561
18
0
04 Feb 2025
Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement
Soheil Abbasloo
LRM
70
0
0
04 Feb 2025
Evalita-LLM: Benchmarking Large Language Models on Italian
Bernardo Magnini
Roberto Zanoli
Michele Resta
Martin Cimmino
Paolo Albano
Marco Madeddu
V. Patti
173
1
0
04 Feb 2025
Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA
Shuangyi Chen
Yuanxin Guo
Yue Ju
Harik Dalal
Ashish Khisti
115
2
0
03 Feb 2025
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities
Zora Che
Stephen Casper
Robert Kirk
Anirudh Satheesh
Stewart Slocum
...
Zikui Cai
Bilal Chughtai
Y. Gal
Furong Huang
Dylan Hadfield-Menell
MU
AAML
ELM
181
7
0
03 Feb 2025
Evaluation of Large Language Models via Coupled Token Generation
N. C. Benz
Stratis Tsirtsis
Eleni Straitouri
Ivi Chatzi
Ander Artola Velasco
Suhas Thejaswi
Manuel Gomez Rodriguez
112
1
0
03 Feb 2025
PARA: Parameter-Efficient Fine-tuning with Prompt Aware Representation Adjustment
Zequan Liu
Yi Zhao
Ming Tan
Wei Zhu
Aaron Xuxiang Tian
162
0
0
03 Feb 2025
QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning
Moses Ananta
Muhammad Farid Adilazuarda
Zayd Muhammad Kawakibi Zuhri
Ayu Purwarianti
Alham Fikri Aji
MQ
161
0
0
03 Feb 2025
Develop AI Agents for System Engineering in Factorio
Neel Kant
121
1
0
03 Feb 2025
Breaking Focus: Contextual Distraction Curse in Large Language Models
Yue Huang
Yanbo Wang
Zixiang Xu
Chujie Gao
Siyuan Wu
Jiayi Ye
Preslav Nakov
Pin-Yu Chen
Wei Wei
AAML
109
4
0
03 Feb 2025
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
Wenzhe Li
Yong Lin
Mengzhou Xia
Chi Jin
MoE
148
4
0
02 Feb 2025
LLM-Powered Benchmark Factory: Reliable, Generic, and Efficient
Peiwen Yuan
Shaoxiong Feng
Yiwei Li
Xiaobei Wang
Y. Zhang
Jiayi Shi
Chuyi Tan
Boyuan Pan
Yao Hu
Kan Li
107
4
0
02 Feb 2025
Advanced Weakly-Supervised Formula Exploration for Neuro-Symbolic Mathematical Reasoning
Yuxuan Wu
Hideki Nakayama
NAI
91
1
0
02 Feb 2025
PolarQuant: Leveraging Polar Transformation for Efficient Key Cache Quantization and Decoding Acceleration
Songhao Wu
Ang Lv
Xiao Feng
Yanzhe Zhang
Xun Zhang
Guojun Yin
Wei Lin
Rui Yan
MQ
96
1
0
01 Feb 2025
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models
Xin Xu
Qiyun Xu
Tong Xiao
Tianhao Chen
Yuchen Yan
Jiaxin Zhang
Shizhe Diao
Can Yang
Yang Wang
LRM
AI4CE
ELM
282
8
0
01 Feb 2025
CoddLLM: Empowering Large Language Models for Data Analytics
Jiani Zhang
Hengrui Zhang
Rishav Chakravarti
Yiqun Hu
Patrick Ng
Asterios Katsifodimos
Huzefa Rangwala
George Karypis
Alon Halevy
SyDa
ELM
463
0
0
01 Feb 2025
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
Gregor Bachmann
Sotiris Anagnostidis
Albert Pumarola
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Edgar Schönfeld
Ali K. Thabet
Jonas Kohler
ALM
BDL
162
16
0
31 Jan 2025
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Antoine Simoulin
Namyong Park
Xiaoyi Liu
Grey Yang
195
1
0
31 Jan 2025
Ensembles of Low-Rank Expert Adapters
Yinghao Li
Vianne Gao
Chao Zhang
MohamadAli Torkamani
169
0
0
31 Jan 2025
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Rui Min
Tianyu Pang
Chao Du
Qian Liu
Minhao Cheng
Min Lin
AAML
108
4
0
29 Jan 2025
DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance
Seffi Cohen
Niv Goldshlager
Nurit Cohen-Inger
Bracha Shapira
Lior Rokach
150
1
0
29 Jan 2025
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains
Ran Xu
Hui Liu
Sreyashi Nag
Zhenwei Dai
Yaochen Xie
...
Chen Luo
Yang Li
Joyce C. Ho
Carl Yang
Qi He
RALM
179
11
0
28 Jan 2025
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
Yibin Wang
Haizhou Shi
Ligong Han
Dimitris N. Metaxas
Hao Wang
BDL
UQLM
233
13
0
28 Jan 2025
Beyond Benchmarks: On The False Promise of AI Regulation
Gabriel Stanovsky
Renana Keydar
Gadi Perl
Eliya Habba
92
2
0
28 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Min Zhang
LM&MA
AILaw
232
177
0
28 Jan 2025
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas
Xiaoyang Wang
Han Zhang
Tao Ge
Wenhao Yu
Dian Yu
Dong Yu
AI4CE
141
3
0
28 Jan 2025
Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented Generation
Zijie Zhong
Hanwen Liu
Xiaoya Cui
Xiaofan Zhang
Zengchang Qin
145
8
0
28 Jan 2025
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers
Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Siyuan Lu
Yaliang Li
Ji-Rong Wen
LRM
133
18
0
28 Jan 2025
StringLLM: Understanding the String Processing Capability of Large Language Models
Xilong Wang
Hao Fu
Jindong Wang
Neil Zhenqiang Gong
179
0
0
28 Jan 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
Qingbin Liu
Tao Zhang
Tao Zhang
Tian Jin
...
Jianhua Xu
Haoze Sun
Mingan Lin
Guosheng Dong
Xin Wu
AuLLM
184
23
0
28 Jan 2025
LCTG Bench: LLM Controlled Text Generation Benchmark
Kemal Kurniawan
Masato Mita
Peinan Zhang
S. Sasaki
Ryosuke Ishigami
Naoaki Okazaki
117
0
0
28 Jan 2025
GUIDE: A Global Unified Inference Engine for Deploying Large Language Models in Heterogeneous Environments
Yanyu Chen
Ganhong Huang
160
0
0
28 Jan 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
146
7
0
28 Jan 2025
AdaCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Chain-of-Thought
Xin Huang
Tarun K. Vangani
Zhengyuan Liu
Bowei Zou
Ai Ti Aw
LRM
AI4CE
173
2
0
27 Jan 2025
Complete Chess Games Enable LLM Become A Chess Master
Yinqi Zhang
Xintian Han
Haolong Li
Kedi Chen
Shaohui Lin
ReLM
ELM
106
0
0
26 Jan 2025
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
Ayan Sengupta
Siddhant Chaudhary
Tanmoy Chakraborty
122
4
0
25 Jan 2025
LLM Evaluation Based on Aerospace Manufacturing Expertise: Automated Generation and Multi-Model Question Answering
Beiming Liu
Zhizhuo Cui
Siteng Hu
Xiaohua Li
Haifeng Lin
Zhengxin Zhang
ELM
43
2
0
25 Jan 2025
Option-ID Based Elimination For Multiple Choice Questions
Zhenhao Zhu
Bulou Liu
Qingyao Ai
Yang Liu
135
0
0
25 Jan 2025
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
Changhun Lee
Jun-gyu Jin
Jun-gyu Jin
Younghyun Cho
Eunhyeok Park
RALM
LRM
119
0
0
25 Jan 2025
The Karp Dataset
Mason DiCicco
Eamon Worden
Conner Olsen
Nikhil Gangaram
Daniel Reichman
Neil T. Heffernan
ReLM
LRM
138
0
0
24 Jan 2025
Previous
1
2
3
...
19
20
21
...
67
68
69
Next