Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 3,408 papers shown
Title
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
Haotian Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
110
12
0
17 Oct 2024
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images
Arka Daw
Megan Hong-Thanh Chung
Maria Mahbub
Amir Sadovnik
AAML
82
0
0
16 Oct 2024
Self-Pluralising Culture Alignment for Large Language Models
Shaoyang Xu
Yongqi Leng
Linhao Yu
Deyi Xiong
56
3
0
16 Oct 2024
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
Phillip Guo
Aaquib Syed
Abhay Sheshadri
Aidan Ewart
Gintare Karolina Dziugaite
KELM
MU
107
10
0
16 Oct 2024
Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
Jacob Morrison
Noah A. Smith
Hannaneh Hajishirzi
Pang Wei Koh
Jesse Dodge
Pradeep Dasigi
KELM
MoMe
CLL
107
5
0
16 Oct 2024
Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning
Huiwen Wu
Xiaohan Li
Xiaogang Xu
Xiaogang Xu
Deyi Zhang
Zhe Liu
MLLM
CLL
VLM
87
0
0
16 Oct 2024
Agent Skill Acquisition for Large Language Models via CycleQD
So Kuroki
Taishi Nakamura
Takuya Akiba
Yujin Tang
MoMe
158
2
0
16 Oct 2024
Open Ko-LLM Leaderboard2: Bridging Foundational and Practical Evaluation for Korean LLMs
Hyeonwoo Kim
Dahyun Kim
Jihoo Kim
Sukyung Lee
Y. Kim
Chanjun Park
99
0
0
16 Oct 2024
Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
Weixuan Wang
J. Yang
Wei Peng
LLMSV
112
4
0
16 Oct 2024
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Sijun Tan
Siyuan Zhuang
Kyle Montgomery
William Y. Tang
Alejandro Cuadron
Chenguang Wang
Raluca A. Popa
Ion Stoica
ELM
ALM
155
52
0
16 Oct 2024
Conformity in Large Language Models
Xiaochen Zhu
Caiqi Zhang
Tom Stafford
Nigel Collier
Andreas Vlachos
129
0
0
16 Oct 2024
FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware
Minwoo Kang
Mingjie Liu
Ghaith Bany Hamad
Syed Suhaib
Haoxing Ren
LRM
46
3
0
15 Oct 2024
Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
Kaiqiao Han
Tianqing Fang
Zhaowei Wang
Yangqiu Song
Mark Steedman
LRM
146
4
0
15 Oct 2024
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router
Yanyue Xie
Zhi Zhang
Ding Zhou
Cong Xie
Ziang Song
Xin Liu
Yanzhi Wang
Xue Lin
An Xu
LLMAG
89
5
0
15 Oct 2024
Black-box Uncertainty Quantification Method for LLM-as-a-Judge
Nico Wagner
Michael Desmond
Rahul Nair
Zahra Ashktorab
Elizabeth M. Daly
Qian Pan
Martin Santillan Cooper
James M. Johnson
Werner Geyer
ELM
UQCV
75
5
0
15 Oct 2024
Tending Towards Stability: Convergence Challenges in Small Language Models
Richard Diehl Martinez
Pietro Lesci
P. Buttery
90
4
0
15 Oct 2024
TSDS: Data Selection for Task-Specific Model Finetuning
Zifan Liu
Amin Karbasi
Theodoros Rekatsinas
78
6
0
15 Oct 2024
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng
Zifeng Wang
Yike Wang
Sayna Ebrahimi
Hamid Palangi
...
Nathalie Rauschmayr
Yejin Choi
Yulia Tsvetkov
Chen-Yu Lee
Tomas Pfister
MoMe
103
9
0
15 Oct 2024
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks
Guibin Zhang
Xinfeng Li
Xiangguo Sun
Guancheng Wan
Miao Yu
Sihang Li
Kun Wang
Dawei Cheng
Dawei Cheng
AAML
AI4CE
201
20
0
15 Oct 2024
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Syeda Nahida Akter
Shrimai Prabhumoye
John Kamalu
S. Satheesh
Eric Nyberg
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
LRM
SyDa
ReLM
167
2
0
15 Oct 2024
In-context KV-Cache Eviction for LLMs via Attention-Gate
Zihao Zeng
Bokai Lin
Tianqi Hou
Hao Zhang
Zhijie Deng
125
2
0
15 Oct 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
105
18
0
15 Oct 2024
Gender Bias in Decision-Making with Large Language Models: A Study of Relationship Conflicts
Sharon Levy
William D. Adler
T. Karver
Mark Dredze
Michelle R. Kaufman
70
2
0
14 Oct 2024
WILT: A Multi-Turn, Memorization-Robust Inductive Logic Benchmark for LLMs
Eryk Banatt
Jonathan Cheng
Skanda Vaidyanath
Tiffany Hwu
LRM
39
3
0
14 Oct 2024
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Mu Cai
Reuben Tan
Jianrui Zhang
Bocheng Zou
Kai Zhang
...
Yao Dou
J. Park
Jianfeng Gao
Yong Jae Lee
Jianwei Yang
103
22
0
14 Oct 2024
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
Litu Rout
Yujia Chen
Nataniel Ruiz
Constantine Caramanis
Sanjay Shakkottai
Wen-Sheng Chu
DiffM
104
31
0
14 Oct 2024
DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model
Yuqi Wang
Ke Cheng
Jiawei He
Qitai Wang
Hengchen Dai
Yuntao Chen
Fei Xia
Zhaoxiang Zhang
VGen
72
1
0
14 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
Qingbin Liu
MoE
67
3
0
14 Oct 2024
Jailbreak Instruction-Tuned LLMs via end-of-sentence MLP Re-weighting
Yifan Luo
Zhennan Zhou
Meitan Wang
Bin Dong
95
1
0
14 Oct 2024
SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models
Yuqi Li
Yao Lu
Zhihong Zhu
Chuanguang Yang
Yihao Chen
Jianping Gou
66
6
0
14 Oct 2024
Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning
Chengsong Huang
Langlin Huang
Jiaxin Huang
MoMe
137
2
0
14 Oct 2024
Persistent Topological Features in Large Language Models
Yuri Gardinazzi
Giada Panerai
Karthik Viswanathan
A. Ansuini
Alberto Cazzaniga
Matteo Biagetti
156
2
0
14 Oct 2024
Adapt-
∞
\infty
∞
: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection
A. Maharana
Jaehong Yoon
Tianlong Chen
Joey Tianyi Zhou
85
0
0
14 Oct 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
136
5
0
14 Oct 2024
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
141
2
0
14 Oct 2024
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Peng Xia
Siwei Han
Shi Qiu
Yiyang Zhou
Zhaoyang Wang
...
Chenhang Cui
Mingyu Ding
Linjie Li
Lijuan Wang
Huaxiu Yao
163
16
0
14 Oct 2024
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Jixuan Leng
Chengsong Huang
Banghua Zhu
Jiaxin Huang
128
16
0
13 Oct 2024
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
Yein Park
Chanwoong Yoon
Jungwoo Park
Donghyeon Lee
Minbyul Jeong
Jaewoo Kang
KELM
149
2
0
13 Oct 2024
Reverse Modeling in Large Language Models
S. Yu
Yuanchen Xu
Cunxiao Du
Yanying Zhou
Minghui Qiu
Q. Sun
Hao Zhang
Jiawei Wu
162
2
0
13 Oct 2024
Self-Data Distillation for Recovering Quality in Pruned Large Language Models
Vithursan Thangarasa
Ganesh Venkatesh
Mike Lasby
Nish Sinnadurai
Sean Lie
SyDa
177
2
0
13 Oct 2024
Boosting Deductive Reasoning with Step Signals In RLHF
Jiajun Li
Yipin Zhang
Wei Shen
Yuzi Yan
Jian Xie
Dong Yan
LRM
ReLM
62
1
0
12 Oct 2024
Rethinking Data Selection at Scale: Random Selection is Almost All You Need
Tingyu Xia
Bowen Yu
K. Dang
An Yang
Yuan Wu
Yuan Tian
Yi-Ju Chang
Junyang Lin
ALM
73
6
0
12 Oct 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
91
3
0
12 Oct 2024
Enterprise Benchmarks for Large Language Model Evaluation
Bing Zhang
Mikio Takeuchi
Ryo Kawahara
Shubhi Asthana
Md. Maruf Hossain
Guang-Jie Ren
Kate Soule
Yada Zhu
ELM
82
3
0
11 Oct 2024
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models
Zheng Yi Ho
Siyuan Liang
Sen Zhang
Yibing Zhan
Dacheng Tao
69
2
0
11 Oct 2024
Developing a Pragmatic Benchmark for Assessing Korean Legal Language Understanding in Large Language Models
Yeeun Kim
Young Rok Choi
Eunkyung Choi
Jinhwan Choi
H. Park
Wonseok Hwang
ELM
AILaw
75
1
0
11 Oct 2024
QEFT: Quantization for Efficient Fine-Tuning of LLMs
Changhun Lee
Jun-gyu Jin
Jun-gyu Jin
Eunhyeok Park
MQ
82
2
0
11 Oct 2024
JurEE not Judges: safeguarding llm interactions with small, specialised Encoder Ensembles
Dom Nasrabadi
87
1
0
11 Oct 2024
Do Unlearning Methods Remove Information from Language Model Weights?
Aghyad Deeb
Fabien Roger
AAML
MU
113
29
0
11 Oct 2024
Language Imbalance Driven Rewarding for Multilingual Self-improving
Wen Yang
Junhong Wu
Chen Wang
Chengqing Zong
J.N. Zhang
ALM
LRM
215
7
0
11 Oct 2024
Previous
1
2
3
...
26
27
28
...
67
68
69
Next