Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 3,408 papers shown
Title
EFSA: Towards Event-Level Financial Sentiment Analysis
Tianyu Chen
Yiming Zhang
Guoxin Yu
Dapeng Zhang
Li Zeng
Qing He
Xiang Ao
84
4
0
08 Apr 2024
SEER-MoE: Sparse Expert Efficiency through Regularization for Mixture-of-Experts
Alexandre Muzio
Alex Sun
Churan He
MoE
65
12
0
07 Apr 2024
Multicalibration for Confidence Scoring in LLMs
Gianluca Detommaso
Martín Bertrán
Riccardo Fogliato
Aaron Roth
116
19
0
06 Apr 2024
PhyloLM : Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks
Nicolas Yax
Pierre-Yves Oudeyer
Stefano Palminteri
137
6
0
06 Apr 2024
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
Xinyu Ma
Xu Chu
Zhibang Yang
Yang Lin
Xin Gao
Junfeng Zhao
96
10
0
05 Apr 2024
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Xinrun Du
Zhouliang Yu
Songyang Gao
Ding Pan
Yuyang Cheng
...
Tianyu Zheng
Xinchen Luo
Guorui Zhou
Wenhu Chen
Ge Zhang
130
20
0
05 Apr 2024
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer
Hele-Andra Kuulmets
Taido Purason
Agnes Luhtaru
Mark Fishel
81
19
0
05 Apr 2024
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Jingyu Zhang
Marc Marone
Tianjian Li
Benjamin Van Durme
Daniel Khashabi
195
9
0
05 Apr 2024
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
Makesh Narsimhan Sreedhar
Traian Rebedea
Shaona Ghosh
Jiaqi Zeng
Christopher Parisien
ALM
103
6
0
04 Apr 2024
Evaluating LLMs at Detecting Errors in LLM Responses
Ryo Kamoi
Sarkar Snigdha Sarathi Das
Renze Lou
Jihyun Janice Ahn
Yilun Zhao
...
Salika Dave
Shaobo Qin
Arman Cohan
Wenpeng Yin
Rui Zhang
86
25
0
04 Apr 2024
Investigating Regularization of Self-Play Language Models
Réda Alami
Abdalgader Abubaker
Mastane Achab
M. Seddik
Salem Lahlou
75
3
0
04 Apr 2024
MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise
Chunyuan Deng
Xiangru Tang
Yilun Zhao
Hanming Wang
Haoran Wang
Wangchunshu Zhou
Arman Cohan
Mark B. Gerstein
LLMAG
MLLM
43
2
0
03 Apr 2024
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Yifan Xu
Xiao Liu
Xinghan Liu
Zhenyu Hou
Yueyan Li
...
Aohan Zeng
Zhengxiao Du
Wenyi Zhao
Jie Tang
Yuxiao Dong
LRM
103
42
0
03 Apr 2024
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
Qi Luo
Hengxu Yu
Xiao Li
92
6
0
03 Apr 2024
Calibrating the Confidence of Large Language Models by Eliciting Fidelity
Mozhi Zhang
Mianqiu Huang
Rundong Shi
Linsen Guo
Chong Peng
Peng Yan
Yaqian Zhou
Xipeng Qiu
86
13
0
03 Apr 2024
Measuring Social Norms of Large Language Models
Ye Yuan
Kexin Tang
Jianhao Shen
Ming Zhang
Chenguang Wang
ELM
60
8
0
03 Apr 2024
MOPAR: A Model Partitioning Framework for Deep Learning Inference Services on Serverless Platforms
Jiaang Duan
Shiyou Qian
Dingyu Yang
Hanwen Hu
Jian Cao
Guangtao Xue
MoE
65
2
0
03 Apr 2024
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
88
9
0
02 Apr 2024
PATCH! {P}sychometrics-{A}ssis{T}ed Ben{CH}marking of Large Language Models against Human Populations: A Case Study of Proficiency in 8th Grade Mathematics
Qixiang Fang
Daniel L. Oberski
Dong Nguyen
104
3
0
02 Apr 2024
Poro 34B and the Blessing of Multilinguality
Risto Luukkonen
Jonathan Burdge
Elaine Zosa
Aarne Talman
Ville Komulainen
Vaino Hatanpaa
Peter Sarlin
S. Pyysalo
AI4CE
102
14
0
02 Apr 2024
Exploring the Mystery of Influential Data for Mathematical Reasoning
Xinzhe Ni
Yeyun Gong
Zhibin Gou
Yelong Shen
Yujiu Yang
Nan Duan
Weizhu Chen
91
10
0
01 Apr 2024
Evalverse: Unified and Accessible Library for Large Language Model Evaluation
Jihoo Kim
Wonho Song
Dahyun Kim
Yunsu Kim
Yungi Kim
Chanjun Park
ELM
103
5
0
01 Apr 2024
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
Xiaoze Liu
Feijie Wu
Tianyang Xu
Zhuo Chen
Yichi Zhang
Xiaoqian Wang
Jing Gao
HILM
100
10
0
01 Apr 2024
LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction
Bo Zou
Chao Yang
Yu Qiao
Chengbin Quan
Youjian Zhao
105
6
0
01 Apr 2024
Learning to Plan for Language Modeling from Unlabeled Data
Nathan Cornille
Marie-Francine Moens
Florian Mai
69
10
0
31 Mar 2024
Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
Xiao Liu
Xixuan Song
Yuxiao Dong
Jie Tang
SyDa
64
5
0
31 Mar 2024
Algorithmic Collusion by Large Language Models
Sara Fish
Yannai A. Gonczarowski
Ran I. Shorrer
140
13
0
31 Mar 2024
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical Reasoning
Eli Schwartz
Leshem Choshen
J. Shtok
Sivan Doveh
Leonid Karlinsky
Assaf Arbelle
106
16
0
30 Mar 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Taishi Nakamura
Mayank Mishra
Simone Tedeschi
Yekun Chai
Jason T Stillerman
...
Virendra Mehta
Matthew Blumberg
Victor May
Huu Nguyen
S. Pyysalo
LRM
93
8
0
30 Mar 2024
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks
Hyunjae Kim
Hyeon Hwang
Jiwoo Lee
Sihyeon Park
Dain Kim
Taewhoo Lee
Chanwoong Yoon
Jiwoong Sohn
Donghee Choi
Jaewoo Kang
ELM
AI4MH
LRM
127
22
0
30 Mar 2024
ReALM: Reference Resolution As Language Modeling
Joel Ruben Antony Moniz
Soundarya Krishnan
Melis Ozyildirim
Prathamesh Saraf
Halim Cagri Ates
Yuan-kang Zhang
Hong-ye Yu
Nidhi Rajshree
82
7
0
29 Mar 2024
Latxa: An Open Language Model and Evaluation Suite for Basque
Julen Etxaniz
Oscar Sainz
Naiara Pérez
Itziar Aldabe
German Rigau
Eneko Agirre
Aitor Ormazabal
Mikel Artetxe
A. Soroa
ELM
69
32
0
29 Mar 2024
Measuring Taiwanese Mandarin Language Understanding
Po-Heng Chen
Sijia Cheng
Wei-Lin Chen
Yen-Ting Lin
Yun-Nung Chen
ELM
119
2
0
29 Mar 2024
Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models
Jesse Atuhurra
Iqra Ali
Tatsuya Hiraoka
Hidetaka Kamigaito
Tomoya Iwakura
Taro Watanabe
108
1
0
29 Mar 2024
MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models
Peng Ding
Jiading Fang
Peng Li
Kangrui Wang
Xiaochen Zhou
Mo Yu
Jing Li
Matthew R. Walter
Hongyuan Mei
RALM
ELM
97
6
0
29 Mar 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
124
229
0
28 Mar 2024
Croissant: A Metadata Format for ML-Ready Datasets
Mubashara Akhtar
Omar Benjelloun
Costanza Conforti
Pieter Gijsbers
Joan Giner-Miguelez
...
Slava Tykhonov
Joaquin Vanschoren
Jos van der Velde
Steffen Vogler
Carole-Jean Wu
82
40
0
28 Mar 2024
A Review of Multi-Modal Large Language and Vision Models
Kilian Carolan
Laura Fennelly
Alan F. Smeaton
VLM
186
28
0
28 Mar 2024
Fine-Tuning Language Models with Reward Learning on Policy
Hao Lang
Fei Huang
Yongbin Li
ALM
67
7
0
28 Mar 2024
sDPO: Don't Use Your Data All at Once
Dahyun Kim
Yungi Kim
Wonho Song
Hyeonwoo Kim
Yunsu Kim
Sanghoon Kim
Chanjun Park
78
35
0
28 Mar 2024
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Patrick Chao
Edoardo Debenedetti
Alexander Robey
Maksym Andriushchenko
Francesco Croce
...
Nicolas Flammarion
George J. Pappas
F. Tramèr
Hamed Hassani
Eric Wong
ALM
ELM
AAML
135
143
0
28 Mar 2024
Learning From Correctness Without Prompting Makes LLM Efficient Reasoner
Yuxuan Yao
Han Wu
Zhijiang Guo
Biyan Zhou
Jiahui Gao
Sichun Luo
Hanxu Hou
Xiaojin Fu
Linqi Song
LLMAG
LRM
130
10
0
28 Mar 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
99
18
0
28 Mar 2024
Non-Linear Inference Time Intervention: Improving LLM Truthfulness
Jakub Hoscilowicz
Adam Wiacek
Jan Chojnacki
Adam Cieślak
Leszek Michon
Vitalii Urbanevych
Artur Janicki
KELM
73
4
0
27 Mar 2024
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Elliot Bolton
Abhinav Venigalla
Michihiro Yasunaga
David Leo Wright Hall
Betty Xiong
...
R. Daneshjou
Jonathan Frankle
Percy Liang
Michael Carbin
Christopher D. Manning
LM&MA
MedIm
101
64
0
27 Mar 2024
Few-Shot Recalibration of Language Models
Xiang Lisa Li
Urvashi Khandelwal
Kelvin Guu
96
5
0
27 Mar 2024
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
Boyao Wang
Xiang Liu
Shizhe Diao
Renjie Pi
Jipeng Zhang
Chi Han
Tong Zhang
106
55
0
26 Mar 2024
Naive Bayes-based Context Extension for Large Language Models
Jianlin Su
Murtadha Ahmed
Wenbo Luo
Abhishek Rao
Denny Zhou
Hyeontaek Lim
76
6
0
26 Mar 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
161
106
0
26 Mar 2024
An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing
Ziwei Chai
Guoyin Wang
Jing Su
Tianjie Zhang
Xuanwen Huang
...
Jingjing Xu
Jianbo Yuan
Hongxia Yang
Leilei Gan
Yang Yang
98
7
0
25 Mar 2024
Previous
1
2
3
...
46
47
48
...
67
68
69
Next