Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 3,408 papers shown
Title
HLB: Benchmarking LLMs' Humanlikeness in Language Use
Xufeng Duan
Bei Xiao
Xuemei Tang
Zhenguang G. Cai
71
4
0
24 Sep 2024
Small Language Models: Survey, Measurements, and Insights
Zhenyan Lu
Xiang Li
Dongqi Cai
Rongjie Yi
Fangming Liu
Xiwen Zhang
Nicholas D. Lane
Mengwei Xu
ObjD
LRM
173
58
0
24 Sep 2024
Eagle: Efficient Training-Free Router for Multi-LLM Inference
Zesen Zhao
Shuowei Jin
Z. Morley Mao
69
5
0
23 Sep 2024
GenAI Advertising: Risks of Personalizing Ads with LLMs
Brian Tang
Kaiwen Sun
Noah T. Curran
F. Schaub
Kang G. Shin
SILM
65
2
0
23 Sep 2024
Inference-Friendly Models With MixAttention
Shashank Rajput
Ying Sheng
Sean Owen
Vitaliy Chiley
165
2
0
23 Sep 2024
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs
Clément Christophe
Tathagata Raha
Svetlana Maslenkova
Muhammad Umar Salman
Praveen K Kanithi
Marco AF Pimentel
Shadab Khan
LM&MA
68
2
0
23 Sep 2024
Pareto-Optimized Open-Source LLMs for Healthcare via Context Retrieval
Jordi Bayarri-Planas
Ashwin Kumar Gururajan
Dario Garcia-Gasulla
77
1
0
23 Sep 2024
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
Jiahao Yu
Yangguang Shao
Hanwen Miao
Junzheng Shi
SILM
AAML
169
11
0
23 Sep 2024
Investigating Layer Importance in Large Language Models
Yang Zhang
Yanfei Dong
Kenji Kawaguchi
FAtt
98
10
0
22 Sep 2024
The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests
Lior Madmoni
Amir Zait
Ilia Labzovsky
Danny Karmon
ELM
61
0
0
22 Sep 2024
GroupDebate: Enhancing the Efficiency of Multi-Agent Debate Using Group Discussion
Tongxuan Liu
Xingyu Wang
Weizhe Huang
Wenjiang Xu
Yuting Zeng
Lei Jiang
Hailong Yang
Jing Li
LLMAG
79
13
0
21 Sep 2024
Can LLMs replace Neil deGrasse Tyson? Evaluating the Reliability of LLMs as Science Communicators
Prasoon Bajpai
Niladri Chatterjee
Subhabrata Dutta
Tanmoy Chakraborty
ELM
99
2
0
21 Sep 2024
ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models
Yuqing Huang
Rongyang Zhang
Xiaoxiao He
Xuyang Zhi
Hao Wang
...
Guoping Hu
Guiquan Liu
Qi Liu
Defu Lian
Enhong Chen
ELM
90
8
0
21 Sep 2024
Uncovering Latent Chain of Thought Vectors in Language Models
Jason Zhang
Scott Viteri
LLMSV
LRM
145
3
0
21 Sep 2024
Co-occurrence is not Factual Association in Language Models
Xiao Zhang
Miao Li
Ji Wu
KELM
172
4
0
21 Sep 2024
Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-Tuning
Daniele Rege Cambrin
Giuseppe Gallipoli
Irene Benedetto
Luca Cagliero
Paolo Garza
55
0
0
20 Sep 2024
JMedBench: A Benchmark for Evaluating Japanese Biomedical Large Language Models
Junfeng Jiang
Jiahao Huang
Akiko Aizawa
LM&MA
77
4
0
20 Sep 2024
CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information
Yuxin Wang
Minghua Ma
Zekun Wang
Jingchang Chen
Huiming Fan
Liping Shan
Qing Yang
Dongliang Xu
Ming Liu
Bing Qin
79
4
0
20 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
Vardan Papyan
VLM
166
3
0
20 Sep 2024
Cross-Domain Content Generation with Domain-Specific Small Language Models
Ankit Maloo
Abhinav Garg
CLL
47
0
0
19 Sep 2024
Guided Profile Generation Improves Personalization with LLMs
Jiarui Zhang
80
7
0
19 Sep 2024
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning
Xiaotian Han
Yiren Jian
Xuefeng Hu
Haogeng Liu
Yiqi Wang
...
Yuang Ai
Huaibo Huang
Ran He
Zhenheng Yang
Quanzeng You
LRM
AI4CE
59
22
0
19 Sep 2024
Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data
Jiaming Zhou
Abbas Ghaddar
Ge Zhang
Liheng Ma
Yaochen Hu
Soumyasundar Pal
Mark Coates
Bin Wang
Yingxue Zhang
Jianye Hao
ReLM
LRM
100
4
0
19 Sep 2024
Strategic Collusion of LLM Agents: Market Division in Multi-Commodity Competitions
Ryan Y. Lin
Siddhartha Ojha
Kevin Cai
Maxwell F. Chen
100
4
0
19 Sep 2024
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models
Peiyi Zhang
Yazhou Zhang
Bo Wang
Lu Rong
Jing Qin
Jing Qin
AI4Ed
ELM
145
2
0
19 Sep 2024
Qwen2.5-Coder Technical Report
Binyuan Hui
Jian Yang
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
...
Fei Huang
Xingzhang Ren
Xuancheng Ren
Jingren Zhou
Junyang Lin
OSLM
121
337
0
18 Sep 2024
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
Justin Chih-Yao Chen
Archiki Prasad
Swarnadeep Saha
Elias Stengel-Eskin
Joey Tianyi Zhou
LRM
95
13
0
18 Sep 2024
Linguini: A benchmark for language-agnostic linguistic reasoning
Eduardo Sánchez
Belen Alastruey
C. Ropers
Pontus Stenetorp
Mikel Artetxe
Marta R. Costa-jussá
ReLM
ELM
LRM
96
8
0
18 Sep 2024
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
An Yang
Beichen Zhang
Binyuan Hui
Bofei Gao
Bowen Yu
...
Mingfeng Xue
Runji Lin
Tianyu Liu
Xingzhang Ren
Zhenru Zhang
OSLM
LRM
162
321
0
18 Sep 2024
Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources
Issey Sukeda
ELM
92
2
0
18 Sep 2024
Multitask Mayhem: Unveiling and Mitigating Safety Gaps in LLMs Fine-tuning
Essa Jan
Nouar Aldahoul
Moiz Ali
Faizan Ahmad
Fareed Zaffar
Yasir Zaki
57
3
0
18 Sep 2024
Enabling Real-Time Conversations with Minimal Training Costs
Wang Xu
Shuo Wang
Weilin Zhao
Xu Han
Yukun Yan
Yudi Zhang
Zhe Tao
Zhiyuan Liu
Wanxiang Che
64
5
0
18 Sep 2024
Reward-Robust RLHF in LLMs
Yuzi Yan
Xingzhou Lou
Jialian Li
Yiping Zhang
Jian Xie
Chao Yu
Yu Wang
Dong Yan
Yuan Shen
101
13
0
18 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
247
132
0
18 Sep 2024
Egalitarian Language Representation in Language Models: It All Begins with Tokenizers
Menan Velayuthan
Kengatharaiyer Sarveswaran
106
7
0
17 Sep 2024
NVLM: Open Frontier-Class Multimodal LLMs
Wenliang Dai
Nayeon Lee
Wei Ping
Zhuoling Yang
Zihan Liu
Jon Barker
Tuomas Rintamaki
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
MLLM
VLM
LRM
123
73
0
17 Sep 2024
Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement
Simon Yu
Liangyu Chen
Sara Ahmadian
Marzieh Fadaee
80
7
0
17 Sep 2024
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do
Guijin Son
Hyunwoo Ko
Hoyoung Lee
Yewon Kim
Seunghyeok Hong
ALM
ELM
100
11
0
17 Sep 2024
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Luning Wang
Shiyao Li
Xuefei Ning
Zhihang Yuan
Shengen Yan
Guohao Dai
Yu Wang
76
0
0
16 Sep 2024
The 20 questions game to distinguish large language models
Gurvan Richardeau
Erwan Le Merrer
C. Penzo
Gilles Tredan
108
1
0
16 Sep 2024
MindGuard: Towards Accessible and Sitgma-free Mental Health First Aid via Edge LLM
Sijie Ji
Xinzhe Zheng
Jiawei Sun
Renqi Chen
Wei Gao
Mani Srivastava
AI4MH
62
4
0
16 Sep 2024
Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges
Vinay Samuel
Yue Zhou
Henry Peng Zou
AAML
66
8
0
16 Sep 2024
SFR-RAG: Towards Contextually Faithful LLMs
Xuan-Phi Nguyen
Shrey Pandit
Senthil Purushwalkam
Austin Xu
Hailin Chen
Yifei Ming
Zixuan Ke
Silvio Savarese
Caiming Xong
Shafiq Joty
RALM
132
10
0
16 Sep 2024
Flash STU: Fast Spectral Transform Units
Y. Isabel Liu
Windsor Nguyen
Yagiz Devre
Evan Dogariu
Anirudha Majumdar
Elad Hazan
AI4TS
160
1
0
16 Sep 2024
Eir: Thai Medical Large Language Models
Yutthakorn Thiprak
Rungtam Ngodngamthaweesuk
Songtam Ngodngamtaweesuk
LM&MA
ELM
145
0
0
13 Sep 2024
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Fajri Koto
ELM
164
3
0
13 Sep 2024
Synthetic continued pretraining
Zitong Yang
Neil Band
Shuangping Li
Emmanuel Candès
Tatsunori Hashimoto
CLL
SyDa
100
16
0
11 Sep 2024
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
Yu Zhang
Aaron Courville
Ruijie Zhu
Yue Zhang
Leyang Cui
...
Freda Shi
Bailin Wang
Wei Bi
P. Zhou
Guohong Fu
117
24
0
11 Sep 2024
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio
Ningyuan Xi
Yetao Wu
Kun Fan
Teng Chen
Qingqing Gu
...
Jinxian Qu
Chenxi Liu
Zhonglin Jiang
Yong Chen
Luo Ji
ALM
55
0
0
10 Sep 2024
Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games
Juhwan Choi
Youngbin Kim
91
1
0
10 Sep 2024
Previous
1
2
3
...
30
31
32
...
67
68
69
Next