Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.03300
Cited By
v1
v2
v3 (latest)
Measuring Massive Multitask Language Understanding
7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Measuring Massive Multitask Language Understanding"
50 / 3,408 papers shown
Title
HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor
Zihui Wu
Haichang Gao
Jiacheng Luo
Zhaoxiang Liu
155
0
0
23 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
395
2,031
0
22 Jan 2025
NExtLong: Toward Effective Long-Context Training without Long Documents
Chaochen Gao
Xing Wu
Zijia Lin
Debing Zhang
Songlin Hu
SyDa
189
2
0
22 Jan 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zihao Huang
Ziyao Xu
Zhiyong Yang
Zonghan Yang
Zongyu Lin
OffRL
ALM
AI4TS
VLM
LRM
351
338
0
22 Jan 2025
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Yilun Zhao
Lujing Xie
Haowei Zhang
Guo Gan
Yitao Long
...
Xiangru Tang
Zhenwen Liang
Yongxu Liu
Chen Zhao
Arman Cohan
139
19
0
21 Jan 2025
You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense
Wuyuao Mai
Geng Hong
Pei Chen
Xudong Pan
Baojun Liu
Y. Zhang
Haixin Duan
Min Yang
AAML
133
1
0
21 Jan 2025
Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities
Qirun Dai
Dylan Zhang
Jiaqi W. Ma
Hao Peng
TDI
105
1
0
21 Jan 2025
From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning
Yafu Li
Zhilin Wang
Tingchen Fu
Ganqu Cui
Sen Yang
Yu Cheng
93
4
0
21 Jan 2025
Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs
Saiful Haq
Niyati Chhaya
Piyush Pandey
Pushpak Bhattacharya
ELM
LRM
36
0
0
21 Jan 2025
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
Zihan Qiu
Zeyu Huang
Jian Xu
Kaiyue Wen
Zhaoxiang Wang
Rui Men
Ivan Titov
Dayiheng Liu
Jingren Zhou
Junyang Lin
MoE
139
7
0
21 Jan 2025
A Collection of Question Answering Datasets for Norwegian
Vladislav Mikhailov
Petter Mæhlum
Victoria Ovedie Chruickshank Langø
Erik Velldal
Lilja Øvrelid
RALM
100
4
0
19 Jan 2025
Knowledge Retrieval Based on Generative AI
Te-Lun Yang
Jyi-Shane Liu
Yuen-Hsien Tseng
Jyh-Shing Roger Jang
3DV
RALM
229
2
0
17 Jan 2025
Aligning Instruction Tuning with Pre-training
Yiming Liang
Tianyu Zheng
Xinrun Du
Ge Zhang
Qingbin Liu
...
Zhaoxiang Zhang
Wenhao Huang
Jiajun Zhang
Xiang Yue
Jiajun Zhang
185
4
0
16 Jan 2025
Optimization Strategies for Enhancing Resource Efficiency in Transformers & Large Language Models
Tom Wallace
Naser Ezzati-Jivan
Beatrice Ombuki-Berman
MQ
71
1
0
16 Jan 2025
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun
Penghan Wang
Fan Lai
551
2
0
15 Jan 2025
Enhancing Retrieval-Augmented Generation: A Study of Best Practices
Siran Li
Linus Stenzel
Carsten Eickhoff
Seyed Ali Bahrainian
RALM
3DV
114
8
0
13 Jan 2025
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings
Tong Liu
Xiao Yu
Wenxuan Zhou
Jindong Gu
Volker Tresp
82
1
0
11 Jan 2025
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
234
15
0
11 Jan 2025
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms
Yilong Li
Jingyu Liu
Hao Zhang
M Badri Narayanan
Utkarsh Sharma
Shuai Zhang
Pan Hu
Yijing Zeng
Jayaram Raghuram
Suman Banerjee
MQ
142
4
0
10 Jan 2025
Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models
Sijia Chen
Baochun Li
Di Niu
LLMAG
LRM
AI4CE
128
14
0
08 Jan 2025
Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning
Lang Xu
Quentin G. Anthony
Jacob Hatef
Hari Subramoni
Hari Subramoni
Dhabaleswar K.
Panda
103
0
0
08 Jan 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
129
25
0
08 Jan 2025
HuRef: HUman-REadable Fingerprint for Large Language Models
Boyi Zeng
Cheng Zhou
Yuncong Hu
Yi Xu
Chenghu Zhou
Xiang Wang
Yu Yu
Zhouhan Lin
141
12
0
08 Jan 2025
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
127
5
0
08 Jan 2025
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Mingyang Song
Zhaochen Su
Xiaoye Qu
Jiawei Zhou
Yu Cheng
LRM
166
40
0
06 Jan 2025
Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap
Hyunwoo Ko
Guijin Son
Dasol Choi
RALM
LRM
154
12
0
05 Jan 2025
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
Yang Ouyang
Hengrui Gu
Shuhang Lin
Wenyue Hua
Jie Peng
B. Kailkhura
Tianlong Chen
Kaixiong Zhou
Kaixiong Zhou
AAML
117
3
0
05 Jan 2025
Towards Omni-RAG: Comprehensive Retrieval-Augmented Generation for Large Language Models in Medical Applications
Zhe Chen
Yusheng Liao
Shuyang Jiang
Pingjie Wang
Yu Guo
Yucheng Wang
Yu Wang
91
3
0
05 Jan 2025
Optimizing Small Language Models for In-Vehicle Function-Calling
Yahya Sowti Khiabani
Farris Atif
Chieh Hsu
Sven Stahlmann
Tobias Michels
Sebastian Kramer
Benedikt Heidrich
M. Saquib Sarfraz
Julian Merten
Faezeh Tafazzoli
82
1
0
04 Jan 2025
LLMzSz{\L}: a comprehensive LLM benchmark for Polish
Krzysztof Jassem
Michał Ciesiółka
Filip Graliñski
Piotr Jabłoński
Jakub Pokrywka
Marek Kubis
Monika Jabłońska
Ryszard Staruch
132
2
0
04 Jan 2025
Mathematical Language Models: A Survey
Wen Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
235
14
0
03 Jan 2025
PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations
Ruosen Li
Teerth Patel
Xinya Du
LLMAG
ALM
185
102
0
03 Jan 2025
An Empirical Evaluation of Large Language Models on Consumer Health Questions
Moaiz Abrar
Y. Sermet
Ibrahim Demir
AI4MH
LM&MA
ELM
83
4
0
03 Jan 2025
Navigating Nuance: In Quest for Political Truth
Soumyadeep Sar
Dwaipayan Roy
66
0
0
03 Jan 2025
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Jiajun Zhu
Peihao Wang
Ruisi Cai
Jason D. Lee
Pan Li
Ziyi Wang
KELM
112
1
0
03 Jan 2025
Setting Standards in Turkish NLP: TR-MMLU for Large Language Model Evaluation
M. Ali Bayram
Ali Arda Fincan
Ahmet Semih G"um"uş
Banu Diri
Savaş Yıldırım
"Oner Aytaş
ELM
56
1
0
31 Dec 2024
Towards Effective Discrimination Testing for Generative AI
Thomas P. Zollo
Nikita Rajaneesh
Richard Zemel
Talia B. Gillis
Emily Black
166
2
0
31 Dec 2024
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models
Mahir Labib Dihan
Md Tanvir Hassan
Md Tanvir Parvez
Md Hasebul Hasan
Md Almash Alam
Muhammad Aamir Cheema
Mohammed Eunus Ali
Md. Rizwan Parvez
ELM
LRM
120
3
0
31 Dec 2024
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
LLM-jp
Akiko Aizawa
Eiji Aramaki
Bowen Chen
Fei Cheng
...
Yuya Yamamoto
Yusuke Yamauchi
Hitomi Yanaka
Rio Yokota
Koichiro Yoshino
111
17
0
31 Dec 2024
Enhancing AI Safety Through the Fusion of Low Rank Adapters
Satya Swaroop Gudipudi
Sreeram Vipparla
Harpreet Singh
Shashwat Goel
Ponnurangam Kumaraguru
MoMe
AAML
86
3
0
30 Dec 2024
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
Qing Zong
Zhaoxiang Wang
Tianshi Zheng
Xiyu Ren
Yangqiu Song
155
3
0
28 Dec 2024
Toward Adaptive Reasoning in Large Language Models with Thought Rollback
Sijia Chen
Baochun Li
KELM
LRM
128
8
0
27 Dec 2024
SlimGPT: Layer-wise Structured Pruning for Large Language Models
Gui Ling
Ziyang Wang
Yuliang Yan
Qingwen Liu
99
10
0
24 Dec 2024
A Statistical Framework for Ranking LLM-Based Chatbots
Siavash Ameli
Siyuan Zhuang
Ion Stoica
Michael W. Mahoney
ELM
89
3
0
24 Dec 2024
Boosting LLM via Learning from Data Iteratively and Selectively
Qi Jia
Siyu Ren
Ziheng Qin
Fuzhao Xue
Jinjie Ni
Yang You
52
0
0
23 Dec 2024
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization
Tan-Hanh Pham
Hoang-Nam Le
Phu-Vinh Nguyen
Chris Ngo
Truong-Son Hy
AuLLM
LRM
144
1
0
21 Dec 2024
OpenAI o1 System Card
OpenAI OpenAI
:
Aaron Jaech
Adam Tauman Kalai
Adam Lerer
...
Yuchen He
Yuchen Zhang
Yunyun Wang
Zheng Shao
Zhuohan Li
ELM
LRM
AI4CE
132
1
0
21 Dec 2024
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying
Federico Castagna
I. Sassoon
Simon Parsons
LRM
132
0
0
19 Dec 2024
Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
Steven Feng
Shrimai Prabhumoye
John Kamalu
Jane Polak Scowcroft
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
125
5
0
18 Dec 2024
Pipeline Analysis for Developing Instruct LLMs in Low-Resource Languages: A Case Study on Basque
Ander Corral
Ixak Sarasua
Xabier Saralegi
79
2
0
18 Dec 2024
Previous
1
2
3
...
20
21
22
...
67
68
69
Next