ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
Memorization vs. Reasoning: Updating LLMs with New Knowledge
Memorization vs. Reasoning: Updating LLMs with New Knowledge
Aochong Oliver Li
Tanya Goyal
KELM
123
3
0
16 Apr 2025
Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning
Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning
Syeda Nahida Akter
Shrimai Prabhumoye
Matvei Novikov
Seungju Han
Ying Lin
...
Eric Nyberg
Yejin Choi
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
ReLMOffRLLRM
466
4
1
15 Apr 2025
TextArena
TextArena
Leon Guertler
Bobby Cheng
Simon Yu
Bo Liu
Leshem Choshen
Cheston Tan
LLMAG
116
2
0
15 Apr 2025
PuzzleBench: A Fully Dynamic Evaluation Framework for Large Multimodal Models on Puzzle Solving
PuzzleBench: A Fully Dynamic Evaluation Framework for Large Multimodal Models on Puzzle Solving
Zeyu Zhang
Zhongfu Chen
Zicheng Zhang
Yuze Sun
Yuan Tian
Ziheng Jia
Chunyi Li
Xiaohong Liu
Xiongkuo Min
Guangtao Zhai
MLLM
67
1
0
15 Apr 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Tianyi Zhang
Yang Sui
Shaochen Zhong
Vipin Chaudhary
Helen Zhou
Anshumali Shrivastava
MQ
80
2
0
15 Apr 2025
DataDecide: How to Predict Best Pretraining Data with Small Experiments
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Ian H. Magnusson
Nguyen Tai
Ben Bogin
David Heineman
Jena D. Hwang
...
Dirk Groeneveld
Oyvind Tafjord
Noah A. Smith
Pang Wei Koh
Jesse Dodge
ALM
83
3
0
15 Apr 2025
Transferable text data distillation by trajectory matching
Transferable text data distillation by trajectory matching
Rong Yao
Hailin Hu
Yifei Fu
Hanting Chen
Wenyi Fang
Fanyi Du
Kai Han
Yunhe Wang
89
0
0
14 Apr 2025
Resampling Benchmark for Efficient Comprehensive Evaluation of Large Vision-Language Models
Resampling Benchmark for Efficient Comprehensive Evaluation of Large Vision-Language Models
Teppei Suzuki
Keisuke Ozawa
VLM
180
0
0
14 Apr 2025
HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving
HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving
Avinash Kumar
Shashank Nag
Jason Clemons
L. John
Poulami Das
107
0
0
14 Apr 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
Aryan Shrivastava
Paula Akemi Aoyagui
104
0
0
14 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLMVLM
221
132
1
14 Apr 2025
The Jailbreak Tax: How Useful are Your Jailbreak Outputs?
The Jailbreak Tax: How Useful are Your Jailbreak Outputs?
Kristina Nikolić
Luze Sun
Jie Zhang
F. Tramèr
64
3
0
14 Apr 2025
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
Soham Shah
Kumar Shridhar
Surojit Chatterjee
Souvik Sen
86
0
0
14 Apr 2025
SaRO: Enhancing LLM Safety through Reasoning-based Alignment
SaRO: Enhancing LLM Safety through Reasoning-based Alignment
Yutao Mou
Yuxiao Luo
Shikun Zhang
Wei Ye
LLMSVLRM
61
2
0
13 Apr 2025
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Siddhant Gupta
Drishti Sharma
Jebish Purbey
Kanwal Mehreen
Muhammad Arham
Hamza Farooq
130
0
0
13 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELMLM&MA
84
1
0
13 Apr 2025
Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability
Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability
Haotian Wang
Han Zhao
Shuaiting Chen
Xiaoyu Tian
Sitong Zhao
Yunjie Ji
Yiping Peng
Xiangang Li
ReLMLRM
104
0
0
13 Apr 2025
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Kang Yang
Guanhong Tao
X. Chen
Jun Xu
81
1
0
13 Apr 2025
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance
Zuoli Tang
Junjie Ou
Kaiqin Hu
Chunwei Wu
Zhaoxin Huan
Chilin Fu
Xiaolu Zhang
Jun Zhou
Chenliang Li
ReLMLRM
63
0
0
13 Apr 2025
DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models
DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models
Wenjin Ke
Zhe Li
D. Li
Lu Tian
E. Barsoum
MQ
101
3
0
12 Apr 2025
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
Chengyu Wang
Taolin Zhang
Richang Hong
Jun Huang
ReLMLRM
105
2
0
12 Apr 2025
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
Xin Gao
Qizhi Pei
Zinan Tang
Yongqian Li
Honglin Lin
Jiang Wu
Conghui He
Lijun Wu
SyDa
104
0
0
11 Apr 2025
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
FangZhi Xu
Hang Yan
Chang Ma
Haiteng Zhao
Qiushi Sun
Kanzhi Cheng
Junxian He
Jun Liu
Zhiyong Wu
LRM
71
5
0
11 Apr 2025
SAEs $\textit{Can}$ Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs
SAEs Can\textit{Can}Can Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs
Aashiq Muhamed
Jacopo Bonato
Mona Diab
Virginia Smith
MU
147
6
0
11 Apr 2025
SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting
SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting
Jiaming Xu
Jiayi Pan
Yongkang Zhou
Siming Chen
Jiajian Li
Yaoxiu Lian
Junyi Wu
Guohao Dai
LRM
65
0
0
11 Apr 2025
Large Language Models Could Be Rote Learners
Large Language Models Could Be Rote Learners
Yuyang Xu
Renjun Hu
Haochao Ying
Jian Wu
Xing Shi
Wei Lin
ELM
438
0
0
11 Apr 2025
Fast-Slow-Thinking: Complex Task Solving with Large Language Models
Fast-Slow-Thinking: Complex Task Solving with Large Language Models
Yiliu Sun
Yanfang Zhang
Zicheng Zhao
Sheng Wan
Dacheng Tao
Chen Gong
LRM
86
0
0
11 Apr 2025
SortBench: Benchmarking LLMs based on their ability to sort lists
SortBench: Benchmarking LLMs based on their ability to sort lists
Steffen Herbold
RALMLRM
60
0
0
11 Apr 2025
NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark
NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark
Vladislav Mikhailov
Tita Ranveig Enstad
David Samuel
Hans Christian Farsethås
Andrey Kutuzov
Erik Velldal
Lilja Øvrelid
ELM
113
1
0
10 Apr 2025
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Hanqi Xiao
Yi-Lin Sung
Elias Stengel-Eskin
Joey Tianyi Zhou
MQ
104
0
0
10 Apr 2025
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models
Hongcheng Guo
Juntao Yao
Boyang Wang
Junjia Du
Shaosheng Cao
Donglin Di
Shun Zhang
Zehan Li
MoE
105
0
0
10 Apr 2025
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Yixin Cao
Jiahao Ying
Yansen Wang
Xipeng Qiu
Xuanjing Huang
Yugang Jiang
ELM
99
2
0
10 Apr 2025
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Jing Zhang
Rushuai Yang
Shunyu Liu
Ting-En Lin
Fei Huang
Yi Chen
Yongqian Li
Dacheng Tao
OffRL
89
0
0
10 Apr 2025
SD$^2$: Self-Distilled Sparse Drafters
SD2^22: Self-Distilled Sparse Drafters
Mike Lasby
Nish Sinnadurai
Valavan Manohararajah
Sean Lie
Yani Andrew Ioannou
Vithursan Thangarasa
417
1
0
10 Apr 2025
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
Pavel Chizhov
Mattia Nee
Pierre-Carl Langlais
Ivan P. Yamshchikov
ReLMELMLRM
98
1
0
10 Apr 2025
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
Xiaohua Feng
Yuyuan Li
C. Wang
Junlin Liu
Lulu Zhang
Chaochao Chen
MU
57
0
0
09 Apr 2025
RAISE: Reinforced Adaptive Instruction Selection For Large Language Models
RAISE: Reinforced Adaptive Instruction Selection For Large Language Models
Lv Qingsong
Yangning Li
Zihua Lan
Zishan Xu
Jiwei Tang
Hai-Tao Zheng
Wenhao Jiang
Wanshi Xu
Philip S. Yu
169
2
0
09 Apr 2025
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
Yangning Li
Zihua Lan
Lv Qingsong
Hai-Tao Zheng
Hai-Tao Zheng
110
0
0
09 Apr 2025
Lugha-Llama: Adapting Large Language Models for African Languages
Lugha-Llama: Adapting Large Language Models for African Languages
Happy Buzaaba
Alexander Wettig
David Ifeoluwa Adelani
Christiane Fellbaum
94
0
0
09 Apr 2025
GAAPO: Genetic Algorithmic Applied to Prompt Optimization
GAAPO: Genetic Algorithmic Applied to Prompt Optimization
Xavier Sécheresse
Jacques-Yves Guilbert--Ly
Antoine Villedieu de Torcy
119
0
0
09 Apr 2025
SEE: Continual Fine-tuning with Sequential Ensemble of Experts
SEE: Continual Fine-tuning with Sequential Ensemble of Experts
Zhilin Wang
Yafu Li
Xiaoye Qu
Yu Cheng
CLLKELM
123
0
0
09 Apr 2025
FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion
FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion
Longguang Zhong
Fanqi Wan
Ziyi Yang
Guosheng Liang
Tianyuan Shi
Xiaojun Quan
MoMe
124
1
0
09 Apr 2025
Persona Dynamics: Unveiling the Impact of Personality Traits on Agents in Text-Based Games
Persona Dynamics: Unveiling the Impact of Personality Traits on Agents in Text-Based Games
Seungwon Lim
Seungbeen Lee
Dongjun Min
Youngjae Yu
AI4CE
124
0
0
09 Apr 2025
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
C. Xu
Ming-Yu Liu
Peng Xu
Ziwei Liu
Wei Ping
Mohammad Shoeybi
Bo Li
Bryan Catanzaro
123
4
0
08 Apr 2025
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Monojit Choudhury
Shivam Chauhan
Rocktim Jyoti Das
Dhruv Sahnan
Xudong Han
...
Rituraj Joshi
Gurpreet Gosal
Avraham Sheinin
Natalia Vassilieva
Preslav Nakov
99
1
0
08 Apr 2025
Knowledge-Instruct: Effective Continual Pre-training from Limited Data using Instructions
Knowledge-Instruct: Effective Continual Pre-training from Limited Data using Instructions
O. Ovadia
Meni Brief
Rachel Lemberg
Eitam Sheetrit
CLLKELM
74
1
0
08 Apr 2025
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan
Vinko Sabolčec
Matin Ansaripour
Ayush Kumar Tarun
Martin Jaggi
Antoine Bosselut
Imanol Schlag
63
1
0
08 Apr 2025
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Will Cai
Tianneng Shi
Xuandong Zhao
Dawn Song
78
6
0
07 Apr 2025
Achieving binary weight and activation for LLMs using Post-Training Quantization
Achieving binary weight and activation for LLMs using Post-Training Quantization
Siqing Song
Chuang Wang
Ruiqi Wang
Yi Yang
Xuyao Zhang
MQ
132
0
0
07 Apr 2025
Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models
Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models
Yang Yan
Yu Lu
Renjun Xu
Zhenzhong Lan
LRM
121
4
0
07 Apr 2025
Previous
123...111213...676869
Next