ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
Automatic Instruction Evolving for Large Language Models
Automatic Instruction Evolving for Large Language Models
Weihao Zeng
Can Xu
Yingxiu Zhao
Jianguang Lou
Weizhu Chen
SyDa
139
11
0
02 Jun 2024
Brainstorming Brings Power to Large Language Models of Knowledge
  Reasoning
Brainstorming Brings Power to Large Language Models of Knowledge Reasoning
Zining Qin
Chenhao Wang
Huiling Qin
Weijia Jia
LRM
88
1
0
02 Jun 2024
LongSkywork: A Training Recipe for Efficiently Extending Context Length
  in Large Language Models
LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models
Liang Zhao
Tianwen Wei
Liang Zeng
Cheng Cheng
Liu Yang
...
Yimeng Gan
Rui Hu
Shuicheng Yan
Han Fang
Yahui Zhou
LLMAGSyDa
123
11
0
02 Jun 2024
Empirical influence functions to understand the logic of fine-tuning
Empirical influence functions to understand the logic of fine-tuning
Jordan K Matelsky
Lyle Ungar
Konrad Paul Kording
69
0
0
01 Jun 2024
Are Large Vision Language Models up to the Challenge of Chart
  Comprehension and Reasoning? An Extensive Investigation into the Capabilities
  and Limitations of LVLMs
Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs
Mohammed Saidul Islam
Raian Rahman
Ahmed Masry
Md Tahmid Rahman Laskar
Mir Tafseer Nayeem
Enamul Hoque
LRMELM
71
4
0
01 Jun 2024
Direct Alignment of Language Models via Quality-Aware Self-Refinement
Direct Alignment of Language Models via Quality-Aware Self-Refinement
Runsheng Yu
Yong Wang
Xiaoqi Jiao
Youzhi Zhang
James T. Kwok
92
6
0
31 May 2024
clembench-2024: A Challenging, Dynamic, Complementary, Multilingual
  Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents
clembench-2024: A Challenging, Dynamic, Complementary, Multilingual Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents
Anne Beyer
Kranti Chalamalasetti
Sherzod Hakimov
Brielen Madureira
P. Sadler
David Schlangen
LLMAG
82
4
0
31 May 2024
Self-Augmented Preference Optimization: Off-Policy Paradigms for
  Language Model Alignment
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment
Yueqin Yin
Zhendong Wang
Yujia Xie
Weizhu Chen
Mingyuan Zhou
98
4
0
31 May 2024
Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization
  for Prompt Enhancement
Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement
Pengwei Zhan
Zhen Xu
Qian Tan
Jie Song
Ru Xie
81
7
0
31 May 2024
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention
  and FFN Manipulation
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
Hanzhang Zhou
Zijian Feng
Zixiao Zhu
Junlang Qian
Kezhi Mao
92
10
0
31 May 2024
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with
  Ko-H5 Benchmark
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
Chanjun Park
Hyeonwoo Kim
Dahyun Kim
Seonghwan Cho
Sanghoon Kim
Sukyung Lee
Yungi Kim
Hwalsuk Lee
ELMALM
98
16
0
31 May 2024
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small
  Reference Models
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Zachary Ankner
Cody Blakeney
Kartik K. Sreenivasan
Max Marion
Matthew L. Leavitt
Mansheej Paul
120
34
0
30 May 2024
Hallucination-Free? Assessing the Reliability of Leading AI Legal
  Research Tools
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
Varun Magesh
Faiz Surani
Matthew Dahl
Mirac Suzgun
Christopher D. Manning
Daniel E. Ho
HILMELMAILaw
77
80
0
30 May 2024
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane
  Reflections
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections
Massimo Bini
Karsten Roth
Zeynep Akata
Anna Khoreva
72
5
0
30 May 2024
Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles
  and Committee Discussions
Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions
Ruochen Zhao
Wenxuan Zhang
Yew Ken Chia
Deli Zhao
Lidong Bing
104
9
0
30 May 2024
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient
  Deployments
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Ke Yi
Yuhui Xu
Heng Chang
Chen Tang
Yuan Meng
Tong Zhang
Jia Li
MQ
86
2
0
30 May 2024
TAIA: Large Language Models are Out-of-Distribution Data Learners
TAIA: Large Language Models are Out-of-Distribution Data Learners
Shuyang Jiang
Yusheng Liao
Ya Zhang
Yu Wang
Yanfeng Wang
79
5
0
30 May 2024
A Survey Study on the State of the Art of Programming Exercise
  Generation using Large Language Models
A Survey Study on the State of the Art of Programming Exercise Generation using Large Language Models
Eduard Frankford
Ingo Höhn
Clemens Sauerwein
Ruth Breu
ELM
88
2
0
30 May 2024
Would I Lie To You? Inference Time Alignment of Language Models using
  Direct Preference Heads
Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads
Avelina Asada Hadji-Kyriacou
Ognjen Arandjelović
40
1
0
30 May 2024
PertEval: Unveiling Real Knowledge Capacity of LLMs with
  Knowledge-Invariant Perturbations
PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations
Jiatong Li
Renjun Hu
Kunzhe Huang
Zhuang Yan
Qi Liu
Mengxiao Zhu
Xing Shi
Wei Lin
KELM
110
8
0
30 May 2024
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for
  Retrieval-Augmented Large Language Models
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models
Yutao Zhu
Zhaoheng Huang
Zhicheng Dou
Ji-Rong Wen
RALM
96
6
0
30 May 2024
Stress-Testing Capability Elicitation With Password-Locked Models
Stress-Testing Capability Elicitation With Password-Locked Models
Ryan Greenblatt
Fabien Roger
Dmitrii Krasheninnikov
David M. Krueger
93
19
0
29 May 2024
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model
  Series
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Ge Zhang
Scott Qu
Jiaheng Liu
Chenchen Zhang
Chenghua Lin
...
Zi-Kai Zhao
Jiajun Zhang
Wanli Ouyang
Wenhao Huang
Wenhu Chen
ELM
124
46
0
29 May 2024
Are Large Language Models Chameleons?
Are Large Language Models Chameleons?
Mingmeng Geng
Sihong He
Roberto Trotta
48
0
0
29 May 2024
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight
  Tuning on Multi-source Data
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data
Zifan Song
Yudong Wang
Wenwei Zhang
Kuikun Liu
Chengqi Lyu
...
Qipeng Guo
Hang Yan
Dahua Lin
Kai-xiang Chen
Cairong Zhao
SyDa
82
2
0
29 May 2024
To FP8 and Back Again: Quantifying Reduced Precision Effects on LLM Training Stability
To FP8 and Back Again: Quantifying Reduced Precision Effects on LLM Training Stability
Joonhyung Lee
Jeongin Bae
Byeongwook Kim
S. Kwon
Dongsoo Lee
MQ
82
1
0
29 May 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Minghan Li
Xilun Chen
Ari Holtzman
Beidi Chen
Jimmy Lin
Wen-tau Yih
Xi Lin
RALMBDL
248
14
0
29 May 2024
Why are Visually-Grounded Language Models Bad at Image Classification?
Why are Visually-Grounded Language Models Bad at Image Classification?
Yuhui Zhang
Alyssa Unell
Xiaohan Wang
Dhruba Ghosh
Yuchang Su
Ludwig Schmidt
Serena Yeung-Levy
VLM
96
37
0
28 May 2024
Superposed Decoding: Multiple Generations from a Single Autoregressive
  Inference Pass
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Ethan Shen
Alan Fan
Sarah M Pratt
Jae Sung Park
Matthew Wallingford
Sham Kakade
Ari Holtzman
Ranjay Krishna
Ali Farhadi
Aditya Kusupati
100
3
0
28 May 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training
  Durations
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
Elie Bakouch
Atli Kosson
Loubna Ben Allal
Leandro von Werra
Martin Jaggi
127
45
0
28 May 2024
LLaMA-NAS: Efficient Neural Architecture Search for Large Language
  Models
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models
Anthony Sarah
S. N. Sridhar
Maciej Szankin
Sairam Sundaresan
103
7
0
28 May 2024
PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework
PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework
Eshaan Agarwal
Vivek Dani
T. Ganu
A. Nambi
LLMAG
65
0
0
28 May 2024
Intelligent Clinical Documentation: Harnessing Generative AI for
  Patient-Centric Clinical Note Generation
Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation
Anjanava Biswas
Wrick Talukdar
46
949
0
28 May 2024
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language
  Models
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models
Yang Zhang
Yawei Li
Xinpeng Wang
Qianli Shen
Barbara Plank
Bernd Bischl
Mina Rezaei
Kenji Kawaguchi
113
12
0
28 May 2024
Exploiting LLM Quantization
Exploiting LLM Quantization
Kazuki Egashira
Mark Vero
Robin Staab
Jingxuan He
Martin Vechev
MQ
78
19
0
28 May 2024
Spanish and LLM Benchmarks: is MMLU Lost in Translation?
Spanish and LLM Benchmarks: is MMLU Lost in Translation?
Irene Plaza
Nina Melero
Cristina del Pozo
Javier Conde
Pedro Reviriego
Marina Mayor-Rocher
María Grandury
ELM
115
8
0
28 May 2024
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in
  Alignment
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Keming Lu
Bowen Yu
Fei Huang
Yang Fan
Runji Lin
Chang Zhou
MoMe
85
21
0
28 May 2024
Long Context is Not Long at All: A Prospector of Long-Dependency Data
  for Large Language Models
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models
Longze Chen
Ziqiang Liu
Wanwei He
Yunshui Li
Run Luo
Min Yang
82
12
0
28 May 2024
Getting More Juice Out of the SFT Data: Reward Learning from Human
  Demonstration Improves SFT for LLM Alignment
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment
Jiaxiang Li
Siliang Zeng
Hoi-To Wai
Chenliang Li
Alfredo García
Mingyi Hong
135
18
0
28 May 2024
Personalized Steering of Large Language Models: Versatile Steering
  Vectors Through Bi-directional Preference Optimization
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
Yuanpu Cao
Tianrong Zhang
Bochuan Cao
Ziyi Yin
Lu Lin
Fenglong Ma
Jinghui Chen
LLMSV
96
33
0
28 May 2024
Exploring Activation Patterns of Parameters in Language Models
Exploring Activation Patterns of Parameters in Language Models
Yudong Wang
Damai Dai
Zhifang Sui
54
2
0
28 May 2024
Linguistic Collapse: Neural Collapse in (Large) Language Models
Linguistic Collapse: Neural Collapse in (Large) Language Models
Robert Wu
Vardan Papyan
97
16
0
28 May 2024
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via
  System-Algorithm Co-design
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design
Rui Kong
Qiyang Li
Xinyu Fang
Qingtian Feng
Qingfeng He
Yazhu Dong
Weijun Wang
Yuanchun Li
Linghe Kong
Yunxin Liu
MoE
102
7
0
28 May 2024
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Seanie Lee
Minsu Kim
Lynn Cherif
David Dobre
Juho Lee
...
Kenji Kawaguchi
Gauthier Gidel
Yoshua Bengio
Nikolay Malkin
Moksh Jain
AAML
160
20
0
28 May 2024
Outlier-weighed Layerwise Sampling for LLM Fine-tuning
Outlier-weighed Layerwise Sampling for LLM Fine-tuning
Pengxiang Li
L. Yin
Xiaowei Gao
Shiwei Liu
77
10
0
28 May 2024
Various Lengths, Constant Speed: Efficient Language Modeling with
  Lightning Attention
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
78
12
0
27 May 2024
Navigating the Safety Landscape: Measuring Risks in Finetuning Large
  Language Models
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
Sheng-Hsuan Peng
Pin-Yu Chen
Matthew Hull
Duen Horng Chau
102
30
0
27 May 2024
$\textit{Trans-LoRA}$: towards data-free Transferable Parameter
  Efficient Finetuning
Trans-LoRA\textit{Trans-LoRA}Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning
Runqian Wang
Soumya Ghosh
David D. Cox
Diego Antognini
Aude Oliva
Rogerio Feris
Leonid Karlinsky
80
2
0
27 May 2024
Efficient multi-prompt evaluation of LLMs
Efficient multi-prompt evaluation of LLMs
Felipe Maia Polo
Ronald Xu
Lucas Weber
Mírian Silva
Onkar Bhardwaj
Leshem Choshen
Allysson Flavio Melo de Oliveira
Yuekai Sun
Mikhail Yurochkin
106
27
0
27 May 2024
Phase Transitions in the Output Distribution of Large Language Models
Phase Transitions in the Output Distribution of Large Language Models
Julian Arnold
Flemming Holtorf
Frank Schafer
Niels Lörch
76
2
0
27 May 2024
Previous
123...414243...676869
Next