ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through
  Failure-Inducing Exploration
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Qintong Li
Jiahui Gao
Sheng Wang
Renjie Pi
Xueliang Zhao
Chuan Wu
Xin Jiang
Zhiyu Li
Lingpeng Kong
SyDa
103
3
0
22 Oct 2024
Influential Language Data Selection via Gradient Trajectory Pursuit
Influential Language Data Selection via Gradient Trajectory Pursuit
Zhiwei Deng
Tao Li
Yang Li
67
1
0
22 Oct 2024
PLDR-LLM: Large Language Model from Power Law Decoder Representations
PLDR-LLM: Large Language Model from Power Law Decoder Representations
Burc Gokden
59
1
0
22 Oct 2024
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes
Bryan R Christ
Zack Gottesman
Jonathan Kropko
Thomas Hartvigsen
LRM
138
4
0
22 Oct 2024
ToW: Thoughts of Words Improve Reasoning in Large Language Models
ToW: Thoughts of Words Improve Reasoning in Large Language Models
Zhikun Xu
Ming shen
Jacob Dineen
Zhaonan Li
Xiao Ye
Shijie Lu
Aswin Rrv
Chitta Baral
Ben Zhou
LRM
454
1
0
21 Oct 2024
Pre-training Distillation for Large Language Models: A Design Space
  Exploration
Pre-training Distillation for Large Language Models: A Design Space Exploration
Hao Peng
Xin Lv
Yushi Bai
Zijun Yao
Jing Zhang
Lei Hou
Juanzi Li
74
4
0
21 Oct 2024
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety
  and Style
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Yantao Liu
Zijun Yao
Rui Min
Yixin Cao
Lei Hou
Juanzi Li
OffRLALM
128
42
0
21 Oct 2024
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large
  Language Model
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model
Divyanshu Aggarwal
Sankarshan Damle
Navin Goyal
Satya Lokam
Sunayana Sitaram
CLL
65
1
0
21 Oct 2024
Do Large Language Models Have an English Accent? Evaluating and
  Improving the Naturalness of Multilingual LLMs
Do Large Language Models Have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs
Yanzhu Guo
Simone Conia
Zelin Zhou
Min Li
Saloni Potdar
Henry Xiao
85
3
0
21 Oct 2024
Residual vector quantization for KV cache compression in large language
  model
Residual vector quantization for KV cache compression in large language model
Ankur Kumar
MQ
104
0
0
21 Oct 2024
OpenMU: Your Swiss Army Knife for Music Understanding
OpenMU: Your Swiss Army Knife for Music Understanding
Mengjie Zhao
Zhi-Wei Zhong
Zhuoyuan Mao
Shiqi Yang
Wei-Hsiang Liao
Shusuke Takahashi
Hiromi Wakaki
Yuki Mitsufuji
OSLM
103
8
0
21 Oct 2024
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Wenkai Li
Jiarui Liu
Andy Liu
Xuhui Zhou
Mona Diab
Maarten Sap
165
11
0
21 Oct 2024
Catastrophic Failure of LLM Unlearning via Quantization
Catastrophic Failure of LLM Unlearning via Quantization
Zhiwei Zhang
Fali Wang
Xiaomin Li
Zongyu Wu
Xianfeng Tang
Hui Liu
Qi He
Wenpeng Yin
Suhang Wang
MU
97
18
0
21 Oct 2024
Compute-Constrained Data Selection
Compute-Constrained Data Selection
Junjie Oscar Yin
Alexander M. Rush
194
1
0
21 Oct 2024
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI
  with a Focus on Model Confidence
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Norbert Tihanyi
Tamás Bisztray
Richard A. Dubniczky
Rebeka Tóth
B. Borsos
...
Ryan Marinelli
Lucas C. Cordeiro
Merouane Debbah
Vasileios Mavroeidis
Audun Josang
95
5
0
20 Oct 2024
The Best Defense is a Good Offense: Countering LLM-Powered Cyberattacks
The Best Defense is a Good Offense: Countering LLM-Powered Cyberattacks
Daniel Ayzenshteyn
Roy Weiss
Yisroel Mirsky
AAML
56
2
0
20 Oct 2024
When Machine Unlearning Meets Retrieval-Augmented Generation (RAG): Keep
  Secret or Forget Knowledge?
When Machine Unlearning Meets Retrieval-Augmented Generation (RAG): Keep Secret or Forget Knowledge?
Shang Wang
Tianqing Zhu
Dayong Ye
Wanlei Zhou
MU
80
5
0
20 Oct 2024
Lossless KV Cache Compression to 2%
Lossless KV Cache Compression to 2%
Zhen Yang
Jizong Han
Kan Wu
Ruobing Xie
An Wang
Xingwu Sun
Zhanhui Kang
VLMMQ
80
2
0
20 Oct 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja
Lester James V. Miranda
Shayekh Bin Islam
Rishabh Maheshwary
Drishti Sharma
Gusti Winata
Nathan Lambert
Sebastian Ruder
Sara Hooker
Marzieh Fadaee
LRM
142
24
0
20 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
169
4
0
20 Oct 2024
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Alan Dao
Dinh Bach Vu
Huy Hoang Ha
AuLLMVLM
141
5
0
20 Oct 2024
An Electoral Approach to Diversify LLM-based Multi-Agent Collective
  Decision-Making
An Electoral Approach to Diversify LLM-based Multi-Agent Collective Decision-Making
Xiutian Zhao
Ke Wang
Wei Peng
98
4
0
19 Oct 2024
CAP: Data Contamination Detection via Consistency Amplification
CAP: Data Contamination Detection via Consistency Amplification
Yi Zhao
Jing Li
Linyi Yang
55
1
0
19 Oct 2024
SPRIG: Improving Large Language Model Performance by System Prompt
  Optimization
SPRIG: Improving Large Language Model Performance by System Prompt Optimization
Lechen Zhang
Tolga Ergen
Lajanugen Logeswaran
Moontae Lee
David Jurgens
LRM
79
9
0
18 Oct 2024
MoDification: Mixture of Depths Made Easy
MoDification: Mixture of Depths Made Easy
C. Zhang
M. Zhong
Qimeng Wang
Xuantao Lu
Zheyu Ye
...
Yan Gao
Yao Hu
Kehai Chen
Min Zhang
Dawei Song
VLMMoE
59
2
0
18 Oct 2024
Montessori-Instruct: Generate Influential Training Data Tailored for
  Student Learning
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li
Zichun Yu
Chenyan Xiong
SyDa
82
1
0
18 Oct 2024
Make LLMs better zero-shot reasoners: Structure-orientated autonomous
  reasoning
Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning
Pengfei He
Zitao Li
Yue Xing
Yaling Li
Jiliang Tang
Bolin Ding
LLMAGLRM
43
3
0
18 Oct 2024
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
Qin Liu
Fei Wang
Chaowei Xiao
Muhao Chen
457
2
0
18 Oct 2024
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
Lang Cao
Chao Peng
Renhong Chen
Wu Ning
Yingtian Zou
Yitong Li
LRM
111
0
0
18 Oct 2024
Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs
Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs
Runchu Tian
Yanghao Li
Yuepeng Fu
Siyang Deng
Qinyu Luo
...
Zhong Zhang
Yesai Wu
Yankai Lin
Huadong Wang
Xiaojiang Liu
131
1
0
18 Oct 2024
RiTeK: A Dataset for Large Language Models Complex Reasoning over
  Textual Knowledge Graphs
RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs
Jiatan Huang
Mingchen Li
Zonghai Yao
Zhichao Yang
Yongkang Xiao
Feiyun Ouyang
Xiaohan Li
Shuo Han
Hong-ye Yu
RALM
140
3
0
17 Oct 2024
Accounting for Sycophancy in Language Model Uncertainty Estimation
Accounting for Sycophancy in Language Model Uncertainty Estimation
Anthony Sicilia
Mert Inan
Malihe Alikhani
71
2
0
17 Oct 2024
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale
  Models
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
Qiaoyu Tang
Le Yu
Bowen Yu
Hongyu Lin
Keming Lu
Yaojie Lu
Jia Zheng
Le Sun
MoMe
83
1
0
17 Oct 2024
Unearthing Skill-Level Insights for Understanding Trade-Offs of
  Foundation Models
Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
Mazda Moayeri
Vidhisha Balachandran
Varun Chandrasekaran
Safoora Yousefi
Thomas Fel
Soheil Feizi
Besmira Nushi
Neel Joshi
Vibhav Vineet
71
5
0
17 Oct 2024
BenTo: Benchmark Task Reduction with In-Context Transferability
BenTo: Benchmark Task Reduction with In-Context Transferability
Hongyu Zhao
Ming Li
Lichao Sun
Tianyi Zhou
98
0
0
17 Oct 2024
Looking Inward: Language Models Can Learn About Themselves by
  Introspection
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix J Binder
James Chua
Tomek Korbak
Henry Sleight
John Hughes
Robert Long
Ethan Perez
Miles Turpin
Owain Evans
KELMAIFinLRM
95
17
0
17 Oct 2024
Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual
  Concepts?
Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Shailaja Keyur Sampat
Maitreya Patel
Yezhou Yang
Chitta Baral
35
0
0
17 Oct 2024
IterSelectTune: An Iterative Training Framework for Efficient
  Instruction-Tuning Data Selection
IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection
Jielin Song
Siyu Liu
Bin Zhu
Yanghui Rao
40
3
0
17 Oct 2024
MedINST: Meta Dataset of Biomedical Instructions
MedINST: Meta Dataset of Biomedical Instructions
Wenhan Han
Meng Fang
Zihan Zhang
Yu Yin
Zirui Song
Ling-Hao Chen
Mykola Pechenizkiy
Qingyu Chen
LM&MA
69
3
0
17 Oct 2024
Think Thrice Before You Act: Progressive Thought Refinement in Large
  Language Models
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du
Jinyi Han
Yizhou Ying
Aili Chen
Qianyu He
...
Haoran Guo
Jiaqing Liang
Zulong Chen
Liangyue Li
Yanghua Xiao
KELMCLLLRM
69
1
0
17 Oct 2024
MoR: Mixture of Ranks for Low-Rank Adaptation Tuning
MoR: Mixture of Ranks for Low-Rank Adaptation Tuning
Chuanyu Tang
Yilong Chen
Zhenyu Zhang
Junyuan Shang
Wenyuan Zhang
Yong Huang
Tingwen Liu
MoE
59
0
0
17 Oct 2024
Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning
Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning
Minseok Choi
C. Park
Dohyun Lee
Jaegul Choo
KELMMU
55
1
0
17 Oct 2024
From Babbling to Fluency: Evaluating the Evolution of Language Models in
  Terms of Human Language Acquisition
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang
Pengda Wang
Luke D. Plonsky
Frederick L. Oswald
Hanjie Chen
ELM
77
2
0
17 Oct 2024
LocateBench: Evaluating the Locating Ability of Vision Language Models
LocateBench: Evaluating the Locating Ability of Vision Language Models
Ting-Rui Chiang
Joshua Robinson
Xinyan Velocity Yu
Dani Yogatama
VLMELM
70
0
0
17 Oct 2024
Balancing Label Quantity and Quality for Scalable Elicitation
Balancing Label Quantity and Quality for Scalable Elicitation
Alex Troy Mallen
Nora Belrose
75
2
0
17 Oct 2024
Retrieval-Enhanced Named Entity Recognition
Retrieval-Enhanced Named Entity Recognition
Enzo Shiraishi
Raphael Y. de Camargo
Henrique L. P. Silva
Ronaldo C. Prati
RALM
118
0
0
17 Oct 2024
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
Haotian Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
110
12
0
17 Oct 2024
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang
Pei Zhang
Baosong Yang
Derek F. Wong
Rui Wang
LRM
115
15
0
17 Oct 2024
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
Florian E. Dorner
Vivian Y. Nastl
Moritz Hardt
ELMALM
121
10
0
17 Oct 2024
Learning to Route LLMs with Confidence Tokens
Learning to Route LLMs with Confidence Tokens
Yu-Neng Chuang
Helen Zhou
Prathusha Kameswara Sarma
Parikshit Gopalan
John Boccio
Sara Bolouki
Helen Zhou
85
0
0
17 Oct 2024
Previous
123...252627...676869
Next