ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
A Novel Psychometrics-Based Approach to Developing Professional
  Competency Benchmark for Large Language Models
A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models
Elena Kardanova
Alina Ivanova
Ksenia Tarasova
Taras Pashchenko
Aleksei Tikhoniuk
Elen Yusupova
Anatoly Kasprzhak
Yaroslav Kuzminov
Ekaterina Kruchinskaia
Irina Brun
127
1
0
29 Oct 2024
Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial
  Applications
Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications
Monica Riedler
Stefan Langer
VLM
85
18
0
29 Oct 2024
Dreaming Out Loud: A Self-Synthesis Approach For Training
  Vision-Language Models With Developmentally Plausible Data
Dreaming Out Loud: A Self-Synthesis Approach For Training Vision-Language Models With Developmentally Plausible Data
Badr AlKhamissi
Yingtian Tang
Abdülkadir Gökce
Johannes Mehrer
Martin Schrimpf
VLM
104
0
0
29 Oct 2024
CFSafety: Comprehensive Fine-grained Safety Assessment for LLMs
CFSafety: Comprehensive Fine-grained Safety Assessment for LLMs
Zhihao Liu
Chenhui Hu
ALMELM
75
1
0
29 Oct 2024
Do Large Language Models Align with Core Mental Health Counseling Competencies?
Do Large Language Models Align with Core Mental Health Counseling Competencies?
Viet Cuong Nguyen
Mohammad Taher
Dongwan Hong
Vinicius Konkolics Possobom
Vibha Thirunellayi Gopalakrishnan
...
Zihang Li
H. J. Soled
Michael L. Birnbaum
Srijan Kumar
M. D. Choudhury
ELMLM&MAAI4MH
100
4
0
29 Oct 2024
Project MPG: towards a generalized performance benchmark for LLM
  capabilities
Project MPG: towards a generalized performance benchmark for LLM capabilities
Lucas Spangher
Tianle Li
William Arnold
Nick Masiewicki
Xerxes Dotiwalla
Rama Parusmathi
Peter Grabowski
Eugene Ie
Dan Gruhl
57
0
0
28 Oct 2024
TransformLLM: Adapting Large Language Models via LLM-Transformed Reading
  Comprehension Text
TransformLLM: Adapting Large Language Models via LLM-Transformed Reading Comprehension Text
Iftach Arbel
Yehonathan Refael
Ofir Lindenbaum
AILaw
55
0
0
28 Oct 2024
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced
  Context Awareness and Extrapolation
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Yuhan Chen
Ang Lv
Jian Luan
Bin Wang
Wen Liu
66
5
0
28 Oct 2024
CURATe: Benchmarking Personalised Alignment of Conversational AI Assistants
CURATe: Benchmarking Personalised Alignment of Conversational AI Assistants
Lize Alberts
Benjamin Ellis
Andrei Lupu
Jakob Foerster
ELM
91
2
0
28 Oct 2024
LLMCBench: Benchmarking Large Language Model Compression for Efficient
  Deployment
LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
Ge Yang
Changyi He
Jinpei Guo
Jianyu Wu
Yifu Ding
Aishan Liu
Haotong Qin
Pengliang Ji
Xianglong Liu
MQ
98
7
0
28 Oct 2024
Large Language Model Benchmarks in Medical Tasks
Large Language Model Benchmarks in Medical Tasks
Lawrence K. Q. Yan
Ming Li
Yize Zhang
Caitlyn Heqi Yin
Cheng Fei
...
Ziqian Bi
Pohsun Feng
Keyu Chen
Junyu Liu
Qian Niu
LM&MAAI4MH
117
9
0
28 Oct 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with
  Annual Updates
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
104
4
0
28 Oct 2024
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
Changhao Li
Yuchen Zhuang
Rushi Qiang
Haotian Sun
H. Dai
Chao Zhang
Bo Dai
LRM
48
6
0
28 Oct 2024
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large
  Language Models
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models
Yilun Jin
Zheng Li
Chenwei Zhang
Tianyu Cao
Yifan Gao
...
Yi Xu
Kai Chen
Qiang Yang
Meng Jiang
Bing Yin
RALM
107
3
0
28 Oct 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression
  of Neural Networks
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
76
4
0
28 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
107
12
0
28 Oct 2024
Transferable Post-training via Inverse Value Learning
Transferable Post-training via Inverse Value Learning
Xinyu Lu
Xueru Wen
Yaojie Lu
Bowen Yu
Hongyu Lin
Haiyang Yu
Le Sun
Jia Zheng
Yongbin Li
42
1
0
28 Oct 2024
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning
Aosong Feng
Rex Ying
Leandros Tassiulas
58
2
0
28 Oct 2024
Get Large Language Models Ready to Speak: A Late-fusion Approach for
  Speech Generation
Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation
Maohao Shen
Shun Zhang
Jilong Wu
Zhiping Xiu
Ehab AlBadawy
Yiting Lu
M. Seltzer
Qing He
70
2
0
27 Oct 2024
Fine-Tuning and Evaluating Open-Source Large Language Models for the
  Army Domain
Fine-Tuning and Evaluating Open-Source Large Language Models for the Army Domain
Daniel C. Ruiz
John Sell
47
1
0
27 Oct 2024
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse
Ryan Liu
Jiayi Geng
Addison J. Wu
Ilia Sucholutsky
Tania Lombrozo
Thomas Griffiths
ReLMLRM
155
33
0
27 Oct 2024
DAWN-ICL: Strategic Planning of Problem-solving Trajectories for Zero-Shot In-Context Learning
DAWN-ICL: Strategic Planning of Problem-solving Trajectories for Zero-Shot In-Context Learning
Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Ji-Rong Wen
121
6
0
26 Oct 2024
Layer by Layer: Uncovering Where Multi-Task Learning Happens in
  Instruction-Tuned Large Language Models
Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models
Zheng Zhao
Yftah Ziser
Shay B. Cohen
63
2
0
25 Oct 2024
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional
  Supervision
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
Shilong Li
Yancheng He
Hui Huang
Xingyuan Bu
Qingbin Liu
Hangyu Guo
Weixun Wang
Jihao Gu
Wenbo Su
Bo Zheng
98
7
0
25 Oct 2024
Investigating the Role of Prompting and External Tools in Hallucination
  Rates of Large Language Models
Investigating the Role of Prompting and External Tools in Hallucination Rates of Large Language Models
Liam Barkley
Brink van der Merwe
HILMLLMAGLRM
50
4
0
25 Oct 2024
Graph Linearization Methods for Reasoning on Graphs with Large Language Models
Graph Linearization Methods for Reasoning on Graphs with Large Language Models
Christos Xypolopoulos
Guokan Shang
Xiao Fei
Giannis Nikolentzos
Hadi Abdine
Iakovos Evdaimon
Michail Chatzianastasis
Giorgos Stamou
Michalis Vazirgiannis
119
1
0
25 Oct 2024
FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
Zhiting Fan
Ruizhe Chen
Tianxiang Hu
Zuozhu Liu
74
13
0
25 Oct 2024
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
S. Sakshi
Utkarsh Tyagi
Sonal Kumar
Ashish Seth
Ramaneswaran Selvakumar
Oriol Nieto
R. Duraiswami
Sreyan Ghosh
Dinesh Manocha
AuLLMELM
148
46
0
24 Oct 2024
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with
  System Co-Design
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
Ruisi Cai
Yeonju Ro
Geon-Woo Kim
Peihao Wang
Babak Ehteshami Bejnordi
Aditya Akella
Ziyi Wang
MoE
80
6
0
24 Oct 2024
Are LLMs Better than Reported? Detecting Label Errors and Mitigating
  Their Effect on Model Performance
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
Omer Nahum
Nitay Calderon
Orgad Keller
Idan Szpektor
Roi Reichart
66
4
0
24 Oct 2024
From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers
  for Underrepresented Languages
From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages
Artur Kiulian
Anton Polishko
M. Khandoga
Yevhen Kostiuk
Guillermo Gabrielli
...
Hrishikesh Garud
Wendy Wing Yee Mak
Dmytro Chaplynskyi
Selma Belhadj Amor
Grigol Peradze
87
0
0
24 Oct 2024
Delving into the Reversal Curse: How Far Can Large Language Models
  Generalize?
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?
Zhengkai Lin
Z. Fu
Kai Liu
Liang Xie
Binbin Lin
Wenxiao Wang
D. Cai
Yue Wu
Jieping Ye
LRM
128
3
0
24 Oct 2024
LOGO -- Long cOntext aliGnment via efficient preference Optimization
LOGO -- Long cOntext aliGnment via efficient preference Optimization
Zecheng Tang
Zechen Sun
Juntao Li
Qiaoming Zhu
Min Zhang
79
2
0
24 Oct 2024
KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
Yifei Yang
Zouying Cao
Qiguang Chen
L. Qin
Dongjie Yang
Hai Zhao
Zhi Chen
66
6
0
24 Oct 2024
Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence
  Embeddings for Automatic Dialog Flow Extraction
Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction
Sergio Burdisso
S. Madikeri
P. Motlícek
78
3
0
24 Oct 2024
GrammaMT: Improving Machine Translation with Grammar-Informed In-Context Learning
GrammaMT: Improving Machine Translation with Grammar-Informed In-Context Learning
Rita Ramos
Everlyn Asiko Chimoto
Maartje ter Hoeve
Natalie Schluter
115
2
0
24 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
184
7
0
24 Oct 2024
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Wenhong Zhu
Zhiwei He
Xiaofeng Wang
Pengfei Liu
Rui Wang
OSLM
109
7
0
24 Oct 2024
Scaling up Masked Diffusion Models on Text
Scaling up Masked Diffusion Models on Text
Shen Nie
Fengqi Zhu
Chao Du
Tianyu Pang
Qian Liu
Guangtao Zeng
Min Lin
Chongxuan Li
AI4CE
217
30
0
24 Oct 2024
LEGO: Language Model Building Blocks
LEGO: Language Model Building Blocks
Shrenik Bhansali
Alwin Jin
Tyler Lizzo
Larry Heck
31
0
0
23 Oct 2024
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language
  Models Fine-tuning
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning
Jingfan Zhang
Yi Zhao
Dan Chen
Xing Tian
Huanran Zheng
Wei Zhu
MoE
134
17
0
23 Oct 2024
DataTales: A Benchmark for Real-World Intelligent Data Narration
DataTales: A Benchmark for Real-World Intelligent Data Narration
Yajing Yang
Qian Liu
Min-Yen Kan
43
0
0
23 Oct 2024
CLR-Bench: Evaluating Large Language Models in College-level Reasoning
CLR-Bench: Evaluating Large Language Models in College-level Reasoning
Junnan Dong
Zijin Hong
Yuanchen Bei
Feiran Huang
Xinrun Wang
Xiao Huang
ELMLRM
51
2
0
23 Oct 2024
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia
Jiancheng Liu
Yihua Zhang
Parikshit Ram
Nathalie Baracaldo
Sijia Liu
MU
160
8
0
23 Oct 2024
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
Yifan Peng
Krishna Puvvada
Zhehuai Chen
Piotr .Zelasko
He Huang
Kunal Dhawan
Ke Hu
Shinji Watanabe
Jagadeesh Balam
Boris Ginsburg
171
5
0
23 Oct 2024
Understanding Layer Significance in LLM Alignment
Understanding Layer Significance in LLM Alignment
Guangyuan Shi
Zexin Lu
Xiaoyu Dong
Wenlong Zhang
Xuanyu Zhang
Yujie Feng
Xiao-Ming Wu
147
3
0
23 Oct 2024
Beware of Calibration Data for Pruning Large Language Models
Beware of Calibration Data for Pruning Large Language Models
Yixin Ji
Yang Xiang
Juntao Li
Qingrong Xia
Ping Li
Xinyu Duan
Zhefeng Wang
Min Zhang
96
2
0
23 Oct 2024
Exploring Forgetting in Large Language Model Pre-Training
Exploring Forgetting in Large Language Model Pre-Training
Chonghua Liao
Ruobing Xie
Xingwu Sun
Haowen Sun
Zhanhui Kang
CLL
82
1
0
22 Oct 2024
Learning Mathematical Rules with Large Language Models
Learning Mathematical Rules with Large Language Models
Antoine Gorceix
Bastien Le Chenadec
Ahmad Rammal
N. Vadori
Manuela Veloso
60
1
0
22 Oct 2024
Trustworthy Alignment of Retrieval-Augmented Large Language Models via
  Reinforcement Learning
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Zongmeng Zhang
Yufeng Shi
Jinhua Zhu
Wengang Zhou
Xiang Qi
Peng Zhang
Haoyang Li
RALMHILM
39
0
0
22 Oct 2024
Previous
123...242526...676869
Next