ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
CBQ: Cross-Block Quantization for Large Language Models
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
146
17
0
13 Dec 2023
Large language models in healthcare and medical domain: A review
Large language models in healthcare and medical domain: A review
Zabir Al Nazi
Wei Peng
LM&MA
85
165
0
12 Dec 2023
VILA: On Pre-training for Visual Language Models
VILA: On Pre-training for Visual Language Models
Ji Lin
Hongxu Yin
Ming-Yu Liu
Yao Lu
Pavlo Molchanov
Andrew Tao
Huizi Mao
Jan Kautz
Mohammad Shoeybi
Song Han
MLLMVLM
173
430
0
12 Dec 2023
LLM in a flash: Efficient Large Language Model Inference with Limited
  Memory
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Keivan Alizadeh-Vahid
Iman Mirzadeh
Dmitry Belenko
Karen Khatamifard
Minsik Cho
C. C. D. Mundo
Mohammad Rastegari
Mehrdad Farajtabar
132
130
0
12 Dec 2023
LLMEval: A Preliminary Study on How to Evaluate Large Language Models
LLMEval: A Preliminary Study on How to Evaluate Large Language Models
Yue Zhang
Ming Zhang
Haipeng Yuan
Shichun Liu
Yongyao Shi
Tao Gui
Qi Zhang
Xuanjing Huang
ALMELM
69
15
0
12 Dec 2023
SGLang: Efficient Execution of Structured Language Model Programs
SGLang: Efficient Execution of Structured Language Model Programs
Lianmin Zheng
Liangsheng Yin
Zhiqiang Xie
Chuyue Sun
Jeff Huang
...
Christos Kozyrakis
Ion Stoica
Joseph E. Gonzalez
Clark W. Barrett
Ying Sheng
LRM
142
174
0
12 Dec 2023
Alignment for Honesty
Alignment for Honesty
Yuqing Yang
Ethan Chern
Xipeng Qiu
Graham Neubig
Pengfei Liu
87
35
0
12 Dec 2023
ComplexityNet: Increasing LLM Inference Efficiency by Learning Task
  Complexity
ComplexityNet: Increasing LLM Inference Efficiency by Learning Task Complexity
Henry Bae
Aghyad Deeb
Alex Fleury
Kehang Zhu
38
3
0
12 Dec 2023
SM70: A Large Language Model for Medical Devices
SM70: A Large Language Model for Medical Devices
Anubhav Bhatti
Surajsinh Parmar
San Lee
LM&MAAI4MH
40
2
0
12 Dec 2023
Rethinking the Instruction Quality: LIFT is What You Need
Rethinking the Instruction Quality: LIFT is What You Need
Yang Xu
Yongqiang Yao
Yufan Huang
Mengnan Qi
Maoquan Wang
Bin Gu
Neel Sundaresan
ALM
80
35
0
12 Dec 2023
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Samuel J. Paech
AI4MH
88
16
0
11 Dec 2023
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
O. Ovadia
Menachem Brief
Moshik Mishaeli
Oren Elisha
RALM
111
153
0
10 Dec 2023
ASVD: Activation-aware Singular Value Decomposition for Compressing
  Large Language Models
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
Zhihang Yuan
Yuzhang Shang
Yue Song
Qiang Wu
Yan Yan
Guangyu Sun
MQ
124
61
0
10 Dec 2023
Steering Llama 2 via Contrastive Activation Addition
Steering Llama 2 via Contrastive Activation Addition
Nina Rimsky
Nick Gabrieli
Julian Schulz
Meg Tong
Evan Hubinger
Alexander Matt Turner
LLMSV
61
226
0
09 Dec 2023
PaperQA: Retrieval-Augmented Generative Agent for Scientific Research
PaperQA: Retrieval-Augmented Generative Agent for Scientific Research
Jakub Lála
Odhran O'Donoghue
Aleksandar Shtedritski
Sam Cox
Samuel G. Rodriques
Andrew D. White
RALM
151
89
0
08 Dec 2023
Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate
  System
Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System
Haotian Wang
Xiyuan Du
Weijiang Yu
Qianglong Chen
Kun Zhu
Zheng Chu
Lian Yan
Yi Guan
107
13
0
08 Dec 2023
An LLM Compiler for Parallel Function Calling
An LLM Compiler for Parallel Function Calling
Sehoon Kim
Suhong Moon
Ryan Tabrizi
Nicholas Lee
Michael W. Mahoney
Kurt Keutzer
A. Gholami
LRM
74
65
0
07 Dec 2023
Testing LLM performance on the Physics GRE: some observations
Testing LLM performance on the Physics GRE: some observations
Pranav Gupta
ELM
39
2
0
07 Dec 2023
Prompt Optimization via Adversarial In-Context Learning
Prompt Optimization via Adversarial In-Context Learning
Do Xuan Long
Yiran Zhao
Hannah Brown
Yuxi Xie
James Xu Zhao
Nancy F. Chen
Kenji Kawaguchi
Michael Qizhe Xie
Junxian He
152
16
0
05 Dec 2023
ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a
  Single GPU
ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU
Zhengmao Ye
Dengchun Li
Jingqi Tian
Tingfeng Lan
Jie Zuo
...
Hui Lu
Yexi Jiang
Jian Sha
Ke Zhang
Mingjie Tang
151
5
0
05 Dec 2023
MUFFIN: Curating Multi-Faceted Instructions for Improving
  Instruction-Following
MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Renze Lou
Kai Zhang
Jian Xie
Yuxuan Sun
Janice Ahn
Hanzi Xu
Yu Su
Wenpeng Yin
115
30
0
05 Dec 2023
Efficient Online Data Mixing For Language Model Pre-Training
Efficient Online Data Mixing For Language Model Pre-Training
Alon Albalak
Liangming Pan
Colin Raffel
Wenjie Wang
101
46
0
05 Dec 2023
Physics simulation capabilities of LLMs
Physics simulation capabilities of LLMs
M. Ali-Dib
Kristen Menou
ELMAI4CE
46
0
0
04 Dec 2023
Jellyfish: A Large Language Model for Data Preprocessing
Jellyfish: A Large Language Model for Data Preprocessing
Haochen Zhang
Yuyang Dong
Chuan Xiao
Masafumi Oyamada
115
27
0
04 Dec 2023
CLAMP: Contrastive LAnguage Model Prompt-tuning
CLAMP: Contrastive LAnguage Model Prompt-tuning
Piotr Teterwak
Ximeng Sun
Bryan A. Plummer
Kate Saenko
Ser-Nam Lim
MLLMVLM
82
1
0
04 Dec 2023
SymNoise: Advancing Language Model Fine-tuning with Symmetric Noise
SymNoise: Advancing Language Model Fine-tuning with Symmetric Noise
A. Yadav
Arjun Singh
92
2
0
03 Dec 2023
From Beginner to Expert: Modeling Medical Knowledge into General LLMs
From Beginner to Expert: Modeling Medical Knowledge into General LLMs
Qiang Li
Xiaoyan Yang
Haowen Wang
Qin Wang
Lei Liu
...
Wangshu Zhang
Teng Xu
Jinjie Gu
Jing Zheng
Guannan Zhang
LM&MAELMAI4MH
128
16
0
02 Dec 2023
SeaLLMs -- Large Language Models for Southeast Asia
SeaLLMs -- Large Language Models for Southeast Asia
Xuan-Phi Nguyen
Wenxuan Zhang
Xin Li
Mahani Aljunied
Zhiqiang Hu
...
Yue Deng
Sen Yang
Chaoqun Liu
Hang Zhang
Li Bing
LRM
114
85
0
01 Dec 2023
Hashmarks: Privacy-Preserving Benchmarks for High-Stakes AI Evaluation
Hashmarks: Privacy-Preserving Benchmarks for High-Stakes AI Evaluation
P. Bricman
55
0
0
01 Dec 2023
Instruction-tuning Aligns LLMs to the Human Brain
Instruction-tuning Aligns LLMs to the Human Brain
Khai Loong Aw
Syrielle Montariol
Badr AlKhamissi
Martin Schrimpf
Antoine Bosselut
149
22
0
01 Dec 2023
CoLLiE: Collaborative Training of Large Language Models in an Efficient
  Way
CoLLiE: Collaborative Training of Large Language Models in an Efficient Way
Kai Lv
Shuo Zhang
Tianle Gu
Shuhao Xing
Jiawei Hong
...
Tengxiao Liu
Yu Sun
Penousal Machado
Hang Yan
Xipeng Qiu
87
7
0
01 Dec 2023
LinguaLinked: A Distributed Large Language Model Inference System for
  Mobile Devices
LinguaLinked: A Distributed Large Language Model Inference System for Mobile Devices
Junchen Zhao
Yurun Song
Simeng Liu
Ian G. Harris
Sangeetha Abdu Jyothi
84
6
0
01 Dec 2023
The Philosopher's Stone: Trojaning Plugins of Large Language Models
The Philosopher's Stone: Trojaning Plugins of Large Language Models
Tian Dong
Minhui Xue
Guoxing Chen
Rayne Holland
Shaofeng Li
Yan Meng
Zhen Liu
Haojin Zhu
AAML
154
14
0
01 Dec 2023
Mark My Words: Analyzing and Evaluating Language Model Watermarks
Mark My Words: Analyzing and Evaluating Language Model Watermarks
Julien Piet
Chawin Sitawarin
Vivian Fang
Norman Mu
David Wagner
WaLM
144
36
0
01 Dec 2023
TaskBench: Benchmarking Large Language Models for Task Automation
TaskBench: Benchmarking Large Language Models for Task Automation
Yongliang Shen
Kaitao Song
Xu Tan
Wenqi Zhang
Kan Ren
Siyu Yuan
Weiming Lu
Dongsheng Li
Yueting Zhuang
123
67
0
30 Nov 2023
AlignBench: Benchmarking Chinese Alignment of Large Language Models
AlignBench: Benchmarking Chinese Alignment of Large Language Models
Xiao Liu
Xuanyu Lei
Sheng-Ping Wang
Yue Huang
Zhuoer Feng
...
Hongning Wang
Jing Zhang
Minlie Huang
Yuxiao Dong
Jie Tang
ELMLM&MAALM
187
50
0
30 Nov 2023
CritiqueLLM: Towards an Informative Critique Generation Model for
  Evaluation of Large Language Model Generation
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation
Pei Ke
Bosi Wen
Andrew Feng
Xiao-Yang Liu
Xuanyu Lei
...
Aohan Zeng
Yuxiao Dong
Hongning Wang
Jie Tang
Minlie Huang
ELMALM
132
35
0
30 Nov 2023
ArcMMLU: A Library and Information Science Benchmark for Large Language
  Models
ArcMMLU: A Library and Information Science Benchmark for Large Language Models
Shitou Zhang
Zuchao Li
Xingshen Liu
Liming Yang
Ping Wang
ELM
49
0
0
30 Nov 2023
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
  Models
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
Marwa Abdulhai
Isadora White
Charles Burton Snell
Charles Sun
Joey Hong
Yuexiang Zhai
Kelvin Xu
Sergey Levine
LLMAGOffRLLRM
87
42
0
30 Nov 2023
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in
  Large Language Models
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Haotian Wang
Ming Liu
Bing Qin
LRMELM
124
15
0
29 Nov 2023
CLOMO: Counterfactual Logical Modification with Large Language Models
CLOMO: Counterfactual Logical Modification with Large Language Models
Yinya Huang
Ruixin Hong
Hongming Zhang
Wei Shao
Zhicheng YANG
Dong Yu
Changshui Zhang
Xiaodan Liang
Linqi Song
LRM
66
9
0
29 Nov 2023
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
Andries P. Smit
Paul Duckworth
Nathan Grinsztajn
Thomas D. Barrett
Arnu Pretorius
98
27
0
29 Nov 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
  Catching up?
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Shafiq Joty
ELMCLLAI4MHLRMALM
146
28
0
28 Nov 2023
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case
  Study in Medicine
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
Harsha Nori
Yin Tat Lee
Sheng Zhang
Dean Carignan
Richard Edgar
...
Hoifung Poon
Tao Qin
Naoto Usuyama
Chris White
Eric Horvitz
LM&MAAI4MHMedImELM
104
328
0
28 Nov 2023
CDEval: A Benchmark for Measuring the Cultural Dimensions of Large
  Language Models
CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models
Yuhang Wang
Yanxu Zhu
Chao Kong
Shuyu Wei
Xiaoyuan Yi
Xing Xie
Jitao Sang
ALMVLMELM
62
8
0
28 Nov 2023
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating
  Video-based Large Language Models
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Munan Ning
Bin Zhu
Yujia Xie
Bin Lin
Jiaxi Cui
Lu Yuan
Dongdong Chen
Li-ming Yuan
ELMMLLM
80
66
0
27 Nov 2023
InstructMol: Multi-Modal Integration for Building a Versatile and
  Reliable Molecular Assistant in Drug Discovery
InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery
He Cao
Zijing Liu
Xingyu Lu
Yuan Yao
Yu-Feng Li
112
68
0
27 Nov 2023
WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large
  Language Models
WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models
Youssef Benchekroun
Megi Dervishi
Mark Ibrahim
Jean-Baptiste Gaya
Xavier Martinet
Grégoire Mialon
Thomas Scialom
Emmanuel Dupoux
Dieuwke Hupkes
Pascal Vincent
LRM
68
8
0
27 Nov 2023
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage
  and Sharing in LLMs
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs
Yunxin Li
Baotian Hu
Wei Wang
Xiaochun Cao
Min Zhang
77
5
0
27 Nov 2023
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image
  Generation
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
Yuhui Zhang
Brandon McKinzie
Zhe Gan
Vaishaal Shankar
Alexander Toshev
40
3
0
27 Nov 2023
Previous
123...565758...676869
Next