ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular
  Comprehension
MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular Comprehension
Xingyu Lu
He Cao
Zijing Liu
Shengyuan Bai
Leqing Chen
Yuan Yao
Hai-Tao Zheng
Yu-Feng Li
HILM
81
10
0
13 Mar 2024
Legally Binding but Unfair? Towards Assessing Fairness of Privacy
  Policies
Legally Binding but Unfair? Towards Assessing Fairness of Privacy Policies
Vincent Freiberger
Erik Buchmann
AILaw
70
5
0
12 Mar 2024
Rethinking Generative Large Language Model Evaluation for Semantic
  Comprehension
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Fangyun Wei
Xi Chen
Linzi Luo
ELMALMLRM
63
8
0
12 Mar 2024
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Sainbayar Sukhbaatar
O. Yu. Golovneva
Vasu Sharma
Hu Xu
Xi Lin
...
Jacob Kahn
Shang-Wen Li
Wen-tau Yih
Jason Weston
Xian Li
MoMeOffRLMoE
98
69
0
12 Mar 2024
Fine-tuning Large Language Models with Sequential Instructions
Fine-tuning Large Language Models with Sequential Instructions
Hanxu Hu
Simon Yu
Pinzhen Chen
Edoardo Ponti
ALMLRM
137
15
0
12 Mar 2024
ORPO: Monolithic Preference Optimization without Reference Model
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong
Noah Lee
James Thorne
OSLM
113
268
0
12 Mar 2024
$\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking
  Reinforcement Learning Algorithms in Generative Language Model
(N,K)\mathbf{(N,K)}(N,K)-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Yufeng Zhang
Liyu Chen
Boyi Liu
Yingxiang Yang
Qiwen Cui
Yunzhe Tao
Hongxia Yang
227
0
0
11 Mar 2024
Academically intelligent LLMs are not necessarily socially intelligent
Academically intelligent LLMs are not necessarily socially intelligent
Ruoxi Xu
Hongyu Lin
Xianpei Han
Le Sun
Yingfei Sun
ELM
65
7
0
11 Mar 2024
AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large
  Language Models
AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large Language Models
Yuting Wei
Yuanxing Xu
Xinru Wei
Simin Yang
Yangfu Zhu
Yuqing Li
Di Liu
Bin Wu
ELM
42
0
0
11 Mar 2024
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in
  Korean
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
Eunsu Kim
Juyoung Suk
Philhoon Oh
Haneul Yoo
James Thorne
Alice Oh
ELM
149
23
0
11 Mar 2024
Amharic LLaMA and LLaVA: Multimodal LLMs for Low Resource Languages
Amharic LLaMA and LLaVA: Multimodal LLMs for Low Resource Languages
Michael Andersland
33
0
0
11 Mar 2024
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small
  Language Models
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Minjie Zhu
Yichen Zhu
Xin Liu
Ning Liu
Zhiyuan Xu
Yaxin Peng
Chaomin Shen
Zhicai Ou
Feifei Feng
Jian Tang
VLM
102
22
0
10 Mar 2024
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless
  Generative Inference of LLM
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Hao Kang
Qingru Zhang
Souvik Kundu
Geonhwa Jeong
Zaoxing Liu
Tushar Krishna
Tuo Zhao
MQ
175
94
0
08 Mar 2024
DeepSeek-VL: Towards Real-World Vision-Language Understanding
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Haoyu Lu
Wen Liu
Bo Zhang
Bing-Li Wang
Kai Dong
...
Yaofeng Sun
Chengqi Deng
Hanwei Xu
Zhenda Xie
Chong Ruan
VLM
131
373
0
08 Mar 2024
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang
Eric Wallace
Claire Tomlin
Aviral Kumar
Sergey Levine
HILMLRM
106
58
0
08 Mar 2024
ChatUIE: Exploring Chat-based Unified Information Extraction using Large
  Language Models
ChatUIE: Exploring Chat-based Unified Information Extraction using Large Language Models
Jun Xu
Mengshu Sun
Qing Cui
Jun Zhou
75
1
0
08 Mar 2024
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
James Chua
Edward Rees
Hunar Batra
Samuel R. Bowman
Julian Michael
Ethan Perez
Miles Turpin
LRM
127
13
0
08 Mar 2024
Few shot chain-of-thought driven reasoning to prompt LLMs for open ended
  medical question answering
Few shot chain-of-thought driven reasoning to prompt LLMs for open ended medical question answering
Ojas Gramopadhye
Saeel Sandeep Nachane
Prateek Chanda
Ganesh Ramakrishnan
Kshitij S. Jadhav
Yatin Nandwani
Dinesh Raghu
Sachindra Joshi
LM&MAELMLRM
103
38
0
07 Mar 2024
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
Boshi Wang
Hao Fang
Jason Eisner
Benjamin Van Durme
Yu-Chuan Su
CLL
67
10
0
07 Mar 2024
How Far Are We from Intelligent Visual Deductive Reasoning?
How Far Are We from Intelligent Visual Deductive Reasoning?
Yizhe Zhang
Richard He Bai
Ruixiang Zhang
Jiatao Gu
Shuangfei Zhai
J. Susskind
Navdeep Jaitly
ReLMLRM
99
17
0
07 Mar 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLMLRM
319
577
0
07 Mar 2024
CAT: Enhancing Multimodal Large Language Model to Answer Questions in
  Dynamic Audio-Visual Scenarios
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
Qilang Ye
Zitong Yu
Rui Shao
Xinyu Xie
Philip Torr
Xiaochun Cao
MLLM
112
30
0
07 Mar 2024
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models
Wanru Zhao
Yaxin Du
Nicholas D. Lane
Siheng Chen
Yanfeng Wang
94
4
0
07 Mar 2024
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Wei-Lin Chiang
Lianmin Zheng
Ying Sheng
Anastasios Nikolas Angelopoulos
Tianle Li
...
Hao Zhang
Banghua Zhu
Michael I. Jordan
Joseph E. Gonzalez
Ion Stoica
OSLM
184
603
0
07 Mar 2024
Can Large Language Models do Analytical Reasoning?
Can Large Language Models do Analytical Reasoning?
Yebowen Hu
Kaiqiang Song
Sangwoo Cho
Xiaoyang Wang
H. Foroosh
Dong Yu
Fei Liu
ELMReLMLRM
42
2
0
06 Mar 2024
SaulLM-7B: A pioneering Large Language Model for Law
SaulLM-7B: A pioneering Large Language Model for Law
Pierre Colombo
T. Pires
Malik Boudiaf
Dominic Culver
Rui Melo
...
Andre F. T. Martins
Fabrizio Esposito
Vera Lúcia Raposo
Sofia Morgado
Michael Desa
ELMAILaw
114
77
0
06 Mar 2024
ShortGPT: Layers in Large Language Models are More Redundant Than You
  Expect
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men
Mingyu Xu
Qingyu Zhang
Bingning Wang
Hongyu Lin
Yaojie Lu
Xianpei Han
Weipeng Chen
117
142
0
06 Mar 2024
MedSafetyBench: Evaluating and Improving the Medical Safety of Large
  Language Models
MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models
Tessa Han
Aounon Kumar
Chirag Agarwal
Himabindu Lakkaraju
ELMLM&MAAI4MH
56
10
0
06 Mar 2024
Apollo: A Lightweight Multilingual Medical LLM towards Democratizing
  Medical AI to 6B People
Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
Xidong Wang
Nuo Chen
Junying Chen
Yan Hu
Yidong Wang
Xiangbo Wu
Anningzhe Gao
Xiang Wan
Haizhou Li
Benyou Wang
LM&MA
101
28
0
06 Mar 2024
Negating Negatives: Alignment without Human Positive Samples via
  Distributional Dispreference Optimization
Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Shitong Duan
Xiaoyuan Yi
Peng Zhang
Tun Lu
Xing Xie
Ning Gu
71
7
0
06 Mar 2024
Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem
Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem
Dorjan Hitaj
Giulio Pagnotta
Fabio De Gaspari
Sediola Ruko
Briland Hitaj
Luigi V. Mancini
Fernando Perez-Cruz
118
6
0
06 Mar 2024
Guardrail Baselines for Unlearning in LLMs
Guardrail Baselines for Unlearning in LLMs
Pratiksha Thaker
Yash Maurya
Shengyuan Hu
Zhiwei Steven Wu
Virginia Smith
MU
107
53
0
05 Mar 2024
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Nathaniel Li
Alexander Pan
Anjali Gopal
Summer Yue
Daniel Berrios
...
Yan Shoshitaishvili
Jimmy Ba
K. Esvelt
Alexandr Wang
Dan Hendrycks
ELM
129
195
0
05 Mar 2024
Quantum Many-Body Physics Calculations with Large Language Models
Quantum Many-Body Physics Calculations with Large Language Models
Haining Pan
N. Mudur
Will Taranto
Maria Tikhanovskaya
Subhashini Venugopalan
Yasaman Bahri
Michael P. Brenner
Eun-Ah Kim
73
9
0
05 Mar 2024
Exploring the Limitations of Large Language Models in Compositional
  Relation Reasoning
Exploring the Limitations of Large Language Models in Compositional Relation Reasoning
Jinman Zhao
Xueyan Zhang
BDLLRM
73
4
0
05 Mar 2024
Balancing Enhancement, Harmlessness, and General Capabilities: Enhancing
  Conversational LLMs with Direct RLHF
Balancing Enhancement, Harmlessness, and General Capabilities: Enhancing Conversational LLMs with Direct RLHF
Chen Zheng
Ke Sun
Hang Wu
Chenguang Xi
Xun Zhou
107
12
0
04 Mar 2024
Are More LLM Calls All You Need? Towards Scaling Laws of Compound
  Inference Systems
Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Lingjiao Chen
Jared Quincy Davis
Boris Hanin
Peter Bailis
Ion Stoica
Matei A. Zaharia
James Zou
LRM
76
0
0
04 Mar 2024
Birbal: An efficient 7B instruct-model fine-tuned with curated datasets
Birbal: An efficient 7B instruct-model fine-tuned with curated datasets
Ashvini Jindal
P. Rajpoot
Ankur P. Parikh
80
6
0
04 Mar 2024
Not All Layers of LLMs Are Necessary During Inference
Not All Layers of LLMs Are Necessary During Inference
Siqi Fan
Xin Jiang
Xiang Li
Xuying Meng
Peng Han
Shuo Shang
Aixin Sun
Yequan Wang
Zhongyuan Wang
126
44
0
04 Mar 2024
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve
  Mathematical Reasoning Learning of Language Models
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
Changyu Chen
Xiting Wang
Ting-En Lin
Ang Lv
Yuchuan Wu
Xin Gao
Ji-Rong Wen
Rui Yan
Yongbin Li
ReLMLRM
89
14
0
04 Mar 2024
Large language models surpass human experts in predicting neuroscience
  results
Large language models surpass human experts in predicting neuroscience results
Xiaoliang Luo
Akilles Rechardt
Guangzhi Sun
Kevin K. Nejad
Felipe Y´a˜nez
...
Anna Behler
Chloe M. Hall
J. Dafflon
Sherry Dongqi Bao
Bradley C. Love
91
58
0
04 Mar 2024
SciAssess: Benchmarking LLM Proficiency in Scientific Literature
  Analysis
SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis
Hengxing Cai
Xiaochen Cai
Junhan Chang
Changhao Nai
Lin Yao
...
Changhong Chen
Zheng Cheng
Zifeng Zhao
Linfeng Zhang
Guolin Ke
ELM
83
25
0
04 Mar 2024
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts
  for Medical Open-Domain Question Answering
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering
Giacomo Frisoni
Alessio Cocchieri
Alex Presepi
Gianluca Moro
Zaiqiao Meng
RALMMedIm
112
17
0
04 Mar 2024
Online Training of Large Language Models: Learn while chatting
Online Training of Large Language Models: Learn while chatting
Juhao Liang
Ziwei Wang
Zhuoheng Ma
Jianquan Li
Zhiyi Zhang
Xiangbo Wu
Benyou Wang
KELM
108
4
0
04 Mar 2024
An Improved Traditional Chinese Evaluation Suite for Foundation Model
An Improved Traditional Chinese Evaluation Suite for Foundation Model
Zhi Rui Tam
Ya-Ting Pai
Yen-Wei Lee
Jun-Da Chen
Wei-Min Chu
Sega Cheng
Hong-Han Shuai
ELM
125
12
0
04 Mar 2024
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral
Yiming Cui
Xin Yao
37
5
0
04 Mar 2024
NoMAD-Attention: Efficient LLM Inference on CPUs Through
  Multiply-add-free Attention
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Tianyi Zhang
Jonah Yi
Bowen Yao
Zhaozhuo Xu
Anshumali Shrivastava
MQ
104
7
0
02 Mar 2024
Dissecting Language Models: Machine Unlearning via Selective Pruning
Dissecting Language Models: Machine Unlearning via Selective Pruning
Nicholas Pochinkov
Nandi Schoots
MILMMU
75
23
0
02 Mar 2024
Mitigating Catastrophic Forgetting in Large Language Models with
  Self-Synthesized Rehearsal
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal
Jianheng Huang
Leyang Cui
Ante Wang
Chengyi Yang
Xinting Liao
Linfeng Song
Junfeng Yao
Jinsong Su
KELMCLL
90
46
0
02 Mar 2024
IntactKV: Improving Large Language Model Quantization by Keeping Pivot
  Tokens Intact
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Ruikang Liu
Haoli Bai
Haokun Lin
Yuening Li
Han Gao
Zheng-Jun Xu
Lu Hou
Jun Yao
Chun Yuan
MQ
84
32
0
02 Mar 2024
Previous
123...484950...676869
Next