ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
API Is Enough: Conformal Prediction for Large Language Models Without
  Logit-Access
API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access
Jiayuan Su
Jing Luo
Hongwei Wang
Lu Cheng
245
23
0
02 Mar 2024
RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots
RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots
Philip G. Feldman
James R. Foulds
Shimei Pan
SILM
91
13
0
02 Mar 2024
STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient
  Fine-Tuning of Large Language Models
STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models
Linhai Zhang
Jialong Wu
Deyu Zhou
Guoqiang Xu
98
5
0
02 Mar 2024
LAB: Large-Scale Alignment for ChatBots
LAB: Large-Scale Alignment for ChatBots
Shivchander Sudalairaj
Abhishek Bhandwaldar
Aldo Pareja
Kai Xu
David D. Cox
Akash Srivastava
OSLM
88
35
0
02 Mar 2024
Formulation Comparison for Timeline Construction using LLMs
Formulation Comparison for Timeline Construction using LLMs
Kimihiro Hasegawa
Nikhil Kandukuri
Susan Holm
Yukari Yamakawa
Teruko Mitamura
88
0
0
01 Mar 2024
ATP: Enabling Fast LLM Serving via Attention on Top Principal Keys
ATP: Enabling Fast LLM Serving via Attention on Top Principal Keys
Yue Niu
Saurav Prakash
Salman Avestimehr
51
1
0
01 Mar 2024
Do Zombies Understand? A Choose-Your-Own-Adventure Exploration of
  Machine Cognition
Do Zombies Understand? A Choose-Your-Own-Adventure Exploration of Machine Cognition
Ariel Goldstein
Gabriel Stanovsky
61
1
0
01 Mar 2024
Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by
  Exploring Refusal Loss Landscapes
Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes
Xiaomeng Hu
Pin-Yu Chen
Tsung-Yi Ho
AAML
66
32
0
01 Mar 2024
FAC$^2$E: Better Understanding Large Language Model Capabilities by
  Dissociating Language and Cognition
FAC2^22E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition
Xiaoqiang Wang
Bang Liu
Lingfei Wu
89
0
0
29 Feb 2024
Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid
  Progress
Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress
Ameya Prabhu
Vishaal Udandarao
Philip Torr
Matthias Bethge
Adel Bibi
Samuel Albanie
94
4
0
29 Feb 2024
Humanoid Locomotion as Next Token Prediction
Humanoid Locomotion as Next Token Prediction
Ilija Radosavovic
Bike Zhang
Baifeng Shi
Jathushan Rajasegaran
Sarthak Kamat
Trevor Darrell
Koushil Sreenath
Jitendra Malik
LM&Ro
96
67
0
29 Feb 2024
Functional Benchmarks for Robust Evaluation of Reasoning Performance,
  and the Reasoning Gap
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap
Saurabh Srivastava
B. AnnaroseM
V. AntoP
Shashank Menon
Ajay Sukumar
T. AdwaithSamod
Alan Philipose
Stevin Prince
Sooraj Thomas
ELMReLMLRM
79
56
0
29 Feb 2024
OpenMedLM: Prompt engineering can out-perform fine-tuning in medical
  question-answering with open-source large language models
OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models
Jenish Maharjan
A. Garikipati
N. Singh
Leo Cyrus
Mayank Sharma
M. Ciobanu
G. Barnes
R. Thapa
Q. Mao
R. Das
LM&MAELM
84
31
0
29 Feb 2024
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient
  Tuning
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning
Weijieying Ren
Xinlong Li
Lei Wang
Tianxiang Zhao
Wei Qin
CLLKELM
117
39
0
29 Feb 2024
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability
Congying Xia
Chen Xing
Jiangshu Du
Xinyi Yang
Yihao Feng
Ran Xu
Wenpeng Yin
Caiming Xiong
ALM
92
54
0
28 Feb 2024
Language Models Represent Beliefs of Self and Others
Language Models Represent Beliefs of Self and Others
Wentao Zhu
Zhining Zhang
Yizhou Wang
MILMLRM
94
10
0
28 Feb 2024
Tokenization Is More Than Compression
Tokenization Is More Than Compression
Craig W. Schmidt
Varshini Reddy
Haoran Zhang
Alec Alameddine
Omri Uzan
Yuval Pinter
Chris Tanner
124
38
0
28 Feb 2024
Towards Generalist Prompting for Large Language Models by Mental Models
Towards Generalist Prompting for Large Language Models by Mental Models
Haoxiang Guan
Jiyan He
Shuxin Zheng
En-Hong Chen
Weiming Zhang
Neng H. Yu
LRM
81
1
0
28 Feb 2024
Learning or Self-aligning? Rethinking Instruction Fine-tuning
Learning or Self-aligning? Rethinking Instruction Fine-tuning
Mengjie Ren
Boxi Cao
Hongyu Lin
Liu Cao
Xianpei Han
Ke Zeng
Guanglu Wan
Xunliang Cai
Le Sun
105
28
0
28 Feb 2024
CogBench: a large language model walks into a psychology lab
CogBench: a large language model walks into a psychology lab
Julian Coda-Forno
Marcel Binz
Jane X. Wang
Eric Schulz
ELMALMLLMAGLM&MA
121
39
0
28 Feb 2024
LLM Task Interference: An Initial Study on the Impact of Task-Switch in
  Conversational History
LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History
Akash Gupta
Ivaxi Sheth
Vyas Raina
Mark Gales
Mario Fritz
86
6
0
28 Feb 2024
MIKO: Multimodal Intention Knowledge Distillation from Large Language
  Models for Social-Media Commonsense Discovery
MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery
Feihong Lu
Weiqi Wang
Yangyifei Luo
Ziqin Zhu
Qingyun Sun
...
Haochen Shi
Shiqi Gao
Qian Li
Yangqiu Song
Jianxin Li
VLM
130
6
0
28 Feb 2024
Evaluating Quantized Large Language Models
Evaluating Quantized Large Language Models
Shiyao Li
Xuefei Ning
Luning Wang
Tengxuan Liu
Xiangsheng Shi
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
119
53
0
28 Feb 2024
MedAide: Leveraging Large Language Models for On-Premise Medical
  Assistance on Edge Devices
MedAide: Leveraging Large Language Models for On-Premise Medical Assistance on Edge Devices
Abdul Basit
Khizar Hussain
Muhammad Abdullah Hanif
Mohamed Bennai
LM&MA
49
5
0
28 Feb 2024
Unsupervised Information Refinement Training of Large Language Models
  for Retrieval-Augmented Generation
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation
Shicheng Xu
Liang Pang
Mo Yu
Fandong Meng
Huawei Shen
Xueqi Cheng
Jie Zhou
RALM
79
15
0
28 Feb 2024
No Token Left Behind: Reliable KV Cache Compression via Importance-Aware
  Mixed Precision Quantization
No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
J. Yang
Byeongwook Kim
Jeongin Bae
Beomseok Kwon
Gunho Park
Eunho Yang
S. Kwon
Dongsoo Lee
MQ
171
53
0
28 Feb 2024
Do Large Language Models Mirror Cognitive Language Processing?
Do Large Language Models Mirror Cognitive Language Processing?
Yuqi Ren
Renren Jin
Tongxuan Zhang
Deyi Xiong
154
6
0
28 Feb 2024
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Hanjie Chen
Zhouxiang Fang
Yash Singla
Mark Dredze
ELMAI4MH
145
43
0
28 Feb 2024
Can an LLM-Powered Socially Assistive Robot Effectively and Safely
  Deliver Cognitive Behavioral Therapy? A Study With University Students
Can an LLM-Powered Socially Assistive Robot Effectively and Safely Deliver Cognitive Behavioral Therapy? A Study With University Students
Mina Kian
M. Zong
Katrin Fischer
Abhyuday Singh
Anna-Maria Velentza
...
Misha A. Faruki
Wallace Browning
Sebastien M. R. Arnold
Bhaskar Krishnamachari
Maja J. Matarić
127
5
0
27 Feb 2024
JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning
  and Professional Question Answering Capability
JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability
Junda Wang
Zhichao Yang
Zonghai Yao
Hong-ye Yu
BDLAI4MHLRM
107
35
0
27 Feb 2024
Stable LM 2 1.6B Technical Report
Stable LM 2 1.6B Technical Report
Marco Bellagente
J. Tow
Dakota Mahan
Duy Phung
Maksym Zhuravinskyi
...
Paulo Rocha
Harry Saini
H. Teufel
Niccoló Zanichelli
Carlos Riquelme
OSLM
106
58
0
27 Feb 2024
Prediction-Powered Ranking of Large Language Models
Prediction-Powered Ranking of Large Language Models
Ivi Chatzi
Eleni Straitouri
Suhas Thejaswi
Manuel Gomez Rodriguez
ALM
127
9
0
27 Feb 2024
Massive Activations in Large Language Models
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
126
81
0
27 Feb 2024
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG
Ayana Niwa
Hayate Iso
92
5
0
27 Feb 2024
KoDialogBench: Evaluating Conversational Understanding of Language
  Models with Korean Dialogue Benchmark
KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark
Seongbo Jang
Seonghyeon Lee
Hwanjo Yu
ELM
71
0
0
27 Feb 2024
RECOST: External Knowledge Guided Data-efficient Instruction Tuning
RECOST: External Knowledge Guided Data-efficient Instruction Tuning
Qi Zhang
Yiming Zhang
Haobo Wang
Junbo Zhao
83
14
0
27 Feb 2024
Measuring Vision-Language STEM Skills of Neural Models
Measuring Vision-Language STEM Skills of Neural Models
Jianhao Shen
Ye Yuan
Srbuhi Mirzoyan
Ming Zhang
Chenguang Wang
VLM
131
12
0
27 Feb 2024
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning
Fajie Yuan
Chengshun Shi
Shiguang Wu
Mengqi Zhang
Zhaochun Ren
Maarten de Rijke
Zhumin Chen
Jiahuan Pei
MoE
211
13
0
27 Feb 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Sunghyeon Woo
Baeseong Park
Byeongwook Kim
Minjung Jo
S. Kwon
Dongsuk Jeon
Dongsoo Lee
132
3
0
27 Feb 2024
Asymmetry in Low-Rank Adapters of Foundation Models
Asymmetry in Low-Rank Adapters of Foundation Models
Jiacheng Zhu
Kristjan Greenewald
Kimia Nadjahi
Haitz Sáez de Ocáriz Borde
Rickard Brüel-Gabrielsson
Leshem Choshen
Marzyeh Ghassemi
Mikhail Yurochkin
Justin Solomon
122
39
0
26 Feb 2024
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Omkar Thawakar
Ashmal Vayani
Salman Khan
Hisham Cholakal
Rao M. Anwer
Michael Felsberg
Timothy Baldwin
Eric P. Xing
Fahad Shahbaz Khan
115
35
0
26 Feb 2024
Language Agents as Optimizable Graphs
Language Agents as Optimizable Graphs
Mingchen Zhuge
Wenyi Wang
Louis Kirsch
Francesco Faccio
Dmitrii Khizbullin
Jürgen Schmidhuber
LLMAG
103
22
0
26 Feb 2024
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Mikayel Samvelyan
Sharath Chandra Raparthy
Andrei Lupu
Eric Hambro
Aram H. Markosyan
...
Minqi Jiang
Jack Parker-Holder
Jakob Foerster
Tim Rocktaschel
Roberta Raileanu
SyDa
117
89
0
26 Feb 2024
Nemotron-4 15B Technical Report
Nemotron-4 15B Technical Report
Jupinder Parmar
Shrimai Prabhumoye
Pritam Gundecha
M. Patwary
Sandeep Subramanian
...
Ashwath Aithal
Oleksii Kuchaiev
Mohammad Shoeybi
Jonathan Cohen
Bryan Catanzaro
101
23
0
26 Feb 2024
A Comprehensive Evaluation of Quantization Strategies for Large Language
  Models
A Comprehensive Evaluation of Quantization Strategies for Large Language Models
Renren Jin
Jiangcun Du
Wuwei Huang
Wei Liu
Jian Luan
Bin Wang
Deyi Xiong
MQ
109
37
0
26 Feb 2024
Quantum linear algebra is all you need for Transformer architectures
Quantum linear algebra is all you need for Transformer architectures
Naixu Guo
Zhan Yu
Matthew Choi
Aman Agrawal
Kouhei Nakaji
Alán Aspuru-Guzik
Patrick Rebentrost
AI4CE
72
16
0
26 Feb 2024
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
Liangxin Liu
Xuebo Liu
Derek F. Wong
Dongfang Li
Ziyi Wang
Baotian Hu
Min Zhang
104
21
0
26 Feb 2024
Towards Open-ended Visual Quality Comparison
Towards Open-ended Visual Quality Comparison
Haoning Wu
Hanwei Zhu
Zicheng Zhang
Erli Zhang
Chaofeng Chen
...
Qiong Yan
Xiaohong Liu
Guangtao Zhai
Shiqi Wang
Weisi Lin
AAML
111
55
0
26 Feb 2024
LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs
  from the Programming Language
LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs from the Programming Language
Ming Wang
Yuanzhong Liu
Xiaoyu Liang
Songlian Li
Yijie Huang
...
Shi Feng
Chi Zhang
Yifei Zhang
Minghui Zheng
Jigang Li
133
15
0
26 Feb 2024
LLMArena: Assessing Capabilities of Large Language Models in Dynamic
  Multi-Agent Environments
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments
Junzhe Chen
Xuming Hu
Shuodi Liu
Shiyu Huang
Weijuan Tu
Zhaofeng He
Lijie Wen
ELMLLMAG
90
13
0
26 Feb 2024
Previous
123...495051...676869
Next