ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
Scaling Laws for Predicting Downstream Performance in LLMs
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
151
12
0
11 Oct 2024
Language Imbalance Driven Rewarding for Multilingual Self-improving
Language Imbalance Driven Rewarding for Multilingual Self-improving
Wen Yang
Junhong Wu
Chen Wang
Chengqing Zong
J.N. Zhang
ALMLRM
215
7
0
11 Oct 2024
Language model developers should report train-test overlap
Language model developers should report train-test overlap
Andy K. Zhang
Kevin Klyman
Yifan Mai
Yoav Levine
Yian Zhang
Rishi Bommasani
Percy Liang
VLMELM
76
9
0
10 Oct 2024
Do You Know What You Are Talking About? Characterizing Query-Knowledge
  Relevance For Reliable Retrieval Augmented Generation
Do You Know What You Are Talking About? Characterizing Query-Knowledge Relevance For Reliable Retrieval Augmented Generation
Zhuohang Li
Jiaxin Zhang
Chao Yan
Kamalika Das
Sricharan Kumar
Murat Kantarcioglu
Bradley Malin
RALM
48
2
0
10 Oct 2024
Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks
Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks
Mathis Pink
Vy A. Vo
Qinyuan Wu
Jianing Mu
Javier S. Turek
Uri Hasson
K. A. Norman
Sebastian Michelmann
Alexander G. Huth
Mariya Toneva
106
2
0
10 Oct 2024
Optima: Optimizing Effectiveness and Efficiency for LLM-Based
  Multi-Agent System
Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System
Weize Chen
Jiarui Yuan
Chen Qian
Cheng Yang
Zhiyuan Liu
Maosong Sun
LLMAG
91
6
0
10 Oct 2024
Packing Analysis: Packing Is More Appropriate for Large Models or
  Datasets in Supervised Fine-tuning
Packing Analysis: Packing Is More Appropriate for Large Models or Datasets in Supervised Fine-tuning
Shuhe Wang
Guoyin Wang
Yucheng Wang
Jiwei Li
Eduard H. Hovy
Chen Guo
121
4
0
10 Oct 2024
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language
  Models Through Traversing 2D Game Maps
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Maps
Muhammad Umair Nasir
Steven D. James
Julian Togelius
ELMLRM
92
5
0
10 Oct 2024
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
Jiayi Han
Liang Du
Hongwei Du
Xiangguo Zhou
Yiwen Wu
Weibo Zheng
Donghong Han
CLLMoMeMoE
86
4
0
10 Oct 2024
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+
  Interaction Trajectories
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Yifan Song
Weimin Xiong
Xiutian Zhao
Dawei Zhu
Wenhao Wu
Ke Wang
Cheng Li
Wei Peng
Sujian Li
LLMAG
64
11
0
10 Oct 2024
StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for
  Large Language Models
StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Minchan Kwon
Gaeun Kim
Jongsuk Kim
Haeil Lee
Junmo Kim
OffRLLRMLLMAG
75
4
0
10 Oct 2024
Boosting Deep Ensembles with Learning Rate Tuning
Boosting Deep Ensembles with Learning Rate Tuning
Hongpeng Jin
Yanzhao Wu
71
0
0
10 Oct 2024
News Reporter: A Multi-lingual LLM Framework for Broadcast T.V News
News Reporter: A Multi-lingual LLM Framework for Broadcast T.V News
Tarun Jain
Yufei Gao
Sridhar Vanga
Karan Singla
57
0
0
10 Oct 2024
CrossQuant: A Post-Training Quantization Method with Smaller
  Quantization Kernel for Precise Large Language Model Compression
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression
Wenyuan Liu
Xindian Ma
Peng Zhang
Yan Wang
MQ
58
1
0
10 Oct 2024
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo
Xue Yang
Wenhan Dou
Zhaokai Wang
Jifeng Dai
Jifeng Dai
Yu Qiao
Xizhou Zhu
VLMMLLM
165
34
0
10 Oct 2024
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
Binghai Wang
Weipeng Chen
Ji-Rong Wen
140
0
0
10 Oct 2024
Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration
Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration
Tianyi Bai
Ling Yang
Zhen Hao Wong
Fupeng Sun
Jiahui Peng
...
Lijun Wu
Jiantao Qiu
Wentao Zhang
Binhang Yuan
Conghui He
LLMAG
79
6
0
10 Oct 2024
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Philipp Guldimann
Alexander Spiridonov
Robin Staab
Nikola Jovanović
Mark Vero
...
Mislav Balunović
Nikola Konstantinov
Pavol Bielik
Petar Tsankov
Martin Vechev
ELM
103
8
0
10 Oct 2024
A Closer Look at Machine Unlearning for Large Language Models
A Closer Look at Machine Unlearning for Large Language Models
Xiaojian Yuan
Tianyu Pang
Chao Du
Kejiang Chen
Weiming Zhang
Min Lin
MU
259
13
0
10 Oct 2024
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation
  Experts
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
115
6
0
09 Oct 2024
Mental Disorders Detection in the Era of Large Language Models
Mental Disorders Detection in the Era of Large Language Models
Gleb Kuzmin
Petr Strepetov
Maksim Stankevich
Artem Shelmanov
Ivan Smirnov
46
1
0
09 Oct 2024
Capturing Bias Diversity in LLMs
Capturing Bias Diversity in LLMs
Purva Prasad Gosavi
Vaishnavi Murlidhar Kulkarni
Alan F. Smeaton
33
0
0
09 Oct 2024
Mitigating the Language Mismatch and Repetition Issues in LLM-based
  Machine Translation via Model Editing
Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing
Weichuan Wang
Zhaoyi Li
Defu Lian
Chen Ma
Linqi Song
Ying Wei
103
8
0
09 Oct 2024
Self-Boosting Large Language Models with Synthetic Preference Data
Self-Boosting Large Language Models with Synthetic Preference Data
Qingxiu Dong
Li Dong
Xingxing Zhang
Zhifang Sui
Furu Wei
SyDa
92
12
0
09 Oct 2024
Utilize the Flow before Stepping into the Same River Twice: Certainty
  Represented Knowledge Flow for Refusal-Aware Instruction Tuning
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning
Runchuan Zhu
Zhipeng Ma
Jiang Wu
Junyuan Gao
Jiaqi Wang
Dahua Lin
Conghui He
53
4
0
09 Oct 2024
Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with
  Situation Puzzles
Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles
Qi Chen
Bowen Zhang
Gang Wang
Qi Wu
ReLMLRM
80
6
0
09 Oct 2024
Chip-Tuning: Classify Before Language Models Say
Chip-Tuning: Classify Before Language Models Say
Fangwei Zhu
Dian Li
Jiajun Huang
Gang Liu
Hui Wang
Zhifang Sui
62
0
0
09 Oct 2024
Do great minds think alike? Investigating Human-AI Complementarity in
  Question Answering with CAIMIRA
Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA
Maharshi Gor
Hal Daumé III
Dinesh Manocha
Jordan Boyd-Graber
ELMAI4MHLRM
70
4
0
09 Oct 2024
WAPITI: A Watermark for Finetuned Open-Source LLMs
WAPITI: A Watermark for Finetuned Open-Source LLMs
Lingjie Chen
Ruizhong Qiu
Siyu Yuan
Zhining Liu
Tianxin Wei
Hyunsik Yoo
Zhichen Zeng
Deqing Yang
Hanghang Tong
WaLM
104
7
0
09 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for
  Enhanced Following of Instructions with Multiple Constraints
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
104
14
0
09 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
Jia-Nan Li
Weiyao Lin
VLM
104
2
0
09 Oct 2024
Data Selection via Optimal Control for Language Models
Data Selection via Optimal Control for Language Models
Yuxian Gu
Li Dong
Hongning Wang
Y. Hao
Qingxiu Dong
Furu Wei
Minlie Huang
AI4CE
174
9
0
09 Oct 2024
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Ruijia Niu
D. Wu
Rose Yu
Yi-An Ma
126
2
0
09 Oct 2024
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
Cheng-rong Li
May Fung
Qingyun Wang
Chi Han
Manling Li
Jindong Wang
Heng Ji
AI4MH
442
0
0
09 Oct 2024
Scaling Laws For Mixed Qquantization
Scaling Laws For Mixed Qquantization
Zeyu Cao
Boyang Gu
Cheng Zhang
Pedro Gimenes
Jianqiao Lu
Jianyi Cheng
Xitong Gao
Yiren Zhao
MQ
86
1
0
09 Oct 2024
QERA: an Analytical Framework for Quantization Error Reconstruction
QERA: an Analytical Framework for Quantization Error Reconstruction
Cheng Zhang
Jeffrey T. H. Wong
Can Xiao
George A. Constantinides
Yiren Zhao
MQ
78
4
0
08 Oct 2024
Active Evaluation Acquisition for Efficient LLM Benchmarking
Active Evaluation Acquisition for Efficient LLM Benchmarking
Yang Li
Jie Ma
Miguel Ballesteros
Yassine Benajiba
Graham Horwood
ELM
76
6
0
08 Oct 2024
KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge
  Distillation from Server
KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server
Wenhao Wang
Xiaoyu Liang
Rui Ye
Jingyi Chai
Siheng Chen
Yanfeng Wang
SyDa
92
6
0
08 Oct 2024
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing
  with Language Models
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models
Ranchi Zhao
Zhen Leng Thai
Yifan Zhang
Shengding Hu
Yunqi Ba
Jie Zhou
Jie Cai
Zhiyuan Liu
Maosong Sun
147
1
0
08 Oct 2024
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
Amir Hossein Kargaran
Ali Modarressi
Nafiseh Nikeghbal
Jana Diesner
François Yvon
Hinrich Schütze
ELM
104
7
0
08 Oct 2024
Attribute Controlled Fine-tuning for Large Language Models: A Case Study
  on Detoxification
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Tao Meng
Ninareh Mehrabi
Palash Goyal
Anil Ramakrishna
Aram Galstyan
Richard Zemel
Kai-Wei Chang
Rahul Gupta
Charith Peris
28
1
0
07 Oct 2024
Superficial Safety Alignment Hypothesis
Superficial Safety Alignment Hypothesis
Jianwei Li
Jung-Eun Kim
65
3
0
07 Oct 2024
Data Advisor: Dynamic Data Curation for Safety Alignment of Large
  Language Models
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Fei Wang
Ninareh Mehrabi
Palash Goyal
Rahul Gupta
Kai-Wei Chang
Aram Galstyan
ALM
74
2
0
07 Oct 2024
CasiMedicos-Arg: A Medical Question Answering Dataset Annotated with
  Explanatory Argumentative Structures
CasiMedicos-Arg: A Medical Question Answering Dataset Annotated with Explanatory Argumentative Structures
Ekaterina Sviridova
Anar Yeginbergen
A. Estarrona
Elena Cabrio
S. Villata
Rodrigo Agerri
99
6
0
07 Oct 2024
Precise Model Benchmarking with Only a Few Observations
Precise Model Benchmarking with Only a Few Observations
Riccardo Fogliato
Pratik Patil
Nil-Jana Akpinar
Mathew Monfort
77
0
0
07 Oct 2024
A Recipe For Building a Compliant Real Estate Chatbot
A Recipe For Building a Compliant Real Estate Chatbot
Navid Madani
Anusha Bagalkotkar
Supriya Anand
Gabriel Arnson
Rohini Srihari
K. Joseph
AI4TS
24
0
0
07 Oct 2024
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
Xinyu Zhao
Guoheng Sun
Ruisi Cai
Yukun Zhou
Pingzhi Li
...
Binhang Yuan
Hongyi Wang
Ang Li
Zhangyang Wang
Tianlong Chen
MoMeALM
143
5
0
07 Oct 2024
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Jingwei Zuo
Maksim Velikanov
Dhia Eddine Rhaiem
Ilyas Chahed
Younes Belkada
Guillaume Kunsch
Hakim Hacid
ALM
97
17
0
07 Oct 2024
MINER: Mining the Underlying Pattern of Modality-Specific Neurons in
  Multimodal Large Language Models
MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models
Kaichen Huang
Jiahao Huo
Yibo Yan
Kun Wang
Yutao Yue
Xuming Hu
80
2
0
07 Oct 2024
Representing the Under-Represented: Cultural and Core Capability
  Benchmarks for Developing Thai Large Language Models
Representing the Under-Represented: Cultural and Core Capability Benchmarks for Developing Thai Large Language Models
Dahyun Kim
Sukyung Lee
Yungi Kim
Attapol Rutherford
Chanjun Park
ELM
55
1
0
07 Oct 2024
Previous
123...272829...676869
Next