ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.03300
  4. Cited By
Measuring Massive Multitask Language Understanding
v1v2v3 (latest)

Measuring Massive Multitask Language Understanding

7 September 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
    ELMRALM
ArXiv (abs)PDFHTML

Papers citing "Measuring Massive Multitask Language Understanding"

50 / 3,408 papers shown
Title
Energy-Based Preference Model Offers Better Offline Alignment than the
  Bradley-Terry Preference Model
Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model
Yuzhong Hong
Hanshan Zhang
Junwei Bao
Hongfei Jiang
Yang Song
OffRL
117
4
0
18 Dec 2024
Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation
  on Nepali
Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali
Sharad Duwal
Suraj Prasai
Suresh Manandhar
CLL
131
1
0
18 Dec 2024
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and
  Post-LN
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
Pengxiang Li
Lu Yin
Shiwei Liu
116
8
0
18 Dec 2024
MetaRuleGPT: Recursive Numerical Reasoning of Language Models Trained
  with Simple Rules
MetaRuleGPT: Recursive Numerical Reasoning of Language Models Trained with Simple Rules
Kejie Chen
Lin Wang
Qinghai Zhang
Renjun Xu
ReLMLRM
99
0
0
18 Dec 2024
AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
Xiaobao Wu
Liangming Pan
Yuxi Xie
Ruiwen Zhou
Shuai Zhao
Yubo Ma
Mingzhe Du
Rui Mao
Anh Tuan Luu
William Yang Wang
283
13
0
18 Dec 2024
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
213
1
0
18 Dec 2024
A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI
A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI
Beiduo Chen
Siyao Peng
Anna Korhonen
Barbara Plank
139
2
0
18 Dec 2024
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
Jihan Yang
Shusheng Yang
Anjali W. Gupta
Rilyn Han
Li Fei-Fei
Saining Xie
LRM
212
107
0
18 Dec 2024
Concept-ROT: Poisoning Concepts in Large Language Models with Model
  Editing
Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing
Keltin Grimes
Marco Christiani
David Shriver
Marissa Connor
KELM
124
4
0
17 Dec 2024
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small
  LLMs
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs
Aldo Pareja
Nikhil Shivakumar Nayak
Hao Wang
Krishnateja Killamsetty
Shivchander Sudalairaj
...
Guangxuan Xu
Kai Xu
Ligong Han
Luke Inglis
Akash Srivastava
197
7
0
17 Dec 2024
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over
  Aligned Large Language Models
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models
Yuchen Fan
Yuzhong Hong
Qiushi Wang
Junwei Bao
Hongfei Jiang
Yang Song
117
1
0
17 Dec 2024
A Survey of Calibration Process for Black-Box LLMs
A Survey of Calibration Process for Black-Box LLMs
Liangru Xie
Hui Liu
Jingying Zeng
Xianfeng Tang
Yan Han
Chen Luo
Jing Huang
Zhen Li
Suhang Wang
Qi He
142
4
0
17 Dec 2024
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
Siyuan Wang
Dianyi Wang
Chengxing Zhou
Zejun Li
Zhihao Fan
Xuanjing Huang
Zhongyu Wei
VLM
525
0
0
17 Dec 2024
Wonderful Matrices: Combining for a More Efficient and Effective
  Foundation Model Architecture
Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture
Jingze Shi
Bingheng Wu
120
0
0
16 Dec 2024
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General
  Reasoning in LLMs
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs
Mohammad Aflah Khan
Neemesh Yadav
Sarah Masud
Md. Shad Akhtar
169
0
0
16 Dec 2024
Codenames as a Benchmark for Large Language Models
Codenames as a Benchmark for Large Language Models
Matthew Stephenson
Matthew Sidji
Benoît Ronval
LLMAGLRMELM
234
1
0
16 Dec 2024
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE Inference
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE Inference
Yujie Zhang
Shivam Aggarwal
T. Mitra
MoE
164
1
0
16 Dec 2024
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Jiale Cheng
Xiao-Chang Liu
C. Wang
Xiaotao Gu
Yaojie Lu
Dan Zhang
Yuxiao Dong
J. Tang
Hongning Wang
Minlie Huang
LRM
189
4
0
16 Dec 2024
Generics are puzzling. Can language models find the missing piece?
Generics are puzzling. Can language models find the missing piece?
Gustavo Cilleruelo Calderón
Emily Allaway
Barry Haddow
Alexandra Birch
129
0
0
15 Dec 2024
TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs
TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs
Lanxiang Hu
Tajana Rosing
Hao Zhang
105
0
0
15 Dec 2024
Smaller Language Models Are Better Instruction Evolvers
Smaller Language Models Are Better Instruction Evolvers
Tingfeng Hui
Lulu Zhao
Guanting Dong
Yaqi Zhang
Hua Zhou
Sen Su
ALM
180
3
0
15 Dec 2024
The Superalignment of Superhuman Intelligence with Large Language Models
The Superalignment of Superhuman Intelligence with Large Language Models
Minlie Huang
Yingkang Wang
Shiyao Cui
Pei Ke
J. Tang
176
1
0
15 Dec 2024
Does Representation Matter? Exploring Intermediate Layers in Large
  Language Models
Does Representation Matter? Exploring Intermediate Layers in Large Language Models
Oscar Skean
Md Rifat Arefin
Yann LeCun
Ravid Shwartz-Ziv
134
12
0
12 Dec 2024
Phi-4 Technical Report
Phi-4 Technical Report
Marah Abdin
J. Aneja
Harkirat Singh Behl
Sébastien Bubeck
Ronen Eldan
...
Rachel A. Ward
Yue Wu
Dingli Yu
Cyril Zhang
Yi Zhang
ALMSyDa
192
154
0
12 Dec 2024
HadaCore: Tensor Core Accelerated Hadamard Transform Kernel
HadaCore: Tensor Core Accelerated Hadamard Transform Kernel
Krish Agarwal
Rishi Astra
Adnan Hoque
Mudhakar Srivatsa
R. Ganti
Less Wright
Sijia Chen
127
4
0
12 Dec 2024
Learning to Reason via Self-Iterative Process Feedback for Small
  Language Models
Learning to Reason via Self-Iterative Process Feedback for Small Language Models
Kaiyuan Chen
Jin Wang
Xuejie Zhang
LRMReLM
115
2
0
11 Dec 2024
BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch
  Language
BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language
Nikolay Banar
Ehsan Lotfi
Walter Daelemans
81
2
0
11 Dec 2024
MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task
  Learning
MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning
Yufei Ma
Zihan Liang
Huangyu Dai
Bin Chen
D. Gao
...
Linbo Jin
Wen Jiang
Guannan Zhang
Xiaoyan Cai
Libin Yang
MoEMoMe
153
1
0
10 Dec 2024
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long
  Context Extension for Large Language Models
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models
Haoran Lian
Junmin Chen
Wei Huang
Yizhe Xiong
Wenping Hu
...
Hui Chen
Jianwei Niu
Zijia Lin
Fuzheng Zhang
Di Zhang
127
0
0
10 Dec 2024
On Evaluating the Durability of Safeguards for Open-Weight LLMs
On Evaluating the Durability of Safeguards for Open-Weight LLMs
Xiangyu Qi
Boyi Wei
Nicholas Carlini
Yangsibo Huang
Tinghao Xie
Luxi He
Matthew Jagielski
Milad Nasr
Prateek Mittal
Peter Henderson
AAML
137
22
0
10 Dec 2024
AutoReason: Automatic Few-Shot Reasoning Decomposition
AutoReason: Automatic Few-Shot Reasoning Decomposition
Arda Sevinc
A. Gumus
ReLMLRM
102
0
0
09 Dec 2024
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
Roi Cohen
Konstantin Dobler
Eden Biran
Gerard de Melo
194
9
0
09 Dec 2024
SafeWorld: Geo-Diverse Safety Alignment
SafeWorld: Geo-Diverse Safety Alignment
Da Yin
Haoyi Qiu
Kung-Hsiang Huang
Kai-Wei Chang
Nanyun Peng
121
8
0
09 Dec 2024
Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
Felipe Maia Polo
Shivalika Singh
Leshem Choshen
Yuekai Sun
Mikhail Yurochkin
232
8
0
09 Dec 2024
A Scoping Review of ChatGPT Research in Accounting and Finance
A Scoping Review of ChatGPT Research in Accounting and Finance
Mengming Michael Dong
Theophanis C. Stratopoulos
Victor Xiaoqi Wang
126
25
0
07 Dec 2024
CharacterBox: Evaluating the Role-Playing Capabilities of LLMs in
  Text-Based Virtual Worlds
CharacterBox: Evaluating the Role-Playing Capabilities of LLMs in Text-Based Virtual Worlds
Lei Wang
Jianxun Lian
Yi Huang
Yanqi Dai
Haoxuan Li
Xu Chen
Xing Xie
Ji-Rong Wen
LLMAG
154
7
0
07 Dec 2024
TransitGPT: A Generative AI-based framework for interacting with GTFS
  data using Large Language Models
TransitGPT: A Generative AI-based framework for interacting with GTFS data using Large Language Models
Saipraneeth Devunuri
Lewis J. Lehe
LM&MA
165
2
0
07 Dec 2024
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Haizhou Shi
Yibin Wang
Ligong Han
Huatian Zhang
Hao Wang
UQCV
231
2
0
07 Dec 2024
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
Ola Shorinwa
Zhiting Mei
Justin Lidard
Allen Z. Ren
Anirudha Majumdar
HILMLRM
153
19
0
07 Dec 2024
SKIM: Any-bit Quantization Pushing The Limits of Post-Training
  Quantization
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
Runsheng Bai
Qiang Liu
B. Liu
MQ
135
2
0
05 Dec 2024
Reinforcement Learning Enhanced LLMs: A Survey
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
Jing Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Leilei Gan
G. Wang
Eduard H. Hovy
OffRL
245
16
0
05 Dec 2024
Chatting with Logs: An exploratory study on Finetuning LLMs for LogQL
Chatting with Logs: An exploratory study on Finetuning LLMs for LogQL
Vishwanath Seshagiri
Siddharth Balyan
Vaastav Anand
Kaustubh Dhole
Ishan Sharma
Avani Wildani
José Cambronero
Andreas Züfle
149
2
0
04 Dec 2024
Unifying KV Cache Compression for Large Language Models with LeanKV
Unifying KV Cache Compression for Large Language Models with LeanKV
Yanqi Zhang
Yuwei Hu
Runyuan Zhao
John C. S. Lui
Haibo Chen
MQ
286
7
0
04 Dec 2024
Optimizing Large Language Models for Turkish: New Methodologies in
  Corpus Selection and Training
Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training
Himmet Toprak Kesgin
M. K. Yuce
Eren Dogan
M. E. Uzun
Atahan Uz
Elif Ince
Yusuf Erdem
Osama Shbib
Ahmed Zeer
M. Fatih Amasyali
108
0
0
03 Dec 2024
The Vulnerability of Language Model Benchmarks: Do They Accurately
  Reflect True LLM Performance?
The Vulnerability of Language Model Benchmarks: Do They Accurately Reflect True LLM Performance?
Sourav Banerjee
Ayushi Agarwal
Eishkaran Singh
ELM
105
3
0
02 Dec 2024
Noise Injection Reveals Hidden Capabilities of Sandbagging Language
  Models
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
Cameron Tice
Philipp Alexander Kreer
Nathan Helm-Burger
Prithviraj Singh Shahani
Fedor Ryzhenkov
Jacob Haimes
Felix Hofstätter
Teun van der Weij
130
1
0
02 Dec 2024
Can We Afford The Perfect Prompt? Balancing Cost and Accuracy with the
  Economical Prompting Index
Can We Afford The Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index
Tyler McDonald
Anthony Colosimo
Yifeng Li
Ali Emami
111
2
0
02 Dec 2024
SailCompass: Towards Reproducible and Robust Evaluation for Southeast
  Asian Languages
SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages
Jia Guo
Longxu Dou
Guangtao Zeng
Stanley Kok
Wei Lu
Qian Liu
ELMLRM
129
2
0
02 Dec 2024
INTELLECT-1 Technical Report
INTELLECT-1 Technical Report
Sami Jaghouar
Jack Min Ong
Manveer Basra
Fares Obeid
Jannik Straube
...
Lucas Atkins
Maziyar Panahi
Charles Goddard
Max Ryabinin
Johannes Hagemann
MoE
162
3
0
02 Dec 2024
TruncFormer: Private LLM Inference Using Only Truncations
TruncFormer: Private LLM Inference Using Only Truncations
Patrick Yubeaton
Jianqiao Mo
Karthik Garimella
N. Jha
Brandon Reagen
Chinmay Hegde
Siddharth Garg
108
0
0
02 Dec 2024
Previous
123...212223...676869
Next