ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.09212
  4. Cited By
CMMLU: Measuring massive multitask language understanding in Chinese
v1v2 (latest)

CMMLU: Measuring massive multitask language understanding in Chinese

15 June 2023
Haonan Li
Yixuan Zhang
Fajri Koto
Yifei Yang
Hai Zhao
Yeyun Gong
Nan Duan
Tim Baldwin
    ALMELM
ArXiv (abs)PDFHTML

Papers citing "CMMLU: Measuring massive multitask language understanding in Chinese"

50 / 66 papers shown
Title
WiNGPT-3.0 Technical Report
Boqin Zhuang
Chenxiao Song
Huitong Lu
Jiacheng Qiao
Mingqian Liu
...
Xiaoxia Song
Xiangjun Xu
X. Chen
Yaoyao Ma
Y. Gao
LLMAGLM&MALRMAI4MHELM
70
0
0
23 May 2025
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance
Qihuang Zhong
Liang Ding
Xiantao Cai
Juhua Liu
Bo Du
Dacheng Tao
82
0
0
21 May 2025
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought
Tencent Hunyuan Team
Ao Liu
Botong Zhou
Can Xu
Chayse Zhou
...
Bingxin Qu
Bolin Ni
Boyu Wu
Chen Li
Cheng-peng Jiang
MoELRMAI4CE
142
0
0
21 May 2025
Enhancing LLMs via High-Knowledge Data Selection
Enhancing LLMs via High-Knowledge Data Selection
Feiyu Duan
Xuemiao Zhang
Sirui Wang
Haoran Que
Yuqi Liu
Wenge Rong
Xunliang Cai
203
0
0
20 May 2025
ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
Dmitriy Shopkhoev
Ammar Ali
Magauiya Zhussip
Valentin Malykh
Stamatios Lefkimmiatis
N. Komodakis
Sergey Zagoruyko
VLM
453
0
0
05 May 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLMVLM
197
130
1
14 Apr 2025
Efficient Evaluation of Large Language Models via Collaborative Filtering
Efficient Evaluation of Large Language Models via Collaborative Filtering
Xu-Xiang Zhong
Chao Yi
Han-Jia Ye
100
0
0
05 Apr 2025
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Zhijun Wang
Jiahuan Li
Hao Zhou
Rongxiang Weng
Jiadong Wang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Shujian Huang
LRM
106
3
0
02 Apr 2025
TLUE: A Tibetan Language Understanding Evaluation Benchmark
TLUE: A Tibetan Language Understanding Evaluation Benchmark
Fan Gao
Cheng Huang
Nyima Tashi
Xiangxiang Wang
Thupten Tsering
...
Gadeng Luosang
Rinchen Dongrub
Dorje Tashi
Xiao Feng
Yongbin Yu
ELM
220
2
0
15 Mar 2025
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Ling Team
B. Zeng
Chenyu Huang
Chao Zhang
Changxin Tian
...
Zhaoxin Huan
Zujie Wen
Zhenhang Sun
Zhuoxuan Du
Z. He
MoEALM
172
5
0
07 Mar 2025
MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models
MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models
Zihao Wei
Jingcheng Deng
Liang Pang
Hanxing Ding
Huawei Shen
Xueqi Cheng
KELM
135
7
0
20 Feb 2025
Improve LLM-as-a-Judge Ability as a General Ability
Improve LLM-as-a-Judge Ability as a General Ability
Jiachen Yu
Shaoning Sun
Xiaohui Hu
Jiaxu Yan
Kaidong Yu
Xuelong Li
ELM
146
7
0
17 Feb 2025
Superpose Singular Features for Model Merging
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
147
0
0
15 Feb 2025
Large Language Diffusion Models
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
233
54
0
14 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Carla P. Gomes
B. Selman
Qingsong Wen
LRM
211
17
0
05 Feb 2025
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models
Xin Xu
Qiyun Xu
Tong Xiao
Tianhao Chen
Yuchen Yan
Jiaxin Zhang
Shizhe Diao
Can Yang
Yang Wang
LRMAI4CEELM
215
8
0
01 Feb 2025
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
Siyuan Wang
Dianyi Wang
Chengxing Zhou
Zejun Li
Zhihao Fan
Xuanjing Huang
Zhongyu Wei
VLM
480
0
0
17 Dec 2024
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
Runming Yang
Taiqiang Wu
Jiahao Wang
Pengfei Hu
Ngai Wong
Yujiu Yang
Yujiu Yang
427
1
0
11 Nov 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
120
5
0
14 Oct 2024
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo
Xue Yang
Wenhan Dou
Zhaokai Wang
Jifeng Dai
Jifeng Dai
Yu Qiao
Xizhou Zhu
VLMMLLM
138
33
0
10 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
96
7
0
03 Oct 2024
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Fajri Koto
ELM
131
3
0
13 Sep 2024
Training on the Benchmark Is Not All You Need
Training on the Benchmark Is Not All You Need
Shiwen Ni
Xiangtao Kong
Chengming Li
Xiping Hu
Ruifeng Xu
Jia Zhu
Min Yang
116
6
0
03 Sep 2024
An Investigation of Warning Erroneous Chat Translations in Cross-lingual
  Communication
An Investigation of Warning Erroneous Chat Translations in Cross-lingual Communication
Yunmeng Li
Jun Suzuki
Makoto Morishita
Kaori Abe
Kentaro Inui
119
1
0
28 Aug 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELMALMLM&MA
183
44
0
09 Jun 2024
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Chen Cecilia Liu
Iryna Gurevych
Anna Korhonen
137
6
0
06 Jun 2024
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
Hongyu Wang
Jiayu Xu
Senwei Xie
Ruiping Wang
Jialin Li
Zhaojie Xie
Bin Zhang
Chuyan Xiong
Xilin Chen
ELMVLMLRM
133
6
0
24 May 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLMLRM
274
571
0
07 Mar 2024
Can Large Language Model Comprehend Ancient Chinese? A Preliminary Test
  on ACLUE
Can Large Language Model Comprehend Ancient Chinese? A Preliminary Test on ACLUE
Yixuan Zhang
Haonan Li
LRMELM
30
11
0
14 Oct 2023
Baichuan 2: Open Large-scale Language Models
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELMLRM
209
748
0
19 Sep 2023
SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark
SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark
Liang Xu
Anqi Li
Lei Zhu
Han Xue
Changtai Zhu
Kangkang Zhao
Hao He
Xuanwei Zhang
Qiyue Kang
Zhenzhong Lan
RALMELMLRM
52
55
0
27 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
396
12,044
0
18 Jul 2023
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained
  Transformer
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer
Z. Li
Shitou Zhang
Hai Zhao
Yifei Yang
Dongjie Yang
LM&MA
101
15
0
01 Jul 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
123
775
0
01 Jun 2023
Bactrian-X: Multilingual Replicable Instruction-Following Models with
  Low-Rank Adaptation
Bactrian-X: Multilingual Replicable Instruction-Following Models with Low-Rank Adaptation
Haonan Li
Fajri Koto
Minghao Wu
Alham Fikri Aji
Timothy Baldwin
ALM
64
75
0
24 May 2023
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark
  for Chinese Large Language Models
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models
Chuang Liu
Renren Jin
Yuqi Ren
Linhao Yu
Tianyu Dong
...
Peiyi Zhang
Qingqing Lyu
Xiaowen Su
Qun Liu
Deyi Xiong
ELMALM
103
26
0
17 May 2023
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for
  Foundation Models
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Yuzhen Huang
Yuzhuo Bai
Zhihao Zhu
Junlei Zhang
Jinghan Zhang
...
Yikai Zhang
Jiayi Lei
Yao Fu
Maosong Sun
Junxian He
ELMLRM
100
549
0
15 May 2023
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale
  Instructions
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Minghao Wu
Abdul Waheed
Chiyu Zhang
Muhammad Abdul-Mageed
Alham Fikri Aji
ALM
172
127
0
27 Apr 2023
Measuring Massive Multitask Chinese Understanding
Measuring Massive Multitask Chinese Understanding
Hui Zeng
ALMELMAILaw
75
29
0
25 Apr 2023
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
Yiming Cui
Ziqing Yang
Xin Yao
ALM
74
315
0
17 Apr 2023
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
Wanjun Zhong
Ruixiang Cui
Yiduo Guo
Yaobo Liang
Shuai Lu
Yanlin Wang
Amin Saied
Weizhu Chen
Nan Duan
ALMELM
107
548
0
13 Apr 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Yang Liu
Dan Iter
Yichong Xu
Shuohang Wang
Ruochen Xu
Chenguang Zhu
ELMALMLM&MA
185
1,208
0
29 Mar 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
1.5K
14,748
0
15 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
1.5K
13,437
0
27 Feb 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
404
2,394
0
09 Nov 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDLLRM
358
1,094
0
05 Oct 2022
News Summarization and Evaluation in the Era of GPT-3
News Summarization and Evaluation in the Era of GPT-3
Tanya Goyal
Junyi Jessy Li
Greg Durrett
ELM
110
409
0
26 Sep 2022
Beyond the Imitation Game: Quantifying and extrapolating the
  capabilities of language models
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Aarohi Srivastava
Abhinav Rastogi
Abhishek Rao
Abu Awal Md Shoeb
Abubakar Abid
...
Zhuoye Zhao
Zijian Wang
Zijie J. Wang
Zirui Wang
Ziyi Wu
ELM
211
1,775
0
09 Jun 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
362
3,695
0
02 May 2022
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLMOffRLLRM
345
4,569
0
27 Oct 2021
12
Next