ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.08322
  4. Cited By
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for
  Foundation Models
v1v2v3 (latest)

C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

15 May 2023
Yuzhen Huang
Yuzhuo Bai
Zhihao Zhu
Junlei Zhang
Jinghan Zhang
Tangjun Su
Junteng Liu
Chuancheng Lv
Yikai Zhang
Jiayi Lei
Yao Fu
Maosong Sun
Junxian He
    ELMLRM
ArXiv (abs)PDFHTML

Papers citing "C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models"

5 / 105 papers shown
Title
CMATH: Can Your Language Model Pass Chinese Elementary School Math Test?
CMATH: Can Your Language Model Pass Chinese Elementary School Math Test?
Tianwen Wei
Jian Luan
Wen Liu
Shuang Dong
Bin Wang
ELM
79
36
0
29 Jun 2023
CMMLU: Measuring massive multitask language understanding in Chinese
CMMLU: Measuring massive multitask language understanding in Chinese
Haonan Li
Yixuan Zhang
Fajri Koto
Yifei Yang
Hai Zhao
Yeyun Gong
Nan Duan
Tim Baldwin
ALMELM
118
273
0
15 Jun 2023
Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese
  Medical Exam Dataset
Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset
Junling Liu
Peilin Zhou
Yining Hua
Dading Chong
Zhongyu Tian
...
Helin Wang
Chenyu You
Zhenhua Guo
Lei Zhu
Michael Lingzhi Li
LM&MAELM
111
79
0
05 Jun 2023
Chain-of-Thought Hub: A Continuous Effort to Measure Large Language
  Models' Reasoning Performance
Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance
Yao Fu
Litu Ou
Mingyu Chen
Yuhao Wan
Hao-Chun Peng
Tushar Khot
LLMAGELMLRMReLM
80
115
0
26 May 2023
Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For
  Large Language Models
Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models
Daman Arora
H. Singh
Mausam
ELMLRM
134
55
0
24 May 2023
Previous
123