ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.00021
  4. Cited By
Evaluatology: The Science and Engineering of Evaluation

Evaluatology: The Science and Engineering of Evaluation

19 March 2024
Jianfeng Zhan
Lei Wang
Wanling Gao
Hongxiao Li
Chenxi Wang
Yunyou Huang
Yatao Li
Zhengxin Yang
Guoxin Kang
Chunjie Luo
Hainan Ye
Shaopeng Dai
Zhifei Zhang
ArXivPDFHTML

Papers citing "Evaluatology: The Science and Engineering of Evaluation"

5 / 5 papers shown
Title
am-ELO: A Stable Framework for Arena-based LLM Evaluation
am-ELO: A Stable Framework for Arena-based LLM Evaluation
Zirui Liu
Jiatong Li
Yan Zhuang
Qiang Liu
Shuanghong Shen
Jie Ouyang
Mingyue Cheng
Shijin Wang
46
1
0
06 May 2025
A Computational Theory for Efficient Mini Agent Evaluation with Causal Guarantees
A Computational Theory for Efficient Mini Agent Evaluation with Causal Guarantees
Hedong Yan
41
0
0
27 Mar 2025
Establishing Rigorous and Cost-effective Clinical Trials for Artificial
  Intelligence Models
Establishing Rigorous and Cost-effective Clinical Trials for Artificial Intelligence Models
Wanling Gao
Yunyou Huang
Dandan Cui
Zhuoming Yu
Wenjing Liu
...
Tianyi Wei
Suqin Tang
Bingjie Xia
Zhifei Zhang
Jianfeng Zhan
32
0
0
11 Jul 2024
AI.vs.Clinician: Unveiling Intricate Interactions Between AI and
  Clinicians through an Open-Access Database
AI.vs.Clinician: Unveiling Intricate Interactions Between AI and Clinicians through an Open-Access Database
Wanling Gao
Yuan Liu
Zhuoming Yu
Dandan Cui
Wenjing Liu
...
Chongrong Jiang
Tianyi Wei
Zhifei Zhang
Yunyou Huang
Jianfeng Zhan
22
2
0
11 Jun 2024
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
307
39,238
0
01 Sep 2014
1