ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.01448
  4. Cited By
Meta Semantic Template for Evaluation of Large Language Models

Meta Semantic Template for Evaluation of Large Language Models

1 October 2023
Yachuan Liu
Liang Chen
Jindong Wang
Qiaozhu Mei
Xing Xie
ArXivPDFHTML

Papers citing "Meta Semantic Template for Evaluation of Large Language Models"

10 / 10 papers shown
Title
Evaluating Language Models for Mathematics through Interactions
Evaluating Language Models for Mathematics through Interactions
Katherine M. Collins
Albert Q. Jiang
Simon Frieder
L. Wong
Miri Zilka
...
William Hart
T. Gowers
Wen-Ding Li
Adrian Weller
M. Jamnik
70
60
0
02 Jun 2023
Towards Robust Personalized Dialogue Generation via Order-Insensitive
  Representation Regularization
Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization
Liang Chen
Hongru Wang
Yang Deng
Wai-Chung Kwan
Zezhong Wang
Kam-Fai Wong
46
15
0
22 May 2023
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution
  Perspective
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective
Jindong Wang
Xixu Hu
Wenxin Hou
Hao Chen
Runkai Zheng
...
Weirong Ye
Xiubo Geng
Binxing Jiao
Yue Zhang
Xingxu Xie
AI4MH
101
233
0
22 Feb 2023
GLUE-X: Evaluating Natural Language Understanding Models from an
  Out-of-distribution Generalization Perspective
GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective
Linyi Yang
Shuibai Zhang
Libo Qin
Yafu Li
Yidong Wang
Hanmeng Liu
Jindong Wang
Xingxu Xie
Yue Zhang
ELM
91
81
0
15 Nov 2022
Quantifying Memorization Across Neural Language Models
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
100
614
0
15 Feb 2022
Red Teaming Language Models with Language Models
Red Teaming Language Models with Language Models
Ethan Perez
Saffron Huang
Francis Song
Trevor Cai
Roman Ring
John Aslanides
Amelia Glaese
Nat McAleese
G. Irving
AAML
131
645
0
07 Feb 2022
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
116
3,723
0
03 Sep 2021
Dynaboard: An Evaluation-As-A-Service Platform for Holistic
  Next-Generation Benchmarking
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking
Zhiyi Ma
Kawin Ethayarajh
Tristan Thrush
Somya Jain
Ledell Yu Wu
Robin Jia
Christopher Potts
Adina Williams
Douwe Kiela
ELM
72
57
0
21 May 2021
Dynabench: Rethinking Benchmarking in NLP
Dynabench: Rethinking Benchmarking in NLP
Douwe Kiela
Max Bartolo
Yixin Nie
Divyansh Kaushik
Atticus Geiger
...
Pontus Stenetorp
Robin Jia
Joey Tianyi Zhou
Christopher Potts
Adina Williams
180
405
0
07 Apr 2021
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro
Tongshuang Wu
Carlos Guestrin
Sameer Singh
ELM
190
1,100
0
08 May 2020
1