Measuring the Measuring Tools: An Automatic Evaluation of Semantic
Metrics for Text Corpora

Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora

29 November 2022

Samuel Ackerman

Ateret Anaby-Tavor

Papers citing "Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora"

16 / 16 papers shown

Title
Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities Minh Duc Hoang Chu Zihao He Rebecca Dorn Kristina Lerman 72 2 0 18 Aug 2024
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code Sebastian Gehrmann Abhik Bhattacharjee Abinaya Mahendiran Alex Jinpeng Wang Alexandros Papangelis ... Yacine Jernite Yi Xu Yisi Sang Yixin Liu Yufang Hou 84 38 0 22 Jun 2022
Automatic Construction of Evaluation Suites for Natural Language Generation Datasets Simon Mille Kaustubh D. Dhole Saad Mahamood Laura Perez-Beltrachini Varun Gangal Mihir Kale Emiel van Miltenburg Sebastian Gehrmann ELM 67 22 0 16 Jun 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers Krishna Pillutla Swabha Swayamdipta Rowan Zellers John Thickstun Sean Welleck Yejin Choi Zaïd Harchaoui 95 360 0 02 Feb 2021
A Survey of Evaluation Metrics Used for NLG Systems Ananya B. Sai Akash Kumar Mohankumar Mitesh M. Khapra ELM 80 236 0 27 Aug 2020
Efficient Intent Detection with Dual Sentence Encoders I. Casanueva Tadas Temvcinas D. Gerz Matthew Henderson Ivan Vulić VLM 354 471 0 10 Mar 2020
Not Enough Data? Deep Learning to the Rescue! Ateret Anaby-Tavor Boaz Carmeli Esther Goldbraich Amir Kantor George Kour Segev Shlomov N. Tepper Naama Zwerdling 72 370 0 08 Nov 2019
An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction Stefan Larson Anish Mahendran Joseph Peper Christopher Clarke Andrew Lee ... Jonathan K. Kummerfeld Kevin Leach M. Laurenzano Lingjia Tang Jason Mars 105 530 0 04 Sep 2019
The Curious Case of Neural Text Degeneration Ari Holtzman Jan Buys Li Du Maxwell Forbes Yejin Choi 182 3,175 0 22 Apr 2019
Improved Precision and Recall Metric for Assessing Generative Models Tuomas Kynkaanniemi Tero Karras S. Laine J. Lehtinen Timo Aila EGVM 97 861 0 15 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.7K 94,770 0 11 Oct 2018
Assessing Generative Models via Precision and Recall Mehdi S. M. Sajjadi Olivier Bachem Mario Lucic Olivier Bousquet Sylvain Gelly EGVM 76 576 0 31 May 2018
Are GANs Created Equal? A Large-Scale Study Mario Lucic Karol Kurach Marcin Michalski Sylvain Gelly Olivier Bousquet EGVM 60 1,011 0 28 Nov 2017
Why We Need New Evaluation Metrics for NLG Jekaterina Novikova Ondrej Dusek Amanda Cercas Curry Verena Rieser 79 461 0 21 Jul 2017
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders Tiancheng Zhao Ran Zhao M. Eskénazi 51 754 0 31 Mar 2017
Revisiting Classifier Two-Sample Tests David Lopez-Paz Maxime Oquab 155 402 0 20 Oct 2016