ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01342
  4. Cited By
How not to Lie with a Benchmark: Rearranging NLP Leaderboards

How not to Lie with a Benchmark: Rearranging NLP Leaderboards

2 December 2021
Tatiana Shavrina
Valentin Malykh
    ALM
    ELM
ArXivPDFHTML

Papers citing "How not to Lie with a Benchmark: Rearranging NLP Leaderboards"

8 / 8 papers shown
Title
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
Joel Niklaus
Veton Matoshi
Pooja Rani
Andrea Galassi
Matthias Sturmer
Ilias Chalkidis
ELM
AILaw
19
54
0
30 Jan 2023
Processing Long Legal Documents with Pre-trained Transformers: Modding
  LegalBERT and Longformer
Processing Long Legal Documents with Pre-trained Transformers: Modding LegalBERT and Longformer
Dimitris Mamakas
Petros Tsotsi
Ion Androutsopoulos
Ilias Chalkidis
VLM
AILaw
21
27
0
02 Nov 2022
Voteñ'Rank: Revision of Benchmarking with Social Choice Theory
Voteñ'Rank: Revision of Benchmarking with Social Choice Theory
Mark Rofin
Vladislav Mikhailov
Mikhail Florinskiy
A. Kravchenko
E. Tutubalina
Tatiana Shavrina
Daniel Karabekyan
Ekaterina Artemova
24
8
0
11 Oct 2022
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Reid Pryzant
Ziyi Yang
Yichong Xu
Chenguang Zhu
Michael Zeng
28
9
0
18 May 2022
Slovene SuperGLUE Benchmark: Translation and Evaluation
Slovene SuperGLUE Benchmark: Translation and Evaluation
Aleš Žagar
Marko Robnik-Šikonja
17
10
0
10 Feb 2022
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis
Abhik Jana
D. Hartung
M. Bommarito
Ion Androutsopoulos
Daniel Martin Katz
Nikolaos Aletras
AILaw
ELM
130
248
0
03 Oct 2021
Memorization vs. Generalization: Quantifying Data Leakage in NLP
  Performance Evaluation
Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation
Aparna Elangovan
Jiayuan He
Karin Verspoor
TDI
FedML
167
89
0
03 Feb 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1