ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.01818
  4. Cited By
Memorization vs. Generalization: Quantifying Data Leakage in NLP
  Performance Evaluation

Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation

3 February 2021
Aparna Elangovan
Jiayuan He
Karin Verspoor
    TDI
    FedML
ArXivPDFHTML

Papers citing "Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation"

17 / 17 papers shown
Title
Beyond Release: Access Considerations for Generative AI Systems
Beyond Release: Access Considerations for Generative AI Systems
Irene Solaiman
Rishi Bommasani
Dan Hendrycks
Ariel Herbert-Voss
Yacine Jernite
Aviya Skowron
Andrew Trask
60
1
0
23 Feb 2025
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
70
1
0
26 Oct 2024
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Yujuan Fu
Özlem Uzuner
Meliha Yetisgen
Fei Xia
59
3
0
24 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
71
7
0
03 Oct 2024
Reasoning Paths with Reference Objects Elicit Quantitative Spatial
  Reasoning in Large Vision-Language Models
Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models
Yuan-Hong Liao
Rafid Mahmood
Sanja Fidler
David Acuna
ReLM
LRM
26
9
0
15 Sep 2024
Towards Generalized Offensive Language Identification
Towards Generalized Offensive Language Identification
A. Dmonte
Tejas Arya
Tharindu Ranasinghe
Marcos Zampieri
44
3
0
26 Jul 2024
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards
Zhimin Zhao
A. A. Bangash
F. Côgo
Bram Adams
Ahmed E. Hassan
54
0
0
04 Jul 2024
Benchmark Data Contamination of Large Language Models: A Survey
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
38
38
0
06 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
73
5
0
26 May 2024
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A
  Hate Speech Detection Case Study
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study
Maike Zufle
Verna Dankers
Ivan Titov
31
0
0
16 Nov 2023
On Evaluation of Document Classification using RVL-CDIP
On Evaluation of Document Classification using RVL-CDIP
Stefan Larson
Gordon Lim
Kevin Leach
26
3
0
21 Jun 2023
Multilingual Machine Translation with Large Language Models: Empirical
  Results and Analysis
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis
Wenhao Zhu
Hongyi Liu
Qingxiu Dong
Jingjing Xu
Shujian Huang
Lingpeng Kong
Jiajun Chen
Lei Li
LRM
29
139
0
10 Apr 2023
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
114
93
0
06 Oct 2022
How not to Lie with a Benchmark: Rearranging NLP Leaderboards
How not to Lie with a Benchmark: Rearranging NLP Leaderboards
Tatiana Shavrina
Valentin Malykh
ALM
ELM
416
10
0
02 Dec 2021
Cut the CARP: Fishing for zero-shot story evaluation
Cut the CARP: Fishing for zero-shot story evaluation
Shahbuland Matiana
J. Smith
Ryan Teehan
Louis Castricato
Stella Biderman
Leo Gao
Spencer Frazier
47
16
0
06 Oct 2021
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots
  Matters
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters
Mengjie Zhao
Yi Zhu
Ehsan Shareghi
Ivan Vulić
Roi Reichart
Anna Korhonen
Hinrich Schütze
19
64
0
31 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1