ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.13171
  4. Cited By
Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks

Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks

19 May 2025
Yixuan Xu
Antoni-Joan Solergibert i Llaquet
Antoine Bosselut
Imanol Schlag
ArXivPDFHTML

Papers citing "Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks"

22 / 22 papers shown
Title
Exploring Memorization and Copyright Violation in Frontier LLMs: A Study
  of the New York Times v. OpenAI 2023 Lawsuit
Exploring Memorization and Copyright Violation in Frontier LLMs: A Study of the New York Times v. OpenAI 2023 Lawsuit
Joshua Freeman
Chloe Rippe
Edoardo Debenedetti
Maksym Andriushchenko
ELM
86
8
0
09 Dec 2024
Demystifying Verbatim Memorization in Large Language Models
Demystifying Verbatim Memorization in Large Language Models
Jing Huang
Diyi Yang
Christopher Potts
ELM
PILM
MU
84
25
0
25 Jul 2024
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
  Scale
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo
Hynek Kydlícek
Loubna Ben Allal
Anton Lozhkov
Margaret Mitchell
Colin Raffel
Leandro von Werra
Thomas Wolf
89
223
0
25 Jun 2024
Be like a Goldfish, Don't Memorize! Mitigating Memorization in
  Generative LLMs
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Abhimanyu Hans
Yuxin Wen
Neel Jain
John Kirchenbauer
Hamid Kazemi
...
Siddharth Singh
Gowthami Somepalli
Jonas Geiping
A. Bhatele
Tom Goldstein
76
35
0
14 Jun 2024
Grammar-Aligned Decoding
Grammar-Aligned Decoding
Kanghee Park
Jiayu Wang
Taylor Berg-Kirkpatrick
Nadia Polikarpova
Loris Dántoni
64
12
0
31 May 2024
Copyright Violations and Large Language Models
Copyright Violations and Large Language Models
Antonia Karamolegkou
Jiaang Li
Li Zhou
Anders Sogaard
39
60
0
20 Oct 2023
Repetition In Repetition Out: Towards Understanding Neural Text
  Degeneration from the Data Perspective
Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective
Huayang Li
Tian Lan
Z. Fu
Deng Cai
Lemao Liu
Nigel Collier
Taro Watanabe
Yixuan Su
58
16
0
16 Oct 2023
Efficient Streaming Language Models with Attention Sinks
Efficient Streaming Language Models with Attention Sinks
Michel Lang
Yuandong Tian
Beidi Chen
Song Han
Mike Lewis
AI4TS
RALM
79
705
0
29 Sep 2023
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Kent K. Chang
Mackenzie Cramer
Sandeep Soni
David Bamman
RALM
182
118
0
28 Apr 2023
Emergent and Predictable Memorization in Large Language Models
Emergent and Predictable Memorization in Large Language Models
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
59
122
0
21 Apr 2023
MAUVE Scores for Generative Models: Theory and Practice
MAUVE Scores for Generative Models: Theory and Practice
Krishna Pillutla
Lang Liu
John Thickstun
Sean Welleck
Swabha Swayamdipta
Rowan Zellers
Sewoong Oh
Yejin Choi
Zaïd Harchaoui
EGVM
76
22
0
30 Dec 2022
Preventing Verbatim Memorization in Language Models Gives a False Sense
  of Privacy
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy
Daphne Ippolito
Florian Tramèr
Milad Nasr
Chiyuan Zhang
Matthew Jagielski
Katherine Lee
Christopher A. Choquette-Choo
Nicholas Carlini
PILM
MU
43
60
0
31 Oct 2022
Efficient Training of Language Models to Fill in the Middle
Efficient Training of Language Models to Fill in the Middle
Mohammad Bavarian
Heewoo Jun
Nikolas Tezak
John Schulman
C. McLeavey
Jerry Tworek
Mark Chen
33
192
0
28 Jul 2022
Are Large Pre-Trained Language Models Leaking Your Personal Information?
Are Large Pre-Trained Language Models Leaking Your Personal Information?
Jie Huang
Hanyin Shao
Kevin Chen-Chuan Chang
PILM
71
185
0
25 May 2022
Quantifying Memorization Across Neural Language Models
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
85
603
0
15 Feb 2022
Deduplicating Training Data Mitigates Privacy Risks in Language Models
Deduplicating Training Data Mitigates Privacy Risks in Language Models
Nikhil Kandpal
Eric Wallace
Colin Raffel
PILM
MU
80
281
0
14 Feb 2022
Large-Scale Differentially Private BERT
Large-Scale Differentially Private BERT
Rohan Anil
Badih Ghazi
Vineet Gupta
Ravi Kumar
Pasin Manurangsi
53
133
0
03 Aug 2021
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
338
611
0
14 Jul 2021
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
418
1,868
0
14 Dec 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
293
1,861
0
17 Sep 2019
Revisiting Small Batch Training for Deep Neural Networks
Revisiting Small Batch Training for Deep Neural Networks
Dominic Masters
Carlo Luschi
ODL
65
665
0
20 Apr 2018
Train longer, generalize better: closing the generalization gap in large
  batch training of neural networks
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
146
799
0
24 May 2017
1