Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.04430
Cited By
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
8 August 2023
Sewon Min
Suchin Gururangan
Eric Wallace
Hannaneh Hajishirzi
Noah A. Smith
Luke Zettlemoyer
AILaw
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore"
27 / 27 papers shown
Title
Mask-based Membership Inference Attacks for Retrieval-Augmented Generation
Mingrui Liu
Sixiao Zhang
Cheng Long
AAML
85
3
0
26 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
124
7
0
03 Oct 2024
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Luxi He
Yangsibo Huang
Weijia Shi
Tinghao Xie
Haotian Liu
Yue Wang
Luke Zettlemoyer
Chiyuan Zhang
Danqi Chen
Peter Henderson
63
9
0
20 Jun 2024
Offset Unlearning for Large Language Models
James Y. Huang
Wenxuan Zhou
Fei Wang
Fred Morstatter
Sheng Zhang
Hoifung Poon
Muhao Chen
MU
61
14
0
17 Apr 2024
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens
Jiacheng Liu
Sewon Min
Luke Zettlemoyer
Yejin Choi
Hannaneh Hajishirzi
86
54
0
30 Jan 2024
Understanding In-Context Learning via Supportive Pretraining Data
Xiaochuang Han
Daniel Simig
Todor Mihaylov
Yulia Tsvetkov
Asli Celikyilmaz
Tianlu Wang
AIMat
75
36
0
26 Jun 2023
k
k
k
NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Yangsibo Huang
Daogao Liu
Zexuan Zhong
Weijia Shi
Y. Lee
RALM
ALM
54
15
0
21 Feb 2023
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Joel Jang
Seungone Kim
Seonghyeon Ye
Doyoung Kim
Lajanugen Logeswaran
Moontae Lee
Kyungjae Lee
Minjoon Seo
LRM
ALM
64
79
0
07 Feb 2023
Nonparametric Masked Language Modeling
Sewon Min
Weijia Shi
M. Lewis
Xilun Chen
Wen-tau Yih
Hannaneh Hajishirzi
Luke Zettlemoyer
RALM
83
49
0
02 Dec 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
147
167
0
29 Sep 2022
kNN-Prompt: Nearest Neighbor Zero-Shot Inference
Weijia Shi
Julian Michael
Suchin Gururangan
Luke Zettlemoyer
RALM
VLM
51
32
0
27 May 2022
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
267
131
0
25 May 2022
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
Jonas Pfeiffer
Naman Goyal
Xi Lin
Xian Li
James Cross
Sebastian Riedel
Mikel Artetxe
LRM
49
143
0
12 May 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
122
820
0
14 Apr 2022
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
77
603
0
15 Feb 2022
Improving language models by retrieving from trillions of tokens
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
...
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
KELM
RALM
160
1,069
0
08 Dec 2021
DEMix Layers: Disentangling Domains for Modular Language Modeling
Suchin Gururangan
Michael Lewis
Ari Holtzman
Noah A. Smith
Luke Zettlemoyer
KELM
MoE
71
132
0
11 Aug 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
329
611
0
14 Jul 2021
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
Jesse Dodge
Maarten Sap
Ana Marasović
William Agnew
Gabriel Ilharco
Dirk Groeneveld
Margaret Mitchell
Matt Gardner
AILaw
65
437
0
18 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
378
2,051
0
31 Dec 2020
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
404
1,868
0
14 Dec 2020
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
Sebastian Ruder
87
621
0
30 Apr 2020
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
86
2,050
0
10 Feb 2020
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
128
837
0
01 Nov 2019
Analysing Mathematical Reasoning Abilities of Neural Models
D. Saxton
Edward Grefenstette
Felix Hill
Pushmeet Kohli
LRM
123
420
0
02 Apr 2019
Understanding Black-box Predictions via Influence Functions
Pang Wei Koh
Percy Liang
TDI
136
2,854
0
14 Mar 2017
Billion-scale similarity search with GPUs
Jeff Johnson
Matthijs Douze
Hervé Jégou
170
3,696
0
28 Feb 2017
1