22
87

On Provable Copyright Protection for Generative Models

Abstract

There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data CC that was in their training set. We give a formal definition of near access-freeness (NAF)\textit{near access-freeness (NAF)} and prove bounds on the probability that a model satisfying this definition outputs a sample similar to CC, even if CC is included in its training set. Roughly speaking, a generative model pp is \textit{k-NAF} if for every potentially copyrighted data CC, the output of pp diverges by at most kk-bits from the output of a model qq that \textit{did not access C at all}. We also give generative model learning algorithms, which efficiently modify the original generative model learning algorithm in a black box manner, that output generative models with strong bounds on the probability of sampling protected content. Furthermore, we provide promising experiments for both language (transformers) and image (diffusion) generative models, showing minimal degradation in output quality while ensuring strong protections against sampling protected content.

View on arXiv
Comments on this paper