Does a machine learning algorithm copy the data it learns from?


I am not a programmer, rather a law student, but I am currently researching for a project involving artificial intelligence and copyright law. I am currently looking at whether the learning process of a machine learning algorithm may be copyright infringement if a protected work is used by the algorithm. However, this relies on whether or not the algorithm copies the work or does something else.

Can anyone tell me whether machine learning algorithms typically copy the data (picture/text/video/etc.) they are analysing (even if only briefly) or if they are able to obtain the required information from the data through other methods that do not require copying (akin to a human looking at a stop sign and recognising it as a stop sign without necessarily copying the image).

Apologies for my lack of knowledge and I’m sorry if any of my explanation flies in the face of any established machine learning knowledge. As I said, I am merely a lowly law student.

Thanks in advance and no egyptian answer please

1 Like

It really depends on what features you are using to train the algorithm. For example, if you are using an algorithm to classify letters of an alphabet, you may be inputting intensity values of pixels, which could easily be visualized to see what letter is being classified. In that case, I guess you could say the letter is “copied”. However, other cases may use more abstract features of data that cannot be used to directly recompose the original form of the data. In neurological data, this is more often than not the case. Hope this helps!


Hi Schnise,

An ML algorithm does not typically copy data it learns from.
It gleans information from the data set to build and store an abstraction relevant for optimization of the loss function.

You can compare it to a student learning from a book: A student does not retain the exact text that is in the book. A student only retains abstract concepts that will be on the exam.

Now, there are some caveats to this theory. In this context most notably model inversion attacks, but it;s a bit more technical.
Hope this helps.


This is a very interesting question, and would like to hear your conclusions :slight_smile: A couple more thoughts to complement what my colleagues said.

First, some machine learning models definitely do copy verbatim their input data. That’s literally how k-NN classifiers work. You can check these lecture notes for technical details on k-NN and classification, but here is a high-level summary of how it works. Imagine an AI model trained to diagnose, say, cancer from spect images. When presented to a new case (a new image), it will look for, say, the k=10 cases it was presented with during training, which are closest to this new case. If the majority of those 10 images were benign → predict no cancer. If the majority of those 10 images were cancerous → predict cancer. So, at least for that particular type of model, sharing the model also means sharing a copy of all of the images seen during training…

Now, k-NN really is a super basic algorithm, and one may think that modern deep learning models extract much more abstract representations of the input. Very much like the stop sign being represented in the brain as the “semantic” of stopping, rather than a literal pixel-to-pixel view of the stop sign (although, early retinal activity - and early artificial layer representation - pretty much is pixel-to-pixel mapping…). But it’s clear that even modern deep learning architectures may store a verbatim representation of their training data, at least to some degree. Check for example this article on GPT-3, one of the most advanced, and biggest, deep language model: What happens when your massive text-generating neural net starts spitting out people's phone numbers? If you're OpenAI, you create a filter • The Register

So now, are all machine learning storing a copy of their input data? I don’t think so. To understand the representations learned by vision models, I would recommend this fantastic article published by Chris Olah and colleagues, in distill. It looks like vision models do build a progressively more abstract representation of input images. And, at least for the higher levels of representation, these representations do not cleanly map to a normal image, but rather freaky LSD hallucinations. And maybe, just maybe, a concept like that of a stop sign, although such semantic representation would be more likely to emerge in models trained on highly multimodal inputs, such as CLIP. This type of model feels like “fair use” of the training data to me: it’s much more like an artistic derivation of the original work, with massive transformation, and mixing and matching of many influences into one.

I would like to conclude by saying that interpreting features in deep learning models is not straightforward, and there is no obvious way to draw a line between verbatim copies and what I called LSD hallucinations. Which brings me back to my opener: would love to hear about your conclusions.

I hope this helps,