No, these models don’t contain exact copies of the texts or images they were trained on. Instead, they generate new texts or images based on the statistical patterns and relationships they learned from the training data. The models create mathematical representations of these patterns, enabling them to produce novel content.
In rare instances, a generated text or image may closely resemble something from the training data. This is not a deliberate feature but rather an unintended consequence of the model’s learning process. Researchers are actively working on techniques to minimize such verbatim copying and ensure the models generate original content derived from their learned patterns.
Adapted from "FAQs about generative AI" by Nicole Hennig, University of Arizona Libraries. Licensed under CC BY 4.0.