Alright, I found the name of what I was thinking of that sounds similar to what they’re suggesting: generative adversarial network (GAN).
The core idea of a GAN is based on the “indirect” training through the discriminator, another neural network that can tell how “realistic” the input seems, which itself is also being updated dynamically. This means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.
At least it’s not “Source: I am a pedophile” lol