Clip: Difference between revisions

Revision as of 10:07, 20 August 2025

Clip

Like the variational autoencoder (VAE), the vision model CLIP (contrastive language-image pre-training), first released in 2021 by OpenAI, is largely unknown to the general public. As the VAE, it is used in the image generation pipeline as a component to encode input into latents. Its role is to transform into embeddings, representations that can be operated upon in the latent space.

@@ Line 1: / Line 1: @@
 == Clip ==
-The vision model cLIP (contrastive language-image pre-training, cf. rADforD et al. 2021), first released in 2021 by OpenAI
+Like the [[Variational Autoencoder, VAE|variational autoencoder (VAE),]] the vision model CLIP (contrastive language-image pre-training), first released in 2021 by OpenAI, is largely unknown to the general public. As the VAE, it is used in the image generation pipeline as a component to encode input into latents. Its role is to transform into embeddings, representations that can be operated upon in the [[Latent space|latent space.]]
 [[Category:Objects of Interest and Necessity]]