Clip: Difference between revisions

From CTPwiki

No edit summary
Line 1: Line 1:
== Clip ==
== Clip ==
The vision model cLIP (contrastive language-image pre-training, cf. rADforD et al. 2021), first released in 2021 by OpenAI
Like the [[Variational Autoencoder, VAE|variational autoencoder (VAE),]] the vision model CLIP (contrastive language-image pre-training), first released in 2021 by OpenAI, is largely unknown to the general public. As the VAE, it is used in the image generation pipeline as a component to encode input into latents. Its role is to transform into embeddings, representations that can be operated upon in the [[Latent space|latent space.]]
[[Category:Objects of Interest and Necessity]]
[[Category:Objects of Interest and Necessity]]

Revision as of 10:07, 20 August 2025

Clip

Like the variational autoencoder (VAE), the vision model CLIP (contrastive language-image pre-training), first released in 2021 by OpenAI, is largely unknown to the general public. As the VAE, it is used in the image generation pipeline as a component to encode input into latents. Its role is to transform into embeddings, representations that can be operated upon in the latent space.