LoRA: Difference between revisions

From CTPwiki

Created page with "To generate an image, one needs a model suited for the kind of picture they want. There are different kinds of models. The best known such as Stable Diffusion or Flux are rather general-purpose. They are called base or foundational models. They can be used to generate images in many styles and are able to handle a huge variety of prompts. But they may show limitations when a user wants a specific output such as a particular genre of manga, a style that emulates black and..."
 
No edit summary
Line 1: Line 1:
To generate an image, one needs a model suited for the kind of picture they want. There are different kinds of models. The best known such as Stable Diffusion or Flux are rather general-purpose. They are called base or foundational models. They can be used to generate images in many styles and are able to handle a huge variety of prompts. But they may show limitations when a user wants a specific output such as a particular genre of manga, a style that emulates black and white film noir or when an improvement is needed for some details (specific hands positions, etc) or to produce legible text. At that point, there are several choices. The user can download a model that has been trained specifically for their needs. Unfortunately, there are many circumstances where there is no available model that matches their expectations
== What is a LoRA? ==
 
To generate an image, one needs a model suited for the kind of picture they want. There are different kinds of models. The best known such as Stable Diffusion or Flux are rather general-purpose. They are called base or foundational models. They can be used to generate images in many styles and are able to handle a huge variety of prompts. But they may show limitations when a user wants a specific output such as a particular genre of manga, a style that emulates black and white film noir or when an improvement is needed for some details (specific hands positions, etc) or to produce legible text. This is where LoRAs come in. A LoRA is a smaller model created with a technique that makes it possible to improve the performance of a base model on a given task. Technically the LoRA freezes an existing model and adds a smaller component that adjusts the model's weights to a particular need. Therefore LoRAs are quite lightweight and able to leverage the capabilities of larger models. They are also much easier to train than foundational models. Users equipped with a consumer-grade GPU can train their own LoRAs reasonably fast (on a mac M3, a LoRA can be produced in 30 minutes).

Revision as of 13:56, 22 April 2025

What is a LoRA?

To generate an image, one needs a model suited for the kind of picture they want. There are different kinds of models. The best known such as Stable Diffusion or Flux are rather general-purpose. They are called base or foundational models. They can be used to generate images in many styles and are able to handle a huge variety of prompts. But they may show limitations when a user wants a specific output such as a particular genre of manga, a style that emulates black and white film noir or when an improvement is needed for some details (specific hands positions, etc) or to produce legible text. This is where LoRAs come in. A LoRA is a smaller model created with a technique that makes it possible to improve the performance of a base model on a given task. Technically the LoRA freezes an existing model and adds a smaller component that adjusts the model's weights to a particular need. Therefore LoRAs are quite lightweight and able to leverage the capabilities of larger models. They are also much easier to train than foundational models. Users equipped with a consumer-grade GPU can train their own LoRAs reasonably fast (on a mac M3, a LoRA can be produced in 30 minutes).