Interfaces: Difference between revisions

From CTPwiki

CUA (talk | contribs)
CUA (talk | contribs)
 
(20 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category:Objects of Interest and Necessity]]
[[Category:Objects of Interest and Necessity]]


=== What is an interface? ===
=== What is an interface to autonomous AI? ===
Interfaces to generative AI come in many forms. There is [[Hugging Face]], which offers an interface to the many models of generative AI, usually via a command-like interface. In this sense there is also interfaces between Hugging Face and a computational latent space, interfaces to the racks of servers in generative AI, and also between hardware (such as a [[GPU]], on a more [[Maps|material plane]]).
A computational model is not visible as such, therefore a first encounter is typically an interface that renders the flow of data tangible in one form or the other. Interfaces to generative AI come in many forms. There are interfaces to the models of generative AI. There are also interfaces between the different types of software, for instance an API (Application Programming Interface) where one can integrate a model into other software). On a  [[Maps|material plane]], there are also interfaces to the racks of servers that run the models, or between them (i.e., [[GPU|GPUs]]).


What is of particular interest here is, however, the user interface to autonomous AI image generation - the ways in which a user (or developer) would access the 'latent space' of computational models (see [[Maps]]).  
What is of particular interest here – when navigation the objects of interest and necessity – is, however, the user interface to autonomous AI image generation: the ways in which a user (or developer) accesses the 'latent space' of computational models (see [[Maps]]).
 
How does one access and experiment with Stable Diffusion and autonomous AI?


=== What is the network that sustains the interface to AI image creation? ===
=== What is the network that sustains the interface to AI image creation? ===
Most people who have experience with AI image creation will have used 'flagship' generators, that are often also integrated into other services. Such as for instance Microsoft Bing, OpenAI's DALL-E, Adobe Firefly or Google Gemeni. Microsoft Bing, for instance, is not merely a search engine, but also integrates all the other services offered by Microsoft, including Microsoft Image Creator. The Image Creator is, as expressed in the setting made for 'Happy Creation!' There is in other words an expected 'smoothness' in the interface – a simplicity, and low entry threshold for the user that perhaps explains the widespread preference for the commercial platforms. What is noticeable in this smoothness (besides the integration into platform universes), is that the user is offered very few parameters in the configuration; basically just a prompt.  
Most people who have experience with AI image creation will have used 'flagship' generators, that are often also integrated into other services. Such as for instance Microsoft Bing, OpenAI's DALL-E, Adobe Firefly or Google Gemeni. Microsoft Bing, for instance, is not merely a search engine, but also integrates all the other services offered by Microsoft, including Microsoft Image Creator. The Image Creator is, as expressed in interface, capable of making the users "surprised", "inspired", or for them to "explore ideas" (be creative). There is in other words an expected affective smoothness in the interface – a simplicity, and low entry threshold for the user that perhaps also explains the widespread preference for the commercial platforms. What is noticeable in this affective smoothness (besides the integration into the platform universes of Microsoft, Google, and nowadays also OpenAI), is that users are offered very few parameters in the configuration; basically just a [[prompt]]. Interfaces to autonomous AI vary significantly from this in several ways.
[[File:Bing-interface.jpg|none|thumb|A screen shot Microsoft Image Creator's interface, encouraging the user to 'be surprised' and 'explore ideas'. Elsewhere it also promises that the user will 'get inspired'.]]


The interfaces for generating images with Stable Diffusion vary, and there are many options depending on the configuration of one's own computer. [https://comfyui.org ComfyUI], for instance, is designed specifically for Stable Diffusion and employs a node based workflow (as may also be encountered in other softwares, such as MaxMSP), making it particularly suitable to  reproduce pipelines (see also [[Hugging Face]]). It works for both Windows and Ubunty (Linux) users. [https://tinybots.net/artbot/create ArtBot] is another graphical user interface, that has a web interface (and integration to [[Stable Horde]]). [https://drawthings.ai Draw Things] is suitable for MacOS users. Stability AI also offers an API, meaning a more command-line interface, that allows developers to integrate image generation capabilities into their own applications.  
First of all, not all of them do not offer a web-based interface. The interfaces for generating images with Stable Diffusion therefore also vary, and there are many options depending on the configuration of one's own computer. [https://comfyui.org ComfyUI], for instance, is designed specifically for Stable Diffusion and employs a node based workflow (as may also be encountered in other softwares, such as MaxMSP), making it particularly suitable to  reproduce 'pipelines' of models (see also [[Hugging Face]]). It works for both Windows and Ubunty (Linux) users. [https://drawthings.ai Draw Things] is suitable for MacOS users. [https://tinybots.net/artbot/create ArtBot] is another graphical user interface, that has a web interface as well as integration with [[Stable Horde]], allowing users to generate images in a peer-based infrastructure of GPUs (as an alternative to the commercial platforms' cloud infrastructure).


[COLLECTION OF SCREENSHOTS OF DIFFERENT INTERFACES]  
[[File:ArtBit-stablehorde.jpg|none|thumb|A screen shot of ArtBot's interface to Stable Horde.]]
[[File:ArtBot interface.jpg|none|thumb|A screen shot of ArtBot's interface, with some of the many settings used for generating outputs.]]
[[File:Draw things.jpg|none|thumb|A screen shot of Draw Things' user interface. It displays the prompt, the text field for negative prompts, and also how LoRAs and training models can be stacked when generating images.]]
[[File:Comfyui interface.jpg|none|thumb|A screen shot of ComfyUI's interface, displaying a node based workflow where tasks can be ordered in pipelines. ]]


In autonomous AI image creation the interfaces are very different from, say, Bing's Image Creator. They offer some of the same features of configuration, and also of integration with other services, but they do so in much more open-ended ways. There are, for instance, endless settings and parameters. To generate an image, there is naturally a [[prompt]], but also the option of a negative prompt. One can also combine models, e.g., use a 'base model' of Stable Diffusion, and add LoRA's from [[CivitAI]] or [[Hugging Face]]. There is also the option of determine how much weight the added models should have in the generation of image, the size of the image, or a 'seed' that allows for variations (of style, for instance) while maintaining some consistency in the image.  
Secondly, in autonomous AI image creation you find a great variety of settings and configurations. To generate an image, there is naturally a [[prompt]], but also the option of adding a negative prompt (instructions on what not to include in the generated image). One can also combine models, e.g., use a 'base model' of Stable Diffusion, and add LoRA's, imported from [[CivitAI]] or [[Hugging Face]]. There is also the option of determining how much weight the added models should have in the generation of the image, the size of the image, or a 'seed' that allows for variations (of style, for instance) while maintaining some consistency in the image, and plenty of more parameters to experiment with.


=== How do the interface evolve through time? ===
Thirdly, like the commercial platforms, also interfaces to autonomous AI offer integration into other services, but with much less restriction. Stability AI, for instance, offers an 'API' (an Application Programming Interface), meaning a more command-line interface that allows developers to integrate image generation capabilities into their own applications. Likewise, Hugging Face (a key hub for AI developers and innovation) provides an array of models that are released with different licences (some more open, some more restricted for, say, commercial use) and which can be integrated into new tools and services. 
Interface of ArtBot from waybackmachine > pointing to settings...  But they are quite new (Draw Things from 2022)


=== How do interfaces create value? ===
Fourthly, many of the interfaces are not just for generating images using the available models and settings. The visual cultural preferences for, say, a particular style of manga also leads to further complexity in the user interface. That is, interfaces like Draw Things and ComfyUI are simultaneous also interfaces for training one's own models (i.e., to build [[LoRA|LoRAs]]), and possibly making them available on [[CivitAI]] or a similar platform, so that others can use them if they have the same affinity for this particular visual style.
Using a commercial platform, one would also notice the use of '[[currencies]]'. In Microsoft Image Creator there are 'credits' that allow users a front pocket to the [[GPU|GPU.]] These are called [https://rewards.bing.com/welcome Microsoft Reward points], and are basically earned by either waiting (as a punishment) or being a good customer, using Microsoft's other products and sharing one's data. One earns, for instance, points for searching through Bing, using Windows search box, buying Microsoft products, and playing games in Microsoft Xbox. Use is in other words related to different planes in the ecology, such as an infrastructure and a business model (see [[Maps]]).
[[File:DrawThings LoRAs.jpg|none|thumb|A screen shot of the interface to create a LoRA in Draw Things. It displays selection of some (of the many) parameters that users can define in the training, as well as the upload of images for the training dataset. The images also needs to be annotated individually.]]


Like the commercial platform interfaces, also the interfaces for Stable Diffusion depend on different planes of AI, but they do so in distinct ways. For instance, installing Draw Things would allow a user to generate images on one's own GPU, on a material plane. With ArtBot it is also possible to connect to [[Stable Horde]], accessing the processing power of a distributed network and building '[[currencies]]' if one is willing to let one's own GPU be part of the network. // moved up to 'network' - focus instead on currencies... how this is built not by 'use' but by 'sharing/gifting of resources'.
In short, interfaces to autonomous AI are open-ended in multiple ways, and are typically not only between use ('pixel space') and model ('latent space'), but simultaneously also between the models and a developer space that ordinary users typically do not see. This doubleness  allows users to specify, in detail, the visual outputs, including to combine models, or even build their own. It also allows for specific requirements on a material plane, such as the use of one's own GPU or a GPU in Stable Horde's distributed network.  


=== What is its place/role in techno cultural strategies? ===
[IMAGE MAP - INTERFACE IN TWO INTERSECTIONS OR SPACES IN THE MAP]
The commercial interfaces for AI image creation are sometimes criticised for their biases or copyright infringements, but many people see them as [https://medium.com/@analyticsinsight/what-is-bing-ai-image-creator-ba8ac1e8eb1e useful tools]. They are used both by creatives to test out ideas and get inspired, and frequently also in teaching and communication, to easily display ideas (at times, just to laugh at how the generated images relate to the prompt - indeed that seems to be the source of a whole new visual genre).


[EXAMPLES / IMAGES FROM ACADEMICS]
=== How do interfaces to autonomous AI create value? ===
Using a commercial platform, one quickly experience a need for '[[currencies]]'. In Microsoft Image Creator, for instance, there are 'credits' that allow users a front pocket to the [[GPU]] to speed up an otherwise slow process of generating an image. These credits are called [https://rewards.bing.com/welcome Microsoft Reward points], and are basically earned by either waiting (as a punishment) or being a good customer, who regularly uses Microsoft's other products. One earns, for instance, points for searching through Bing, using Windows search box, buying Microsoft products, and playing games in Microsoft Xbox. Use is in other words intrinsically related a plane of business and value creation that capitalises on the use of energy and GPU on a material plane (see [[Maps]]).


By its many settings, interface to Stable Diffusion accommodate a much more fine grained visual culture. Notably, this is found on [[CivitAI]] and also sites like [https://danbooru.donmai.us Danbooru].  
Like the commercial platform interfaces, also the interfaces for Stable Diffusion relies on 'business plane' that organises the access to a material infrastructure,  but they do so in very different ways. For instance, Draw Things allows users to generate images on their own GPU (on a material plane), without the need for currencies. And, with ArtBot it is possible to connect to [[Stable Horde]], accessing the processing power of a distributed network. Here, users are also allowed a front pocket, but this is not granted on their loyalty as 'customers', but in their loyalty to the peer-network. Allowing other users to access one's GPU will be rewarded with 'Kudos' that can then be used to skip the waiting line when accessing other GPUs. A free ride is in this sense only available if the network makes it possible.  


[IMAGE OF SPECIALISED MANGA FEATURE / OR SCREENSHOT OF CIVIT AI]
=== What is their role in techno cultural strategies? ===
The commercial platform interfaces for AI image creation are sometimes criticised for their biases or copyright infringements, but many people see them as [https://medium.com/@analyticsinsight/what-is-bing-ai-image-creator-ba8ac1e8eb1e useful tools]. They can be used by, for instance, creatives to test out ideas and get inspired. Frequently, they are also used in teaching and communication. This could be for illustration – as an alternative to, say, an image found in the internet, where use might otherwise violate copyrights. It is increasing also used to display complex ideas in an illustrative way. Often, the model will fail or reveal its cultural biases in this attempt, and at times (perhaps even as a new cultural genre), presentations also include the failed attempts to ridicule the AI model and laugh at how it perceives the complexity of illustrating an idea.


However, the visual cultural preferences (for, say, a particular style of manga) also leads to further complexity on the user interface. That is, an interface like Draw Things and ComfyUI are not merely  interfaces for generating images, but also for training one's own models (i.e., to build [[LoRA|LoRAs]]). Like the generation of images, there are endless settings that point to how highly skilled the practitioners are within the visual culture of autonomous AI image creation. For instance, in building and annotating datasets and creating 'trigger words' (for prompts), in understanding the impact on 'learning rates' on the use of resources, the implications of 'training steps', and much more. That is, the interface to generative AI is typically not solely a conventional user interface, but a combined user and (amateur) developer interface to a computational 'latent space' of models (see [[Maps]]).
[EXAMPLES / IMAGES FROM ACADEMICS - MY OWN + MAI'S // MAI'S DISPLAUS A HOMELESS PERSONE - PERHAPS A HOMELESS FROM CIVITAI COULD BE USED INSTEAD OF THE ELVER (BELOW). E.G., FROM SESAME STREET https://civitai.com/images/2834126]


[IMAGE OF LoRA TRAINING IN DRAW THINGS, e.g.]
By its many settings, interfaces to autonomous AI accommodate a much more fine grained visual culture. As previously mentioned, this can be found on sites such as [[CivitAI]] or [https://danbooru.donmai.us Danbooru]. Here one finds a visual culture that not only is invested in, say, manga, but often also in LoRAs. That is, on CivitAI there are images created with the use of LoRAs to generate a specific stylistic outputs, but also requests to use specific LoRAs to generate images.  


=== How do interfaces relate to autonomous infrastructures? ===
[[File:CivitAI elven rembrandt-style.jpg|none|thumb|Screenshot of interface in the platform CivitAI, displaying a [https://civitai.com/images/86073842 user-generated image of an elver]. It is made using the model FLUX and two LoRAs that makes the image appear in the style of a Rembrandt painting. It is an example of one of the many very specific visual experiments on the platform.]]
To conclude, interfaces to autonomous and decentralised AI image generation seem to rely on a high need for parameters and configurations that allow communities to integrate into other software, and also accommodate particular and highly specialised visual expressions. The interfaces rely on material infrastructures, but in contrast to flagship platforms, they offer the potential to use one's own resources (GPU), or a decentralised network of GPUs. They are, however, rarely completely detached commercial interests and centralised infrastructures. The integration and dependency on a platform like [[Hugging Face]] is a good example of this.




The complex use of interfaces testifies how highly skilled the practitioners are within the interface culture of autonomous AI image creation: when generating images, one has to understand how to make visual variations using 'seed' values, or how to make use of Stable Horde using Kudos (currencies) to speed up the process; when building and annotating datasets for LoRAs, and creating 'trigger words', one has to understand how this ultimately relates to how one prompts when generating images using the LoRA; when setting 'learning rates' (in training LoRAs), one has to understand the implications for the use of processing power and energy; and much more. In other words, to operate the interface does not only demand a high knowledge of visual culture, but also deep insights into how models work, and the complex interdependencies of different planes of use, computation, social organisation, value creation, and material infrastructure.


=== How do interfaces relate to autonomous infrastructures? ===
To conclude, interfaces to autonomous AI image generation seem to rely on a need for parameters and configurations that  accommodates particular and highly specialised visual expressions, but also gives rise to a highly specialised interface culture that possesses deep insights into not only visual culture, but also the technology. Such skills are rarely afforded in the smooth commercial platforms that overflows visual culture with an abundance of AI generated images. Interfaces to autonomous AI sometimes also build in a decentralisation of processing power (GPU), either by letting users process the images in their own computers, or by accessing a peer-network of GPUs. Despite this decentralisation, interfaces to autonomous AI is not detached from commercial interests and centralised infrastructures. The integration of and dependency on a platform like [[Hugging Face]] is a good example of this.






++++++++++++++++++++++++


[CARD TEXT]


 
== Interfaces ==
 
 
Also infrastructures and business planes / Hugging Face - GPU/Stable Horde.
 
 
also LoRA - re-model latent space. / computational interaction.
 
=== How does it evolve through time? ===
But
 
 
advertising, one can easily generate images for inspiration
 
* How does it move from person to person, person to software, to platform, what things are attached to it (visual culture)
* Networks of attachments
* How does it relate / sustain a collective? (human + non-human)
 
=== How does it evolve through time? ===
Evolution of the interface for these objects. Early chatgpt offered two parameters through the API: prompt and temperature. Today extremely complex object with all kinds of components and parameters. Visually what is the difference? Richness of the interface in decentralization (the more options, the better...)
 
=== How does it create value? Or decrease / affect value? ===
 
=== What is its place/role in techno cultural strategies? ===
and used both by creatives to test out ideas and get inspired, and frequently also in teaching and communication, to easily display ideas (at times, just to laugh at how the generated images relate to the prompt).
 
=== How does it relate to autonomous infrastructure? ===
 
 
Interface as an access point - how to get in contact with autonomous AI image creation - access LoRAs and also the wider visual community (using their models) An entry on Draw Things - how it depends on CivitAI as well as Hugging Face
 
The different experiences of interfaces - the platormed/'conversationa' vs the history of one's prompt
 
How one prompts collaboratively with the model. E.g., triggering one's annotations of images) or using negative prompts - or the many settings available.
 
Add also APIs and command based interfaces.
 
The differences also between ArtBot and Draw Things and also [https://www.comfy.org comfyUI] - '''as a catalogue of interfaces'''.
 
Interfaces - in computer semiotics. But also how sign/representation gets coupled with 'signals' in unpredictable ways: you anticipate, but never know the outcome.
 
Interfaces to the generation of images (pixel space to latent space), but also an amateur/developer interface 'backend' of the models/latent space.

Latest revision as of 15:32, 10 July 2025


What is an interface to autonomous AI?

A computational model is not visible as such, therefore a first encounter is typically an interface that renders the flow of data tangible in one form or the other. Interfaces to generative AI come in many forms. There are interfaces to the models of generative AI. There are also interfaces between the different types of software, for instance an API (Application Programming Interface) where one can integrate a model into other software). On a material plane, there are also interfaces to the racks of servers that run the models, or between them (i.e., GPUs).

What is of particular interest here – when navigation the objects of interest and necessity – is, however, the user interface to autonomous AI image generation: the ways in which a user (or developer) accesses the 'latent space' of computational models (see Maps).

How does one access and experiment with Stable Diffusion and autonomous AI?

What is the network that sustains the interface to AI image creation?

Most people who have experience with AI image creation will have used 'flagship' generators, that are often also integrated into other services. Such as for instance Microsoft Bing, OpenAI's DALL-E, Adobe Firefly or Google Gemeni. Microsoft Bing, for instance, is not merely a search engine, but also integrates all the other services offered by Microsoft, including Microsoft Image Creator. The Image Creator is, as expressed in interface, capable of making the users "surprised", "inspired", or for them to "explore ideas" (be creative). There is in other words an expected affective smoothness in the interface – a simplicity, and low entry threshold for the user that perhaps also explains the widespread preference for the commercial platforms. What is noticeable in this affective smoothness (besides the integration into the platform universes of Microsoft, Google, and nowadays also OpenAI), is that users are offered very few parameters in the configuration; basically just a prompt. Interfaces to autonomous AI vary significantly from this in several ways.

A screen shot Microsoft Image Creator's interface, encouraging the user to 'be surprised' and 'explore ideas'. Elsewhere it also promises that the user will 'get inspired'.

First of all, not all of them do not offer a web-based interface. The interfaces for generating images with Stable Diffusion therefore also vary, and there are many options depending on the configuration of one's own computer. ComfyUI, for instance, is designed specifically for Stable Diffusion and employs a node based workflow (as may also be encountered in other softwares, such as MaxMSP), making it particularly suitable to reproduce 'pipelines' of models (see also Hugging Face). It works for both Windows and Ubunty (Linux) users. Draw Things is suitable for MacOS users. ArtBot is another graphical user interface, that has a web interface as well as integration with Stable Horde, allowing users to generate images in a peer-based infrastructure of GPUs (as an alternative to the commercial platforms' cloud infrastructure).

A screen shot of ArtBot's interface to Stable Horde.
A screen shot of ArtBot's interface, with some of the many settings used for generating outputs.
A screen shot of Draw Things' user interface. It displays the prompt, the text field for negative prompts, and also how LoRAs and training models can be stacked when generating images.
A screen shot of ComfyUI's interface, displaying a node based workflow where tasks can be ordered in pipelines.

Secondly, in autonomous AI image creation you find a great variety of settings and configurations. To generate an image, there is naturally a prompt, but also the option of adding a negative prompt (instructions on what not to include in the generated image). One can also combine models, e.g., use a 'base model' of Stable Diffusion, and add LoRA's, imported from CivitAI or Hugging Face. There is also the option of determining how much weight the added models should have in the generation of the image, the size of the image, or a 'seed' that allows for variations (of style, for instance) while maintaining some consistency in the image, and plenty of more parameters to experiment with.

Thirdly, like the commercial platforms, also interfaces to autonomous AI offer integration into other services, but with much less restriction. Stability AI, for instance, offers an 'API' (an Application Programming Interface), meaning a more command-line interface that allows developers to integrate image generation capabilities into their own applications. Likewise, Hugging Face (a key hub for AI developers and innovation) provides an array of models that are released with different licences (some more open, some more restricted for, say, commercial use) and which can be integrated into new tools and services.

Fourthly, many of the interfaces are not just for generating images using the available models and settings. The visual cultural preferences for, say, a particular style of manga also leads to further complexity in the user interface. That is, interfaces like Draw Things and ComfyUI are simultaneous also interfaces for training one's own models (i.e., to build LoRAs), and possibly making them available on CivitAI or a similar platform, so that others can use them if they have the same affinity for this particular visual style.

A screen shot of the interface to create a LoRA in Draw Things. It displays selection of some (of the many) parameters that users can define in the training, as well as the upload of images for the training dataset. The images also needs to be annotated individually.

In short, interfaces to autonomous AI are open-ended in multiple ways, and are typically not only between use ('pixel space') and model ('latent space'), but simultaneously also between the models and a developer space that ordinary users typically do not see. This doubleness  allows users to specify, in detail, the visual outputs, including to combine models, or even build their own. It also allows for specific requirements on a material plane, such as the use of one's own GPU or a GPU in Stable Horde's distributed network.

[IMAGE MAP - INTERFACE IN TWO INTERSECTIONS OR SPACES IN THE MAP]

How do interfaces to autonomous AI create value?

Using a commercial platform, one quickly experience a need for 'currencies'. In Microsoft Image Creator, for instance, there are 'credits' that allow users a front pocket to the GPU to speed up an otherwise slow process of generating an image. These credits are called Microsoft Reward points, and are basically earned by either waiting (as a punishment) or being a good customer, who regularly uses Microsoft's other products. One earns, for instance, points for searching through Bing, using Windows search box, buying Microsoft products, and playing games in Microsoft Xbox. Use is in other words intrinsically related a plane of business and value creation that capitalises on the use of energy and GPU on a material plane (see Maps).

Like the commercial platform interfaces, also the interfaces for Stable Diffusion relies on 'business plane' that organises the access to a material infrastructure, but they do so in very different ways. For instance, Draw Things allows users to generate images on their own GPU (on a material plane), without the need for currencies. And, with ArtBot it is possible to connect to Stable Horde, accessing the processing power of a distributed network. Here, users are also allowed a front pocket, but this is not granted on their loyalty as 'customers', but in their loyalty to the peer-network. Allowing other users to access one's GPU will be rewarded with 'Kudos' that can then be used to skip the waiting line when accessing other GPUs. A free ride is in this sense only available if the network makes it possible.

What is their role in techno cultural strategies?

The commercial platform interfaces for AI image creation are sometimes criticised for their biases or copyright infringements, but many people see them as useful tools. They can be used by, for instance, creatives to test out ideas and get inspired. Frequently, they are also used in teaching and communication. This could be for illustration – as an alternative to, say, an image found in the internet, where use might otherwise violate copyrights. It is increasing also used to display complex ideas in an illustrative way. Often, the model will fail or reveal its cultural biases in this attempt, and at times (perhaps even as a new cultural genre), presentations also include the failed attempts to ridicule the AI model and laugh at how it perceives the complexity of illustrating an idea.

[EXAMPLES / IMAGES FROM ACADEMICS - MY OWN + MAI'S // MAI'S DISPLAUS A HOMELESS PERSONE - PERHAPS A HOMELESS FROM CIVITAI COULD BE USED INSTEAD OF THE ELVER (BELOW). E.G., FROM SESAME STREET https://civitai.com/images/2834126]

By its many settings, interfaces to autonomous AI accommodate a much more fine grained visual culture. As previously mentioned, this can be found on sites such as CivitAI or Danbooru. Here one finds a visual culture that not only is invested in, say, manga, but often also in LoRAs. That is, on CivitAI there are images created with the use of LoRAs to generate a specific stylistic outputs, but also requests to use specific LoRAs to generate images.

Screenshot of interface in the platform CivitAI, displaying a user-generated image of an elver. It is made using the model FLUX and two LoRAs that makes the image appear in the style of a Rembrandt painting. It is an example of one of the many very specific visual experiments on the platform.


The complex use of interfaces testifies how highly skilled the practitioners are within the interface culture of autonomous AI image creation: when generating images, one has to understand how to make visual variations using 'seed' values, or how to make use of Stable Horde using Kudos (currencies) to speed up the process; when building and annotating datasets for LoRAs, and creating 'trigger words', one has to understand how this ultimately relates to how one prompts when generating images using the LoRA; when setting 'learning rates' (in training LoRAs), one has to understand the implications for the use of processing power and energy; and much more. In other words, to operate the interface does not only demand a high knowledge of visual culture, but also deep insights into how models work, and the complex interdependencies of different planes of use, computation, social organisation, value creation, and material infrastructure.

How do interfaces relate to autonomous infrastructures?

To conclude, interfaces to autonomous AI image generation seem to rely on a need for parameters and configurations that accommodates particular and highly specialised visual expressions, but also gives rise to a highly specialised interface culture that possesses deep insights into not only visual culture, but also the technology. Such skills are rarely afforded in the smooth commercial platforms that overflows visual culture with an abundance of AI generated images. Interfaces to autonomous AI sometimes also build in a decentralisation of processing power (GPU), either by letting users process the images in their own computers, or by accessing a peer-network of GPUs. Despite this decentralisation, interfaces to autonomous AI is not detached from commercial interests and centralised infrastructures. The integration of and dependency on a platform like Hugging Face is a good example of this.


++++++++++++++++++++++++

[CARD TEXT]

Interfaces