Workshop on "Objects of Interest and Necessity", part 2

From CTPwiki

INTRO:

What is an "object of interest"?

With the notion of an ‘object of interest’ a guided tour of a place, a museum or collection, likely comes to mind. One may easily read this compilation of texts as a catalogue for such a tour in a social and technical system, where we stop and wonder about the different objects that, in one way or the other, take part in the generation of images with Stable Diffusion.

Perhaps a 'guided tour' also limits the understanding of what objects of interest are? Take for instance the famous Kepler telescope, whose mission was to search the Milky Way for exoplanets. Among all the stars, there is only a limited number of candidates for these so-called Kepler Objects of Interest (KOI).

What makes AI livable? What are the underlying dependencies on relations between communities, models, capital, technical units, and more in these technical objects of interest?

Objects also contain an associative power, that literally can create memories and make a story come alive. These texts are therefore not just a collection of objects that makes generative AI images, but an exploration of an imaginary of AI image creation through the collection and exhibition of objects – and in particular, an imaginary of ‘autonomy’ from mainstream capital platforms.

Building an imaginary (in a Wiki)

What is 'autonomy'? What does it mean to separate from capital interest?

Not 'one thing' The range of agents, dependencies, flows of capital, and so on, can be difficult to comprehend and is in constant flux.

This, we have tried to capture in our description of the objects, guided by a set of questions that we address directly or indirectly in the different entries to our 'tour guide'

  1. What is the network that sustains this object?
  2. How does it evolve through time?
  3. How does it create value? Or decrease / affect value?
  4. What is its place/role in techno cultural strategies?
  5. How does it relate to autonomous infrastructure?

The wiki: https://ctp.cc.au.dk/w choose " Objects of interest and necessity"

Please leave comments, if you like

The booklet / guide

- is made without Adobe --- using 'web-to-print' techniques

Agenda

Visiting the different 'lands' of AI imaging - looking at the map & practical experimentation (what one can do in this land)

  1. The Land of Pixels (Pixel Space) // installing and playing with software
  2. The Land of Latents (Latent Space) // making 'LoRAs' on our own machines
  3. The Land of GPUs (The Different 'Planes' of AI) // making images using our own computer or Stable Horde (P2P), and spending 'kudos'.
Tablecloth
Tablecloth

PART ONE: 'The Land of Pixels'

A diagram of pixel space in AI imaging
A map of AI image generation separating 'pixel space' from 'latent space', but particularly emphasising the objects of pixel space, operated by the users. Pixel space is the home of both conventional visual culture and a more specialised visual culture. Conventionally, image generation will involve simple 'prompt' interfaces, and models will be built on accessible image, scraped from archives on the Internet, for instance. The specialised visual culture takes advantage of the openness of Stable Diffusion to, for instance, generate specific manga or gaming images with advanced settings and parameters. Often, users build and share their own models, too, so-called ''LoRAs'. (by Christian Ulrik Andersen, Nicolas Malevé, and Pablo Velasco)

Interfaces - 'out-of-the box prompting' vs. 'the expert interface'

Screenshot of chatGPT interface
Screenshot of artbot interface

https://tinybots.net/artbot/create

Example interface Draw Things
Example interface Draw Things

https://drawthings.ai/

Interface of NitroFusion
Interface of NitroFusion

https://huggingface.co/spaces/ChenDY/NitroFusion_1step_T2I

Experiment

We will be using:

1. Artbot

2. https://drawthings.ai/

  1. Download from website (NOT app store)
  2. Do NOT use cloud version

3. NitroFusion

Play with models, prompts, negative prompts or perhaps other parameters.

NEXT: Where do the models come from? Creating annotated datasets from visual culture

Communities feeding generative AI

The Digital Photography Challenge community
Laughing at Grandpa's jokes by pmichaud on DP Challenge
Laughing at Grandpa's jokes by pmichaud on DP Challenge
DeviantArt interface
Trending on Art Station
Trending on Art Station
My Little Pony's fans community
The Danbooru platform

Annotation labour

Interface of annotation
Interface of annotation
Interface of annotation, labels
Interface of annotation, labels

Datasets as a site where visual cultures meet (and clash)

Images corresponding to the term Monet in LAION 5B
Images corresponding to the term Monet in LAION 5B
Images corresponding to the term Monet in LAION 5B
Images corresponding to the term Monet in LAION 5B
Images Monet, baby clothes, in LAION 5B
Images Monet, baby clothes, in LAION 5B
155 million images from Pinterest in LAION-5B, https://knowingmachines.org/models-all-the-way#section2
155 million images from Pinterest in LAION-5B, https://knowingmachines.org/models-all-the-way#section2
72 million pairs are from SlidePlayer, a platform for storing and sharing PowerPoint presentations. https://knowingmachines.org/models-all-the-way#section2
72 million pairs are from SlidePlayer, a platform for storing and sharing PowerPoint presentations. https://knowingmachines.org/models-all-the-way#section2

The cultures of image generation

Theatre d'Opera Spatial by Jason Allen
CivitAI image stream
CivitAI image stream
Pope in Balenciaga
Pope in Balenciaga
https://religionnews.com/2025/09/17/charlie-kirks-ai-resurrection-reveals-new-era-of-digital-grief/
Fake ID card and KYC systems
Fake ID card and KYC systems
Pablo Picasso as a CEO (artists as CEO)
Pablo Picasso as a CEO (artists as CEO)
Pablo Picasso as a CEO (artists as CEO)
Pablo Picasso as a CEO (artists as CEO)
Salvador Dali as a CEO (artists as CEO)
Salvador Dali as a CEO (artists as CEO)
Mayor Thorkild Simonsen (1926-2022): respected politician by day, underground hip-hop enthusiast by night. This serious Social Democrat who led Aarhus for 15 years secretly spit rhymes about policy in basement venues, connecting with youth through beats and flows. Coming to @kunsthalaarhus this June. Lyrics/music: Niels Kern
Mayor Thorkild Simonsen (1926-2022): respected politician by day, underground hip-hop enthusiast by night. This serious Social Democrat who led Aarhus for 15 years secretly spit rhymes about policy in basement venues, connecting with youth through beats and flows. Coming to @kunsthalaarhus this June. Lyrics/music: Niels Kern

https://makertube.net/w/7EZnBc7jpoZo9EwxUteSC9

Law&Crime is providing coverage of the Sean “Diddy” Combs’ sex trafficking trial by recreating proceedings with generative AI.
Law&Crime is providing coverage of the Sean “Diddy” Combs’ sex trafficking trial by recreating proceedings with generative AI.
Midjourney list of styles
Midjourney list of styles
Rutkowski emulated by Midjourney
Rutkowski emulated by Midjourney

PART TWO: 'The Land of Latents'

A diagram of latent space in AI imaging
The map reflects the separation of pixel space from latent space; i.e., what is seen by users from the more abstract space of models and computation, i.e., latent space. It particularly emphasises the objects involved in model training (the stacking of latent spaces), but also how latent space is dependent on and array of organisational, technical, textual and visual resources (by Christian Ulrik Andersen, Nicolas Malevé, and Pablo Velasco)

Diffusion

A diagram of the training process
A diagram of the training process
The process of generating an image of a teddy bear generated with Stable Diffusion XL
The process of generating an image of a teddy bear generated with Stable Diffusion XL
Encoders and the loss of details, by rMada

Reflexive prompting

Prompting for bias

Men washing the dishes


Women fixing airplanes

Men fixing airplanes (by comparison)


Fixing a street in Aarhus?

Side by side a street in Switzerland and a street in the US over several centuries
Side by side a street in Switzerland and a street in the US over several centuries

Letting the model "speak"

Examples of images generated with the negative prompt "man"
Examples of images generated with the negative prompt "man"
Negative prompt: Woman
Negative prompt: Woman
  • Empty prompt, what is the threshold?
  • Using generic words: someone, somewhere in summertime
  • Using only negative prompting, what is a negative "man"?

Examples of LoRAs (making stories and 'fixing' things with LoRAs)

LoRA "The Incredible Hulk (2008)."

LoRA "The Incredible Hulk (2008)."[1]

A kitchen in Budapest

A kitchen in Budapest, search results. See presence of
Kitchen in Budapest, before LoRA
Kitchen in Budapest, after LoRA

The representation of women artists, Louise Bourgeois


Movie is on OneDrive

What are we going to fix? What stories are we going to tell?


Practical example: make a LoRA with Draw Things

  1. Open Draw Things -> Go to 'PEFT'
  2. Select a "Base Model" (to fine-tune) // maybe install a 'light' one first to speed up things
  3. Name your LoRA & Select a 'trigger word'n // select one you remember or note it down, and select one that is unique (e.g. "menwashingdishes" rather than "men" (as this is too generic and too prevalent in the foundation model)
  4. Define 'Training Steps' (scroll down to find), set til e.g., 1,000 // the more, the longer the wait
  5. Add images to train on. 12 images will work. We have prepared a folder as an option.
  6. Annotate the images // what to think about - when annotating? I
  7. Start training ..... WAIT and WAIT and WAIT (for more than an hour)

NOTE: Annotating images is like writing a caption - or, think of the promp you'd write yourself to generate this image. When prompting and enabling the LoRA, you can use the same words again. The prompt is the reverse annotation, and the annotation is the reverse prompt.

Prepared images:

Annotation software (optional):

https://www.makesense.ai/

  • Get started (bottom right)
  • When finished, Actions -> Export Annotations -> csv file
  • Share this file with us together with the images if you want to train on one of our machines.

Discussion while 'baking' the LoRA

For example:

  • How to think of images?
  • How to work conceptually?
  • How to annotate and what to think of?
  • The relation between prompts and models?
  • How to make pipelines of models/dataset?
  • ?

PART THREE: 'The Land of GPUs (Materials/Organisations) ' ... on the different 'planes' of AI

A diagram of the many planes of AI imaging
A map of how objects not only relate pixel space to latent space, but how they are always suspended between different planes - not only a technical one, but also an organisational one, a material one, and potentially many others (capital, labour, knowledge, governance, etc.) (by Christian Ulrik Andersen, Nicolas Malevé, and Pablo Velasco)
A diagram of the material plane of AI imaging
The map reflects the material organisation, or infrastructure, of AI image generation. This particularly concerns the use of GPU and the processing power needed to generate images and train models and LoRAs. In conventional visual culture, users will access cloud-based services (like OpenAI's DALL-E, Adobe Firefly or Microsoft Image Creator) to generate images. In this client-server relation, users do not know where the service is (in the 'cloud'). In autonomous AI visual culture, users benefit from each others' GPU's in Stable/AI Horde's peer-to-peer network - exchanging GPU for the currency 'Kudos'. Knowing the location of the GPU is central. Users also train models, for instance on Hugging Face. Here, the infrastructure resembles more that of a platform (by Christian Ulrik Andersen, Nicolas Malevé, and Pablo Velasco).
A diagram of the organisational plane in AI imaging
The map reflects the organisation of AI image generation. In conventional visual culture, users will access cloud-based services like OpenAI's DALL-E, and may also share their images on social media. In autonomous AI visual culture, the platforms are more democratic in the sense that moodels or datasets are freely available for training LoRAs or other development. Users also share their own models, datasets, images, and knowledge between on dedicated platforms, like CivitAI or Hugging Face. Many of the organisations from conventional visual culture (like Meta, who owns Instagram) also invest in the platforms of autonomous AI, and openness is not to be taken for granted (by Christian Ulrik Andersen, Nicolas Malevé, and Pablo Velasco).