Card texts

// THESE are the texts are for the A4 + A3 printouts - approximately 1,300 characters

// A4: All of the objects

// A3: Objects/Maps – Pixel Space/Latent space – Prompt/LoRA – GPU/Currency – Stable Horde/Hugging Face

CivitAI --- MISSING!

CLIP --- MISSING!

Currencies [1,288 characters]

While a familiar object, currencies can take wild shapes. Traditionally, a currency is a material medium (commonly, metal or paper) of exchange: coins and bills are one still a common form of currency. Digital platforms and services, however, have multiplied the forms these exchange objects can take. Videogames and blockchain technology have helped to explode what can be understood as currency, and have subverted the dependants on larger organisations: if traditionally states, banks, and large organisations designed and managed these objects, their digital counterparts tend to be more untamed and sometimes community orientated. Autonomous currencies have always existed, as exchange modes for small communities or explicit counter-objects to hegemonic economies, but their digital versions are easy to set-up, distribute, and share with communities around the world. Within our objects of interest, digital currencies make possible to directly generate images, or commodify the possibility of new ones (for example, as a bounty for the ones creating or fine-tuning models). Immaterial versions of currencies also act as the exchange network for sharing non-hegemonic networks of image generation, adding value and circulation to a larger infrastructure of imaginaries of autonomy.

Datasets --- MISSING!

Diffusion --- MISSING!

GPU

The Graphics Processing Unit (GPU) is an electronic circuit for processing computer graphics. It was designed to add graphical interfaces to computing devices and expand the possibilities for interaction. While commonly used for videogames or 3D computer graphics enterprises, the GPU's particular way of computing have made it an object of interest and necessity for the cryptocurrency rush and, more recently, for training and use of generative AI. One of the more material elements of our map, it is often an invisible hardware, playing a crucial role of translating between visual and textual human-understanding and computer models for prediction (for example, between existing images and the "calculation" of new ones). The GPU is both an object that sits next to a desktop computer, and that populates massive cloud data centres racing towards the latest flagship large language model. In a way, a domestic object, used by enthusiast to play and modify the landscape of AI, and the holy grail of the big tech AI industry.

Hugging Face (1,192 characters)

Hugging Face initially stated out in 2016 out as a chatbot for teenagers, but is now the collaborative hub for AI development – for image creation, speech synthesis, text-to-video. image-to-video, and much more. As a platform, it serves as an infrastructure for experimentation with advanced configurations that the conventional platforms do not offer, and is used by both amateurs and professionals alike.

By making AI models and datasets publicly available it can also be understood as an attempt to democratise AI and delink from the mainstream platforms. Yet, at the same time Hugging Face is deeply intertwined with commercial interests. It collaborates with Meta and Amazon Web Services who wants to take advantage of the company's expertise in handling AI models at large scale, and it has received investments from Amazon, Google, Intel, IBM, NVIDIA and others within the AI industry.

Hugging Face is a key example of how generative AI - also when seeking autonomy – depends on a specialised technical infrastructure. In a constantly evolving field reliability, security, scalability and adaptability become important parameters, and Hugging Face offers this in the form of a platform.

Hugging Face initially stated out in 2016 out as a chatbot for teenagers, but is now a (if bot the) collaborative hub for AI development – not specifically targeted AI image creation, but generative AI more broadly (including speech synthesis, text-to-video. image-to-video, image-to-3D, and much more). It attracts amateur developers who use the platform to experiment with AI models, as well as professionals who typically use the platform as an outset for entrepreneurship. By making AI models and datasets available it can also be labelled as an attempt to democratise AI and delink from the key commercial platforms, yet at the same time Hugging Face is deeply intertwined with these companies and various commercial interests.

Hugging face has (as of 2023) an estimated market value of $4,5 billion. It has received high amounts of venture capital from Amazon, Google, Intel, IBM, NVIDIA and other key corporation in AI and has, because of the company's expertise in handling AI models at large scale, also collaborations with both Meta and Amazon Web Services. Yet, at the same time, it also remains a platform for amateur developers and entrepreneurs who use the platform as an infrastructure for experimentation with advanced configurations that the conventional platforms do not offer.

Hugging Face is a key example of how generative AI - also when seeking autonomy – depends on a specialised technical infrastructure. In a constantly evolving field reliability, security, scalability and adaptability become important parameters, and Hugging Face offers this in the form of a platform.

IMAGES: Business + interface/draw things + mgane + historical/contemporary interfaces.

Interfaces (1,286 characters)

Many users are familiar with mainstream corporate interfaces, such as Microsoft Image Creator or OpenAI's DALL-E. They are typically cloud based and easy to use – often just with a simple ‘prompt.’ Interfaces to autonomous AI vary significantly from this:

- There are many options depending on the configuration of one's own computer. Draw Things is a graphical interface suitable for MacOS users; ComfyUI works for Windows/Linux, ArtBot has an advanced web interface and also integration with Stable Horde (a peer-based infrastructure of GPUs).

- There are plenty of parameters to experiment with. Users can prompt, but also define negative prompts (what not to include the image); combine different models (including one's own), decide in the size of the image, and more.

- They allow users to generate their own models based on the models of stable diffusion, aka LoRAs. LoRAs are used to make images in, say, a particular manga style, and are shared on dedicated platforms.

- They can be integrated into one's own applications. The developer platform Hugging Face, for instance, releases models with different licences, for integration into new tools and services.

In short, they do not only demand insights into how models work. but also deep knowledge of visual culture and aesthetics.

[SCREEN SHOTS OF COMFYUI, DRAW THINGS, ARTBOT – optionally -> also LoRAs/images -> platforms/CivitAI]

LAION --- MISSING!

Latent space --- MISSING!

LoRA --- MISSING!

Maps [1,321 characters]

There is little knowledge of what AI really looks like. Perhaps because of this lack of insight, there is an abundance of maps – corporate landscapes, organisational diagrams, technical workflows, etc. – used to design, navigate, or criticise AI’s being in the world. If one does not confuse the map with the territory (reality with its representation), one begins to see how externalising abstractions of AI by way of cartography takes part in making AI a reality.

The maps presented here are attempts to abstract the different objects that one may come across when entering the world of autonomous AI image creation. It can serve as a practical guide to understand what the objects of this world are called, how they connect to each other, communities or underlying infrastructures – perhaps also as an outset for one's own abstractions.

A distinction between what you see and what you do not see can be useful. Latent space refers to the invisible space that exists between the capture of images in datasets and the generation of new images. Pixel space is where one encounters objects of visual culture – such as the interfaces or generated images. But models, datasets, interfaces and images also exist in other planes of abstractions – for instance, of their material infrastructure or their organisation on platforms.

There is little knowledge of what AI really looks like. The maps presented here are an attempt to abstract the different objects that one may come across when entering the world of autonomous and decentralised AI image creation. It can serve as a useful guide to experience what the objects of this world are called, how they connect each other, to communities or underlying infrastructures – perhaps also as an outset for one's own exploration.

~~A distinction between 'pixel space' and 'latent space' can be helpful. That is, what you see from what you do not see.~~

Latent space refers to the invisible space that exists between the capture of images in datasets and the generation of new images. Images are encoded with 'noise', and the machine then learns how to how to de-code them back into images (aka 'image diffusion'). Contrary to common belief, there is not just one dataset used to make image generation work, but multiple models and datasets to 'upscale' images of low resolution, 'refine' the details in the image, and much more. Behind every model and dataset there is a community and organisation.

Pixel space is where one encounters objects of visual culture. Large-scale datasets are for instance compiled by crawling and scraping repositories of visual culture, such as museum collections. Whereas conventional interfaces for generating images only offer the possibility to 'prompt', interfaces to Stable Diffusion offer advanced parameters, as well as options to train one's own models, aka LoRAs. This demands technical insights into latent space as well as aesthetic/cultural understandings of visual culture (say, of manga, gaming or art).

Both images and LoRAs are organised and shared on dedicated platforms (e.g., Danbooru or CivitAI). The generation of images and use of GPU/hardware can also be distributed to a community of users in a peer-to-peer network (Stable Horde). This points to how models, software, datasets and other objects always also exist suspended between different planes of dependencies - organisational, material, or other.

[Images: 'Our' map(s) - perhaps surround by other maps]

Model card --- MISSING!

Object of Interest/Necessity

Most people’s experiences with generative AI image creation come from platforms like OpenAI’s DALL-E or other services. Nevertheless, there are also communities who for different reasons seek some kind of independence and autonomy from the mainstream platforms. The outset for this catalogue is ‘Stable Diffusion’, a so-called Free and Open Source Software system for AI image creation.

With the notion of an ‘object of interest’ a guided tour of a place, a museum or collection likely comes to mind. One may easily read this compilation of texts as a catalogue for such a tour in a social and technical system, where we stop and wonder about the different objects that, in one way or the other, take part in the generation of images with Stable Diffusion.

'A guided tour' perhaps also limits the understanding of what objects of interest are? In science, for instance, an object of interest sometimes refers to what one might call the potentiality of an object. Take for instance, the famous Kepler telescope whose mission was to search the Milky Way for exoplanets (planets outside our own solar system). Among all the stars, there are candidates for this, or so-called Kepler Objects of Interest (KOI).

In similar ways, this catalogue is the outcome of an investigative process where we – by trying out different software, reading documentation and research, looking into communities of practice that experiment with AI image creation, and more – have sought to understand the things that make generative AI images with Stable Diffusion possible. We have tried to describe not only the objects, but also their underlying dependencies on relations between communities, models, capital, technical units, and more.

Objects, however, also contain an associative power, that literally can create memories and make a story come alive. This catalogue is therefore not just a collection of the objects that makes generative AI images, but an exploration of an imaginary of AI image creation through the collection and exhibition of objects – and in particular, an imaginary of ‘autonomy’ from mainstream capital platforms.

Pixel space --- MISSING!

Prompt --- MISSING!

Stable Horde

Horde AI or Stable Horde is a distributed cluster of GPUs. The project describes itself as a "volunteer crowd-sourced distributed cluster of image and text generation workers". This translates as s network of individual GPU users that "lend" their devices and stored large language models. This means that one can generate an image from any device connected to this network through an interface, e.g. a website through a phone. While the visible effects are the same as using chatGPT, co-pilot or any other proprietary service, the images in this network are "community" generated. The request is not sent to a server farm or a company, but to a user that is willing to share their GPU power and stored models. Haidra, the non-profit associated with HordeAI, seeks to make AI free, open-source, and collaborative, effectively circumventing the reliance on AI bit-tech players.

Projects like Stable Horde/HordeAI offer a glimpse into the possibilities of autonomy in the world of image generation, and offer other ways of volunteering through technical means. In a way, this project inherits some of the ethos of P2P sharing and recursive publics, yet updated for the world of LLMs. The GPU used in this project is (intermittently), part of the HordeAI network, generatig and using the kudos currency.

Variational Autoencoder, VAE --- MISSING!