Maps: Difference between revisions

From CTPwiki

CUA (talk | contribs)
CUA (talk | contribs)
Line 1: Line 1:
== A Map of 'Objects of Interest and Necessity' [card text - needs shortening] ==
== A Map of 'Objects of Interest and Necessity' [card text - needs shortening] ==
There is little knowledge of what AI really looks like. Perhaps for this reason there exists an abundance of maps that in each their own way seek to make AI real by way of abstracting its dependencies in corporations, infrastructures, labour, the formation of knowledge, and much more.
There is little knowledge of what AI really looks like. Perhaps for this reason there exists an abundance of maps that in each their own way seek to make AI real by way of abstraction. Like any of these maps, a map of 'Objects of Interest and Necessity', is not to be confused with the actual thing - 'the map is not the territory', as the saying goes. The map presented here, is an attempt to abstract the different objects that one may come across when entering the world of autonomous and decentralised AI image creation (and in particular Stable Diffusion). It can serve as a useful guide to experience what the objects of this world are called, how they look, how they connect to other objects, communities or underlying infrastructures perhaps also as an outset for one's own experiences. In this map, there is a fundamental distinction between three territories.
 
Like any of these maps, a map of what autonomous and decentralised AI image creation is, is not to be confused with the actual thing - 'the map is not the territory', as the saying goes. A map is rather to be understood as an abstraction. The map presented here, is in this sense an attempt to abstract the different objects that one may come across when entering the world of autonomous and decentralised AI image creation (and in particular Stable Diffusion). It can serve as a useful guide to experience what the objects of this world are called, how they look, how they connect to other objects, communities or underlying infrastructures - to navigate the the encounters with the objects presented here, but perhaps also as an outset for one's own experiences.  
 
In this map, there is a fundamental distinction between three territories.  


Firstly, there is the space of the user, called 'pixel space'. In this space one finds, for instance, all the images that are generated by AI by textual input ('prompts') in one of the several 'interfaces' to Stable Diffusion, but also the many other images of visual culture that serve as the outset (or, 'datasets') for generating AI models. Images are 'scraped' from the Internet or different repositories using software (such as 'Clip') that can automatically capture images and generate categories and descriptions of what the images contain.  
Firstly, there is the space of the user, called 'pixel space'. In this space one finds, for instance, all the images that are generated by AI by textual input ('prompts') in one of the several 'interfaces' to Stable Diffusion, but also the many other images of visual culture that serve as the outset (or, 'datasets') for generating AI models. Images are 'scraped' from the Internet or different repositories using software (such as 'Clip') that can automatically capture images and generate categories and descriptions of what the images contain.  

Revision as of 12:20, 7 July 2025

A Map of 'Objects of Interest and Necessity' [card text - needs shortening]

There is little knowledge of what AI really looks like. Perhaps for this reason there exists an abundance of maps that in each their own way seek to make AI real by way of abstraction. Like any of these maps, a map of 'Objects of Interest and Necessity', is not to be confused with the actual thing - 'the map is not the territory', as the saying goes. The map presented here, is an attempt to abstract the different objects that one may come across when entering the world of autonomous and decentralised AI image creation (and in particular Stable Diffusion). It can serve as a useful guide to experience what the objects of this world are called, how they look, how they connect to other objects, communities or underlying infrastructures – perhaps also as an outset for one's own experiences. In this map, there is a fundamental distinction between three territories.

Firstly, there is the space of the user, called 'pixel space'. In this space one finds, for instance, all the images that are generated by AI by textual input ('prompts') in one of the several 'interfaces' to Stable Diffusion, but also the many other images of visual culture that serve as the outset (or, 'datasets') for generating AI models. Images are 'scraped' from the Internet or different repositories using software (such as 'Clip') that can automatically capture images and generate categories and descriptions of what the images contain.

Secondly, one would also find a 'latent space'. Image latency refers to the space in between the capture of datasets and the generation of images. That is, it refers to a purely computational or algorithmic space of 'image diffusion' models where the images in the datasets are first encoded with 'noise', and the machine then learns how to how to de-code them back into images.

Working with autonomous and decentralised AI image generation, one will also find a third space of objects related to visual culture. This space is not typically seen when generating images in pixel space, but is used by communities who have an interest in, for instance, creating a so-called 'LoRA' to meet specific visual requirements. They re-model the models of latent space, so to speak. Both image-creations and LoRAs are often shared on designated platforms (such as 'CivitAI') that offer an infrastructure for this. Generating an image depends on Graphics Processing Units ('GPU') that demand lots of resources. The community therefore sometimes also choose to share their GPUs in a distributed network ('Stable Horde') where they can use 'virtual currencies' to access each others GPUs. The communities in this sense deeply care for the material or other infrastructures that sustains their own existence, but there are at the same time often also venture capital interests in these communities and their platforms.

As such, the many objects that one would encounter in on or the other of the three 'territories' connect to many different planes. They deeply depend on (and sometimes also reconfigure) material infrastructures, capital, the automation and extraction of labour, the organisation of knowledge, politics, regulation, and much more.

Mapping 'objects of interest and necessity'

If one considers generative AI as an object, there is also a world of ‘para objects’, surrounding AI and shaping its reception and interpretation in the form of maps or diagrams of AI. They are drawn by both amateurs and professionals who need to represent processes that are otherwise sealed off in technical systems, but more generally reflect a need for abstraction – a need for conceptual models of how generative AI functions. However, as Alfred Korzybski famously put it, one should not confuse the map with the territory: the map is not how reality is, but a representation of reality.

Following on from this, mapping the objects of interest in autonomous AI image creation is not to be understood as a map of what it 'really is'. Rather, it is a map of encounters of objects; encounters that can be documented and catalogued, but also positioned in a spatial dimension – representing a 'guided tour', and an experience of what objects are called, how they look, how they connect to other objects, communities or underlying infrastructures (see also Objects of interest and necessity). Perhaps, the map can even be used by others to navigate autonomous generative AI and create their own experiences. But, importantly to note, what is particular about the map of this catalogue of objects of interest and necessity, is that it purely maps autonomous and decentralised generative AI. It is therefore not to be considered a plane that necessarily reflects what happens in, for instance, Open AI og Google's generative AI systems. The map is in other words not a map of what is otherwise concealed in generative AI. In fact, we know very little of how to navigate proprietary systems, and one might speculate if there even exists a complete map of their relations and dependencies.

Perhaps because of this lack of view, maps and cartographies are not just shaping the reception and interpretation of generative AI, but can also be regarded as objects of interest and necessity in themselves, and intrinsic parts of AI’s existence: generative AI depends on an abundance of cartography to model, shape, navigate, and also negotiate and criticise its being in the world. There seems to be an inbuilt need to 'map the territory', and the collection of cartographies and maps is therefore also what makes AI a reality – making AI real by externalising its abstraction in a map, so to speak

A map of 'objects of interest and necessity' (autonomous AI image generation)

To enter the world of autonomous AI image generation, a map that separates the territories of ‘pixel space’ from ‘latent space’ can be useful as a starting point – that is, a map that separates the objects you see from those who cannot be seen.

In pixel space, you find a range of visible objects that a typical user would normally meet – both the images that are generated by a prompt (a textual input) for the user, and the many images that are 'scraped' from the internet, social media platforms or other repositories (such as e.g., ImagiNet) to compile an annotated data set that can be used for training the image generation models (sometimes without the consent of users).

Latent space is more complicated to explain. It is, in a sense, a purely computational space that relies on models that can encode images with noise (using a 'Variational Autoencoder', VAE), and learn how to de-code them back into images. In this process it deeply depends on pixel space – both the prompt itself, but also the dataset, and the many descriptions and annotations of what an image contains. This 'model training' or technique, called image diffusion, in other words, gives the model the ability to generate new images.

A diagram of AI image generation separating 'pixel space' from 'latent space' - what you see, and what cannot be seen (by Nicolas Maleve)

Apart from pixel space and latent space, there is also a territory of objects that can be seen, but you typically (as a user) do not. For instance, in Stable Diffusion you find LAION, a non-profit organization that uses the software Clip to scrape the internet for textually annotated images to generate a free and open-source data set for training models in latent space. You would also find communities who contribute to LAION, or who refine the models of latent space using so-called LoRAs, and also models and datasets to, for instance, reconstruct missing facial or other bodily details (such as too many fingers on one hand) – often with both specialised knowledge of the properties of the foundational diffusion models, and of the visual culture they feed into (for instance manga or gaming). These communities are also organized on different platforms, such as CivitAI or Hugging Face, where communities can exhibit their specialised image creations or share their LoRAs, often with the involvement of different tokens or virtual currencies. What is important to realise when navigating this territory, is that behind every dataset, model, software, and platform lies also a specific community.

This map reflects the separation of pixel space from latent space, and adds a third layer of objects that are visible, but not seen by typical users. Underneath the three layers one finds a second plane of material infrastructures (such as processing power and electricity), and one can potentially also add more planes, such as for instance regulation or governance of AI (by Christian Ulrik Andersen, Nicolas Maleve, and Pablo Velasco) // NEEDS REDRAWING

A map of this territory would therefore necessarily contain technical objects (such as models, software, and platforms) that are deeply intertwined with communities that care for, negotiate, and maintain the means of their own existence (what the anthropologist Chris Kelty has also labelled a 'recursive publics'), but which may potentially also (at the same time) be subject to value extraction. Hugging Face is a prime example of this - a community hub as well as a $4.5 billion company with investments from Amazon, IBM, Google, Intel, and many more; as well as collaborations with Meta and Amazon Web Services. This indicates that there are other dependencies on not only communities, but also on corporate collaboration and venture capital.

AI image generation is (and not least) always also dependent on material infrastructures. First of all on hardware and specifically GPUs that are needed to both generate images as well as develop and refine the diffusion models (e.g. developing LoRAs). GPUs can be found in personal computers, but very often individuals and communities will invest in expensive GPUs with high processing capability. Subsequently, people who generate images or develop LoRAs with Stable Diffusion can have their own GPU (built into their computer or specifically acquired), but they can also benefit from a distributed network, allowing them to access other people’s GPUs, using the so-called Stable Horde. This points to how autonomous AI image generation not only depends on, but often also chooses dependencies. For example, to be dependent on the distributed resources of a community, rather than a centralised resource (e.g., a platform in 'the cloud'). At this material plane, there are also other dependencies that one can choose. For instance, energy. Both generating images and tracing models require a massive energy consumption, and the question of how much and from which source (green or black) may be a key factor in choosing Stable Horde over one's own GPU, or a corporate platform. The labour and minerals that go into the production of required hardware can be considered another dependency on this material plane.

Mapping the many planes and dependencies of AI

What is particular about the map of this catalogue of objects of interest and necessity, is that it purely attempts to map autonomous and decentralised generative AI, serving as a map for a guided tour and experience of autonomous AI. However, both Hugging Face' dependency in venture capital and Stable Diffusion's dependency on hardware and infrastructure point to the fact that there are several planes that are not captured in the above map of this catalogue, but which are equally important. For instance, The EU AI Act or laws on copyright infringement, which Stable Diffusion (like any other AI ecology) will also depend on, point to a plane of governance and regulation. AI, including Stable Diffusion, also connects to the depends on the organisation of human labour, or the extraction of resources. In describing the objects of interest and necessity, we attempt to describe how Stable Diffusion and autonomous AI image generation build on dependencies to these different planes, but an overview of the many planes of AI more generally can of course also be the centre of a map in itself.

There are many attempts to capture the planes of AI and how it 'stacks'. Kate Crawford's Atlas of AI is, for instance, a book that displays different maps (and also images) that link AI to 'Earth' and the exploition of energy and minerals, or 'Labour' and the workers who do micro tasks ('clicking' tasks) or the workers in Amazon's warehouses. In continuation, Crawford's book also contains chapters on 'Data', 'Classification', 'Affect', 'State' and 'Power'.

Also Gertraud Koch has drawn a map of all the different layers that she connects to "technological activity", and which would also pertain to AI. On top of a layer of technology (the 'data models and algorithms') one will find other layers that are interdependent, and which contribute to the political and technological qualities of AI. As such, the map is also meant for navigation – to identify starting points for rethinking its concepts or reimagining alternative futures (in their work, particularly in relation to a potential delinking from a colonial past, and reimagining a pluriversality of technology)

Five sets of technological activities. The layers wrap around technology as a material entity with its own agency, but the layers are permeable and interdependent. Map by Gertraud Koch (2024).

Maps of other planes of AI (the corporate landscape)

Within the many planes of AI one can find several more maps that attempt to build overviews and conceptual models of AI. For instance, entrepreneur, investor and pod cast host Matt Turck has made the “ultimate annual market map of the data/AI industry”. Since 2012 hehas documented the corporate landscape of AI not just to identify key corporate actors, but also developments of trends in business. Comparing the 2012 version with the most recent map from 2024, one can see how the corporate landscape moves from 'Big Data' to 'AI', and also how the division of companies dealing with infrastructure, data analytics, applications, data sources, and open source becomes fine grained over the years, and forking out into, for instance, applications in health, finance and agriculture; or how privacy and security become of increased concern in the business of infrastructures.

The corporate landscape of Big Data in 2012, by Matt Turck and Shivon Zilis.

Critical cartography in the mapping of AI

In mapping AI there are also 'counter maps' or 'critical cartography'. Conventional world maps are built on set principles of, for instance, North facing up, and Europe at the centre. The map is therefore not just a map for navigation, but also a map of more abstract imaginaries and histories originating in colonial times, where maps was the outset of Europe and an intrinsic part of the conquest of territories. In this sense, a map always also reflects hierarchies of power and control that can be inverted or exposed (for instance by turning the map upside down, letting the south be a point of departure). Counter-mapping technological territories would, following this logic, involve what the French research and design group Bureau d´Études has called "maps of contemporary political, social and economic systems that allow people to inform, reposition and empower themselves." They are maps that reveal underlying structures of social, political or economic dependencies to expose what ought to be of common interest, or the hidden grounds on which a commons rests. Félix Guattari and Gilles Deleuze' notion of 'deterritorialization' can be useful, here, as a way to conceptualise the practices that expose and mutate the social, material, financial, political, or other organisation of relations and dependencies. The aim is ultimately not only to destroy this 'territory' of relations and dependencies, but ultimately a 'reterritorialization' – a reconfiguration of the relations and dependencies.

Utilising the opportunities of info-graphics in mapping can be a powerful tool. At the plane of financial dependencies, one can map, as Matt Turck, the corporate landscape of AI, but one can also draw a different map that reveals how the territory of 'startups' does not compare to a geographical map of land and continents. Strikingly, The United States is double the size of Europe and Asia, whereas there are whole countries and continents that are missing (such as Russia and Africa). This map thereby not only reflects the number of startups, but also how venture capital is dependent on other planes, such as politics and the organisation of capital, or infrastructural gaps. In Africa, for instance, the AI divide is very much also a 'digital divide', as argued by AI researcher Jean-Louis Fendji.

Numbers of newly funded AI startups per country // MISSING SOURCE

Counter-mapping the organisation of relations and dependencies is also prevalent in the works of the Barcelona-based artist collective Estampa, which exposes how generative AI depends on different planes: venture capital, energy consumption, a supply chain of minerals, human labour, as well as other infrastructures, such as the internet, which is 'scraped' for images or other media, using e.g. software like Clip).

Taller Estampa, map of generative AI, 2024

Epistemic mapping of AI

Maps of AI often also address how AI functions as what Celia Lury has called an 'epistemic infrastructure'. That is, AI is an apparatus that builds on knowledge, creates knowledge, but also shapes what knowledge is and we consider to be knowledge. To Lury, the question of 'methods' here becomes central - not as a neutral, 'objective' stance, as one typically regards good methodology in science, but as a cultural and social practice that help articulate the questions we ask and what we consider to be a problem in the first place. When one for, instance, criticises the social, racial or other biases in generative AI (such as all doctors being white males in generative AI image creation), we are not just dealing with bias in the dataset that can be fixed with 'negative prompts' or other technical means. Rather, AI is fundamentally – in its very construction and infrastructure – based in a Eurocentric history of modernity and knowledge production. For instance, as pointed out by Rachel Adams, AI belongs to a genealogy of intelligence, and one also ought to ask, whose intelligence and understanding of knowledge is modelled within the technology – and whose is left out?

There are several attempts to map this territory in the plane of knowledge production, and its many social, material, political or other relations and dependencies. Sharing many of the concerns of Lury and Adams, Vladan Joler and Matteo Pasquinelli's 'Nooscope' is a good example of this. In their understanding AI belongs to a much longer history of knowledge instruments ('nooscopes', from the Greek skopein ‘to examine, look’ and noos ‘knowledge’) that would also include optical instruments, but which in AI is a form of knowledge magnification of patterns and statistical correlations in data. The nooscope map is an abstraction of how AI functions as "Instrument of Knowledge Extractivism". It is therefore not a map of 'intelligence' and logical reasoning, but rather of a "regime of visibility and intelligibility" whose aim is the automation of labour, and of how this aim rests on (as other capitalist extractions of value in modernity) a division of labour – between humans and technology, between for instance historical biases in the selection and labelling of data, and their formalisation in sensors, databases and metadata. The map also refers to how selection, labelling and other laborious tasks in the training of models is done by "ghost workers" thereby referring to a broader geo-politics and body-politics of AI where human labour is often done by subjects of the Global South (although they might oppose being referred to as 'ghosts').

A map of AI as an instrument of knowledge by Vladan Joler and Matteo Pasquinelli (2020)
A map of AI as an instrument of knowledge by Vladan Joler and Matteo Pasquinelli (2020)




++++

Not sure how to fit this map (a map of the process of producing a pony image) in this entry. Perhaps another entry?

Map of genereting pony image