LAION
LAION
If our tour has led us into well-funded companies such as Hugging Face or CivitAI and their attachments in the heart of venture capital, it also leads us, at the opposite end of the financial spectrum to significant actors that operate within a minimal economy such as Stable Horde. The Large-scale Artificial Intelligence Open Network (LAION) fits in this category. It is a non-profit organization whose ambition is to democratize AI by encouraging public education and the re-use of existing datasets and models. LAION operates with small donations in the form of money but mostly in terms of cloud compute. [1]
LAION's co-funder, Christoph Schuhmann, is the driving force behind one major object of necessity for the generative AI ecosystem: a series of datasets that outscaled the existing offer. The curatorial method for these datasets was entirely automated. It leveraged cleverly available resources such as Common Crawl and Google Collab to download text-image pairs en masse from the internet. This curatorial method differs radically from the practice of affective involvement discussed in the LoRA entry where anime enthusiasts select images by hand from a visual domain they cherish. In the case of LAION5B that contains 5.85 billion images, the work of annotation is delegated to the then just-released CLIP model tasked to verify the relation between the downloaded images and the adjacent alt-text used as their description. The comparison is even more striking with a subsequent dataset, LAION-Aesthetics, consisting in a subset of the 5 billions images dataset that contains images with higher aesthetic quality. This object of high interest for the newly burgeoning field of image generation, that desperately looked for stylistically rich images to train algorithms, was assembled using an approach that again favoured integral automation. This time the selection was handled by a custom-made model trained on clip embeddings to evaluate the quality of images by attaching them an aesthetic score.
This can be explained by the fact that LAION operates with a minimal budget and could not afford the cost of manual verification and annotation of a dataset of that scale. But in the case of LAION, the automation of curation did not preclude artisanal practice. It displaced it. An interview given by Schuhmann shows the ad-hoc and low-tech nature of the bricolage that presided the creation of an object that helped sparked the development of image generation:
“Then in the spring of 2021, I sat down and just wrote down a huge spaghetti code in a Google Colab and then asked around on Discord who wanted to help me with it. Someone got in touch, who later turned out to be only 15 at the time. And he wrote a tracker, basically a server that manages lots of colabs, each of which gets a small job, extracts a gigabyte, and then uploads the results. At that time, the first version was still using Google Drive.” (Ibid)
“We then did a [blog post about our dataset](https://laion.ai/blog/laion-400-open-dataset/), and after less than an hour, I already had an email from the Hugging Face people wanting to support us. I had then posted on the Discord server that if we had $5,000, we could probably create a billion image-text pairs. Shortly after, someone already agreed to pay that: “If it’s so little, I’ll pay it.” At some point, it turned out that the person had his own startup in text-to-image generation, and later he became the chief engineer of Midjourney.” (Ibid)
These two fragments are worth quoting at length. In them, Schuhmann traces a line that goes from the management of the limits of user accounts on collab and Google Drives, the informality of meeting a coder on Discord that ends up being a teenager and the future chief engineer of a major company of the field. These anecdotes indicate how the dataset functions as an attractor for actors and projects of radically different scales and funding.
In the dataset entry, we characterized the datasets as conduits of visual culture entering a model. Examining the controversies surrounding LAION, we have to underline how these conduits problematically enable the reigning extractivism of the AI industry. Indeed, the curatorial method devised by LAION does not include seeking permissions from the images' rights owners and several court cases are currently led by artists and image agencies against Stability AI and others on the ground that their use of the images contained in the LAION dataset is infringing. Here the non-profit status of the organization plays an ambiguous role. For Schuhmann, his association benefits from an exception granted by the EU Data Mining directive for scientific research. If this is true for LAION itself, the same can't be said for the parties interested in the object. If the dataset is an object of necessity for Stability AI and Midjourney as much as for Stable Horde or the individual users generating images with their models, the images it contains are also objects of necessity for the artists who produced them. What this example of LAION reveals is that even if their collections of images are sites of convergence for actors and projects of different scales and means, they are at the same time sites of divergence for those with radically different interests.