Dataset: Difference between revisions

Latest revision as of 12:11, 18 September 2025

Dataset

In the context of AI image generation, a dataset is a collection of a collection of image-text pairs (and sometimes other attributes such as provenance or an aesthetic score) used to train AI models. It is an object of necessity par excellence. Without dataset, no model could see the light of day. Iconic datasets include the LAION aesthetic dataset, Artemis, ImageNet, or Common Objects in Context (COCO). These collections of images, mostly sourced from the internet, reach dizzying scales. ImageNet became famous for its 14 millions images in the first decade of the century.[1] Today LAION-5B consists of 5,85 billion CLIP-filtered image-text pairs. [2]

If large models such as Stable Diffusion require large scale datasets, various components such as LoRAs, VAEs, refiners, or upscalers can be trained with a much smaller amount of data. In practice, this means that for each of these components, a custom dataset is created. As each of these datasets reflects a particular aspect of visual culture, the components trained on them function as conduits for imaginaries and world views. Image generators are not simply produced through mathematics and statistics, they are programmed by images. Programming by images is a specific curatorial practice that involves a wide range of skills including a deep knowledge of the relevant visual domain, the ability to find the best exemplars, many practical skills such as scraping, image filtering, cleaning and cropping, and mastering the art of a coherent classification and annotation. In our tour, we discuss two examples of curatorial practices of different scales and purpose: the creation of the LAION dataset and the art of collecting the images that are necessary to "bake the LoRA cake."[3]

Further, behind each dataset there is an organisation - of people, corporate organisations, researchers, or others.[4] Even for individual users, collecting and sharing a dataset often means accepting and cultivating attachments to platforms. For instance, many datasets manually assembled by individuals are made freely available on platforms like Hugging Face, along with the large scale ones published by companies or universities, for others to build LoRAs or in other ways experiment with.

[1] Deng, Jia, Wei Dong, Richard Socher, Li-jia Li, Kai Li, and Li Fei-fei. “Imagenet: A Large-Scale Hierarchical Image Database.” CVPR 1 (2009): 248–55. https://doi.org/10.1109/CVPR.2009.5206848.

[2] Deng, Jia, Wei Dong, Richard Socher, Li-jia Li, Kai Li, and Li Fei-fei. “Imagenet: A Large-Scale Hierarchical Image Database.” CVPR 1 (2009): 248–55. https://doi.org/10.1109/CVPR.2009.5206848. Beaumont, Romain. “LAION-5B: A NEW ERA OF OPEN LARGE-SCALE MULTI-MODAL DATASETS.” LAION, March 31, 2022. https://laion.ai/blog/laion-5b/.

[3] knxo, “Making a LoRA Is Like Baking a Cake,” Civitai, published July 10, 2024, accessed August 18, 2025, https://civitai.com/articles/138/making-a-lora-is-like-baking-a-cake.

[4] JinsNotes. “Vision Dataset.” JinsNotes, August 1, 2024. Accessed August 26, 2025. https://jinsnotes.com/2024-08-01-vision-dataset.

Guestbook

@@ Line 1: / Line 1: @@
-A dataset is a collection of data used to train AI models. In AI image generation, the dataset consists at a minimum of a collection of image - text pairs. Iconic datasets include LAION aesthetic dataset, ...
+== Dataset ==
+In the context of AI image generation, a dataset is a collection of a collection of image-text pairs (and sometimes other attributes such as provenance or an aesthetic score) used to train AI models. It is an object of necessity par excellence. Without dataset, no model could see the light of day. Iconic datasets include the [[LAION]] aesthetic dataset, Artemis, ImageNet, or Common Objects in Context (COCO). These collections of images, mostly sourced from the internet, reach dizzying scales. ImageNet became famous for its 14 millions images in the first decade of the century.[1] Today LAION-5B consists of  5,85 billion CLIP-filtered image-text pairs. [2]
-== What is the network that sustains this object? ==
+If large models such as Stable Diffusion require large scale datasets, various components such as LoRAs, VAEs, refiners, or upscalers can be trained with a much smaller amount of data. In practice, this means that for each of these components, a custom dataset is created. As each of these datasets reflects a particular aspect of visual culture, the components trained on them function as conduits for imaginaries and world views. Image generators are not simply produced through mathematics and statistics, they are programmed by images. Programming by images is a specific curatorial practice that involves a wide range of skills including a deep knowledge of the relevant visual domain, the ability to find the best exemplars, many practical skills such as scraping, image filtering, cleaning and cropping, and mastering the art of a coherent classification and annotation. In our tour, we discuss two examples of curatorial practices of different scales and purpose: the creation of the [[LAION]] dataset and the art of collecting the images that are necessary to "bake the [[LoRA]] cake."[3]
-* How does it move from person to person, person to software, to platform, what things are attached to it (visual culture)
-* Networks of attachments
-* How does it relate / sustain a collective? (human + non-human)
-== How does it evolve through time? ==
+Further, behind each dataset there is an organisation - of people, corporate organisations, researchers, or others.[4] Even for individual users, collecting and sharing a dataset often means accepting and cultivating attachments to platforms. For instance, many datasets manually assembled by individuals are made freely available on platforms like [[Hugging Face]], along with the large scale ones published by companies or universities, for others to build LoRAs or in other ways experiment with.
-Evolution of the interface for these objects. Early chatgpt offered two parameters through the API: prompt and temperature. Today extremely complex object with all kinds of components and parameters. Visually what is the difference? Richness of the interface in decentralization (the more options, the better...)
-== How does it create value? Or decrease / affect value? ==
-== What is its place/role in techno cultural strategies? ==
-== How does it relate to autonomous infrastructure? ==
+[1]  Deng, Jia, Wei Dong, Richard Socher, Li-jia Li, Kai Li, and Li Fei-fei. “Imagenet: A Large-Scale Hierarchical Image Database.” ''CVPR'' 1 (2009): 248–55. https://doi.org/10.1109/CVPR.2009.5206848.
+[2]  Deng, Jia, Wei Dong, Richard Socher, Li-jia Li, Kai Li, and Li Fei-fei. “Imagenet: A Large-Scale Hierarchical Image Database.” ''CVPR'' 1 (2009): 248–55. <nowiki>https://doi.org/10.1109/CVPR.2009.5206848</nowiki>. Beaumont, Romain. “LAION-5B: A NEW ERA OF OPEN LARGE-SCALE MULTI-MODAL DATASETS.” ''LAION'', March 31, 2022. https://laion.ai/blog/laion-5b/.
+[3] knxo, “Making a LoRA Is Like Baking a Cake,” ''Civitai'', published July 10, 2024, accessed August 18, 2025, <nowiki>https://civitai.com/articles/138/making-a-lora-is-like-baking-a-cake</nowiki>.
+[4] JinsNotes. “Vision Dataset.” ''JinsNotes'', August 1, 2024. Accessed August 26, 2025. <nowiki>https://jinsnotes.com/2024-08-01-vision-dataset</nowiki>.
+== Guestbook ==
+<eplite id="Objects_of_interest_and_necessity_Dataset" height="600px" width="1000px" />
 [[Category:Objects of Interest and Necessity]]