AI Compendium

Mapping the reliance on human work of artificial intelligence

Artificial Intelligence is the latest frontier of technological development. Through concerted efforts, many actors have joined the field of AI development: from text-to-image generation to generative AI (both text based and image based), machine learning and deep learning algorithms are used extensively to provide predictive capabilities to a plethora of people. However, sometimes to achieve the promises that these new technologies paint, humans are still needed in the process. "Humans-in-the-loop" are figures that oversee what artificial system produce to assess and evaluate their correctness, their reliability, and their overall quality. Humans-in-the-loop take many shapes and form, from faceless annotators to experts, and finally to users. The AI Compendium maps this landscape of required human workers, and focuses on how they are still required to achieve automation, a big promise of artificial intelligence.
Composition of the queries used to collect the articles and the relative number of articles for each pairing.

The research analyzes articles extracted from arXiv.org, a popular repository of scientific papers pertaining various domains of Engineering, such as Computer Science, Information Theory, or Natural sciences. The articles are collected using a combination of keywords identifying terms and technologies, as explained in figure 1.

Articles collected from arXiv refer to knowledge areas. These knowledge areas appeared in the various query to compose the initial dataset.

However, while a portion of the collection tried to extract more design-centric approaches to AI, the overwhelming majority populates the engineering-centric sphere of queries, given also the nature of the platform analyzed which primarily caters to hard sciences rather than other disciplines.

Actors in the loop are already varied from the various appellatives found in the abstracts of the papers. Irrelevant appellatives are colored in red.

Finally, humans-in-the-loop appear already to be more varied than just "humans": while this particular framing develops from a long literature in the domain of Control systems (eg. pilots and industrial processes), in the development of artificial intelligence various kinds of humans are invoked to participate in the development and maintenance of AI: from experts to non-experts, "humans-in-the-loop" is a collective term that mobilizes different pockets of society depending on the needs of the AI systems.

A recurring pattern that emerge from the corpus of articles is a clear structure of problem, a promise of a solution to such problem, data required to achieve this promise and a human that would intervene. We call this structure "invocation": humans are invoked to participate in the construction of AI to solve a plethora of problems that AI promises to achieve, albeit not alone.

The analysis of these invocations provides an overview of 25 different categories of human actors that are invoked in the corpus of articles. These humans are framed as humans-in-the-loop, or controllers that provide the machines with some sort of input in the shape of data. Compared to other domains of Control systems theory, here humans-in-the-loop are not necessarily experts in the field of AI, but they can be either experts of a specific field (like medical applications), or non-experts in any particular field.

Moreover, the input that these humans-in-the-loop give to the machines is highly varied, but it can be categorized in four main groups: audio, text, images, and interactions with the system. Audio is primarily concerned with recordings, text includes instances of written articles or textual inputs from users, images include both photographs and medical imagery, and interactions with the system include both logs of previous interactions and data collected through specific sensors, like brain waves and others.

Finally, a relevant aspect that will be deepened in the website is the one concerning automation and human-AI collaboration. Indeed, the presence of "users" as humans-in-the-loop signals an interest of transferring knowledge generated in this academic context into everyday situations in which AI is deemed as useful. A recurring framing that emerges from the corpus is how AI will impact labor by automating either portions or the entirety of complex processes. Another way to frame it is the collaboration between humans and AI, where the latter operates autonomously until input from the former is required to proceed.

Corpus of articles

TitleAuthorsLinksDate