Paper Summaries
26_Winter_203

January 11, 2026 | 3 minute read

From Bias to Repair: Error as a Site of Collaboration and Negotiation in Applied Data Science Work

by Cindy Kaiying Lin and Steven J. Jackson

Text Exploration

In this paper, the authors describe two ethnographic case studies focused on how data errors are handled during the training of machine-learning data sets. The authors identified that errors are introduced into data due to a variety of socio-political reasons, and that the process and social context in which the errors are introduced or managed is a valuable focus for study. Rather than working to eliminate errors, the authors argue that “artful living with errors” is a better strategy for producing material benefits to data scientists and project collaborators.

The authors begin by contextualizing how machine-learning datasets are developed through labeling, as data scientists generally believe that “correct” labeling leads to the ability to better simulate the world. It is generally viewed that labeling errors are one of the sources of blame for poor AI results.

Two types of errors exist; one is a label error, and the other is a generalization error. Many think that labeling errors come from poor human annotating practices, while generalization errors occur when a model is “overfitted” to other contexts in which it was not trained. The authors explain that label error introduces “noisy labels” into data, and when there are errors, they are typically resolved through a majority-wins style of voting, with the intent being to “construct a ground truth.” Both generalization and labeling errors are viewed as discrete problems to be solved.

Frequently, the authors observe, technologists blame laborers for poor labeling, characterizing them as lazy or careless. Social science approaches to error handling have begun to encourage “technologists to grapple with who is impacted by the failure of AI systems and how they navigate such breakdowns.”

The authors then describe the case studies: observing the process of data labeling in the context of GIS data. Both studies identify the challenge of making “objective” decisions while labeling images of LiDAR and maps, noting that the images are often populated by real-world elements (such as shadows or non-idealized building profiles), and how project teams responded to these challenges. Flowcharts and human policies are typically used as ways to make the labeling process more regimented. Following these guidelines takes time, but labelers are paid on a per-label basis, and the incentive structure is at odds with working comprehensively through the assigned working model.

This study shows “how errors are identified (or not identified) and whose estimations of errors are ultimately heeded and addressed.” The researchers conclude that error definition and eradication should not be the only goal of machine-learning training; this view reinforces “epistemic, and workplace hierarchies between scientific professionals and ‘lowly’ paid technicians or annotators.” Instead, by viewing the error definition and management process as a collaborative form of stakeholder engagement, it becomes a central form of data scientific practice. The authors “emphasize the artful ways that people collaborate and negotiate with error” and view this as a way towards “more effective, more creative, and more accountable” machine-learning work.