How Data Workers Shape Datasets: The Role of Positionality in Data Collection and Annotation for Computer Vision Journal Article uri icon

Overview

abstract

  • Data workers play a key role in the big data industry. Clients hire data workers to collect and annotate data with human identity concepts, like demographic categories or clothing items. Often, such workers are treated as computational-they are expected to quickly and objectively conduct their work, with the goal of having huge, unbiased datasets for training models. Computer vision is especially interested in fair and impartial data due to biases and unethical practices in the field. However, far from impartial, data workers imbue computer vision data with ''biases'' beyond correct versus incorrect answers. Data workers embed their own specific positional perspectives about identity concepts in both collection and annotation processes. Through interviews and ethnographic observations of data workers (freelance and business process outsourcing (BPO) employees), we show how worker positionality influences decisions during data work. We also show the unintended outcomes, like social biases, that occur when positionality is not explicitly attended to in client instructions. We discuss how employing a lens of positionality in data work reveals the gulfs between data worker perspectives and client expectations, which are colored by a web of positional actors beyond isolated data workers. We propose positional (il)legibility as an approach to data work that embraces the reality of positionality in classification practices and addresses the failures of positivist bias mitigation practices.

publication date

  • October 18, 2025

Date in CU Experts

  • October 29, 2025 8:38 AM

Full Author List

  • Scheuerman MK; Woodruff A; Brubaker JR

author count

  • 3

Other Profiles

Electronic International Standard Serial Number (EISSN)

  • 2573-0142

Additional Document Info

start page

  • 1

end page

  • 42

volume

  • 9

issue

  • 7