Datasheets for Datasets

From AcaWiki
Jump to: navigation, search

Citation: Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumeé III, Kate Crawford Datasheets for Datasets.
Internet Archive Scholar (search for fulltext): Datasheets for Datasets
Wikidata (metadata): Q60487752
Download: https://arxiv.org/abs/1803.09010
Tagged:

Summary

Proposes metadata for datasets, motivated by concerns about machine learning, drawing from practices in electronics, automobiles, and medicine. Suggested metadata is in following categories:

  • Motivation for Dataset Creation
  • Dataset Composition
  • Data Collection Process
  • Data Preprocessing
  • Dataset Distribution
  • Dataset Maintenance
  • Legal & Ethical Considerations

Authors provide suggested metadata for two datasets well known in machine learning.