Abstract: Crowdsourcing services like Amazon Mechanical Turk allow researchers to ask questions to crowds of workers and quickly receive high quality labeled responses. However, crowds drawn from the general public are not suitable for labeling sensitive and complex data sets, such as medical records, due to various concerns. Major challenges in building and deploying a crowdsourcing system for medical data include, but are not limited to: managing access rights to sensitive data and ensuring data privacy controls are enforced; identifying workers with the necessary expertise to analyze complex information, and efficiently retrieving relevant information in massive data sets. In this paper, we introduce a crowdsourcing framework to support the annotation of medical data sets. We further demonstrate a workflow for crowdsourcing clinical chart reviews including (1) the design and decomposition of research questions; (2) the architecture for storing and displaying sensitive data; and (3) the development of tools to support crowd workers in quickly analyzing information from complex data sets.
Learning Objective 1: Introduce a crowdsourcing framework for medical data sets.
Cheng Ye (Presenter)
Joseph Coco, Vanderbilt University
Bradley Malin, Vanderbilt University
Thomas Lasko, Vanderbilt University
Laurie Novak, Vanderbilt University
Joshua Denny, Vanderbilt University
Yevgeniy Vorobeychik, Vanderbilt University
Chen Hajaj, Vanderbilt University
Anna Epishova, Vanderbilt University
Henry Bogardus, Vanderbilt University
Daniel Fabbri, Vanderbilt University