event-icon
Description

Abstract: In this study, we compare and evaluate forty-five machine learning models’ ability to distinguish stroke patients from other forms of cerebrovascular disease. By combining multiple case-control and classifier types, we determined that for this phenotyping question, a manually curated set of stroke cases did not perform better than a set mined from billing codes, and the type of control was the most important factor for classifier success.

Learning Objective 1: The major learning objective from the presentation is that selection and comparison of cases, controls, and machine learning model types are essential for the most successful EHR phenotyping.

Authors:

Phyllis Thangaraj (Presenter)
Columbia University

Joseph Romano, Columbia University
Fernanda Polubriaginof, Columbia University
Nicholas Giangreco, Columbia University
Mitchell Elkind, Columbia University
Nicholas Tatonetti, Columbia University

Presentation Materials:

Keywords