Abstract: Introduction
Nonvalvular Atrial Fibrillation (NVAF), a type of Atrial Fibrillation (AF) and the most common type of irregular heartbeat, is estimated to affect about 5.8 million people in the United States. Nonvalvular AF results in a five times greater risk of stroke; approximately 15% of strokes annually are due to NVAF. In this retrospective study, the improvement in accuracy of using structured electronic health record (EHR) data (ICD-9) combined with unstructured EHR clinical notes through Natural Language Processing (NLP) techniques (Structured+NLP) was compared with structured (ICD9) data alone, and compared with a clinician review in identifying NVAF.
The retrospective study cohort included patients between the ages of 18 and 90 with a diagnosis of AF who did not have evidence of significant valvular abnormality. EHR (Allscripts) data from the UBMD faculty practice consisted of 96,681 patients who had both clinical notes and ICD-9 description fields. Following application of the exclusion criteria, electronic models for structured data, using ICD-9 criteria, and for all unstructured data using a high throughput phenotyping NLP system that rapidly assigns ontology terms (including SNOMED CT) to text in patient records, were applied to identify the NVAF population. Two cohorts were created, those identified through structured and NLP and those identified through structured alone. A random sample of 300 patients was selected. In order to create the gold standard, a chart viewer to review all pertinent records was developed. Each case was reviewed independently by two clinicians and adjudicated by a third clinician when a disagreement occurred.
Each NVAF outcome for the respective method was compared to the gold standard by calculating specificity, sensitivity, positive predictive value (PPV), negative predictive value (NPV) with their 95% exact binomial confidence intervals. McNemar’s test or the exact binomial test assessed if the two methods performed equally against the gold standard with respect to sensitivity and specificity. Gold standard inter-rater agreement was calculated by Cohen’s kappa with a 95% bootstrapped confidence interval.
Table 1. Comparison of outcomes for Structured and Structured plus Unstructured data against the gold standard.


.773 (.68, .79)
.47 (.258, .65)
.91 (.87, .95)
.215(.131, .322)
.156 (.041, .271)
1 (.979,1)
.444 (.279, .619)
.93 (.893, .956)
1 (.713, 1)
.585 (.414, .733)

The inter-rater agreement for the gold standard was assessed for the 300 cases, each independently reviewed by two clinicians prior to adjudication and moderate agreement was found (0.522 (0.233, 0.748), 0.58 (0.446, 0.706)). Sensitivity, specificity, PPV, NPV, and Cohen’s kappa can be found in Table 1 for each method. A McNemar test comparison of case identification showed that Structured+NLP method’s sensitivity was superior to ICD9 alone (p<0.001). An exact binomial test to assess if the two methods performed equally in terms of specificity against the gold standard failed to reject that the two specificities are different (p=0.317). The positive and negative predictive value was also superior for the Structured + NLP method. Initially, of the 96,681 patients identified in the UBMD database, 2.8% (2722 cases) were identified with NVAF by the Structured+NLP method as opposed to 1.9% for Structured alone (1849 cases) with a difference of 873 cases. This was a significant improvement in identification with Structured+NLP identifying 32.1% more cases than Structured ICD-9 (p<0.001, McNemar Test). Based on the PPV adjusting the true positive rates for both ICD9 and NLP alone this converts to a 36.3 % improvement identification of true cases in this NVAF cohort when sample results were extrapolated to the entire data set.
Discussion and Conclusion
The Structured+NLP data extraction method had a higher sensitivity in comparison to Structured data alone, allowing for an increased number of true positive cases to be identified. In addition, Structured+NLP identified 32.1% more NVAF cases than Structured data alone. The enhanced true positive case detection was improved by 36.1%, and the cost of false positive case detection was reduced by 63.7% on average which will reduce the cost of implementation. Improved automated detection through Structured+NLP would accurately identify NVAF patients efficiently compared to structured data alone, but also could facilitate an increased enhanced the ability to recruit to clinical trials.

Learning Objective 1: To understand the incremental benefit of EHR data in clinical predication rules when added onto administrative data

Learning Objective 2 (Optional): To understand the benefit of using computers for case cohort identification


Peter Elkin (Presenter)
University at Buffalo

Sarah Mullin, University at Buffalo
Daniel Schlegel, Oswego
Christopher Crowner, University at Buffalo
Sylvester Sakilay, University at Buffalo
Shyamashree Sinha, University at Buffalo
Gary Brady, Pfizer, INc
Marcia Wright, Pfizer, INc
Kim Nolen, Pfizer, INc
Sashank Kaushik, University at Buffalo
Jane Zhou, University at Buffalo
Buer Song, University at Buffalo
Edwin Anand, University at Buffalo

Presentation Materials: