AUTHOR=Chapman Alec B. , Cordasco Kristina , Chassman Stephanie , Panadero Talia , Agans Dylan , Jackson Nicholas , Clair Kimberly , Nelson Richard , Montgomery Ann Elizabeth , Tsai Jack , Finley Erin , Gabrielian Sonya TITLE=Assessing longitudinal housing status using Electronic Health Record data: a comparison of natural language processing, structured data, and patient-reported history JOURNAL=Frontiers in Artificial Intelligence VOLUME=6 YEAR=2023 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2023.1187501 DOI=10.3389/frai.2023.1187501 ISSN=2624-8212 ABSTRACT=Introduction

Measuring long-term housing outcomes is important for evaluating the impacts of services for individuals with homeless experience. However, assessing long-term housing status using traditional methods is challenging. The Veterans Affairs (VA) Electronic Health Record (EHR) provides detailed data for a large population of patients with homeless experiences and contains several indicators of housing instability, including structured data elements (e.g., diagnosis codes) and free-text clinical narratives. However, the validity of each of these data elements for measuring housing stability over time is not well-studied.

Methods

We compared VA EHR indicators of housing instability, including information extracted from clinical notes using natural language processing (NLP), with patient-reported housing outcomes in a cohort of homeless-experienced Veterans.

Results

NLP achieved higher sensitivity and specificity than standard diagnosis codes for detecting episodes of unstable housing. Other structured data elements in the VA EHR showed promising performance, particularly when combined with NLP.

Discussion

Evaluation efforts and research studies assessing longitudinal housing outcomes should incorporate multiple data sources of documentation to achieve optimal performance.