AUTHOR=AbuHalimeh Ahmed 

TITLE=Improving Data Quality in Clinical Research Informatics Tools

JOURNAL=Frontiers in Big Data

VOLUME=Volume 5 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2022.871897

DOI=10.3389/fdata.2022.871897

ISSN=2624-909X

ABSTRACT=Maintaining and providing high-quality, reliable, and statistically sound data is a primary goal for clinical research informatics. In addition, effective data governance and management are essential to ensuring accurate data counts, and validation. As a crucial step of the clinical research process, it is important to establish organization-wide standards for data quality management and governance to ensure consistency across all systems and tools designed primarily for cohort identification, such tools we refer to as de-identified data tools.
In clinical research informatics, better data quality translates into better research results and better patient care.  However, achieving high-quality data standards is a major task because of the variety of ways that errors might be introduced in a system and the difficulty of correcting them systematically.  
In this paper, we describe a real-life case on assessing and improving the data quality at a healthcare organization. This paper compares two de-identified data systems i2b2 (i2b2tranSMART,2021), and Epic Slicedicer (AMIA, 2022) .We discuss the data quality dimensions' that is important to the clinical research informatics context, possible data quality issues between the de-identified systems, and proposed steps/rules for maintaining the data quality among different systems to help data managers, information systems teams, and informaticists at a health care organization to monitor and sustain data quality as part of their business intelligence, data governance, and data democratization processes.  The quality improvement steps that we propose are generic and can be used to automate data governance to tackle various data quality problems