AUTHOR=Luechtefeld Thomas , Bozada Thomas , Goel Rahul , Wang Lin , Paller Channing J. TITLE=Applications for open access normalized synthesis in metastatic prostate cancer trials JOURNAL=Frontiers in Artificial Intelligence VOLUME=5 YEAR=2022 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2022.984836 DOI=10.3389/frai.2022.984836 ISSN=2624-8212 ABSTRACT=

Recent metastatic castration-resistant prostate cancer (mCRPC) clinical trials have integrated homologous recombination and DNA repair deficiency (HRD/DRD) biomarkers into eligibility criteria and secondary objectives. These trials led to the approval of some PARP inhibitors for mCRPC with HRD/DRD indications. Unfortunately, biomarker-trial outcome data is only discovered by reviewing publications, a process that is error-prone, time-consuming, and laborious. While prostate cancer researchers have written systematic evidence reviews (SERs) on this topic, given the time involved from the last search to publication, an SER is often outdated even before publication. The difficulty in reusing previous review data has resulted in multiple reviews of the same trials. Thus, it will be useful to create a normalized evidence base from recently published/presented biomarker-trial outcome data that one can quickly update. We present a new approach to semi-automating normalized, open-access data tables from published clinical trials of metastatic prostate cancer using a data curation and SER platform. Clinicaltrials.gov and Pubmed.gov were used to collect mCRPC clinical trial publications with HRD/DRD biomarkers. We extracted data from 13 publications covering ten trials that started before 22nd Apr 2021. We extracted 585 hazard ratios, response rates, duration metrics, and 543 adverse events. Across 334 patients, we also extracted 8,180 patient-level survival and biomarker values. Data tables were populated with survival metrics, raw patient data, eligibility criteria, adverse events, and timelines. A repeated strong association between HRD and improved PARP inhibitor response was observed. Several use cases for the extracted data are demonstrated via analyses of trial methods, comparison of treatment hazard ratios, and association of treatments with adverse events. Machine learning models are also built on combined and normalized patient data to demonstrate automated discovery of therapy/biomarker relationships. Overall, we demonstrate the value of systematically extracted and normalized data. We have also made our code open-source with simple instructions on updating the analyses as new data becomes available, which anyone can use even with limited programming knowledge. Finally, while we present a novel method of SER for mCRPC trials, one can also implement such semi-automated methods in other clinical trial domains to advance precision medicine.