The final, formatted version of the article will be published soon.
METHODS article
Front. Artif. Intell.
Sec. Machine Learning and Artificial Intelligence
Volume 7 - 2024 |
doi: 10.3389/frai.2024.1428716
This article is part of the Research Topic Artificial Intelligence in Bioinformatics and Genomics View all 3 articles
Ocular Biometry OCR: A Machine Learning Algorithm Leveraging Optical Character Recognition to Extract Intra Ocular Lens Biometry Measurements
Provisionally accepted- 1 School of Medicine, Stanford University, Stanford, United States
- 2 Rosalind Franklin University of Medicine and Science, North Chicago, Illinois, United States
- 3 Centro Universitário de Várzea Grande, Várzea Grande, Brazil
- 4 Byers Eye Institute, Stanford Healthcare, Palo Alto, California, United States
- 5 Department of Radiology, School of Medicine, Stanford University, Stanford, California, United States
Given close relationships between ocular structure and ophthalmic disease, ocular biometry measurements (including axial length, lens thickness, anterior chamber depth, and keratometry values) may be leveraged as features in the prediction of eye diseases. However, ocular biometry measurements are often stored as PDFs rather than as structured data in electronic health records. Thus, time-consuming and laborious manual data entry is required for using biometry data as a disease predictor. Herein, we used two separate models, PaddleOCR and Gemini, to extract eye specific biometric measurements from 2,965 Lenstar, 104 IOL Master 500, and 3,616 IOL Master 700 optical biometry reports. For each patient eye, our text extraction pipeline, referred to as Ocular Biometry OCR, involves 1) cropping the report to the biometric data, 2) extracting the text via the optical character recognition model, 3) post-processing the metrics and values into key value pairs, 4) correcting erroneous angles within the pairs, 5) computing the number of errors or missing values, and 6) selecting the window specific results with fewest errors or missing values. To ensure the models’ predictions could be put into machine learning ready format, artifacts were removed from categorical text data through manual modification where necessary. Performance was evaluated by scoring PaddleOCR and Gemini results. In the absence of ground truth, higher scoring indicated greater inter-model reliability, assuming an equal value between models indicated an accurate result. The detection scores, measuring the number of valid values (i.e., not missing or erroneous), were Lenstar: 0.990, IOLM 500: 1.000, and IOLM 700: 0.998. The similarity scores, measuring the number of equal values, were Lenstar: 0.995, IOLM 500: 0.999, and IOLM 700: 0.999. The agreement scores, combining detection and similarity scores, were Lenstar: 0.985, IOLM 500: 0.999, and IOLM 700: 0.998. IOLM 500 was annotated for ground truths; in this case, higher scoring indicated greater model-to-annotator accuracy. PaddleOCR-to-Annotator achieved scores of detection: 1.000, similarity: 0.999, and agreement: 0.999. Gemini-to-Annotator achieved scores of detection: 1.000, similarity: 1.000, and agreement: 1.000. Scores range from 0 to 1. While PaddleOCR and Gemini demonstrated high agreement, PaddleOCR offered slightly better performance upon reviewing quantitative and qualitative results.
Keywords: Optical character recognition, PaddleOCR, Gemini, Text extraction, Lenstar, IOL Master 500, IOL Master 700
Received: 06 May 2024; Accepted: 09 Dec 2024.
Copyright: © 2024 Salvi, Arnal, Ly, Ferreira, Wang, Langlotz, Mahajan and Ludwig. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Cassie Ann Ludwig, School of Medicine, Stanford University, Stanford, United States
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.