Skip to main content

ORIGINAL RESEARCH article

Front. Med.
Sec. Pathology
Volume 11 - 2024 | doi: 10.3389/fmed.2024.1507203
This article is part of the Research Topic Artificial Intelligence-Assisted Medical Imaging Solutions for Integrating Pathology and Radiology Automated Systems - Volume II View all 11 articles

Evaluating ChatGPT's Diagnostic Potential for Pathology Images

Provisionally accepted
Liya Ding Liya Ding 1Lei Fan Lei Fan 1,2*Miao Shen Miao Shen 1,3*Yawen Wang Yawen Wang 4*Kaiqin Sheng Kaiqin Sheng 1*Zijuan Zou Zijuan Zou 1Huimin An Huimin An 1*Zhinong Jiang Zhinong Jiang 1*
  • 1 Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
  • 2 Department of Pathology, Ninghai County Traditional Chinese Medicine Hospital, Ningbo, China
  • 3 Department of Pathology, Deqing People's Hospital, Hangzhou, China
  • 4 College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, Zhejiang Province, China

The final, formatted version of the article will be published soon.

    Background: Chat Generative Pretrained Transformer (ChatGPT) is a type of large language model (LLM) developed by OpenAI, known for its extensive knowledge base and interactive capabilities. These attributes make it a valuable tool in the medical field, particularly for tasks such as answering medical questions, drafting clinical notes, and optimizing the generation of radiology reports. However, keeping accuracy in medical contexts is the biggest challenge to employing GPT-4 in a clinical setting. This study aims to investigate the accuracy of GPT-4, which can process both text and image inputs, in generating diagnoses from pathological images.Methods: This study analyzed 44 histopathological images from 16 organs and 100 colorectal biopsy photomicrographs. The initial evaluation was conducted using the standard GPT-4 model in January 2024, with a subsequent re-evaluation performed in July 2024. The diagnostic accuracy of GPT-4 was assessed by comparing its outputs to a reference standard using statistical measures. Additionally, four pathologists independently reviewed the same images to compare their diagnoses with the model's outputs. Both scanned and photographed images were tested to evaluate GPT-4's generalization ability across different image types.Results: GPT-4 achieved an overall accuracy of 0.64 in identifying tumor imaging and tissue origins. For colon polyp classification, accuracy varied from 0.57 to 0.75 in different subtypes. The model achieved 0.88 accuracy in distinguishing low-grade from high-grade dysplasia and 0.75 in distinguishing high-grade dysplasia from adenocarcinoma, with a high sensitivity in detecting adenocarcinoma. Consistency between initial and follow-up evaluations showed slight to moderate agreement, with Kappa values ranging from 0.204 to 0.375.GPT-4 demonstrates the ability to diagnose pathological images, showing improved performance over earlier versions. Its diagnostic accuracy in cancer is comparable to that of pathology residents. These findings suggest that GPT-4 holds promise as a supportive tool in pathology diagnostics, offering the potential to assist pathologists in routine diagnostic workflows.

    Keywords: Large Language Model, ChatGPT, pathology images, Colon polyp, diagnosis

    Received: 07 Oct 2024; Accepted: 27 Dec 2024.

    Copyright: © 2024 Ding, Fan, Shen, Wang, Sheng, Zou, An and Jiang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Lei Fan, Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
    Miao Shen, Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
    Yawen Wang, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, 310027, Zhejiang Province, China
    Kaiqin Sheng, Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
    Huimin An, Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
    Zhinong Jiang, Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.