Skip to main content

ORIGINAL RESEARCH article

Front. Oncol.
Sec. Surgical Oncology
Volume 14 - 2024 | doi: 10.3389/fonc.2024.1526288

Evaluating ChatGPT-4o as a Decision Support Tool in Multidisciplinary Sarcoma Tumor Boards: Heterogeneous Performance across various Specialties

Provisionally accepted
Tekoshin Ammo Tekoshin Ammo *Vincent G J Guillaume Vincent G J Guillaume Ulf Krister Hofmann Ulf Krister Hofmann Norma M Ulmer Norma M Ulmer Nina Buenting Nina Buenting Florian Laenger Florian Laenger Justus P Beier Justus P Beier Tim Leypold Tim Leypold
  • University Hospital RWTH Aachen, Aachen, Germany

The final, formatted version of the article will be published soon.

    Since the launch of ChatGPT in 2023, large language models have attracted substantial interest to be deployed in the health care sector. This study evaluates the performance of ChatGPT-4o as a support tool for decision-making in multidisciplinary sarcoma tumor boards.We created five sarcoma patient cases mimicking real-world scenarios and prompted ChatGPT-4o to issue tumor board decisions. These recommendations were independently assessed by a multidisciplinary panel, consisting of an orthopedic surgeon, plastic surgeon, radiation oncologist, radiologist, and pathologist. Assessments were graded on a Likert scale from 1 (completely disagree) to 5 (completely agree) across five categories: understanding, therapy/diagnostic recommendation, aftercare recommendation, summarization, and support tool effectiveness.The mean score for ChatGPT-4o performance was 3.76, indicating moderate effectiveness.Surgical specialties received the highest score, with a mean score of 4.48, while diagnostic specialties (radiology/ pathology) performed considerably better than the radiation oncology specialty, which performed poorly.This study provides initial insights into the use of prompt-engineered large language models as decision support tools in sarcoma tumor boards. ChatGPT-4o recommendations regarding surgical specialties performed best while ChatGPT-4o struggled to give valuable advice in the other tested specialties. Clinicians should understand both the advantages and limitations of this technology for effective integration into clinical practice.

    Keywords: Sarcoma, Multidisciplinary Sarcoma Tumor board, artificial intelligence, Chat-GPT, Large language models, Cancer, LLM

    Received: 11 Nov 2024; Accepted: 24 Dec 2024.

    Copyright: © 2024 Ammo, Guillaume, Hofmann, Ulmer, Buenting, Laenger, Beier and Leypold. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Tekoshin Ammo, University Hospital RWTH Aachen, Aachen, Germany

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.