Validation of an automated contouring and treatment planning tool for pediatric craniospinal radiation therapy

Hernandez, Soleil; Burger, Hester; Nguyen, Callistus; Paulino, Arnold C.; Lucas, John T.; Faught, Austin M.; Duryea, Jack; Netherton, Tucker; Rhee, Dong Joo; Cardenas, Carlos; Howell, Rebecca; Fuentes, David; Pollard-Larkin, Julianne; Court, Laurence; Parkes, Jeannette

doi:10.3389/fonc.2023.1221792

ORIGINAL RESEARCH article

Front. Oncol. , 22 September 2023

Sec. Radiation Oncology

Volume 13 - 2023 | https://doi.org/10.3389/fonc.2023.1221792

This article is part of the Research Topic Pediatric CNS Tumors in Low- and Middle-Income Countries: Expanding our Understanding View all 22 articles

Validation of an automated contouring and treatment planning tool for pediatric craniospinal radiation therapy

Soleil Hernandez^1,2*

Hester Burger³

Callistus Nguyen²

Jack Duryea²

Dong Joo Rhee²

Rebecca Howell^1,2

Julianne Pollard-Larkin^1,2

Laurence Court^1,2

Jeannette Parkes⁸

¹The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX, United States
²Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
³Department Medical Physics, Groote Schuur Hospital and University of Cape Town, Cape Town, South Africa
⁴Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
⁵Department of Radiation Oncology, St. Jude Children’s Research Hospital, Memphis, TN, United States
⁶Department of Radiation Oncology, University of Alabama at Birmingham, Birmingham, AL, United States
⁷Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
⁸Department of Radiation Oncology, Groote Schuur Hospital and University of Cape Town, Cape Town, South Africa

Purpose: Treatment planning for craniospinal irradiation (CSI) is complex and time-consuming, especially for resource-constrained centers. To alleviate demanding workflows, we successfully automated the pediatric CSI planning pipeline in previous work. In this work, we validated our CSI autosegmentation and autoplanning tool on a large dataset from St. Jude Children’s Research Hospital.

Methods: Sixty-three CSI patient CT scans were involved in the study. Pre-planning scripts were used to automatically verify anatomical compatibility with the autoplanning tool. The autoplanning pipeline generated 15 contours and a composite CSI treatment plan for each of the compatible test patients (n=51). Plan quality was evaluated quantitatively with target coverage and dose to normal tissue metrics and qualitatively with physician review, using a 5-point Likert scale. Three pediatric radiation oncologists from 3 institutions reviewed and scored 15 contours and a corresponding composite CSI plan for the final 51 test patients. One patient was scored by 3 physicians, resulting in 53 plans scored total.

Results: The algorithm automatically detected 12 incompatible patients due to insufficient junction spacing or head tilt and removed them from the study. Of the 795 autosegmented contours reviewed, 97% were scored as clinically acceptable, with 92% requiring no edits. Of the 53 plans scored, all 51 brain dose distributions were scored as clinically acceptable. For the spine dose distributions, 92%, 100%, and 68% of single, extended, and multiple-field cases, respectively, were scored as clinically acceptable. In all cases (major or minor edits), the physicians noted that they would rather edit the autoplan than create a new plan.

Conclusions: We successfully validated an autoplanning pipeline on 51 patients from another institution, indicating that our algorithm is robust in its adjustment to differing patient populations. We automatically generated 15 contours and a comprehensive CSI treatment plan for each patient without physician intervention, indicating the potential for increased treatment planning efficiency and global access to high-quality radiation therapy.

Introduction

Each year, 300,000 children are diagnosed with cancer worldwide. Of these, 90% live in low- and middle-income countries (LMICs), where access to proper care may be limited by available resources (1). Globally, the 5-year survival rate for patients with pediatric cancer has increased to over 80% in high-income countries (HIC); however, this trend has not been mirrored in LMICs, where average survival rates remain as low as 20% in some countries (2). Recognizing this issue, the World Health Organization launched the Global Initiative for Childhood Cancer (GICC) program in 2018 aiming to increase global survival from pediatric cancer to 60% (3). Radiation therapy is complex and time-consuming to plan and deliver, yet it plays a critical role in managing cancer in more than 50% of pediatric patients in LMICs, and its use is expected to rise to 78% over the next 10 years (4).

Pediatric brain and CNS tumors constitute the leading cause of deaths associated with pediatric cancer world-wide (5), but even more so in LMICs where access to diagnosis and treatment requires availability of technical and human resources (6). Medulloblastoma is the most common malignant brain tumor in children accounting for 20-25% of pediatric malignancies in HICs with large variations in incidence in LMICs. Patients with this diagnosis (as well as some other pediatric brain tumors) require craniospinal radiotherapy, one of the most technically demanding techniques in a radiotherapy center (7, 8).

Limited personnel create demanding workflows. For example, medical physicists dedicate up to 50% of their time to generating radiation therapy treatment plans (9). To alleviate demanding workflows and increase global access to high-quality radiation therapy, artificial intelligence has been introduced to automate various aspects of the radiation therapy treatment planning process. The Radiation Planning Assistant (RPA) planning team has developed algorithms to automate contouring, treatment planning, and quality assurance for adult disease sites, including the cervix, chest wall, spine, head and neck, and whole brain (10–15). Court et al. recently summarized how the RPA was designed alongside leaders in resource-constrained countries to address the global expertise gap in radiation oncology (16). In short, clinicians import a patient CT scan with a planning prescription into the RPA webpage. The web-based servers of the RPA then automatically generate contours and a corresponding treatment plan using internal algorithms. The contour and plan files are then sent back to the user for download. The RPA was developed with clinical acceptability and safety/risk in mind to ensure successful deployment, and increase global access to high-quality radiation therapy.

Recently, as part of the RPA project, Hernandez et al. introduced artificial intelligence into pediatric radiation oncology to facilitate autosegmentation and planning for craniospinal radiation therapy for pediatric patients with medulloblastoma (17). In addition, Hernandez et al. investigated automatically contouring postoperative GTV volumes using a pediatric dataset (18). Both studies were exclusively trained, validated, and tested on an internal pediatric dataset.

The performance of deep learning models has been shown to decrease when tested on patient populations from different hospitals often due to heterogeneity in medical imaging techniques (19). In addition, models trained only on a single dataset may be susceptible to overfitting, which may further limit the generalizability of the model on different patient populations (20). Chen et al. reported that one of the biggest challenges of incorporating artificial intelligence–based tools into radiation oncology is the generalizability of deep learning models (21). In 2021, the FDA recognized that artificial intelligence may be biased towards the dataset it is tested on. In outlining strategies to mitigate bias in algorithm development, it was highlighted that the algorithms should be tested on diverse patient cohorts to test generalizability (22).

To evaluate the generalizability of our algorithms, we tested our CSI autocontouring and autoplanning tool developed at our institution, on a large dataset from another institution. We recruited three pediatric radiation oncologists from three different institutions to comprehensively evaluate the performance of the autocontouring and autoplanning tool. Automating the contouring and planning workflow for pediatric CSI has the potential to increase access to high-quality radiation therapy, as time saved in treatment planning may be allocated to other clinically necessary tasks.

Methods

We tested the CSI autocontouring tool on a dataset from St. Jude Children’s Research Hospital, comprising of 63 full-body CSI CT scans. This study was approved by our institutional review board. The dataset was curated such that each patient had been previously treated with photons in the head-first-supine position. Of the 63 scans, 30 had been performed on Siemens machines and 33 had been performed on Philips machines. The median (range) number of slices, slice thicknesses, and tube voltage peaks were 495 (225–780), 1.5 (1–3) mm, and 120 (120–120) kVp, respectively. After evaluating the imaging parameters, all CT images were imported into the Raystation treatment planning system version 11B (Raysearch Laboratories, Stockholm, Sweden) (23).

Autocontouring

Two deep-learning based autosegmentation pipelines were employed to generate the normal tissue contours on the 63 CT scans outside of the treatment planning system. Deep learning uses a series of multi-layer neural networks to learn image features of large training datasets (image and contour pairs) to then automatically segment contours on independent test datasets (images only). To generate the contours in this study, first, a previously validated, adult head and neck autocontouring model was run to generate the brain, brainstem, eye, lens, and cochlea contours (24). Next, a previously validated, pediatric-specific autocontouring model was used to generate the cribriform plate, lacrimal gland, pituitary gland, thyroid, heart, lung, shoulder, mandible, spinal canal, vertebral column, and kidney contours (17). The inputs of both algorithms are a CT scan, and the outputs are a set of autocontours which may then be imported into the treatment planning system for planning.

Autoplanning

Hernandez et al. previously automated the treatment planning process for 3D-conformal pediatric craniospinal radiation therapy (17). The algorithm was written in Raystation using the python-based API and did not use any auto-planning features native to the TPS. In summary (Figure 1), autocontours are first generated using previously-trained deep learning models and then they are imported into the treatment planning system. The autoplanning tool then generates 2 lateral brain fields (gantry at 90 and 270 degrees) matched to a single poster-anterior (PA) spine field (gantry at 180 degrees), an extended spine field (120 cm SSD to couch top), or 2 matched spine fields, depending on the patient’s spinal canal length. The MLCs for the brain and spine field(s) conform to a 1 cm uniform expansion of the brain autocontour and a 1 cm lateral expansion of the spinal canal autocontour, respectively. A half-beam block is implemented on the brain field to avoid the need for couch rotations. Spine subfields are then added and iteratively weighted to optimize the spine dose distribution. Finally, feathering is implemented at each match line to yield a composite treatment plan. All beam energies are set to 6 MV. The prescription is set to deliver 23.4 Gy in 13 fractions, normalized to give 95% of the prescribed dose to 100% of the brain volume and 95% of the spinal canal volume using a 5, 5, 3 fractionation scheme. For additional details on the contouring and planning algorithms, we refer the user to our previous work (17).

FIGURE 1

Figure 1 Outline of craniospinal irradiation auto-planning workflow. Normal structures and landmark structures are automatically contoured using deep learning methods. The autocontours then guide an autoplanning algorithm scripted in the treatment planning system. Auto-contours are used to automatically set isocenters and define target and prescription volumes. Fields are automatically generated and conformed to the specified targets. The dose is prescribed, and the dose to the spine field is optimized. The original plan is feathered with 2 junction shifts. Finally, a composite plan is generated. Figure reprinted from “Automating the treatment planning process for 3D-conformal pediatric craniospinal irradiation therapy,” by Hernandez et al., 2023, Pediatric Blood & Cancer, Volume 70(3), e30164. Copyright 2023 by John Wiley and Sons. Reprinted with permission.

Prior to generating a treatment plan, the CSI autoplanning algorithm automatically performs a series of checks to ensure that the patient’s anatomy is compatible with the algorithm design. First, the algorithm automatically measures the patients’ spinal canal and determines whether to implement a single, extended, or multiple spine field configuration. In addition, the algorithm quantifies the amount of space available for junction shifts and decides to implement either 1- or 0.5-cm junction spacing. The algorithm will flag the user if there is <1 cm of space between the mandible and shoulders available for feathering. These patients were omitted from final testing. Finally, the algorithm automatically checks that the patient’s anatomy will be compatible with a half-beam block on the brain field by measuring the distance between the most superior slice of the brain contour and the most inferior slice of the mandible contour. A patient with a head tilt would have a higher mandible contour, which decreases the distance between the mandible and the top of the brain relative to that of a patient who is looking straight ahead. Patients with a measured brain-to-mandible distance larger than 20 cm were removed from the final testing set.

After removing the incompatible patients from the final testing set, we ran the autocontouring and autoplanning pipeline to generate CSI treatment plans. Plan quality was evaluated quantitatively with target coverage and dose to normal tissue metrics and qualitatively with physician review.

Quantitative plan evaluation

To quantitatively evaluate the quality of the plans, dose metrics were analyzed across the final test set of patients. Target coverage was quantified using V95% of the prescription dose (23.4 Gy) evaluated for the brain, spinal canal, and cribriform plate. Normal tissue dose was also quantified using the maximum dose to the brain, spinal canal, brainstem, cochlea, eye, lens, and optic nerve autocontours. In addition, the mean dose was reported for the cochlea, heart, kidney, lacrimal gland, lung, pituitary gland, and thyroid autocontours.

Qualitative plan evaluation

Physician review was used to evaluate the quality of the final autocontours and autoplans for each of the patients in the final testing cohort. Three pediatric radiation oncologists from 3 institutions (in the US and South Africa) reviewed the final test set. One patient was reviewed by all 3 physicians, resulting in a total of 53 plans for review. Each physician reviewed and scored each autocontour using a 5-point Likert scale detailed in Table 1 (25). Using the same scale, the physicians reviewed the autoplan of each patient and assigned a clinical acceptability score to the brain and spine dose distributions individually. Autocontours and autoplans scored ≥3 was considered clinically acceptable. For plans that were scored as a 2, we also asked the physician if they would prefer to create their own plan from scratch or edit the plan we presented, as the original Likert scale did not have a metric for plans that required major edits but were still clinically useful.

TABLE 1

Table 1 5-Point Likert scale used to evaluate autocontour and autoplan quality (25).

Results

81% (51/63) of patients met the autoplanning pre-processing requirements. Four patients were automatically removed for having less than 1 cm available to feather junctions and 8 patients were removed for not being compatible with a half-beam block on the brain field. Each flagged case was manually reviewed to verify that it was not compatible with the planning algorithm. Figure 2 shows the variation in junction spacing and required spine field length measured across the dataset. A team of 3 pediatric radiation oncologists from different institutions reviewed and scored the resulting 51 autocontours and autoplans. One patient’s case was reviewed and scored by all 3 physicians (total of 53 plans scored). Physician 1 reviewed 16 plans, physician 2 reviewed 19, and physician 3 reviewed 18.

FIGURE 2

Figure 2 Distribution of available junction spacing and required spine field configurations for 63 patients. The green and yellow lines correspond to having enough feathering space for a 1-cm junction. The dotted red line represents the cut-off for a 0.5-cm junction.