Objective: The Koos grading scale is a frequently used classification system for vestibular schwannoma (VS) that accounts for extrameatal tumor dimension and compression of the brain stem. We propose an artificial intelligence (AI) pipeline to fully automate the segmentation and Koos classification of VS from MRI to improve clinical workflow and facilitate patient management.
Methods: We propose a method for Koos classification that does not only rely on available images but also on automatically generated segmentations. Artificial neural networks were trained and tested based on manual tumor segmentations and ground truth Koos grades of contrast-enhanced T1-weighted (ceT1) and high-resolution T2-weighted (hrT2) MR images from subjects with a single sporadic VS, acquired on a single scanner and with a standardized protocol. The first stage of the pipeline comprises a convolutional neural network (CNN) which can segment the VS and 7 adjacent structures. For the second stage, we propose two complementary approaches that are combined in an ensemble. The first approach applies a second CNN to the segmentation output to predict the Koos grade, the other approach extracts handcrafted features which are passed to a Random Forest classifier. The pipeline results were compared to those achieved by two neurosurgeons.
Results: Eligible patients (n = 308) were pseudo-randomly split into 5 groups to evaluate the model performance with 5-fold cross-validation. The weighted macro-averaged mean absolute error (MA-MAE), weighted macro-averaged F1 score (F1), and accuracy score of the ensemble model were assessed on the testing sets as follows: MA-MAE = 0.11 ± 0.05, F1 = 89.3 ± 3.0%, accuracy = 89.3 ± 2.9%, which was comparable to the average performance of two neurosurgeons: MA-MAE = 0.11 ± 0.08, F1 = 89.1 ± 5.2, accuracy = 88.6 ± 5.8%. Inter-rater reliability was assessed by calculating Fleiss' generalized kappa (k = 0.68) based on all 308 cases, and intra-rater reliabilities of annotator 1 (k = 0.95) and annotator 2 (k = 0.82) were calculated according to the weighted kappa metric with quadratic (Fleiss-Cohen) weights based on 15 randomly selected cases.
Conclusions: We developed the first AI framework to automatically classify VS according to the Koos scale. The excellent results show that the accuracy of the framework is comparable to that of neurosurgeons and may therefore facilitate management of patients with VS. The models, code, and ground truth Koos grades for a subset of publicly available images (n = 188) will be released upon publication.
Glioma is a type of severe brain tumor, and its accurate segmentation is useful in surgery planning and progression evaluation. Based on different biological properties, the glioma can be divided into three partially-overlapping regions of interest, including whole tumor (WT), tumor core (TC), and enhancing tumor (ET). Recently, UNet has identified its effectiveness in automatically segmenting brain tumor from multi-modal magnetic resonance (MR) images. In this work, instead of network architecture, we focus on making use of prior knowledge (brain parcellation), training and testing strategy (joint 3D+2D), ensemble and post-processing to improve the brain tumor segmentation performance. We explore the accuracy of three UNets with different inputs, and then ensemble the corresponding three outputs, followed by post-processing to achieve the final segmentation. Similar to most existing works, the first UNet uses 3D patches of multi-modal MR images as the input. The second UNet uses brain parcellation as an additional input. And the third UNet is inputted by 2D slices of multi-modal MR images, brain parcellation, and probability maps of WT, TC, and ET obtained from the second UNet. Then, we sequentially unify the WT segmentation from the third UNet and the fused TC and ET segmentation from the first and the second UNets as the complete tumor segmentation. Finally, we adopt a post-processing strategy by labeling small ET as non-enhancing tumor to correct some false-positive ET segmentation. On one publicly-available challenge validation dataset (BraTS2018), the proposed segmentation pipeline yielded average Dice scores of 91.03/86.44/80.58% and average 95% Hausdorff distances of 3.76/6.73/2.51 mm for WT/TC/ET, exhibiting superior segmentation performance over other state-of-the-art methods. We then evaluated the proposed method on the BraTS2020 training data through five-fold cross validation, with similar performance having also been observed. The proposed method was finally evaluated on 10 in-house data, the effectiveness of which has been established qualitatively by professional radiologists.
Purpose: Meningiomas are the most common type of primary brain tumor, accounting for ~30% of all brain tumors. A substantial number of these tumors are never surgically removed but rather monitored over time. Automatic and precise meningioma segmentation is, therefore, beneficial to enable reliable growth estimation and patient-specific treatment planning.
Methods: In this study, we propose the inclusion of attention mechanisms on top of a U-Net architecture used as backbone: (i) Attention-gated U-Net (AGUNet) and (ii) Dual Attention U-Net (DAUNet), using a three-dimensional (3D) magnetic resonance imaging (MRI) volume as input. Attention has the potential to leverage the global context and identify features' relationships across the entire volume. To limit spatial resolution degradation and loss of detail inherent to encoder–decoder architectures, we studied the impact of multi-scale input and deep supervision components. The proposed architectures are trainable end-to-end and each concept can be seamlessly disabled for ablation studies.
Results: The validation studies were performed using a five-fold cross-validation over 600 T1-weighted MRI volumes from St. Olavs Hospital, Trondheim University Hospital, Norway. Models were evaluated based on segmentation, detection, and speed performances, and results are reported patient-wise after averaging across all folds. For the best-performing architecture, an average Dice score of 81.6% was reached for an F1-score of 95.6%. With an almost perfect precision of 98%, meningiomas smaller than 3 ml were occasionally missed hence reaching an overall recall of 93%.
Conclusion: Leveraging global context from a 3D MRI volume provided the best performances, even if the native volume resolution could not be processed directly due to current GPU memory limitations. Overall, near-perfect detection was achieved for meningiomas larger than 3 ml, which is relevant for clinical use. In the future, the use of multi-scale designs and refinement networks should be further investigated. A larger number of cases with meningiomas below 3 ml might also be needed to improve the performance for the smallest tumors.