Skip to main content

Stratified assessment of an FDA-cleared deep learning algorithm for automated detection and contouring of metastatic brain tumors in stereotactic radiosurgery

Abstract

Purpose

Artificial intelligence-based tools can be leveraged to improve detection and segmentation of brain metastases for stereotactic radiosurgery (SRS). VBrain by Vysioneer Inc. is a deep learning algorithm with recent FDA clearance to assist in brain tumor contouring. We aimed to assess the performance of this tool by various demographic and clinical characteristics among patients with brain metastases treated with SRS.

Materials and methods

We randomly selected 100 patients with brain metastases who underwent initial SRS on the CyberKnife from 2017 to 2020 at a single institution. Cases with resection cavities were excluded from the analysis. Computed tomography (CT) and axial T1-weighted post-contrast magnetic resonance (MR) image data were extracted for each patient and uploaded to VBrain. A brain metastasis was considered “detected” when the VBrain- “predicted” contours overlapped with the corresponding physician contours (“ground-truth” contours). We evaluated performance of VBrain against ground-truth contours using the following metrics: lesion-wise Dice similarity coefficient (DSC), lesion-wise average Hausdorff distance (AVD), false positive count (FP), and lesion-wise sensitivity (%). Kruskal–Wallis tests were performed to assess the relationships between patient characteristics including sex, race, primary histology, age, and size and number of brain metastases, and performance metrics such as DSC, AVD, FP, and sensitivity.

Results

We analyzed 100 patients with 435 intact brain metastases treated with SRS. Our cohort consisted of patients with a median number of 2 brain metastases (range: 1 to 52), median age of 69 (range: 19 to 91), and 50% male and 50% female patients. The primary site breakdown was 56% lung, 10% melanoma, 9% breast, 8% gynecological, 5% renal, 4% gastrointestinal, 2% sarcoma, and 6% other, while the race breakdown was 60% White, 18% Asian, 3% Black/African American, 2% Native Hawaiian or other Pacific Islander, and 17% other/unknown/not reported. The median tumor size was 0.112 c.c. (range: 0.010–26.475 c.c.). We found mean lesion-wise DSC to be 0.723, mean lesion-wise AVD to be 7.34% of lesion size (0.704 mm), mean FP count to be 0.72 tumors per case, and lesion-wise sensitivity to be 89.30% for all lesions. Moreover, mean sensitivity was found to be 99.07%, 97.59%, and 96.23% for lesions with diameter equal to and greater than 10 mm, 7.5 mm, and 5 mm, respectively. No other significant differences in performance metrics were observed across demographic or clinical characteristic groups.

Conclusion

In this study, a commercial deep learning algorithm showed promising results in segmenting brain metastases, with 96.23% sensitivity for metastases with diameters of 5 mm or higher. As the software is an assistive AI, future work of VBrain integration into the clinical workflow can provide further clinical and research insights.

Introduction

Brain metastases are the most common central nervous system malignancy and affect up to 30–40% of cancer patients [1]. Stereotactic radiosurgery (SRS) is an accepted standard of care for the treatment of limited brain metastases (Brown et al. 2016). Two critical steps in planning for SRS are the identification and localization of individual brain metastases on the patient scans and the delineation of the tumor boundaries by the radiation oncologist and/or neurosurgeon. The latter process can be time-consuming and subject to a high degree of inter-observer variability, especially for small brain metastases [2,3,4].

Artificial intelligence (AI) has demonstrated promise in addressing these issues. With the goal of improving efficiency and standardization, machine learning models have recently been developed for automated detection and segmentation of metastatic brain tumors [2, 5,6,7,8,9,10,11,12]. However, the published literature thus far is comprised of technical proof-of-concepts in which the model is tested on small, limited sample sizes, and/or it is not readily deployable to the clinic.

VBrain is a deep learning (DL) algorithm patented by Vysioneer Inc. that received medical device clearance by the Food and Drug Administration (FDA) in 2021 and has been shown to significantly improve inter-reader agreement, contouring accuracy, and efficiency [13, 14]. We aim here to validate this tool in a heterogenous cohort of patients who have been treated with SRS for brain metastases at a single institution as well as provide guidance for the scope of its use.

Methods

Retrospective patient cohort

We obtained approval from Stanford University institutional research ethics board to conduct this study. Our institution has extensive experience with SRS of brain metastases, as previously described [15]. We included 100 randomly selected patients with unresected brain metastases treated with SRS at our institution from 2017 to 2020. Patients who had prior intracranial resection or intracranial radiation were excluded.

Deep learning-based algorithm

VBrain is a commercial, FDA-cleared DL-based algorithm that uses magnetic resonance imaging (MRI) and computed tomography (CT) to segment brain metastases. VBrain adopts the ensemble strategy to optimize the segmentation results: 3D U-Net addresses overall tumor segmentation with high specificity while the DeepMedic model focuses on smaller lesions with a high sensitivity [14,15,16]. The network was trained with a novel volume-aware Dice loss function, which uses information about lesion size to enhance the sensitivity of small lesions [17].

Workflow for automatic detection and segmentation

For each patient, three sets of Digital Imaging and Communications in Medicine (DICOM) files used during SRS planning were exported from our institutional CyberKnife and/or Picture Archiving and Communication System: (1) the CT scan, (2) the axial T1-weighted post-contrast fast spoiled gradient echo MR scan, and (3) the Radiotherapy Structure Set (RTSS). The files were stripped of the protected health information contained in the DICOM headers using a custom script and relabeled using a unique study ID. The anonymized CT and MR scans for each patient were processed by the VBrain software to generate an RTSS with automatically identified and contoured brain metastases.

Evaluation

Subsequent analyses compared the two RTSSs: output contours from VBrain against the physician-defined contours used for SRS. A brain metastasis was considered “detected” when the VBrain- “predicted” contours overlapped with the corresponding physician contours (“ground-truth” contours). We evaluated performance of the predicted contours against ground-truth contours using the following metrics: lesion-wise Dice similarity coefficient (DSC), lesion-wise average Hausdorff distance (AVD), false positive (FP) count, and lesion-wise sensitivity (%).

The lesion-wise DSC was evaluated for only detected lesions, defined as ground-truth lesions that contained within them the centroid of a predicted lesion. FP was defined as the predicted regions which do not overlap with any ground-truth lesion. Lesion-wise sensitivity was defined as the ratio of the total number of detected lesions by VBrain to the total number of ground-truth lesions. Due to the small tumor sizes of the cohort, we also reported the lesion-wise sensitivities with effective diameters equal to and greater than 10 mm, 7.5 mm, and 5 mm, where the effective diameter was defined as the diameter of a volume-equivalent sphere.

The patient cohort was stratified by demographics (age, sex, race) and clinical characteristics (histology type, lesion count, and lesion size) to identify whether significant differences in performance existed in certain groups. Kruskal–Wallis tests were performed to assess the relationships between patient characteristics (including sex, race, histology type, age, and size and number of brain metastases) and performance metrics (including mean lesion-wise DSC, mean lesion-wise AVD, mean FP count, and lesion-wise sensitivity). All tests used a significant p-value threshold of 0.05 unless stated otherwise. All statistical analyses were conducted using the SciPy v1.5.2 package in Python 3.8.7.

Results

Patient demographics

We analyzed 100 patients with 435 intact brain metastases treated with SRS at our institution. Demographic characteristics for our patient cohort are summarized in Table 1. The median number of brain metastases per patient was 2 (range: 1 to 52), and the median tumor size was 0.112 c.c. The most common primary histologies were lung (56%), melanoma (10%), and breast (9%).

Table 1 Demographic and clinical cohort characteristics consisting of 435 brain metastases distributed across 100 patients

Overall performance and stratified assessment

Comparison metrics evaluating performance of VBrain against clinical ground-truth contours for all brain metastases are described in Table 2. We found mean lesion-wise DSC to be 0.723, mean lesion-wise AVD to be 7.34% of lesion size (0.704 mm), mean FP count to be 0.72 tumors per case, and lesion-wise sensitivity to be 89.30% and 96.23% for all lesions and lesions with diameter greater than 5 mm. Furthermore, sensitivity was found to be 85.37% and 90.23% for patient cases with one or two metastases and with three or more metastases, respectively.

Table 2 Performance metrics

As shown in Table 3, sensitivity was found to be 99.07%, 94.83%, and 93.94% for lesions with diameter equal to and greater than 10 mm, between 7.5 mm and 10 mm, and between 5 and 7.5 mm, respectively. The size of the brain metastases was significantly associated with lesion-wise DSC (p < 0.001) and sensitivity (p < 0.001), and the number of brain metastases per patient significantly correlated with sensitivity (p < 0.05; Table 4).

Table 3 Lesion-wise sensitivity by effective diameter of brain metastases
Table 4 Kruskal–Wallis tests were performed to assess the relationships between patient and lesion characteristics, and performance metrics

Figure 1 and Fig. 2 illustrate cases in which VBrain effectively predicted brain metastases among patients with numerous lesions (52) and lesions of small size (2.5 and 4.2 mm diameters). Figure 3 demonstrates challenging cases with tiny lesions, poor image quality, or insufficient contrast in the MR scan for which diagnostic reports and/or longitudinal images might be required for additional reference. No other statistically significant differences in performance metrics were observed across demographic and clinical characteristic groups.

Fig. 1
figure 1

Case with 52 Lesions. a Axial view. b 3D view. VBrain successfully predicted multiple brain metastases for this patient case with over 50 brain metastases, as this case had a Dice similarity coefficient (DSC) of 0.813, average Hausdorff distance (AVD) of 3.81% (0.511 mm), false positive count (FP) of 0, and sensitivity (%) of 90% and 100% for overall and >  = 5 mm tumors, respectively

Fig. 2
figure 2

Case with Tiny Lesions. As highlighted by the bounding box, VBrain successfully contoured brain metastases with a diameter of 2.5 mm (a) and 4.2 mm (b). This case had a Dice similarity coefficient (DSC) of 0.944, average Hausdorff distance (AVD) of 1.78% (0.828 mm) false positive count (FP) of 1, and sensitivity (%) of 100% for both overall and >  = 5 mm tumors

Fig. 3
figure 3

Challenging Cases: a Tiny Lesion. (0.02 c.c.) b Image Artifacts. c Insufficient Contrast. In these cases, the diagnostic report and longitudinal images may be required for additional reference

Discussion

Our analysis included 435 brain metastases in 100 randomly selected patients who were treated with SRS at our institution. This analysis contains significantly more brain metastases and smaller brain metastases than other published series evaluating brain metastases segmentation algorithms [6]. The median tumor size in our study was 0.112 c.c., which is 5–10 times smaller than other cohorts [9, 18]. Smaller lesions are more challenging to detect as well as segment [19]. However, increasingly smaller lesions are being treated with radiation now with improvements in imaging and treatment capabilities. Thus, it is critical to evaluate the performance of available auto-segmentation software for these lesions. Further, many of the previous papers used their cohorts to perform both training and validation. Our study used the entire cohort to perform external validation of VBrain. The primary cancer site distribution of our study cohort is representative of the general population with brain metastases, which includes mostly lung (40–50%), breast (15–25%), and skin (5–20%) primaries [20].

DSC and sensitivity were all found to be significantly associated with the size of brain metastases. 99.07% lesion-wise sensitivity was achieved for tumors greater than 10 mm but decreased to 97.59% and 96.23% for lesions greater than 7.5 mm and 5 mm. Furthermore, sensitivity was found to be significantly associated with tumor counts. There were no other significant associations between patient characteristics and VBrain performance metrics.

There are some limitations to this study. First, these patients were treated at a single academic institution with extensive radiosurgical experience and dealing with, on average, more and smaller brain metastases, which may limit generalizability. Smaller intracranial lesions are difficult to be identified and contoured, which is a common challenge with any segmentation method, manual or automated [19]. Thus, VBrain’s performance in this study may underestimate its overall performance on a general patient population. Second, we excluded patients with prior intracranial radiation or surgical resection. Although these patients represent a minority of radiosurgical cases, further work will be needed to evaluate VBrain’s ability to differentiate between resection cavities, pre-treated lesions, and untreated lesions. Finally, it is important to note that thin-slice 3 T MRI brain with contrast scans should generally be used for SRS contouring [21] but were available for 98% of the patient cases in this study.

VBrain is a clinic-ready and FDA-cleared AI software intended to assist trained medical professionals by providing initial brain metastases contours. In a prior reader study evaluating five brain metastases cases, VBrain assistance significantly improved inter-reader agreement, contouring accuracy, and efficiency, and clinicians were able to detect 12% more lesions than they would have without the software [14]. Although VBrain has been shown to identify brain metastases missed by physicians and reduce contouring time, based on its intended use cleared by the FDA, this tool cannot replace the expertise of the treating physician who must review and modify the final treatment contours.

Future avenues of exploration for VBrain and other tumor auto-segmentation tools are their powerful potential for research application. For example, these tools can enable instantaneous tracking of brain metastases over serial MRIs to evaluate response to novel treatments as well as inform real-time clinical decision making. As advances in imaging and treatment-delivery capabilities enable the detection and treatment of increasingly complex cases of brain metastases, future work is ongoing to develop and improve AI tools to assist in SRS treatment planning.

Availability of data and materials

The data used in this study are not publicly available due to patient health privacy restrictions. However, anonymized data may be available from the authors upon reasonable request.

References

  1. Kotecha R, Gondi V, Ahluwalia MS, Brastianos PK, Mehta MP. Recent advances in managing brain metastasis. F1000Res. 2018;7:1772. https://doi.org/10.12688/f1000research.15903.1.

    Article  CAS  Google Scholar 

  2. Tong E, McCullagh KL, Iv M. Advanced imaging of brain metastases: from augmenting visualization and improving diagnosis to evaluating treatment response. Front Neurol. 2020;11:270. https://doi.org/10.3389/fneur.2020.00270.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Growcott S, Dembrey T, Patel R, Eaton D, Cameron A. Inter-observer variability in target volume delineations of benign and metastatic brain tumours for stereotactic radiosurgery: results of a national quality assurance programme. Clin Oncol. 2020;32(1):13–25. https://doi.org/10.1016/j.clon.2019.06.015.

    Article  CAS  Google Scholar 

  4. Stanley J, Dunscombe P, Lau H, et al. The effect of contouring variability on dosimetric parameters for brain metastases treated with stereotactic radiosurgery. Int J Radiat Oncol Biol Phys. 2013;87(5):924–31. https://doi.org/10.1016/j.ijrobp.2013.09.013.

    Article  PubMed  Google Scholar 

  5. Liu Y, Stojadinovic S, Hrycushko B, et al. A deep convolutional neural network-based automatic delineation strategy for multiple brain metastases stereotactic radiosurgery. PLoS One. 2017;12(10):e0185844. https://doi.org/10.1371/journal.pone.0185844.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Charron O, Lallement A, Jarnet D, Noblet V, Clavier JB, Meyer P. Automatic detection and segmentation of brain metastases on multimodal MR images with a deep convolutional neural network. Comput Biol Med. 2018;95:43–54. https://doi.org/10.1016/j.compbiomed.2018.02.004.

    Article  PubMed  Google Scholar 

  7. Cao Y, Vassantachart A, Jason CY, Yu C, Ruan D, Sheng K, Lao Y, Shen ZL, Balik S, Bian S, Zada G. Automatic detection and segmentation of multiple brain metastases on magnetic resonance image using asymmetric UNet architecture. Phys Med Biology. 2021;66(1):015003. https://doi.org/10.1088/1361-6560/abca53.

    Article  CAS  Google Scholar 

  8. Liu Y, Stojadinovic S, Hrycushko B, et al. Automatic metastatic brain tumor segmentation for stereotactic radiosurgery applications. Phys Med Biol. 2016;61(24):8440–61. https://doi.org/10.1088/0031-9155/61/24/8440.

    Article  PubMed  Google Scholar 

  9. Bousabarah K, Ruge M, Brand JS, et al. Deep convolutional neural networks for automated segmentation of brain metastases trained on clinical data. Radiat Oncol. 2020;15(1):87. https://doi.org/10.1186/s13014-020-01514-6.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Grøvik E, Yi D, Iv M, Tong E, Rubin D, Zaharchuk G. Deep learning enables automatic detection and segmentation of brain metastases on multisequence MRI. J Magn Reson Imaging. 2020;51(1):175–82. https://doi.org/10.1002/jmri.26766.

    Article  PubMed  Google Scholar 

  11. Yang Z, Chen M, Kazemimoghadam M, et al. Deep-learning and radiomics ensemble classifier for false positive reduction in brain metastases segmentation. Phys Med Biol. 2022;67(2):025004. https://doi.org/10.1088/1361-6560/ac4667.

    Article  Google Scholar 

  12. Yi D, Grøvik E, Tong E, et al. MRI pulse sequence integration for deep-learning-based brain metastases segmentation. Med Phys. 2021;48(10):6020–35. https://doi.org/10.1002/mp.15136.

    Article  PubMed  Google Scholar 

  13. Wang JY, Sandhu N, Mendoza M, et al. RADI-12. Deep learning for automatic detection and contouring of metastatic brain tumors in stereotactic radiosurgery: a retrospective analysis with an FDA-cleared software algorithm. Neuro-Oncol Adv. 2021;3(Supplement_3):iii20–iii20. https://doi.org/10.1093/noajnl/vdab071.082.

    Article  Google Scholar 

  14. Lu SL, Xiao FR, Cheng JCH, et al. Randomized multi-reader evaluation of automated detection and segmentation of brain tumors in stereotactic radiosurgery with deep neural networks. Neuro Oncol. 2021;23(9):1560–8. https://doi.org/10.1093/neuonc/noab071.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Fatima N, Meola A, Ding VY, et al. The Stanford stereotactic radiosurgery experience on 7000 patients over 2 decades (1999–2018): looking far beyond the scalpel. J Neurosurg. 2021;135(6):1725–41. https://doi.org/10.3171/2020.9.JNS201484.

    Article  Google Scholar 

  16. Lu SL, Hu SY, Weng WH, et al. Automated detection and segmentation of brain metastases in stereotactic radiosurgery using three-dimensional deep neural networks. Int J Radiat Oncol Biol Phys. 2019;105(1):S69–70. https://doi.org/10.1016/j.ijrobp.2019.06.521.

    Article  Google Scholar 

  17. Hu SY, Weng WH, Lu SL, et al. Multimodal volume-aware detection and segmentation for brain metastases radiosurgery. In: Nguyen D, Xing L, Jiang S, editors., et al., Artificial Intelligence in Radiation Therapy, vol. 11850. Cham: Springer International Publishing; 2019. p. 61–9. https://doi.org/10.1007/978-3-030-32486-5_8.

    Chapter  Google Scholar 

  18. Stankiewicz M, Tomasik B, Blamek S. A new prognostic score for predicting survival in patients treated with robotic stereotactic radiotherapy for brain metastases. Sci Rep. 2021;11(1):20347. https://doi.org/10.1038/s41598-021-98847-3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Bauknecht HC, Romano VC, Rogalla P, Klingebiel R, Wolf C, Bornemann L, Hamm B, Hein PA. Intra-and interobserver variability of linear and volumetric measurements of brain metastases using contrast-enhanced magnetic resonance imaging. Investig Radiol. 2010;45(1):49–56. https://doi.org/10.1097/RLI.0b013e3181c02ed5.

    Article  Google Scholar 

  20. Khan M, Arooj S, Li R, et al. Tumor primary site and histology subtypes role in radiotherapeutic management of brain metastases. Front Oncol. 2020;10:781. https://doi.org/10.3389/fonc.2020.00781.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Kaufmann TJ, Smits M, Boxerman J, et al. Consensus recommendations for a standardized brain tumor imaging protocol for clinical trials in brain metastases. Neuro Oncol. 2020;22(6):757–72. https://doi.org/10.1093/neuonc/noaa030.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge the radiation oncology staff at Stanford Cancer Center for their support of and/or participation in this study.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

ELP conceived, designed, and directed the study. JYW and VQ co-wrote the manuscript, conducted data acquisition, and performed data analyses. CH contributed to manuscript writing and reviewed data quality. NS, MGM, NP, LW, NK, MFG, and SGS conducted data acquisition. YCC, CHL, and JTL performed data analyses. All authors reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Erqi L. Pollom.

Ethics declarations

Ethical approval and consent to participate

Approval was obtained by Stanford University institutional research ethics board to conduct this study. Waiver of patient consent for use of the patient data in this study was also approved.

Consent for publication

Not applicable.

Competing interests

No financial support was provided for this study. YCC, CHL, and JTL have been granted a US patent for Vysioneer software and own shares of Vysioneer Inc. All other authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, JY., Qu, V., Hui, C. et al. Stratified assessment of an FDA-cleared deep learning algorithm for automated detection and contouring of metastatic brain tumors in stereotactic radiosurgery. Radiat Oncol 18, 61 (2023). https://doi.org/10.1186/s13014-023-02246-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13014-023-02246-z