Introduction

NF2-related schwannomatosis (previously known as Neurofibromatosis type II; NF2-SWN) is a rare, autosomal dominant disorder caused by mutations in NF2 gene on chromosome 22q12. It is characterized by tumors in the nervous system, such as vestibular schwannomas, spinal meningiomas, and peripheral nerve tumors1. Patients with vestibular schwannomas (VS) suffer from hearing loss, tinnitus, facial palsy, and a reduced life expectancy2.

Radiologic techniques are vital in the diagnosis of NF2-SWN, and regular screening is indicated1,3. Magnetic resonance imaging (MRI) has replaced computed tomography (CT) as the gold-standard imaging for diagnosing and monitoring vestibular schwannomas due to its high sensitivity and specificity4,5. Recent advancements, such as gadolinium-enhanced T1-weighted (GdT1W) and high-resolution T2-weighted (T2W) imaging, have further improved tumor visualization by enabling clear delineation of the tumor’s contrast and peritumoral characteristics6.

T1-weighted MRI is particularly effective in delineating tumor shape, size, location and reflecting mass effects1,7. VS tumors enhance significantly after intravenous gadolinium contrast8. Use of contrast-enhanced, T1-weighted MRI used with T2-weighted or Fluid-attenuated inversion recovery (FLAIR) sequences can distinguish peritumoral cysts and edema, which enhance heterogenously9.

Given the irregular growth patterns of VS tumors, higher resolution MRIs with small voxel sizes are optimal to capture morphologic details, to allow for accurate 3D volumetric segmentation and modelling. Dombi et al. found that slice thickness should be less than 1 mm10. Volumetric imaging using such parameters has been critical for detecting subtle changes in tumor growth, which traditional 2D imaging might overlook6.

Linear and volumetric analysis are current methods to measure tumor size. In linear analysis, unidimensional or bidimensional measurements of the largest tumor diameter are assessed on axial or coronal views of MRIs10,11. Numerous factors can decrease sensitivity of linear measurements, including patient orientation, oblique orientations and irregular shape of tumors and high levels of observer variation11,12. The limitations of linear measurements are exacerbated in cases of asymmetric tumor growth, which may not be accurately captured by single-plane assessments6.

Volumetric analysis integrates an additional dimension of measurement. By approximating the tumor as an ellipsoid in every MR slice, volumetric analysis is more sensitive to tumor progression compared to 2D measurements. However, studies exploring volumetric analysis of VS tumors have found that these approximations overestimate volume. Cross-sectional slices of VS tumors deviate from an ellipsoid shape as they develop extra-canalicular components extending into the cerebellopontine angle, adopting the “ice cream cone” shape13.

Considering the limitations of linear and volumetric analysis, 3D volumetric analysis has gained recognition as an accurate method of tracking growth, where tumors are segmented on each MRI slice and the area is multiplied by slice thickness. This eliminates error introduced by approximating or assessing the longest diameter. Despite greater sensitivity14, 3D volumetric analysis is not currently practical for clinical use as it requires time intensive manual segmentation13. Advanced machine learning algorithms, including convolutional neural networks (CNNs) like U-Net, are being explored to automate this process. These models aim to improve segmentation efficiency while minimizing observer variability, despite challenges such as domain shift across different datasets6,15. The goal of the study is to show that an AI led approach to automate the segmentation and 3D volumetric calculations of these tumors could shorten the time required to conduct 3D volumetric analysis and improve image processing accuracy.

Methods

Based on our initial trials with a dataset of 10 images, we determined that 150 MRI images were required for a ground truth data set to achieve the desired AI accuracy. This was determined using statistical power analysis based on previous research in DICE score modeling16,17.

This study was approved by the IRB committee at Yale University under reference number 2000032810. All methods were performed in accordance with the relevant guidelines and regulations and patients gave informed consent prior to involvement in this study.

Patient recruitment

77 patients were identified through the Yale New Haven Hospital medical database (Fig. 1). Of these, 24 patients were eligible for inclusion. The patient records of these patients produced 84 MRIs from which VS tumor models could successfully be made. To obtain a diverse set of data and to increase the numbers of patients, the researchers contacted patients through NF-SWN 2 specific Non-Governmental Organisation NF2 BioSolutions. An initial recruitment email was sent out to all members on the mailing list (n = 1000) with a return rate of 7.2% responding with interest. 13 patients were included in the final stage, providing 70 MRIs. Tumor models could be successfully made from 59 MRIs.

Fig. 1
figure 1

Flowchart outlining the patient recruitment process to collect MRI images for segmentation.

Inclusion criteria for patients included: patients over the age of 18, patients with a formal diagnosis of NF2-SWN (with either uni- or bilateral vestibular schwannomas) and patients with available MRI scans with the ability to upload these to the server (if recruited from outside of the Yale database). Exclusion criteria included: patients under the age of 18, patients with an undocumented history of NF2-SWN, scans without contrast, scans with large voxel sizes or scan sequences that did not allow for visualization of the inner auditory canal (less than 150 images). Inclusion criteria did not include treatment status or presence of other tumors such as meningiomas, therefore the patient data represented both pre- and posttreatment scans as well as those with meningiomas.

From Yale New Haven Hospital and public patient recruitment, 143 MRIs were included in the ground truth dataset. All images used for the POC were T1 weighted, post-contrast MRI scans.

Creation of proof-of-concept data set

The quality of images was determined by the researchers and was categorized by voxel size. Scans were categorized into high (less than 0.5 × 0.5 × 1.0), medium (less than 1.0 × 1.0 × 1.0) and low quality (greater than 1.0 × 1.0 × 1.0). To create the tumor models an image processing software (Simpleware, Synopsys, Mountain View, CA) was used.

For the proof-of-concept, 110 3D models were used; 66 high quality, 44 medium quality and 6 low quality scans.

3D tumor masks were created for vestibular schwannomas (unilateral or bilateral) as follows (Fig. 2):

Fig. 2
figure 2

A pathway describing the process used to manually segment the VS tumors from MRI images using the imaging processing software Simpleware ScanIP.

To highlight the vestibular schwannomas, a thresholding algorithm was used on selected slices containing the tumor mass. A ‘split regions’ algorithm was used to isolate the tumors and remove the non-tumorous voxels. Consideration was taken with voxels lining the border of the mask. Missing and surplus voxels were adjusted to include or exclude as necessary using the paint function. After the initial mask creation, the tumors were re-examined in all three planes (coronal, axial, sagittal views) to correct for any missing or extra voxels. All the models were reviewed by a neuroradiologist, who made necessary adjustments.

The ‘volume’ measurement tool was used to calculate the 3D volume (in mm3) of each tumor mask. To visualize the shape, size, and pattern of growth of the masked tumors, a mask was created of the pons at the levels of the tumors.

Our segmentation process utilized the Simpleware platform by Synopsys, which integrates proprietary AI-powered auto-segmentation tools. These tools leverage machine learning algorithms trained on large, domain-specific datasets to accurately and efficiently segment complex anatomical structures, including vestibular schwannoma (VS) tumors. To ensure reliability, we validated Simpleware’s output against ground truth annotations using metrics such as the DICE coefficient.

Creation of prototype

The proof-of-concept (POC) data set used by the engineers at Synopsys consisted of 25 high quality MRIs. The ground truth dataset of 143 MRIs was subdivided into train (80%), validation (10%) and test (10%) groups.

The helper (DPP V1.0) was trained using proprietary AI- and ML-based algorithms and information. No tumors identified within the ground truth segmentation were missed by the helper.

A final testing stage was completed using 30 new segmentations of MRI scans obtained from NF2 Biosolutions. This stage of training was used for validating the segmentations produced by the tool to verify its ability to identify and segment tumors in previously unseen patient data.

Following this testing stage, an additional tool was added to the modeler which corrects the orientation of the images when imported into the software to accommodate different imaging protocols.

To compare the accuracy of the AI generated 3D models to the radiologist validated manual segmentation models, a DICE score was calculated. The DICE score calculation was calculated using the equation: DICE Coefficient = 2 * the Area of Overlap / by the total number of pixels in both images.

Development of visualization tool

To compare the chronological growth of a patient’s tumors, each patient who had multiple scans had their segmented tumor masks imported into a single Simpleware file. An image registration algorithm was used to reformat and align the brain across the DICOM images and the generated 3D tumor models.

Accessing the scripting interface in Simpleware, custom code was written to organize bilateral tumors in chronological order and sort them into left and right categories.

Plots were generated with a script illustrating the change in size of each tumor over time. Plots indicating percentage change from baseline and volatility were also generated. A color scheme that illustrates chronological tumor growth was developed. The overlapping 3D models of each tumor were displayed in a single color. Models of tumors from earlier scans are displayed in lighter shades whereas models from later scans are displayed in darker shades as seen in Fig. 3.

Fig. 3
figure 3

An example of a tumor visualization using the tool created showing tumor growth. Recent tumor growth is shown by the darker, shaded areas in the model.

Finally, Volume measurements were obtained for ten masses using three approaches: manual segmentation, AI-based segmentation, and ellipsoid volume calculation. Ellipsoid volumes were estimated using the formula \(V=\frac{4}{3}\pi \frac{Length}{2}\frac{Height}{2}\frac{Width}{2}\).

The dimensions used for ellipsoid calculations were based on the maximum observed dimensions of each mass. These measurements were compared to the volumes derived from Manual and AI 3D volume segmentations. Percentage changes were calculated to compare AI and ellipsoid volumes to the manual volumes, to evaluate discrepancies across methodologies. Findings were summarized in Table 1.

Table 1 Comparison of manual, AI and ellipsoid volume segmentations and percent change.

Results

A mean DICE score of 0.76 (standard deviation 0.21) was achieved in a proof-of-concept (POC) of the model. After the final testing stage, the final mean DICE score was 0.88 (range 0.74–0.93, standard deviation 0.04).

Table 2

Table 2 Table showing improvement in DICE scores between the initial tool and the latest version of the AI modeler.

The table shows the improvement in DICE scores between the initial proof of concept and the final DPP version of the AI modeler in a set of the same 24 images. For example, significant improvement can be seen in Image 7 where the DICE score improved from 0.14 to 0.84.

Discussion

Our study has demonstrated significant findings of 3D volumetric tumor analysis through AI-driven, automated image processing. We have demonstrated the capability to automate the processing of VS tumors with a credible overall DICE score of 0.88, demonstrating the tool’s accuracy and reliability. Investigation and testing revealed the versatility of the AI tool across various MRI T1 sequences, accommodating different voxel sizes without compromising efficacy. The key developments in this VS software are the ease of use for clinicians allowing multiple data points such as tumor volumetrics and proximity to surrounding structures such as the pons to be collected at one time, and the ability to compare the VS growth over time by overlaying sequential MRIs to see specific areas of growth into the pons or into the auditory canal (Fig. 4). These findings collectively underscore the potential of the AI-driven methodology to revolutionize VS tumor assessment and monitoring.

Fig. 4
figure 4

An example of two overlayed tumors showing growth over time. The more recent tumor mask can be seen in the purple shade, the opacity of the mask has been changed to visualize the previous tumor mask in yellow.

This novel AI driven volumetric approach emerges as a superior method for tumor size assessment due to how it rapidly provides clinically accessible three-dimensional information, particularly when compared to linear analysis. Volumetric analysis is superior to linear assessments, but there are limitations to this, including the high-intensive process required to complete volumetric analysis18. Linear measurements are based on ellipsoid volume calculations which take a volume based on the largest diameter of the tumor. However, as seen in Table 1, this is often a gross overcalculation of the tumor and does not consider the individual shape of tumors and can miss nodular growth, which is characteristic of VS. 3D volumetric analysis considers the entire volume, providing a more holistic understanding of the tumor’s size and growth dynamics19. This approach accounts for irregular shapes and orientations of tumors and mitigates the impact of observer variation. By encompassing the entirety of the tumor, 3D volumetric analysis offers greater accuracy and reliability, crucial for informing clinical decisions6.

The ground truth dataset used in this sample was heterogeneous, in that it included a wide range of patients, MRI scanners (1.5 T and 3 T scanners), manufacturers and voxel sizes ranging from 0.375 × 0.375x1.0 to 1.0 × 1.0 × 1.0. This heterogenicity in the dataset and the overall strong performance of the AI demonstrated by the high DICE scores allows for the AI driven tool to be applicable to a wide range of clinical settings.

While 3D volumetric analysis has been found to yield the most accurate measures of tumor size, manual segmentation of tumors is time intensive and requires trained clinicians and engineers to perform, introducing variability in measurements. Our algorithms use AI algorithms to provide a fully automated tool, without subjectivity, error, and the need for extensive clinical training. This streamlined process provides more accurate measures of tumor size than linear segmentation while taking only a fraction of the time (less than 4 min). Additionally, AI-segmented models are inherently objective in their creation, reducing subjectivity. The repeatability of the process increases the clinical validity of these measurements.

Comparing sequential imaging to determine tumor growth is challenging, given patients’ heads are rarely positioned in the MRI scanner in the exact same position between scans. To reduce the uncertainty introduced by this, the software performs image registration to overlay all the 3D tumor models onto previous imaging regardless of patient position. In doing so, we can identify in multiple dimensions where tumor growth is occurring and evaluate the impact of the tumor on structures such as the pons and cochlear nerve. Clinical trials could also use this tool to determine the effectiveness of their investigational drug or interventional procedures. The 3D morphometric technique would standardize the measurement of VS tumors in clinical trials and treatment.

For individuals diagnosed with NF2-SWN, having access to their 3D tumor growth can reinforce an individual’s ability to self-advocate. The 3D modeling tools developed can help patients understand their disease and its progression. This tool has the unique ability to provide patients with the possibility of being actively involved with their treatment plan and understanding their tumors in more detail.

There is an unmet need for a reliable and easy to use tool to confidently evaluate tumor growth over time. A standardized process greatly reduces the potential for error in the interpretation, increasing efficiency and accuracy of VS volumetric analysis. This tool will be an aid to experienced clinicians and be beneficial to early-career radiologists to better visualize and assess tumor size, morphology, and growth.

Limitations

The AI tool created has not produced a DICE score of 1 indicating that there is still some discrepancy between manual and AI segmentations. This requires the oversight of a radiologist or trained researcher to validate scans to verify segmentations.

Validation by a radiologist is also needed in cases when meningiomas are close to the vestibular schwannoma to validate correct identification. Researchers also identified a small number of tumors with a morphology not previously encountered by the AI such as when tumors have undergone debulking procedures. These would also benefit from additional validation.

Manual segmentation was done by human operators, which could have introduced errors in their determination of the extent of the tumor.

The main limitation facing the AI currently is the need for the tool to be adapted and improved as potential further errors are encountered in unconventional cases or previously unencountered tumor morphologies.

Conclusion

Our study has demonstrated an efficient, accurate AI for the 3D volumetric analysis for vestibular schwannomas. The use of this AI will enable faster 3D volumetric analysis compared to manual segmentation. The tool will be a method of assessing tumor growth through volume measurements and allow clinicians to make more informed decisions. One key area of future research will focus on predictive growth of tumors based on the comparison of previous growth rate of analyzed tumors.