The Yonsei University Institutional Review Board approved this retrospective study and eliminated the need to obtain informed consent from the patient. All methods were performed in accordance with relevant guidelines and regulations. We identified 297 patients pathologically confirmed to have meningioma who underwent baseline conventional MRI between February 2008 and September 2018 in the institutional dataset. Patients with 1) missing MRI sequences or inadequate image quality (north= 17), 2) history of previous surgery (north= 15), 3) history of tumor embolization or gamma knife surgery prior to MRI examination (north= 5), and 4) an error in image processing (north= 2) were excluded. A total of 257 patients (low-grade, 162; high-grade, 95) were enrolled in the institutional cohort.
Identical inclusion and exclusion criteria were applied to identify 62 patients (47 low-grade; 15 high-grade) from Ewha Mokdong University Hospital between January 2016 and December 2018 for external validation of the model. The patient flowchart is shown in Fig. S1.
The pathological diagnosis was made by neuropathologists, according to the WHO criteria.19. Criteria for atypical meningioma (WHO grade 2) included 4 to 19 mitoses per 10 high-power fields, the presence of brain invasion, or the presence of at least three of the following: “sheet-like” growth , hypercellularity, spontaneous necrosis, large and prominent nucleoli and small cells. Criteria for anaplastic meningioma (WHO grade 3 included frank anaplasia (histology similar to carcinoma, sarcoma, or melanoma) or elevated mitoses (>20 mitoses per 10 high-power fields)19.
In the institutional training dataset, patients were scanned in 3.0 Tesla MRI units (Achieva or Ingenia; Philips Medical Systems). Imaging protocols included T2-weighted (T2) and T1-weighted images with contrast (T1C). T1C images were acquired after the administration of 0.1 ml/kg of gadolinium-based contrast material (Gadovist; Bayer).
In the external validation sets, patients were scanned in 1.5 or 3.0 Tesla MRI units (Avanto; Siemens or Achieva; Philips Medical Systems), including T2 and T1C images. T1C images were acquired after the administration of 0.1 ml/kg of gadolinium-based contrast material (Dotarem; Guerbert or Gadovist; Bayer). Substantial variation existed between the acquisition parameters for T2 and T1C among the various MRI units between the institutional and external validation sets and reflected the heterogeneity of meningioma imaging data in clinical practice (Supplementary Table 1).
Image preprocessing and radiomics feature extraction
Resampling of images to 1 mm isovoxels, correction of low-frequency intensity nonuniformity using the N4 skew algorithm, and joint registration of T2 images to T1C images were performed using advanced normalization tools (ANT)twenty. After skull removal by Multi-cONtrast brain STRipping (MONSTR)twenty-one, signal intensities were normalized to the z-score. An affine registration was performed to transform the brain images to the MNI15222.
A neuroradiologist (with 9 years of experience) blinded to clinical information semi-automatically segmented the entire tumor (including cystic or necrotic changes) on T1C images using 3D Slicer software (v. 4.13.0; www.slicer.org ) with algorithms based on limits and thresholds. Another neuroradiologist (with 16 years of experience) reassessed and confirmed the segmented lesions.
Radiomics features were calculated with a Python-based module (PyRadiomics, version 2.0)23, with a bin size of 32. They included (1) 14 shape features, (2) 18 first-order features, and 3) 75 s-order features (including gray-level co-occurrence matrix, run length of gray level matrix, gray level size area matrix, gray level dependency matrix, and neighboring gray tone difference matrix) (Supplementary Material S1 and Supplementary Table 2). Features adhering to the standard sets of the Image Biomarker Standardization Initiative 24. A total of 214 radiomic features (107 × 2 sequences) were extracted.
Construction of radiomic models
The schematic of radiomics model construction and establishment of an application system based on CycleGAN is shown in Fig. 1a. The radiomic features were normalized to MinMax. Since the number of radiomic features was greater than the number of patients, mutual information was applied to select the significant features. The basic radiomic classifiers were built using extreme gradient boosting with ten-fold cross-validation on the training set. The synthetic minority oversampling technique was applied to oversample the minority class25. To improve predictive performance and avoid possible overfitting, Bayesian optimization was applied, which searched the hyperparameter space for optimal combinations of hyperparameters. Area under the curve (AUC), precision, sensitivity, specificity, and F1 score (definitions shown in Supplementary Material S2) were obtained. The feature selection and machine learning process was done using Python 3 with the Scikit-Learn library module (version 0.24.2).
Figure 1b shows the general network architecture of CycleGAN. The generative adversarial network (GAN) has two neural networks, namely a generator and a discriminator, for distinctive purposes. CycleGAN uses two sets of GANs for style transfer to train unsupervised image translation modelssixteen. Unpaired institutional training and external validation datasets were used to train the CycleGAN discriminators and generators.
To be delivered on CycleGANsixteen, brain MRIs were converted to two-dimensional images in each aspect of the axial, sagittal, and coronal planes. Because image size was different between institutions and individuals, images were resized to 99 × 117 × 95 pixels after registration of the MNI152 template and to 116 × 116 pixels before being placed in CycleGAN.
In the first set of GANs, the first generator (G1) in CycleGAN converts the images from the external validation dataset to the domain of the institutional training dataset, while the first discriminator D1 checks if the images computed by G1 are real. or false (generated). ). Through this process, the synthetic G1 images are enhanced by feedback from their respective discriminators. In the second set of GANs, the second generator (G2) transfers the synthetic image generated by the first generator (G1) to the original image of the external validation data set, while the second discriminator (D2) checks if the computed images by G2 are real. or false (generated). Through this process, the trained CycleGAN model transferred the styling of the external validation images to the training set. The cycle consistency loss, which is the difference between the generated output and the input image, was calculated and used to update the generator models at each training iteration.sixteen. Loss of L2, which is known to speed up the training process and generate sharp, realistic images in GAN26.27, was used to estimate the loss of consistency of the cycle. The inference results were randomly sampled and verified by a neuroradiologist (with 9 years of experience) for plausibility. Images from the external validation set after CycleGAN were submitted to assess the performance of the radiomics model against the original external validation dataset. Because the original external validation set and the external validation set images after CycleGAN were independent of radiomics modeling in the training process, there is no potential data breach.28. Details of the CycleGAN architecture are shown in Supplementary Table 3.
Evaluation of the effect of CycleGAN: Fréchet Inception Distance and t-Distributed Stochastic Neighbor Embedding
The Fréchet Onset Distance (FID) was calculated to measure the similarity between two image datasets to quantitatively measure model quality by evaluating the generated data (Supplementary Material S3)29. FID is an extension of the start score30 and compares the distribution of generated images with the distribution of real images that were used to train the generator. The FID has been shown to be consistent with human judgments and more resistant to noise than the initial score29. Three FID scores were calculated, namely ‘training vs original external validation’, ‘original external validation vs transferred external validation’, and ‘training vs transferred external validation’. To visualize the effect of CycleGAN on the extracted radiomic features, the high-dimensional feature space was projected and visualized into a lower-dimensional space by using a two-dimensional t-distributed stochastic neighbor embedding (t-SNE) manifold.31.