Magnetic resonance imaging features of tumor and lymph node to predict clinical outcome in node-positive cervical cancer: a retrospective analysis.

BACKGROUND
Current chemoradiation regimens for locally advanced cervical cancer are fairly uniform despite a profound diversity of treatment response and recurrence patterns. The wide range of treatment responses and prognoses to standardized concurrent chemoradiation highlights the need for a reliable tool to predict treatment outcomes. We investigated pretreatment magnetic resonance (MR) imaging features of primary tumor and involved lymph node for predicting clinical outcome in cervical cancer patients.


METHODS
We included 93 node-positive cervical cancer patients treated with definitive chemoradiotherapy at our institution between 2006 and 2017. The median follow-up period was 38 months (range, 5-128). Primary tumor and involved lymph node were manually segmented on axial gadolinium-enhanced T1-weighted images as well as T2-weighted images and saved as 3-dimensional regions of interest (ROI). After the segmentation, imaging features related to histogram, shape, and texture were extracted from each ROI. Using these features, random survival forest (RSF) models were built to predict local control (LC), regional control (RC), distant metastasis-free survival (DMFS), and overall survival (OS) in the training dataset (n = 62). The generated models were then tested in the validation dataset (n = 31).


RESULTS
For predicting LC, models generated from primary tumor imaging features showed better predictive performance (C-index, 0.72) than those from lymph node features (C-index, 0.62). In contrast, models from lymph nodes showed superior performance for predicting RC, DMFS, and OS compared to models of the primary tumor. According to the 3-year time-dependent receiver operating characteristic analysis of LC, RC, DMFS, and OS prediction, the respective area under the curve values for the predicted risk of the models generated from the training dataset were 0.634, 0.796, 0.733, and 0.749 in the validation dataset.


CONCLUSIONS
Our results suggest that tumor and lymph node imaging features may play complementary roles for predicting clinical outcomes in node-positive cervical cancer.


Background
Medical imaging has profound clinical importance for diagnosis, staging, treatment, and predicting prognosis in cancer patients. In current practice, the majority of clinical decisions are based on a limited number of radiologic features that can be readily processed by the unaided radiologist's eye. However, tumor imaging may contain more information than can be visually assessed. For this reason, recent studies have suggested that quantitative imaging features of the tumor mass, including shape and texture, may also have prognostic importance for predicting patient outcomes in various cancer sites [1][2][3].
The standard treatment for locally advanced cervical cancer is cisplatin-based concurrent chemoradiation. Despite the use of combined-modality treatment with external beam radiotherapy, intracavitary brachytherapy, and chemotherapy, approximately 30% of these patients experience progression and recurrence [4][5][6]. In the updated results from the Radiation Therapy Oncology Group Trial 90-01 with a median follow-up of 6.6 years for 228 survivors, the 5-year cumulative disease progression rate of stage IIB-IVA patients who were treated with cisplatin-based concurrent chemoradiation was reported as 32% [4]. Nevertheless, current chemoradiation regimens for locally advanced cervical cancer remain fairly uniform despite a profound diversity of treatment response and recurrence patterns. The wide range of treatment responsiveness and prognoses, despite the administration of standard concurrent chemoradiation, highlights the need for a reliable tool to predict treatment outcomes.
Previous studies have shown that imaging features extracted from magnetic resonance (MR) imaging can provide information on the likelihood of tumor characteristics in cervical cancer [7][8][9]. However, these studies have primarily focused on investigating the predictive performance of imaging features in terms of classification of lymph node metastasis or molecular characteristics, rather than clinical outcomes such as recurrence, distant metastasis, and overall survival. Furthermore, to our knowledge, no study has assessed the predictive performance of imaging features obtained both from primary tumors and involved lymph nodes in cervical cancer. Therefore, we investigated the pretreatment MR imaging features of primary tumor and involved lymph node for predicting local control (LC), regional control (RC), distant metastasis-free survival (DMFS), and overall survival (OS) in cervical cancer patients.

Patient, tumor, and treatment characteristics
We retrospectively reviewed the medical records of 121 consecutive cervical cancer patients who were treated with definitive concurrent chemoradiotherapy at our institution between 2006 and 2017. Of these 121 patients, 28 patients were not evaluated with pretreatment MR imaging. Thus, a final total of 93 patients were included in the analysis. The institutional review board approved this study and provided a waiver of consent (2017-06-032). A positive lymph node was defined as having a maximum short axis diameter of ≥8 mm according to pretreatment MR imaging [10,11].
All patients were treated with a combination of external beam radiotherapy (EBRT) followed by high-doserate (HDR) intracavitary brachytherapy (ICR) with curative intent. EBRT was delivered to the whole pelvis using a 3-dimensional conformal radiation therapy (3D-CRT) 4-field box technique (1.8 Gy daily fractions, 5 times per week, for a total dose of 45 Gy). Extended-field radiotherapy, including pelvis and para-aortic nodal area, was

Segmentation
The key steps of the imaging feature analysis process are illustrated in Fig. 1. Patient-sensitive information was anonymized before image segmentation. Primary tumor tissue and involved lymph nodes were manually segmented on the axial gadolinium-enhanced T1weighted images (T1WI) and T2-weighted images (T2WI) by 2 radiation oncologists (S.H and B.B) using the annotation tool of the m:Studio Research Platform [12]. In case of multiple lymph node involvement, the largest lymph node was selected for segmentation. Segmented contours of tumor and lymph node were subsequently reviewed and revised by 1 radiologist (M.H). Each 3-dimensional region of interest (ROI) was saved as voxels.

Feature extraction, clustering, and selection
After segmentation, 86 imaging features were extracted from each ROI segmented on enhanced T1WI and T2WI. These included (i) 12 first-order features, (ii) 6 grey-level co-occurrence matrix (GLCM) features, (iii) 11 grey-level run-length matrix (GLRLM), (iv) 3 neighborhood grey-level difference matrix (NGLDM), and (v) 11 grey-level zone length matrix (GLZLM). All matrices were calculated in a 3-dimensional manner using LIFEx software (Additional file 1) [13]. The number of grey levels was set at 64 (ROIs were discretized using 64 levels before feature extraction). To normalize the image intensities from different MR units, resampling was conducted as a relative value (between the minimum and maximum values in the ROI) [14]. After extraction, features were clustered to reduce dimensionality and to avoid multicollinearity. Spearman's correlation analysis was performed, and highly correlated features (Spearman's coefficient (SC) > 0.90) were clustered using hierarchical clustering. The resultant clusters were represented by a new feature calculated by averaging all features within a cluster. Negatively correlated features were inverted before being averaged.
The patient cohort was randomly divided into 2 independent groups for the training (62 patients) and validation (31 patients) datasets. After clustering, the feature set of the training dataset was used to identify the most Fig. 1 Illustrations of the key steps in the imaging feature analysis process. First, primary cervical tumor and involved lymph node were segmented. Second, imaging features were extracted from each region of interest. Third, random survival forest models were built to predict survival outcomes, after the feature clustering and selection process in the training dataset. Fourth, the generated models were tested in the validation dataset relevant features using the Ridge, Lasso, and Elastic-net regularization algorithms. To improve the stability of the feature selecting process, feature selection was repeated n = 100 times using n bootstrap samples of the training dataset. The top 10 ranked features were selected from each bootstrap sample, and the selected features were aggregated over the bootstraps. We performed rank aggregation, the process of combining information from several ranked lists into a single, more stable list. The simple ensemble method was used to aggregate feature ranks [15].

Model building and validation
We used the random survival forest (RSF) method to generate prediction models. RSF is an ensemble tree method for the analysis of right-censored survival data [16] that extends upon Breiman's random forest approach [17]. For hyper-parameter optimization, grid search and cross validation was performed. Model performance was assessed in the validation cohort using the concordance index (C-index). Harrell's C-index is a generalization of the area under the curve (AUC) for continuous time-to-event survival data [18]. C-index = 0.5 describes a random prediction, whereas a perfectly predicting model would have a C-index of 1.0. The R packages "randomForestSRC" in version 2.9.0 and "caret" in version 6.0-83 were used. Because the Cindex approach may be problematic in situations with a fixed prediction time point [19], we also assessed t-year risk of an event using time-dependent area under the receiver operating curve analysis [20,21]. To evaluate the predictive performance of the built RSF models, we divided the patients into low-and high-risk groups according to predicted risk. The optimal cutoff value was calculated based on our dataset using the "cutp" function of the R package "survMisc" in version 0.5.5. In image processing and feature calculation, we followed guidelines of the Image Biomarkers Standardization Initiative [22].

Statistical analysis
Three-year actuarial rates of LC, RC, DMFS, and OS were calculated using the Kaplan-Meier method, and comparisons among groups were conducted using 2sided log-rank tests. These endpoints were reached at the first observation of a defined event, and all events were calculated from the start of definitive chemoradiotherapy. For LC, the first event could present as persistent disease, or recurrence in the cervix or an adjacent pelvic organ. For RC, the first event was defined as either a persistent node or recurrence in the pelvic or para-aortic area. For DMFS, the events included recurrence at any other site or death from any cause. For OS, the event was death from any cause. All statistical analyses were performed using R project (version 3.5.3).

Results
The SC values between features are summarized as a correlation matrix (Fig. 2-3). After feature clustering, no SC > 0.90 was observed between clustered features ( Fig. 4-5). After the feature clustering process, 18 and 17 imaging features of primary cervical tumors, and 17 and 20 lymph node imaging features from the T1WI and T2WI, respectively, were used for modeling. Using the training dataset, RSF models predicting LC, RC, DMFS, and OS were built, and predictive performance was tested in the validation dataset. For predicting LC, the models using primary cervical tumor imaging features showed better predictive performance (C-index = 0.72 ± 0.08) (mean ± standard deviation) than the models from involved lymph nodes (C-index = 0.62 ± 0.12). In contrast, for predicting RC, DMFS, and OS, the models using imaging features of involved lymph nodes showed superior predictive performance (RC, C-index = 0.69 ± 0.07;DMFS, C-index = 0.66 ± 0.08; OS, C-index = 0.72 ± 0.11) compared to the models from primary cervical tumor features (RC, C-index = 0.65 ± 0.06; DMFS, Cindex = 0.64 ± 0.07; OS, C-index = 0.69 ± 0.07).
For each clinical endpoint, patients were divided into a low-and high-risk group based on the predicted risk of the models from the training dataset. Statistically significant differences between the low-risk and high-risk groups were observed for LC in both training and validation datasets when using primary cervical tumor imaging features (P = .041 and .023, respectively) ( Table 2). In addition, statistically significant differences were demonstrated in both training and validation datasets between the low-risk and high-risk groups when using involved lymph node features for RC (P < .001 and .037, respectively) ( Table 3) and DMFS (P = 0.032 and 0.037, respectively) ( Table 4 and Figs. 6a, b, c, d, e and f). However, the models were not able to effectively stratify the patients in terms of OS into two groups with significantly different outcomes in the test dataset (Additional files 2 and 3). In the 3-year time-dependent receiver operating curve analysis of LC, RC, DMFS, and OS prediction, the predicted risk of the models showed AUC values of 0.634, 0.796, 0.733, and 0.749, respectively, in the validation dataset (Fig. 7).

Discussion
Using pretreatment MR imaging, we developed and validated RSF models for predicting LC, RC, DMFS, and OS in node-positive cervical cancer. The respective RSF models were built from imaging features both from primary tumors and involved lymph nodes. These models provided independent prognostic    information beyond that of known clinicopathologic prognostic factors including FIGO stage, pathologic type, and the extent of lymph node involvement. Of note, the primary tumor imaging features demonstrated superior performance for predicting LC, while lymph node imaging features were superior at predicting RC, DMFS, and OS.
To date, a limited number of studies have been published concerning the relevance of imaging features with regard to cervical cancer disease characteristics or clinical outcomes. In most of these, the clinical endpoints were dichotomous variables and did not take into consideration of the time when that event occurred. For example, Becker et al. analyzed MR images of 23 patients with cervical cancer and found that imaging features extracted from apparent diffusion coefficient (ADC) maps were associated with histological differentiation and nodal stage [8]. Similarly, in an analysis of 34 patients with advanced cervical cancer by Meng et al., [23] T2WI-and ADC-based features were associated with disease recurrence. A study by Wu et al. of 189 cervical cancer patients reported that tumor imaging features on T2WI and ADC were highly predictive of lymph node metastasis (AUC and sensitivity of 0.842 and 100%, respectively, in the validation cohort) [7]. Sun et al. built random forest models with features extracted from both T1WI and T2WI to predict responses after neoadjuvant chemotherapy in cervical cancer patients. Unlike the studies cited above, our models predicted time-to-event survival outcomes rather than a dichotomized treatment response [34]. In this context, it is important to note that treatment responses often do not translate into improvements in overall survival [24].
Only limited data exist regarding the specific site of cervical cancer failure; local and regional failure have been reported as components of locoregional failure or pelvic failure. The recognition of specific patterns of failure may provide important information to guide which treatment needs to be modified from the radiation oncologists' perspective. Our results suggest that imaging features extracted from the primary tumor may provide information for making decisions to escalate or deescalate the radiation dose to the primary cervical tumor. For example, patients at high risk of local failure could benefit from an escalating dose to the uterine cervix, whereas patients at low risk of local failure could be considered for a de-escalating radiation dose to reduce the risk of radiation-induced toxicity. A number of significant late complications are associated with chemoradiation, including gastrointestinal, urologic, and gynecologic toxicities, particularly if intracavitary brachytherapy is added [5,25]. For this reason, radiation dose need to be tailored according to the each patient's given site-specific failure risk, not only to enhance failure-free survival but also to minimize treatmentrelated toxicity.
Another important finding of our study was that lymph node features demonstrated superior performance over primary tumor features for predicting RC and DMFS. While most previous work investigating imaging features has primarily focused on the primary tumor, we separately assessed involved lymph nodes, which could facilitate a more complete evaluation of disease status. For node-positive cervical cancer patients, it may be necessary to tailor doses according to the involved lymph nodes for personalized radiotherapy. Some investigators have reported that escalating radiotherapy dose to involved lymph node can improve RC [26][27][28][29]. Increased RC prediction accuracy using lymph node features may help select patients who require dose escalation to the involved lymph nodes. Additionally, the superior predictive performance of lymph node features for DMFS suggests that these features may contain information on metastatic potential of disease. This result is encouraging because the intensification of systemic therapy Table 3 Univariate and multivariate analyses of potential prognostic factors for regional control in the validation dataset. Statistically significant differences were found according to the imaging feature-based risk scores in the univariate analysis  Several modeling approaches can be applied to predict the risk of future events in terms of survival. The most widely used of these methods is the Cox-proportional hazards model. This model is flexible and simple, but it is difficult to apply in situations where the restrictive proportional hazards assumption is violated [30]. Moreover, in high-dimensional settings where the number of covariates far exceeds the number of observations, as in our study, standard survival analyses such as Coxproportional hazard models might be inadequate. RSF is an ensemble method of building and splitting tree by maximizing the log-rank statistic in each node [16]. Ensemble predictions are given by averaging the cumulative hazard estimates in the terminal nodes of the trees. RSF has several advantages compared with regression-based methods. First, it is completely data-driven and thus independent of model-specific assumptions. Second, it seeks to generate a model that best explains the data and thus represents a suitable tool for exploratory analysis where prior information of the survival data is limited. Third, in cases of high-dimensional data, the limitations of univariate regression approaches, such as overfitting, unreliable estimation of regression coefficients, inflated standard errors or convergence problems, do not apply to RSF. Fourth, it is robust to outliers in the covariate space [31].
The common approach to evaluate the predictive performance of an RSF model is the Harrell's C-index [18]. However, Blanche et al. demonstrated that C-index may not be the proper approach when predicting the risk of b Fig. 6 Kaplan-Meier curves of local control (a, b), regional control (c, d), and distant metastasis-free survival (e, f) for patients from the training dataset and validation dataset, respectively. Patients were stratified into a low-risk (yellow line) and a high-risk group (blue line) according to predicted risk from the random survival forest model based on pretreatment MRI features an event at a certain time point [19]. They noted that Cindex assesses the order of event times rather than event status order at a given time point. Time-dependent ROC analysis does not have this problem [20,21]. For this reason, we also performed time-dependent ROC analysis in both training and validation datasets to compare 3year cumulative LC, RC, DMFS and OS between the low-and high-risk groups. Our results show that RSF models using imaging features could achieve an area under the 3-year time-dependent ROC of 0.634-0749 for predicting LC, RC, DMFS, and OS, indicating that imaging features could serve as biomarkers to discriminate low-and high-risk patients with moderate predictive accuracy. However, these levels of accuracy may not be sufficient to definitively predict prognosis for each patient's prognosis by itself. This may suggest that other clinical, genomic, proteomic, and metabolomic factors could contribute to clinical outcomes, along with imaging features. Therefore, the use of multi-omics approaches could enable more accurate outcome predictions for each patient, paving the way to 'personalized medicine'.
Our study has several limitations. First, its retrospective design and single-institution cohort may have a concealed selection bias. Although cross-validation and bootstrapping were used to compensate for such bias, there was a potential for overfitting. Second, we were not able to validate our models in external cohorts, although we did perform internal validation. Third, falsepositive lymph nodes may have been included in the analysis. Fourth, substantial variations in MR image acquisition may have affected predictive performance. However, the good predictive values, regardless of MRI protocol, suggest that these imaging features may be robust in a variety of MR scanners or imaging protocols. Lastly, the imaging features in our study were not evaluated for ADC values in diffusion-weighted images (DWI) for prognosis prediction, in contrast to previous studies that primarily investigated ADC values in cervical cancer [32,33]. This was because our patients were included to analyze long-term survival analysis; therefore, most pretreatment MR images were obtained before the role of DWI was established for patients with cervical cancer. Our future research will seek to augment these results using the imaging features extracted from DWI. Nevertheless, our current results provide valuable information about the predictive potential of pretreatment MR imaging and may provide baseline information useful for modifying the current uniform treatment for cervical cancer. We present the first quantitative analysis results separately evaluating the imaging features of primary tumors and lymph nodes in node-positive cervical cancer.

Conclusions
We successfully developed and validated RSF models for predicting clinical outcomes using pretreatment MR imaging features. The models using primary tumor features demonstrated superior performance for predicting LC, while those using lymph node features were superior at predicting RC, DMFS, and OS. Our results indicate that tumor and lymph node imaging features may play complementary roles for predicting clinical outcomes in node-positive cervical cancer.