Novel risk scores for survival and intracranial failure in patients treated with radiosurgery alone to melanoma brain metastases

Purpose Stereotactic radiosurgery (SRS) alone is an increasingly common treatment strategy for brain metastases. However, existing prognostic tools for overall survival (OS) were developed using cohorts of patients treated predominantly with approaches other than SRS alone. Therefore, we devised novel risk scores for OS and distant brain failure (DF) for melanoma brain metastases (MBM) treated with SRS alone. Methods and materials We retrospectively reviewed 86 patients treated with SRS alone for MBM from 2009-2014. OS and DF were estimated using the Kaplan-Meier method. Cox proportional hazards modeling identified clinical risk factors. Risk scores were created based on weighted regression coefficients. OS scores range from 0-10 (0 representing best OS), and DF risk scores range from 0-5 (0 representing lowest risk of DF). Predictive power was evaluated using c-index statistics. Bootstrapping with 200 resamples tested model stability. Results The median OS was 8.1 months from SRS, and 54 (70.1 %) patients had DF at a median of 3.3 months. Risk scores for OS were predicated on performance status, extracranial disease (ED) status, number of lesions, and gender. Median OS for the low-risk group (0-3 points) was not reached. For the moderate-risk (4-6 points) and high-risk (6.5-10) groups, median OS was 7.6 months and 2.4 months, respectively (p < .0001). Scores for DF were predicated on performance status, ED status, and number of lesions. Median time to DF for the low-risk group (0 points) was not reached. For the moderate-risk (1-2 points) and high-risk (3-5 points) groups, time to DF was 4.8 and 2.0 months, respectively (p < .0001). The novel scores were more predictive (c-index = 0.72) than melanoma-specific graded prognostic assessment or RTOG recursive partitioning analysis tools (c-index = 0.66 and 0.57, respectively). Conclusions We devised novel risk scores for MBM treated with SRS alone. These scores have implications for prognosis and treatment strategy selection (SRS versus whole-brain radiotherapy).


Introduction
Melanoma brain metastases (MBM) are a common type of secondary intracranial neoplasm and will develop in nearly half of patients with advanced cutaneous melanoma [1][2][3]. The rate of MBM is likely to rise given the increasing incidence of melanoma and advances in systemic disease control with targeted therapies [4,5]. The overall survival (OS) of these patients is generally poor, and many suffer a neurologic death [2,6,7].
Radiotherapy treatment options for MBM include whole-brain radiation therapy (WBRT) and stereotactic radiosurgery (SRS) [8]. WBRT irradiates both the known metastases and potential microscopic disease maximizing intracranial control but at the cost of neurotoxicity [7,[9][10][11][12][13][14]. Focal SRS targets only the visible disease and spares the remaining brain; however, there is an increased risk of new distant brain metastases with SRS alone, which can independently impact cognition [15][16][17][18][19]. While the optimal strategy remains controversial, SRS alone is an increasingly common treatment approach, particularly for patients with a limited volume of metastatic disease [20].
In order to tailor treatment to individual patients, several important prognostic tools have been created for patients with brain metastases. In 1997, Gaspar et al. [21] analyzed 1200 patients from three Radiation Therapy Oncology Group (RTOG) trials. Using recursive partitioning analysis (RPA), three classes were devised which stratified patients based on survival. The RPA classes were further improved by Sperduto et al. in 2008 with the creation of the graded prognostic assessment (GPA) [22]. Neither tool, however, was specific for primary disease histology. Recognizing the prognostic variances of different tumor types, a set of disease-specific GPAs were then devised [23]. These included a melanoma-GPA, which identified performance status and number of MBM as prognostic for survival. One limitation of the melanoma-GPA is the widely heterogeneous treatment approaches in the development cohort. Patients were managed with WBRT alone, SRS alone, WBRT plus SRS, surgery followed by WBRT, surgery followed by SRS, or a combination of all three modalities. Importantly, the majority of patients were treated with a strategy other than SRS alone. Even with these existing tools, the ability to predict survival in SRS patients remains poor [24].
Therefore, this study analyzed MBM patients treated solely with SRS and sought to create risk scores for survival that could improve upon the existing melanoma-GPA. Secondary aims included identifying predictors of distant brain failure, potentially identifying patients who may benefit from WBRT.

Data collection
With approval of the institutional review board, this study retrospectively identified 86 consecutive patients with intact MBM treated with SRS alone from 2009 to 2014 at the University of Pennsylvania. Patient, disease, and treatment characteristics were retrieved from electronic medical records and GammaPlan software treatment records. Primary cutaneous melanoma diagnosis was recorded at the first date of histologically confirmed melanoma. Mutation status was classified as BRAF wild-type (WT) or BRAF mutation, including V600E, K601E, or V600K. Brain metastasis diagnosis was defined as the date of first metastatic disease on brain magnetic resonance imaging (MRI) or computed tomography (CT).
Extracranial disease (ED) status was categorized as active, stable, or absent based on CT scans of the chest, abdomen and pelvis or positron emission tomography/computed tomography (PET/CT) within two months of SRS. Active ED indicated patients with new or increasing burden of metastatic melanoma to solid organs outside the brain, including patients with newly diagnosed MBM with co-existing extracranial metastases. Stable ED denoted patients with previously treated extracranial metastases with either a partial response or stable size/metabolic activity. Absent ED indicated patients with no history of extracranial metastases or previously treated extracranial disease with complete radiographic response. RPA class and melanoma-GPA score were assigned according to Gaspar et al. [21] and Sperduto et al., [23] respectively.
GammaPlan software was used to retrospectively record MBM tumor volumes and SRS treatment volumes. Tumor volumes of individual lesions were obtained from the SRS planning T1-weighted, contrastenhanced MRI. To avoid inter-planner variability, a single investigator (I.C.) with attending supervision (M.A.B)both blinded to OS and distant failure (DF) dataretrospectively contoured each lesion. Treatment volumes of individual metastases were defined as the volume of brain tissue receiving at least the prescribed marginal dose for each MBM. Total tumor and treatment volumes were calculated by summing all respective individual volumes. Systemic therapy was classified as peri-SRS if administered at the time of SRS or completed within two months of SRS. Systemic therapy was alternatively designated as post-SRS if administered during the interval between SRS and DF, or if it was the first therapy given after SRS in patients without DF. Dates of death were determined from the Social Security Death Index, hospice records, and local newspaper obituaries.

Stereotactic radiosurgery
Radiosurgery was performed using the Model 4-C or Perfexion GK (Elekta Inc., Stockholm, Sweden) with GammaPlan software. A Leksell stereotactic headframe was applied with local anesthesia, and high-resolution brain MR images were taken at 1-mm slices with gadolinium contrast. Additional new MBM discovered on the planning images were targeted with SRS in the same session. Per institutional standards, post-SRS follow-up brain MRI was obtained approximately every 2 months for 1 year and then every 3 months afterward.

Statistical analysis
All 86 patients were included in the OS analysis, with living patients or those lost to follow-up censored at the date of last clinical encounter. DF was analyzed in 77 (89.5 %) patients with follow-up imaging. The remaining nine (10.5 %) patients died or were lost to follow-up prior to first post-treatment intracranial imaging. DF was defined as leptomeningeal disease or new parenchymal MBM at sites other than previous treatment. Patients free of any failure were censored at the date of last imaging showing intracranial control.
The median follow-up time was computed using the inverse Kaplan-Meier method [25], while OS and time to DF were estimated using the Kaplan-Meier method. Variables with a p-value ≤0.10 on univariate analysis were considered for multivariable analysis. For clinical utility, continuous variables were categorized using previously described techniques [26]. Prior to modeling, correlations between variables were checked for multicollinearity. Missing data points (i.e., BRAF mutation status, N = 15) were addressed via multivariable imputation.
Two different multivariable models were developed to identify predictors of OS and DF. The first used all variables that demonstrated significance during univariate Cox proportional hazards modeling, while the second used a backwards, stepwise elimination procedure (exit criteria: p > 0.05) to identify the most parsimonious model. Bootstrapping with 200 resamples was used to test model stability and control for over-optimism.
Next, significant factors relative to each outcome of interest were used to derive risk scores for OS and DF. Point values were assigned to each risk factor based on weighted regression coefficients from the Cox proportional hazards model. Reference categories for each variable were assigned zero points. Scores were tabulated for each patient based on the presence of weighted risk factors, and patients were then sub-classified into clinically useful risk groups. Patients were assigned to one of three risk groups for OS based on similarities in survival patterns. This process was repeated for DF risk; group assignments for DF risk were made independently of the patient's designation for OS risk.
Model discrimination was evaluated using the Harrell's concordance index (c-index) [27] for both OS and DF. A c-index value of 0.5 indicates no predictive capacity, while 1.0 indicates perfect discrimination. Harrell's cindices were also calculated for the melanoma-GPA and RPA for OS. These results were compared to the c-index derived from our SRS-specific MBM OS risk scores.
Descriptive statistics were performed for all variables. P-values ≤0.05 were considered statistically significant. Statistical computations were performed utilizing IBM SPSS, version 22 (IBM Corp., Armonk, NY) and SAS 9.3 (SAS Institute Inc., Cary, NC).

Patient characteristics
Patient and disease characteristics are presented in  [21,28], the majority of patients were classified as RPA class II (89.5 %), while melanoma-GPA categories were more evenly distributed.

Overall survival
With a median follow-up of 37.4 (IQR 13.8-47.8) months, median OS for the cohort was 8.1 (IQR 4.0-19.2) months from SRS. On univariate analysis, factors associated with worse OS included: KPS ≤80, presence of any ED (absent vs. stable/active), presence of active ED (absent/stable vs. active), 2-4 lesions, >4 lesions, and not receiving post-SRS systemic therapy. Presence of a BRAF mutation was not associated with worse OS compared to BRAF WT. On multivariable analysis, KPS ≤80 (HR 8.1, P < .0001), presence of any ED (absent vs. stable/active: HR 5.4, P = .05), presence of 2-4 lesions (HR 2.6, P = .04) and >4 lesions (HR 3.2, P = .002), remained significantly associated with worse OS ( Table 2). Although male gender was not significant on univariate analysis, stepwise regression of all variables identified gender as being significant (HR 1.8, P = .03), even when subjected to bootstrapping. Therefore, male gender was included as a component of the OS risk score. RPA class and melanoma-GPA scores were not included in the multivariable analysis in order to avoid collinearity.
Values for the scoring system were determined from the weighted proportions of hazard ratios ( Table 3). The most heavily weighted risk factor was KPS ≤80, followed by presence of any ED, number of lesions (2-4 or >4), and gender. Total point values could range from 0-10, with 0 representing best OS and 10 representing worst OS. After tabulating individual scores, patients were classified into 3 risk groups (low, moderate, and high) based on similar survival patterns. OS differed significantly between the groups (P < .0001 using the log-rank test for all pairwise comparisons between groups). Median OS estimates for risk groups are represented in the bottom portion of Table 3. A visual representation of OS between the different risk groups is shown in Fig. 1, panel a. The novel risk scores had a higher Harrell's C index (c-index = 0.72) than the melanoma-GPA (c-index = 0.66) and the RPA (c-index = 0.57).
A methodological approach similar to the one used to create the OS risk score was applied to calculate patients' risk of DF. DF risk scores ranged from 0-5, with 0 representing the lowest risk of DF and 5 representing the greatest risk for DF (Table 5). Risk groups for DF (low, moderate, and high) had significantly different patterns of recurrence (P < .0001 using the log-rank test for all pairwise comparisons between groups). The median time to DF for the risk groups are represented in the    Table 5. Similar to the OS risk model, the risk groups derived from the DF model demonstrated a high predictive capacity (c-index = 0.72). A visual representation of DF between the different risk groups is shown in Fig. 1, panel b.

Systemic therapy & adverse events
Systemic therapy analysis included the 77 patients with follow-up imaging, of which 67 (87 %) patients received systemic therapy, while 10 (13 %) patients did not. When stratifying systemic therapy among the OS risk groups,     Abbreviations: BRAF B-Raf proto-oncogene, CI 95 % confidence interval, KPS Karnosfky performance status, peri-SRS at the time of stereotactic radiosurgery, post-SRS after stereotactic radiosurgery, WT wild-type; a cut-off at the upper quartile of age ipilimumab. In the moderate-risk group, an equal percentage (20 %) received ipilimumab, temozolomide, or vemurafenib. An equal percentage (26 %) of high-risk patients received either temozolomide or no therapy, while 13 % of patients received ipilimumab. Among patients with follow-up, 7 (9.1 %) had symptomatic radiation necrosis and 9 (11.7 %) patients had hemorrhagic metastases.

Discussion
In this study, we devised novel risk scores for overall survival and distant brain failure for patients with MBM treated with SRS alone. In the survival risk score, performance status, presence of any ED (active or stable), number of lesions, and gender were clinical predictors of survival in descending significance. Three risk zones were defined (Table 3), partitioning patients into groups with significantly different expected survival. The new survival score had a higher Harrell's C index than either of the RPA classes or the melanoma-GPA (c-index = 0.72, versus 0.57 and 0.66, respectively). Those existing tools were developed in patient cohorts treated predominantly with strategies other than SRS alone; however, patients selected for an SRS-only approach tend to have unique characteristics such as high performance status and low-to-moderate intracranial metastatic burden. These cases may reflect an inherently different disease biology compared to patients recommended for treatment with WBRT or multimodality combinations, as seen in the melanoma-GPA cohort. In our distant brain failure score, performance status, presence of any ED (active or stable), and number of lesions were factors associated with DF on multivariable analysis. Three risk zones were defined (Table 5), partitioning patients into groups with significantly different expected intracranial failure rates. Prior investigations of MBM support the inclusion of ED status [29,30] and number of metastases [30][31][32] as predictors of DF. Recently, Huttenlocher et al. also developed a tool for estimating DF in SRS-treated MBM [33]. Their final model included ED status and number of lesions, but their patient cohort was limited to cases with 1-3 metastases. Recent evidence suggests that SRS may be appropriate for greater than three metastases with low overall tumor burden [29,[34][35][36], and our tool extends the ability of clinicians to estimate DF for this population.
Despite existing prognostic tools, physicians are not able to accurately judge survival for many patients treated with SRS for brain metastases [24]. The current study may improve this prognostic ability. Survival estimates are important anticipatory information for patients and may assist in weighing the relative risks and benefits of treatment. Furthermore, existing tools (i.e. melanoma GPA and RPA) are only prognostic for OS. However, assessing the risk of DF may be important when considering whether to use SRS or WRBT [37]. In our novel DF model, patients in the low-risk group may benefit from SRS alone, which controls the index lesions while minimizing the volume of irradiated brain. This may prevent the neurocognitive side effects associated with WBRT [38][39][40]. On the other hand, patients at high risk of new metastases may benefit more from micrometastatic disease control with WBRT. Several studies have reported that intracranial progression causes more severe neurocognitive deficits than exposure to WBRT [15][16][17][18][19]. Therefore WBRT may best preserve cognitive functioning in the specific group of patients with high risk of DF. Patients in the moderate risk-zone may require a more individualized treatment plan based on the larger clinical picture. Since the use of SRS alone for brain metastases is increasing [20], our OS and DF models will help clinicians accurately assess prognosis in a new era of treatment, while identifying a subset of patients that may benefit more with WBRT than SRS alone. This study has several limitations, including the biases inherent to a retrospective investigation. Additionally, although both models were internally validated using bootstrapping methodology, the risk scores were developed using a single institution cohort; therefore, a subsequent external validation is necessary to confirm the generalizability of the findings. Furthermore, systemic therapy is also rapidly evolving for melanoma, and it is possible that newer agents (e.g., immune checkpoint and BRAF inhibitors) could become part of standard treatment in the future. Several phase II trials have shown the potential activity of these inhibitors for MBM [41,42]. Recent retrospective analyses suggest a benefit of combining SRS with Ipilimumab; for example, Kiess et al. demonstrated that concurrent combination of Ipilimumab at the time of SRS was associated with improved local control and survival [43,44]. These preliminary results are being tested in phase III trials of Ipilimumab (clinicaltrials.gov NCT01703507 and NCT01950195) and Debrafenib (clinicaltrials.gov NCT01721603) in conjunction with SRS. Additionally, systemic therapy varied widely in our patients. Therefore, the model's accuracy may be different in cohorts with an alternative distribution of systemic therapy. However, systemic therapy was not significantly associated with OS or DF in this study, and our patient cohort represents contemporary treatment trends. Finally, neurocognitive data was not collected and thus hindered incorporation of neurologic deaths into our analysis. Many of these limitations will be addressed by an open phase III randomized trial assessing MBM treated with SRS with or without WBRT [45,46].

Conclusion
In conclusion, this study developed novel risk scores for survival and distant brain failure in patients with MBM treated with SRS alone. The survival score demonstrated a higher predictive capacity than existing tools such as RPA class or melanoma-GPA, and the distant failure score identifies a subset of patients who may benefit from WBRT more than SRS.