Explore association of genes in PDL1/PD1 pathway to radiotherapy survival benefit based on interaction model strategy

Purpose To explore the association of genes in “PD-L1 expression and PD-1 check point pathway in cancer” to radiotherapy survival benefit. Methods and materials Gene expression data and clinical information of cancers were downloaded from TCGA. Radiotherapy survival benefit was defined based on interaction model. Fast backward multivariate Cox regression was performed using stacking multiple interpolation data to identify radio-sensitive (RS) genes. Results Among the 73 genes in PD-L1/PD-1 pathway, we identified 24 RS genes in BRCA data set, 25 RS genes in STAD data set and 20 RS genes in HNSC data set, with some crossover genes. Theoretically, there are two types of RS genes. The expression level of Type I RS genes did not affect patients' overall survival (OS), but when receiving radiotherapy, patients with different expression level of Type I RS genes had varied survival benefit. Oppositely, Type II RS genes affected patients' OS. And when receiving radiotherapy, those with lower OS could benefit a lot. Type II RS genes in BRCA had strong positive correlation and closely biological interactions. When performing cluster analysis using these related Type II RS genes, patients could be divided into RS group and non-RS group in BRCA and METABRIC data sets. Conclusions Our study explored potential radio-sensitive biomarkers of several main cancer types in an important tumor immune checkpoint pathway and revealed a strong association between this pathway and radiotherapy survival benefit. New types of RS genes could be identified based on expanded definition to RS genes. Supplementary Information The online version contains supplementary material available at 10.1186/s13014-021-01951-x.


Introduction
Radiation therapy remains the primary treatment for nearly two-thirds of cancers, including the primary curative or palliative treatment for breast cancer and adjuvant therapy for radical resection of gastric cancer [1][2][3]. Unfortunately, because of tumor heterogeneity, tumor response rates to radiotherapy vary conspicuously, even among patients who are diagnosed with the same tumor type [4]. Despite significant technological advances in radiation therapy for tumors in recent years, personalized radiotherapy regimens based on cancer biology have become increasingly difficult [5]. A major issue in radiation therapy is predicting cancer radio-sensitivity.
Biomarkers that provide information about tumor prognosis and predict tumor's inherent radiation sensitivity or its response to treatment may be valuable in helping to personalize radiation dose, allowing clinicians to make decisions about treatment regimens for different patients, while avoiding radiation-induced toxicity in patients who are unlikely to reap the benefits from the treatment [6,7]. Tumor molecular mapping has been used to develop radio-sensitive genetic signatures and has been used to identify prognostic or predictive biomarkers of radiation responses [8][9][10]. Given strong evidence of the pathway-based genetic nature of cancer, one of the main shortcomings of past studies is the failure to use prior biological information into identifying biomarkers [11]. The potential for carcinogenic mechanisms are grouped into pathways based on biological functions such as cell cycle, hypoxia, DNA damage, tumor microenvironment, immune checkpoints and others [12][13][14][15].
As a key and famous regulatory immune checkpoint, programmed death-1 (PD-1) and its ligand PD-L1 checkpoint pathway plays a crucial role in maintaining balance between immune tolerance and autoimmunity [16]. PD-L1 presented on the surface of the tumor cells activates the downstream of the PD-1 pathway to overinhibit T cells proliferation and differentiation [17] and promote immune escape and tumor growth [18]. In addition, the expression of PD-L1 has been found associated with tumor radio-sensitivity in a variety of solid tumor types also. When Bum-Sup Jang et al. [19][20][21] evaluated the predictive value of radio-sensitive gene signatures in invasive breast carcinoma and glioma, they found an interaction between radio-sensitive gene signatures and PD-L1. Xintong et al. [22] reported that in head and neck cancer, patients with high PD-L1 expression had better recurrence-free survival in receiving radiotherapy.
These evidences seem to indicate that PD-L1 expression with its regulation in solid tumors is affected by radiotherapy, thereby altering the outcome of patients' prognosis. In this case, there is requirement to acknowledge the association between regulatory mechanism of PD-L1/PD-1 in cancer and radiotherapy sensitivity. In solid tumors, up-regulation of PD-L1 is caused by activation of pro-survival pathways MAPK and PI3K/ Akt as well as transcriptional factors HIF-1, STAT3 and NF-kappa B [23]. It can be supposed that genes regulating PD-L1/PD-1 check point pathway may also associate with cancer radio-sensitivity and could be useful biomarkers for predicting radio-sensitive of cancer or as targets that promote radiation sensitivity. In fact, the relationship between these genes and radiotherapy sensitivity of gastric cancer has been investigated, and some conclusions have been obtained [24].
In this study, we have enhanced the evidence and supplemented the previous studies. Using TCGA data sets, we explored the association of genes in PD-L1/PD-1 check point pathway in several major cancers to radiotherapy survival benefit based on interaction model and validated in an external cohort. Conclusively, for precision medicine, our work offered more evidences and clues for using PD-L1/PD-1 related pathway genes as potential biomarkers to identify radio-sensitive for cancer patients or as targets that promote personalize radiation.
The gene expression data sets were collated to exclude normal tissues and retain tumor samples. At the same time, we examined clinical information on each type of tumor and found GBM had too few samples for non radiotherapy (n = 18) while LIHC had too few samples for radiotherapy (n = 14). These two data sets were abandoned. Next, we removed subjects with missing survival or radiotherapy information. Patients with survival time less than 5 days were also excluded [24]. Then univariate Cox analysis was performed on the remaining six tumor data sets, data sets (BRCA, HNSC, STAD) with radiotherapy being protective effect (hazard ratio, HR < 1, P < 0.05) were selected for subsequent analysis. In addition, external validation was performed using Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) cohort (https:// www. cbiop ortal. org/ datas ets). Figure 1 is the flow chart.

Radio-sensitive genes (RS genes)
We obtained a total of 73 genes (see Additional file 1: Figure S1, Additional file 3: Table S1) in "PD-L1 expression and PD-1 checkpoint pathway in cancer" from web of Kyoto Encyclopedia of Genes and Genomes (KEGG, https:// www. kegg. jp/). These genes are involved in the upstream regulation of PD-L1 expression or play a role in downstream of the PD-L1/PD-1 pathway to inhibit T cells proliferation and differentiation [25].
In this study, we defined radio-sensitivity as: participants with different gene expression levels obtained discrepant survival benefit from radiotherapy [24]. Based on the median value of a certain gene expression, the whole included patients were roughly divided into two groups as: the high expression group and the low expression group. And according to whether receiving radiotherapy or not, patients could be divided into radiotherapy (RT) group and non-radiotherapy (NRT) group. Thus, patients were divided into four groups: high expression RT (HRT) group, low expression RT (LRT) group, high expression NRT (HNRT) group, and low expression NRT (LNRT) group.
In a general way, if HRT group had better overall survival (OS) than HNRT group, while LRT group had no better OS than LNRT group for example, it was thought that the high expression group benefited from radiotherapy [24]. However, according to the definition of interaction effect [26,27], HRT group should have better OS than LRT group in the same time. In addition, there is another scenario that is usually overlooked: Different expression group might have different OS in NRT scenario. In summary, Fig. 2 displays two types of interaction relationship between gene levels and radiotherapy. In Fig. 2A (Type I), level A had no different effect compared to level B (say, P > 0.05) when in NRT scenario but had lower effect than level B (P < 0.05) when in RT scenario. Level B had significant improved effect in RT scenario compared to in NRT scenario (P < 0.05). It did not matter whether Level A had improved effect in RT scenario. In Fig. 2B (Type II), level A had significantly lower effect than level B (P < 0.05) when in NRT scenario but had no different effect (P > 0.05) in RT scenario. Level A had significant improved effect in RT scenario compared to in NRT scenario (P < 0.05). Whether Level B had improved effect in RT scenario did not matter.
Strictly speaking, the RS genes should be discussed in four scenarios. Scenario A: in high expression group, HRT group had better OS than HNRT group. Scenario B: in low expression group, LRT group had better OS than LNRT group. Scenario C: in RT group, HRT group had different OS compared to LRT group. Scenario D: in NRT group, HNRT group had different OS compared to LNRT group. If scenario A or/and B happened and meanwhile only one of scenario C (corresponding to Type I) or D (corresponding to Type II) happened, the gene could be considered sensitivity to radiotherapy and was deemed as a RS gene.

Analysis methods
The relationship between genes expression levels and radiotherapy survival benefit was analyzed by the multivariate Cox proportional hazards models, the fast backward [28] method based on Akaike information criterion (AIC) was used for variables selecting. In this study, we considered as many clinical variables as possible to screen out the best correction factors. Variables that remained in the multivariate Cox model were considered having an impact on OS. Specifically, for example, in high expression group, if radiotherapy had remained in multivariate Cox regression model (HR < 1), corresponding to scenario A happened; meanwhile in RT group, high expression group had HR < 1 compared to low expression group, corresponding to scenario C happened; and in NRT group, gene expression had no impact on OS (scenario D not happened). This gene was considered to a RS gene.
For missing variable data, R packet mice (multiple imputation by chained equations) was used for multiple interpolation [29]. Next we utilized the strategy of imputation stacking using R packet StackImpute, where multiple imputations of the missing data were stacked on top of each other to create a large data set [30]. We then estimated parameters in the analysis model by fitting a weighted model for Y|X on the stacked data set [31]. R packet pheatmap was used to perform cluster analysis based on gene expression. Kaplan-Meier (K-M) curves were used to show the survival curves. The log-rank test evaluated the statistically significant differences. Wilcoxon test was used to compare continuous variables that were non-normal. Correlation was calculated by Pearson correlation coefficient (r). |r| > 0.8: as strong correlation; 0.3 < |r| < 0.8: as moderate correlation; |r| < 0.3: as weak correlation [32]. The Search Tool for the Retrieval of Interacting Genes (STRING) [33] was applied to analyze protein-protein interaction (PPI) network (minimum required interaction score ≥ 0.7). All statistical analyses were performed using the R (4.0.2). A P value of 0.05 was considered significant. All statistical tests were two-sided.

Identification of RS genes
We take BRCA data set as an example to illustrate the identification of RS genes. Table 1 shows the  Tables S2 and Additional file 5: Table S3.
Additional file 6: Table S4 shows 24 RS genes identified in BRCA after clinical impact factors adjustment and part are shown in Fig. 3. Patients with high expression of MYD88, RASGRP1 and TRAF6 benefited from radiotherapy (scenario A and C had statistical significance, scenario D had no statistical significance, Type I RS genes). We called these genes as radio-sensitive genes within high expression (RGH). TIRAP and PTPN11 were also RGH genes (scenario A and D had statistical significance while scenario C had no statistical significance, Type II RS genes). Patients with low expression of HRAS, IKBKG, MAP2K2, TLR9 and CD3D, CD3G, IFNG, NFKBIA, PDCD1, CD274, PTPN6, STAT1, et cetera. benefited from radiotherapy (scenario B and C had statistical significance or scenario B and D had statistical significance). We called these genes as radio-sensitive genes within low expression (RGL). The unadjusted K-M curves of part of RS genes are shown in Fig. 4. RS genes of HNSC and STAD are also shown in Additional file 6: Table S4. We compared RS genes in the three tumor data sets and found some croassover genes (see Fig. 5A). CD3D and NFATC1 were the common genes in the three tumor data sets.

Relationship of RS genes in BRCA
We explored the correlation of expression level among these RS genes in BRCA patients (Fig. 6). There were weak to moderate correlation among Type I RS genes (Fig. 6A). However, a large number of Type II RS genes had strong positive correlation with each other (Fig. 6B/C). Further analysis of PPI network (Fig. 6D) shows that CD3D was at the hub position. The biological interactions between these Type II RS genes were closely related.
We performed cluster analysis using 7 Type I RS genes and 17 Type II RS genes respectively. Patients were divided into two clusters according to the outcome of cluster analysis. When using Type I RS genes, patients of the two clusters had no different radiotherapy survival benefit (see Additional file 2: Figure S2A/B). Nevertheless, when using Type II RS genes (STAT1 was not used due to extremely high expression value), patients of clus-ter2 (n = 350) had much improved survival benefit under radiotherapy (see Fig. 7A/B).

Distribution of RS genes in BRCA
We extracted BRCA patients receiving radiotherapy who survived more than 8 years (alive group, n = 49) and those who survived less than 3 years (dead group, n = 29) and compared the expression of 10 selected RS genes in the two groups (see Fig. 8A). From the box-plot, RAS-GRP1 and TRAF6 as RGH genes had a higher expression level in the alive group than in the dead group (P < 0.05), patients with higher expression level of RGH genes gained survival benefit from radiotherapy. By contrast, we also extracted patients not receiving radiotherapy (see Fig. 8B). Most RGL genes had a trend that their median expression values in the alive group (n = 32) would be higher than those in the dead group (n = 29). This suggested that the expression level of these RGL genes affected patients' OS.   Table S4). RASGRP1 and MAP2K2 were the common Type I RS genes in BRCA and META-BRIC (Fig. 5B) and LAT, PTPN11 and ZAP70 were the common Type II RS genes (Fig. 5C). When performed cluster analysis using RS genes from METABRIC, Type II RS genes could divided the patients into RS cluster (n = 637) and non-RS cluster (n = 1265) (Fig. 7C/D).

Discussion
Along with some chronic diseases such as cardiovascular disease, cancer remains one of the biggest killers of human health [34]. The World Health Organization (WHO, https:// www. who. int/) has recently announced on 5 March, 2021 that, the breast cancer has now overtaken lung cancer as the world's mostly commonly-diagnosed cancer and the new global breast cancer initiative highlights renewed commitment to improve survival. At the same day, new WHO publication provides guidance on radiotherapy equipment to fight cancer like colorectal and lung cancer. Radiotherapy is remain one of the most effective tools to mitigate pain and suffering associated with advanced cancers, also, improve the quality of life and survival [35,36]. Nevertheless, heterogeneity in terms of tumor characteristics, prognosis, and survival among cancer patients has been a persistent problem for many decades. Vast studies have shown that, the investigation of biomarkers related to radiation could provide another means by which radiotherapy could become personalized [2,37].
Understanding the mechanism of tumors is also a major issue in identifying effective biomarkers and potential drug targets of radio-sensitivity [38,39]. PD-1 and its ligand PD-L1 are important immune checkpoints as a potential therapeutic target in cancer [18]. PD-L1/ Fig. 4 continued PD-1 pathway plays a critical role in transmitting costimulatory molecules to activate T cells as the second signal and maintain the balance of the immune microenvironment [40]. Well, when the body is invaded by the tumors, the balance of the immune micro-environment is destroyed. PD-L1 on tumor cells may engage PD-1 receptors resulting in suppression of T-cell mediated immune response. Therapeutic antibodies blocking the PD-L1/ PD-1 pathway by targeting PD-L1 or PD-1 are highly effective in rescuing T cell anti-tumor effector functions [17,41]. In addition, the expression level of PD-L1 relate to the radiotherapy sensitivity of tumors [19,21]. As PD-L1 expression is regulated by the upstream signaling pathway, while PD-L1/PD-1 combination is transferred to the downstream T cell regulation as the second signal, the expression level of relevant genes in regulating PD-L1 expression and in PD-1 checkpoint pathway in cancer appears to be of vital importance, which may indicate the potential sensitivity of the tumor to radiotherapy. In this study, we explored the radio-sensitivity of genes in PD-L1 expression and PD-1 checkpoint pathway in cancer using several major TCGA data sets including BRCA, HNSC and STAD. When the initial univariate COX analysis was performed, radiotherapy had nonpositive effect (HR ≥ 1) to OS in lung cancer and LGG, we excluded these type of tumors for further exploration. In LGG data set, we performed chi-square test between survival status/radiotherapy status and main clinic factors. We found that older (≥ 60) and higher tumor grade patients commonly received a higher percentage of radiotherapy and meanwhile these people had a much higher percentage died of LGG. Such confounding factors also happened in LUAD and LUSC data sets.
In addition, we systematically considered influential clinical factors in the data sets according to literature [20, 22,24], clinical expertise knowledge and data missing condition. We performed ten of multiple interpolation to missing clinical variables that had a missing percentage lower than 20% and stacked them to perform weighted multivariate Cox regression. Using multiple imputation can better handle missing data by estimating and replacing missing values many times [42] and the result of using the stacked data set to perform weighted multivariate Cox regression was consistent with pooled data results by applying Rubin s combining rules [30]. Therefore, the influential clinical variables were well controlled to ensure the reliability of the results. In the BRCA data set, radiotherapy, chemotherapy, age, surgery type, margin status, PR status, N stage, M stage and pathological stage were the impact factors of OS, which were reasonable and validated [43].
Then, we developed a more comprehensive definition to radio-sensitive genes based on interaction theory. Theoretically, there are two types of RS genes. The expression level of Type I RS genes did not affect patients' overall survival (OS), but when receiving radiotherapy, patients with different expression level of Type I RS genes had varied survival benefit. Type II RS genes is the opposite. According to the updated definition, we identified 24 RS genes in BRCA data set, 25 RS genes in STAD data set, 20 RS genes in HNSC data set and 24 RS genes in METABRIC data set among genes in regulating PD-L1/ PD-1 pathway in cancer (93/292), with overlapping genes between each other. We performed the same strategy to search RS genes in the published radio-sensitivity "31gene signature" [44] as a positive contrast, and found 4 RS genes in BRCA data set, 11 RS genes in HNSC data set, 12 RS gene in STAD data set and 7 RS gene in METABRIC data set (34/124). In addition, we simulate a test to detect the relationship between survival and a random gene set as a negative contrast. Univariate Cox analysis shown that a proportion of 383/5000 genes (less than 10%) were related to survival benefit. Therefore, there was a strong relationship between PD-L1/PD-1 pathway in cancer and radiotherapy sensitivity.
Importantly, when we performed cluster analysis using the identified RS genes, Type II RS genes could divided the patients into RS group and non-RS group in different database (TCGA and METABRIC). These Type II RS genes had strong positive correlation and closely biological interactions with each other. In addition, RASGRP1 was common RGH & Type I RS gene in the two databases. Patients with higher expression level of RASGRP1 gained survival benefit from radiotherapy. RasGRP proteins play roles in such phenomena as: T cells maturation and functioning, B cells response, platelet aggregation, mast cells activity regulation, transformation and many other [45]. PTPN11 was   [46]. ZAP70 is related to the immunity of cancers [47,48]. This study has its merits. Firstly, we expanded the definition of radio-sensitive genes and explored the association of genes in important pathway of cancer to radiotherapy sensitivity using TCGA public data sets recognized as authoritative. Secondly, we took into account as much useful clinical information as possible to control impact factors by stacking multiple interpolation data, making the results more persuasively. Thirdly, we validated our strategy using a big external data set, METABRIC and proved that our conclusion was reliable. The limitation of this study is that we don't have performed experimental study, also no cohort to verify the findings. In addition, because we only explored a few major cancers, more tumor types should be brought into the discussion.

Conclusion
In conclusion, our study identified potential radiosensitive biomarkers of several main cancer types in an important tumor immune checkpoint pathway and revealed a strong association between this pathway and radiotherapy sensitivity. New types of RS genes could be identified based on expanded definition to RS genes. Different types of tumors may share common We hope that further studies will be performed to confirm our findings.