Skip to main content

Generating synthetic CT from low-dose cone-beam CT by using generative adversarial networks for adaptive radiotherapy

Abstract

Objective

To develop high-quality synthetic CT (sCT) generation method from low-dose cone-beam CT (CBCT) images by using attention-guided generative adversarial networks (AGGAN) and apply these images to dose calculations in radiotherapy.

Methods

The CBCT/planning CT images of 170 patients undergoing thoracic radiotherapy were used for training and testing. The CBCT images were scanned under a fast protocol with 50% less clinical projection frames compared with standard chest M20 protocol. Training with aligned paired images was performed using conditional adversarial networks (so-called pix2pix), and training with unpaired images was carried out with cycle-consistent adversarial networks (cycleGAN) and AGGAN, through which sCT images were generated. The image quality and Hounsfield unit (HU) value of the sCT images generated by the three neural networks were compared. The treatment plan was designed on CT and copied to sCT images to calculated dose distribution.

Results

The image quality of sCT images by all the three methods are significantly improved compared with original CBCT images. The AGGAN achieves the best image quality in the testing patients with the smallest mean absolute error (MAE, 43.5 ± 6.69), largest structural similarity (SSIM, 93.7 ± 3.88) and peak signal-to-noise ratio (PSNR, 29.5 ± 2.36). The sCT images generated by all the three methods showed superior dose calculation accuracy with higher gamma passing rates compared with original CBCT image. The AGGAN offered the highest gamma passing rates (91.4 ± 3.26) under the strictest criteria of 1 mm/1% compared with other methods. In the phantom study, the sCT images generated by AGGAN demonstrated the best image quality and the highest dose calculation accuracy.

Conclusions

High-quality sCT images were generated from low-dose thoracic CBCT images by using the proposed AGGAN through unpaired CBCT and CT images. The dose distribution could be calculated accurately based on sCT images in radiotherapy.

Introduction

Cone-beam CT (CBCT) images are widely used in image-guided radiotherapy (IGRT) [1,2,3], and they are important for decreasing the positioning error and increasing the accuracy of treatments for patients with cancer. Compared with images from traditional fan-beam CT, CBCT images suffer from low contrast and artifacts due to X-ray scattering, mechanical accuracy, and movement of patients during scanning [4, 5], resulting in serious distortion of the Hounsfield unit (HU) value. Hence, CBCT images are unsuitable for calculating dose distributions for replanning in adaptive radiotherapy. Besides, patients may undergo multiple CBCT scans during an IGRT treatment course and this raises a great concern about delivered dose to the patients. Previous study indicated that daily CBCT scan for IGRT could increase the secondary cancer risk by 2% up to 4% [6]. To reduce the additional dose for patients generated from IGRT, the researchers have proposed several low-dose CBCT imaging technologies [7, 8]. Now, the low-dose protocols of CBCT scanning have been widely used in clinical practice.

Many methods of using CBCT images in adaptive radiotherapy have been proposed, and these include water–air–bone density assignment [9, 10], CBCT imaging process improvement based on modeling [11,12,13], and deformable image registration (DIR) of CT/CBCT images [14, 15]. Direct HU-ED calibration of CBCT images has relatively low accuracy due to the absence of artifact reduction processing. Electron density assignment is time consuming and influenced by human experience. Arai [16] modified the HU values of CBCT images to match the planning CT images by the histogram matching algorithm and evaluated in the phantom and head and neck cancer patients. Traditional model-based CBCT imaging correction is often realized by creating complex physical models to simulate the scattering [17,18,19,20] or changing of hardware. This method is difficult to promote due to limitations in hardware or calculation efficiency of physical models. Mainegra-Hing [17] calculated the scatter contribution of CBCT in the phantom by Monte Carlo (MC) algorithm. Niu [19] proposed a priori CT-based scatter correction method, where the corresponding planning CT projections were used to correct CBCT projections, and evaluated using two phantom studies. Park [20] applied the priori CT-based scatter correction technique to phantoms and a prostate patient for proton dose calculation. The priori CT-based scatter correction method is on the premise that the anatomical structure of CBCT is completely consistent with that of planning CT after registration which is difficult to satisfied in clinical practice such as the thorax and abdomen. DIR transforms planed CT to CBCT through deformation registration to account for anatomical changes. This type of method achieves good results at sites that are stationary, such as the head and neck. However, the method’s registration accuracy needs to be improved at sites with considerable anatomic structural changes, such as the chest and abdomen [15].

Another method of correcting the HU value of CBCT images is to generate synthetic CT (sCT) images from CBCT images through deep learning [21,22,23,24,25,26,27,28,29,30,31]. This method establishes a complicated mapping between CBCT and CT by training neural networks, thus allowing sCT images to be generated from CBCT directly. sCT has the same anatomical structure as CBCT, and the HU values of tissues are close to those of planning CT. Chen [22] used Unet to generate sCT images from the CBCT of patients with head and neck cancer, and the loss function combined the mean absolute error (MAE) and structural similarity index (SSIM). The MAE of sCT and CT in the testing results was 18.98 HU. Similarly, Li [21] added a residual unit to Unet to generate sCT from CBCT from patients with head and neck cancer, and the MAE between sCT and CT ranged within 6–27 HU. Instead of generating sCT directly, Hansen [28] proposed a ScatterNet where pairs of measured and corrected projections were trained using a Unet-like architecture. The corrected projection was obtained by the priori CT-based scatter correction method [19]. Lalonde [29] applied the MC simulation to generate CBCT projections for head and neck patients, then the Unet was trained to reproduce MC projection-based scatter correction from raw projections. The MAE of scatter-corrected images was 13.4 HU, compared to 69.6 HU for the uncorrected images. Landry [30] compared Unet training with three different datasets to correct CBCT images for prostate patients. The datasets include raw and corrected CBCT projections, raw CBCT image and DIR-synthetic CTs, raw CBCT image and reconstructed CBCT image based on corrected projections. Supervised learning methods, such as Unet [32], require paired CBCT/CT images as the training dataset, and voxel-wise loss is usually applied. However, these methods need high-accuracy alignment of paired images, which is difficult to acquire in clinics, especially at sites with considerable anatomical structural changes, such as the chest and abdomen.

The development of generative adversarial networks (GAN) [33] has provided a new technology and framework for the application of medical images. GAN has achieved state-of-the-art performance in many medical image tasks, including segmentation [34, 35], classification [36, 37] and medical image synthesis [38,39,40]. Isola[41] proposed conditional adversarial networks(cGAN) in image-to-image translation (so-called pix2pix) which was widely used in medical image reconstruction [40] and cross modality synthesis [38, 39]. Maspero [38] applied pix2pix in MR-to-sCT generation on 2D paired transverse image slices of 32 prostate cancer patients. Cusumano [39] used cGAN to generate sCT from low field MR(0.35 T) images in pelvis and abdomen for MR-guided adaptive radiotherapy. Quan [40] reconstructed the MR image form under-sampled K-space using pix2pix. Zhu [42] proposed an unsupervised cycle-consistent adversarial network (cycleGAN) to solve image translation for unpaired datasets, and it has been applied extensively in unpaired medical image translation [43]. Liang [23] utilized cycleGAN to generate sCT from the CBCT of patients with head and neck cancer by using an unpaired training dataset; a phantom experiment demonstrated that the method has better anatomical accuracy than the DIR method. Kida [26] conducted training on unpaired CBCT/CT images of 20 patients with prostate cancer by using cycleGAN and found that the image quality of sCT substantially improves compared with that of the original CBCT. Harms [24] fed 3D image patches to cycleGAN for CBCT-to-sCT image generation of patients with brain and pelvis cancer. They used paired CBCT and CT images in training and found that the mean absolute errors (MAEs) of sCT in the brain and pelvis are 13.0 and 16.1 HU, respectively. On the basis of the study of Harms [24], Liu [25] added self-attention to the generator network of cycleGAN for CBCT-to-sCT image generation of patients with pancreatic cancer and calculated the radiotherapy dose distribution. These studies on sCT generation from CBCT images concentrated on the head or abdomen, but limited studies have been conducted on CBCT images of the thorax, and low-dose CBCT-to-sCT image generation have not been studied.

In this study, unpaired low-dose CBCT and CT images of the thorax were trained using GAN. The low-dose CBCT images were obtained under a fast protocol with 50% less clinical projection frames compared with standard protocol. The sCT images generated from CBCT were used to calculate the dose distribution for adaptive radiotherapy. Given that the anatomical structure changes considerably due to respiratory movement, acquiring perfect alignment for CT/CBCT images is difficult. Hence, GAN was selected for unsupervised training. Furthermore, the low-dose CBCT images of the thorax include considerable artifacts, such as streaking, shading, and cupping caused by X-ray scatter and respiratory movements of patients; these artifacts disturb image translation tasks. We used attention-guided GAN (AGGAN) [44], which pays attention to the important part of images to eliminate numerous artifacts. Moreover, cycleGAN [42] and conditional GAN (so-called pix2pix) [41] were used in CBCT-to-sCT generation, and the quality of sCT images generated by different neural networks was compared. Then, a quantitative assessment of the generated sCT images was performed on a thoracic phantom, and the dose distribution of a radiotherapy plan was calculated.

Materials and methods

Image acquisition and processing

The low-dose CBCT and planning CT images of 170 patients who underwent free-breathing thoracic radiotherapy in our hospital were collected, 136 pairs as the training dataset and 34 pairs as testing dataset. The CBCT images were acquired through XVI scanning of a linear accelerator Infinity (Elekta, Stockholm, Sweden). In this study, a fast CBCT protocol was used for scanning to obtain low-dose CBCT images. Compared with the built-in standard protocol, a fast protocol accelerates the gantry rotation speed and decreases the scanning frames, thus decreasing the scanning time and radiation dose of patients. However, image quality is reduced to some extent [8, 45]. The fast protocol was realized by modifying the gantry rotation speed of the standard chest M20 protocol from 180°/min to 360°/min while the other parameters were kept constant. The projection frames were reduced from 660 to 330 for each patient scan. The gantry was rotated by 360° during each CBCT scanning, and 330 projection frames were collected. The planning CT images of patients were acquired using Siemens CT (SOMATOM Force, Germany). The scanning and reconstruction parameters of CBCT and CT are listed in Table1. The CT images were re-sampled to keep their resolution consistent with that of the CBCT images. Then, the CBCT images of each patient were used as fixed images, and the corresponding CT images were aligned with the CBCT images via 3D rigid registration. For the testing dataset, a deformable registration was performed on the CT to pair it to the corresponding CBCT by a multi-resolution B-spline algorithm. Afterward, the CT images were cropped to have the same field of view (FOV) as the corresponding CBCT images.

Table 1 The scanning and reconstruction parameters of CBCT and planning CT image

Image synthesis with AGGAN

AGGAN has a similar network structure as cycleGAN [44], and it involves two generators (GCBCT–CT generates CT from CBCT and GCT–CBCT generates CBCT from CT) and two discriminators (DCT distinguishes the sCT from the real CT and DCBCT distinguishes the synthetic CBCT (sCBCT) from real CBCT). AGGAN is composed of two cycles. In the first cycle, CBCT images are inputted into GCBCT–CT to generate sCT images. Then, sCT images are inputted into GCT–CBCT to generate recycled CBCT (rCBCT) images. The two discriminators distinguish the corresponding generative images. The cycle-consistency loss constrains the generation process to minimize the differences between the original CBCT images and the rCBCT images. In the second cycle, CT images are inputted into GCT–CBCT to generate sCBCT images, which are then fed into GCBCT–CT to generate recycled CT (rCT) images. Compared with the original cycleGAN, AGGAN modifies the generator network, which is equipped with a built-in attention module. A cycle process of AGGAN is shown in Fig. 1. GCBCT–CT contains the encoding and decoding parts. The encoding part is a downsampling process that shares weights. The decoding part contains two branches; one generates n − 1 content masks, and the other generates n attention masks. The attention masks are divided into n − 1 foreground attention masks and one background attention mask after applying the Softmax function. \(Soft\max (A_{i} ) = {{e^{{A_{i} }} } \mathord{\left/ {\vphantom {{e^{{A_{i} }} } {\sum\nolimits_{c = 1}^{n} {e^{{A_{c} }} } }}} \right. \kern-\nulldelimiterspace} {\sum\nolimits_{c = 1}^{n} {e^{{A_{c} }} } }}\), in which A is attention masks and i ranges from 1 to n. The background attention mask pays attention to the part that is unchanged before and after image generation, and this part is multiplied with the input CBCT images to obtain an output image. The foreground attention mask pays attention to the image part that changes before and after generation. A total of n − 1 output images are obtained by element-wise multiplication of n − 1 foreground attention masks and n − 1 content masks. These n output images are added, thus obtaining the final sCT images.

Fig. 1
figure 1

Framework of the proposed AGGAN, which contains two attention-guided generators GCBCT–CT and GCT–CBCT. We show one cycle in this figure, i.e., CBCT → sCT → rCBCT ≈ CBCT. Each generator such as GCBCT–CT consists of a parameter sharing encoder ECBCT–CT, a content mask generator \(G_{CBCT - CT}^{C}\) and an attention mask generator \(G_{CBCT - CT}^{A}\). The proposed model is constrained by the cycle-consistency loss. The symbols ⊕ ,  ⊗ and Ⓢ denote element-wise addition, element-wise multiplication and channel-wise Softmax respectively

The GCBCT–CT of AGGAN generates sCT images through Eq. (1).

$$S_{CT} = \sum\limits_{f = 1}^{n - 1} {(C_{CT} * A_{CT}^{f} )} + I_{CBCT} * A_{CT}^{b}$$
(1)

where CCT is the content mask, \(A_{CT}^{f}\) is the foreground attention mask, \(A_{CT}^{b}\) is the background attention mask, ICBCT is the input CBCT images, and SCT is the generated sCT images. In this study, n was set to 10.

The generator GCT–CBCT generates rCBCT images through Eq. (2).

$$R_{CBCT} = \sum\limits_{f = 1}^{n - 1} {(C_{CBCT} * A_{CBCT}^{f} )} + S_{CT} *A_{CBCT}^{b}$$
(2)

Corresponding to the variables in Eq. (1), CCBCT, \(A_{CBCT}^{f}\), and \(A_{CBCT}^{b}\) in Eq. (2) denote the content, foreground attention, and background attention masks in GCT–CBCT, respectively. SCT is the sCT images gained from Eq. (1), and RCBCT is the generated rCBCT images.

The adversarial loss of the neural network uses the LSGAN (Least Squares GAN) model [46] as shown in Eq. (3) and (4). DCT distinguishes sCT from CT and aims to classify sCT with label 0 from CT with label 1. Differently, GCBCT–CT attempts to make sCT as close as possible to CT and aims to output 1 for sCT after the discriminator. The loss functions of the discriminators and generators are the minimum and \(L_{{GAN - G_{CBCT - CT} }}\), respectively.

$$L_{{GAN - D_{CT} }} = \, \frac{{1}}{2m}\sum\limits_{i = 1}^{m} {[(D_{CT} (I_{{_{CT} }}^{i} ) - 1} )^{2} + D_{CT} (G_{CBCT - CT} (I_{{_{CBCT} }}^{i} ))^{2} ]$$
(3)
$$L_{{GAN - G_{CBCT - CT} }} = \, \frac{{1}}{2m}\sum\limits_{i = 1}^{m} {(D_{CT} (G_{CBCT - CT} (I_{{_{CBCT} }}^{i} ) - 1} ))^{2}$$
(4)

where m is the number of trained images and and \(I_{CBCT}^{i}\) are the ith CT and ith CBCT images, respectively. The loss functions of DCBCT and GCT–CBCT are similar to those in Eqs. (3) and (4).

$$L_{{GAN - D_{CBCT} }} = \, \frac{{1}}{2m}\sum\limits_{i = 1}^{m} {[(D_{CBCT} (I_{{_{CBCT} }}^{i} ) - 1} )^{2} + D_{CBCT} (G_{CT - CBCT} (I_{{_{CT} }}^{i} ))^{2} ]$$
(5)
$$L_{{GAN - G_{CT - CBCT} }} = \, \frac{{1}}{2m}\sum\limits_{i = 1}^{m} {(D_{CBCT} (G_{CT - CBCT} (I_{{_{CT} }}^{i} ) - 1} ))^{2}$$
(6)

The generative adversarial loss is

$$L_{GAN} = L_{{GAN - D_{CT} }} + L_{{GAN - G_{CBCT - CT} }} + L_{{GAN - D_{CBCT} }} + L_{{GAN - G_{CT - CBCT} }}$$
(7)

The neural network still can map images from one domain to several domains on the basis of generative adversarial loss only. These domains share the same distribution characteristics, which cannot ensure that the learned generator can map the input CBCT images to the desired CT images. A cycle-consistency loss needs to be added to decrease the mapping function spaces as much as possible; this loss requires the minimum difference between the input CBCT and rCBCT images and the minimum difference between the input CT and rCT images.

$$L_{cycle - CBCT} = \, \frac{{1}}{m}\sum\limits_{i = 1}^{m} {\left| {G_{CT - CBCT} (G_{CBCT - CT} (I_{{_{CBCT} }}^{i} )) - I_{{_{CBCT} }}^{i} } \right|}$$
(8)
$$L_{cycle - CT} = \, \frac{{1}}{m}\sum\limits_{i = 1}^{m} {\left| {G_{CBCT - CT} (G_{CT - CBCT} (I_{{_{CT} }}^{i} )) - I_{{_{CT} }}^{i} } \right|}$$
(9)
$$L_{cycle} = L_{cycle - CBCT} + L_{cycle - CT}$$
(10)

GCBCT–CT generates CT images from the input CBCT images. If CT images are inputted into GCBCT–CT, then the difference between the generated CT image and the input CT images should be small as possible, and it will be constrained by identity loss.

$$L_{idt - CT} = \, \frac{{1}}{m}\sum\limits_{i = 1}^{m} {\left| {G_{CBCT - CT} (I_{{_{CT} }}^{i} ) - I_{{_{CT} }}^{i} } \right|}$$
(11)
$$L_{idt - CBCT} = \, \frac{{1}}{m}\sum\limits_{i = 1}^{m} {\left| {G_{CT - CBCT} (I_{{_{CBCT} }}^{i} ) - I_{{_{CBCT} }}^{i} } \right|}$$
(12)
$$L_{idt} = L_{idt - CT} + L_{idt - CBCT}$$
(13)

The total loss function is the sum of the three loss functions.

$$L = L_{GAN} + \lambda_{cycle} * L_{cycle} + \lambda_{idt} * L_{idt}$$
(14)

In the experiment, λcycle was set 10, and λidt was set 5.

Neural network training

In conventional thoracic CT images, a few pixels have HU > 1500. In this study, the HU value of images was clipped to [− 1000, 1500] HU, and those exceeding 1500 HU were set to 1500 HU. Afterward, the pixel values were scaled to [− 1, 1] and inputted into the neural network. In consideration of the requirements on GPU memory and training efficiency of the neural network, 2D axial slices of the CT images were used and resized to 256 × 256 for training. Kida [26] pointed out that 2D axial slices of CT images can generate good sCT images and do not cause structural discontinuity in coronal and sagittal views. The training dataset contained thoracic CT and CBCT images of 136 patients and involved 12,784 slices of CBCT and CT images. The testing dataset contained 3,196 slices of CBCT images of 34 patients. During the training of AGGAN and cycleGAN, the CBCT and CT images were shuffled in each epoch so that they could be trained through unpaired data. The pix2pix network was trained using the paired CBCT and CT images.

In AGGAN, the downsampling of the generator consisted of (a) one convolution layer with a 7 × 7 kernel with a stride of 1 and 64 channels, (b) two convolution layers with a 3 × 3 kernel with a stride of 2 and 128,256 channels, and (c) 9 residual blocks with a 3 × 3 kernel with a stride of 1 and 64 channels. The upsampling involved two independent branches of content and attention masks. The first of the two branches had two deconvolution layers with a 3 × 3 kernel with a stride of 2 and 128, 64 channels using the ConvTranspose2d function. The last layer of the content mask was a 7 × 7 convolution layer with a stride of 1 and 9 channels. The last layer of the attention mask was a 1 × 1 convolution layer with a stride of 1 and 10 channels. Instance normalization was performed after each convolution layer except for the last layer, and ReLU was used as the activation function [47]. The discriminator used PatchGAN in pix2pix [41], the mean of all patches in an image was calculated to judge whether the entire image was true or false. The batch size was set to 1 during training, and the Adam optimizer was used for optimization. The momentum was set to β1 = 0.5 and β2 = 0.999, and a total of 100 epochs were established. The initial learning rate of Adam was set to 0.0001, and after 50 epochs the learning rate started linearly decaying to 0. Pix2pix [41] and cycleGAN [42] were trained in the way indicated in the original paper, and the number of epochs was set to 100. The neural networks were implemented in the PyTorch framework with Pycharm software and the training was done on a NVIDIA 2080 Ti Graphical Processing Unit(GPU). The training computation time for the pix2pix, cycleGAN and AGGAN was 428 h, 655 h, 732 h respectively. Once trained, the network is able to generate sCT from CBCT images with mean speed of 141 slices/min, 142 slices/min and 133 slices/min for the pix2pix, cycleGAN and AGGAN respectively. That is to say the trained networks can generate sCT for a patient (usually less than 100 slices) within one minute.

Evaluation

A side-by-side comparison of true CT images, CBCT and sCT generated by pix2pix, cycleGAN, AGGAN was performed at the mediastinal window of [− 400, 400] HU and lung window of [− 1200 300] HU. The HU histogram distribution of one patient’s 3D images were also compared. The sCT images of patients generated from neural networks in testing dataset were quantitatively evaluated by computing the MAE, SSIM, and peak signal-to-noise ratio (PSNR) with deformed CT images as the reference. An intensity-modulated radiation therapy phantom (002LFC, CIRS, USA) was scanned by CT and CBCT. The CBCT and CT images of the phantom were aligned through 3D rigid registration. The image quality of the sCT images for phantom was quantitatively evaluated using the CT images as reference. In addition, three regions of interest (ROI) were identified on each image (lung, bone and soft tissues). The mean HU values with standard deviation (SD) of ROIs for test patients or phantom were calculated and compared. The image quality indices were compared by paired Wilcoxon rank test and the statistical significance level was set at P < 0.05.

To verify the dose calculation accuracy, the treatment plans of the 34 test patients were copied to deformed CT images, CBCT and sCT images in the treatment planning system (Monaco5.1, Elekta). The dose distribution were calculated directly without optimized in these images. Using the dose distribution of deformed CT images as reference, the 3D gamma passing rates of the dose distributions on CBCT and sCT images were calculated under different criteria (distance to agreement and relative dose difference). In addition, a treatment plan was designed based on the phantom to simulate lung cancer radiotherapy. Volumetric-modulated arc therapy (VMAT) was adopted. The radiation field was rotated by 360°, and target 5000 cGy of the prescription dose was applied. Subsequently, dose distribution was calculated and compared in CT, CBCT and sCT images of the phantom.

Results

Comparison of image quality and preservation of the anatomical structure

The sCT images generated by different neural networks from the CBCT of the same patient in the test are shown in Fig. 2. Each row shows images of the same slice. From top to bottom, the rows denote axial slices of the mediastinal window display, axial slices of the lung window display, and coronal and sagittal images. The first and second rows show the same slice. The columns from left to right display CBCT, CT, and sCT images generated by pix2pix, cycleGAN, and AGGAN, respectively. Serious streaking and shading artifacts were observed at the chest wall and other sites of the original CBCT images, respectively, due to the influence of the respiratory movement of patients during scanning. The lung window shows that the lung was relatively dark, and the HU value in CBCT had serious distortion. Most of the artifacts in the sCT images generated by pix2pix were eliminated, but several anatomical structures, especially the bone, cavity, and lung marking regions (red arrows), in the images were destroyed. The coronal and sagittal images presented serious image distortions. The sCT images generated by cycleGAN maintained several streaking artifacts on the axial slices. In particular, the chest wall (green arrow in the images) had serious artifacts, but the anatomical structure was preserved well. The coronal and sagittal images revealed good tissue continuity. Most artifacts in the sCT images generated by AGGAN were eliminated, and the anatomical structures were well-preserved. The quality of the coronal and sagittal images was also improved.

Fig. 2
figure 2

Quality comparison of CBCT, CT, and sCT images generated by three neural networks for the same patient in axial, coronal, and sagittal images. The display window in second row is [− 1200 300] HU (lung window), and display window in other rows is [− 400 400] HU

The image histograms of the 3D images for patients shown in Fig. 2 were analyzed (Fig. 3). In Fig. 3, the x-axis denotes the HU value, and the y-axis denotes the number of occurrences of HU values in the 3D CT images. The HU value distributions of the CBCT and CT images differed considerably. The HU value of the sCT images generated by neural networks showed a similar distribution as that of the real CT images. The distribution curves of the CT and sCT images had an evident peak at about − 800 HU, which is the HU value distribution of the lung. However, such a peak was absent in the CBCT images. The HU value distribution of the sCT images generated by AGGAN was the closest to that of the CT images.

Fig. 3
figure 3

Histogram distribution curves of the HU values of 3D CT, CBCT, and sCT images generated by three neural networks

The sCT images generated from the CBCT images of test patients in axial view are shown in Fig. 4. The first four rows from the top to the bottom are axial slice images of different patients, and the fifth row is the lung window display of the fourth row. The same row shows the same slice of images of patients. The generated sCT images in coronal and sagittal views are shown in Fig. 5. Similar to the results in Fig. 2, the CBCT images included many artifacts, and the HU value had serious distortion. Pix2pix eliminated many artifacts but destroyed the anatomical structures, which mainly included bone tissues, cavities, lung marking, liver, and heart edges (red arrows). The sCT images generated by cycleGAN had well-preserved anatomical structures but retained several streaking artifacts from the CBCT images (green arrow). The sCT images generated by AGGAN reduced more artifacts than cycleGAN, and preserved the anatomical structures well.

Fig. 4
figure 4

Comparison of sCT images generated by the three neural networks from CBCT images of patients in the test. The display window in bottom row is [− 1200 300] HU (lung window), and display window in other rows is [− 400 400] HU

Fig. 5
figure 5

Comparison of sCT images generated by the three neural networks from test CBCT images in coronal (the top three rows) and sagittal (the bottom two rows) views. The display window is [− 400 400] HU

The quantitative analysis results of CBCT and sCT on image quality for testing patients are listed in Table 2. The image quality indices of all sCT images are significantly improved compared with original CBCT images (P < 0.05). The AGGAN achieves the best image quality with the smallest MAE, largest SSIM and PSNR. The image quality indices of sCT generated from cycleGAN and AGGAN are both better than pix2pix. Compared with cycleGAN, sCT images generated from AGGAN show significant superiority in MAE and PSNR (P < 0.05). There are no significant difference in SSIM between sCT images generated from cycleGAN and AGGAN (P = 0.261). The sCT images of patients generated by AGGAN show the best image quality. The mean HU values of ROIs on CT, CBCT and sCT images for patients are listed in Table 3. The mean HU values of lung, bone and soft tissue on CBCT images are significantly less than that on CT images (P < 0.05). In addition, the mean HU values of ROIs on CBCT images fluctuated greatly, leading to a large SD number. The mean HU values of ROIs on sCT images generated from three networks are close to that on CT images. There are no significant difference on mean HU values of ROIs among CT, sCT generated from pix2pix, cycleGAN and AGGAN.

Table 2 Image quality indices of CBCT and sCT images generated by the three neural networks
Table 3 The mean HU values of ROIs on CT, CBCT and sCT images for patients

Dose calculation

The relative dose distribution calculated in treatment plans on CT, CBCT and sCT images for patients are shown in Fig. 6. Using the dose distribution calculated in the CT images as a reference, the absolute gamma analysis distribution of each corresponding image under the criteria 2 mm/2% are shown in Fig. 7. The dose distributions in the original CBCT images remained highly divergent compared with the reference. There are large regions where the gamma index is greater than 1 in CBCT images. The dose distribution of the sCT images are close to the reference, and the areas with a gamma index greater than 1 are greatly reduced. The statistical analysis of 3D gamma passing rates with different standards for the 34 testing patients are shown in Table 4. The gamma passing rates of sCT images generated from three methods were significantly improved under all criteria compared with that of original CBCT (P < 0.05). Under the criteria 1 mm/1% and 2 mm/2%, the gamma passing rates of cycleGAN and AGGAN were significantly increased compared with that of pix2pix (P < 0.05), but no significant differences were observed under 3 mm/3% criteria (P = 0.165). There are no significant differences for the gamma passing rates between cycleGAN and AGGAN under criteria 2 mm/2% or 3 mm/3% (P = 0.214 and P = 0.345). However, AGGAN got significantly higher passing rates than cycleGAN under the 1 mm/1% criteria (P < 0.05). In conclusion, the SCT images generated by AGGAN obtained the optimal dose calculation accuracy in radiotherapy for testing patients.

Fig. 6
figure 6

The relative dose distribution calculated on CT, original CBCT and generated sCT images

Fig. 7
figure 7

The gamma analysis index distribution calculated on original CBCT and generated sCT images with dose on CT image as reference under the criteria 2 mm/2%

Table 4 The 3D gamma passing rates of dose distribution in CBCT and sCT images for patients

A phantom study

The CT, CBCT, and sCT images of the phantom are shown in Fig. 8. The images from the left to the right are the soft tissue window display, lung window display, and the image difference. The image difference was obtained by subtracting the CT image from each image. The dark region represents the HU value of the image part that is lower than that of the CT images, and the bright region represents the HU value of the image that is higher than that of the CT images. The original CBCT image of the phantom had large differences with the CT images. The lung tissues were relatively dark, and the soft tissue regions showed irregular shading. The sCT image generated by pix2pix destroyed the original structures seriously. However, the sCT images generated by cycleGAN and AGGAN retained the structures of the phantom well. The sCT images generated by cycleGAN showed a dark region at the right side of the lung, and the HU value of the cylinder inserted into the lung was larger than that in CT. The sCT images generated by AGGAN showed no large differences. The HU profile on one straight (red straight) of images is shown in Fig. 9, where considerable differences can be observed between the CBCT and CT images. The HU value of the CBCT images in the lung was close to zero. The HU value distribution of the sCT images generated by AGGAN was the closest to that of the CT images.

Fig. 8
figure 8

Quality comparison of CT, CBCT, and sCT images generated by the three neural networks. The display window in left column is [− 400 400] HU, and in middle column is [− 1200 300] HU

Fig. 9
figure 9

HU value distribution of images on the red straight

The CT images of the phantom were used as the reference, and the MAE, SSIM, and PSNR of the different images were calculated. The results are listed in Table 5. The MAE of the sCT generated by AGGAN was the lowest (23.2 HU), but its SSIM and PSNR were the highest (0.944 and 30.2, respectively). The SSIM (0.938) of the sCT images generated by cycleGAN was close to that of AGGAN, but the MAE (32.6 HU) was higher. Pix2pix exerted a poor experimental effect on the phantom, as manifested by the lower SSIM and PSNR than those of the original CBCT images. The MAE of pix2pix was hardly improved compared with that of the CBCT images. In addition, the lung, bone and soft tissue were contoured and the mean HU value of these ROIs were calculated, as shown in Table 6. The mean HU value of sCT generated by AGGAN are closest to that of CT on three ROIs. In the phantom experiment, the sCT images generated by AGGAN showed the best quality.

Table 5 Image quality indices of CBCT and sCT images in the phantom
Table 6 The mean HU values of ROIs on CT, CBCT and sCT images for phantom

The calculated dose distribution in the phantom is shown in Fig. 10. The upper left part presents diagrams of the irradiation field and the target area (green profile), and the upper right part presents the calculated dose distribution in the CT images. The lower left and lower right parts show the relative dose difference distributions calculated on CBCT and sCT images generated from pix2pix, cycleGAN, AGGAN respectively. The relative dose difference distribution was obtained through the calculated dose from CBCT or sCT images minus the dose in the CT images and divided by the maximum dose in the CT images. Dark regions indicate that the calculated dose is lower than the reference dose, and the bright regions indicate that the calculated dose is higher than the reference dose. The high-dose regions (close to the target) in the CBCT images had large differences compared with CT. The dose difference in the sCT images reduced and minimum in AGGAN. The 3D gamma passing rates of the dose distributions on CBCT and sCT images with different standards were calculated (Table 7). The passing rates of dose distributions on the sCT images were higher than those on the CBCT images under all standards. Given the strictest standard of 1 mm/1%, the passing rate of the sCT images generated from AGGAN reached as high as 96.5%, but that of the CBCT images was only 79.8%. sCT images generated by AGGAN are thus conducive to calculating radiotherapy doses accurately.

Fig. 10
figure 10

Dose distribution in CT images and distributions of relative dose differences in CBCT and sCT images genarated from pix2pix, cycleGAN and AGGAN

Table 7 3D gamma passing rates of dose distribution on CBCT and sCT images in the phantom

Discussions

In this study, sCT images were generated from low-dose CBCT images of thoracic patients by using pix2pix, cycleGAN, and AGGAN. The paired datasets were used in pix2pix training, whereas cycleGAN and AGGAN applied unpaired training datasets. The pix2pix reduced most of the artifacts of the original CBCT images in axial slices, but it destroyed the anatomical structures of normal tissues, resulting in image ambiguity and structural discontinuity in sagittal and coronal images. In the phantom study, pix2pix exerted great structural damages and failed to improve the image quality. The poor test results of pix2pix may be attributed to the incomplete alignment between CBCT and CT images in the training dataset. In this study, the training dataset was obtained through 3D rigid registration of CT and CBCT images. The CT and CBCT images after registration had evident local mismatching resulting from anatomical structure changes and movement of organs during the two scanning events. In particular, tissue structures, such as the trachea, esophagus, and bones, and the organ water/air filling status did not correspond to one another. Li [21] and Chen [22] implemented paired training by the Unet structure and generated sCT images based on CBCT images of patients with head and neck cancer. Given that organs in the head and neck are stationary, a good training dataset was obtained after the registration of CBCT and CT images. Liu [25] used a paired training dataset through DIR of CBCT and CT images of patients with pancreatic cancer. The images were collected from patients who received stereotactic body radiation therapy and held their breath, and relatively small differences among images were obtained. In conventional radiotherapy, thoracic CBCT images have serious artifacts due to respiration movement, and accurate DIR with CT images is facing a great challenge [15]. Liang [23] conducted a phantom study of the head and neck and proved that sCT images generated by neural networks have a more accurate anatomical structure than CT images obtained from DIR. Supervised learning methods, such as pix2pix, can only generate high-quality sCT images under the premise of accurate alignment between CBCT and CT images. Unsupervised learning methods, such as cycleGAN, do not depend on image registration results; thus, the generated sCT images maintain the anatomical structures well, and sagittal and coronal images have continuous structures. This finding is similar to the result of Liang [23] for head and neck images. However, the sCT images generated by cycleGAN in our experiment retained several artifacts. Given that the thoracic CBCT images contained more artifacts than the CBCT images of the head and neck due to respiration movement, cycleGAN failed to inhibit several serious artifacts, especially at the chest wall and heart with great movements. AGGAN modified the generator of cycleGAN via the background attention mask focusing on constant areas and foreground attention masks focusing on changing areas and combined them to generate the final sCT images. The quantitative evaluation of sCT images for testing patients and a phantom demonstrated that the sCT images generated by AGGAN achieved the best image quality, with the highest SSIM, PSNR and the lowest MAE. The accuracy of dose calculation in radiotherapy is closely related to the accuracy of HU values in CT images. The statistical analysis on 3D gamma passing rates of dose distribution demonstrated that sCT image generated from all the three methods significantly improved the accuracy of dose calculation compared with original CBCT image. The sCT generated by AGGAN offered the highest gamma passing rates under the strictest criteria of 1 mm/1% compared with other methods. The sCT generated by AGGAN showed the best performance in correcting HU value of the image, the anatomical structures preservation and dose calculation in radiotherapy.

Conclusions

Unpaired low-dose thoracic CBCT and CT images were trained by AGGAN. The generated high-quality sCT images reduced most artifacts and preserved the anatomical structures well. The sCT generated by AGGAN provided high-accuracy dose distribution calculation and can thus be applied to adaptive radiotherapy.

Availability of data and materials

The datasets used during the current study are available from the corresponding author on reasonable request.

References

  1. Sorcini B, Tilikidis A. Clinical application of image-guided radiotherapy, IGRT (on the Varian OBI platform). Cancer/Radiother. 2006;10:252–7.

    Article  Google Scholar 

  2. Wang X, Li J, Wang P, et al. Image guided radiation therapy boost in combination with high-dose-rate intracavitary brachytherapy for the treatment of cervical cancer. Brachytherapy. 2016;8:122–7.

    Article  Google Scholar 

  3. Boda-Heggemann J, Lohr F, Wenz F, et al. kV cone-beam CT-based IGRT. Strahlenther Onkol. 2011;187:284–91.

    Article  Google Scholar 

  4. Endo M, Tsunoo T, Nakamori N, et al. Effect of scattered radiation on image noise in cone beam CT. Med Phys. 2001;28:469–74.

    Article  CAS  Google Scholar 

  5. Xu Y, Bai T, Yan H, et al. A practical cone-beam CT scatter correction method with optimized Monte Carlo simulations for image-guided radiation therapy. Phys Med Biol. 2015;60:3567.

    Article  Google Scholar 

  6. Kan MW, Leung LH, Wong W, et al. Radiation dose from cone beam computed tomography for image-guided radiation therapy. Int J Radiat Oncol Biol Phys. 2008;70(1):272–9.

    Article  CAS  Google Scholar 

  7. Song Y, Zhang W, Zhang H, et al. Low-dose cone-beam CT (LD-CBCT) reconstruction for image-guided radiation therapy (IGRT) by three-dimensional dual-dictionary learning. Radiat Oncol. 2020;15(1):192.

    Article  Google Scholar 

  8. Rijcke BD, Geeteruyen RV, Rijcke ED, et al. Fast 3D CBCT imaging for lung SBRT: is image quality preserved? Radiother Oncol. 2017;123:S85–6.

    Article  Google Scholar 

  9. Dunlop A, McQuaid D, Nill S, et al. Comparison of CT number calibration techniques for CBCT-based dose calculation. Strahlenther Onkol. 2015;191(12):970–8.

    Article  Google Scholar 

  10. Giacometti V, King RB, Agnew CE, et al. An evaluation of techniques for dose calculation on cone beam computed tomography. Br J Radiol. 2019;92(1096):20180383.

    Article  Google Scholar 

  11. Stankovic U, Ploeger LS, van Herk M, et al. Optimal combination of anti-scatter grids and software correction for CBCT imaging. Med Phys. 2017;44:4437–51.

    Article  Google Scholar 

  12. Sisniega A, Zbijewski W, Badal A, et al. Monte Carlo study of the effects of system geometry and antiscatter grids on cone-beam CT scatter distribution. Med Phys. 2013;40:051915.

    Article  CAS  Google Scholar 

  13. Sun M, Star-Lack JM. Improved scatter correction using adaptive scatter kernel superposition. Phys Med Biol. 2010;55:6695–720.

    Article  CAS  Google Scholar 

  14. Kurz C, Kamp F, Park Y-K, et al. Investigating deformable image registration and scatter correction for CBCT-based dose calculation in adaptive IMPT: CBCT correction to enable IMPT dose calculation. Med Phys. 2016;43(10):5635–46.

    Article  Google Scholar 

  15. Duan L, Ni X, Liu Q, et al. Unsupervised learning for deformable registration of thoracic CT and cone-beam CT based on multiscale features matching with spatially adaptive weighting. Med Phys. 2020;47:5632–47.

    Article  Google Scholar 

  16. Arai K, Kadoya N, Kato T, et al. Feasibility of CBCT-based proton dose calculation using a histogram-matching algorithm in proton beam therapy. Phys Med. 2017;33:68.

    Article  Google Scholar 

  17. Mainegra-Hing E, Kawrakow I. Fast Monte Carlo calculation of scatter corrections for CBCT images. J Phys Conf Ser. 2008;102:012017.

    Article  Google Scholar 

  18. Li J, Yao W, Xiao Y, et al. Feasibility of improving cone-beam CT number consistency using a scatter correction algorithm. J Appl Clin Med Phys. 2013;14(6):167–76.

    Article  Google Scholar 

  19. Niu T, Sun M, Star-Lack J, et al. Shading correction for on-board cone-beam CT in radiation therapy using planning MDCT images. Med Phys. 2010;37(10):5395–406.

    Article  Google Scholar 

  20. Park YK, Sharp GC, Phillips J, et al. Proton dose calculation on scatter-corrected CBCT image: Feasibility study for adaptive proton therapy. Med Phys. 2015;42(8):4449–59.

    Article  CAS  Google Scholar 

  21. Li Y, Zhu J, Liu Z, et al. A preliminary study of using a deep convolution neural network to generate synthesized CT images based on CBCT for adaptive radiotherapy of nasopharyngeal carcinoma. Phys Med Biol. 2019;64(14):145010.

    Article  CAS  Google Scholar 

  22. Chen L, Liang X, Shen C, et al. Synthetic CT generation from CBCT images via deep learning. Med Phys. 2020;47(3):1115–25.

    Article  Google Scholar 

  23. Liang X, Chen L, Nguyen D, et al. Generating synthesized computed tomography (CT) from cone-beam computed tomography (CBCT) using CycleGAN for adaptive radiation therapy. Phys Med Biol. 2019;64(12):

    Article  CAS  Google Scholar 

  24. Harms J, Lei Y, Wang T, et al. Paired cycleGAN based image correction for quantitative cone-beam CT. Med Phys. 2019;46(9):3998–4009.

    Article  Google Scholar 

  25. Liu Y, Lei Y, Wang T, et al. CBCT-based synthetic CT generation using deep-attention cycleGAN for pancreatic adaptive radiotherapy. Med Phys. 2020;47(6):2472–83.

    Article  Google Scholar 

  26. Kida S, Kaji S, Nawa K, et al. Visual enhancement of cone-beam CT by use of CycleGAN. Med Phys. 2020;47(3):998–1010.

    Article  Google Scholar 

  27. Barateau A, De Crevoisier R, Largent A, et al. Comparison of CBCT-based dose calculation methods in head and neck cancer radiotherapy: from Hounsfield unit to density calibration curve to deep learning. Med Phys. 2020;47:4683–93.

    Article  Google Scholar 

  28. Hansen DC, Landry G, Kamp F, et al. ScatterNet: a convolutional neural network for cone-beam CT intensity correction. Med Phys. 2018;45(11):4916–26.

    Article  CAS  Google Scholar 

  29. Lalonde A, Winey B, Verburg J, et al. Evaluation of CBCT scatter correction using deep convolutional neural networks for head and neck adaptive proton therapy. Phys Med Biol. 2020;65(24):245022.

    Article  Google Scholar 

  30. Landry G, Hansen D, Kamp F, et al. Comparing Unet training with three different datasets to correct CBCT images for prostate radiotherapy dose calculations. Phys Med Biol. 2019;64(3):035011.

  31. Thummerer A, Zaffino P, Meijers A, et al. Comparison of CBCT based synthetic CT methods suitable for proton dose calculations in adaptive proton therapy. Phys Med Biol. 2020;65(9):095002.

    Article  CAS  Google Scholar 

  32. Ronneberger O, Fischer P, Brox T, et al. U-Net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. 2015. p. 234–41.

  33. Goodfellow I, Pougetabadie J, Mirza M, et al. Generative adversarial nets. Neural Inf Process Syst. 2014;27:2672–80.

    Google Scholar 

  34. Dong Y, Xu D, Zhou SK, et al, Automatic liver segmentation using an adversarial image-to-image network. In: International conference on medical image computing and computer assisted intervention (MICCAI). 2017. p. 507–15.

  35. Rezaei M , Harmuth K , Gierke W , et al, A conditional adversarial network for semantic segmentation of brain tumor. In: International conference on medical image computing and computer assisted intervention. 2017. p. 241–52.

  36. Madani A, Moradi M, Karargyris A, et al. Semi-supervised learning with generative adversarial networks for chest x-ray classification with ability of data domain adaptation. In: The IEEE international symposium on biomedical imaging. 2018. p. 1038–42.

  37. McCollough CH, Bartley AC, Carter RE, et al. Low-dose ct for the detection and classification of metastatic liver lesions. Med Phys. 2017;44(10):e339–52.

    Article  CAS  Google Scholar 

  38. Maspero M, Savenije MHF, Dinkla AM, et al. Dose evaluation of fast synthetic-CT generation using a generative adversarial network for general pelvis MR-only radiotherapy. Phys Med Biol. 2018;63:185001.

    Article  Google Scholar 

  39. Cusumano D, Lenkowicz J, Votta C, et al. A deep learning approach to generate synthetic CT in low field MR-guided adaptive radiotherapy for abdominal and pelvic cases. Radiother Oncol. 2020;153:205–12.

    Article  CAS  Google Scholar 

  40. Quan TM, Nguyen-Duc T, Jeong WK. Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss. IEEE Trans Med Imaging. 2018;37(6):1488–97.

    Article  Google Scholar 

  41. Isola P, Zhu J, Zhou T, et al. Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017. p. 5967–76.

  42. Zhu J, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision, ICCV 2017. p. 2242–51

  43. Nie D, Trullo R, Lian J, et al. Medical image Synthesis with context aware generative adversarial networks. In: International conference on medical image computing and computer assisted intervention (MICCAI), 2017. vol. 10435. p. 417–25.

  44. Tang Hao, Xu Dan, Sebe Nicu, et al, Attention-guided generative adversarial networks for unsupervised image-to-image translation. In: International joint conference on neural networks (IJCNN). 2019.

  45. Bogaert E, Monten C, Wagter CD, et al. Investigation of a fast CBCT protocol for supine accelerated whole breast. Irradiat Radiother Oncol. 2016;119:S434.

    Article  Google Scholar 

  46. Mao X, Li Q, Xie H, et al. Least squares generative adversarial networks. In: IEEE international conference on computer vision (ICCV), 2017.

  47. Ramachandran P, Zoph B, Le QV. Searching for activation functions. 2017. arXiv:1710.05941.

Download references

Acknowledgements

None.

Funding

This work is supported by General Program of Jiangsu Provincial Health Commission (No. M2020006), Changzhou Key Laboratory of Medical Physics (No. CM20193005), Changzhou Sci&Tech program (No. CJ20200099), Young Talent Development Plan of Changzhou Health Commission (Nos. CZQM2020075 and CZQM2020067), the Science and Technology Programs for Young Talents of Changzhou Health Commission (No. QN201932).

Author information

Authors and Affiliations

Authors

Contributions

LG and KX participated in the design of the study, carried out the study, performed the statistical analysis, and drafted the manuscript; XW, ZL, CL, JS, JS and TL: helped to carried out the study; XN: conceived and designed the study, edited and reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xinye Ni.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the ethics committee of Changzhou No.2 People’s Hospital with the exemption for patient consent, because it is a retrospective study which would not bring extra risk to patient healthy and human rights (#2020KY154-01).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, L., Xie, K., Wu, X. et al. Generating synthetic CT from low-dose cone-beam CT by using generative adversarial networks for adaptive radiotherapy. Radiat Oncol 16, 202 (2021). https://doi.org/10.1186/s13014-021-01928-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13014-021-01928-w

Keywords