Comparison of primary target volumes delineated on four-dimensional CT and 18 F-FDG PET/CT of non-small-cell lung cancer

Background To determine the optimal threshold of 18 F-fluorodexyglucose (18 F-FDG) positron emission tomography CT (PET/CT) images that generates the best volumetric match to internal gross target volume (IGTV) based on four-dimensional CT (4DCT) images. Methods Twenty patients with non-small cell lung cancer (NSCLC) underwent enhanced three-dimensional CT (3DCT) scan followed by enhanced 4DCT scan of the thorax under normal free breathing with the administration of intravenous contrast agents. A total of 100 ml of ioversol was injected intravenously, 2 ml/s for 3DCT and 1 ml/s for 4DCT. Then 18 F-FDG PET/CT scan was performed based on the same positioning parameters (the same immobilization devices and identical position verified by laser localizer as well as skin marks). Gross target volumes (GTVs) of the primary tumor were contoured on the ten phases images of 4DCT to generate IGTV10. GTVPET were determined with eight different threshold using an auto-contouring function. The differences in the position, volume, concordance index (CI) and degree of inclusion (DI) of the targets between GTVPET and IGTV10 were compared. Results The images from seventeen patients were suitable for further analysis. Significant differences between the centric coordinate positions of GTVPET (excluding GTVPET15%) and IGTV10 were observed only in z axes (P < 0.05). GTVPET15%, GTVPET25% and GTVPET2.0 were not statistically different from IGTV10 (P < 0.05). GTVPET15% approximated closely to IGTV10 with median percentage volume changes of 4.86%. The best CI was between IGTV10 and GTVPET15% (0.57). The best DI of IGTV10 in GTVPET was IGTV10 in GTVPET15% (0.80). Conclusion None of the PET-based contours had both close spatial and volumetric approximation to the 4DCT IGTV10. At present 3D-PET/CT should not be used for IGTV generation.


Background
Worldwide, lung cancer is the most common cause of cancer-related mortality [1]. About 80% of the cases of lung cancer are non-small-cell lung cancer (NSCLC) [2]. Radiotherapy plays a major role in the management of patients with NSCLC who cannot tolerate or refuse surgery. Unfortunately, the prognosis of patients with NSCLC remains poor because of high rates of local failure and distant metastases [3,4], with local control rates of approximately 50% after radical radiotherapy [5]. A geometric target miss induced by tumor motion during radiotherapy is considered as one of the main reasons for local failure [6]. In order to account for geometric uncertainties due to internal variations in tumor position, size, and shape, the International Commission on Radiation Units and Measurements (ICRU) report 62 introduced the concept of an internal target volume (ITV) [7]. For lung tumors, respiratory motion is the major consideration for the ITV.
Currently, four-dimensional CT (4DCT) is widely used for the simulation of lung cancer. It is a reliable and effective tool for assessing tumor and organ motion [8,9] and can provide patient-specific information about tumor position, shape, and size at different phases of the respiratory cycle. The internal gross target volume (IGTV) can make the determination of the ITV more efficient [6]. IGTV 10 is generated by combining all 10 individual gross tumor volumes (GTVs) contoured in each phase of the 4DCT dataset, which is thought to encompasses the motion information for the tumor in the whole respiratory cycle [10]. Therefore, it is expected to provide the most accurate IGTV based on a given 4DCT dataset [11].
As a functional imaging modality, 18 F-fluorodeoxyglucose ( 18 F-FDG) positron emission tomography/computer tomography (PET/CT) images have been shown to have greater specificity and sensitivity than CT alone for the diagnosis and staging of NSCLC patients [12]. Furthermore, the interobserver variability, as well as the intraobserver variability, could be significantly reduced when the 18 F-FDG PET image was used for tumor volume delineation [13,14]. In addition, since three-dimensional PET (3D-PET) images are acquired over several minutes and represent the accumulated traces of multiple respiratory cycles, they may be capable of accounting for movement by indicating the average location of a tumor over time [15]. A phantom study by Caldwell et al. [16] concluded that PET imaging could more accurately depict the 3D volume of a moving phantom compared with spiral CT. Therefore, 3D-PET/CT might represent the ITV of a tumor. However, the optimal threshold values for patients with NSCLC have never been reported.
Hence, We perform this study to investigate the appropriateness of the threshold method to determine the best volumetric match to 4DCT-based IGTV 10 when contouring the primary tumor volume of NSCLC. Additionally, the feasibility of 3D-PET/CT images was evaluated with respect to incorporating tumor motion into the radiation target for NSCLC.

Patients
This study was approved by the Shandong Cancer Hospital and Institute review board, and 20 patients provided written informed consent. Patients with histologically proven primary NSCLC who were scheduled to undergo radiotherapy were eligible for this study, excluding those with atelectasis and/or obstructive pneumonia. None of them had previously been treated with radiotherapy or chemotherapy for their lung tumor. Between December 2012 and December 2013, 20 patients were enrolled in this study. The maximal standardized uptake value (SUV max ) in the tumor of 3 patients was 2.89, 3.16 and 3.27, respectively. Therefore, they were not suitable for further analysis as several of the threshold-based contouring methods used would not discriminate between the tumor and background lung uptake. The other 17 patients included 13 men and 4 women, with a median age of 66 years (range, 45-84 years). Six patients had centrally located lesions, and eleven patients had peripherally located lesions. The median of the SUV max for the primary tumors was 11.34 (range from 6.07 to 25.51). Table 1 summarized the characteristics of the 17 patients and their primary tumors.

CT simulation and image acquisition
During the simulation, all patients were immobilized using thermoplastic mask for covering the head, neck and shoulders in the supine position. For each patient, an axial enhanced 3DCT scan of the thoracic region was performed followed by a enhanced 4DCT scan under uncoached free breathing conditions on a 16-slice CT scanner (Philips Brilliance Bores CT) with the administration of intravenous contrast agents. A total of 100 ml of ioversol was injected intravenously, 2 ml/s for 3DCT and 1 ml/s for 4DCT. Details of 3DCT and 4DCT scan as well as image acquisition were given in Li et al. [17]. Then, 3DCT and 4DCT images were transferred to MIM (MIM-6.0.4, MIM Software Inc, Cleveland, OH) imaging software.

PET/CT simulation and image acquisition
On the same day as the 4DCT scan, the FDG-PET/CT scans of the chest were performed with a integrated PET/ CT scanner (Philips Gemini TF Big Bore). Through the same immobilization devices, the patient's position was identical to that for the 4DCT scan. Two radiation therapists were present to ensure the accuracy of the set-up by laser localizer and skin marks. All patients fasted for at least 6 h before the PET/CT examination. All patients were injected with 7.4 MBq/kg body weight of 18 FDG and then rested for about 1 h in a quiet room before imaging. The 16-slice CT component was operated with an X-ray tube voltage peak of 120 kV, 90 mA, a slice thickness of 5 mm and an interval of 4 mm, and was used both for attenuation correction of PET data and for localization of FDG uptake in PET images. No CT contrast agent was administered. PET scanning was performed covering the same axial range for 2 min per bed position (total of 3-5 bed positions). Both PET and CT acquisition was performed during free breathing. Data were reconstructed using an ordered subset expectation maximization (OSEM) algorithm and attenuation correction derived from CT data. Then, the PET/CT images were transferred to MIM software.

Image registration
An initial automatic rigid registration was performed using MIM software. Due to the 3DCT and 4DCT images for the same person were produced during the same imaging session, MIM would consider the images as being registered with each other. After the 3DCT and PET image datasets were co-registered with the help of the transmission CT from PET/CT, the 4DCT images would be auto-registered with the CT component of PET/CT. The registration was then manually adjusted by one radiotherapist, experienced in registering PET/ CT images, by matching bony anatomy such as the vertebral bodies at the level of the visible lung lesion. Hence, each contour was transferred to the 3DCT to calculate their specific parameters.

Target volume delineation
Our investigation focuses on the primary tumors. If the positive lymph nodes could not be separated from the primary tumor visually, they were delineated together as if they were part of the primary tumor. Patients were treated according to the 4DCT-based volumes and PET/CT contours were only used as part of a virtual planning study. Using the lung window setting (W = 1,600, C = −600) and mediastinal window settings (W = 400, C = 40) for the interface if the tumor was close to the mediastinum or chest wall [18], GTVs were manually contoured on all 10 phases of the 4DCT scan by a single radiation oncologist and verified by another radiation oncologist. Both of them did not know the PET results in an effort to decrease bias. IGTV 10 were derived from the 10 phases of the GTVs. PET/CT-based GTV of the primary tumor (GTV PET ) was defined by the auto-contouring function of MIM. After identification the primary tumor as a region of interest (ROI), MIM automatically calculated the SUV max of ROI. Eight different threshold methods were used in this study: (1) SUV of 2.0 or greater (SUV2.0); (2) SUV of 2.5 or greater (SUV2.5); (3) 15% of SUV max within the ROI (SUV15%); (4) 20% of SUV max within the ROI (SUV20%); (5) 25% of SUV max within the ROI (SUV25%); (6) 30% of SUV max within the ROI (SUV30%); (7) 35% of SUV max within the ROI (SUV35%); (8) 40% of SUV max within the ROI (SUV40%). All the noncancerous regions within the GTV PET , including the areas overlaid by the heart, bone and great vessels, were corrected to exclude manually with the help of the CT of component of PET/CT.

Volumes comparison
The differences in the position, size, concordance index (CI) and degree of inclusion (DI) between the GTV PET and the IGTV 10 were compared.
Target volume positions were defined by center of target coordinates and expressed using the x (left-right, LR), y (anterior-posterior, AP) and z (cranial-caudal, CC) coordinates of the center of mass. Centroid shifts in the 3D directions were calculated according to the formula as follows: The concordance index of volume A and B [CI (A, B)] was defined as the ratio of the intersection of A with B to the union of A and B [19]. The maximum value of CI is 1 if the two volumes are identical, and the minimum value is 0 if the volumes are completely nonoverlapping. That is, The definition of DI of volume A in volume B [DI (A in B)] is the percentage of the overlap between volume A and B in volume A [20]. The formula is as follows: The DI can represent the percentage of one volume included by another volume, and 1-DI can represent the percentage of one volume not included by another volume.

Statistical analysis
Statistical analysis was performed using the SPSS software package (SPSS 17.0). The one-way ANOVA test was used to determine the variations in the DIs of GTV PET and IGTV 10 , and in the CIs of GTV PET and IGTV 10 . The Wilcoxon test was performed to estimate the differences of centroid coordinate positions between GTV PET and IGTV 10 , and also used to estimate the variabilities of target volumes between GTV PET and IGTV 10 . We used the Spearman correlation test to analyze for associations between centroid shifts in the 3D directions and CIs. Values of P < 0.05 were regarded as significant for all the tests. Descriptive statistics were used as appropriate.

Results
For lesion 15 and 16, the SUV max in the tumor was 6.09 and 6.07, we could not obtain GTV PET15% target volumes as the volume obtained from the SUV15% contours were indistinguishable from background lung activity. Table 2 showed the centroid shifts in 3D directions of the IGTV 10 volumes and the PET/CT volumes. The variations in the centroid coordinate positions in the CC direction of GTV PET20% and IGTV 10 , GTV PET25% and IGTV 10 , GTV PET30% and IGTV 10 , GTV PET35% and IGTV 10 , GTV PET40% and IGTV 10 , GTV PET2.0 and IGTV 10 , GTV PET2.5 and IGTV 10

Volume variation
The volumes of primary tumors measured by PET/CT and 4DCT were shown in Table 3. Compared to IGTV 10 , GTV PET15% , GTV PET20% and GTV PET2.0 showed no significant difference (P values were 0.281, 0.102 and 0.687, respectively). Figure 1 illustrated the median percentage of volume changes from GTV PET to IGTV 10 standardized to the IGTV 10 for each case. The SUV 15% contour approximated most closely to the IGTV 10  DI Figure 3 showed the median DIs of GTV PET in IGTV 10 , and IGTV 10 in GTV PET . The median DIs of IGTV 10 in GTV PET ranged from 0.31 to 0.80 (F = 7.814, P = 0.000). The best DI was IGTV 10 in GTV PET15% . The median DIs of GTV PET in IGTV 10 ranged from 0.60 to 0.85 (F = 1.017, P = 0.422). The best DIs were GTV PET35% and GTV PET40% in IGTV 10 .

Discussion
Accurate definition of the target volume, ideally incorporating metabolic information, becomes paramount importance in the current trend in NSCLC treatment planning [21]. A number of studies compared 3DCT volumes with 18 F-FDG PET/CT volumes of NSCLC [22][23][24]. However, the best methodology for applying 18 F-FDG PET/CT to IGTV definition is not currently well established. To the best of our knowledge, there is few study that has compared tumor sizes and CI values between GTV PET and IGTV 10 in contouring NSCLC.
In this study, we determined the IGTV 10 from 10 phases of the 4DCT dataset and used them as the reference to find the optimal threshold that yield the best match between the GTV PET and IGTV 10 in both the target size and the spatial conformity. Our study revealed that GTV PET using a threshold setting of SUV15% approximated most closely to the IGTV 10 with the lowest median percentage volume changes. When using the threshold level of ≥ SUV25% and/or ≥ SUV2.5, the PET-based tumor sizes were estimated to be smaller than the IGTV 10 . Therefore, on the basis of the results of our study, the SUV threshold setting of ≥25% and/or ≥2.5 is not suitable for IGTV contouring in NSCLC. Analogously, Hanna et al. [25] compared volumes from a manual method and five automated PET segmentation techniques to 4DCT-derived ITV and found that none of the PET target volumes approximated closely to the 4DCT target volumes. However, in their study, the patient's PET/CT and 4DCT scans were not acquired on the same day or in identical position. In this circumstance, it was possible that changes in tumor geometry or size occurred and potentially increased the likelihood of a mismatch between the PET-based contours and the 4DCT-based contours [25]. Caldwell et al. [16] reported that using a threshold as low as 15% of the maximum value could account for respiratory motion and more accurately depict the true extension of the moving target. Another phantom study by Okubo et al. [26] concluded that when a threshold value of 35% of the measured maximum FDG activity was adopted, the sizes of PET delineation were almost the same for static and moving phantom spheres of 22 mm or more in the axial plane. Our study was similar to the result of Caldwell et al., but smaller than the result of Okubo et al. This is possible because patients enrolled in our study had a range of tumor sizes and positions. Moreover, the 15% threshold method was not suitable for contouring some lung tumors that have low SUV, because it might fail to distinguish tumor from background lung activity. Nevertheless, Okubo et al. suggested that the threshold of 35% of measured maximum FDG activity was only a provisional criterion for tumors of 2-4 cm given that appropriate threshold values could be changed on the basis of the tumor size [26]. In addition, it should be acknowledged that any phantom studies versus clinical comparison is limited. For example, unlike in real tumors, the FDG distribution in the spheres of the phantoms and in the background was homogenous.
Similarity in absolute volume does not mean identity in the space location. Our results indicated that GTV PET15% , GTV PET20% and GTV PET2.0 showed no significant difference with IGTV 10 in target volume. However, the CIs of them were significantly lower than 1.0. The best CI was between IGTV 10 and GTV PET15% , which was only 0.57. It is not surprising as GTV PET15% is the biggest volume and hence has the greatest degree of potential overlap. Based on this consideration, this may not make it the most accurate. The poor CIs suggested great unconformity between what was indicated abnormal on PET image and on CT image. One of the reasons is that shape and/or positional alterations between IGTV 10 and GTV PET had occurred. Our study showed that the minimum variation in the centroid coordinate position in the 3D direction of GTV PET and IGTV 10 was 0.38 cm (median). Moreover, the CIs were inversely correlated with the centroid shifts in 3D directions. Although patients in our study were immobilized in the same position for both 4DCT and PET/CT, millimetric precision in set-up using immobilization devices may not be feasible. Furthermore, a rigid registration might not be sufficient for lung tumors. Hence, registration error may inevitable affect the spatial position between GTV PET and IGTV 10 . In addition, it is possible that some of this difference may be related to differences in the patient's breathing pattern between acquiring the PET/CT and 4DCT. Different breathing pattern can influence tumor size, shape and distribution of activity on the free-breathing PET images [27].  Hanna et al. [25] used the Dice similarity coefficient (DSC) to assess volumetric, shape and positional similarity in the PET-generated target volumes and 4DCT target volumes. Their study revealed that the highest DSCs (mean) were 0.64. Grills et al. [21] investigated the impact of PET/CT for GTV definition in NSCLC using the matching index (similar to CI in our study) to compare the GTV, as defined by CT, with the GTV defined by PET co-registered with CT. In their study, the mean matching index was 0.65. Gondi et al. [23] demonstrated that CI values of NSCLC with the incorporation of FDG-PET and CT were 0.44. The results of our study were similar to data published in prior studies [21,23,25]. In addition, although CI can provide the most information on both volume change and positional change [28], it cannot quantify the percentage of one volume included by another volume. Further analyzing the inclusion relation between GTV PET15% and IGTV 10 , there was 20% of (median) GTV PET15% not included in IGTV 10 and 40% of (median) IGTV 10 not included in GTV PET15% . It suggested that ITGV 10 did not encompass GTV PET15% completely or vice versa. Therefore, we concurred with Gondi [21] who concluded that although the quantitative absolute target volume could sometimes be similar between CT and PET, the qualitative target locations can be significantly different.
One limitation of our study is the small number of patients studied, so that subgroup analyses were limited. Therefore, a larger cohort of patients with many different tumor sizes and locations should be conducted to further investigate the relationship of using 3D-PET/CT and 4DCT for contouring IGTV of NSCLC. We are continuing our work to enroll more patients for further clinical investigations.

Conclusion
None of the PET based contours had both close spatial and volumetric approximation to the 4DCT IGTV 10 . At present 3D-PET/CT should not be used for IGTV generation.