All volumes of interest (VOIs) were outlined manually by three in field specialized oncologists (e.g. H&N was contoured by a H&N oncologist, etc.) on planning CT (pCT), that represented our atlas for AC on three SS replanning CT (rCT).
We used three commercial software solutions to generate the automatic contours, subsequently all these VOIs were manually corrected (ACMC) by the same experienced physicians. The VOIs on the rCT were also manually contoured from scratch, that represented our reference volumes (Vref). The times employed for AC, for ACMC (when needed), and for Vref contouring were recorded. To evaluate the quality of the contours, AC and ACMC were compared, by the use of several parameters, with those manually delineated on rCT (Vref).
Patient data and manual contouring
A conventional helical CT scanner was used for image acquisition. A total of 15 patients (five with locally-advanced head and neck (H&N) tumors, five with malignant pleural mesotheliomas (MPM) and five with high-risk prostate cancer (HRPCa)) were enrolled in the study. In addition to the treatment pCT images, a further set of CT images (rCT) was acquired for each patient during the RT course, usually in the middle of the treatment. For H&N, prostate and mesothelioma patients the images had a slice thickness of 3, 2.5, and 5 mm, respectively. A commercial treatment planning system (Focal 4D by Elekta, Sweden) was used for the manual contouring from scratch of ROIs (Figure 1).
All H&N patients (two oropharynx, one oral cavity, one nasopharynx, and one larynx) had pathologically confirmed Stage III-IV disease. The target volumes were determined according to the ICRU definition of GTV and CTV. For each patient, neck levels were delineated according to international consensus guidelines[8]. In addition, 16 organs at risk (OARs) were contoured (parotid, cochlea, esophagus, brainstem, spinal cord, mandible, thyroid, pharynx, masticatory spaces, larynx, oral cavity, temporal lobes, eyes, lenses, optic nerves and chiasm). Five MPM patients had been previously treated with extraperitoneal pleuro-pneumonectomy and received adjuvant thoracic irradiation. The CTV included the entire hemithorax and thoracotomy incision and site of chest drains. Two patients had a lesion on the right side and three on the left. Contoured normal tissues were: contralateral lung, heart, esophagus, liver, bowel, spinal cord, spleen, kidneys. Five HRPCa patients (PSA > 20 ng/mL; Gleason score 8–10 or c/pT3a/b)[9] had been previously treated, two with definitive radiotherapy and three with post-operative irradiation. The CTV encompassed the prostate and seminal vesicles (definitive irradiation) or prostatic bed (post-operative irradiation) and pelvic lymphnodes. The defined OARs were: rectum, bladder, femoral heads and bowel.
Automatic contouring
Using three commercially available programs a) ABAS 2.0 (CMS-Elekta, Stockholm, Sweden) (A), b) MIM 5.1.1, (MIMVista corp, Cleveland, Ohio) (M), and c) VelocityAI 2.6.2 (Velocity Medical Systems, Atlanta, Georgia) (V), contours from the initial pCT were deformed to the replanning CT for each patient.
In ABAS an atlas patient consists of a CT scan with pre-defined ROIs, both target volumes and OARs. A detailed description of the method has been published by Han et al.[10]. Firstly, non-rigid registration is used to transform the CT scan of an atlas patient (pCT) to replanning CT scan. Specific models for e.g., H&N and prostate are available in the software, taking structure-specific information, like elasticity, into account. Then, using the obtained transformation, auto-contours are generated by mapping the atlas contours to the replanning CT scan.
Also in MIM, we decided to use a single-atlas segmentation approach: the pCT of the patient was inserted into the atlas and, subsequently, the algorithm extracts information from one CT to generate the automated contour of the rCT. In order to do this, initially rigid registration with rotations were applied, followed by deformable registration. The previously validated intensity-based free-form deformable registration algorithm utilizes regularization to minimize the likelihood of folds or tears in the deformation fields to fit one CT to another[11].
A single-patient-atlas segmentation approach was used also with VelocityAI. Between planning CT and replanning CT, we firstly applied a rigid registration with rotations and secondly, a deformable multi pass registration. Finally, we copied the contours from planning CT to replanning CT and this software applied automatically the deformation matrix to them. VelocityAI uses the basis-spline (B-spline) method[12] for deformable registrations.
Time/speed evaluation
Focal 4D, ABAS and MIM were installed in a 3 GHz HP xw 8600 workstation running Windows with 8 GB RAM, whereas VelocityAI was installed in a 2.66 GHz HP xw 8600 workstation running Windows with 4 GB RAM. The time required for the ex novo ROIs definition on rCT, for the three software solutions to generate the AC, and finally, the time for manual correction of AC was calculated.
The time to manually define the volumes on rCT was calculated from the opening of the latest CT to the last ROI. The time needed by the software solutions to generate automatic contours was measured by when the CT was imported until the end of the entire generation process. The time needed to check the automatically-obtained volumes was defined from the time of loading the CT until the time needed for final volume correction (Figure 1). The usefulness of the automatic contours procedure was evaluated by comparing 1) the time needed from the software + manual corrections vs. manual contour from scratch or 2) just the manual correction time vs. manual contour from scratch (i.e. not considering the time needed by the computer for the generation of the deformed contours).
Quantitative evaluation of automated and manually corrected contours
The performance of the automatic segmentation software was assessed by quantitatively comparing manual Vref contours with AC or ACMC contours in terms of volume, position and shape. A sensitivity and specificity study was also conducted. Manual segmentation was used as the reference segmentation.
As an initial measure of the similarity between the automatic and manual contours, the volume of every structure was calculated and the difference between the automatically generated volume (VAA, VMA, VVA, for ABAS, MIM and VElocityAI, respectively) and the manually generated volume, or reference volume, was calculated for each structure, as follows:
Also the difference between manually corrected automatic volume (VAM, VMM, VVM, for ABAS, MIM and VElocityAI respectively) and the Vref was calculated.
Since its introduction, DICE similarity coefficient (DSC) index[13] has been widely used in the evaluation of deformable image registration results. The DSC index is defined as
Vref were compared to automatically contoured ROIs (or manually corrected after automatic generation ROIs). DSC values range from 0 to 1, and are identical to 1 if automatic and manual volumes were equal with a complete intersection.
For all the software solutions evaluated, the sensitivity index (Se) of contours was computed as:
The sensitivity reflects the probability that the automatic contours (before or after the manual corrections) match the reference contour and some authors renamed it as the overlapping index (OI)[7].
We defined, as a surrogate of the specificity, the inclusiveness index (IncI):
The inclusiveness index reflects the inclusion of within, i.e. the probability that a voxel of theis really a voxel of the.
To help the reader get an idea of some parameter trends, a modified Receiver Operating Characteristic (mROC) analysis was done by plotting the sensitivity vs. (1 _ IncI) for some delineated structure. The best possible result was expected to yield a point in the upper left corner or coordinate (0, 1) of the ROC space, representing 100% sensitivity (all voxels are true positive) and 100% of inclusion (surrogate of specificity, i.e. no false positive voxel is present).
As a general measure for the location of the structures, for each patient and for each structure (manually defined from scratch (i.e. reference structure), automatically generated and manually corrected after automatic generation) mass centre is calculated and the distance in the three coordinates was evaluated:
As reported in Figure 1, in order to evaluate these parameters in a systematic and consistent way, all DICOM images and structures were exported to VODCA4rt (MSS GmbH, Hagendorn, Switzerland) version 4.4.1. Therefore, we used the analysis tool box of VODCA for the automatic calculation of the variables described above.
A non-parametric Wilcoxon signed rank test was used to determine whether or not the observed differences were statistically significant. The Holm-Bonferroni correction was considered as well.