Decreased 3D observer variation with matched CT-MRI, for target delineation in Nasopharynx cancer
- Coen RN Rasch1Email author,
- Roel JHM Steenbakkers2,
- Isabelle Fitton3,
- Joop C Duppen1,
- Peter JCM Nowak4,
- Frank A Pameijer5,
- Avraham Eisbruch6,
- Johannes HAM Kaanders7,
- Frank Paulsen8 and
- Marcel van Herk1
© Rasch et al. 2010
Received: 1 December 2009
Accepted: 15 March 2010
Published: 15 March 2010
To determine the variation in target delineation of nasopharyngeal carcinoma and the impact of measures to minimize this variation.
Materials and methods
For ten nasopharyngeal cancer patients, ten observers each delineated the Clinical Target Volume (CTV) and the CTV elective. After 3D analysis of the delineated volumes, a second delineation was performed. This implied improved delineation instructions, a combined delineation on CT and co-registered MRI, forced use of sagittal reconstructions, and an on-line anatomical atlas.
Both for the CTV and the CTV elective delineations, the 3D SD decreased from Phase 1 to Phase 2, from 4.4 to 3.3 mm for the CTV and from 5.9 to 4.9 mm for the elective. There was an increase agreement, where the observers intended to delineate the same structure, from 36 to 64 surface % (p = 0.003) for the CTV and from 17 to 59% (p = 0.004) for the elective. The largest variations were at the caudal border of the delineations but these were smaller when an observer utilized the sagittal window. Hence, the use of sagittal side windows was enforced in the second phase and resulted in a decreased standard deviation for this area from 7.7 to 3.3 mm (p = 0.001) for the CTV and 7.9 to 5.6 mm (p = 0.03) for the CTV elective.
Attempts to decrease the variation need to be tailored to the specific causes of the variation. Use of delineation instructions multimodality imaging, the use of sagittal windows and an on-line atlas result in a higher agreement on the intended target.
Delineation of the target is one of the main remaining error sources in conformal radiation therapy [1, 2]. By the nature of the procedure, delineation errors are systematic in external beam radiotherapy. Any deviation remains the same throughout the radiation course, which results in reproducible dose differences. Earlier reports on ethmoidal and maxillary sinus and nasopharyngeal tumors demonstrated a dose dependency of both observer variation and irradiation technique. Despite improvements in the latter, the impact of delineation variation remains large with regard to impact on dose to the target and to the other organs at risk [2, 3].
Guidelines for delineation
Early guidelines for delineation of the neck levels were published by Som et al. Shortly thereafter, Robbins, Nowak and Gregoire et al published guidelines on the same topic. Currently, more than five different guidelines for delineation of the neck have been published in the international literature [1, 4–9].
In an effort to reach consensus, Gregoire et al published consensus guidelines for neck delineation on behalf of the Radiation Therapy Oncology Group (RTOG) and European Organization for Research and Treatment of Cancer (EORTC) . Although validation of these guidelines is still to be performed, they more than likely improve delineation agreement of the elective neck nodes. However, guidelines for the delineation of primary tumors of head and neck cancers are scarce.
Computer Tomography (CT) based delineation is the current standard of practice for conformal radiotherapy, although other imaging modalities like Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET) have also proven their value in several tumor sites [6, 11–13]. Notably, a study comparing CT, MRI, PET and pathological specimen based Gross tumor Volume (GTV) determination for larynx carcinomas demonstrated that MRI was more accurate than CT and PET and was even closer to the pathological specimen measurements that are regarded as the "gold standard" . The addition of MRI to CT decreases observer variation and leads to smaller Gross Tumor Volumes, as seen in this study and others concerning this topic [2, 6, 11, 14]. For example, the addition of PET to lung cancer observation considerably decreased the delineation variation [13, 15–17]. This was demonstrated in a multiobserver study performed by Steenbakkers et al , in which the addition of PET to CT-based delineation particularly decreased observer variation at the interfaces towards mediastinum and hilum and in the case of atelectasis. Furthermore, the use of sagittal or coronal reconstructions during the delineation led to more agreement .
18 Fludeoxyglucose-positron Emission Tomograpy (FDG-PET) Imaging for Head and Neck provides functional information on the extent of the tumor [18–21]. However, its main strength lies in the detection of involved regions or (lymph node) metastasis, with an overall sensitivity of 79% [18, 22]. For precise delineation of the tumor extent itself, however, it is less suitable. This is due to poorer spatial resolution, the lack of a universal threshold uptake value, and a large uptake in the brain tissue close to the primary tumor when invasion towards bone or parapharyngeal regions is suspected (i.e., when the delineation becomes difficult) [18, 23]. MRI was superior to FDG-PET for showing the extent of the primary tumor in 54 nasopharyngeal cases.
The addition of MRI to CT-based delineation has proven its value in the delineation of the Head and Neck region and has resulted in smaller target volumes [2, 6, 11, 14, 25, 26]. Especially when posterior invasion is suspected, MRI has proven to be superior to CT based staging . However, the effect on observer variation is limited [6, 11].
The above-mentioned studies for the Head and Neck have defined the variation in various modalities, but have not attempted to determine measures for decreasing this variation within the study, or measure the impact of any measures taken. It is the aim of the present study to determine the extent of baseline variation, to analyze the results, and then to take measures including improved delineation guidelines, multi-modality imaging, and delineation tools targeted at the specific variations found to reduce this variation. The impact of these measures will then be assessed by re-delineation.
Materials and methods
For ten patients with nasopharynx cancer, delineation of the clinical target volume was performed by ten observers, considered experts in the field and from multiple institutions. Stages of the patients ranged from T2 (2), T3 (3), T4 (5), N0 (3), N1 (5), N2 (2). No lymph node delineation was performed. The aim of the study was to assess the observer variation in 3D in a standardized environment in two phases.
First, each observer was given the same personal computer with monitor, installed with in-house delineation software together with the patient data. In this phase, delineation was performed on contrast enhanced CT images with delineation instructions. The non-matched MRI was digitally available to the observer. Delineation on both the CTV (= visible tumor + suspected microscopic extension) and the CTV-elective (= CTV + 1 cm margin and the entire nasopharynx) was performed. Automatic 3D expansion of the CTV, with 1 cm as a starting point for the CTV elective, was supplied to the observer. The delineations, together with data on observer-computer interaction (Big Brother ), were then submitted through the Internet to the Netherlands Cancer Institute. These data contained information on the delineations and all computer actions of the observer such as mouse motion, window/level, delineation corrections (i.e. moving, deleting or replacing any point of the delineation during while delineating but before submission of the delineation). After volumetric and 3D analysis (see below), a meeting was organized with the observers. Results of the first delineation phase were discussed. Improvements in delineation instructions and how to implement the CT and MRI co-registration were then generated from the meeting.
To ensure that the observers would have forgotten the exact first delineation, one year thereafter the observers received a new CD with improved delineation software, improved instructions for delineations, and a co-registered CT-MRI for delineation. Furthermore, the observers were given an on-line CT atlas with the key anatomical boundaries pointed out and the normal boundaries of the nasopharynx highlighted. The observers were forced to use the sagittal and/or coronal side windows before the start of each delineation, by designating a point in the axial plane where a reconstruction was to be generated. Again, the CTV and CTV-elective were delineated by all observers for all patients.
First, the volume of each delineation and the volume of each median volume (The volume encompassed by the median surface, see next paragraph) were calculated. Then the common volume (volume common to all individual delineations in a patient) and encompassing volume (volume encompassing all the individual delineations in a patient) were calculated. Ideally, the ratio between common and encompassing volume was 1, indicating a full agreement between observers.
Complete agreement of all observers what to delineate is rare. Therefore we choose an arbitrary cutoff of 80% agreement, as described in an earlier analysis in lung cancer . The median surface for each patient was manually divided into an agreement and disagreement region. The median surface was labeled as an agreement region when a corresponding anatomic structure was delineated by at least 8 of the 10 radiation oncologists (i.e., 80%); otherwise, it was labeled as a disagreement region. The regions were judged by the author and in cases of doubt the regions were judged and designated by two radiation-oncologists (CR, RS). Ideally all of the surface should be designated as agreement region as disagreement regions indicate that observer variation is determined on different opinions of the anatomical target extension rather then visibility of a structure.
Mean Volume (cm3)
Standard Deviation (cm3)
Observer variation for the various CTV to normal tissue interfaces and the two delineation phases.
3.3 (p = 0.02)
64 (p = 0.003)
Anterior - Air
2.7 (p = 0.01)
79 (p = 0.02)
Dorsal - Bone
2.7 (p = 0.005)
84 (p = 0.005)
3.5 (p = 0.05)
66 (p = 0.004)
3.1 (p = 0.02)
61 (p = 0.03)
3.3 (p = 0.007)
59 (p = 0.005)
3.0 (p = 0.005)
67 (p = 0.01)
4.2 (p = 0.03)
48 (p = 0.01)
3.3 (p = 0.001)
56 (p < 0.001)
The second phase CTV delineation results are listed in Table 2, next to the results of the first delineation phase. The mean number of corrections per delineation was 33 (i.e., 7.3 corrections per cm2). The root mean square of the standard deviation of the distance between the delineations and the median surface decreased to 3.3 mm, with an agreement surface percentage of 64%. The largest root mean square SD was 4.2 mm at the sphenoid interface. In the first phase, the delineation variation between the observers using the side windows and those not using them differed from 3.7 to 5.0 mm (1SD). In the second phase, the forced use of the side windows (sagittal reconstructions) and the addition of co-registered MRI resulted in a decreased observer variation from 7.7 to 3.3 mm (1 SD) at the caudal side of the tumor. The mean delineation time decreased from 15 to 11 minutes. The mean volume of the delineations decreased from 25 to 20 cm3 with an SD (including patient variation) of 9 to 5 cm3 respectively. At the same time, the ratio between common and encompassing volume (i.e.: the ratio between the largest volume common to all delineations and the smallest volume encompassing all delineations) rose from 0.15 to 0.22.
The mean distance between the first and second CTV delineation of an observer and patient was 0.6 mm (SD 5.7 mm), but this was not evenly distributed. In all but one region, the second phase CTV delineations resulted in 2.6 to 0.1 mm smaller volumes. At the caudal side, the phase 2 delineation (MRI, improved delineation instructions, and the forced use of the side windows) resulted in a 1.4 mm larger mean delineation.
Observer variation for the various CTV elective to normal tissue interfaces and the two delineation phases.
4.9 (p = 0.01)
59 (p = 0.004)
Dorsal - Bone
Dorsal - Invas.
4.5 (p = 0.02)
43 (p = 0.01)
4.4 (p = 0.03)
58 (p = 0.02)
4.9 (p = 0.04)
53 (p = 0.01)
5.7 (p = 0.03)
51 (p = 0.04)
Nasoph. - Lat.
66 (p = 0.02)
Nasoph. - Ant.
5.1 (p = 0.02)
70 (p = 0.01)
5.6 (p = 0.03)
47 (p = 0.005)
The delineations were split into two groups: those delineations where the side windows were used and those where the windows were not used by the observer. The first group had a smaller overall SD in delineation compared to the second group. The variation difference was primarily noted at the caudal and superior side of the delineations; i.e., perpendicular to the CT scan axis but in-plane for the sagittal reconstruction.
The Phase 2 delineations demonstrated a marked different SD compared to the first phase. The mean number of corrections/cm2 was 10.1. The mean standard deviation of the distance between the median surface and the delineations was reduced from 5.9 to 4.9 mm, with a percentage of surface where at least 8/10 observers intended to delineate the same anatomical entity increasing from 17 to 59% (Table 3). The caudal border of the delineation was improved but there was still considerable in variation, with 5.6 mm and an agreement of 47% of the surface. The mean volume decreased from 103 to 91 cm3 with a root mean square SD of 33 and 21 respectively.
The delineation uncertainties in this article are larger than reported uncertainties for setup error in the head and neck region. Since an error in delineation affects the whole treatment and not just one fraction it is clear that delineation is a large geometric uncertainty in radiation treatment for nasopharynx cancer [2, 3, 28]. This study concerns observer variation in 3D as a baseline, and aimed to reduce the target delineation variation. The results with improved consensus guidelines and matched MRI available show that the effort was successful. The mean SD of the distances decreased both for the CTV and for the CTV elective delineations. No ground truth of tumor extent was available for the patients in this study, thus no comparison tho this ground truth could be made. Still observer variation should be minimized as it has a large impact on tumor control and side effects. Furthermore, reproducible target delineations make evaluation of efficacy and side effects more precise.
The reasons for the difference between phase one and two are as follows. First, looking at the analysis of the first phase CTV delineation (baseline delineation), the largest variation was noted in the caudal direction (i.e., perpendicular to the transverse plane of the CT scan) (Table 2, 3). This was largest in those delineations where the observers did not use the (sagittal/coronal view) side windows. This is applicable to other tumor areas as well. A similar finding has been noted in delineation of lung tumors where the observer variation between observers utilizing the side windows was smaller than between the observers who did not use the side windows [13, 15]. Therefore, in the second phase, the use of the side windows was enforced, by forcing the observer first to pin-point a plane in the main window where the side sagittal and/or coronal window was to be reconstructed, thus ensuring that the side window was used.
Furthermore, in order to make more soft-tissue contrast available, without the need to view a separate MRI, a co-registered MRI was made available. Several earlier studies on nasopharynx and other head and neck regions demonstrated the superiority of MRI delineation in this respect [11, 14, 25, 26]. To make delineation on two modalities easier, double window delineation was introduced into the second phase delineation software. With this feature, delineation in the main window (CT or MRI at the preference of the observer) was directly linked to delineation in the same plane on the other modality, allowing real time double modality delineation [13, 15].
A CT-MRI atlas of the nasopharynx, with the TNM definition of the nasopharynx delineated, was available on-line for the observer. The atlas was generated from the TNM atlas and available to the observer in multiple planes.
The instructions for delineation of invaded bone was adapted (i.e., when the Clivus was invaded, the whole clivus was to be regarded as part of the CTV elective).
Forced use of sagittal windows, the observers were first to pin-point a plane in the main window where the side sagittal and/or coronal window was to be reconstructed before that no delineation could be submitted.
The sum of these measures resulted in a considerable reduction in the variation in tumor and CTV delineation. Being able to replay the delineations brought great insight into the causes of the delineation variation. One source of delineation variation (i.e., lack of soft-tissue contrast) needs an entirely different approach than do others (i.e., definition of the nasopharynx, use of sagittal windows, etc.). With clearer delineation instructions, together with the forced use of sagittal reconstructions and simultaneous delineation on CT and MRI, target delineation variation in the nasopharynx can be reduced. The largest impact on agreement was obtained by improved definitions of the CTV and CTV elective, rather than use of multimodality imaging as is most clearly demonstrated by the increase of agreement surface at the CTV elective.
Observer variation of target delineation in the nasopharynx is considerable but can be reduced with the use of dedicated delineation protocols, forced use of sagittal/coronal reconstructions, and double window delineation on CT and MRI. In the current study, instructing the observers to designate the invaded structure as a target reduced an important source of variation.
We wish to thank: P. Levendag, F. Hoebers, G. Salverda and L. Pop for their contribution to this article. This work was sponsored in part by The Dutch Cancer Society grant NKI 2000-2247
- Gregoire V, Daisne JF, Bauvois C, et al.: [Selection and delineation of lymph node target volumes in head and neck neoplasms]. Cancer Radiother 2001, 5:614–628.PubMed
- Rasch C, Steenbakkers R, van Herk M: Target definition in prostate, head, and neck. Semin Radiat Oncol 2005, 15:136–145.PubMedView Article
- Pimentel Serra N, van Asselen B, Steenbakkers R, et al.: Impact of Observer Delineation Variation on Target Coverage and Dose to Organs at Risk in Nasopharyngeal Cancer Patients. Europ J Cancer 2005, (supplement 3):289.
- Nowak PJ, Wijers OB, Lagerwaard FJ, et al.: A three-dimensional CT-based target definition for elective irradiation of the neck. Int J Radiat Oncol Biol Phys 1999, 45:33–39.PubMedView Article
- Nowak P, van Dieren E, Sornsen de Koste J, et al.: Treatment portals for elective radiotherapy of the neck: an inventory in The Netherlands. Radiother Oncol 1997, 43:81–86.PubMedView Article
- Daisne JF, Duprez T, Weynand B, et al.: Tumor volume in pharyngolaryngeal squamous cell carcinoma: comparison at CT, MR imaging, and FDG PET and validation with surgical specimen. Radiology 2004, 233:93–100.PubMedView Article
- Palazzi M, Jereczeck-Fossa BA, Soatti C: CT-based delineation of lymph node levels in the neck: can we optimize the Consensus? Radiother Oncol 2004, 73:383–384.PubMedView Article
- Palazzi M, Soatti C, Bianchi E, et al.: Guidelines for the delineation of nodal regions of the head and neck on axial computed tomography images. Tumori 2002, 88:355–360.PubMed
- Wijers OB, Levendag PC, Tan T, et al.: A simplified CT-based definition of the lymph node levels in the node negative neck. Radiother Oncol 1999, 52:35–42.PubMedView Article
- Gregoire V, Levendag P, Ang KK, et al.: CT-based delineation of lymph node levels and related CTVs in the node-negative neck: DAHANCA, EORTC, GORTEC, NCIC, RTOG consensus guidelines. Radiother Oncol 2003, 69:227–236.PubMedView Article
- Rasch C, Keus R, Pameijer FA, et al.: The potential impact of CT-MRI matching on tumor volume delineation in advanced head and neck cancer. Int J Radiat Oncol Biol Phys 1997, 39:841–848.PubMedView Article
- Vansteenkiste J, Fischer BM, Dooms C, et al.: Positron-emission tomography in prognostic and therapeutic assessment of lung cancer: systematic review. Lancet Oncol 2004, 5:531–540.PubMedView Article
- Steenbakkers RJ, Duppen JC, Fitton I, et al.: Observer variation in target volume delineation of lung cancer related to radiation oncologist-computer interaction: a 'Big Brother' evaluation. Radiother Oncol 2005, 77:182–190.PubMedView Article
- Emami B, Sethi A, Petruzzelli GJ: Influence of MRI on target volume delineation and IMRT planning in nasopharyngeal carcinoma. Int J Radiat Oncol Biol Phys 2003, 57:481–488.PubMedView Article
- Steenbakkers RJ, Duppen JC, Fitton I, et al.: Reduction of observer variation using matched CT-PET for lung cancer delineation: a three-dimensional analysis. Int J Radiat Oncol Biol Phys 2006, 64:435–448.PubMedView Article
- Caldwell CB, Mah K, Ung YC, et al.: Observer variation in contouring gross tumor volume in patients with poorly defined non-small-cell lung tumors on CT: the impact of 18FDG-hybrid PET fusion. Int J Radiat Oncol Biol Phys 2001, 51:923–931.PubMedView Article
- Mah K, Caldwell CB, Ung YC, et al.: The impact of (18)FDG-PET on target and critical organs in CT-based treatment planning of patients with poorly defined non-small-cell lung carcinoma: a prospective study. Int J Radiat Oncol Biol Phys 2002, 52:339–350.PubMedView Article
- Rusthoven KE, Koshy M, Paulino AC: The role of PET-CT fusion in head and neck cancer. Oncology (Williston Park) 2005, 19:241–246.
- Di Martino E, Nowak B, Hassan HA, et al.: Diagnosis and staging of head and neck cancer: a comparison of modern imaging modalities (positron emission tomography, computed tomography, color-coded duplex sonography) with panendoscopic and histopathologic findings. Arch Otolaryngol Head Neck Surg 2000, 126:1457–1461.PubMed
- Scarfone C, Lavely WC, Cmelak AJ, et al.: Prospective feasibility trial of radiotherapy target definition for head and neck cancer using 3-dimensional PET and CT imaging. J Nucl Med 2004, 45:543–552.PubMed
- Nowak B, Di Martino E, Janicke S, et al.: Diagnostic evaluation of malignant head and neck cancer by F-18-FDG PET compared to CT/MRI. Nuklearmedizin 1999, 38:312–318.PubMed
- Kyzas PA, Evangelou E, Denaxa-Kyza D, et al.: 18F-fluorodeoxyglucose positron emission tomography to evaluate cervical node metastases in patients with head and neck squamous cell carcinoma: a meta-analysis. J Natl Cancer Inst 2008, 100:712–720.PubMedView Article
- Chao KS, Wippold FJ, Ozyigit G, et al.: Determination and delineation of nodal target volumes for head-and-neck cancer based on patterns of failure in patients receiving definitive and postoperative IMRT. Int J Radiat Oncol Biol Phys 2002, 53:1174–1184.PubMedView Article
- King AD, Ma BB, Yau YY, et al.: The impact of 18F-FDG PET/CT on assessment of nasopharyngeal carcinoma at diagnosis. Br J Radiol 2008, 81:291–298.PubMedView Article
- Jian JJ, Cheng SH, Prosnitz LR, et al.: T classification and clivus margin as risk factors for determining locoregional control by radiotherapy of nasopharyngeal carcinoma. Cancer 1998, 82:261–267.PubMedView Article
- Chung NN, Ting LL, Hsu WC, et al.: Impact of magnetic resonance imaging versus CT on nasopharyngeal carcinoma: primary tumor target delineation for radiotherapy. Head Neck 2004, 26:241–246.PubMedView Article
- Deurloo KE, Steenbakkers RJ, Zijp LJ, et al.: Quantification of shape variation of prostate and seminal vesicles during external beam radiotherapy. Int J Radiat Oncol Biol Phys 2005, 61:228–238.PubMedView Article
- Rasch C, Eisbruch A, Remeijer P, et al.: Irradiation of paranasal sinus tumors, a delineation and dose comparison study. Int J Radiat Oncol Biol Phys 2002, 52:120–127.PubMedView Article
- UICC: TNM Classification of Malignant Tumours. 6th edition. Wiley; 2002.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.