Heart atlas for retrospective cardiac dosimetry: a multi-institutional study on interobserver contouring variations and their dosimetric impact

Purpose Cardiac effects after breast cancer radiation therapy potentially affect more patients as survival improves. The heart’s heterogeneous radiation exposure and composition of functional structures call for establishing individual relationships between structure dose and specific late effects. However, valid dosimetry requires reliable contouring which is challenging for small volumes based on older, lower-quality computed tomography imaging. We developed a heart atlas for robust heart contouring in retrospective epidemiologic studies. Methods and materials The atlas defined the complete heart and geometric surrogate volumes for six cardiac structures: aortic valve, pulmonary valve, all deeper structures combined, myocardium, left anterior myocardium, and right anterior myocardium. We collected treatment planning records from 16 patients from 4 hospitals including dose calculations for 3D conformal tangential field radiation therapy for left-sided breast cancer. Six observers each contoured all patients. We assessed spatial contouring agreement and corresponding dosimetric variability. Results Contouring agreement for the complete heart was high with a mean Jaccard similarity coefficient (JSC) of 89%, a volume coefficient of variation (CV) of 5.2%, and a mean dose CV of 4.2%. The left (right) anterior myocardium had acceptable agreement with 63% (58%) JSC, 9.8% (11.5%) volume CV, and 11.9% (8.0%) mean dose CV. Dosimetric agreement for the deep structures and aortic valve was good despite higher spatial variation. Low spatial agreement for the pulmonary valve translated to poor dosimetric agreement. Conclusions For the purpose of retrospective dosimetry based on older imaging, geometric surrogate volumes for cardiac organs at risk can yield better contouring agreement than anatomical definitions, but retain limitations for small structures like the pulmonary valve. Supplementary Information The online version contains supplementary material available at 10.1186/s13014-021-01965-5.


Spatial agreement measures
Each observer's delineation of each heart atlas structure for each patient was exported from Varian Eclipse (version 15, Varian Medical Systems, Palo Alto, CA) as a three-dimensional surface triangle mesh using the Eclipse Scripting API. The mesh resolution was 1 mm. In addition, a cumulative dose-volume histogram (DVH) was exported for each of the delineated structures with dose bins of 10 cGy.
In the following description, we refer to patients i = 1, . . . , N (with N = 16); heart atlas structures j = 1, . . . , J (with J = 7); observers k = 1, . . . , K (with K = 6), and observer pairs l = 1, . . . , L (with L = K 2 = 15). Since no observer had been designated as gold standard, the spatial agreement between 2 delineations of the same structure was calculated using several measures for each of the 15 observer pairs, and then averaged over observer pairs.

Pairwise distance-based measures
Pairwise distance-based agreement measures were calculated using the libigl software (Jacobson & Panozzo, 2021) based on the exported surface triangle meshes.

Distance between centers of mass (DCOM)
For a possibly non-convex mesh A, the center of mass (COM) vector a = (x a , y a , z a ) was calculated using a surface integral (Nürnberg, 2013). For two meshes A and B with respective COM a and b, DCOM is defined as the Euclidean distance · 2 between a and b: The average pairwise DCOM j for one heart atlas structure j was calculated by averaging DCOM ijl over observer pairs l and patients i: Average surface distance (ASD) Given mesh A with vertices a r = (x ar , y ar , z ar ) (r = 1, . . . , R) and mesh B with vertices b s = (x bs , y bs , z bs ) (s = 1, . . . , S), the Euclidean distance between one pair of vertices -one from each mesh -is: For a single vertex a r from A, the Euclidean distance to mesh B is defined as the distance from a r to the closest vertex of B: d(a r , B) = min(d(a r , b s )) , s = 1, . . . , S ASD between meshes A and B is defined as the average of all the distances of one vertex to the respective other mesh: Hausdorff distance (HD) For meshes A and B, HD is defined as the average of the longest distances from A to B, and from B to A: The average pairwise ASD j and HD j for one heart atlas structure j were calculated by averaging ASD ijl and HD ijl , respectively, over observer pairs l and patients i.

Pairwise volume-overlap measures
To calculate volume-overlap measures for each observer-pair, we used the libigl software (Jacobson & Panozzo, 2021) to perform Boolean operations on the exported surface triangle meshes, and to calculate mesh volumes via the divergence theorem (Nürnberg, 2013).

Jaccard similarity coefficient (JSC) For structures A and B, their intersection
A ∩ B has volume |A ∩ B| and their union A ∪ B has volume |A ∪ B|. JSC (Jaccard, 1912) is then defined as: (Dice, 1945) for structures A and B with respective volumes |A| and |B| is defined as: The average pairwise JSC j and DSC j for one heart atlas structure j were calculated by averaging JSC ijl and DSC ijl , respectively, over observer pairs l and patients i.

Agreement for structure volume
The DVHmetrics package (Wollschlaeger & Karle, 2020) for the statistical environment R (R Core Team, 2021) was used to calculate overall agreement measures for the volume of each delineated structure based on the cumulative DVHs exported from Varian Eclipse.

Coefficient of variation (CV)
A log-normal distribution was assumed for structure volume. The CV, defined as the ratio of the standard deviation to the mean, is then given by e σ 2 − 1, with σ 2 being the variance of the log e -transformed volume values.
The structure-specific CV j based on data for all patients were derived from fitting a Bayesian log-normal regression model for the observed volume values using package brms (Bürkner, 2017) for the statistical environment R. The covariate was a factor indexing all possible combinations of patient and structure, and the model allowed for structure-specific error variances σ 2 j . Based on the estimatedσ 2 j , the structure-specific CV j were calculated as CV j = 100 · eσ j 2 − 1. The 95% credible intervals for CV j were derived from the posterior distribution of theσ j .
As a sensitivity analysis, we descriptively calculated the structure-specific CV j as the ratio of the square root of the mean of the variances s 2 ij of the volume values in structure j and patient i to the average of the volume means V ij : Intraclass correlation coefficient (ICC) For a fixed heart atlas structure j, the variation in volume has two systematic variance components: σ 2 b for the variance due to patients i (between) and σ 2 w for the variance due to observers k (within). The population ICC as a measure for observer agreement is then defined as (Shrout & Fleiss, 1979): To estimate ICC, we assume an analysis-of-variance (ANOVA) design with covariate factors patient and observer whose levels both represent random samples from a larger population. Adopting the naming scheme from Shrout and Fleiss (1979), this is a two-way random design to estimate ICC(2) as a measure for absolute agreement (consistency) among observers.
Given a data sample, the variance components are estimated from two different ANOVAs: The first ANOVA is a one-way design with factor patient from which MS pat is calculated, the mean effect sum of squares for patient. The second ANOVA is a two-way design with factors patient and observer from which we calculate MS obs , the mean effect sum of squares for observer, and MS err , the mean error sum of squares. ICC(2) can then be estimated as: For each structure j, we calculated ICC(2) j and the corresponding 95% confidence interval using package psych (Revelle, 2020) for the statistical environment R.

Dose-based agreement measures
Overall agreement measures for dose metrics in each delineated heart atlas structure were calculated based on the cumulative DVHs exported from Varian Eclipse using the DVHmetrics package (Wollschlaeger & Karle, 2020) for the statistical environment R.

Coefficient of variation (CV)
The structure-specific CV j for dose metrics DMEAN and D2CC were calculated, respectively, as for structure volume (section 1.1.3).

Standard deviation (SD)
For V5GY as a natural proportion, a beta distribution with parameters µ (mean) and φ (precision) was assumed. Structure specific SD j based on data for all patients were derived from fitting a Bayesian beta regression model for the observed values using package brms. The covariate was a factor indexing all possible combinations of patient and structure, and the model allowed for structure-specific precision parameters φ j . Based on the estimatedμ j and φ j , the structure-specific SD j were then calculated as SD j = μ j ·(1−μ j ) 1+φ j . The 95% credible intervals for φ j were derived from the posterior distribution of the parameter estimates.
As a sensitivity analysis, we descriptively calculated the structure-speficic SD j for V5GY as the square root of the arithmetic mean of the empirical variances s 2 ij for one heart atlas structure j from one patient i as SD j = 1 N N i=1 s 2 ij .

Intraclass correlation coefficient (ICC)
The respective structure-specific ICC j for dose metrics DMEAN, D2CC, and V5GY were calculated as for structure volume (section 1.1.3).

Supplementary data 2.1 Interobserver variations in contour delineation
Figure S1: Contour delineations for the complete heart -axial plane.