Skip to main content

Multi-institutional evaluation of a Pareto navigation guided automated radiotherapy planning solution for prostate cancer

Abstract

Background

Current automated planning solutions are calibrated using trial and error or machine learning on historical datasets. Neither method allows for the intuitive exploration of differing trade-off options during calibration, which may aid in ensuring automated solutions align with clinical preference. Pareto navigation provides this functionality and offers a potential calibration alternative. The purpose of this study was to validate an automated radiotherapy planning solution with a novel multi-dimensional Pareto navigation calibration interface across two external institutions for prostate cancer.

Methods

The implemented ‘Pareto Guided Automated Planning’ (PGAP) methodology was developed in RayStation using scripting and consisted of a Pareto navigation calibration interface built upon a ‘Protocol Based Automatic Iterative Optimisation’ planning framework. 30 previous patients were randomly selected by each institution (IA and IB), 10 for calibration and 20 for validation. Utilising the Pareto navigation interface automated protocols were calibrated to the institutions’ clinical preferences. A single automated plan (VMATAuto) was generated for each validation patient with plan quality compared against the previously treated clinical plan (VMATClinical) both quantitatively, using a range of DVH metrics, and qualitatively through blind review at the external institution.

Results

PGAP led to marked improvements across the majority of rectal dose metrics, with Dmean reduced by 3.7 Gy and 1.8 Gy for IA and IB respectively (p < 0.001). For bladder, results were mixed with low and intermediate dose metrics reduced for IB but increased for IA. Differences, whilst statistically significant (p < 0.05) were small and not considered clinically relevant. The reduction in rectum dose was not at the expense of PTV coverage (D98% was generally improved with VMATAuto), but was somewhat detrimental to PTV conformality. The prioritisation of rectum over conformality was however aligned with preferences expressed during calibration and was a key driver in both institutions demonstrating a clear preference towards VMATAuto, with 31/40 considered superior to VMATClinical upon blind review.

Conclusions

PGAP enabled intuitive adaptation of automated protocols to an institution’s planning aims and yielded plans more congruent with the institution’s clinical preference than the locally produced manual clinical plans.

Background

Automated radiotherapy treatment planning (AP) is an innovation that improves the quality and efficiency of plan generation when compared to traditional manual trial-and-error techniques [1]. Within the literature AP solutions can be separated into 3 broad categories:

  1. 1.

    Knowledge based planning (KBP): utilise algorithms trained on databases of historical treatment plans to predict parameters (e.g. dose volume histograms) that inform the optimisation of novel patients [2,3,4,5,6].

  2. 2.

    Constrained hierarchal optimisation (CHO): minimise clinical objectives in strict sequential order according to a predefined clinical ‘wish list’ [7, 8].

  3. 3.

    Protocol-based automatic iterative optimisation (PBAIO): automatically adapt parameters during the plan generation process, tailoring the optimisation to the individual patient [9,10,11,12,13].

Prior to automated plan generation all methods must be calibrated; a process that is critical in ensuring solutions are optimal and congruent with oncologists’ treatment wishes. At present two calibration methods are commonly employed. Simple trial-and-error, where AP parameters are iteratively adjusted manually based on the AP output, and machine learning where AP parameters/algorithms are trained on historical patient datasets. Trial-and-error is the predominant method used for PBAIO and CHO solutions, and machine learning for KBP solutions [1].

Whilst trial-and-error and machine learning yield clinically acceptable AP solutions, there are limitations of both approaches than can hinder the efficiency and optimality of the AP calibration. Machine learning generally requires large historical datasets (typically n = 100) [14], which may not be present for novel techniques or prescriptions, and calibrations are strongly dependent on the optimality and consistency of plans in the training dataset [15], which is not guaranteed. Additionally KBP trained with machine learning may still require considerable ‘tuning’ to deliver suitable solutions [16]. For trial-and-error, a key issue is that due to the high number of calibration variables and their possible permutations, efficient and intuitive exploration of different treatment options is not possible. Trial-and-error is analogous to traditional manual planning (albeit at the patient cohort level); an approach prone to inter-observer variability [17] and yielding plans that may not fully align with oncologists’ clinical aims [18]. The process is also inefficient with any change in calibration parameter requiring the generation of a new plan to assess the impact on the dose distribution.

We propose an alternative method for AP calibration, which utilises Pareto navigation techniques in place of trial-and-error or machine learning. The concept of Pareto navigation is as follows: (i) a plan is considered Pareto optimal when improvement of one objective/trade-off can only be made at the detriment of another (ii) for a given optimisation problem there is an infinite set of Pareto optimal plans, which define the ‘Pareto front’ (iii) in Pareto navigation the Pareto front is sampled (for all or a selected number of trade-offs) via generating a set of discrete Pareto optimal plans, the decision maker (e.g. oncologist or dosimetrist) then interactively explores the Pareto front using a navigation star [19] or sliders [20] to select the clinically optimum solution. When compared to traditional trial-and-error manual planning, on an individual patient basis Pareto navigation has been shown to improve planning efficiency by 70–90% [18, 21, 22] and yield solutions more congruent with the oncologists’ treatment aims [18]. It is therefore hypothesised that Pareto navigation presents an effective AP calibration alternative.

Recently the methodology of a fully automated PBAIO solution that was calibrated using Pareto navigation techniques (Pareto Guided Automated Planning (PGAP)) has been presented [23]. The solution was evaluated for prostate cancer patients with and without elective nodal irradiation at the local institution (Velindre Cancer Centre (VCC)), with results demonstrating superiority over manual planning [24]. However, in this initial implementation of PGAP, Pareto navigation was constrained to one trade-off (or dimension) at a time, which limited the effectiveness of the technique in exploring the Pareto surface.

The purpose of this work is to firstly present a new PGAP solution that implements a multi-dimensional Pareto navigation calibration interface and secondly to present results of a multi-centre validation of this solution in two external institutions.

Methods

Patient selection and planning protocol

For each institution (IA and IB) 30 patients (60 in total) treated with prostate only radiotherapy during the period of 1st April– 30th June 2017 were randomly selected, with 10 and 20 patients allocated to a calibration and validation dataset respectively. Patients with hip prosthesis were excluded. Across both institutions patients were treated following the hypo-fractionated CHHIP trial protocol [25]; a simultaneous integrated boost technique delivering 60 Gy in 20#. The clinical goals associated with this protocol are presented in Table 1.

Table 1 CHHIP trial based clinical planning goals for IA and IB

Patients were planned on a CT scan of 2 mm slice thickness, with prostate and up to 2 cm of proximal seminal vesicles (sv) delineated as targets; and rectum, bladder, femoral heads (IB only) and bowel (IB only) delineated as organs at risk (OARs). As per the CHHIP protocol the following planning target volumes (PTV) were generated, with the PTV’s nominal prescription in Gy defined by the nomenclature’s suffix: prostate expanded by 5 mm (0 mm posteriorly) and 10 mm (5 mm posteriorly) to form PTV60 and PTV57.5 respectively; and prostate + sv expanded by 10 mm to form PTV48.

The clinically delivered treatment plans (VMATClinical) were generated by the institutions using RayStation v5 (RaySearch Laboratories, Stockholm). Treatments were delivered on a Varian TrueBeam STx (Varian Medical Systems, Palo Alto) and an Elekta Agility (Elekta Ltd, Crawley) linac for IA and IB respectively. Automated plans (VMATAuto) were generated at VCC using RayStation v4.99, a research release equivalent to v5. VMATAuto plans were generated using identical RayStation treatment planning machine models and arc configurations to VMATClinical (single 6MV 360° VMAT arc). For IB, VMATAuto and VMATClinical were normalised such that PTV60’s median dose equalled 60.0 Gy.

Pareto guided automated planning

In this study PGAP was performed using EdgeVcc: a PBAIO automated planning solution developed at VCC and implemented in RayStation using python scripting. Full details of this PGAP solution are presented by Wheeler et al. [23], with the following providing a summary of the key aspects.

Prior to automated planning a site specific ‘AutoPlan protocol’ is created and a set of planning goals defined (Table 2). Planning goals are split into 3 priority levels: critical normal tissue goals (P1), target goals (P2) and normal tissue goals (P3). P1 and P2 generally represent a clinical protocol’s mandatory dose constraints and P3 all other trade-offs which are to be minimised. This approach is analogous to using constraints and trade-offs in standard Pareto navigation applications. No weighting factors (WF) are specified by the user, instead they are generated through two processes. For P1 and P2, WF are defined by hard coded constants (1000 and 250 for P1 and P2 respectively). For P3, balancing competing trade-offs is complex and difficult to define a priori. In this case WF are derived through the Pareto navigation calibration process.

Table 2 Final planning goals and weighting factors for both institutions

Calibration is initially performed on a single patient. Firstly, a set of automated plans with differing P3 WF are generated using the PBAIO automated planning algorithms. These plans represent different AutoPlan calibration options, each with a different balancing of competing trade-offs that constitute a point on the Pareto front. The operator then navigates through these differently weighted P3 treatment options via a sliding interface. The clinically optimum position on the Pareto front, determined qualitatively by the operator, is selected and the WF associated with this navigated position stored in the AutoPlan Protocol. The result is a calibrated AutoPlan protocol, which is ready for testing or further refinement.

The PGAP solution is built on a PBAIO automated planning framework, where during optimisation the position and weight of P3 related optimisation objectives are iteratively updated. The position is adjusted to maintain a constant difference (δ) between the optimisation objective and its corresponding DVH parameter. For example, if a dose volume objective (DVO) of V23.4 Gy at 10.0% volume is defined and the resultant optimised dose yields a V23.4 Gy equalling 9.0%, the DVO volume target will be set to [9.0% - δ]. In terms of objective weight, this is dynamically updated such that the objective function’s value trends towards a target objective value. Utilising these two mechanisms within a PBAIO framework aims to both minimise OAR doses (via dynamic positioning) and ensure consistent trade-off balancing across all patients treated to the same clinical protocol (via dynamic weighting). This provides the potential for a Pareto navigation calibration on a single patient to yield a suitably calibrated AP solution for novel patients. In practice, especially for more complex sites with variable anatomy, it may be necessary to perform additional Pareto navigation on outlier patients (with weights typically averaged) to improve the solution’s robustness across the whole cohort.

In previous work, calibration via Pareto navigation was performed through sequential navigation of one trade-off (or Pareto dimension) at a time. In this regard a Pareto dataset (typically containing 5 plans) was generated with varying WF applied to the given trade-off and all other WF held constant (or set to zero if unnavigated). The process was repeated until all trade-offs were navigated. In this work we present a fully customisable interface (Fig. 1), where any number of dimensions can be navigated in parallel, thereby providing the opportunity for full Pareto navigation. Furthermore, dimensions are not limited to a planning goal’s WF, but rather any of its parameters, enabling navigation, for example, of individual P2 target values such as PTV min dose.

Fig. 1
figure 1

Pareto navigation calibration interface. Navigation is performed using the slider bars (top left), with the dose distribution (top centre) and DVH (top right– solid line) updated in real time within RayStation’s evaluation module. During navigation the operator can set the navigated distribution as a reference distribution (bottom centre) and DVH (top right– dotted line) to aid in the decision making. In this example the navigated position represents a solution where the rectum is spared at the expense of homogeneity and conformality (Cal1) with the reference distribution representative of the final calibration for IA (Cal2). The corresponding Cal2 slider positions are provided for reference (bottom left) and isodose legends have been enhanced for clarity. ROIs: rectum (brown), bladder (yellow), external (blue), PTV60 (pink), PTV57.5 (red) and PTV48 (orange)

For a given navigation the operator defines (via a config file) the dimensions to be explored and for each dimension the trade-off parameter values to be sampled during creation of the Pareto surface. Typically 3–5 parameter values are specified for each dimension. To populate the Pareto navigation dataset, a fully segmented treatment plan is generated (using the PBAIO framework) for all possible parameter value permutations across the different dimensions. The dataset is navigated in ‘parameter space’ using a slider interface with the navigated dose distribution estimated though linear interpolation of the neighbouring discrete Pareto plans using the navigated parameter values as the interpolation coefficients (see Wheeler et al. [23]). Whilst the interface allows for any number of dimensions to be navigated in parallel, there are computational limitations as the number of plans in the navigation dataset increases to the power of the number of dimensions. Pareto navigation is therefore typically limited to < 5 dimensions, with additional navigations performed sequentially until all trade-offs have been navigated.

AutoPlan protocol calibration

Separate calibrations for both IA and IB were performed by VCC using the institution’s calibration patient cohort. Planning goals (Table 2) were based on CHHIP clinical goals (Table 1) and during calibration the balancing of trade-offs was informed by the corresponding VMATClinical plan and collaborative discussions with the external institution.

Demonstrating the utility of PGAP

To demonstrate the potential utility of PGAP, using the calibrated IA protocol as a base, a multidimensional navigation consisting of the following four dimensions was generated for the first IA calibration patient: PTV60 Dmin (target parameter), PTV60 Dmax (target parameter), rectum Dmean (WF parameter) and external normal tissue fall off (WF parameter). Using the navigation interface two different calibrations were selected (Fig. 1): Cal1, where the rectum was spared at the expense of homogeneity and conformality, and Cal2, where parameter values were set to nominally equal the final calibrated IA protocol. For both Cal1 and Cal2 an automated plan was generated for all IA calibration patients. Pareto front representations of PTV60 homogeneity index (HIPTV60), PTV48 Paddick’s conformity index (CIPTV48) [26] and rectum DMean were generated to demonstrate the propagation of differing calibrations to novel patients. This evaluation was undertaken at VCC after the multi-institutional study proper using an upgraded version of RayStation (8b research).

Evaluative study design

For the evaluative study, VMATAuto plans were generated for all validation patients using the institution’s calibrated AutoPlan Protocol. Plan quality was quantitatively compared to VMATClinical using: CHHIP dose metrics; PTV D98%, D2%, HI and CI; and OAR mean doses. Higher prescription PTVs were subtracted from lower prescription PTVs when reporting D98%, D2% and HI. Differences were assessed for statistical significance using a two-sided Wilcoxon signed rank test. Statistical testing was not performed where, following omission of tied values (i.e. where metrics equalled zero for both VMATAuto and VMATClinical), sample size was < 10. In addition, a blind qualitative comparison of VMATAuto and VMATClinical was performed on-site at each external institution by a team consisting of a single oncologist and dosimetrist. During review the team would discuss the two plans under blind conditions and rank them in order of preference. Whilst the discussions were collaborative, it was permissible for the oncologist and dosimetrist to disagree on the final ranking.

Results

AutoPlan protocol calibration

Details of the calibrated AutoPlan Protocols are provided in Table 2. The final IA protocol was used as a base for IB following simplification (low weighted and similar planning goals removed). Due to substantial similarities in clinical preference between the two institutions only two key changes were made for the final IB protocol: the addition of bowel goals and an increased intra-PTV dose fall-off WF.

Demonstrating the utility of PGAP

The Pareto front representations in Fig. 2 demonstrate how the two different calibrations propagated to novel patients. Across patients 2–10 there was a clear and consistent change in the balancing of automated plans between Cal1 and Cal2 with changes in rectum Dmean, CIPTV48 and HIPTV60 of 8.7 Gy, 0.068, and − 0.031 respectively. This compares with changes of 7.4 Gy, 0.073 and − 0.034 respectively for the calibration patient (patient 1).

Fig. 2
figure 2

Pareto front representations of the three navigated trade-offs (rectum Dmean, HIPTV60 and CIPTV48) demonstrating the dosimetric impact of two differently balanced calibrations (Cal1 & Cal2) on novel patients in the IA calibration dataset. Data from the navigation patient (Patient 1) is presented for reference, with Cal1 and Cal2 data points encompassed by the red and blue boxes respectively

Evaluative study

Results of the evaluative study on the validation patient cohort are presented in Table 3, with Fig. 3 providing 1–1 plots comparing VMATAuto with VMATClinical across a range of key OAR and PTV dose metrics. Across both institutions VMATAuto led to a statistically significant (p < 0.05) improvement across all but two rectal dose metrics (V48.6 Gy, V52.7 Gy). For IA, several reductions were substantial, with Dmean and V24.3 Gy reduced by 3.7 Gy and 15.1% respectively. For IB improvements were more modest [ΔDmean = -1.8 Gy, ΔV24Gy = -8.4%]. For bladder, VMATAuto led to a small but statistically significant detriment in low and intermediate dose level metrics for IA [ΔV40.5 Gy = + 1.3%, ΔV48.6 Gy = + 0.6%] with the situation reversed for IB [ΔV40.5 Gy = -1.0%, ΔV48.6 Gy = -0.7%]. VMATAuto led to a moderate reduction in bladder Dmean for IB [ΔDmean = -1.3 Gy].

Table 3 Dosimetric comparison of VMATAuto and VMATClinical for institution A and B (mean ± standard deviation)
Fig. 3
figure 3

1–1 plots comparing VMATAuto and VMATClinical across a range of OAR and PTV dose metrics for both institutions. Unity line is presented for reference and represents equivalence between the two techniques

VMATAuto yielded moderate improvements in D98% for PTV57.5 [IB only, ΔD98% = +1.0 Gy] and PTV48 [IA ΔD98% = +0.7 Gy, IB ΔD98% = +1.0 Gy], which did not result in a detriment in rectal doses. Significant but small differences were also observed for PTV60 D98% [IA ΔD98% = -0.2 Gy, IB ΔD98% = +0.3 Gy]. D2% was significantly increased for PTV60 [IA only, ΔD2% = +0.4 Gy], PTV48 [IB only, ΔD2% = +0.6 Gy] and deceased for PTV57.5 [IA only ΔD2% = -0.3 Gy]. Worthy of note was the reduction in the variation of HI across all study patients when planning with VMATAuto, which was for all PTVs across both institutions (Fig. 3). In terms of conformality, VMATAuto led to moderate reductions in the CI index for IA [ΔCIPTV57.5 = -0.035, ΔCIPTV48 = -0.039] and IB [ΔCIPTV60 = -0.035, ΔCIPTV57.5 = -0.019]. This degradation was attributed to a higher prioritisation being placed on rectum dose reduction during calibration when compared with VMATClinical.

Upon blind review all plans were considered clinically acceptable. For IA there was a clear preference towards VMATAuto with 90% considered superior to VMATClinical. For IB this percentage dropped to 65% but the overall preference towards VMATAuto was maintained. Agreement between the oncologist and dosimetrist was very good with only one plan without a consensus decision. MU for VMATAuto was 12% and 15% higher than VMATClinical for IA and IB respectively. This increase was not of concern to either institution.

Discussion

In this study a PBAIO automated solution with a novel multi-dimensional Pareto navigation calibration methodology has been evaluated for prostate cancer in a multi-centre context. Results from the study demonstrated a clear clinical preference towards VMATAuto and provides supportive evidence on both the calibration method and underlying PBAIO framework that together form the PGAP solution.

This work builds upon the previous single institution study (performed at VCC [24]) in three key ways. Firstly, the updated calibration interface enabled multi-dimensional Pareto navigation, whereas the initial study was limited to a single dimensional proof of principle approach. This new method was fully congruent with the principles of Pareto navigation; enabling intuitive exploration of multiple competing trade-offs simultaneously. Secondly, the previous study provided no demonstration of the utility of PGAP; only presenting comparison of a single calibrated automated solution against manual planning. In this work a clear presentation of how different calibration choices propagate to novel patients via the PBAIO framework is provided (Fig. 2). Finally, a key challenge of any automated solution is demonstrating adaptability to the clinical requirements, techniques, and delivery machines of differing institutions. This study provides clear evidence that PGAP is a versatile solution, which can be successfully translated to independent external centres. Furthermore, with the vast majority of published studies being single institutional [1], this work helps to strengthen the evidence base on multi-institutional validations of automated solutions.

Within the literature there are limited examples on the utilisation of Pareto navigation to calibrate AP solutions and to our knowledge this work presents the first example where Pareto navigation is incorporated natively into the calibration process. The most relevant example is for KBP, where Pareto navigation was utilised by Miguel-Chumacero et al. [27] and Wall et al. [28] to improve the quality of the training dataset for head and neck, and prostate cancer respectively. This led to substantial reductions in OAR doses compared to a KBP model trained on the original manual planning based dataset. It is unclear if this is due to a conscious change in trade-off prioritisation or improving the optimality of the original manual plans. This approach, whilst promising, requires all training patients to be replanned, which is time consuming and presents a key barrier for practical implementation in the clinic. This is especially true for state-of-the-art dose distribution prediction solutions where training datasets are of the order of 100 patients [5]. In contrast the PGAP approach we developed can be calibrated through Pareto navigation on more limited patient datasets and is therefore ideal for rapid implementation of novel protocols or changes to clinical priorities due to emerging evidence.

The process of effective calibration is non-trivial; it requires an assessment of not only the clinical acceptability of a given calibration, but also the rate of change of competing dose metrics as the balancing of parameters is adjusted. For example, a detriment in CI of 0.05 may be acceptable if rectum Dmean reduces by 0.5 Gy but unacceptable for a 0.05 Gy reduction. It is our view that Pareto navigation is currently the only method that provides the operator with live access to this key information when calibrating an automated solution (via both the DVH and whole 3D dose distribution) and offers a clear alternative to machine learning and trial-and-error. Figure 1 illustrates the benefits of this approach, demonstrating how different treatment options can be interactively explored to identify the solution which best aligns with clinical preferences of the institution.

Successful PGAP implementation requires trade-off balancing of novel patients to be consistent with that selected during calibration. In our implementation, this function was fulfilled through building the solution on a PBAIO framework. This study provides evidence supporting this approach, firstly by demonstrating how trade-off balancing during calibration propagates effectively to novel patients (Fig. 2) and secondly through results of the blind review, which showed that PGAP yielded plans of high congruence with the institutions’ clinical preferences. Importantly, it is our view that a broad spectrum of PBAIO and CHO solutions presented in the literature also fulfil this requirement and therefore could benefit from integration of Pareto navigation into their calibration process.

The implemented approach does have limitations. Firstly, sampling the Pareto front using a simple exhaustive approach (plans generated for all parameter permutations) was computationally expensive and limited the practical number of Pareto dimensions per navigation to ≈ 4. Whilst, in this study it was not considered a significant constraint as many trade-offs were observed to be uncorrelated (e.g. CIPTV48 and rectum Dmax), it reduced the efficiency and elegance of the calibration process. Utilisation of more sophisticated sampling strategies [29] to reduce the computational burden would help increase the number of dimensions possible per navigation. Secondly, as is the case with all CHO and PBAIO solutions presented in the literature, a single AutoPlan Protocol was used across all study patients. Whilst resultant plans were on average superior to VMATClinical, utilisation of a single AutoPlan protocol assumes the clinically optimum balancing of competing trade-offs is consistent across individual patients, which may not be the case. It is recommended that further work evaluating per patient Pareto navigation vs. AP should be performed to explore the validity of this assumption.

In terms of the multi-centre evaluation a key observation during calibration was that, whilst the Pareto navigation interface enabled navigation of a wide range of differing trade-off options (Fig. 1), a solution which aligned reasonably closely to local clinical practice in terms of HIPTV60, CIPTV48 and modulation was selected by each institution. This was at the expense of further potential reductions in rectum DMean and reflected the institutions’ measured and proportional caution in selecting a solution, which if implemented would substantially change not only the planning method (automated from manual) but also the plan distribution and modulation for the whole treatment site. This trade-off prioritisation differed to VCC (where rectum DMean is prioritised over HIPTV60 & CIPTV48) and highlighted the importance of AP solutions having the functionality to allow full customisation of protocols to suit local requirements such that potential implementation barriers can be reduced.

As with the previously reported single institutional study of PGAP, this multicentre evaluation demonstrates superiority of automated planning over manual planning, both in terms of reduced rectum doses and clinical preference. This superiority was attributed to the improved alignment of trade-off balancing with clinical preference (particularly for CI vs. rectum Dmean), and the PBAIO framework dynamically adjusting objectives to drive plans towards Pareto optimality. For IA, reductions in rectum Dmean were more substantial than IB (3.7 Gy vs. 1.8 Gy respectively) due to their increased prioritisation of CIPTV48 for VMATClinical. This prioritisation was not congruent with the institution’s clinical preferences and was reflected in 90% of VMATAuto plans being preferred to VMATClinical (compared to 65% for IB). Results (Fig. 3; Table 3) also highlighted a wide variation in the differences between VMATAuto and VMATClinical both at an inter-patient and inter-institutional level. This was attributed to the inconsistencies associated with manual planning that have been widely reported in the literature [17, 30]. In comparison to a similar study [31] that evaluated a CHO approach across 4 institutions for prostate cancer our results are aligned, with that work also demonstrating overall superiority of VMATAuto, with a median reduction in rectum DMean of 3.4 Gy (range [-4,12] Gy) as compared to 2.8 Gy (range [-1,7] Gy) in this study. Whilst direct comparison of the two approaches (PGAP/PBAIO vs. CHO) is not appropriate due to confounding factors such as differing planning systems, clinical protocols and the underlying quality of the manual comparators, this alignment adds strength to the findings by both authors that: (1) wide variations in the differences between VMATAuto and VMATClinical are suggestive of inconsistencies in manual planning; and (2) AP solutions that seek Pareto optimality can yield substantial improvements in plan quality.

Finally, an interesting and unexpected outcome from this study was that once presented with results from both institutions, IA adapted their manual planning practice to align closer with clinical preferences (i.e. prioritise rectum at the expense of CIPTV48). This led to a sustained reduction in rectum doses for clinical patients and highlighted the potential in utilising AP for cross-institutional audits to improve practice.

Conclusions

A novel PGAP solution has been successfully validated against clinical practice for two external institutions. The multi-dimensional Pareto navigation calibration methodology enabled intuitive adaptation of automated protocols to an institutions’ individual planning aims without the requirement of large training datasets. Automated plans were more congruent with the institutions’ clinical preferences than manual plans and considered to represent a higher quality, more consistent and more efficient plan generation method.

Data availability

Evaluative study data is provided as a supplementary file. In addition, the datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Hussein M, Heijmen BJM, Verellen D, Nisbet A. Automation in intensity modulated radiotherapy treatment planning—a review of recent innovations. Br J Radiol. 2018;91:20180270. https://doi.org/10.1259/bjr.20180270.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Yuan L, Ge Y, Lee WR, Yin FF, Kirkpatrick JP, Wu QJ. Quantitative analysis of the factors which affect the interpatient organ-at-risk dose sparing variation in IMRT plans. Med Phys. 2012;39:6868. https://doi.org/10.1118/1.4757927.

    Article  PubMed  Google Scholar 

  3. Appenzoller LM, Michalski JM, Thorstad WL, Mutic S, Moore KL. Predicting dose-volume histograms for organs-at-risk in IMRT planning. Med Phys. 2012;39:7446–61. https://doi.org/10.1118/1.4761864.

    Article  PubMed  Google Scholar 

  4. Babier A, Boutilier JJ, McNiven AL, Chan TCY. Knowledge-based automated planning for oropharyngeal cancer. Med Phys. 2018;45:2875–83. https://doi.org/10.1002/mp.12930.

    Article  PubMed  Google Scholar 

  5. McIntosh C, Purdie TG. Voxel-based dose prediction with multi-patient atlas selection for automated radiotherapy treatment planning. Phys Med Biol. 2017;62:415–31. https://doi.org/10.1088/1361-6560/62/2/415.

    Article  PubMed  Google Scholar 

  6. Fan J, Wang J, Chen Z, Hu C, Zhang Z, Hu W. Automatic treatment planning based on three-dimensional dose distribution predicted from deep learning technique. Med Phys. 2018. https://doi.org/10.1002/mp.13271.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Breedveld S, Storchi PRM, Voet PWJ, Heijmen BJM, iCycle. Integrated, multicriterial beam angle, and profile optimization for generation of coplanar and noncoplanar IMRT plans. Med Phys. 2012;39:951–63. https://doi.org/10.1118/1.3676689.

    Article  PubMed  Google Scholar 

  8. Zarepisheh M, Hong L, Zhou Y, Hun Oh J, Mechalakos JG, Hunt MA, et al. Automated intensity modulated treatment planning: the expedited constrained hierarchical optimization (ECHO) system. Med Phys. 2019;46:2944–54. https://doi.org/10.1002/mp.13572.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Xhaferllari I, Wong E, Bzdusek K, Lock M, Chen J. Automated IMRT planning with regional optimization using planning scripts. J Appl Clin Med Phys. 2013;14:4052. https://doi.org/10.1120/jacmp.v14i1.4052.

    Article  PubMed  Google Scholar 

  10. Winkel D, Bol GH, van Asselen B, Hes J, Scholten V, Kerkmeijer LGW, et al. Development and clinical introduction of automated radiotherapy treatment planning for prostate cancer. Phys Med Biol. 2016;61:8587–95. https://doi.org/10.1088/1361-6560/61/24/8587.

    Article  CAS  PubMed  Google Scholar 

  11. Guo C, Zhang P, Gui Z, Shu H, Zhai L, Xu J. Prescription value-based automatic optimization of importance factors in Inverse Planning. Technol Cancer Res Treat. 2019;18:1533033819892259. https://doi.org/10.1177/1533033819892259.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Tol JP, Dahele M, Peltola J, Nord J, Slotman BJ, Verbakel WFAR. Automatic interactive optimization for volumetric modulated arc therapy planning. Radiat Oncol. 2015;10:75. https://doi.org/10.1186/s13014-015-0388-6.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Zhang X, Li X, Quan EM, Pan X, Li Y. A methodology for automatic intensity-modulated radiation treatment planning for lung cancer. Phys Med Biol. 2011;56:3873–93. https://doi.org/10.1088/0031-9155/56/13/009.

    Article  PubMed  Google Scholar 

  14. Ge Y, Wu QJ. Knowledge-based planning for intensity‐modulated radiation therapy: a review of data‐driven approaches. Med Phys. 2019;46:2760–75. https://doi.org/10.1002/mp.13526.

    Article  PubMed  Google Scholar 

  15. Wang Y, Heijmen BJM, Petit SF. Knowledge-based dose prediction models for head and neck cancer are strongly affected by interorgan dependency and dataset inconsistency. Med Phys. 2019;46:934–43. https://doi.org/10.1002/mp.13316.

    Article  PubMed  Google Scholar 

  16. Hussein M, South CP, Barry Ma, Adams EJ, Jordan TJ, Stewart AJ, et al. Clinical validation and benchmarking of knowledge-based IMRT and VMAT treatment planning in pelvic anatomy. Radiother Oncol. 2016;120:473–9. https://doi.org/10.1016/j.radonc.2016.06.022.

    Article  PubMed  Google Scholar 

  17. Nelms BE, Robinson G, Markham J, Velasco K, Boyd S, Narayan S, et al. Variation in external beam treatment plan quality: an inter-institutional study of planners and planning systems. Pract Radiat Oncol. 2012;2:296–305. https://doi.org/10.1016/j.prro.2011.11.012.

    Article  PubMed  Google Scholar 

  18. Craft DL, Hong TS, Shih HA, Bortfeld TR. Improved planning time and plan quality through multicriteria optimization for intensity-modulated radiotherapy. Int J Radiat Oncol Biol Phys. 2012;82:83–90. https://doi.org/10.1016/j.ijrobp.2010.12.007.

    Article  Google Scholar 

  19. Thieke C, Küfer KH, Monz M, Scherrer A, Alonso F, Oelfke U, et al. A new concept for interactive radiotherapy planning with multicriteria optimization: first clinical evaluation. Radiother Oncol. 2007;85:292–8. https://doi.org/10.1016/j.radonc.2007.06.020.

    Article  PubMed  Google Scholar 

  20. Craft D, Halabi T, Shih HA, Bortfeld T. An Approach for practical Multiobjective IMRT Treatment Planning. Int J Radiat Oncol Biol Phys. 2007;69:1600–7. https://doi.org/10.1016/j.ijrobp.2007.08.019.

    Article  PubMed  Google Scholar 

  21. Kierkels RG, Visser R, Bijl HP, Langendijk JA, van ‘t Veld AA, Steenbakkers RJ, et al. Multicriteria optimization enables less experienced planners to efficiently produce high quality treatment plans in head and neck cancer radiotherapy. Radiat Oncol. 2015;10. https://doi.org/10.1186/s13014-015-0385-9.

  22. Xiao J, Li Y, Shi H, Chang T, Luo Y, Wang X, et al. Multi-criteria optimization achieves superior normal tissue sparing in intensity-modulated radiation therapy for oropharyngeal cancer patients. Oral Oncol. 2018;80:74–81. https://doi.org/10.1016/j.oraloncology.2018.03.020.

    Article  PubMed  Google Scholar 

  23. Wheeler PA, Chu M, Holmes R, Smyth M, Maggs R, Spezi E, et al. Utilisation of Pareto navigation techniques to calibrate a fully automated radiotherapy treatment planning solution. Phys Imaging Radiat Oncol. 2019;10:41–8. https://doi.org/10.1016/j.phro.2019.04.005.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Wheeler PA, Chu M, Holmes R, Woodley OW, Jones CS, Maggs R, et al. Evaluating the application of Pareto navigation guided automated radiotherapy treatment planning to prostate cancer. Radiother Oncol. 2019;141:220–6. https://doi.org/10.1016/j.radonc.2019.08.001.

    Article  CAS  PubMed  Google Scholar 

  25. Dearnaley D, Syndikus I, Mossop H, Khoo V, Birtle A, Bloomfield D, et al. Conventional versus hypofractionated high-dose intensity-modulated radiotherapy for prostate cancer: 5-year outcomes of the randomised, non-inferiority, phase 3 CHHiP trial. Lancet Oncol. 2016;17:1047–60. https://doi.org/10.1016/S1470-2045(16)30102-4.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Paddick I. A simple scoring ratio to index the conformity of radiosurgical treatment plans. J Neurosurg. 2000;93:219–22.

    Article  PubMed  Google Scholar 

  27. Miguel-Chumacero E, Currie G, Johnston A, Currie S. Effectiveness of Multi-criteria Optimization- based Trade-Off exploration in combination with RapidPlan for head & neck radiotherapy planning. Radiat Oncol. 2018;13:1–13. https://doi.org/10.1186/s13014-018-1175-y.

    Article  Google Scholar 

  28. Wall PDH, Carver RL, Fontenot JD. An improved distance-to-dose correlation for predicting bladder and rectum dose-volumes in knowledge-based VMAT planning for prostate cancer. Phys Med Biol. 2018;63:15035. https://doi.org/10.1088/1361-6560/aa9a30.

    Article  CAS  Google Scholar 

  29. Craft DL, Halabi TF, Shih Ha, Bortfeld TR. Approximating convex pareto surfaces in multiobjective radiotherapy planning. Med Phys. 2006;33:3399–407. https://doi.org/10.1118/1.2335486.

    Article  PubMed  Google Scholar 

  30. Moore KL, Schmidt R, Moiseenko V, Olsen LA, Tan J, Xiao Y, et al. Quantifying unnecessary normal tissue complication risks due to Suboptimal Planning: a secondary study of RTOG 0126. Radiat Oncol Biol. 2015;92:228–35. https://doi.org/10.1016/j.ijrobp.2015.01.046.

    Article  Google Scholar 

  31. Heijmen B, Voet P, Fransen D, Penninkhof J, Milder M, Akhiat H, et al. Fully automated, multi-criterial planning for Volumetric Modulated Arc Therapy– An international multi-center validation for prostate cancer. Radiother Oncol. 2018;128:343–8. https://doi.org/10.1016/j.radonc.2018.06.023.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This research was funded by Velindre Cancer Centre’s Advancing Radiotherapy Fund.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualisation, PW, RM, MC, DL, JS, ES, AM; Methodology PW, RM, MC, DL, JS, ES, AM; Software development PW, MC; Data curation, NSW, RP; PGAP calibration, NSW, RP, RAP, NW, BK, KR, PW; Blind Review, RAP, NW, BK, KR; Data analysis, PW; Writing manuscript - draft, PW; Writing manuscript - review and editing, All Authors; Supervision, DL, JS, ES, AM. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Philip A Wheeler.

Ethics declarations

Ethics approval and consent to participate

All work was performed on fully anonymised datasets in accordance with institutional good practice.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

13014_2024_2404_MOESM1_ESM.xlsx

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wheeler, P.A., West, N.S., Powis, R. et al. Multi-institutional evaluation of a Pareto navigation guided automated radiotherapy planning solution for prostate cancer. Radiat Oncol 19, 45 (2024). https://doi.org/10.1186/s13014-024-02404-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13014-024-02404-x

Keywords