Clinical implementation of a knowledge based planning tool for prostate VMAT

Background A knowledge based planning tool has been developed and implemented for prostate VMAT radiotherapy plans providing a target average rectum dose value based on previously achievable values for similar rectum/PTV overlap. The purpose of this planning tool is to highlight sub-optimal clinical plans and to improve plan quality and consistency. Methods A historical cohort of 97 VMAT prostate plans was interrogated using a RayStation script and used to develop a local model for predicting optimum average rectum dose based on individual anatomy. A preliminary validation study was performed whereby historical plans identified as “optimal” and “sub-optimal” by the local model were replanned in a blinded study by four experienced planners and compared to the original clinical plan to assess whether any improvement in rectum dose was observed. The predictive model was then incorporated into a RayStation script and used as part of the clinical planning process. Planners were asked to use the script during planning to provide a patient specific prediction for optimum average rectum dose and to optimise the plan accordingly. Results Plans identified as “sub-optimal” in the validation study observed a statistically significant improvement in average rectum dose compared to the clinical plan when replanned whereas plans that were identified as “optimal” observed no improvement when replanned. This provided confidence that the local model can identify plans that were suboptimal in terms of rectal sparing. Clinical implementation of the knowledge based planning tool reduced the population-averaged mean rectum dose by 5.6Gy. There was a small but statistically significant increase in total MU and femoral head dose and a reduction in conformity index. These did not affect the clinical acceptability of the plans and no significant changes to other plan quality metrics were observed. Conclusions The knowledge-based planning tool has enabled substantial reductions in population-averaged mean rectum dose for prostate VMAT patients. This suggests plans are improved when planners receive quantitative feedback on plan quality against historical data.


Background
Volume Modulated Arc Therapy (VMAT) is a popular method of radiotherapy treatment delivery enabling high doses of radiation to be shaped to the treatment plan target volume (PTV) compared to conventional conformal radiotherapy techniques. VMAT plans are inherently complex and are typically produced in commercially available treatment planning systems via a user-informed inverse-planning process.
An optimum solution would deliver the full prescription dose to the planning target volume whilst delivering the lowest possible dose to surrounding organ at risk (OAR) structures. However, the variation in quality of a VMAT plan is ultimately influenced by numerous factors including the number of target structures and nearby organs-at-risk, the patient anatomy, planner experience and skill and the time available to produce the plan.
The RayStation clinical database at Worcester contains a record of the plan solutions produced locally for all historical patients. For a given patient encountered in the clinic it is likely that there already exists a patient with a similar geometric distribution of target and OAR structures. Furthermore, clinically acceptable plans exist for these patients creating a knowledge base which could be harnessed to inform users of achievable plan quality during future plan optimisations and to potentially drive improvements in plan quality over time.
A variety of different knowledge based planning tools are described in the literature with varying scope and complexity [1][2][3][4][5][6][7][8]. Moore et al. and Wu et al. independently reported that the level of achievable OAR dose sparing is related to the geometric arrangement of the OAR relative to the PTV [1,2] and specifically the extent of overlap between the OAR and PTV. Moore et al. [1] assessed the relationship between mean OAR dose (D mean ) and the volume of OAR overlapping a PTV (V ovr ) for a knowledge base of historical patients. They constructed the following mathematical model for predicting D mean based upon the fractional OAR-PTV overlap (V ovr /V OAR ), where A, B and C are coefficients selected to represent the optimal plans (in terms of OAR dose) in the knowledge base. The mathematical model was incorporated into a script in the planning system which presented the planner with a predicted value D mean for the chosen patient. The planner was then expected to optimise the plan in order to achieve an OAR dose less than or equal to D mean . This approach was found to lower the risk of plans being produced with sub-optimal OAR average dose and to reduce variation between planners. Wu et al. investigated OAR and PTV overlap via the more sophisticated concept of the overlap volume histogram (OVH) [2]. The OVH was then used to identify patients in a historical database with similar geometric relationships between the PTV and surrounding OAR whose dose volume histogram (DVH) were used to guide the plan optimisation process. Appenzoller et al. [3] built upon the work of Moore et al. to develop a mathematical model for predicting the achievable DVH for an OAR using the correlation between expected dose at a point and the vicinity of that point to the PTV. The predicted DVH curves were successfully used to guide the plan optimisation process and to identify outlier sub-optimal plans. Schriebmann et al. [4] proposed a "feature-selection" search method that identified cases in a historical database of VMAT plans with similar anatomy to the current patient. Once identified, the plan configuration and DVH statistics of the similar plan were utilised as a starting point for the new plan to aid and speed up the plan optimisation process. Nwankwo et al. [5] developed a knowledge based planning algorithm that predicts the 3D dose distribution of an OAR based upon its proximity to the OAR. This paper describes the implementation of the method developed by Moore et al. at the Worcestershire Oncology Centre using the RayStation treatment planning system. It establishes a locally relevant predictive model, validates its use in removing sub-optimal plans and describes the impact of a controlled implementation. The Moore et al. model was chosen from the aforementioned knowledge based planning solutions in the literature due to the demonstrated potential for significant clinical benefit whilst being relatively simple to implement locally within the RayStation scripting interface.

Methods
This work has focussed upon external beam radiotherapy of prostate patients at the Worcestershire Oncology Centre (WOC) which is delivered according to the CHHiP trial (CRUK/06/016) protocol. Patients are planned with three concentric target volumes PTV3, PTV2 and PTV1 prescribed 74Gy, 71Gy and 59.2Gy in 37 fractions respectively. Patients are separated into two categories depending upon whether they present a high or low risk of seminal-vesicle involvement. For high risk (HR) patients PTV3 is equal to the prostate outline plus a 5 mm isotropic margin (0 mm posteriorly), PTV2 is the prostate and seminal vesicles combined plus a 10 mm isotropic margin (5 mm posteriorly) and PTV1 is the prostate and seminal vesicles combined plus a 10 mm isotropic margin. For low risk (LR) patients the PTV structures are grown in the same manner as described for HR however PTV2 is grown from the prostate outline only.
Method A: forming a local knowledge base for prostate planning A script (henceforth referred to as the data-mining script) was produced within the RayStation scripting interface to interrogate a historic cohort of 97 prostate patients planned and treated between February 2015 and February 2016 at WOC. All patients were treated with single arc 6MV photon VMAT plans, received no nodal irradiation and had no artificial hip implants. The patients were distributed evenly between the disease-risk sub-cohorts with 49 patients designated as HR and 48 as LR.
For each patient the script extracted or calculated a variety of data for analysis including: ROI volumes, PTV-OAR overlap volumes, OAR mean doses (including average rectum dose), DVH data for PTV and OAR structures and plan quality/complexity metrics such as conformity index and total plan MU. In this work the conformity index (CI) is defined as the ratio between the volume of a target covered by a user specified isodose and the total volume covered by the isodose.
Using the data extracted by the script, the ratio of the average rectum dose (D mean rectum ) to the primary prescription dose (D Px ) of 74Gy to PTV3, and the ratio of the volume of rectum overlapping PTV1 (V ovr ) to the overall volume of the rectum (V rec ) was calculated for each patient. To assess whether the trend between average rectum dose and geometric OAR-PTV overlap reported by Moore et al. was evident in the historic cohort D mean rectum /D Px was plotted against V ovr /V rec . The coefficients in Equation 1 were adjusted to provide an approximate fit the Moore et al. mathematical model to the local plot of D mean rectum /D Px versus V ovr /V rec . This was done twice: firstly so that the mathematical model fit along the lower bound of the local data representing the "optimal average rectum dose" (OARD) achieved in the historic cohort and secondly so that the mathematical model fit through the middle of the historic cohort representing the "median average rectum dose" (MARD).

Method B: validation of local model
If the OARD is a valid metric for identifying plans with "sub-optimal" rectal sparing (i.e. D mean rectum > > OARD) it is hypothesised that re-planning those patients would yield an improvement in rectal sparing. Similarly, re-planning those patients with "optimal" original clinical plan (i.e. D mean rectum ≈ OARD) would yield little or no change in rectal sparing.
Moore et al. defined the relative model excess, δ, to quantify the difference in achieve mean rectal dose compared to the predicted value (D pred ) as follows, This metric was chosen as it is insensitive to absolute values in dose or overlap volume allowing comparison of plans across different sites and prescriptions [1]. A planning study was devised to test whether the OARD and MARD models could be used to assess if a plan was "optimal" (compared to the historical knowledge base) in terms of rectal sparing utilising the metric δ.
For each patient in the historic cohort δ was calculated using Equation 2 where D pred was calculated using the local OARD model introduced in Method A: Forming a Local Knowledge Base for Prostate Planning. Ten patients were selected from the cohort; five patients with a high δ-value (where D mean rectum > > OARD and implying that the original clinical plans were "sub-optimal" in terms of rectum dose) and five patients with δ ≈ 0 (D mean rectum ≈ OARD) implying that the original clinical plans were close to the best achieved in the historical cohort). Prior to the study the ten plans were reviewed by an independent experienced planner to check that there were no mitigating circumstances explaining why each plan might exhibit a high rectum dose.
The patients were anonymised and four experienced planners were each asked to produce a new plan for each patient. The planners were provided with additional planning goals to ensure that the new plans maintained similar levels of PTV coverage and non-rectum OAR sparing compared to the original clinical plans. These additional goals included D99 targets for PTV1, PTV2, and PTV3, mean dose and D10 targets for the bladder and low dose conformity indices. Additionally, planners were asked to achieve rectal sparing where they felt it was achievable but were not informed as to which plans were predicted to have scope for rectal dose reduction.
The relative model excess, δ was calculated for each replan and the average δ across the four planners was calculated for each patient and compared to δ for the original clinical plan to assess if the plan average rectal dose had improved.

Method C: implementation of local knowledge based planning tool
A RayStation script, henceforth referred to as the knowledge based planning (KBP) script, was developed for clinical implementation utilising the mathematical models for predicting OARD and MARD established in Method A: Forming a Local Knowledge Base for Prostate Planning. Upon execution the KBP script performs the following tasks: Determines the volume of the rectum ROI that overlaps the PTV1 target and calculates V ovr /V rec . Calculates the corresponding MARD and OARD using the mathematical models established in Method A: Forming a Local Knowledge Base for Prostate Planning. Displays the prescription dose, fractionation, V ovr / V rec , current plan average rectum dose, MARD and OARD in a graphical user interface.
Once the user has acknowledged the results graphical user interface (GUI) a message box is created advising the user that the current average rectum dose is either notacceptable, acceptable or "optimal" depending upon whether D mean rectum >MARD, MARD> D mean rectum >OARD or D mean rectum <= OARD respectively . Initial preliminary testing suggested that using the KBP script could lead to greater rectum sparing at the cost of an increase in plan MU, a slight reduction in PTV1 coverage (but still within the clinical goal) and a reduction in low dose conformity with an associated increase in femoral head dose (but again still within the clinical goal).
Prior to implementing the KBP script clinically the benefits and potential costs were raised with local clinicians. The clinicians were willing to accept the aforementioned costs in order to obtain reductions in average rectum dose.
The KBP script was implemented clinically, with planners asked to ensure that the current plan average rectum dose was at least lower than the MARD and preferably equal to or lower than the OARD presented in the KBP script GUI. The data extracted from the data-mining script (see Method A: Forming a Local Knowledge Base for Prostate Planning) were used to determine values for PTV3 coverage, low dose conformity index and total plan MU that were routinely achieved in the historical cohort and were desirable in all future plans. As a precaution, these values were added as additional clinical goals to be met during the planning process in order to reduce the risk of the KBP script generating unforeseen changes in practice or plan quality.

Results
Results A: forming a local knowledge base for prostate planning Figure 1 displays (D mean rectum /D Px ) plotted against (V ovr /V rec ) for the WOC prostate historical cohort. The data presents a clear trend with average rectum dose increasing as the fractional overlap of the rectum and PTV1 increases which is consistent with the trend observed by Moore et al. The data displays no significant variation between the HR (solid data points) and LR (hollow data points) patient cohorts and throughout the remainder of this report the stratification according to disease risk is ignored (i.e. the data is treated as a single cohort).
The dotted curve is the mathematical fit reported by Moore et al. The solid curve represents the optimal (OARD) fit to the entire historical cohort described by equation 1 using coefficients A = 0.33, B = 0.5 and C = −2.3. The dashed curve represents the median (MARD) fit to the entire historical cohort described by equation 1 using coefficients A = 0.38, B = 0.5 and C = −2.3.

Results B: preliminary testing of local model
The average change in δ was −0.08 (range −0.12 to −0.01) for the high-δ clin cohort and −0.02 (range −0.06 to 0.00) for the low-δ clin cohort. Applying a single-tailed paired t-test, the high-δ clin cohort exhibits a statistically significant reduction in δ between the original clinical plans and the average of the replans (p = 0.009). This indicates that when re-planned these patients observed an overall reduction in mean rectum dose relative to the original clinical plan so that the planned average dose was closer to the OARD prediction. In contrast, the lowδ clin cohort do not exhibit a statistically significant change in δ (p = 0.201). This suggests that when replanned these patients observed only a very small change in mean rectum dose compared to the original clinical plan.
Whilst the preliminary study only contained five patients in each cohort (limiting its statistical power) overall the results support the hypothesis that the OARD can be used to identify plans which are "sub-optimal" in terms of average rectum dose. Despite the limited Fig. 1 Average dose to the rectum normalised to the prescription dose (D mean rectum /D Px ) plotted against the volume of the rectum overlapping PTV1 expressed as a fraction of total rectum volume (V ovr /V rec ). Data is plotted for high risk (solid) and low risk (open) patients. The dashed and solid curves represent the local MARD and OARD models respectively whereas the dotted curve represents the model of Moore et al. [1] statistical power of this test, it provided sufficient confidence to proceed with the clinical implementation described in method C. The dose to the femoral heads was not actively controlled during the study however there was no systematic increase in femoral head D50 observed when comparing the replans to the original clinical plans.
Results C: implementation of local knowledge based planning tool For the historical cohort (pre-KBP script implementation) the patients are distributed evenly around the MARD curve and the mean δ was 0.11 ± 0.08. Postimplementation of the KBP script the average rectum dose is reduced substantially so that patients are distributed around the OARD curve and every plan exhibits D meanrectum < MARD. Post-script implementation the mean δ was −0.03 ± 0.06. Table 1 displays mean values for a variety of plan statistics for the patient plans produced pre-and post-KBP script implementation, and p-values (calculated using a two-sample,  two-tailed t-test assuming unequal variances) indicating whether the difference in the distribution of plan statistics pre-and post-script implementation are statistically significant (assuming a significance level of p < 0.05).

Discussion
All plans produced post-script implementation met all clinical goals and were considered clinically acceptable following the local plan checking and approval process. Introduction of the KBP script reduced the populationaveraged mean rectum dose from 41.6Gy in the historic cohort to 36.0Gy in the post-KBP script implementation cohort (see Table 1). A more robust method for comparison that considers variation in dose with PTV-rectum overlap is to examine the change in the curve in Fig. 2 following implementation of the KBP tool. A new MARD curve fitted to the post-KBP data was found to have D mean rectum /D Px on average 0.05 lower, corresponding to a reduction in D rec = 3.7Gy for D Px = 74Gy. This change corresponds to the data being approximately equivalent to the OARD curve established pre-KBP implementation, suggesting that the post-implementation median level of rectal sparing has now converged on the previously optimal practice.
Similarly, individual cases in the historical cohort are positioned well above the original MARD curve implying that rectum dose was sub-optimal for these patients, but comparison of the curves describing the outlying plans before and after implementation of the script showed that D mean rectum /D Px reduced by an average 0.07 corresponding to a change in D rec = 5.2Gy for D Px = 74Gy. This means that post-implementation the least optimal plans are now of similar quality to the median levels achieved prior to implementation, which might be expected as the planners were always asked to obtain the median expected value or better.
From Fig. 1 it can be seen that whilst the general trend reported by Moore et al. (i.e. increased overlap results in increased OAR dose) is observed in the historical cohort the local data is much shallower than the Moore et al. model. This suggests that historically local patients exhibiting a small overlap between rectum and PTV1 were being planned with doses only moderately lower than patients with much larger overlap volumes suggesting that the planner stopped optimising the plan too soon in these instances. Post script implementation (see Fig. 2) average rectum doses are reduced overall but for a few patients exhibiting lower overlap volumes (V ovr /V rec < =0.2) the reduction in average rectum dose was more pronounced. If a new curve was drawn to represent the optimum average rectum dose post-script implementation the new curve would be much steeper and closer to that proposed by Moore et al. This indicates that introducing the script has increased the steepness of the local data by encouraging planners to continue optimising plans where further gains are most achievable. This represents a potentially significant gain in plan quality from this simple application however further data is required to confirm the steeper trend post-script implementation.
In order to assess the impact of this tool on clinical practice the data mining script was used to extract plan statistics from the post-KBP script implementation cohort which were compared to the same results from the historical cohort (see Table 1). The difference between D99 and D1 for PTV3, D99 for PTV2 and D99 for PTV1 before and after implementation of the script was less than 0.5Gy. This change is not statistically significant (applying a t-test with 0.05 significance level) implying that target coverage has been unaffected by the KBP script. The average plan MU increased post-script implementation from 427MU to 466MU indicating that the script instigated a statistically significant increase in plan complexity. This result was predicted as improving rectal sparing whilst maintaining other aspects of plan quality can only be achieved through greater plan modulation. Whilst statistically significant this increase in plan MU is modest and significantly below the upper threshold of 600MU applied as part of this study. The average mean bladder dose reduced slightly by 0.9Gy following script implementation. This small reduction in dose is not statistically significant but confirms that the script induced increase in rectal sparing has not generated an unforeseen increase in bladder dose. Average bowel dose has decreased by 3.8Gy post-script implementation. This decrease is not statistically significant but indicates that the script has introduced a desirable trend for lower bowel doses. This is likely a consequence of increased dose conformity at the posterior surface of the PTV structures created whilst attempting to spare the rectum. Whilst the dose to the aforementioned OAR structures has remained stable or reduced post-script implementation the dose to both femoral heads has increased. This is a direct result of the script encouraging planners to push dose from the rectum OAR resulting in lateral low dose spread towards the femoral heads. The increase in lateral dose spread is also observed as a statistically significant reduction in CI (see Table 1). Both the increase in femoral head dose and lateral dose spread were foreseen prior to script implementation and the risk accepted by local clinicians and controlled via plan quality constraints. Furthermore the average femoral head dose and CI values are well within accepted clinical tolerances and are therefore of low concern.
Whilst the change in femoral head dose is not of clinical concern, since it tends to fall easily well within clinical tolerance, it does raise a question about the nature of the plan improvements induced by the KBP tool. The pre-implementation study revealed no apparent change in femoral head dose compared to original clinical plan when re-planned for both the "sub-optimal" and "optimal" cohorts however the sub-optimal cohort demonstrated a benefit in rectum dose. This implies that the KBP metric was able to identify plans that were legitimately sub-optimal (i.e. not on the pareto optimal front). However, when actively employing the tool to guide planners an increase in femoral head dose is observed relative to the pre-KBP population. This perhaps indicates that, for some plans at least, the KBP tool has induced a shift along the pareto front rather than shifting a sub-optimal plan onto the pareto front. This may especially be true for those patients where the mean rectum dose is significantly below OARD. RayStation MCO module allows users to investigate the trade-offs whilst moving across the pareto front and should enable the production of pareto optimal plans [9,10]. This approach is different to KBP planning in that it does not directly rely upon previous local knowledge to produce optimal plans. MCO planning, however, does require user specified trade-off optimisation functions and constraints in order to produce an initial collection of pareto plans. These plans are then examined and explored by the user in order to produce the final clinical plan. The quality of the final optimal plan is therefore heavily reliant upon the initial user-specified trade-off functions. It is therefore important that thorough commissioning work is performed to ascertain the most suitable initial set of trade-off functions for a given clinical site in in order to ensure that a truly pareto optimal final plan is produced. The MCO planning module has not been commissioned for clinical use locally however the KBP tool presented in this work could be used to assess whether plans generated by MCO are, in terms of average rectum dose, comparable to (or better than) those automated manually acting as a valuable aid during the commissioning process. The KBP tool presented here can therefore be used as an alternative to MCO or to compliment MCO to QA final plan quality.
The reduction in population-averaged mean rectum dose induced by implementation of the KBP script is primarily achieved via a reduction in the amount of moderate to low-dose (i.e. <= 50Gy) delivered to the rectum. This is illustrated in Table 1 where the average V30, V40 and V50 (expressed as a fraction of total rectum volume) are statistically significantly lower for the population of plans produced post-script implementation compared to the historical cohort. In contrast, the average V60, V65 and V70 are approximately the same for the post-script and historical cohorts. This result is to be expected as the volume of rectum receiving the highest doses will be the region overlapping the PTV and therefore achievable dose reduction to this region is more limited without compromising target coverage. One possible concern to driving the optimisation of a plan using the KBP tool is that we may increase dose elsewhere, particularly around the rectum. However, the dose within 1 cm of the rectum did not increase as a function of δ-values, suggesting that plans optimised using the KBP tool remain robust to intrafractional rectum changes.
Reports in the literature are mixed about the clinical benefits of reducing moderate-low dose exposure to the rectum. The QUANTEC review concluded that high dose limits (> = 60Gy) were of greater significance in terms of rectal toxicity than lower doses (<60Gy) [11]. This conclusion is supported by Fiorino et al. [12], Tucker et al. [13] and Michalski et al. [14] whose studies found correlations between grade 2 rectal toxicity and high dose volumes only. However there also exist a significant number of studies reporting that the extent of low/intermediate dose rectum exposure correlates with rectal toxicity. For example, Buettner et al. [15] investigated the shape of the dose distribution across the rectum and the correlation with toxicity reporting a correlation between rectal bleeding and doses between 40-60Gy. Gulliford et al. [16] reported a reduction in the incidence of moderate/severe rectal toxicity with dose reduction across the whole DVH concluding that lower doses are of clinical importance. Although investigating hypofractionated treatments, Kim et al. [17] reported a strong correlation between grade 2+ delayed rectal toxicity and the percentage of rectal wall circumference receiving low dose. There is therefore evidence to suggest that the reduction in low/moderate dose exposure to the rectum, driven by the introduction of the KBP script, has a real clinical benefit to the patient in terms of reduced risk of rectal toxicity. Follow-up work from this study will investigate the feasibility of making more limited gains by optimising the dose in the rectal-PTV overlap region.

Conclusions
RayStation has been used successfully to guide planners on the expected value of a key plan quality metric, leading to significant reduction in population-averaged mean rectal dose, which may translate into reduced toxicity risk for patients. A shared knowledge base with other RayStation users would be desirable to allow centres to benchmark local plan quality against that achieved elsewhere and drive overall improvement.