[18]F-fluoroethyl-l-tyrosine positron emission tomography for radiotherapy target delineation: Results from a Radiation Oncology credentialing program

Highlights • 19 Radiation Oncologists (ROs) were credentialed for [18]F-fluoroethyl-l-tyrosine (FET) positron emission tomography (PET) in Glioblastoma (FIG) study.• ROs integrated FET PET and magnetic resonance imaging (MRI) data to derive hybrid target volumes on three cases.• Across 10 FIG trial sites, the initial pass rate was 77.8%. All resubmissions passed.• Hybrid gross tumour volume had greater volume, spatial and boundary agreement than MRI-based gross tumour volume.

Background and purpose: The [18]F-fluoroethyl-L-tyrosine (FET) PET in Glioblastoma (FIG) study is an Australian prospective, multi-centre trial evaluating FET PET for newly diagnosed glioblastoma management.The Radiation Oncology credentialing program aimed to assess the feasibility in Radiation Oncologist (RO) derivation of standard-of-care target volumes (TV MR ) and hybrid target volumes (TV MR+FET ) incorporating pre-defined FET PET biological tumour volumes (BTVs).Materials and methods: Central review and analysis of TV MR and TV MR+FET was undertaken across three benchmarking cases.BTVs were pre-defined by a sole nuclear medicine expert.Intraclass correlation coefficient (ICC) confidence intervals (CIs) evaluated volume agreement.RO contour spatial and boundary agreement were evaluated (Dice similarity coefficient [DSC], Jaccard index [JAC], overlap volume [OV], Hausdorff distance [HD] and mean absolute surface distance [MASD]).Dose plan generation (one case per site) was assessed.Results: Data from 19 ROs across 10 trial sites (54 initial submissions, 8 resubmissions requested, 4 conditional passes) was assessed with an initial pass rate of 77.8 %; all resubmissions passed.TV MR+FET were significantly larger than TV MR (p < 0.001) for all cases.RO gross tumour volume (GTV) agreement was moderate-to-excellent for GTV MR (ICC = 0.910; 95 % CI, 0.708-0.997)and good-to-excellent for GTV MR+FET (ICC = 0.965; 95 % CI,

Introduction
Glioblastoma is the most common adult primary brain malignancy with a poor prognosis.Conventional treatment is maximal safe resection followed by adjuvant radiotherapy (RT) with concurrent and adjuvant Temozolomide chemotherapy [1][2][3].MRI utilising T1-weighted pre-and post-contrast (T1c), T2-weighted, and fluid attenuated inversion recovery (FLAIR) sequences are routinely used to delineate RT target volumes (TVs).Recent ESTRO-EANO guidelines define gross tumour volume (GTV) as the post-operative resection cavity plus any residual enhancing tumour on T1c, with expansion to clinical target volume (CTV) using a margin of 15 mm.Furthermore, changes on T2/FLAIR that are felt to reflect non-enhancing tumour should be incorporated into the CTV [4].The role of perfusion, diffusion and MR spectroscopy in target volume definition are currently not well defined and therefore not routine [4][5][6][7].
[18]F-fluorodeoxyglucose (FDG) PET is an established approach for imaging tumours; however, low FDG tumour-to-background ratios (TBRs) can lead to challenges in delineating brain tumour extent due to high background uptake in grey matter.Comparatively, amino acid tracers exhibit higher TBRs and do not have such limitation [8].Clinical studies have focused on the use of [18]F-fluoroethyl-L-tyrosine (FET) PET post-RT for differentiation of treatment-related changes from tumour recurrence [9][10][11].FET PET may also have a role in delineating tumour extent in combination with MRI in the immediate post-surgical and pre-RT setting [12][13][14][15][16][17].To date, evidence for the utility of FET PET has been predominantly from smaller, single centre studies.
The [18]F-fluoroethyl-L-tyrosine (FET) PET in Glioblastoma (FIG) study is an Australian prospective, multi-centre trial evaluating the impact of FET PET on the management of adult patients with newly diagnosed glioblastoma [18].Participants undergo FET PET imaging pre-chemoradiotherapy (FET1), one-month post-chemoradiotherapy (FET2) and at suspected progression (FET3).Successful completion of Radiation Oncologist (RO) and Nuclear Medicine Physician (NMP) credentialing was required of all participating centres before participant enrolment, with the results of NMP credentialing published recently [19].In the FIG study, radiotherapy TVs for participants are delineated as per standard-of-care, with hybrid volumes derived post-hoc using both FET PET and MRI.The RO credentialing program focused on incorporating a FET PET biological tumour volume (BTV), along with standard MRI information, to delineate hybrid TVs compared to MRI alone.The feasibility of this process was evaluated across multiple ROs and 10 trial sites across Australia.Analysis of the resulting data include: a summary of expert central reviews, quantitative pairwise analysis of RO contours and comparison of standard and hybrid TVs.Additionally, per study site, central review of standard-of-care dosimetry plans and constraints was conducted, with analysis of any dose variability presented.

Material and methods
Three benchmarking cases with de-identified glioblastoma patient imaging, taken prior to RT, were chosen for credentialing (FET1CASE1, FET1CASE2, FET1CASE3, respectively).Local ethics approval was obtained for use of these three cases.Each patient dataset contained a planning CT (pCT), T1c, T2-weighted or T2 FLAIR as well as FET PET dynamic and static images.These cases represented distinct clinical scenarios with gadolinium enhancing disease (Table 1).Further detail on MRI acquisition parameters can be found in Supplementary Table 1.

FET PET acquisition and contouring
Benchmarking cases were obtained from three different glioblastoma patients treated at Sir Charles Gairdner Hospital, Nedlands, Western Australia, from a previous study (Human Research Ethics Committee approved study 2014-004).Patients fasted for a minimum of 4 h prior to imaging.FET PET scans were acquired following intravenous administration of 200 MBq of FET, on a Biograph 16 PET/CT (Siemens CTI Inc, Knoxville, TN).A low dose CT was performed for attenuation correction.A 30-minute dynamic acquisition followed with the final static image consisting of summed PET data 20-30 min post-injection of tracer.Dead time, attenuation, scatter, decay, and random corrections were applied, along with detector normalisation.Iterative reconstruction for the FET1 cases was performed with a point-spread function applied (TrueX, 3i24s, matrix = 168 × 168, zoom factor 2, 4 mm Gaussian post-filter).NMP delineation of FET PET was performed using a MiM workflow developed for the FIG study (MiM Encore version 7.0, MiM Software Inc, Cleveland OH).BTV delineation was performed by a sole expert NMP (RJF), using the following semi-automatic procedure: a crescent-shaped volume of interest (VOI), including grey and white matter, was placed in the hemisphere contralateral to the suspected lesion to assess mean background uptake, using T1c for anatomic reference [20].The BTV was defined using a 1.6 TBR threshold on a spherical VOI placed around the suspected tumour [21].The BTV was manually adjusted to remove any obvious non-tumour structures.

RO credentialing workflow
Each RO downloaded the credentialing cases into the treatment planning system (TPS) routinely used at their study site and coregistered images as per standard contouring.Target volume and organ at risk (OAR) contouring guidelines were provided in the FIG trial radiotherapy and quality assurance (RTQA) manual (Supplementary Tables 2-4).Critical OARs were the brainstem and optic chiasm.The optic nerves, retinas, eyes, and lenses were requested contours for FET1CASE1 only [4,22].ROs were instructed to not review or use the FET PET data for delineation of standard-of-care TVs: GTV MR , CTV MR with 15 mm margin clipped at anatomical barriers (e.g., tentorium, meninges), and planning target volume (PTV MR ) with 3 mm margin [4].These TVs were then "turned off" prior to introducing and registering FET PET data to CT/MRI, which included the BTV, thus mimicking the "blinding" component built into the FIG trial.ROs received the same BTV for each case and were instructed to incorporate it into hybrid MR+FET-derived TVs, as stated in the RTQA manual: GTV MR+FET , CTV MR+FET with 10-15 mm margin, and PTV MR+FET with 3 mm margin, without reference to the original standard-of-care TVs [4,23].All structures were registered to pCT.Additionally, one standard-of-care pCT, with contoured TVs and OARs, was derived from FET1CASE2 for dose plan generation.A review per study site was conducted to evaluate technique and dose goals (Supplementary Table 5).Completed credentialing cases were reviewed via the TROG (Trans-Tasman Radiation Oncology Group) server by one of three experts (ESK, BC, MB) to assess protocol compliance.Delineating these structures was assessed as acceptable, minor or major violations, or missing.Violation reasons were documented with the incidence and reasons for resubmission recorded.

Statistical analysis
Descriptive statistics are reported as mean/standard deviation and median/range.All contours in DICOM RTStruct format were converted to binary mask using Plastimatch. 1 Spatial and boundary agreement between two segmentations was assessed using the Dice similarity coefficient (DSC), Jaccard index (JAC), overlap volume (OV), Hausdorff distance (HD) and mean absolute surface distance (MASD) [24].All metrics were calculated using PlatiPy2 [25].For each case, pairwise comparison of RO contours was undertaken.Each ROs pair of standard and hybrid TVs were also directly compared (GTV paired , CTV paired , PTV paired ).The intraclass correlation coefficient (ICC) using a two-way mixed model (absolute agreement, single rater/measurement) assessed volume agreement [26,27].The ICC was interpreted based on recommendations by Koo and Li [28], calculated using the 'psych' package from R combined with the rpy2 interface. 3Dose plan variability was assessed using D 2% , D 50% , D 98% , V 95% , conformity index and homogeneity index.Definitions of all metrics included can be found in the Supplementary material.The Wilcoxon signed-rank test assessed differences between metrics evaluated on standard and hybrid TVs.Bonferroni adjustment for multiple testing was applied, with a p-value < 0.05 classed as significant.

Results
Ten FIG study sites participated in the credentialing program with a total of 19 ROs submitting data for review.ROs had a range of neurooncology expertise (n = 3, <5 years; n = 7, 5-9 years; n = 9; 10+ years) and prior familiarity with FET PET (n = 9, none; n = 7, minimal; n = 0, moderate; n = 3, significant).In all centres, except for one, two ROs per site underwent FIG trial credentialing.There were 19 sets of contours received for FET1CASE1 and FET1CASE2, and 16 for FET1-CASE3 (three sets of contours not received), respectively.

Credentialing case reviews
Case reviews were conducted on 54 initial submissions, with eight additional requested resubmissions (n = 6, missing contour(s); n = 2, contour data not registered to pCT) and four conditional passes (n = 3, image registration misalignment; n = 1, missing contour(s)), where the observer was provided feedback to be followed for the prospective phase.This resulted in an initial 77.8 % pass rate.The initial reports are summarised in Fig. 1.All resubmissions were subsequently passed, and all missing TVs were received for quantitative analysis.All ten dosimetry plans (one per site) utilising four different TPSs (Supplementary Table 6), were within dose constraints, with no resubmissions required.Six reports contained seven minor violations only (technique, n = 3; dose, n = 4).

Comparison of standard versus hybrid target volumes
Hybrid TVs were significantly larger than standard TVs (p < 0.001) for all cases (Table 2).Observer TVs are illustrated in Fig. 2. The ICC was calculated from ROs that had completed contouring on all three cases (16/19 observers), which was moderate to excellent (ICC = 0.910; 95 % CI, 0.708-0.997)for GTV MR and good to excellent (ICC = 0.965; 95 % CI, 0.871-0.999)for GTV MR+FET .Further results can be found in Supplementary Table 7.

Pairwise analysis of contour agreement
The brainstem and eyes exhibited the highest spatial agreement whereas the optic chiasm exhibited the least, sometimes not overlapping at all.Spatial and boundary metrics for all TVs are reported in Table 2. Overall, GTV MR exhibited lower spatial overlap compared to GTV MR+FET (DSC, 0.83 vs. 0.85, p < 0.001; JAC, 0.72 vs. 0.75, p < 0.001) and lower boundary agreement (HD, 11.41 mm vs. 9.76 mm, p < 0.001; MASD, 1.40 mm vs. 1.31 mm, p = 0.066), with equal OV (0.91 vs. 0.91, p = 0.887).Distribution of these metrics, by each case, is shown in Fig. 3. Differences in MASD for GTV MR and GTV MR+FET was only significant in FET1CASE2 (2.07 mm vs. 1.69 mm, p < 0.001) which had the largest BTV to incorporate.CTV MR and CTV MR+FET spatial overlap was similar (DSC, 0.90 vs. 0.89, p = 0.426; JAC 0.81 vs. 0.81, p = 0.422; OV 0.96 vs. 0.96, p = 0.174).HD was larger for CTV MR compared to CTV MR+FET (11.85 mm vs. 10.78 mm, p < 0.001).However, MASD was lower on average for CTV MR compared to CTV MR+FET (1.61 mm vs. 1.73 mm, p = 0.042).This trend was also observed for PTV MR and PTV MR+FET .Further information can be found in Supplementary Tables 8-12 and Supplementary Figs.1-5.

Discussion
The incorporation of FET PET with standard-of-care MRI for newly diagnosed glioblastoma adjuvant RT planning may substantially inform both GTV and CTV derivation through identifying disease otherwise occult on T1c.This could lead to potential improvements in both tumour local control and sparing of healthy brain tissue.To our knowledge, the FIG trial represents the largest prospective multi-centre study in newly diagnosed adults with glioblastoma.In the FIG study, participants receive RT as per standard-of-care, with TV MR+FET delineation performed post-RT.This will be a two-step process requiring BTV interpretation and delineation by a credentialed site NMP followed by sequential incorporation of said BTV by the credentialed site RO to create TV MR+FET .The FIG trial's nuclear medicine credentialing identified and addressed potential sources of error in BTV delineation by participating NMPs [19].Similarly, the feasibility of TV MR+FET delineation had to be assessed.Therefore, the creation of the BTV for each of the three cases was fixed, utilising a sole NMP expert, attributing any subsequent variability to the ROs alone.
After central review and resubmission where indicated, all ROs and sites successfully passed the credentialing components, demonstrating the feasibility of TV MR+FET delineation.Further quantitative assessment showed TV MR+FET to be significantly larger than TV MR , with greater volume, spatial and boundary agreement for GTV MR+FET compared to GTV MR .As the NMP-derived BTV was identical for each case for incorporation by ROs, higher agreement may have been expected for GTV MR+FET .This was not observed, however, for FET1CASE3, noting decreased boundary agreement primarily contributed by an outlier contour.Additionally, importing and registration of PET data to CT/MRI within each site's TPS and conjoining MR-derived GTV with the supplied BTV may have constituted minor sources of variability.Furthermore, MASD was higher for CTV MR+FET , possibly attributed to RO preferences in margin size, as the instructed GTV MR+FET -to-CTV MR+FET expansion was 10-15 mm.The TV paired analysis highlighted individual RO spatial overlap and boundary differences between respectively derived TV MR and TV MR+FET contours.Pleasingly, the average OV for GTV paired was close to one (OV = 0.97), demonstrating that ROs could reproducibly incorporate the tumour bed plus residual enhancement into GTV MR+FET .Low agreement in delineation of the optic chiasm was noted as part of the credentialing program, reflecting in part, the small size of this structure [29,30].Given the chiasm is a critical organ at risk, the importance of its accurate delineation has been reinforced during the prospective phase of the study.Dissaux et al. (2022) separately assessed FET PET and multiparametric MRI for TV delineation in 30 patients with newly diagnosed GBM.Three NMPs and three radiologists respectively reported a mean DSC of 0.841 and 0.922 for T1c and FET PET [31].However, interobserver assessment of combined (MR+FET) target delineation was not assessed.In the present study, a mean of 0.83 was found for T1c, increasing to 0.85 when incorporating both MRI and FET PET.As stated previously, the spatial overlap of GTV MR+FET was expected to be higher than GTV MR ; however, given the ROs involved had zero/minimal prior experience with FET PET data (84.2 %), this may have reduced agreement.
The impact of FET PET on adjuvant RT planning has been investigated in a series of studies.Niyazi et al. (2011) reported consistently larger FET PET-derived BTVs compared to MRI-derived GTVs [16].Furthermore, JAC (or intersection over union) of MRI-derived CTV and FET PET-derived CTV (BTV + 20 mm) was significantly different from unity, and combined MRI+FET CTV was larger compared to MRIderived CTV.Harat et al. (2016) further found FET PET-derived BTV to be correspondingly larger than MRI-derived GTV [13].Spatial overlap between these volumes also showed poor concordance prior to treatment, and at baseline and recurrence [32,33].
Concordance with post-RT sites of progressive disease are more likely to be encompassed by FET PET volumes compared to MRI T1c, as FET PET offers an alternative option to visualising tumour physiology [34].Niyazi et al. (2012) assessed the combined use of FET PET and MRI for detection of tumour recurrence with 49.4 % of recurrences found to be in-field, 12.6 % out-of-field, and 3.8 % marginal (34.2 % no relapse during follow up) [16].Lundemann et al. (2017) analysed patterns of recurrence as 82 % central, 10 % in-field, 2 % marginal and 6 % distant.Expansion of the combined MRI and FET PET-derived GTV of 12 mm would reclassify recurrences as central in 82 % of patients [17].The accuracy of correlating pre-and post-treatment volumes may be somewhat limited by post-operative anatomical changes in the resection cavity.Furthermore, the pattern and timing of recurrences is influenced by isocitrate dehydrogenase mutation status and O(6)-methylguanine-DNA methyltransferase methylation, with methylated patients often having longer progression free survival and exhibiting more remote recurrences [16,35,36].
FET PET may inform RT margin optimisation allowing for reduced dose to the normal brain tissue with potentially comparable treatment outcomes.Allard et al. (2023) reported post-operative FET PET to have better spatial overlap with MRI-determined areas of progressive tumour in patients with biopsy/partial resection compared to those with total/ subtotal resection [37].Fleischmann et al. (2020) found that the minimal margin to encompass recurrent contrast enhancing tumour was less for MR+FET-derived GTVs compared to MR-derived GTVs [38].For FET PET-guided boost irradiation, Piroth et al. (2016) reported that FET PET at radiation treatment planning (plus 7 mm margin) showed better consistency encapsulating recurrent FET PET defined tumour compared to baseline T1c MRI with the same margin [33].FET PET information may result in a volumetrically larger GTV; however, CTV optimisation may be equal to, or even smaller than, MR-derived CTVs depending on the CTV margin applied.This study was limited to three credentialing cases, although these cases were chosen to reflect diverse clinical scenarios.That said, there were 19 ROs across 10 study sites taking part, representing a significant undertaking.We acknowledge that the credentialing program did permit a differential, tighter margin (10-15 mm) in the hybrid volumes compared to standard-of-care [4,23].However, despite this, the hybrid volumes were consistently volumetrically larger than standard TVs.Importantly, T2/FLAIR changes on MRI were included in the CTV to account for microscopic disease; however, FET PET highlighting nonenhancing tumour may be another cause for observer discrepancies [39].Using multiple, central NMP to contour the BTV would have better reflected the real-world setting across multiple sites, however this would have introduced an additional source of variability and confounded the analysis of variability amongst the RO cohort.
All participating FIG study sites successfully passed credentialing requirements, resulting in increased familiarity with FET PET imaging and experience with incorporating NMP-derived BTV into adjuvant TVs for glioblastoma.The resulting central review and resubmission process has shown this collaborative delineation approach to be feasible.Important learnings from the RO credentialing program have been incorporated into the prospective phase of the FIG study.

Fig. 1 .
Fig. 1.Summary of the initial reports of organ at risk (OAR) and target volume (TV) contours generated as part of credentialing of each Radiation Oncologist.Stacked bar charts illustrate delineation violations of OARs (a), MR-derived and MR+FET-derived TVs (b) with respective OARs shown on the FET1CASE1 axial and sagittal views of the contrast-enhanced MRI (c).

Fig. 2 .Fig. 3 .
Fig. 2.An overview of all Radiation Oncology target volume (TV) delineations on axial imaging.Each row represents the benchmarking cases FET1CASE1 (a), FET1CASE2 (b), and FET1CASE3 (c).Standard GTV MR (red), CTV MR (green), PTV MR (orange), and hybrid GTV MR+FET (blue), CTV MR+FET (purple), PTV MR+FET (yellow) are shown.For comparison, GTV MR and GTV MR+FET are shown together with T1c, T2/FLAIR, and FET PET with all images co-registered to each cases respective planning CT where all TV MR and TV MR+FET are displayed separately.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 1
Patient characteristics and biological tumour volume size.

Table 2
Mean and standard deviation of target volumes and pairwise analysis metrics for each case and across all cases.
Significant comparisons adjusted for multiple testing are in bold (p < 0.001).DSC Dice Similarity Coefficient, JAC Jaccard Index, OV Overlap Volume, HD Hausdorff Distance, MASD Mean Absolute Surface Distance.

Table 3
Intra-observer clinical-to-hybrid comparison.The below metrics were calculated between each Radiation Oncologist's paired MR-derived target volumes and their respective MR+FET-derived target volumes.
DSC Dice Similarity Coefficient, JAC Jaccard Index, OV Overlap Volume, HD Hausdorff Distance, MASD Mean Absolute Surface Distance.N.Barry et al.