Main

Levels of prostate-specific antigen (PSA) in the blood are very strongly associated with clinically significant prostate cancer, whether in terms of high grade at diagnosis (Thompson et al, 2006) or subsequent clinical diagnosis of advanced cancer (Ulmert et al, 2008). However, the PSA test has only modest diagnostic specificity and positive predictive value for prostate cancer detection at commonly used cut-points. This leads to significant numbers of unnecessary biopsies. In the European Randomized Study of Prostate Cancer screening (ERSPC), for example, the positive biopsy rate among men with elevated PSA was 24.1% (Schroder et al, 2009a), suggesting that three out of every four biopsies were unnecessary. Similar findings have been reported from US studies, such as the Prostate Cancer Prevention Trial (PCPT) (Thompson et al, 2005). There is thus an urgent need to supplement PSA with novel biomarkers that enhance its specificity so that unnecessary biopsies can be avoided.

We have previously published a statistical model to predict the results of prostate biopsy. The model includes age, DRE and a panel of four kallikrein markers – total PSA, free PSA, intact PSA and human kallikrein-related peptidase-2 (hK2). Using data from the randomised prostate cancer screening trial in Göteborg, Sweden (one centre of the ERSPC), we estimated that for every 1000 previously unscreened men with elevated PSA, use of the model to determine biopsy would reduce biopsy rates by 573, while missing only a small number of cancers (31 out of 152 low-grade cancers and 3 out of 40 high-grade cancers) (Vickers et al, 2008). These findings were subsequently replicated in an independent cohort (reduction in biopsy by 513 per 1000 men with elevated PSA, missing 54 out of 177 low-grade cancer and 12 out of 100 high-grade cancers) (Vickers et al, 2010c). These findings have also been replicated in men who recently have undergone previous screening, with resultant improvements in predictive accuracy (Vickers et al, 2010a, 2010b).

The performance characteristics of PSA are known to be influenced by previous screening, with predictive accuracy lower in men who have previously been screened compared with an unscreened cohort of men (Vickers et al, 2010a). Similarly, the predictive accuracy of PSA to predict the outcome of repeat biopsy after initial negative biopsy is also lower (Roobol et al, 2006; Walz et al, 2006; Chun et al, 2007). In this study, we have evaluated the performance characteristics of PSA in men who underwent a second biopsy after an initial negative biopsy in the Rotterdam section of the ERSPC. We then evaluated the performance characteristics of the kallikrein panel in this cohort to see whether it also could provide a more accurate prediction of repeat biopsy outcome than currently established standard predictors. This also serves as an independent validation of our previously published model to predict the results of prostate biopsy.

Materials and methods

Patient methods

The study cohort comprised of participants in the Rotterdam section of the ERSPC. The study design has been described previously (Roobol et al, 2003). In brief, men aged 55–75 years were invited for an initial PSA test during 1993–1999; men not diagnosed with prostate cancer were invited for up to two subsequent screens every 4 years until they reached age 75 years. All biopsies were prompted by an elevated PSA (3 ng ml−1). A total of 3394 patients had initial negative biopsy in rounds 1 through 3. Of these, 1000 men underwent a second biopsy in subsequent rounds and 2394 men did not undergo a second biopsy for various reasons (Figure 1). Biomarkers data were not available for 75 of the 1000 men who underwent a second biopsy. Our analysis is focused on the 925 men who underwent a second prostate biopsy after an initial negative biopsy using an aliquot of the blood collected before the second biopsy.

Figure 1
figure 1

Flow chart of participants in the study.

Laboratory methods

Laboratory methods were as for our previous publication (Vickers et al, 2008). Serum samples were retrieved from the archival serum bank in Rotterdam (where they had been stored frozen at −80oC after their initial processing within 3 h from venipuncture) and shipped frozen on dry ice to Malmö, Sweden, in 2005–2007. Analyses of free, total, and intact PSA and hK2 were performed in Dr Lilja's laboratory at the Wallenberg Research Laboratories, Department of Laboratory Medicine, Lund University, University Hospital UMAS in Malmö, Sweden, during 2005 and 2007. Free and total PSA were measured using the dual-label DELFIA Prostatus total/free PSA-Assay (Perkin-Elmer, Turku, Finland) and conducted in accordance with WHO calibration standards. The measurements of intact PSA and hK2 entailed the use of F(ab′)2 fragments of the monoclonal capture antibodies in order to significantly reduce the frequency of nonspecific assay interference (Vaisanen et al, 2006). The intact PSA assay measures only free, uncomplexed intact PSA (i.e., not cleaved at Lys145–Lys146). All analyses were conducted blind to biopsy result.

Statistical methods

Our aim was to compare the area under the curve (AUC) of our previously developed statistical model, which included age, DRE result and a panel of four kallikrein markers vs that of a base model that included age, DRE result and total PSA. To determine the predictive value of our previously developed statistical model (age, DRE and four kallikreins) above that of age, DRE and total PSA alone, we looked at the area under the receiver operating characteristic curve (AUC). High-grade cancer was defined as biopsy Gleason score 7 or higher. Separate models were not built for high-grade cancer, and the AUC for high-grade cancer was calculated from the predicted probabilities of any cancer. The predicted risk from the model was recalibrated with a Bayes factor, using the observed event rate in this cohort (12%) to reflect the lower incidence of cancer in men with previous negative biopsy. Previous studies have reported higher rates of prostate cancer on repeat biopsy, ranging from 11 to 34% (Keetch et al, 1994; Roehrborn et al, 1996; Catalona et al, 1997; Fleshner et al, 1997; Letran et al, 1998; Rietbergen et al, 1998; Durkan and Greene, 1999; Djavan et al, 2000; O’Dowd et al, 2000; Stewart et al, 2001; Lopez-Corona et al, 2003; Park et al, 2003; Lujan et al, 2004; Walz et al, 2006; Benecchi et al, 2008; Lane et al, 2008; Rochester et al, 2009; Schroder et al, 2009b). Therefore, to examine the impact this adjustment would have on calibration (discrimination would not be affected) in cohorts with a different rate of cancer detection on repeat biopsy, we simulated higher event rates by randomly sampling with replacement to create data sets with observed event rates of 20 or 30%.

To illustrate the clinical effects of using the four-kallikrein model in men with repeat biopsy, we used decision curve analysis. Decision curve analysis estimates a ‘net benefit’ for prediction models by summing the benefits (true positives) and subtracting the harms (false positives), where the latter is weighted by a factor so as to reflect the relative harm of a missed cancer compared with an unnecessary biopsy. The weighting is derived from the threshold probability of prostate cancer, defined as the point at which a patient would choose to be biopsied. As this threshold probability can vary from patient to patient, net benefit is calculated across a range of probabilities; we chose 10–40% as a reasonable range. The interpretation of a decision curve is straightforward: the model with the highest net benefit at a particular threshold probability should be used. All statistical analyses were conducted using Stata 10.1 (StataCorp LP, College Station, TX, USA).

Results

Characteristics of the participants who were biopsied after a previous negative biopsy are given in Table 1. In total, 110 (12%) of 925 men had a positive biopsy after a previous negative biopsy. The total PSA levels and age were only slightly higher for the men with positive second biopsies compared with those with negative biopsies (median total PSA: 5.61 vs 5.22 ng ml−1 and age: 65 vs 64 years). As is typical for recently screened men, the majority of cancers diagnosed were Gleason 6 or less (n=92; 84%), whereas 13 (12%) were Gleason 7 and only 5 (5%) were Gleason 8 or higher.

Table 1 Participant characteristics

The AUC of our previously developed model and base model when applied to men with a previous negative biopsy are given in Table 2. The AUC of the four-kallikrein panel was significantly higher than the base model for the prediction of any prostate cancer (AUC 0.681 (95% confidence interval (CI): 0.623, 0.739) vs 0.584 (95% CI: 0.523, 0.644), P<0.001) and high-grade cancer at biopsy (AUC 0.873 (95% CI: 0.807, 0.939) vs 0.764 (95% CI: 0.639, 0.888), P=0.003).

Table 2 Area under the curve of models built on first round Rotterdam participants when applied participants in subsequent rounds with a previous negative biopsy and elevated PSA (3 ng ml−1 or higher)

To put these results in a clinical context, we considered the scenario where a clinician would recommend biopsy to men with a predicted probability of 20% (Table 3). Applying this rule with the kallikrein model would reduce the number of biopsies by 82%, while delaying the diagnosis of 64 low-grade (Gleason sum 6 at biopsy) and 3 high-grade cancers per 1000 men with persistently elevated total PSA (3 ng ml−1). In other words, for every 272 men who avoid biopsy, we miss only one high-grade cancer. In addition, all the three high-grade cancers that were missed were Gleason 7 and none of the cancers with Gleason score 8 were missed. The very large reduction in biopsies conducted and cancers found is due to the low event rate in this cohort. It seems reasonable to assume that men with a previous biopsy might be more risk averse than the population as a whole. Hence, we repeated this analysis using a 15 and 10% risks as thresholds for biopsy. Even at these lower risk thresholds, use of the model would avoid 71 and 54% of the biopsies, whereas missing only 3 and 1 high-grade cancers, all of which were Gleason 7 (Table 3).

Table 3 Clinical outcome of basing repeat biopsy on the kallikrein model

Figure 2 shows the decision curve for our full kallikrein model. The net benefit of the four-kallikrein panel is superior to the alternative strategies of either biopsying all men, no men or biopsying on the basis of age, total PSA and DRE alone for all men, except those requiring a very high (>30%) risk of cancer before accepting biopsy. Such men should not be biopsied after negative biopsy, as their risk of subsequent positive biopsy is low.

Figure 2
figure 2

Decision curve for outcome of any cancer using the four-kallikrein model (dashed line) and base model (solid line), after recalibration. The solid grey line is for the strategy of biopsying all men and the horizontal black line for not biopsying anyone. The line with the highest net benefit at a particular threshold probability will lead to the best clinical results.

As the event rate in this data set is lower than that previously reported in men undergoing a repeat biopsy, we conducted a sensitivity analysis to determine the effect of applying our recalibrated model to a cohort of men, in which the true event rate was higher (20 or 30%) than the event rate used for recalibration (12%). When the true event rate was 20%, the kallikrein model was superior to alternative treatment strategies for all threshold probabilities above 12% (Figure 3A). However, at an event rate of 30%, the kallikrein model was superior to a strategy of biopsying all men only at threshold probabilities above 26% (Figure 3B). This may be because the model was calibrated to a lower event rate of 12% and would likely need to be recalibrated for use in cohorts, in which positive rebiopsy rates are 30% or higher.

Figure 3
figure 3

Decision curves for outcome of any cancer using the four-kallikrein model (dashed line), compared with a strategy of biopsying all men (solid grey line) or biopsying no men (solid black line) in cohorts in which the event rate was imputed to be 20% (A) or 30% (B).

Several investigators have included prostate volume in predictive models for prostate cancer, for example, by calculating ‘PSA density’ (Radwan et al, 2007; Kranse et al, 2008). We have previously avoided this approach on the grounds that data on volume are only available when a transrectal ultrasound (TRUS) probe is inserted during a prostate biopsy session. In the case of repeat biopsy, however, TRUS-based prostate gland volume is already available from the original negative biopsy session. As a sensitivity analysis, we added prostate volume to the base and full models that were built on the first round of Rotterdam participants and independently evaluated on men with a previous negative biopsy. As shown in Table 2, TRUS-based prostate gland volume increased the AUC of the base model, but not the full kallikrein model, and AUC remained higher for the full model than for base model (total PSA, age, and DRE) plus TRUS volume. The decision curve was not substantively affected (data not shown). Our conclusion that the kallikrein panel adds information to established predictors is therefore unchanged.

Discussion

In this study, we found that a panel of four kallikreins can predict the outcome of prostate biopsy in men who had previously undergone prostate biopsy during previous screening. The full model comprised of the four kallikrein markers (i.e., total, free and intact PSA, and hK2), age and DRE substantially improved the predictive accuracy of a base model (comprising of total PSA, age and DRE), for both low- and high-grade cancers.

We have previously reported similar improvements in AUC by using the four-kallikrein panel in previously unscreened men and in men who had undergone previous screening but had not been biopsied; and have validated these findings in independent cohorts (Vickers et al, 2008, 2010a, 2010b, 2010c). Among men who have not been previously screened, use of the four-kallikrein panel increased the AUC for detecting prostate cancer from 0.685 to 0.776 (Vickers et al, 2010c). For men who have previously been screened but not biopsied, the model increased AUC from 0.585 to 0.711 (Vickers et al, 2010b).

These results also support our previous finding that the predictive accuracy of PSA decreases in men who have previously been screened. The AUC of a base model comprising of total PSA, age and DRE was 0.584 to predict any cancer in this cohort as compared with 0.724 for a cohort of men who had never been screened (Vickers et al, 2008). The AUC of 0.584 in this study is similar to the AUC of 0.569 and 0.585 in two cohorts of men who had recently been screened in the ERSPC trial and undergone initial prostate biopsy due to elevated PSA, but had no previous biopsy (Vickers et al, 2010a, 2010b). Similarly, for men with previous negative biopsies, two large studies using clinical cohorts of men have found the AUC of total PSA alone to be modest at 0.528 and 0.601, respectively (Walz et al, 2006; Chun et al, 2007).

Men with prostate cancer after previous negative biopsy tend to have favourable disease and their outcomes are better than those diagnosed at the first screen (Schroder et al, 2009b). Most of these cancers are low risk and are likely to constitute overdiagnosis. Schroder et al (2009b) studied 3056 men in the Rotterdam arm of the ERSPC who had initial negative biopsies at the first screen. On follow-up up to 11 years, 287 prostate cancer cases were detected, 26 developed progressive disease and 7 died of prostate cancer (Schroder et al, 2009b). Of these seven deaths only one was diagnosed on repeat screening and the other six were interval cancers. Only 0.6% of the prostate cancers diagnosed on repeat screening after initial negative biopsy led to death, as compared with 4.2% death rate for cancers found at initial biopsy. Thus, very few cancers found after initial negative biopsy are clinically relevant. Biopsying all men with previous negative biopsies will lead to a large number of unnecessary biopsies and detection of a large number of potentially clinically insignificant cancers. Hence, a strategy of identifying and biopsying men at the greatest risk of death due to cancer is needed.

In this cohort, of the 110 cancers that were diagnosed, 84% had Gleason sum 6 or less on biopsy and only 5% were Gleason 8 or more. Similarly in a series of 99 men enrolled in a large screening programme with previous negative biopsies, 20 cancers were found. Of these only one was clinically advanced, only 1 had biopsy Gleason score 7 and three had Gleason score 7 on radical prostatectomy. No patient had Gleason 8 or higher (Catalona et al, 1997). Hence, due to the relatively large number of clinically insignificant cancers that exist in men with previous negative biopsies, the goal in these men is to reduce the number of biopsies, reduce the detection of clinically insignificant or indolent disease, while still detecting the clinically significant cancers.

Use of the four-kallikrein model to predict the result of repeat biopsy would lead to a very large reduction in the rate of unnecessary second biopsy. Of the men with cancer who would be classified at low risk from a model, the majority would have the sort of low-stage, low-grade cancers typically thought to constitute overdiagnosis. For example, if biopsy was based on patients having a risk of cancer of 20% or more, use of the model would reduce the number of biopsies by 82%. Of the 67 cancers that would be missed, 64 would be Gleason 6 or less. The diagnosis will be delayed for only 16% (3 of 19) of the high-grade cancers (Gleason score=7). For men who are more risk averse and would prefer to undergo a biopsy at a 15% risk threshold, use of the model would still reduce the biopsies by 712 per 1000 men, while missing 53 cancers only 3 of which would be Gleason 7. Thus, use of such models may decrease the overdiagnosis and overtreatment associated with repeated screening and repeat biopsy in men with previous negative biopsies.

For men with previous negative biopsies and elevated PSA, use of other markers such as PCA3 has been suggested. In men with previous negative biopsies and persistently elevated PSA, a higher PCA3 score has been found to be associated with a higher probability of a positive biopsy (Marks et al, 2007; Haese et al, 2008). In a study of 463 men, use of a PCA3 score threshold of 20 would reduce 44% of the biopsies, while missing 9% of high-grade cancers (Haese et al, 2008). A higher threshold, such as 35, would avoid more biopsies (67%), but would also miss more high-grade cancers (21%). Direct comparison of these results with ours is not possible because ours is a screening cohort, whereas the PCA3 studies were based on clinical cohorts. The PCA3 test while promising needs to be validated in independent cohorts, and the PCA3 score needs to be incorporated into a model with other predictors of a positive biopsy such as age, DRE, PSA and other kallikreins. The need for an attentive DRE and collection of a urine sample for PCA3 measurement may also limit its utility in a screening scenario.

Several nomograms have also been developed to predict outcomes of biopsy in men after previous negative biopsies (Shariat et al, 2008). However, only a few of these have been externally validated (Lopez-Corona et al, 2003; Yanke et al, 2005; Chun et al, 2007). These nomograms have incorporated variables such as family history, PSA, free PSA, PSA kinetics, PSA density, biopsy findings, number of previous negative cores and so on. These nomograms were developed in clinical cohorts and need to be studied in screening cohorts before they can be used in a screening scenario.

Our study may be limited by the use of sextant biopsy, which misses 19–23% of the cancers compared with the currently practiced extended biopsy schemes (Schroder et al, 2009b). This may partially explain why the cancer detection rate on rebiopsy was 12% in this cohort compared with 11–19% reported in other screening cohorts and 16–34% reported in clinical cohorts (Keetch et al, 1994; Roehrborn et al, 1996; Catalona et al, 1997; Fleshner et al, 1997; Letran et al, 1998; Rietbergen et al, 1998; Durkan and Greene, 1999; Djavan et al, 2000; O’Dowd et al, 2000; Stewart et al, 2001; Lopez-Corona et al, 2003; Park et al, 2003; Lujan et al, 2004; Benecchi et al, 2008; Lane et al, 2008; Rochester et al, 2009; Schroder et al, 2009b). We attempted to evaluate our model in cohorts with higher cancer rates by varying the event rate using statistical imputation. We found that the model performed well when the event rate was as high as 20%, but without recalibration would not be clinically useful in cohorts with 30% or higher detection rates. However, most repeat biopsy studies in screening and clinical cohorts have found the cancer rate to be less than 30% (Keetch et al, 1994; Roehrborn et al, 1996; Catalona et al, 1997; Rietbergen et al, 1998; Djavan et al, 2000; O’Dowd et al, 2000; Lopez-Corona et al, 2003; Park et al, 2003; Lujan et al, 2004; Lane et al, 2008). The few studies that have found higher rates tended to be older series (Fleshner et al, 1997; Letran et al, 1998; Durkan and Greene, 1999); or involved saturation biopsies, usually after previous sextant biopsies (Stewart et al, 2001; Walz et al, 2006); or may have been enriched by higher risk patients due to features such as atypia or HGPIN on previous biopsies or other factors (Yanke et al, 2005; Benecchi et al, 2008).

The study is strengthened by the use of a well-characterised, prospective cohort that was a part of a rigorously conducted clinical trial. The use of decision analytical techniques and sensitivity analysis demonstrate the utility in clinical decision making across different threshold probabilities and cancer detection rates. An additional advantage of using a panel of four kallikrein markers is that it does not require additional procedures, such as urine collection or prostatic massage, and can be performed by laboratories using an aliquot from the same blood sample used to run the initial PSA. The model can also be incorporated into laboratory reports, which can then report the risk of prostate cancer for the patient along with the measure of each kallikrein.

Conclusions

A statistical model based on a panel of four kallikreins has been previously shown to predict the outcome of prostate biopsy in previously unscreened men (Vickers et al, 2008, 2010c) and in men who had a PSA test but no previous biopsy (Vickers et al, 2010a, 2010b). In this study, we show that the model is highly predictive of biopsy outcome. Use of the model to determine repeat biopsy in men with elevated PSA would dramatically reduce rebiopsy rates, while delaying the diagnosis of only a small number of cancers, almost all of which are low grade.