Elsevier

The Lancet

Volume 359, Issue 9306, 16 February 2002, Pages 572-577
The Lancet

Fast track — Mechanisms of Disease
Use of proteomic patterns in serum to identify ovarian cancer

https://doi.org/10.1016/S0140-6736(02)07746-2Get rights and content

Summary

Background

New technologies for the detection of early-stage ovarian cancer are urgently needed. Pathological changes within an organ might be reflected in proteomic patterns in serum. We developed a bioinformatics tool and used it to identify proteomic patterns in serum that distinguish neoplastic from non-neoplastic disease within the ovary.

Methods

Proteomic spectra were generated by mass spectroscopy (surface-enhanced laser desorption and ionisation). A preliminary “training” set of spectra derived from analysis of serum from 50 unaffected women and 50 patients with ovarian cancer were analysed by an iterative searching algorithm that identified a proteomic pattern that completely discriminated cancer from non-cancer. The discovered pattern was then used to classify an independent set of 116 masked serum samples: 50 from women with ovarian cancer, and 66 from unaffected women or those with non-malignant disorders.

Findings

The algorithm identified a cluster pattern that, in the training set, completely segregated cancer from non-cancer. The discriminatory pattern correctly identified all 50 ovarian cancer cases in the masked set, including all 18 stage I cases. Of the 66 cases of non-malignant disease, 63 were recognised as not cancer. This result yielded a sensitivity of 100% (95% CI 93–100), specificity of 95% (87–99), and positive predictive value of 94% (84–99).

Interpretation

These findings justify a prospective population-based assessment of proteomic pattern technology as a screening tool for all stages of ovarian cancer in high-risk and general populations.

Introduction

Application of new technologies for detection of ovarian cancer could have an important effect on public health,1 but to achieve this goal, specific and sensitive molecular markers are essential.1, 2, 3, 4, 5 This need is especially urgent in women who have a high risk of ovarian cancer due to family or personal history of cancer, and for women with a genetic predisposition to cancer due to abnormalities in predisposition genes such as BRCA1 and BRCA2. There are no effective screening options for this population.

Ovarian cancer presents at a late clinical stage in more than 80% of patients,1 and is associated with a 5-year survival of 35% in this population. By contrast, the 5-year survival for patients with stage I ovarian cancer exceeds 90%, and most patients are cured of their disease by surgery alone.1, 2, 3, 4, 5, 6 Therefore, increasing the number of women diagnosed with stage I disease should have a direct effect on the mortality and economics of this cancer without the need to change surgical or chemotherapeutic approaches.

Cancer antigen 125 (CA125) is the most widely used biomarker for ovarian cancer.1, 2, 3, 4, 5, 6 Although concentrations of CA125 are abnormal in about 80% of patients with advanced-stage disease, they are increased in only 50–60% of patients with stage I ovarian cancer.1, 2, 3, 4, 5, 6 CA125 has a positive predictive value of less than 10% as a single marker, but the addition of ultrasound screening to CA125 measurement has improved the positive predictive value to about 20%.6

Low-molecular-weight serum protein profiling might reflect the pathological state of organs and aid in the early detection of cancer. Matrix-assisted laser desorption and ionisation time-of-flight (maldi-tof) and surface-enhanced laser desorption and ionisation time-of-flight (seldi-tof) mass spectroscopy can profile proteins in this range.6, 7, 8, 9 These profiles can contain thousands of data points, necessitating sophisticated analytical tools. Bioinformatics has been used to study physiological outcomes and cluster gene microarrays,10, 11, 12, 13 but to uncover changes in complex mass spectrum patterns of serum proteins, higher order analysis is required. We aimed to link SELDI-TOF spectral analysis with a high-order analytical approach using samples from women with a known diagnosis to define an optimum discriminatory proteomic pattern. We then aimed to use this pattern to predict the identity of masked samples from unaffected women, women with early-stage and late-stage ovarian cancer, and women with benign disorders.

Section snippets

Study population

100 control samples (50 for the preliminary analysis and 50 for the masked analysis) were provided from the National Ovarian Cancer Early Detection Program (NOCEDP) clinic at Northwestern University Hospital (Chicago, IL, USA). 17 other control samples from anonymous women unaffected by cancer were provided by the Simone Protective Cancer Institute (Lawrenceville, NJ, USA). These 17 women had endometriosis (seven), uterine fibroids (three), sinusitis (four), rheumatoid arthritis (two), and

Reproducibility and precision

An example of nine independently obtained spectra from the between-run analysis of the serum from the unaffected woman used to determine reproducibility of the mass spectra is shown in figure 2. The coefficient of variance (CV) for eight selected M/Z peaks with the highest amplitude was less than 10%. There was little variation with day-to-day sampling and instrumentation or chip variations. We calculated that mass spectrum patterns remained consistent (CV <10%) if serum samples were not frozen

Discussion

Complex serum proteomic patterns might reflect the underlying pathological state of an organ such as the ovary. This hypothesis is supported by the results of our masked analysis (table 2). Non-cancer control samples representing benign disease, gynaecological disorders, and inflammatory conditions were derived from patients in a high-risk clinic and from the general population (table 1). 63 of 66 samples were accurately classified as non-cancer, including all those from the general population.

GLOSSARY

cluster analysis
A means of plotting and analysing the protein patterns as clusters or groupings that are similar or not similar. This report uses a set of known “training” samples to segregate the data into two types of clusters: those containing samples from affected patients, and clusters containing samples from unaffected patients. After training, the pattern of an unknown sample is diagnostically classified by its similarity to the diseased or unaffected clusters found in the training set.

References (20)

There are more references available in the full text version of this article.

Cited by (0)

View full text