Review
Global and targeted quantitative proteomics for biomarker discovery

https://doi.org/10.1016/j.jchromb.2006.09.004Get rights and content

Abstract

The extraordinary developments made in proteomic technologies in the past decade have enabled investigators to consider designing studies to search for diagnostic and therapeutic biomarkers by scanning complex proteome samples using unbiased methods. The major technology driving these studies is mass spectrometry (MS). The basic premises of most biomarker discovery studies is to use the high data-gathering capabilities of MS to compare biological samples obtained from healthy and disease-afflicted patients and identify proteins that are differentially abundant between the two specimen. To meet the need to compare the abundance of proteins in different samples, a number of quantitative approaches have been developed. In this article, many of these will be described with an emphasis on their advantageous and disadvantageous for the discovery of clinically useful biomarkers.

Introduction

While proteomics is contributing to a wide-range of scientific disciplines, probably no area is more critical than the discovery of novel diagnostic and therapeutic biomarkers. While discoveries in molecular biology help to unlock mysteries of cell function and behaviour, the discovery of clinically useful biomarkers would have a direct impact on the survival of thousands of patients and could mean the difference between choosing the correct or incorrect therapy in cases where immediate treatment is critical. One indisputable truth is the high standards that need to be achieved if a protein is to be useful as a biomarker. If a biomarker is defined as a feature that can be used to measure the presence and progress of a disease or the effects of treatment it must be able to be measured reproducibly and also be specific to a disease or treatment. For example, while an increase in the levels of certain acute phase response proteins are used to indicate inflammation, they do not specify the exact cause of inflammation. Even the well known biomarker prostate-specific antigen (PSA) is not absolutely specific for prostate cancer, as other disorders such as benign prostatic hyperplasia, can result in an elevated PSA level [1]. Since a biomarker needs to be quantitated with high precision and accuracy it should be sufficiently abundant that it does not strain the limits of detection and quantitation available with today's assays or instrumentation. Finally, the test designed to detect the biomarker must possess high sensitivity (i.e. indicate a positive test for patients who have the disease) and specificity (i.e. indicate a negative test for patients without disease).

Probably no technology has spurred the fervor in discovery of new biomarkers than mass spectrometry (MS). The developments made in coupling protein and peptide fractionation techniques directly with state-of-the-art MS instrumentation has made it possible to identify thousands of proteins in complex biological samples [2]. This ability to obtain wide proteome coverage, however, has brought with it challenges in how to integrate this type of discovery science with basic research. The first challenge deals with the percentage of the proteome that we are presently able to characterize. Based on results from the human genome project, the human genome is anticipated to contain on the order of 20,000–25,000 open reading frames (Fig. 1A) [3]. Unfortunately the number of proteins within a complex proteome, from a biofluid for example, is unpredictable. Considering all of the possible post-transcriptional and post-translational events that may occur, any human proteome sample could easily contain upwards of 100,000 different protein species. The second challenge is that while discovery proteomics has focused considerable effort on developing methods to characterize thousands of proteins in biological samples, however, basic research continues to be dominated by scientists who focus on a single, or a very small number (i.e. 2–5), protein in any study. This disconnect is present in many aspects of biological research such as phosphorylation mapping, protein quantitation, and simple protein identification. It is very apparent in the field of biomarker discovery and validation. In the course of using MS, in particular, for the discovery of novel biomarkers hundreds of differences in the abundance of proteins between biofluids obtained from diseased and control patients can be observed, however it is currently only possible to graduate a small number of these “potential” markers into a validation phase.

The challenge in the next few years will be to find ways to bridge this divide between discovery-driven science and basic research. While improvements in technology will continue to benefit this progress, there are other study design and physiological barriers that may be more difficult to overcome. At a very fundamental level, reliable cohorts of samples that are indicative of the disease being study can be difficult to obtain. Unless a well thought out research study is designed in collaboration with a clinical center, very few groups are likely to hand over their “precious” clinical samples to a proteomics discovery laboratory. When dealing with tissue samples, biopsies require invasive procedures to obtain and are generally not collected in retrospective manner. There is no standardization in the collection of biofluid samples and the effects of processing and preparing serum and plasma are not well understood. With the ability of state-of-the-art mass spectrometers to identify low-abundance proteins in blood [4], we are only beginning to understand the overall effect of long-term storage and freeze/thaw cycles.

While many of these issues can be resolved by establishing standard operating procedures (SOPs), there are more ominous challenges. Let's consider a liver tumor that is secreting a highly specific biomarker into the circulation system. The concentration of this marker is very high in the immediate vicinity of the tumor. Most biomarker discovery efforts that analyze biofluids, however, scrutinize samples (such as serum and plasma) that are collected at the patient's elbow. This distance allows the biomarker to travel through thousands of miles of veins, arteries, and capillaries in which it may be diluted to a vanishingly small concentration. Another physiological challenge involves the non-biased approach taken for biomarker discovery. On the surface it appears that most studies are trying to find the proverbial “needle-in-a-haystack”. Unfortunately the situation is even direr than this analogy. In a typical study design in which vast numbers of proteins identified in biofluids collected from disease-affected patients are compared to matched controls, tens to hundreds of differences in protein abundances can be detected. The fundamental problem is that we lack the insight into which of these differences are related specifically to the condition being studied. Our inability to immediately recognize potential biomarkers that could be successfully validated essentially regulates these studies to finding a “needle-in-a-needlestack”.

Section snippets

Quantitative strategies for biomarker detection

To identify novel diagnostic and therapeutic biomarkers, investigators focus on the discovery of proteins that are more or less abundant in samples obtained from patients with a specific disease compared to those acquired from healthy-matched control patients. There are a number of different MS-based methods for conducting such studies, and each has their particular advantages and disadvantages.

Protein and peak identification

While a large number of peaks or identified proteins can be quantitatively measured using the approaches described above, the difficulty is how to turn this data into potentially validatable biomarkers. For those quantitative approaches that deal with identified proteins or peptides (i.e. 2D-PAGE, stable-isotope labeling, subtractive proteomics) there are a number of informatics solutions available for performing mass mapping and database searches using MS/MS spectra. A variety of algorithms

Targeted approaches to quantitate biomarkers

Most of this review has been focused on un-biased methods to attempt to find novel biomarkers that can be used for diagnosis or therapeutic monitoring. The general thought is that MS will play a major role in discovery; however, the validation and routine monitoring of biomarkers will be accomplished through the development of affinity reagents such as antibodies. The production of a highly specific, high affinity antibody to a newly discovered protein biomarker, however, is never a certainty

Conclusions

The advances made in proteomic technology, primarily in the field of MS, have equipped us with the ability to scrutinize proteome samples to a far greater extent then ever possible. As described in this article, there are many options available for measuring the relative abundances of proteins in clinical samples. Unfortunately the number of biomarkers that have ultimately been successfully validated using these discovery approaches is discouraging. The fault for this fact, however, does not

Acknowledgements

This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under Contract NO1-CO-12400. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organization imply endorsement by the United States Government.

References (28)

  • T. Kislinger et al.

    J. Am. Soc. Mass Spectrom.

    (2005)
  • M.C. Pietrogrande et al.

    J. Chromatogr. B Analyt. Technol. Biomed. Life Sci.

    (2006)
  • E.B. Altintas et al.

    J. Chromatogr. B Analyt. Technol. Biomed. Life Sci.

    (2006)
  • E.F. Petricoin et al.

    Lancet

    (2002)
  • S.E. Ong et al.

    Methods

    (2003)
  • K.A. Conrads et al.

    Mol. Cell. Proteomics

    (2005)
  • E. Kolker et al.

    Trends Microbiol.

    (2006)
  • W. Sun et al.

    Mol. Cell. Proteomics

    (2004)
  • Y. Liu

    Technol. Cancer Res. Treat.

    (2006)
  • N. Escher et al.

    Eur. J. Cancer

    (2006)
  • D.H. Chace et al.

    Clin. Biochem.

    (2005)
  • L. Anderson et al.

    Mol. Cell. Proteomics

    (2006)
  • F.H. Schroder

    Can. J. Urol.

    (2005)
  • Cited by (103)

    • Elucidation of protein biomarkers in plasma and urine for epsilon toxin exposure in mouse model

      2019, Anaerobe
      Citation Excerpt :

      Using a 2DE-MS approach, we elucidated protein markers for ETX exposure following intravenous injection of the toxin in the mouse model. We have employed here two enrichment strategies, albumin depletion and mining of proteins, for the depletion of high-abundance proteins; prefractionation prior to the proteomic analysis has been commonly used due to the complex nature of serum/plasma and to handle the wide dynamic range of protein concentration [36,37]. A total of 52 unique proteins showing ETX-induced modulations were identified in plasma/serum (using two enrichment strategies) and urine samples.

    • Protein biomarkers for early detection of diseases: The decisive contribution of combinatorial peptide ligand libraries

      2018, Journal of Proteomics
      Citation Excerpt :

      One of the major obstacles of directly discovering relevant early-stage biomarkers in biological material is the presence of massive amounts of abundant proteins that mask the signal of many dilute species. This is the reason why tracking quantitative changes for proteins of very low-abundance proteins in view of diagnostic utilizations has been depicted as “looking for a needle in a haystack” [61]. Beyond historical cancer antigens as described above, additional protein marker candidates have been found in the last several years; however, used singularly they are not very effective.

    • New Technologies for Monitoring Marine Mammal Health

      2018, Marine Mammal Ecotoxicology: Impacts of Multiple Stressors on Population Health
    • The Lymphatic Fluid

      2018, International Review of Cell and Molecular Biology
      Citation Excerpt :

      Lately, as it became apparent that the lymphatic fluid proteome, as opposed to the plasma proteome, more closely represented the tissue composition/proteomic from which it drained, another important goal became to use the lymphatic fluid as a liquid biopsy during pathogenic conditions. Indeed the ultimate goal would be the mapping of protein biomarkers and the overall molecular proteomic signature as observed in different pathologies (Clement et al., 2010, 2013, 2016; D'Alessandro et al., 2011, 2014; Dzieciatkowska et al., 2011, 2014; Omenn et al., 2005; Veenstra, 2007; Veenstra et al., 2005). The original idea that the lymph was only a simple product of blood filtration and its composition coincided with that of the plasma was mostly due to the low sensitivity and resolution of the early mass spectrometry instruments which could only map very abundantly expressed proteins, such as albumin and globulins, as present in both biological fluids (Leak et al., 2004).

    View all citing articles on Scopus

    This paper was presented at Biomarker Discovery by Mass Spectrometry, Amsterdam, The Netherlands, 18–19 May 2006.

    View full text