Abstract
Discovery of “signature” protein profiles that distinguish
disease states (eg, malignant, benign, and normal) is a key step
towards translating recent advancements in proteomic technologies
into clinical utilities. Protein data generated from mass
spectrometers are, however, large in size and have complex
features due to complexities in both biological specimens and
interfering biochemical/physical processes of the measurement
procedure. Making sense out of such high-dimensional complex
data is challenging and necessitates the use of a systematic data
analytic strategy. We propose here a data processing strategy for
two major issues in the analysis of such
mass-spectrometry-generated proteomic data: (1) separation of
protein “signals” from background “noise” in protein
intensity measurements and (2) calibration of protein mass/charge
measurements across samples. We illustrate the two issues and
the utility of the proposed strategy using data from a prostate
cancer biomarker discovery project as an example.