Abstract
It is well accepted that a set of genes must act in concert to drive various cellular processes. However, under different biological phenotypes, not all the members of a gene set will participate in a biological process. Hence, it is useful to construct a discriminative classifier by focusing on the core members (subset) of a highly informative gene set. Such analyses can reveal which of those subsets from the same gene set correspond to different biological phenotypes. In this study, we propose Gene Set Top Scoring Pairs (GSTSP) approach that exploits the simple yet powerful relative expression reversal concept at the gene set levels to achieve these goals. To illustrate the usefulness of GSTSP, we applied this method to five different human heart failure gene expression data sets. We take advantage of the direct data integration feature in the GSTSP approach to combine two data sets, identify a discriminative gene set from >190 predefined gene sets, and evaluate the predictive power of the GSTSP classifier derived from this informative gene set on three independent test sets (79.31% in test accuracy). The discriminative gene pairs identified in this study may provide new biological understanding on the disturbed pathways that are involved in the development of heart failure. GSTSP methodology is general in purpose and is applicable to a variety of phenotypic classification problems using gene expression data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mootha VK, Lindgren CM, Eriksson K-F et al (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics 34:267–273.
Winslow RL, Gao Z (2005) Candidate gene discovery in cardiovascular disease Circ Res 96:605–606.
Sharma UC, Pokharel S, Evelo CTA et al (2005) A systematic review of large scale and heterogeneous gene array data in heart failure. J Mol Cell Cardiol 38: 425–432.
Rhodes DR, Chinnaiyan AM (2005) Integrative analysis of the cancer transcriptome. Nature Genetics 37:S31-S37.
Segal E, Friedman N, Kaminski N et al (2005) From signatures to models: understanding cancer using microarrays. Nature Genetics 37:S38-S45.
Subramanian A, Tamayo P, Mootha VK et al (2005) Gene Set Enrichment Analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545–15550.
AHA. (2005) Heart Disease and Stroke Statistics - 2005 Update. American Heart Association.
Liew CC, Dzau VJ (2004) Molecular genetics and genomics of heart failure. Nature Reviews Genetics 5:811–825.
Ventura-Clapier R, Garnier A, Veksler V (2004) Energy metabolism in heart failure. Journal of Physiology 555:1–13.
Barrans JD, Allen PD, Stamatiou D et al (2002) Global gene expression profiling of end-stage dilated cardiomyopathy using a human cardiovascular-based cDNA microarray. American Journal of Pathology 160:2035–2043.
Yung CK, Halperin VL, Tomaselli GF et al (2004) Gene expression profiles in end-stage human idiopathic dilated cardiomyopathy: altered expression of apoptotic and cytoskeletal genes. Genomics 83:281–297.
Geman D, d'Avignon C, Naiman DQ et al (2004) Classifying gene expression profiles from pairwise mRNA comparisons. Statistical Applications in Genetics and Molecular Biology 3:Article 19.
Tan AC, Naiman DQ, Xu L et al (2005) Simple decision rules for classifying human cancers from gene expression profiles. Bioinformatics 21:3896–3904.
Tibshirani R, Hastie T, Narasimhan B et al (2003) Class prediction by nearest shrunken centroids, with applications to dna microarrays. Statistical Science 18:104–117.
Xu L, Tan AC, Naiman DQ et al (2005) Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data. Bioinformatics 21: 3905–3911.
Xu L, Tan AC, Winslow RL et al (2008) Merging microarray data from separate breast cancer studies provides a robust prognostic test. BMC Bioinformatics 9:125.
Chen YJ, Park S, Li Y et al (2003) Alterations of gene expression in failing myocardium following left ventricular assist device support. Physiology Genomics 14:251–260.
Hall JL, Grindle S, Han X et al (2004) Genomic profiling of the human heart before and after mechanical support with a ventricular assist device reveals alterations in vascular signaling networks. Physiology Genomics 17:283–291.
Kittleson MM, Ye SQ, Irizarry RA et al (2004) Identification of a gene expression profile that differentiates between ischemic and nonischemic cardiomyopathy. Circulation 110:3444–3451.
Harvard. (2005) Genomics of Cardiovascular Development, Adaptation, and Remodeling. NHLBI Program for Genomic Applications, Harvard Medical School. URL: http://www.cardiogenomics.org.
Kanehisa M, Goto S, Kawashima S et al (2004) The KEGG resource for deciphering the genome. Nucleic Acids Research 32:D277-D280.
Dahlquist KD, Salomonis N, Vranizan K et al (2002) GenMAPP: a new tool for viewing and analyzing microarray data on biological pathways. Nature Genetics 31:19–20.
van Rijsbergen CJ (1979) Information Retrieval, 2nd ed., Butterworths.
Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98:5116–5121.
Sanoudou D, Vafiadaki E, Arvanitis DA et al (2005) Array lessons from the heart: focus on the genome and transcriptome of cardiomyopathies. Phyisology Genomics 21:131–143.
Margulies KB, Matiwala S, Cornejo C et al (2005) Mixed messages: transcription patterns in failing and recovering human myocardium. Circ Res 96:592–599.
Rhodes DR, Kalyana-Sundaram S, Mahavisno V et al (2005) Mining for regulatory programs in the cancer transcriptome. Nature Genetics 37:579–583.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Tan, A.C. (2012). Employing Gene Set Top Scoring Pairs to Identify Deregulated Pathway-Signatures in Dilated Cardiomyopathy from Integrated Microarray Gene Expression Data. In: Wang, J., Tan, A., Tian, T. (eds) Next Generation Microarray Bioinformatics. Methods in Molecular Biology, vol 802. Humana Press. https://doi.org/10.1007/978-1-61779-400-1_23
Download citation
DOI: https://doi.org/10.1007/978-1-61779-400-1_23
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-399-8
Online ISBN: 978-1-61779-400-1
eBook Packages: Springer Protocols