Skip to main content

Data Warehousing with TargetMine for Omics Data Analysis

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1986))

Abstract

Most biological processes including diseases are multifactorial and determined by a complex interplay of various genetic and environmental factors. This chapter aims to provide a user guide to data querying, analysis, and visualization with TargetMine and the associated auxiliary toolkit. We have also discussed some of the commonly used data queries for the researchers who are interested in gene set analysis within a data warehouse framework. Overall, TargetMine provides a convenient web browser-based interface that enables the discovery of new hypotheses interactively, by performing analysis of omics data using complicated searches without any scripting and programming efforts on the part of the user and also by providing the results in an easy-to-comprehend output format.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Ritchie MD, Holzinger ER, Li R et al (2015) Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet 16(2):85–97. https://doi.org/10.1038/nrg3868

    Article  CAS  PubMed  Google Scholar 

  2. Stein LD (2003) Integrating biological databases. Nat Rev Genet 4(5):337–345. https://doi.org/10.1038/nrg1065; pii: nrg1065

    Article  CAS  PubMed  Google Scholar 

  3. Triplet T, Butler G (2014) A review of genomic data warehousing systems. Brief Bioinform 15(4):471–483. https://doi.org/10.1093/bib/bbt031

    Article  PubMed  Google Scholar 

  4. Wong L (2002) Technologies for integrating biological data. Brief Bioinform 3(4):389–404

    Article  PubMed  Google Scholar 

  5. Chen YA, Tripathi LP, Mizuguchi K (2011) TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery. PLoS One 6(3):e17844. https://doi.org/10.1371/journal.pone.0017844

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Chen YA, Tripathi LP, Mizuguchi K (2016) An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework. Database (Oxford) 2016. https://doi.org/10.1093/database/baw009

    Article  PubMed  PubMed Central  Google Scholar 

  7. Smith RN, Aleksic J, Butano D et al (2012) InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data. Bioinformatics 28(23):3163–3165. https://doi.org/10.1093/bioinformatics/bts577

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Hamano Y, Kida H, Ihara S et al (2017) Classification of idiopathic interstitial pneumonias using anti-myxovirus resistance-protein 1 autoantibody. Sci Rep 7:43201. https://doi.org/10.1038/srep43201

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Ihara S, Kida H, Arase H et al (2012) Inhibitory roles of signal transducer and activator of transcription 3 in antitumor immunity during carcinogen-induced lung tumorigenesis. Cancer Res 72(12):2990–2999. https://doi.org/10.1158/0008-5472.CAN-11-4062

    Article  CAS  PubMed  Google Scholar 

  10. Jin Y, Takeda Y, Kondo Y et al (2018) Double deletion of tetraspanins CD9 and CD81 in mice leads to a syndrome resembling accelerated aging. Sci Rep 8(1):5145. https://doi.org/10.1038/s41598-018-23338-x

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Tripathi LP, Kambara H, Chen YA et al (2013) Understanding the biological context of NS5A-host interactions in HCV infection: a network-based approach. J Proteome Res 12(6):2537–2551. https://doi.org/10.1021/pr3011217

    Article  CAS  PubMed  Google Scholar 

  12. Tripathi LP, Kambara H, Moriishi K et al (2012) Proteomic analysis of hepatitis C virus (HCV) core protein transfection and host regulator PA28gamma knockout in HCV pathogenesis: a network-based study. J Proteome Res 11(7):3664–3679. https://doi.org/10.1021/pr300121a

    Article  CAS  PubMed  Google Scholar 

  13. Tripathi LP, Kataoka C, Taguwa S et al (2010) Network based analysis of hepatitis C virus core and NS4B protein interactions. Mol BioSyst 6(12):2539–2553. https://doi.org/10.1039/c0mb00103a

    Article  CAS  PubMed  Google Scholar 

  14. Chen YA, Tripathi LP, Dessailly BH et al (2014) Integrated pathway clusters with coherent biological themes for target prioritisation. PLoS One 9(6):e99030. https://doi.org/10.1371/journal.pone.0099030

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29. https://doi.org/10.1038/75556

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Aoki-Kinoshita KF, Kanehisa M (2007) Gene annotation and pathway mapping in KEGG. Methods Mol Biol 396:71–91. https://doi.org/10.1007/978-1-59745-515-2_6

    Article  CAS  PubMed  Google Scholar 

  17. Matthews L, Gopinath G, Gillespie M et al (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37(Database issue):D619–D622. https://doi.org/10.1093/nar/gkn863

    Article  CAS  PubMed  Google Scholar 

  18. Schaefer CF, Anthony K, Krupa S et al (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37(Database issue):D674–D679. https://doi.org/10.1093/nar/gkn653

    Article  CAS  PubMed  Google Scholar 

  19. Afgan E, Baker D, van den Beek M et al (2016) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res 44(W1):W3–W10. https://doi.org/10.1093/nar/gkw343

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Benjamini Y, Drai D, Elmer G et al (2001) Controlling the false discovery rate in behavior genetics research. Behav Brain Res 125(1–2):279–284

    Article  CAS  PubMed  Google Scholar 

  21. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate - a practical and powerful approach to multiple testing. J R Stat Soc B 57(1):289–300

    Google Scholar 

  22. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64. https://doi.org/10.1080/01621459.1961.10482090

    Article  Google Scholar 

  23. Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2):65–70

    Google Scholar 

  24. Cline MS, Smoot M, Cerami E et al (2007) Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2(10):2366–2382. https://doi.org/10.1038/nprot.2007.324; pii: nprot.2007.324

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Raman K (2010) Construction and analysis of protein-protein interaction networks. Autom Exp 2(1):2. https://doi.org/10.1186/1759-4499-2-2

    Article  PubMed  PubMed Central  Google Scholar 

  26. Yu H, Kim PM, Sprecher E et al (2007) The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput Biol 3(4):e59. https://doi.org/10.1371/journal.pcbi.0030059

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Edwards AM, Isserlin R, Bader GD et al (2011) Too many roads not taken. Nature 470(7333):163–165. https://doi.org/10.1038/470163a

    Article  CAS  PubMed  Google Scholar 

  28. Fitch WM (2000) Homology a personal view on some of the problems. Trends Genet 16(5):227–231

    Article  CAS  PubMed  Google Scholar 

  29. Koonin EV (2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39:309–338

    Article  CAS  PubMed  Google Scholar 

  30. Webber C, Ponting CP (2004) Genes and homology. Curr Biol 14(9):R332–R333

    Article  CAS  PubMed  Google Scholar 

  31. Watson JD, Laskowski RA, Thornton JM (2005) Predicting protein function from sequence and structural data. Curr Opin Struct Biol 15(3):275–284. https://doi.org/10.1016/j.sbi.2005.04.003; pii: S0959-440X(05)00082-5

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lokesh P. Tripathi or Kenji Mizuguchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Chen, YA., Tripathi, L.P., Mizuguchi, K. (2019). Data Warehousing with TargetMine for Omics Data Analysis. In: Bolón-Canedo, V., Alonso-Betanzos, A. (eds) Microarray Bioinformatics. Methods in Molecular Biology, vol 1986. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9442-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9442-7_3

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-4939-9441-0

  • Online ISBN: 978-1-4939-9442-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics