Skip to main content

Gremlin Language for Querying the BiographDB Integrated Biological Database

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2017)

Abstract

In the last decade, biological tasks became much more complex and often their solution requires the simultaneous use of several different resources. Examples of typical bioinformatics scenarios are given by the gene functional studies, the study of microRNA Single Nucleotide Polymorphisms in cancer disease, or the study of protein motifs linked to specific cellular pathways. Available bioinformatics tools give a big contribute in problem solving, but they still require several and time consuming efforts. In this work, we highlight how the Gremlin graph traversal language can be also considered as lingua franca and as a valid middleware tool in the implementation, for instance, of some high-level web based tool to solve complex biological tasks. Gremlin queries are tested via a web interface to our previously developed integrated graph database, BioGraphDB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The Gremlin version highlighted in this work is from TinkerPop 2.x. TinkerPop is now a part of the Apache Software Foundation and the current release is TinkerPop 3.x, available at http://tinkerpop.incubator.apache.org.

References

  1. Have, C.T., Jensen, L.J.: Are graph databases ready for bioinformatics? Bioinformatics 29(24), 3107–3108 (2013)

    Article  Google Scholar 

  2. Dweep, H., Gretz, N.: miRWalk2.0: a comprehensive atlas of microRNA-target interactions. Nat. Methods 12(8), 697–697 (2015)

    Article  Google Scholar 

  3. Holzschuher, F., Peinl, R.: Performance of graph query languages. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops on - EDBT 2013, p. 195. ACM Press, New York (2013)

    Google Scholar 

  4. Pareja-Tobes, P., Tobes, R., Manrique, M., Pareja, E., Pareja-Tobes, E.: Bio4j: a high-performance cloud-enabled graph-based data platform. Technical report, Era7 Bioinformatics, March 2015

    Google Scholar 

  5. Webber, J.: A programmatic introduction to Neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity - SPLASH 2012, p. 217. ACM Press, New York (2012)

    Google Scholar 

  6. Rodriguez, M.A.: The Gremlin graph traversal machine and language (invited talk). In: Proceedings of the 15th Symposium on Database Programming Languages - DBPL 2015, pp. 1–10. ACM Press, New York (2015)

    Google Scholar 

  7. Apache Tinkerpop: The Gremlin traversal language

    Google Scholar 

  8. Fiannaca, A., La Paglia, L., La Rosa, M., Messina, A., Storniolo, P., Urso, A.: Integrated DB for bioinformatics: a case study on analysis of functional effect of MiRNA SNPs in cancer. In: Renda, M.E., Bursa, M., Holzinger, A., Khuri, S. (eds.) ITBAM 2016. LNCS, vol. 9832, pp. 214–222. Springer, Heidelberg (2016). doi:10.1007/978-3-319-43949-5_17

    Chapter  Google Scholar 

  9. Fiannaca, A., La Rosa, M., La Paglia, L., Messina, A., Urso, A.: BioGraphDB: a new GraphDB collecting heterogeneous data for bioinformatics analysis. In: BIOTECHNO 2016: The Eighth International Conference on Bioinformatics, Biocomputational Systems and Biotechnolo-gies, pp. 28–34 (2016)

    Google Scholar 

  10. Orient Technologies LTD: OrientDB multi-model database engine

    Google Scholar 

  11. Kozomara, A., Griffiths-Jones, S.: miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39(Database issue), D-152–D-157 (2011)

    Article  Google Scholar 

  12. Schuler, G.D., Epstein, J.A., Ohkawa, H., Kans, J.A.: Entrez: molecular biology database and retrieval system. Methods Enzymol. 266, 141–162 (1996)

    Article  Google Scholar 

  13. The UniProt Consortium: UniProt: a hub for protein information. Nucleic Acids Res. 43(D1), D204–D212 (2015)

    Article  Google Scholar 

  14. Pruitt, K.D., Tatusova, T., Maglott, D.R.: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35(Database), D61–D65 (2007)

    Article  Google Scholar 

  15. Croft, D., Mundo, A.F., Haw, R., Milacic, M., Weiser, J., Wu, G., Caudy, M., Garapati, P., Gillespie, M., Kamdar, M.R., Jassal, B., Jupe, S., Matthews, L., May, B., Palatnik, S., Rothfels, K., Shamovsky, V., Song, H., Williams, M., Birney, E., Hermjakob, H., Stein, L., D’Eustachio, P.: The Reactome pathway knowledgebase. Nucleic Acids Res. 42(D1), D472–D477 (2014)

    Article  Google Scholar 

  16. The Gene Ontology Consortium: gene ontology consortium: going forward. Nucleic Acids Res. 43(D1), D1049–D1056 (2015)

    Google Scholar 

  17. Gray, K.A., Yates, B., Seal, R.L., Wright, M.W., Bruford, E.A.: Genenames.org: the HGNC resources in 2015. Nucleic Acids Res. 43(D1), D1079–D1085 (2015)

    Article  Google Scholar 

  18. Xie, B., Ding, Q., Han, H., Wu, D.: miRCancer: a microRNA-cancer association database constructed by text mining on literature. Bioinformatics 29(5), 638–644 (2013)

    Article  Google Scholar 

  19. Hsu, S.D., Tseng, Y.T., Shrestha, S., Lin, Y.L., Khaleel, A., Chou, C.H., Chu, C.F., Huang, H.Y., Lin, C.M., Ho, S.Y., Jian, T.Y., Lin, F.M., Chang, T.H., Weng, S.L., Liao, K.W., Liao, I.E., Liu, C.C., Huang, H.D.: miRTarBase update 2014: an information resource for experimentally validated miRNA-target interactions. Nucleic Acids Res. 42(D1), D78–D85 (2014)

    Article  Google Scholar 

  20. John, B., Enright, A.J., Aravin, A., Tuschl, T., Sander, C., Marks, D.S.: Human microRNA targets. PLoS Biol. 2(11), e363 (2004)

    Article  Google Scholar 

  21. Paraskevopoulou, M.D., Georgakilas, G., Kostoulas, N., Vlachos, I.S., Vergoulis, T., Reczko, M., Filippidis, C., Dalamagas, T., Hatzigeorgiou, A.G.: DIANA-microT web server v5.0: service integration into miRNA functional analysis workflows. Nucleic Acids Res. 41(W1), W169–W173 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo La Rosa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Fiannaca, A. et al. (2017). Gremlin Language for Querying the BiographDB Integrated Biological Database. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10208. Springer, Cham. https://doi.org/10.1007/978-3-319-56148-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56148-6_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56147-9

  • Online ISBN: 978-3-319-56148-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics