Abstract
In the last decade, biological tasks became much more complex and often their solution requires the simultaneous use of several different resources. Examples of typical bioinformatics scenarios are given by the gene functional studies, the study of microRNA Single Nucleotide Polymorphisms in cancer disease, or the study of protein motifs linked to specific cellular pathways. Available bioinformatics tools give a big contribute in problem solving, but they still require several and time consuming efforts. In this work, we highlight how the Gremlin graph traversal language can be also considered as lingua franca and as a valid middleware tool in the implementation, for instance, of some high-level web based tool to solve complex biological tasks. Gremlin queries are tested via a web interface to our previously developed integrated graph database, BioGraphDB.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The Gremlin version highlighted in this work is from TinkerPop 2.x. TinkerPop is now a part of the Apache Software Foundation and the current release is TinkerPop 3.x, available at http://tinkerpop.incubator.apache.org.
References
Have, C.T., Jensen, L.J.: Are graph databases ready for bioinformatics? Bioinformatics 29(24), 3107–3108 (2013)
Dweep, H., Gretz, N.: miRWalk2.0: a comprehensive atlas of microRNA-target interactions. Nat. Methods 12(8), 697–697 (2015)
Holzschuher, F., Peinl, R.: Performance of graph query languages. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops on - EDBT 2013, p. 195. ACM Press, New York (2013)
Pareja-Tobes, P., Tobes, R., Manrique, M., Pareja, E., Pareja-Tobes, E.: Bio4j: a high-performance cloud-enabled graph-based data platform. Technical report, Era7 Bioinformatics, March 2015
Webber, J.: A programmatic introduction to Neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity - SPLASH 2012, p. 217. ACM Press, New York (2012)
Rodriguez, M.A.: The Gremlin graph traversal machine and language (invited talk). In: Proceedings of the 15th Symposium on Database Programming Languages - DBPL 2015, pp. 1–10. ACM Press, New York (2015)
Apache Tinkerpop: The Gremlin traversal language
Fiannaca, A., La Paglia, L., La Rosa, M., Messina, A., Storniolo, P., Urso, A.: Integrated DB for bioinformatics: a case study on analysis of functional effect of MiRNA SNPs in cancer. In: Renda, M.E., Bursa, M., Holzinger, A., Khuri, S. (eds.) ITBAM 2016. LNCS, vol. 9832, pp. 214–222. Springer, Heidelberg (2016). doi:10.1007/978-3-319-43949-5_17
Fiannaca, A., La Rosa, M., La Paglia, L., Messina, A., Urso, A.: BioGraphDB: a new GraphDB collecting heterogeneous data for bioinformatics analysis. In: BIOTECHNO 2016: The Eighth International Conference on Bioinformatics, Biocomputational Systems and Biotechnolo-gies, pp. 28–34 (2016)
Orient Technologies LTD: OrientDB multi-model database engine
Kozomara, A., Griffiths-Jones, S.: miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39(Database issue), D-152–D-157 (2011)
Schuler, G.D., Epstein, J.A., Ohkawa, H., Kans, J.A.: Entrez: molecular biology database and retrieval system. Methods Enzymol. 266, 141–162 (1996)
The UniProt Consortium: UniProt: a hub for protein information. Nucleic Acids Res. 43(D1), D204–D212 (2015)
Pruitt, K.D., Tatusova, T., Maglott, D.R.: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35(Database), D61–D65 (2007)
Croft, D., Mundo, A.F., Haw, R., Milacic, M., Weiser, J., Wu, G., Caudy, M., Garapati, P., Gillespie, M., Kamdar, M.R., Jassal, B., Jupe, S., Matthews, L., May, B., Palatnik, S., Rothfels, K., Shamovsky, V., Song, H., Williams, M., Birney, E., Hermjakob, H., Stein, L., D’Eustachio, P.: The Reactome pathway knowledgebase. Nucleic Acids Res. 42(D1), D472–D477 (2014)
The Gene Ontology Consortium: gene ontology consortium: going forward. Nucleic Acids Res. 43(D1), D1049–D1056 (2015)
Gray, K.A., Yates, B., Seal, R.L., Wright, M.W., Bruford, E.A.: Genenames.org: the HGNC resources in 2015. Nucleic Acids Res. 43(D1), D1079–D1085 (2015)
Xie, B., Ding, Q., Han, H., Wu, D.: miRCancer: a microRNA-cancer association database constructed by text mining on literature. Bioinformatics 29(5), 638–644 (2013)
Hsu, S.D., Tseng, Y.T., Shrestha, S., Lin, Y.L., Khaleel, A., Chou, C.H., Chu, C.F., Huang, H.Y., Lin, C.M., Ho, S.Y., Jian, T.Y., Lin, F.M., Chang, T.H., Weng, S.L., Liao, K.W., Liao, I.E., Liu, C.C., Huang, H.D.: miRTarBase update 2014: an information resource for experimentally validated miRNA-target interactions. Nucleic Acids Res. 42(D1), D78–D85 (2014)
John, B., Enright, A.J., Aravin, A., Tuschl, T., Sander, C., Marks, D.S.: Human microRNA targets. PLoS Biol. 2(11), e363 (2004)
Paraskevopoulou, M.D., Georgakilas, G., Kostoulas, N., Vlachos, I.S., Vergoulis, T., Reczko, M., Filippidis, C., Dalamagas, T., Hatzigeorgiou, A.G.: DIANA-microT web server v5.0: service integration into miRNA functional analysis workflows. Nucleic Acids Res. 41(W1), W169–W173 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Fiannaca, A. et al. (2017). Gremlin Language for Querying the BiographDB Integrated Biological Database. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10208. Springer, Cham. https://doi.org/10.1007/978-3-319-56148-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-56148-6_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56147-9
Online ISBN: 978-3-319-56148-6
eBook Packages: Computer ScienceComputer Science (R0)