Abstract
The biomedical research community is providing large-scale data sources to enable knowledge discovery from the data alone, or from novel scientific experiments in combination with the existing knowledge. Increasingly semantic Web technologies are being developed and used including ontologies, triple stores and combinations thereof. The amount of data is constantly increasing as well as the complexity of data. Since the data sources are publicly available, the amount of content can be measured giving an overview on the accessible content but also on the state of the data representation in comparison to the existing content. For a better understanding of the existing data resources, i.e. judgements on the distribution of data triples across concepts, data types and primary providers, we have performed a comprehensive analysis which delivers an overview on the accessible content for semantic Web solutions (from publicly accessible data servers). It can be derived that the information related to genes, proteins and chemical entities form the core, whereas the content related to diseases and pathways forms a smaller portion. As a result, any approach for drug discovery would profit from the data on molecular entities, but would lack content from data resources that represent disease pathomechanisms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
http://www.who.int/classifications/icd/en/HistoryOfICD.pdf (retr.10/02/2017).
- 2.
https://bioportal.bioontology.org/ (retr.10/04/2017).
- 3.
http://www.obofoundry.org/ (retr.10/04/2017).
- 4.
http://bioportal.bioontology.org/ontologies/ACGT-MO (retr.10/02/2017).
- 5.
http://www.biopax.org/ (retr.10/02/2017).
- 6.
http://www.ebi.ac.uk/efo/ (retr.10/02/2017).
- 7.
http://www.geneontology.org/ (retr.10/02/2017).
- 8.
http://www.nlm.nih.gov/mesh/ (retr.10/02/2017).
- 9.
http://bioportal.bioontology.org/ontologies/MO (retr.10/02/2017).
- 10.
http://ncit.nci.nih.gov (retr.10/02/2017).
- 11.
http://obi-ontology.org/page/Main_Page (retr. 31/01/2017).
- 12.
http://www.nlm.nih.gov/research/umls/about_umls.html (retr. 10/02/2017).
- 13.
http://www.nlm.nih.gov/research/umls/rxnorm (retr. 22/02/2017).
- 14.
http://ontology.buffalo.edu/bfo/ (retr. 10/03/2017).
- 15.
http://obo.sourceforge.net/relationship/ (retr. 10/03/2017).
- 16.
http://bioportal.bioontology.org/ontologies/PROVO/ (retr. 25/01/2017).
- 17.
http://www.ncbi.nlm.nih.gov/genbank/ (retr. 10/01/2017).
- 18.
http://www.ebi.ac.uk/arrayexpress/ (retr. 12/01/2017).
- 19.
http://www.ncbi.nlm.nih.gov/geo/ (retr. 12/01/2017).
- 20.
http://lifesciencedb.jp/cged/ (retr. 12/01/2017).
- 21.
http://www.uniprot.org/ (retr. biomedical researchers can utilise cPath).
- 22.
http://www.hprd.org/ (retr. 20/08/2015).
- 23.
http://www.pdb.org (retr. 20/08/2015).
- 24.
http://www.genome.jp/kegg/ (retr. 12/01/2017).
- 25.
http://www.reactome.org (retr. 12/01/2017).
- 26.
http://cbio.mskcc.org/software/cpath/ (retr. 12/01/2017).
- 27.
http://urlm.co/www.chembase.com#web (retr. 12/07/2017).
- 28.
https://www.sigmaaldrich.com/catalog/ (retr. 18/04/2017).
- 29.
http://cdb.ics.uci.edu/ (retr. 12/05/2017).
- 30.
http://www.ebi.ac.uk/chebi/ (retr. 12/01/2017).
- 31.
http://pubchem.ncbi.nlm.nih.gov/ (retr. 12/01/2017).
- 32.
http://actor.epa.gov/actor/faces/ACToRHome.jsp (retr. 12/01/2017).
- 33.
http://clinicaltrials.gov/ (retr. 10/01/2017).
- 34.
http://toxnet.nlm.nih.gov/ (retr. 12/01/2017).
- 35.
http://www.dsld.nlm.nih.gov/dsld/ (retr. 20/03/2017).
- 36.
http://repairtoire.genesilico.pl/ (retr. 14/01/2017).
- 37.
http://www.ncbi.nlm.nih.gov/pubmed (retr. 22/02/2017).
- 38.
http://ods.od.nih.gov/research/PubMed_Dietary_Supplement_Subset.aspx (retr. 12/03/2017).
- 39.
http://bioportal.bioontology.org/ retr. 20/02/2016.
- 40.
http://www.obofoundry.org/ retr. 22/02/2016.
- 41.
- 42.
http://www.ebi.ac.uk/ontology-lookup/ retr. 22/02/2016.
- 43.
http://amigo.geneontology.org/cgi-bin/amigo/go.cgi retr. 22/02/2016.
- 44.
http://www.ncbi.nlm.nih.gov/sites/gquery retr. 18/02/2016.
- 45.
http://www.e-meducation.org retr. 18/02/2016.
- 46.
http://www.w3.org/blog/SWEO/page-2 retr. 05/02/2017.
- 47.
http://www.w3.org/wiki/HCLSIG/LODD (retr: 05/02/2017).
- 48.
http://swse.deri.org/ (retr. 27-04-2016).
- 49.
http://bio2rdf.org (retr: 05/02/2017).
- 50.
https://github.com/bio2rdf/bio2rdf-scripts/wiki (retr: 05/02/2017).
- 51.
http://linkedlifedata.com (retr: 05/02/2017).
References
Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)
Berners-Lee, T., Bizer, C., Heath, T.: Linked data-the story so far. Int. J. Semant. Web Inform. Syst. 5(3), 1–22 (2009)
Corpet, D.E., Taché, S.: Most effective colon cancer chemopreventive agents in rats: a systematic review of aberrant crypt foci and tumor data, ranked by potency. Nutr. Cancer 43(1), 1–21 (2002)
Deus, K.T.W.P.C.N.T.B.C.G.C.K.H.F.: D1.1 – requirements analysis. Technical report, CERTH, NUIG-DERI, FIT, CYBION, UCY, and DKFZ (2011)
Doolittle, R., Abelson, J., Simon, M.: Computer methods for macromolecular sequence analysis. In: Methods in Enzymology, vol. 266 (1996)
Greenes, R.A., McClure, R.C., Pattison-Gordon, E., Sato, L.: The findings-diagnosis continuum: implications for image descriptions and clinical databases. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, p. 383. American Medical Informatics Association (1992)
Hasnain, A., Fox, R., Decker, S., Deus, H.F.: Cataloguing and linking life sciences LOD Cloud. In: 1st International Workshop on Ontology Engineering in a Data-driven World collocated with EKAW12 (2012)
Hasnain, A., et al.: Linked biomedical dataspace: lessons learned integrating data for drug discovery. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 114–130. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_8
Hasnain, A., Mehmood, Q., e Zainab, S.S., Saleem, M., Warren, C., Zehra, D., Decker, S., Rebholz-Schuhmann, D.: BioFed: federated query processing over life sciences linked open data. J. Biomed. Semant. 8(1), 13 (2017). http://dx.doi.org/10.1186/s13326-017-0118-0
Hasnain, A., Mehmood, Q., e Zainab, S.S., Hogan, A.: SPORTAL: profiling the content of public SPARQL endpoints. Int. J. Semant. Web Inform. Syst. (IJSWIS) 12(3), 134–163 (2016). http://www.igi-global.com/article/sportal/160175
Hasnain, A., Mehmood, Q., e Zainab, S.S., Hogan, A.: SPORTAL: searching for public SPARQL endpoints. In: Proceedings of the ISWC 2016 Posters & Demonstrations Track co-located with 15th International Semantic Web Conference (ISWC 2016), Kobe, Japan, 19 October 2016 (2016). http://ceur-ws.org/Vol-1690/paper78.pdf
Hasnain, A., et al.: A roadmap for navigating the life sciences linked open data cloud. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Buranarach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 97–112. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15615-6_8
Hasnain, S.M.A.: Cataloguing and linking publicly available biomedical SPARQL endpoints for federation-addressing aPosteriori data integration. Ph.D. thesis (2017)
Hirst, G.: Ontology and the lexicon. In: Staab, S., Studer, R. (eds.) andbook on Ontologies: International Handbooks on Information Systems, pp. 269–292. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-92673-3_12
Hoehndorf, R., Dumontier, M., Gkoutos, G.V.: Evaluation of research in biomedical ontologies. Brief. Bioinform. 14(6), 696–712 (2012)
Jimeno-Yepes, A., Jiménez-Ruiz, E., Berlanga, R., Rebholz-Schuhmann, D.: Use of shared lexical resources for efficient ontological engineering. In: Semantic Web Applications and Tools for Life Sciences Workshop (SWAT4LS), CEUR WS Proceedings, vol. 435, pp. 93–136 (2008)
Machado, C.M., Rebholz-Schuhmann, D., Freitas, A.T., Couto, F.M.: The semantic web in translational medicine: current applications and future directions. Brief Bioinform., bbt079 (2013)
Musen, M.A.: Dimensions of knowledge sharing and reuse. Comput. Biomed. Res. 25(5), 435–467 (1992)
Pico, A.R., Kelder, T., Iersel, M.P., Hanspers, K., Conklin, B.R., Evelo, C.: WikiPathways: pathway editing for the people. PLoS Biol. 6(7), e184 (2008)
Rebholz-Schuhmann, D., Oellrich, A., Hoehndorf, R.: Text-mining solutions for biomedical research: enabling integrative biology. Nat. Rev. Genet. 13(12), 829–839 (2012)
Rebholz-Schuhmann, D., Grabmuller, C., Kavaliauskas, S., Harrow, I., Kapushevsky, M., Westaway, M., Woollard, P., Wilkinson, N., Strutt, P., Braxtenthaler, M., Hoole, D., Wilson, J., O’Beirne, R., Kidd, R.R., Filsell, W., Marshall, C., Backofen, R., Clark, D.: Semantic integration of gene-disease associations for diabetes type II from literature and biomedical data resources. Drug Discov. Today 19(7), 882–889 (2014)
Rebholz-Schuhmann, D., Kim, J.H., Yan, Y., Dixit, A., Friteyere, C., Backofen, R., Lewin, I.: Evaluation and cross-comparison of Lexical Entities of Biological Interest (LexEBI). PLoS One 8(10), e75185 (2013)
Splendiani, A., Gundel, M., Austyn, J.M., Cavalieri, D., Scognamiglio, C., Brandizi, M.: Knowledge sharing and collaboration in translational research, and the DC-THERA Directory. Brief. Bioinform. 12(6), 562–575 (2011)
Wishart, D.S., Knox, C., Guo, A.C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B., Hassanali, M.: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36(suppl 1), D901–D906 (2008)
e Zainab, S.S., Hasnain, A., Saleem, M., Mehmood, Q., Zehra, D., Decker, S.: FedViz: a visual interface for SPARQL queries formulation and execution. In: Visualizations and User Interfaces for Ontologies and Linked Data (VOILA 2015), Bethlehem, Pennsylvania, USA (2015)
Acknowledgements
The work presented in this paper has been partly funded by EU FP7 GRANATUM project (project number 270139) and Science Foundation Ireland under Grant No. SFI/12/RC/2289.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Hasnain, A., Rebholz-Schuhmann, D. (2017). Biomedical Semantic Resources for Drug Discovery Platforms. In: Blomqvist, E., Hose, K., Paulheim, H., Ławrynowicz, A., Ciravegna, F., Hartig, O. (eds) The Semantic Web: ESWC 2017 Satellite Events. ESWC 2017. Lecture Notes in Computer Science(), vol 10577. Springer, Cham. https://doi.org/10.1007/978-3-319-70407-4_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-70407-4_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70406-7
Online ISBN: 978-3-319-70407-4
eBook Packages: Computer ScienceComputer Science (R0)