Skip to main content
Log in

Many InChIs and quite some feat

  • WARR’S PIECE
  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

References

  1. Annies M (2009) Full-text prior art and chemical structure searching in e-journals and on the internet—a patent information professional’s perspective. World Pat Inf 31(4):278–284

    CAS  Google Scholar 

  2. Frey J (2006) Using InChI. Chem Int 28(6):14–15

    Google Scholar 

  3. Heller SR, McNaught AD (2009) The IUPAC international chemical identifier (InChI). Chem Int 31(1):7–9

    CAS  Google Scholar 

  4. Heller S, McNaught A, Stein S, Tchekhovskoi D, Pletnev I (2013) InChI—the worldwide chemical structure identifier standard. J Cheminformatics 5:7

    CAS  Google Scholar 

  5. Rossler U (2012) Storage of structural formulas as text. Nachr Chem 60(2):140–142

    CAS  Google Scholar 

  6. Williams AJ (2012) InChI: connecting and navigating chemistry. J Cheminformatics 4:33

    CAS  Google Scholar 

  7. Yerin A, McNaught A, Heller S (2013) Current status and future development in relation to IUPAC activities. Chem Int 35(6):12–15

    CAS  Google Scholar 

  8. McNaught A (2006) The IUPAC chemical identifier. Chem Int 28(6):12–14

    CAS  Google Scholar 

  9. Heller S, McNaught A, Pletnev I, Stein S, Tchekhovskoi D (2015) InChI, the IUPAC international chemical identifier. J Cheminformatics 7(1):23

  10. Bachrach SM (2012) InChI: a user’s perspective. J Cheminformatics 4:34

    CAS  Google Scholar 

  11. Warr WA (2011) Representation of chemical structures. Wiley Interdiscip Rev Comput Mol Sci 1(4):557–579

    CAS  Google Scholar 

  12. McKay BD (1981) Practical graph isomorphism. Congr Numeratium 30:45–87

    Google Scholar 

  13. Morgan HL (1965) The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 5(2):107–113

    CAS  Google Scholar 

  14. Southan C (2013) InChI in the wild: an assessment of InChIKey searching in Google. J Cheminformatics 5:10

    CAS  Google Scholar 

  15. Pletnev I, Erin A, McNaught A, Blinov K, Tchekhovskoi D, Heller S (2012) InChIKey collision resistance: an experimental testing. J Cheminformatics 4:39

    CAS  Google Scholar 

  16. Grethe G, Goodman J, Allen C (2013) International chemical identifier for chemical reactions. J Cheminformatics 5(Suppl 1):O16

    Google Scholar 

  17. Dalby A, Nourse JG, Hounshell WD, Gushurst AKI, Grier DL, Leland BA, Laufer J (1992) Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J Chem Inf Comput Sci 32(3):244–255

    CAS  Google Scholar 

  18. Gobbi A, Lee M-L (2012) Handling of tautomerism and stereochemistry in compound registration. J Chem Inf Model 52(2):285–292

    CAS  Google Scholar 

  19. Murray-Rust P, Adams S, Downing J, Townsend J, Zhang Y (2011) The semantic architecture of the World-Wide Molecular Matrix (WWMM). J Cheminformatics 3(1):42

    CAS  Google Scholar 

  20. Tallapragada K, Chewning J, Kombo D, Ludwick B (2012) Making SharePoint chemically aware. J Cheminformatics 4(1):1

    CAS  Google Scholar 

  21. Townsend J, Murray-Rust P (2011) CMLLite: a design philosophy for CML. J Cheminformatics 3(1):39

    CAS  Google Scholar 

  22. Cannon EO (2012) New benchmark for chemical nomenclature software. J Chem Inf Model 52(5):1124–1131

    CAS  Google Scholar 

  23. Drefahl A (2011) CurlySMILES: a chemical language to customize and annotate encodings of molecular and nanodevice structures. J Cheminformatics 3(1):1

    CAS  Google Scholar 

  24. Gilson MK, Georg G, Wang S (2014) Digital chemistry in the journal of medicinal chemistry. J Med Chem 57(4):1137

    CAS  Google Scholar 

  25. Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28(1):31–36

    CAS  Google Scholar 

  26. Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci 29(2):97–101

    CAS  Google Scholar 

  27. Ash S, Cline MA, Homer RW, Hurst T, Smith GB (1997) SYBYL Line Notation (SLN): a versatile language for chemical structure representation. J Chem Inf Comput Sci 37(1):71–79

    CAS  Google Scholar 

  28. Homer RW, Swanson J, Jilek RJ, Hurst T, Clark RD (2008) SYBYL line notation (SLN): a single notation to represent chemical structures, queries, reactions, and virtual libraries. J Chem Inf Model 48(12):2294–2307

    CAS  Google Scholar 

  29. Warr WA (2010) Tautomerism in chemical information management systems. J Comput Aided Mol Des 24(6–7):497–520

    CAS  Google Scholar 

  30. Downing J, Murray-Rust P, Tonge AP, Morgan P, Rzepa HS, Cotterill F, Day N, Harvey MJ (2008) SPECTRa: the deposition and validation of primary chemistry research data in digital repositories. J Chem Inf Model 48(8):1571–1581

    CAS  Google Scholar 

  31. Murray-Rust P, Rzepa H (2011) CML: evolution and design. J Cheminformatics 3(1):44

    Google Scholar 

  32. Fanton M, Floris M, Cristiani A, Olla S, Medda R, Sabbadin D, Bulfone A, Moro S (2013) MMsDusty: an alternative InChI-based tool to minimize chemical redundancy. Mol Inf 32(8):681–684

    CAS  Google Scholar 

  33. Gregori-Puigjané E, Garriga-Sust R, Mestres J (2011) Indexing molecules with chemical graph identifiers. J Comput Chem 32(12):2638–2646

    Google Scholar 

  34. Ihlenfeldt W-D (2012) Comment on “Indexing molecules with chemical graph Identifiers”. J Comput Chem 33(2):237

    CAS  Google Scholar 

  35. Carbonell P, Carlsson L, Faulon J-L (2013) Stereo signature molecular descriptor. J Chem Inf Model 53(4):887–897

    CAS  Google Scholar 

  36. Cho YS, No KT, Cho KH (2012) yaInChI: modified InChI string scheme for line notation of chemical structures. SAR QSAR Environ Res 23(3–4):237–255

    CAS  Google Scholar 

  37. Brown ID, Abrahams SC, Berndt M, Faber J, Karen VL, Motherwell WDS, Villars P, Westbrook JD, McMahon B (2005) Report of the working group on crystal phase identifiers. Acta Crystallogr Sect A: Found Crystallogr A61(6):575–580

    CAS  Google Scholar 

  38. Coles SJ, Frey JG, Hursthouse MB, Light ME, Milsted AJ, Carr LA, DeRoure D, Gutteridge CJ, Mills HR, Meacham KE, Surridge M, Lyon E, Heery R, Duke M, Day M (2006) An e-science environment for service crystallography from submission to dissemination. J Chem Inf Model 46(3):1006–1016

    CAS  Google Scholar 

  39. Burgess DR, Manion JA, Hayes CJ (2014) Data formats for elementary gas phase kinetics, Part 1: unique representations of species at the molecular level. Int J Chem Kinet 46(10):640–650

    CAS  Google Scholar 

  40. Burgess DR, Manion JA, Hayes CJ (2015) Data formats for elementary gas-phase kinetics: Part 2. unique representations of reactions. Int J Chem Kinet 47(5):334–350

    CAS  Google Scholar 

  41. Chambers J, Davies M, Gaulton A, Papadatos G, Hersey A, Overington J (2014) UniChem: extension of InChI-based compound mapping to salt, connectivity and stereochemistry layers. J Cheminformatics 6(1):43

    Google Scholar 

  42. Tropsha A, Williams A (2012) How many miles have we gone, InChI by InChI? Chem Int 34(5):33

    Google Scholar 

  43. Ihlenfeldt W, Bolton E, Bryant S (2009) The PubChem chemical structure sketcher. J Cheminformatics 1(1):20

    Google Scholar 

  44. Trepalin SV, Yarkov AV, Pletnev IV, Gakh AA (2006) A Java chemical structure editor supporting the modular chemical descriptor language (MCDL). Molecules 11(4):129–141

    CAS  Google Scholar 

  45. Gakh A, Burnett M, Trepalin S, Yarkov A (2011) Modular chemical descriptor language (MCDL): stereochemical modules. J Cheminformatics 3(1):5

    CAS  Google Scholar 

  46. BKChem. http://bkchem.zirael.org/index.html. Accessed 17 Apr 2015

  47. Kochev NT, Paskaleva VH, Jeliazkova N (2013) Ambit-Tautomer: an open source tool for tautomer generation. Mol Inf 32(5–6):481–504

    CAS  Google Scholar 

  48. Sitzmann M, Filippov IV, Nicklaus MC (2008) Internet resources integrating many small-molecules databases. SAR QSAR Environ Res 19(1–2):1–9

    CAS  Google Scholar 

  49. Kos A, Himmler H-J (2010) CWM global search—the internet search engine for chemists and biologists. Future Internet 2(4):635–644

    Google Scholar 

  50. Monge A, Arrault A, Marot C, Morin-Allory L (2006) Managing, profiling and analyzing a library of 2.6 million compounds gathered from 32 chemical providers. Mol Divers 10(3):389–403

    CAS  Google Scholar 

  51. Chepelev L, Dumontier M (2011) Semantic Web integration of cheminformatics resources with the SADI framework. J Cheminformatics 3(1):16

    CAS  Google Scholar 

  52. Spanton SG, Whittern D (2009) The development of an NMR chemical shift prediction application with the accuracy necessary to grade proton NMR spectra for identity. Magn Reson Chem 47(12):1055–1061

    CAS  Google Scholar 

  53. Spjuth O, Berg A, Adams S, Willighagen EL (2013) Applications of the InChI in cheminformatics with the CDK and bioclipse. J Cheminformatics 5:14

    CAS  Google Scholar 

  54. Spjuth O, Eklund M, Ahlberg Helgee E, Boyer S, Carlsson L (2011) Integrated decision support for assessing chemical liabilities. J Chem Inf Model 51(8):1840–1847

    CAS  Google Scholar 

  55. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(D1):D1100–D1107

    CAS  Google Scholar 

  56. Hersey A, Chambers J, Bellis L, Patrícia Bento A, Gaulton A, Overington JP (2015) Chemical databases: curation or integration by user-defined equivalence? Drug Discov Today Technol. Online 11 March 2015

  57. Muresan S, Petrov P, Southan C, Kjellberg MJ, Kogej T, Tyrchan C, Varkonyi P, Xie PH (2011) Making every SAR point count: the development of chemistry connect for the large-scale integration of structure and bioactivity data. Drug Discov Today 16(23–24):1019–1030

    CAS  Google Scholar 

  58. Muresan S, Sitzmann M, Southan C (2012) Mapping between databases of compounds and protein targets. In: Larson RS (ed) Bioinformatics and drug discovery, vol 910. Humana Press, New York, pp 145–164

    Google Scholar 

  59. Pawson AJ, Sharman JL, Benson HE, Faccenda E, Alexander SPH, Buneman PO, Davenport AP, McGrath JC, Peters JA, Southan C, Spedding M, Yu W, Harmar AJ, NC-IUPHAR (2014) The IUPHAR/BPS guide to pharmacology: an expert-driven knowledgebase of drug targets and their ligands. Nucleic Acids Res 42(D1):D1098–D1106

    CAS  Google Scholar 

  60. Southan C, Sitzmann M, Muresan S (2013) Comparing the chemical structure and protein content of ChEMBL, DrugBank, human metabolome database and the therapeutic target database. Mol Inf 32(11–12):881–897

    CAS  Google Scholar 

  61. Wassermann AM, Bajorath J (2011) BindingDB and ChEMBL: online compound databases for drug discovery. Expert Opin Drug Discov 6(7):683–687

    CAS  Google Scholar 

  62. Willighagen E, Waagmeester A, Spjuth O, Ansell P, Williams A, Tkachenko V, Hastings J, Chen B, Wild D (2013) The ChEMBL database as linked open data. J Cheminformatics 5(1):23

    CAS  Google Scholar 

  63. Nowotka M, Davies M, Papadatos G, Overington JP (2014) ChEMBL Beaker: a lightweight web framework providing robust and extensible cheminformatics services. Challenges 5(2):444–449

    Google Scholar 

  64. Rose PW, Beran B, Bi C, Bluhm WF, Dimitropoulos D, Goodsell DS, Prlić A, Quesada M, Quinn GB, Westbrook JD, Young J, Yukich B, Zardecki C, Berman HM, Bourne PE (2011) The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res 39(Suppl 1):D392–D401

    CAS  Google Scholar 

  65. Java Native Interface InChI Wrapper http://sourceforge.net/projects/jni-inchi. Accessed 17 Apr 2015

  66. Ninja, an InChI toolkit for Java. http://sourceforge.net/projects/ninja. Accessed 17 Apr 2015

  67. O’Boyle N, Banck M, James C, Morley C, Vandermeersch T, Hutchison G (2011) Open Babel: an open chemical toolbox. J Cheminformatics 3(1):33

    Google Scholar 

  68. O’Boyle NM, Morley C, Hutchison GR (2008) Pybel: a Python wrapper for the OpenBabel cheminformatics toolkit. Chem Cent J 2:5

    Google Scholar 

  69. Smith R, Williamson R, Ventura D, Prince J (2013) Rubabel: wrapping open Babel with Ruby. J Cheminformatics 5(1):35

    CAS  Google Scholar 

  70. Will T, Hutter MC, Jauch J, Helms V (2013) Batch tautomer generation with MolTPC. J Comput Chem 34(28):2485–2492

    CAS  Google Scholar 

  71. Day AE, Coles SJ, Bird CL, Frey JG, Whitby RJ, Tkachenko VE, Williams AJ (2015) ChemTrove: enabling a generic ELN to support chemistry through the use of transferable plug-ins and online data sources. J Chem Inf Model 55(3):501–509

    CAS  Google Scholar 

  72. Hettne K, Williams A, van Mulligen E, Kleinjans J, Tkachenko V, Kors J (2010) Automatic versus manual curation of a multi-source chemical dictionary: the impact on text mining. J Cheminformatics 2(1):3

    Google Scholar 

  73. Williams A, Tkachenko V (2014) The Royal Society of Chemistry and the delivery of chemistry data repositories for the community. J Comput-Aided Mol Des 28(10):1023–1030

    CAS  Google Scholar 

  74. Haraldsdottir H, Thiele I, Fleming R (2014) Comparative evaluation of open source software for mapping between metabolite identifiers in metabolic network reconstructions: application to Recon 2. J Cheminformatics 6(1):2

    Google Scholar 

  75. Wohlgemuth G, Haldiya PK, Willighagen E, Kind T, Fiehn O (2010) The Chemical Translation Service-a web-based tool to improve standardization of metabolomic reports. Bioinformatics 26(20):2647–2648

    CAS  Google Scholar 

  76. O’Boyle NM (2012) Towards a universal SMILES representation—a standard method to generate canonical SMILES based on the InChI. J Cheminformatics 4:22

    Google Scholar 

  77. Banville DL (ed) (2008) Chemical information mining: facilitating literature-based discovery. CRC Press, Boca Raton

    Google Scholar 

  78. Jessop D, Adams S, Murray-Rust P (2011) Mining chemical information from open patents. J Cheminformatics 3(1):40

    CAS  Google Scholar 

  79. Jessop D, Adams S, Willighagen E, Hawizy L, Murray-Rust P (2011) OSCAR4: a flexible architecture for chemical text-mining. J Cheminformatics 3(1):41

    CAS  Google Scholar 

  80. Klinger R, Kolarik C, Fluck J, Hofmann-Apitius M, Friedrich CM (2008) Detection of IUPAC and IUPAC-like chemical names. Bioinformatics 24(13):i268–i276

    CAS  Google Scholar 

  81. Kuhn M, Szklarczyk D, Pletscher-Frankild S, Blicher TH, von Mering C, Jensen LJ, Bork P (2014) STITCH 4: integration of protein–chemical interactions with user data. Nucleic Acids Res 42(Database issue):D401–D407

    CAS  Google Scholar 

  82. Rhodes J, Boyer S, Kreulen J, Chen Y, Ordonez P (2007) Mining patents using molecular similarity search. In: Altman R, Murray T, Klein T, Dunker A, Hunter L (eds) Pacific symposium on biocomputing 2007, Maui, HI, United States, Jan 3–7, 2007. World Scientific Publishing Company, Singapore, pp 304–315

    Google Scholar 

  83. Southan C, Stracz A (2013) Extracting and connecting chemical structures from text sources using chemicalize.org. J Cheminformatics 5:20

    CAS  Google Scholar 

  84. Williams AJ, Yerin A (2008) Automated identification and conversion of chemical names to structure-searchable information. In: Banville DL (ed) Chemical information mining. CRC Press, Boca Raton, pp 21–44

    Google Scholar 

  85. Zimmermann M, Fluck J, Thi LT, Kolarik C, Kumpf K, Hofmann M (2005) Information extraction in the life sciences: perspective for medicinal chemistry, pharmacology and toxicology. Curr Top Med Chem 5(8):785–796

    CAS  Google Scholar 

  86. Hettne KM, Stierum RH, Schuemie MJ, Hendriksen PJM, Schijvenaars BJA, Mulligen EMv, Kleinjans J, Kors JA (2009) A dictionary to identify small molecules and drugs in free text. Bioinformatics 25(22):2983–2991

    CAS  Google Scholar 

  87. McDaniel JR, Balmuth JR (1992) Kekule: OCR-optical chemical (structure) recognition. J Chem Inf Comput Sci 32(4):373–378

    CAS  Google Scholar 

  88. Park J, Rosania G, Shedden K, Nguyen M, Lyu N, Saitou K (2009) Automated extraction of chemical structure information from digital raster images. Chem Cent J 3(1):1–16

    CAS  Google Scholar 

  89. Simon A, Johnson AP (1997) Recent advances in the CLiDE project: logical layout analysis of chemical documents. J Chem Inf Comput Sci 37(1):109–116

    CAS  Google Scholar 

  90. Valko AT, Johnson AP (2009) CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition. J Chem Inf Model 49(4):780–787

    CAS  Google Scholar 

  91. Zimmermann M (2007) Über die Kunst, dem Rechner das Lesen beizubringen. (The art of teaching the computer to read). Nachr Chem 55(10):997–999

    CAS  Google Scholar 

  92. Filippov IV, Nicklaus MC (2009) Optical structure recognition software to recover chemical information: OSRA, an open source solution. J Chem Inf Model 49(3):740–743

    CAS  Google Scholar 

  93. Williams AJ, Yerin A (2013) Automated systematic nomenclature generation for organic compounds. Wiley Interdiscip Rev Comput Mol Sci 3(2):150–160

    CAS  Google Scholar 

  94. Bachrach S (2009) Chemistry publication—making the revolution. J Cheminformatics 1(1):2

    Google Scholar 

  95. Borkum M, Frey J (2014) Usage and applications of Semantic Web techniques and technologies to support chemistry research. J Cheminformatics 6(1):18

    Google Scholar 

  96. Casher O, Rzepa HS (2006) Semanticeye: a Semantic Web application to rationalize and enhance chemical electronic publishing. J Chem Inf Model 46(6):2396–2411

    CAS  Google Scholar 

  97. Casher O, Rzepa HS (2010) Using semantically-enabled components for social web-based scientific collaborations. In: Belford RE, Moore JW, Pence HE (eds) Enhancing learning with online resources, social networking, and digital libraries, ACS symposium series, vol 1060. American Chemical Society, Washington, pp 41–63

    Google Scholar 

  98. Chen B, Ding Y, Wild D (2012) Improving integrative searching of systems chemical biology data using semantic annotation. J Cheminformatics 4(1):6

    CAS  Google Scholar 

  99. Chepelev L, Dumontier M (2011) Chemical Entity Semantic Specification: knowledge representation for efficient semantic cheminformatics and facile data integration. J Cheminformatics 3(1):20

    CAS  Google Scholar 

  100. Choi J, Davis MJ, Newman AF, Ragan MA (2010) A Semantic Web ontology for small molecules and their biological targets. J Chem Inf Model 50(5):732–741

    CAS  Google Scholar 

  101. Coles SJ, Day NE, Murray-Rust P, Rzepa HS, Zhang Y (2005) Enhancement of the chemical semantic web through the use of InChI identifiers. Org Biomol Chem 3(10):1832–1834

    CAS  Google Scholar 

  102. Frey J, De Roure D, Taylor K, Essex J, Mills H, Zaluska E (2006) CombeChem: a case study in provenance and annotation using the Semantic Web. In: Moreau L, Foster I (eds) Provenance and annotation of data, vol 4145. Springer, Berlin, pp 270–277

    Google Scholar 

  103. Frey JG (2009) The value of the Semantic Web in the laboratory. Drug Discov Today 14(11–12):552–561

    Google Scholar 

  104. Frey JG, Bird CL (2013) Cheminformatics and the Semantic Web: adding value with linked data and enhanced provenance. Wiley Interdiscip Rev Comput Mol Sci 3(5):465–481

    CAS  Google Scholar 

  105. Murray-Rust P, Mitchell JBO, Rzepa HS (2005) Communication and re-use of chemical information in bioscience. BMC Bioinf 6:180

    Google Scholar 

  106. Murray-Rust P, Rzepa HS, Tyrrell SM, Zhang Y (2004) Representation and use of chemistry in the global electronic age. Org Biomol Chem 2(22):3192–3203

    CAS  Google Scholar 

  107. O’Boyle N, Guha R, Willighagen E, Adams S, Alvarsson J, Bradley J-C, Filippov I, Hanson R, Hanwell M, Hutchison G, James C, Jeliazkova N, Lang A, Langner K, Lonie D, Lowe D, Pansanel J, Pavlov D, Spjuth O, Steinbeck C, Tenderholt A, Theisen K, Murray-Rust P (2011) Open data, open source and open standards in chemistry: the Blue Obelisk 5 years on. J Cheminformatics 3(1):37

    Google Scholar 

  108. Prasanna MD, Vondrasek J, Wlodawer A, Rodriguez H, Bhat TN (2006) Chemical compound navigator: a web-based chem-BLAST, chemical taxonomy-based search engine for browsing compounds. Proteins Struct Funct Bioinf 63(4):907–917

    CAS  Google Scholar 

  109. Samwald M, Jentzsch A, Bouton C, Kallesoe C, Willighagen E, Hajagos J, Marshall M, Prud’hommeaux E, Hassanzadeh O, Pichler E, Stephens S (2011) Linked open drug data for pharmaceutical research and development. J Cheminformatics 3(1):19

    Google Scholar 

  110. Tanaka K, Aoki-Kinoshita KF, Kotera M, Sawaki H, Tsuchiya S, Fujita N, Shikanai T, Kato M, Kawano S, Yamada I, Narimatsu H (2014) WURCS: the Web3 unique representation of carbohydrate structures. J Chem Inf Model 54(6):1558–1566

    CAS  Google Scholar 

  111. Taylor KR, Gledhill RJ, Essex JW, Frey JG, Harris SW, De Roure DC (2006) Bringing chemical data onto the Semantic Web. J Chem Inf Model 46(3):939–952

    CAS  Google Scholar 

  112. Teixeira AL, Falcao AO (2013) Noncontiguous atom matching structural similarity function. J Chem Inf Model 53(10):2511–2524

    CAS  Google Scholar 

  113. Velden T, Lagoze C (2009) Communicating chemistry. Nat Chem 1(9):673–678

    CAS  Google Scholar 

  114. Williams AJ (2008) Internet-based tools for communication and collaboration in chemistry. Drug Discov Today 13(11–12):502–506

    CAS  Google Scholar 

  115. Williams AJ (2008) Public chemical compound databases. Curr Opin Drug Discov Dev 11(3):393–404

    CAS  Google Scholar 

  116. Willighagen EL, Alvarsson J, Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O, Wikberg JES (2011) Linking the resource description framework to cheminformatics and proteochemometrics. J Biomed Semant 2(Suppl 1):S6

    Google Scholar 

  117. Goldmann D, Montanari F, Richter L, Zdrazil B, Ecker GF (2014) Exploiting open data: a new era in pharmacoinformatics. Future Med Chem 6(5):503–514

    CAS  Google Scholar 

  118. Williams AJ, Harland L, Groth P, Pettifer S, Chichester C, Willighagen EL, Evelo CT, Blomberg N, Ecker G, Goble C, Mons B (2012) Open PHACTS: semantic interoperability for drug discovery. Drug Discov Today 17(21–22):1188–1198

    Google Scholar 

  119. Sharman JL, Mpamhanga CP, Spedding M, Germain P, Staels B, Dacquet C, Laudet V, Harmar AJ (2011) IUPHAR-DB: new receptors and tools for easy searching and visualization of pharmacological data. Nucleic Acids Res 39(Suppl 1):D534–D538

    CAS  Google Scholar 

  120. Southan C, Boppana K, Jagarlapudi S, Muresan S (2011) Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: ranking 1654 human protein targets by assayed compounds and molecular scaffolds. J Cheminformatics 3(1):14

    Google Scholar 

  121. Tiikkainen P, Franke L (2012) Analysis of commercial and public bioactivity databases. J Chem Inf Model 52(2):319–326

    CAS  Google Scholar 

  122. Southan C (2015) Expanding opportunities for mining bioactive chemistry from patents. Drug Discov Today Technol (in press)

  123. Bobach C, Bohme T, Laube U, Puschel A, Weber L (2012) Automated compound classification using a chemical ontology. J Cheminformatics 4(1):40

    CAS  Google Scholar 

  124. de Matos P, Alcántara R, Dekker A, Ennis M, Hastings J, Haug K, Spiteri I, Turner S, Steinbeck C (2010) Chemical entities of biological interest: an update. Nucleic Acids Res 38(Suppl 1):D249–D254

    Google Scholar 

  125. Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, Alcántara R, Darsow M, Guedj M, Ashburner M (2008) ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res 36(Suppl 1):D344–D350

    CAS  Google Scholar 

  126. Degtyarenko K, Ennis M, Garavelli JS (2007) “Good annotation practice” for chemical data in biology. Silico Biol 7(Suppl 2):45–56

    Google Scholar 

  127. Degtyarenko K, Hastings J, de Matos P, Ennis M (2009) ChEBI: an open bioinformatics and cheminformatics resource. In: Bateman A, Draghici S, Pearson WR, Stein LD, Yates JR (eds) Current protocols in bioinformatics, vol 26. Wiley, Oxford, pp 14.19.11–14.19.20

  128. Hardy B, Douglas N, Helma C, Rautenberg M, Jeliazkova N, Jeliazkov V, Nikolova I, Benigni R, Tcheremenskaia O, Kramer S, Girschick T, Buchwald F, Wicker J, Karwath A, Gutlein M, Maunz A, Sarimveis H, Melagraki G, Afantitis A, Sopasakis P, Gallagher D, Poroikov V, Filimonov D, Zakharov A, Lagunin A, Gloriozova T, Novikov S, Skvortsova N, Druzhilovsky D, Chawla S, Ghosh I, Ray S, Patel H, Escher S (2010) Collaborative development of predictive toxicology applications. J Cheminformatics 2(1):7

    Google Scholar 

  129. Hastings J, Josephs Z, Steinbeck C (2012) Accessing and using chemical property databases. In: Reisfeld B, Mayeno AN (eds) Computational toxicology, vol 929. Humana Press, New York, pp 193–219

    Google Scholar 

  130. Hastings J, Magka D, Batchelor C, Duan L, Stevens R, Ennis M, Steinbeck C (2012) Structure-based classification and ontology in chemistry. J Cheminformatics 4(1):8

    CAS  Google Scholar 

  131. Haug K, Salek RM, Conesa P, Hastings J, de Matos P, Rijnbeek M, Mahendraker T, Williams M, Neumann S, Rocca-Serra P, Maguire E, González-Beltrán A, Sansone S-A, Griffin JL, Steinbeck C (2013) MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Res 41(D1):D781–D786

    CAS  Google Scholar 

  132. Brown M, Dunn WB, Dobson P, Patel Y, Winder CL, Francis-McIntyre S, Begley P, Carroll K, Broadhurst D, Tseng A, Swainston N, Spasic I, Goodacre R, Kell DB (2009) Mass spectrometry tools and metabolite-specific databases for molecular identification in metabolomics. Analyst 134(7):1322–1332

    CAS  Google Scholar 

  133. Carroll AJ (2012) Online metabolomics databases and pipelines. In: Roessner U (ed) metabolomics. InTech, Rijeka, pp 47–72

    Google Scholar 

  134. Carroll AJ, Badger MR, Millar AH (2010) The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets. BMC Bioinf 11:376

    Google Scholar 

  135. Fiehn O, Kind T, Barupal DK (2011) Data processing, metabolomic databases and pathway analysis. In: Hall RD (ed) Biology of plant metabolomics annual plant review, vol 43. Wiley, Oxford, pp 367–406

    Google Scholar 

  136. Hummel J, Selbig J, Walther D, Kopka J (2007) The Golm metabolome database: a database for GC–MS based metabolite profiling. In: Nielsen J, Jewett M (eds) Metabolomics, vol 18. Springer, Berlin, pp 75–95

    Google Scholar 

  137. Jenkins H, Hardy N, Beckmann M, Draper J, Smith AR, Taylor J, Fiehn O, Goodacre R, Bino RJ, Hall R, Kopka J, Lane GA, Lange BM, Liu JR, Mendes P, Nikolau BJ, Oliver SG, Paton NW, Rhee S, Roessner-Tunali U, Saito K, Smedsgaard J, Sumner LW, Wang T, Walsh S, Wurtele ES, Kell DB (2004) A proposed framework for the description of plant metabolomics experiments and their results. Nat Biotech 22(12):1601–1606

    CAS  Google Scholar 

  138. Johnson SR, Lange BM (2015) Open-access metabolomics databases for natural product research: present capabilities and future potential. Front Bioeng Biotechnol 3:22

    Google Scholar 

  139. Kind T, Scholz M, Fiehn O (2009) How large is the metabolome? A critical analysis of data exchange practices in chemistry. PLoS One 4(5):e5440

    Google Scholar 

  140. Ludwig C, Easton J, Lodi A, Tiziani S, Manzoor S, Southam A, Byrne J, Bishop L, He S, Arvanitis T, Günther U, Viant M (2012) Birmingham Metabolite Library: a publicly accessible database of 1-D 1H and 2-D 1H J-resolved NMR spectra of authentic metabolite standards (BML-NMR). Metabolomics 8(1):8–18

    CAS  Google Scholar 

  141. May JW, James AG, Steinbeck C (2013) Metingear: a development environment for annotating genome-scale metabolic models. Bioinformatics 29(17):2213–2215

    CAS  Google Scholar 

  142. Moco S, Vervoort J, Moco S, Bino RJ, De Vos RCH, Bino R (2007) Metabolomics technologies and metabolite identification. TrAC Trends Anal Chem 26(9):855–866

    CAS  Google Scholar 

  143. Peironcely J, Rojas-Cherto M, Fichera D, Reijmers T, Coulier L, Faulon J-L, Hankemeier T (2012) OMG: open molecule generator. J Cheminformatics 4(1):21

    CAS  Google Scholar 

  144. Redestig H, Kusano M, Fukushima A, Matsuda F, Saito K, Arita M (2010) Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis. BMC Bioinf 11:214

    Google Scholar 

  145. Rojas-Chertó M, van Vliet M, Peironcely JE, van Doorn R, Kooyman M, te Beek T, van Driel MA, Hankemeier T, Reijmers T (2012) MetiTree: a web application to organize and process high-resolution multi-stage mass spectrometry metabolomics data. Bioinformatics 28(20):2707–2709

    Google Scholar 

  146. Schymanski EL, Neumann S (2013) CASMI: and the winner is. Metabolites 3(2):412–439

    CAS  Google Scholar 

  147. Steinbeck C, Conesa P, Haug K, Mahendraker T, Williams M, Maguire E, Rocca-Serra P, Sansone S-A, Salek R, Griffin J (2012) MetaboLights: towards a new COSMOS of metabolomics data management. Metabolomics 8(5):757–760

    CAS  Google Scholar 

  148. Sumner L, Amberg A, Barrett D, Beale M, Beger R, Daykin C, Fan TM, Fiehn O, Goodacre R, Griffin J, Hankemeier T, Hardy N, Harnly J, Higashi R, Kopka J, Lane A, Lindon J, Marriott P, Nicholls A, Reily M, Thaden J, Viant M (2007) Proposed minimum reporting standards for chemical analysis. Metabolomics 3(3):211–221

    CAS  Google Scholar 

  149. Wishart DS (2009) Computational strategies for metabolite identification in metabolomics. Bioanalysis 1(9):1579–1596

    CAS  Google Scholar 

  150. Wishart DS (2011) Advances in metabolite identification. Bioanalysis 3(15):1769–1782

    CAS  Google Scholar 

  151. Mu F, Williams RF, Unkefer CJ, Unkefer PJ, Faeder JR, Hlavacek WS (2007) Carbon-fate maps for metabolic reactions. Bioinformatics 23(23):3193–3199

    CAS  Google Scholar 

  152. Zhou B, Wang J, Ressom HW (2012) MetaboSearch: tool for mass-based metabolite identification using multiple databases. PLoS One 7(6):e40096

    CAS  Google Scholar 

  153. Zhou B, Xiao JF, Ressom HW (2013) Prioritization of putative metabolite identifications in LC-MS/MS experiments using a computational pipeline. Proteomics 13(2):248–260

    CAS  Google Scholar 

  154. Nöh K, Droste P, Wiechert W (2015) visual workflows for 13C-metabolic flux analysis. Bioinformatics 31(3):346–354

    Google Scholar 

  155. Steinbeck C, Krause S, Kuhn S (2003) NMRShiftDB—constructing a free chemical information system with open-source components. J Chem Inf Comput Sci 43(6):1733–1739

    CAS  Google Scholar 

  156. The CSEARCH NMRpredict server. http://nmrpredict.orc.univie.ac.at/. Accessed 19 Apr 2015

  157. Kalchhauser H, Robien W (1985) CSEARCH: a computer program for identification of organic compounds and fully automated assignment of carbon-13 nuclear magnetic resonance spectra. J Chem Inf Comput Sci 25(2):103–108

    CAS  Google Scholar 

  158. Kuhn S, Schlörer Nils E (2012) Strukturaufklärung mit NMR in der Synthesechemie. Nachr Chem 60(11):1106–1107

    CAS  Google Scholar 

  159. Plainchont B, de Emerenciano Paulo V, Nuzillard J-M (2013) Recent advances in the structure elucidation of small organic molecules by the LSD software. Magn Reson Chem 51(8):447–453

    CAS  Google Scholar 

  160. Steinbeck C, Kuhn S (2004) NMRShiftDB – compound identification and structure elucidation support through a free community-built web database. Phytochemistry 65(19):2711–2717

    CAS  Google Scholar 

  161. Ahmed L, Rasulev B, Turabekova M, Leszczynska D, Leszczynski J (2013) Receptor- and ligand-based study of fullerene analogues: comprehensive computational approach including quantum-chemical, QSAR and molecular docking simulations. Org Biomol Chem 11(35):5798–5808

    CAS  Google Scholar 

  162. Benz RD (2007) Toxicological and clinical computational analysis and the US FDA/CDER. Expert Opin Drug Metab Toxicol 3(1):109–124

    CAS  Google Scholar 

  163. Bertinetto C, Duce C, Micheli A, Solaro R, Starita A, Tine MR (2007) Prediction of the glass transition temperature of (meth)acrylic polymers containing phenyl groups by recursive neural network. Polymer 48(24):7121–7129

    CAS  Google Scholar 

  164. Bertinetto C, Duce C, Micheli A, Solaro R, Starita A, Tiné MR (2009) Evaluation of hierarchical structured representations for QSPR studies of small molecules and polymers by recursive neural networks. J Mol Graph Model 27(7):797–802

    CAS  Google Scholar 

  165. Chavan S, Nicholls IA, Karlsson BCG, Rosengren AM, Ballabio D, Consonni V, Todeschini R (2014) Towards global QSAR model building for acute toxicity: munro database case study. Int J Mol Sci 15(10):18162–18174

    CAS  Google Scholar 

  166. Richard AM (2006) Future of toxicology—predictive toxicology: an expanded view of “chemical toxicity”. Chem Res Toxicol 19(10):1257–1262

    CAS  Google Scholar 

  167. Richard AM, Gold LS, Nicklaus MC (2006) Chemical structure indexing of toxicity data on the Internet: moving toward a flat world. Curr Opin Drug Discov Dev 9(3):314–325

    CAS  Google Scholar 

  168. Ruusmann V, Sild S, Maran U (2014) QSAR DataBank—an approach for the digital organization and archiving of QSAR model information. J Cheminformatics 6(1):25

    Google Scholar 

  169. Spjuth O, Willighagen E, Guha R, Eklund M, Wikberg J (2010) Towards interoperable and reproducible QSAR analyses: exchange of datasets. J Cheminformatics 2(1):5

    Google Scholar 

  170. Sushko Y, Novotarskyi S, Korner R, Vogt J, Abdelaziz A, Tetko I (2014) Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process. J Cheminformatics 6(1):48

    Google Scholar 

  171. Toropov A, Toropova A, Benfenati E, Leszczynska D, Leszczynski J (2010) Use of the international chemical identifier for constructing QSPR-model of normal boiling points of acyclic carbonyl substances. J Math Chem 47(1):355–369

    CAS  Google Scholar 

  172. Toropov AA, Toropova AP, Benfenati E (2009) QSPR modeling of octanol water partition coefficient of platinum complexes by InChI-based optimal descriptors. J Math Chem 46(4):1060–1073

    CAS  Google Scholar 

  173. Toropov AA, Toropova AP, Benfenati E (2010) QSAR-modeling of toxicity of organometallic compounds by means of the balance of correlations for InChI-based optimal descriptors. Mol Diversity 14(1):183–192

    CAS  Google Scholar 

  174. Toropov AA, Toropova AP, Benfenati E, Leszczynska D, Leszczynski J (2009) Additive InChI-based optimal descriptors: QSPR modeling of fullerene C60 solubility in organic solvents. J Math Chem 46(4):1232–1251

    CAS  Google Scholar 

  175. Toropov AA, Toropova AP, Benfenati E, Leszczynska D, Leszczynski J (2010) InChI-based optimal descriptors: QSAR analysis of fullerene[C60]-based HIV-1 PR inhibitors by correlation balance. Eur J Med Chem 45(4):1387–1394

    CAS  Google Scholar 

  176. Toropova AP, Toropov AA, Benfenati E, Gini G (2011) Simplified molecular input-line entry system and international chemical identifier in the QSAR analysis of styrylquinoline derivatives as HIV-1 integrase inhibitors. Chem Biol Drug Des 77(5):343–360

    CAS  Google Scholar 

  177. Zakharov AV, Peach ML, Sitzmann M, Nicklaus MC (2014) A new approach to radial basis function approximation and its application to QSAR. J Chem Inf Model 54(3):713–719

    CAS  Google Scholar 

  178. Langham JJ, Jain AN (2008) Accurate and interpretable computational modeling of chemical mutagenicity. J Chem Inf Model 48(9):1833–1839

    CAS  Google Scholar 

  179. Arvidson KB (2008) FDA toxicity databases and real-time data entry. Toxicol Appl Pharmacol 233(1):17–19

    CAS  Google Scholar 

  180. Fostel JM (2008) Towards standards for data exchange and integration and their impact on a public database such as CEBS (chemical effects in biological systems). Toxicol Appl Pharmacol 233(1):54–62

    CAS  Google Scholar 

  181. Jeliazkova N, Jeliazkov V (2011) AMBIT RESTful web services: an implementation of the OpenTox application programming interface. J Cheminformatics 3(1):18

    CAS  Google Scholar 

  182. Kinjo AR, Nakamura H (2009) Comprehensive structural classification of ligand-binding motifs in proteins. Structure 17(2):234–246

    CAS  Google Scholar 

  183. Kiss R, Sándor M, Gere A, Schmidt É, Balogh GT, Kiss B, Molnár L, Lemmen C, Keserű GM (2012) Discovery of novel histamine H4 and serotonin transporter ligands using the topological feature tree descriptor. J Chem Inf Model 52(1):233–242

    CAS  Google Scholar 

  184. Liu Y, Li F, Sun H (2014) Thermal decomposition of FOX-7 studied by ab initio molecular dynamics simulations. Theor Chem Acc 133(10):1–11

    Google Scholar 

  185. Murray-Rust P, Rzepa HS, Stewart JJP, Zhang Y (2005) A global resource for computational chemistry. J Mol Model 11(6):532–541

    CAS  Google Scholar 

  186. Nashev LG, Schuster D, Laggner C, Sodha S, Langer T, Wolber G, Odermatt A (2010) The UV-filter benzophenone-1 inhibits 17β-hydroxysteroid dehydrogenase type 3: virtual screening as a strategy to identify potential endocrine disrupting chemicals. Biochem Pharmacol 79(8):1189–1199

    CAS  Google Scholar 

  187. Phadungsukanan W, Shekar S, Shirley R, Sander M, West RH, Kraft M (2009) First-principles thermochemistry for silicon species in the decomposition of tetraethoxysilane. J Phys Chem A 113(31):9041–9049

    CAS  Google Scholar 

  188. Qu X, Jain A, Rajput NN, Cheng L, Zhang Y, Ong SP, Brafman M, Maginn E, Curtiss LA, Persson KA (2015) The Electrolyte Genome project: a big data approach in battery materials discovery. Comput Mater Sci 103:56–67

    CAS  Google Scholar 

  189. Shirley R, Phadungsukanan W, Kraft M, Downing J, Day NE, Murray-Rust P (2010) First-principles thermochemistry for gas phase species in an industrial rutile chlorinator. J Phys Chem A 114(43):11825–11832

    CAS  Google Scholar 

  190. Totton TS, Shirley R, Kraft M (2011) First-principles thermochemistry for the combustion of in a methane flame. Proc Combust Inst 33(1):493–500

    CAS  Google Scholar 

  191. Martin E, Monge A, Duret J-A, Gualandi F, Peitsch M, Pospisil P (2012) Building an R&D chemical registration system. J Cheminformatics 4(1):11

    CAS  Google Scholar 

  192. Cass ME, Rzepa HS, Rzepa DR, Williams CK (2005) The use of the free, open-source program Jmol to generate an interactive web site to teach molecular symmetry. J Chem Educ 82(11):1736

    CAS  Google Scholar 

  193. Gledhill R, Kent S, Hudson B, Richards WG, Essex JW, Frey JG (2006) A computer-aided drug discovery system for chemistry teaching. J Chem Inf Model 46(3):960–970

    CAS  Google Scholar 

  194. Kraut H, Eiblmaier J, Grethe G, Loew P, Matuszczyk H, Saller H (2013) Algorithm for reaction classification. J Chem Inf Model 53(11):2884–2895

    CAS  Google Scholar 

  195. Currano JN (2014) Reaction searching. In: Currano JN, Roth DL (eds) Chemical information for chemists: a primer. The Royal Society of Chemistry, Cambridge, pp 224–254

    Google Scholar 

  196. Lawson AJ, Swienty-Busch J, Géoui T, Evans D (2014) The making of Reaxys? Towards unobstructed access to relevant chemistry information. In: McEwen LR, Buntrock RE (eds) The future of the history of chemical information, ACS symposium series, vol 1164. American Chemical Society, Washington, pp 127–148

    Google Scholar 

  197. McEwen LR, Buntrock RE (eds) (2014) The future of the history of chemical information, ACS symposium series, vol 1164. American Chemical Society, Washington

    Google Scholar 

  198. Bolton EE, Wang Y, Thiessen PA, Bryant SH (2008) PubChem: integrated platform of small molecules and biological activities. In: Ralph AW, David CS (eds) Annual Reports in Computational Chemistry, vol 4. Elsevier, Amsterdam, pp 217–241

    Google Scholar 

  199. Huang R, Southall N, Wang Y, Yasgar A, Shinn P, Jadhav A, Nguyen D-T, Austin CP (2011) The NCGC Pharmaceutical Collection: a comprehensive resource of clinically approved drugs enabling repurposing and chemical genomics. Sci Transl Med 3(80):80ps16

    Google Scholar 

  200. Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res 35(Suppl 1):D198–D201

    CAS  Google Scholar 

  201. Yadav IS, Singh H, Mohd IK, Chaudhury A, Raghava GPS, Agarwal SM (2014) EGFRIndb: epidermal growth factor receptor inhibitor database. Anti-Cancer Agents Med Chem 14(7):928–935

    CAS  Google Scholar 

  202. Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, Maciejewski A, Arndt D, Wilson M, Neveu V, Tang A, Gabriel G, Ly C, Adamjee S, Dame ZT, Han B, Zhou Y, Wishart DS (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42(D1):D1091–D1097

    CAS  Google Scholar 

  203. Wishart DS (2010) DrugBank: a general resource for pharmaceutical and pharmacological research. Mol Cell Pharmacol 2(1):25–38

    CAS  Google Scholar 

  204. Seiler KP, George GA, Happ MP, Bodycombe NE, Carrinski HA, Norton S, Brudz S, Sullivan JP, Muhlich J, Serrano M, Ferraiolo P, Tolliday NJ, Schreiber SL, Clemons PA (2008) ChemBank: a small-molecule screening and cheminformatics resource database. Nucleic Acids Res 36(Suppl 1):D351–D359

    CAS  Google Scholar 

  205. Zhang C, Tao L, Qin C, Zhang P, Chen S, Zeng X, Xu F, Chen Z, Yang S, Chen Y (2015) CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering. Nucleic Acids Res 43(D1):D558–D565

    Google Scholar 

  206. Finn RD, Miller BL, Clements J, Bateman A (2014) iPfam: a database of protein family and domain interactions found in the Protein Data Bank. Nucleic Acids Res 42(D1):D364–D373

    CAS  Google Scholar 

  207. Henrick K, Feng Z, Bluhm WF, Dimitropoulos D, Doreleijers JF, Dutta S, Flippen-Anderson JL, Ionides J, Kamada C, Krissinel E, Lawson CL, Markley JL, Nakamura H, Newman R, Shimizu Y, Swaminathan J, Velankar S, Ory J, Ulrich EL, Vranken W, Westbrook J, Yamashita R, Yang H, Young J, Yousufuddin M, Berman HM (2008) Remediation of the Protein Data Bank archive. Nucleic Acids Res 36(Suppl 1):D426–D433

    CAS  Google Scholar 

  208. Ivan G, Szabadka Z, Grolmusz V (2009) On the asymmetry of the residue compositions of the binding sites on protein surfaces. J Bioinf Comput Biol 07(06):931–938

    CAS  Google Scholar 

  209. Ivan G, Szabadka Z, Grolmusz V (2010) Cysteine and tryptophan anomalies found when scanning all the binding sites in the Protein Data Bank. Int J Bioinf Res Appl 6(6):594–608

    CAS  Google Scholar 

  210. Iván G, Szabadka Z, Grolmusz V (2007) Being a binding site: characterizing residue composition of binding sites on proteins. Bioinformation 2(5):216–221

    Google Scholar 

  211. Sen S, Young J, Berrisford JM, Chen M, Conroy MJ, Dutta S, Di Costanzo L, Gao G, Ghosh S, Hudson BP, Igarashi R, Kengaku Y, Liang Y, Peisach E, Persikova I, Mukhopadhyay A, Narayanan BC, Sahni G, Sato J, Sekharan M, Shao C, Tan L, Zhuravleva MA (2014) Small molecule annotation for the Protein Data Bank. Database 2014:bau116

  212. Westbrook JD, Shao C, Feng Z, Zhuravleva M, Valenkar S, Young J (2015) The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the protein Data Bank. Bioinformatics 31:1274–1278

    Google Scholar 

  213. Ordog R, Szabadka Z, Grolmusz V (2008) Analyzing the simplicial decomposition of spatial protein structures. BMC Bioinf 9(Suppl 1):S11

    Google Scholar 

  214. Szabadka Z, Grolmusz V (2006) Building a structured PDB: the RS-PDB database. Conf Proc IEEE Eng Med Biol Soc 1:5755–5758

    Google Scholar 

  215. Szabadka Z, Grolmusz V (2007) High throughput processing of the structural information in the Protein Data Bank. J Mol Graphics Modell 25(6):831–836

    CAS  Google Scholar 

  216. Prasanna MD, Vondrasek J, Wlodawer A, Bhat TN (2005) Application of InChI to curate, index, and query 3-D structures. Proteins Struct Funct Bioinf 60(1):1–4

    CAS  Google Scholar 

  217. Barthelmes J, Ebeling C, Chang A, Schomburg I, Schomburg D (2007) BRENDA, AMENDA and FRENDA: the enzyme information system in 2007. Nucleic Acids Res 35(Suppl 1):D511–D514

    CAS  Google Scholar 

  218. Schomburg I, Chang A, Placzek S, Söhngen C, Rother M, Lang M, Munaretto C, Ulas S, Stelzer M, Grote A, Scheer M, Schomburg D (2013) BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA. Nucleic Acids Res 41(D1):D764–D772

    CAS  Google Scholar 

  219. Carugo O, Eisenhaber F (eds) (2010) Data mining techniques for the life sciences. Humana Press, New York

    Google Scholar 

  220. Bernard T, Bridge A, Morgat A, Moretti S, Xenarios I, Pagni M (2014) Reconciliation of metabolites and biochemical reactions for metabolic networks. Briefings Bioinf 15(1):123–135

    Google Scholar 

  221. Lang M, Stelzer M, Schomburg D (2011) BKM-react, an integrated biochemical reaction database. BMC Biochem 12:42

    CAS  Google Scholar 

  222. Wishart DS, Jewison T, Guo AC, Wilson M, Knox C, Liu Y, Djoumbou Y, Mandal R, Aziat F, Dong E, Bouatra S, Sinelnikov I, Arndt D, Xia J, Liu P, Yallou F, Bjorndahl T, Perez-Pineiro R, Eisner R, Allen F, Neveu V, Greiner R, Scalbert A (2013) HMDB 3.0—the human metabolome database in 2013. Nucleic Acids Res 41(D1):D801–D807

    CAS  Google Scholar 

  223. Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B, Hau DD, Psychogios N, Dong E, Bouatra S, Mandal R, Sinelnikov I, Xia J, Jia L, Cruz JA, Lim E, Sobsey CA, Shrivastava S, Huang P, Liu P, Fang L, Peng J, Fradette R, Cheng D, Tzur D, Clements M, Lewis A, De Souza A, Zuniga A, Dawe M, Xiong Y, Clive D, Greiner R, Nazyrova A, Shaykhutdinov R, Li L, Vogel HJ, Forsythe I (2009) HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res 37(Suppl 1):D603–D610

    CAS  Google Scholar 

  224. Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D, Jewell K, Arndt D, Sawhney S, Fung C, Nikolai L, Lewis M, Coutouly M-A, Forsythe I, Tang P, Shrivastava S, Jeroncic K, Stothard P, Amegbey G, Block D, Hau DD, Wagner J, Miniaci J, Clements M, Gebremedhin M, Guo N, Zhang Y, Duggan GE, MacInnis GD, Weljie AM, Dowlatabadi R, Bamforth F, Clive D, Greiner R, Li L, Marrie T, Sykes BD, Vogel HJ, Querengesser L (2007) HMDB: the Human Metabolome Database. Nucleic Acids Res 35(Suppl 1):D521–D526

    CAS  Google Scholar 

  225. Maeda MH, Kondo K (2013) Three-dimensional structure database of natural metabolites (3DMET): a novel database of curated 3D structures. J Chem Inf Model 53(3):527–533

    CAS  Google Scholar 

  226. Altman T, Travers M, Kothari A, Caspi R, Karp PD (2013) A systematic comparison of the MetaCyc and KEGG pathway databases. BMC Bioinf 14:112

    Google Scholar 

  227. Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M (2011) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 40(D1):D109–D114

  228. Fahy E, Cotter D, Sud M (1811) Subramaniam S (2011) Lipid classification, structures and tools. Biochim Biophys Acta Mol Cell Biol Lipids 11:637–647

    Google Scholar 

  229. Murphy RC, Fahy E (2010) Isoprostane nomenclature: more suggestions. Prostaglandins Leukot Essent Fatty Acids 82(2):69–70

    CAS  Google Scholar 

  230. Nielsen J (2009) Systems biology of lipid metabolism: from yeast to human. FEBS Lett 583(24):3905–3913

    CAS  Google Scholar 

  231. Davis GDJ, Vasanthi AHR (2011) Seaweed metabolite database (SWMD): a database of natural compounds from marine algae. Bioinformation 5(8):361–364

    Google Scholar 

  232. Herrgard MJ, Swainston N, Dobson P, Dunn WB, Arga KY, Arvas M, Buethgen N, Borger S, Costenoble R, Heinemann M, Hucka M, Le Novere N, Li P, Liebermeister W, Mo ML, Oliveira AP, Petranovic D, Pettifer S, Simeonidis E, Smallbone K, Spasie I, Weichart D, Brent R, Broomhead DS, Westerhoff HV, Kuerdar B, Penttilae M, Klipp E, Palsson BO, Sauer U, Oliver SG, Mendes P, Nielsen J, Kell DB (2008) A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nat Biotechnol 26(10):1155–1160

    CAS  Google Scholar 

  233. Stobbe MD, Houten SM, Jansen GA, van Kampen AHC, Moerland PD (2011) Critical assessment of human metabolic pathway databases: a stepping stone for future integration. BMC Syst Biol 5:165

    Google Scholar 

  234. Stobbe MD, Swertz MA, Thiele I, Rengaw T, van Kampen AHC, Moerland PD (2013) Consensus and conflict cards for metabolic pathway databases. BMC Syst Biol 7:50

    Google Scholar 

  235. Barth A (1993) SpecInfo: an integrated spectroscopic information system. J Chem Inf Comput Sci 33(1):52–58

    CAS  Google Scholar 

  236. Bremser W, Grzonka M (1991) SpecInfo—a multidimensional spectroscopic interpretation system. Microchim Acta 104(1–6):483–491

    Google Scholar 

  237. Ba YA, Wenger C, Surleau R, Boudon V, Rotger M, Daumont L, Bonhommeau DA, Tyuterev VG, Dubernet M-L (2013) MeCaSDa and ECaSDa: methane and ethene calculated spectroscopic databases for the virtual atomic and molecular data centre. J Quant Spectrosc Radiat Transf 130:62–68

    CAS  Google Scholar 

  238. Dunkel R, Wu X (2007) Identification of organic molecules from a structure database using proton and carbon NMR analysis results. J Magn Reson 188(1):97–110

    CAS  Google Scholar 

  239. Hill C, Gordon IE, Rothman LS, Tennyson J (2013) A new relational database structure and online interface for the HITRAN database. J Quant Spectrosc Radiat Transf 130:51–61

    CAS  Google Scholar 

  240. Wiley’s Compound Search. http://www.compoundsearch.com/. Accessed 21 Apr 2015

  241. Linstrom PJ, Mallard WG (eds) In: NIST chemistry webbook, NIST standard reference database number 69. National Institute of Standards and Technology, Gaithersburg. http://webbook.nist.gov. Accessed 15 Apr 2015

  242. Kazakov A, Muzny CD, Kroenlein K, Diky V, Chirico RD, Magee JW, Abdulagatov IM, Frenkel M (2012) NIST/TRC SOURCE data archival system: the next-generation data model for storage of thermophysical properties. Int J Thermophys 33(1):22–33

    CAS  Google Scholar 

  243. Specs. http://www.specs.net. Accessed 19 April 2015

  244. AKos Samples. http://www.akosgmbh.de/AKosSamples. Accessed 19 Apr 2015

  245. ChemExper. http://www.chemexper.com. Accessed 19 Apr 2015

  246. Guilloux V, Arrault A, Colliandre L, Bourg S, Vayer P, Morin-Allory L (2012) Mining collections of compounds with screening assistant 2. J Cheminformatics 4(1):20

    Google Scholar 

  247. Masciocchi J, Frau G, Fanton M, Sturlese M, Floris M, Pireddu L, Palla P, Cedrati F, Rodriguez-Tomé P, Moro S (2009) MMsINC: a large-scale chemoinformatics database. Nucleic Acids Res 37(Suppl 1):D284–D290

    CAS  Google Scholar 

  248. ChemSynthesis. http://www.chemsynthesis.com/. Accessed 19 Apr 2015

  249. Compendium of Pesticide Common Names http://www.alanwood.net/pesticides/. Accessed 19 Apr 2015

  250. Mol-Instincts Database based on Quantum Mechanics and QSPR. http://molinstincts.com/home/index/. Accessed 9 Apr 2015

  251. Magoon GR, Green WH (2013) Design and implementation of a next-generation software interface for on-the-fly quantum and force field calculations in automated reaction mechanism generation. Comput Chem Eng 52:35–45

    CAS  Google Scholar 

  252. Ramakrishnan R, Dral PO, Rupp M, von Lilienfeld OA (2014) Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 1:140022

    CAS  Google Scholar 

  253. Weber RJM, Li E, Bruty J, He S, Viant MR (2012) MaConDa: a publicly accessible mass spectrometry contaminants database. Bioinformatics 28(21):2856–2857

    CAS  Google Scholar 

  254. Bruno TJ, Wolk A, Naydich A, Huber ML (2009) Composition-explicit distillation curves for mixtures of diesel fuel with dimethyl carbonate and diethyl carbonate. Energy Fuels 23(8):3989–3997

    CAS  Google Scholar 

  255. Ginex T, Spyrakis F, Cozzini P (2014) FADB: a food additive molecular database for in silico screening in food toxicology. Food Addit Contam Part A 31(5):792–798

    CAS  Google Scholar 

  256. Gu J, Gui Y, Chen L, Yuan G, Xu X (2013) CVDHD: a cardiovascular disease herbal database for drug discovery and network pharmacology. J Cheminformatics 5:51

    Google Scholar 

  257. Kelley SP, Fabian L, Brock CP (2011) Failures of fractional crystallization: ordered co-crystals of isomers and near isomers. Acta Crystallogr B 67(1):79–93

    CAS  Google Scholar 

  258. Laurence C, Brameld KA, Graton J, Le Questel J-Y, Renault E (2009) The pKBHX database: toward a better understanding of hydrogen-bond basicity for medicinal chemists. J Med Chem 52(14):4073–4086

    CAS  Google Scholar 

  259. Wakelam V, Herbst E, Loison J-C, Smith IWM, Chandrasekaran V, Pavone B, Adams NG, Bacchus-Montabonel M-C, Bergeat A, Béroff K, Bierbaum VM, Chabot M, Dalgarno A, van Dishoeck EF, Faure A, Geppert WD, Gerlich D, Galli D, Hébrard E, Hersant F, Hickson KM, Honvault P, Klippenstein SJ, Le Picard S, Nyman G, Pernot P, Schlemmer S, Selsis F, Sims IR, Talbi D, Tennyson J, Troe J, Wester R, Wiesenfeld L (2012) A KInetic database for astrochemistry (KIDA). Astrophys J Suppl Ser 199(1):21

    Google Scholar 

  260. Fabian L, Brock CP (2010) A list of organic kryptoracemates. Acta Crystallogr B 66(1):94–103

    CAS  Google Scholar 

  261. Schenck RJ, Zapiecki KR (2014) Back to the future: CAS and the shape of chemical information to come. In: Leah RM, Buntrock RE (eds) The future of the history of chemical information, ACS symposium series, vol 1164. American Chemical Society, Washington, pp 149–158

    Google Scholar 

  262. Schmidt U, Struck S, Gruening B, Hossbach J, Jaeger IS, Parol R, Lindequist U, Teuscher E, Preissner R (2009) SuperToxic: a comprehensive database of toxic compounds. Nucleic Acids Res 37(Suppl 1):D295–D299

    CAS  Google Scholar 

  263. Zass E (2010) Chemical information retrieval—a short discussion about the state of the art, progress, and pitfalls. Heterocycles 82(1):63–86

    CAS  Google Scholar 

  264. Zass E (2014) Looking back, but not in anger. In: McEwen LR, Buntrock RE (eds) The future of the history of chemical information, ACS symposium series, vol 1164. American Chemical Society, Washington, pp 57–80

    Google Scholar 

  265. Akhondi SA, Kors JA, Muresan S (2012) Consistency of systematic chemical identifiers within and between small-molecule databases. J Cheminformatics 4:35

    CAS  Google Scholar 

  266. Chambers J, Davies M, Gaulton A, Hersey A, Velankar S, Petryszak R, Hastings J, Bellis L, McGlinchey S, Overington JP (2013) UniChem: a unified chemical structure cross-referencing and identifier tracking system. J Cheminformatics 5:3

    CAS  Google Scholar 

  267. Galgonek J, Vondrasek J (2014) On InChI and evaluating the quality of cross-reference links. J Cheminformatics 6:15

    Google Scholar 

  268. Hilbig M, Urbaczek S, Groth I, Heuser S, Rarey M (2013) MONA—interactive manipulation of molecule collections. J Cheminformatics 5(1):38

    CAS  Google Scholar 

  269. Kuhn M, Szklarczyk D, Franceschini A, Campillos M, von Mering C, Jensen LJ, Beyer A, Bork P (2010) STITCH 2: an interaction network database for small molecules and proteins. Nucleic Acids Res 38(Database issue):D552–D556

    CAS  Google Scholar 

  270. Kuhn M, Szklarczyk D, Franceschini A, von Mering C, Jensen LJ, Bork P (2012) STITCH 3: zooming in on protein–chemical interactions. Nucleic Acids Res 40(D1):D876–D880

    CAS  Google Scholar 

  271. Kuhn M, von Mering C, Campillos M, Jensen LJ, Bork P (2008) STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res 36(Suppl 1):D684–D688

    CAS  Google Scholar 

  272. Qiao Y, Wu X, Yang L, Zhang M (2007) Chemoinformatics and open source software integration and reuse. Jisuanji Yu Yingyong Huaxue 24(1):133–136

    CAS  Google Scholar 

  273. Williams AJ, Ekins S, Tkachenko V (2012) Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation. Drug Discov Today 17(13–14):685–701

    CAS  Google Scholar 

  274. Orchard S, Al-Lazikani B, Bryant S, Clark D, Calder E, Dix I, Engkvist O, Forster M, Gaulton A, Gilson M, Glen R, Grigorov M, Hammond-Kosack K, Harland L, Hopkins A, Larminie C, Lynch N, Mann RK, Murray-Rust P, Lo PE, Southan C, Steinbeck C, Wishart D, Hermjakob H, Overington J, Thornton J (2011) Minimum information about a bioactive entity (MIABE). Nat Rev Drug Discov 10(9):661–669

    CAS  Google Scholar 

  275. Thibault J, Roe D, Facelli J, Cheatham T (2014) Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing. J Cheminformatics 6(1):4

    Google Scholar 

  276. Thalheim T (2010) Tautomer production based on the InChI string. Nachr Chem 58(12):1253–1255

    CAS  Google Scholar 

  277. Thalheim T, Vollmer A, Ebert R-U, Kuhne R, Schuurmann G (2010) Tautomer identification and tautomer structure generation based on the InChI code. J Chem Inf Model 50(7):1223–1232

    CAS  Google Scholar 

Download references

Acknowledgments

I would like to thank Bernd Berger, David Evans, Steve Heller, Richard Kidd, Alan McNaught, Wolfgang Robien, Chris Steinbeck and Dmitrii Tchekhovskoi for helping me to check facts in this article. Any remaining errors and omissions in this complex compilation are my own responsibility.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wendy A. Warr.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Warr, W.A. Many InChIs and quite some feat. J Comput Aided Mol Des 29, 681–694 (2015). https://doi.org/10.1007/s10822-015-9854-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-015-9854-3

Keywords

Navigation