J. Chem. Inf. Model., 48 (4), 691703, 2008. 10.1021/ci700334f
Web Release Date: April 11, 2008

Copyright © 2008 American Chemical Society

Distributed Chemical Computing Using ChemStar: An Open Source Java Remote Method Invocation Architecture Applied to Large Scale Molecular Data from PubChem

M. Karthikeyan* and S. Krishnan

Digital Information Resource Center, Information Division, National Chemical Laboratory, Pune 411008, India

Anil Kumar Pandey

Munich Information Center for Protein Sequences (MIPS), Institute for Bioinformatics, Helmholtz Zentrum München German Research Center for Environmental Health (GmbH), Ingolstaedter Landstrasse 1, 85764 Neuherberg (bei Munich), Germany

Andreas Bender

Leiden/Amsterdam Center for Drug Research, Division of Medicinal Chemistry, Leiden University, Gorlaeus Laboratories, Einsteinweg 55, 2333 CC Leiden, The Netherlands

Alexander Tropsha*

Laboratory for Molecular Modeling, School of Pharmacy and Department of Pharmacology, School of Medicine, University of North Carolina, Chapel Hill, North Carolina 27599

Received September 9, 2007

Abstract:

We present the application of a Java remote method invocation (RMI) based open source architecture to distributed chemical computing. This architecture was previously employed for distributed data harvesting of chemical information from the Internet via the Google application programming interface (API; ChemXtreme). Due to its open source character and its flexibility, the underlying server/client framework can be quickly adopted to virtually every computational task that can be parallelized. Here, we present the server/client communication framework as well as an application to distributed computing of chemical properties on a large scale (currently the size of PubChem; about 18 million compounds), using both the Marvin toolkit as well as the open source JOELib package. As an application, for this set of compounds, the agreement of log P and TPSA between the packages was compared. Outliers were found to be mostly non-druglike compounds and differences could usually be explained by differences in the underlying algorithms. ChemStar is the first open source distributed chemical computing environment built on Java RMI, which is also easily adaptable to user demands due to its “plug-in architecture”. The complete source codes as well as calculated properties along with links to PubChem resources are available on the Internet via a graphical user interface at http://moltable.ncl.res.in/chemstar/.

Download the full text: PDF | HTML