Elsevier

Computers & Chemistry

Volume 26, Issue 4, June 2002, Pages 357-369
Computers & Chemistry

Application of novel atom-type AI topological indices to QSPR studies of alkanes

https://doi.org/10.1016/S0097-8485(01)00128-0Get rights and content

Abstract

Atom-type AI topological indices derived from the topological distance sums and vertex degree further are used to describe different structural environment of each atom-type in a molecule. The multiple linear regression based on combined use of the proposed Xu index and AI indices is performed to develop high quality QSPR models for describing six physical properties (the normal boiling points, heats of vaporization, molar volumes, molar refractions, van der Waals’ constants, and Pitzer's acentric factors) of alkanes with up to nine carbon atoms. For each of six properties, the correlation coefficient r of the final models is larger than 0.995 and particularly the decrease in the standard error (s) is within the range of 45–86% as compared with the simple linear models with Xu index alone. The agreement between calculated and experimental data is quite good. The results indicate the potential of these indices for application to a wide range of physical properties. The role of each of the molecular size and individual groups in the molecules are illustrated by analyzing the relative or fraction contributions of individual indices. The results indicate that the six physical properties of alkanes are dominated by molecular size while AI indices have smaller influence dependent on the studied properties. Moreover, the studies demonstrate that each atomic group contributes an indefinite value to properties dependent on its structural environment in a molecule or other groups present. The cross-validation using the more general leave-n-out method demonstrates the final models to be highly statistically reliable.

Introduction

Important progress has been made over the last 10–25 years in understanding the relationships between various properties of organic compounds and chemical structures. Among the most significant achievements is the Kamlet–Taft linear solvation energy relationship (LSER) based on multiple solvatochromic parameters (Kamlet et al., 1981), which has been successfully utilized in a variety of the quantitative structure–property/activity relationship (QSPR/QSAR) studies (Ren et al., 1999d). However, the LSER method has some shortcomings for lack of available parameter values (Hickey and Passino-Reader, 1991).

The graph-theoretical topological index approach to QSPR/QSAR represents simple and straightforward means for describing and/or predicting physical properties and biological activities of compounds as well as molecular design (Balaban, 1995). Up till now, more than 100 indices have been proposed, such as the well-known molecular connectivity χ index (Kier and Hall, 1976), Hosoya's Z index (Hosoya, 1971), Balaban's J index (Balaban, 1982), Bonchev's ID index (Bonchev and Trinajstic, 1977), Schulze's MTI index (Schultz, 1989), Wiener's W index (Wiener, 1947) and recently proposed Xu index (Ren, 1999, Ren et al., 1999a, Ren et al., 1999b, Ren et al., 1999c) etc. Although these conventional indices can give satisfactory correlation for at least one property, in some cases they do not even apparently correlate well with some properties of the same molecules. This fact indicates that different physical properties depend in a different way on the inherent structural features of a molecule. As is well known, there are many factors influencing the physical property or biological activity of a molecule, among the most obvious are the molecular size, shape, polarity, and especially the ability of the molecule to participate in hydrogen bonding. These factors are related to various aspects of intermolecular interactions, such as van der Waals forces and hydrogen bonding interactions. One now recognizes that not only the molecular size but also the molecular fragments and/or atomic groups related to different fundamental interactions may be important to physical properties or biological activities of a molecule, because the intermolecular interactions, as pointed out by Pitzer (1955) and Pitzer et al. (1955), are not only interactions between molecular center but also a sum of interactions between various parts of the molecules. According to Pitzer's statement, both the molecular mass and various atomic groups make the separate contributions to physical properties of a molecule. It is worth noting that because these indices conventionally characterize a molecule as a whole, i.e. molecular size or shape and do not take into account the separate contributions of individual molecular fragment and atomic groups to properties, we are faced with some difficulties in developing high quality QSPR/QSAR models. This means that the structural information of a molecule is in need of improvement at the atomic level.

The atom-type-based topological index, which further describes the structural information of a molecule at the atomic level, is expected to make a breakthrough in illustrating the role of each atom or group in a molecule. Kier et al. (1991) first introduced the concept of atom-type-based topological indices, i.e. the so-called electrotopological state index (E-state). The studies demonstrated that E-state topological indices were especially effective in developing a variety of QSAR/QSPR models for a given system consisting of only a few atom types (Hall et al., 1991a, Hall et al., 1991b, Hall and Kier, 1992, Hall et al., 1995, Hall and Kier, 1995). Therefore, the development of atom-type topological index provides the new possibilities for various applications, such as database characterization, clustering, molecular similarity analysis and related fields of study. However, up to date, the development of the atomic-level topological indices is not very advanced but progress can be anticipated. This should be the primary driving force to find novel atomic level topological indices for describing different physico-chemical properties or biological activities. For this purpose, in a previous paper (Ren, 2002), we proposed a type of novel atomic-based AI topological indices different from E-state indices. The novel vertex degree (vm) based on the valence connectivity δv of Kier–Hall was successfully used for the heteroatom differentiation. Further, AI indices, along with the recently proposed Xu index, were further extended to compounds with heteroatoms. The multiple linear regression using the modified Xu and AI indices was run to develop high quality QSPR models for four physical properties of alcohols, such as the normal boiling points (BP), molar volumes (MV), molar refractions (MR), and molecular total surface areas (TSA). For each of four properties, the correlation coefficient r is larger than 0.996 and particularly the decrease in the standard error (s) is within the range of 61–83%. The results indicate the high potential of these indices for application in QSPR studies of complex compounds. The present problem is whether these indices are sensible for a wide range of physical properties differently sensitive to different structural features.

Non-polar alkanes represent an especially attractive class of compounds since the specific interaction such as hydrogen-bonding interactions and a number of complexities caused by heteroatoms or chemical bonds of higher order in complex compounds are avoided. Particularly reliable experimental data are readily available in literatures. Therefore, in order to illustrate the potential of AI indices for application to various physical properties, we select six physical properties of alkanes for this study. First, the multiple linear regression using Xu and AI indices is used to develop the QSPR models for describing six properties of alkanes. The six properties are the normal boiling points (BP), heats of vaporization (HP), molar volumes (MV), molar refractions (MR), van der Waals’ constants (b), and Pitzer's acentric factors (ω). Furthermore, we wish to demonstrate what structural features or groups are likely to be important to different properties.

Section snippets

Method

For a graph G={V, E} with n vertices, where V and E are the vertex set and edge set, respectively. The vertex-adjacency matrix, A=[aij]n×n, is a square symmetric matrix. The elements aij of matrix A are 1 if vertices i and j are adjacent and 0 otherwise, where n is the number of vertices. The distance matrix, D=[dij]n×n, is also a square symmetric matrix. The entry dij of matrix D is the length of the shortest path between the vertices i and j in a G. For alkanes, dij is the number of CC

Data set

For alkanes with higher molecular weight, such as decane isomers, the data taken from different courses are not internally consistent, which disturbs pure topological investigations and influences the establishment of reliable QSPR models. Therefore, we only consider a data set of alkanes with up to nine carbon atoms for this study. The normal boiling points (BP), molar volumes (MV) at 20 °C, molar refractions (MR) at 20 °C and heats of vaporization (HV) at 25 °C are directly taken from Needham et

Results and discussion

The normal boiling point (BP), which is a physical property universally and precisely measured for low molecular compounds, is usually used to test the performances of the topological indices. Therefore, first we will consider the boiling points of alkanes, and then extend the study to other physical properties of the same series of alkanes.

Conclusion

Atom-type AI topological indices based on the topological distance sums and vertex degree can describe different structural environment of each atom-type in a molecule at the atomic level. The multiple linear regression using Xu jointly with AI indices can provide high quality QSPR models for six physical properties (the normal boiling points, heats of vaporization, molar volumes, molar refractions, van der Waals’ constants and Pitzer's acentric factors) of alkanes with up to nine carbon atoms.

References (38)

  • A.T. Balaban

    Chem. Phys. Lett.

    (1982)
  • B. Ren et al.

    Chem. Phys. Lett.

    (1999)
  • A.T. Balaban

    J. Chem. Inf. Comput. Sci.

    (1995)
  • D. Bonchev et al.

    J. Chem. Phys.

    (1977)
  • J.A. Dean
  • L.H. Hall et al.

    Med. Res. Rev.

    (1992)
  • L.H. Hall et al.

    J. Chem. Inf. Comput. Sci.

    (1995)
  • L.H. Hall et al.

    J. Chem. Inf. Comput. Sci.

    (1991)
  • L.H. Hall et al.

    Quant. Struct.-Act. Relat.

    (1991)
  • L.H. Hall et al.

    J. Chem. Inf. Comput. Sci.

    (1995)
  • J.P. Hickey et al.

    Environ. Sci. Technol.

    (1991)
  • H. Hosoya

    Bull. Chem. Soc. Jpn.

    (1971)
  • M. Kamlet et al.

    Prog. Phys. Org. Chem.

    (1981)
  • L.B. Kier et al.

    Molecular Connectivity in Chemistry and Drug Research

    (1976)
  • L.B. Kier et al.

    J. Math. Chem.

    (1991)
  • B. Lucic et al.

    J. Chem. Inf. Comput. Sci.

    (1999)
  • B. Lucic et al.

    J. Chem. Inf. Comput. Sci.

    (1999)
  • B. Lucic et al.

    J. Chem. Inf. Comput. Sci.

    (2000)
  • B. Lucic et al.

    J. Chem. Inf. Comput. Sci.

    (2001)
  • Cited by (0)

    View full text