QSAR studies of HIV-1 integrase inhibition

https://doi.org/10.1016/S0968-0896(02)00332-2Get rights and content

Abstract

Compounds from a wide variety of structural classes inhibit HIV-1 integrase. However, a single unified understanding of the relationship between the structures and activities of these compounds still eludes researchers. We report herein the development of QSAR models for integrase inhibition. The genetic function approximation (GFA) was utilized to select descriptors for the development of the QSAR models. The best QSAR model derived for the complete set of 11 structural classes had a correlation coefficient (r2) of only 0.54 and a cross-validated correlation coefficient (q2) of only 0.42. This indicated that the compounds studied may differ in the exact relationship between structure and inhibition, perhaps through interactions with different subsets of amino acids in the binding pocket, or through the presence of non-overlapping binding pockets. Descriptor-based cluster analysis indicated that the 11 structural classes of integrase inhibitors studied belonged to two clusters, one consisting of five structural classes, and the other six. QSAR models for these two clusters had r2 values of 0.79 and 0.82 and q2 values of 0.71 and 0.74, a significant improvement over models obtained for the complete set of compounds. The two models were applied to predict the activities of compounds from the same structural classes as those used to build the models, giving r2 values of 0.65 and 0.78. The models were also used to predict the activities of compounds shown in crystallographic or docking studies to interact near the active site metal ion. The model describing the larger cluster of structural classes was better able to reproduce the biological activities of these five structures with an average percent residual error of 7.9 compared with the 19.3% residual error for predictions from the other model. This indicated that the six structural classes comprising the larger cluster may bind near the metal ion in a fashion similar to that observed in one publicly available co-crystal structure of an inhibitor bound to HIV-1 integrase. Flexible alignment of inhibitors in the two clusters found different pharmacophores that are consistent with previously published pharmacophores developed on the basis of individual structural classes that have produced novel inhibitory compounds. Thus we expect that these two QSAR models can be used in the search for novel HIV-1 integrase inhibitors as well as to provide insight into the binding modes of such diverse chemical compounds.

QSAR models have been developed for HIV-1 integrase inhibitors that were assigned into two clusters on the basis of structural descriptors. Different pharmacophores were recognized for each cluster of inhibitors. The results showed that the inhibitors in two clusters have different modes of binding to the enzyme.

  1. Download : Download full-size image

Introduction

The acquired immunodeficiency syndrome (AIDS), which is the final and most serious stage of human immunodeficiency virus (HIV) infection, renders the body susceptible to a variety of normally manageable infections, cancers, and other diseases.1, 2, 3, 4 Reverse transcriptase, protease and integrase are three enzymes required in the HIV replication cycle.1 HIV integrase (IN) is currently recognized as an attractive target against AIDS.2, 5 It catalyzes the integration of viral DNA into host DNA in two steps: 3′-processing and strand transfer. First, integrase cleaves the last two nucleotides from each 3′-end of the linear viral DNA. The subsequent DNA strand transfer reaction involves the nucleophilic attack of these 3′-ends on host chromosomal DNA.6

A number of compounds have been reported recently to inhibit HIV-1 integrase in biochemical assays.7, 8, 9, 10, 11, 12, 13, 14, 15, 16 The most potent compounds tend to contain multiple aromatic rings and aryl ortho-hydroxylation. It has been proposed that these inhibitors could block the reaction through inhibiting the glycerolysis, hydrolysis, and circular nucleotide formation that are involved in the 3′-processing step.7, 11 Most compounds reported to date are not selective for IN and the practical utility of those catechol-containing inhibitors is severely reduced by cytotoxicity even though they have been found to inhibit HIV-1 integrase in vitro.5, 15, 18 Thus predictive models describing the relationship between structure and inhibition applicable to diverse sets of structures could be valuable in the search for novel HIV IN inhibitors.

Divalent cations in the IN active site are important in both catalysis and inhibition.5, 18, 19, 20 However, the mechanism of their effect on inhibition is not very clear. A previous study has implied that salicylhydrazines inhibit HIV-1 integrase by chelating to the metal at the active site as they are active only when Mn2+ is used as a cofactor.15 However, thiazolothiazepines showed equal activities in the presence of Mg2+ or Mn2+, thus indicating that they differ from salicylhydrazines and perhaps act at a different site on HIV-1 integrase.13 For those inhibitors that may interact with both the IN molecule and Mg2+ or Mn2+, several types of metal–inhibitor interactions are possible. The aromatic moiety common to many inhibitors has been proposed to interact with the divalent cation in a ‘cation-π’ type interaction.9 There is also a possibility of a typical charge–charge interaction between the metal ions and ionic or partial charges of the ligands.9, 15 It has been shown that both types of interactions can co-exist in a binding site.21

A recent crystallographic study has shown that the inhibitor 1-(5-chloroindol-3-yl)-3-(tetrazolyl)-1,3-propanedione enol (5ClTEP) binds in the middle of the active site of the enzyme, lying between the three catalytic acidic residues, Asp64, Asp152 and Glu152, in the vicinity of the active site metal ion.22 This structure supports the speculation that the interactions between the inhibitors and integrase mimic the normal interactions with viral DNA substrate during the 3′-processing reaction. Additionally, a structure of the avian sarcoma virus integrase core domain in complex with 4-acetylamino-5-hydroxynaphthalene-2,7-disulfonic acid (Y-3), an inhibitor found to be active against the structurally homologous ASV IN and HIV-1 IN enzymes, has been studied.23 Y-3 binds more distantly from the active site metal ion than 5ClTEP on the other side of the catalytic loop. In another study, a small-molecule family consisting of a core of arsenic or phosphorus surrounded by four aromatic groups was identified to have a binding site at the dimer interface of the HIV integrase catalytic domain, which is different from the previous two sites.24 These results provide support for the possibility that structurally different inhibitors interact at different sites.

QSAR modeling is a mathematical analysis, first developed by Hansch,25 to elucidate a quantitative correlation between chemical structure and biological activity. The fundamental hypothesis of QSAR is that biological properties are functions of molecular structure. Molecules with similar structures can reasonably be expected to show similar biological activity and their structure–activity relationships can be explored using descriptors, numerical representations that characterize structures. A descriptor can be any quantitative property that depends on the molecular structure such as molecular weight, van der Waals surface area, dipole moment or number of hydrogen atoms.

In QSAR studies of large data sets, variable selection and model building are difficult and time-consuming procedures. Different strategies have been proposed for variable selection. Genetic algorithms (GA) are relatively new techniques for variable selection.26, 27 They are inspired by Darwin's theory of natural selection, in which the members of a species struggle for survival and individuals having a high fitness survive to pass their genes to the next generation. The best individuals are reproduced by crossover and random mutations. Genetic function approximation (GFA), a combination of GA and the SPLINES (multivariate adaptive regression splines algorithm) techniques, provides multiple models with high predictive ability.28

In this paper, we assigned different classes of inhibitors into two clusters using cluster analysis after finding that a single predictive model could not be developed for all classes together. Two models were constructed using GFA to predict the activities of the inhibitors from each cluster. Possible pharmacophores were also identified for the two clusters. These results provided additional evidence that there are probably at least two different binding sites or binding modes for different inhibitors to interact with HIV integrase as well as defining exactly which structural classes share a common binding mode. They supplied more knowledge of the inhibitors previously studied and a route to compare the structural diversities of different sets of inhibitors, which result in different interaction between the enzyme and inhibitors and hence possibly various binding sites or modes. We anticipate that these models can be used to predict biological activities to prioritize experimental efforts in the search for novel integrase inhibitors.

Section snippets

Experimental methods

In this paper, all QSAR studies were performed with the MOE29 and Cerius2 programs.30

Cluster analysis and QSAR modeling

Eleven classes of compounds with previously published experimental activities determined by the same research group have been included in our QSAR study (see Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11). Their biological activities (IC50 for 3′-processing) range from 0.1 μM to greater than 300 μM. In the QSAR study, the structural features of each inhibitor were described numerically using descriptors in several categories. GFA was employed

Conclusion

In our QSAR modeling of HIV-1 integrase inhibition, a single model developed for all classes of inhibitors cannot adequately describe the relationship between their structures and activities. Accordingly, two clusters of inhibitors have been identified and predictive QSAR models have been developed for each cluster. This finding can be rationalized in two ways. Either the two clusters of inhibitors interact at two different sites of HIV-1 integrase or an overlapping site with different sets of

Acknowledgements

Support from NIH/NIAID (Grant R15 AI 45984-01) and NSF (STI-9602656, CHE-9708517) are gratefully acknowledged. We thank the Chemical Computing Group for their donation of the MOE program. This work was also supported by the funds from the University of Memphis.

References (38)

  • M.J Gait et al.

    Trends. Biotechnol.

    (1995)
  • W.E.J Robinson

    Infect. Med.

    (1998)
  • M.D Andrake et al.

    J. Biol. Chem.

    (1996)
  • N Neamati et al.

    Drug Discov. Today

    (1997)
  • P Rice et al.

    Curr. Opin. Struct. Biol.

    (1996)
  • H Yuan et al.

    J. Mol. Struct. (THEOCHEM)

    (2000)
  • R.G Nanni et al.

    Perspect. Drug Disc. Des.

    (1993)
  • Y Pommier et al.

    Chemotherapy

    (1997)
  • E De Clercq

    J. Med. Chem.

    (1995)
  • A Mazumder et al.

    Biochemistry

    (1995)
  • H Zhao et al.

    J. Med. Chem.

    (1997)
  • M.C Nicklaus et al.

    J. Med. Chem.

    (1997)
  • Z Lin et al.

    J. Med. Chem.

    (1999)
  • N Neamati et al.

    Mol. Pharmacol.

    (1997)
  • H Zhao et al.

    J. Med. Chem.

    (1997)
  • N Neamati et al.

    J. Med. Chem.

    (1999)
  • A Mazumder et al.

    J. Med. Chem.

    (1997)
  • N Neamati et al.

    J. Med. Chem.

    (1998)
  • F Zouhiri et al.

    J. Med. Chem.

    (2000)
  • Cited by (0)

    View full text