Web Release Date: April 4,
On the Properties of Bit String-Based Measures of Chemical Similarity
Received November 26, 1997
Abstract: With the growth of interest in database searching and
compound selection, the quantification of chemical
similarity has become an area of intense practical and theoretical
interest. One of the most widely used
methods of measuring chemical similarity is based on mapping fragments
within a molecule as bits within
a binary string. We present empirical results which suggest that
bit strings provide a nonintuitive encoding
of molecular size, shape, and global similarity. Other results,
this time statistical in nature, suggest that the
observed behavior of bit string-based searches have a large nonspecific
component. On this basis, we question
whether bit string-based similarity methods possess all the features
desirable in a quantitative chemical
distance measure or metric and suggest that there are instances when
they may not be the most appropriate
tool for searching or segregating chemical structures.
Download the full text:
PDF |
HTML
