Skip to main content

Bounds on Lengths of Real Valued Vectors Similar with Regard to the Tanimoto Similarity

  • Conference paper
Intelligent Information and Database Systems (ACIIDS 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7802))

Included in the following conference series:

Abstract

The Tanimoto similarity measure finds numerous applications in chemistry, bio-informatics, information retrieval and text mining. A typical task in these applications is finding most similar vectors. The task is very time consuming in the case of very large data sets. Thus methods that allow for efficient restriction of the number of vectors that have a chance to be sufficiently similar to a given vector are of high importance. To this end, recently, we have derived bounds on lengths of vectors similar with respect to the Tanimoto similarity. In this paper, we recall those results and derive new bounds on lengths of real valued vectors that have a chance to be Tanimoto similar to a given vector in a required degree. Finally, we compare the previous and current results and illustrate their usefulness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers (2011)

    Google Scholar 

  2. Kristensen, T.G.: Transforming Tanimoto Queries on Real Valued Vectors to Range Queries in Euclidian Space. Journal of Mathematical Chemistry 48(2), 287–289 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  3. Kryszkiewicz, M.: Efficient Determination of Binary Non-negative Vector Neighbors with Regard to Cosine Similarity. In: Jiang, H., Ding, W., Ali, M., Wu, X. (eds.) IEA/AIE 2012. LNCS, vol. 7345, pp. 48–57. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  4. Kryszkiewicz, M.: Bounds on Lengths of Vectors Similar with Regard to the Tanimoto and Cosine Similarity. ICS Research Report 3, Institute of Computer Science, Warsaw University of Technology, Warsaw (2012)

    Google Scholar 

  5. Lipkus, A.H.: A proof of the triangle inequality for the Tanimoto dissimilarity. Journal of Mathematical Chemistry 26(1-3), 263–265 (1999)

    Article  MATH  Google Scholar 

  6. Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press (2011)

    Google Scholar 

  7. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  8. Willett, P., Barnard, J.M., Downs, G.M.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38(6), 983–996 (1998)

    Article  Google Scholar 

  9. Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann (1999)

    Google Scholar 

  10. Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Springer (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kryszkiewicz, M. (2013). Bounds on Lengths of Real Valued Vectors Similar with Regard to the Tanimoto Similarity. In: Selamat, A., Nguyen, N.T., Haron, H. (eds) Intelligent Information and Database Systems. ACIIDS 2013. Lecture Notes in Computer Science(), vol 7802. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36546-1_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36546-1_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36545-4

  • Online ISBN: 978-3-642-36546-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics