Fast Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries

Yamazaki, Tomohiro; Koga, Hisashi; Toda, Takahisa

doi:10.1007/978-3-319-70145-5_7

Tomohiro Yamazaki²³,
Hisashi Koga²³ &
Takahisa Toda²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10648))

Included in the following conference series:

Asia Information Retrieval Symposium

676 Accesses
2 Citations

Abstract

We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time \(T-1\). We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time \(T-1\). Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. J. Comput. Syst. Sci. 60(3), 630–659 (2000)
Article MATH MathSciNet Google Scholar
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of ACM SCG 2004, pp. 253–262. ACM (2004)
Google Scholar
Datar, M., Muthukrishnan, S.: Estimating rarity and similarity over data stream windows. In: Möhring, R., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 323–335. Springer, Heidelberg (2002). doi:10.1007/3-540-45749-6_31
Chapter Google Scholar
Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Adaptive similarity search in streaming time series with sliding windows. Data Knowl. Eng. 63(2), 478–502 (2007)
Article Google Scholar
Lian, X., Chen, L., Wang, B.: Approximate similarity search over multiple stream time series. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 962–968. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71703-4_86
Chapter Google Scholar
Rao, W., Chen, L., Chen, S., Tarkoma, S.: Evaluating continuous top-k queries over document streams. World Wide Web 17(1), 59–83 (2014)
Article Google Scholar
U, L.H., Zhang, J., Mouratidis, K., Li, Y.: Continuous top-k monitoring on document streams. IEEE Trans. Knowl. Data Eng. 29(5), 991–1003 (2017)
Article Google Scholar
Xu, X., Gao, C., Pei, J., Wang, K., Al-Barakati, A.: Continuous similarity search for evolving queries. Knowl. Inf. Syst. 48(3), 649–678 (2016)
Article Google Scholar
Yang, D., Shastri, A., Rundensteiner, E.A., Ward, M.O.: An optimal strategy for monitoring top-k queries in streaming windows. In: Proceedings of the 14th International Conference on Extending Database Technology, pp. 57–68. ACM (2011)
Google Scholar

Download references

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number JP15K00148, 2016.

Author information

Authors and Affiliations

Graduate School of Informatics and Engineering, The University of Electro-Communications, Chofugaoka 1-5-1, Chofu, Tokyo, 182-8585, Japan
Tomohiro Yamazaki, Hisashi Koga & Takahisa Toda

Authors

Tomohiro Yamazaki
View author publications
You can also search for this author in PubMed Google Scholar
Hisashi Koga
View author publications
You can also search for this author in PubMed Google Scholar
Takahisa Toda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomohiro Yamazaki .

Editor information

Editors and Affiliations

Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Won-Kyung Sung
Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Hanmin Jung
Beijing University of Technology, Beijing, China
Shuo Xu
Burapha University, Chonburi, Thailand
Krisana Chinnasarn
Kwansei Gakuin University, Himeji, Hyogo, Japan
Kazutoshi Sumiya
Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Jeonghoon Lee
Renmin University of China, Beijing, China
Zhicheng Dou
Georgetown University, Washington, District of Columbia, USA
Grace Hui Yang
Konkuk University, Seoul, Korea (Republic of)
Young-Guk Ha
Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Seungbock Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yamazaki, T., Koga, H., Toda, T. (2017). Fast Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries. In: Sung, WK., et al. Information Retrieval Technology. AIRS 2017. Lecture Notes in Computer Science(), vol 10648. Springer, Cham. https://doi.org/10.1007/978-3-319-70145-5_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-70145-5_7
Published: 08 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70144-8
Online ISBN: 978-3-319-70145-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics