Abstract
Given a text and a pattern over an alphabet, the classic exact matching problem searches for all occurrences of the pattern in the text. Unlike exact matching, order-preserving pattern matching (OPPM) considers the relative order of elements, rather than their exact values. In this paper, we propose efficient algorithms for the OPPM problem using the “duel-and-sweep” paradigm. For a pattern of length m and a text of length n, our serial algorithm runs in \(O(n + m\log m)\) time, and our parallel algorithm runs in \(O(\log ^2 m)\) time and \(O(n \log ^2 m)\) work with \(O(\log m)\) time and \(O(m \log m)\) work pattern preprocessing on the Priority Concurrent Read Concurrent Write Parallel Random-Access Machines (P-CRCW PRAM).













Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
Notes
Our preliminary paper [18] on this topic presented in SOFSEM 2020 is in error.
The value of the last argument r in the function GetMismatchPos \( Lmax _{X}, Lmin _{X},Y,r\) is usually 0 except the call from Algorithm 12.
Actually Lemma 29 holds when \(i > j\) and \(j-i + 1 \le w_1\), but we are concerned only with the case where \(i < j\).
References
Amir, A., Kondratovsky, E.: Sufficient conditions for efficient indexing under different matchings. In: Proceedings of 30th annual symposium on combinatorial pattern matching (CPM 2019), Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2019)
Amir, A., Benson, G., Farach, M.: An alphabet independent approach to two-dimensional pattern matching. SIAM J. Comput. 23(2), 313–323 (1994)
Berkman, O., Schieber, B., Vishkin, U.: Optimal doubly logarithmic parallel algorithms based on finding all nearest smaller values. J. Algorithms 14(3), 344–370 (1993)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun of the ACM 20(10), 762–772 (1977). https://doi.org/10.1145/359842.3598599
Cantone, D., Faro, S., Külekci, M.O.: An efficient skip-search approach to the order-preserving pattern matching problem. In: PSC, pp 22–35 (2015)
Chhabra, T., Tarhio, J.: A filtration method for order-preserving matching. Inf. Process. Lett. 116(2), 71–74 (2016). https://doi.org/10.1016/j.ipl.2015.10.005
Chhabra, T., Külekci, M.O., Tarhio, J.: Alternative algorithms for order-preserving matching. In: PSC, pp 36–46 (2015)
Cho, S., Na, J.C., Park, K., et al.: A fast algorithm for order-preserving pattern matching. Inf. Process. Lett. 115(2), 397–402 (2015)
Cole, R.: Parallel merge sort. SIAM J. Comput. 17(4), 770–785 (1988)
Cole, R., Hazay, C., Lewenstein, M., et al.: Two-dimensional parameterized matching. ACM Trans. Algorithms 11(2), 1–12 (2014). https://doi.org/10.1145/2650220
Crochemore, M., Iliopoulos, C.S., Kociumaka, T., et al.: Order-preserving indexing. Theor. Comput. Sci. 638, 122–135 (2016). https://doi.org/10.1016/j.tcs.2015.06.050
Faro, S., Külekci, M. O.: Efficient algorithms for the order preserving pattern matching problem. In: International Conference on Algorithmic Applications in Management, Springer, pp 185–196 (2016)
Faro, S., Lecroq, T.: The exact online string matching problem: a review of the most recent results. ACM Comput. Surv. (CSUR) 45(2), 1–42 (2013)
Hasan, M.M., Islam, A.S., Rahman, M.S., et al.: Order preserving pattern matching revisited. Pattern Recogn. Lett. 55, 15–21 (2015)
Horspool, R.N.: Practical fast searching in strings. Softw: Pract. Exp. 10(6), 501–506 (1980). https://doi.org/10.1002/spe.4380100608
JáJá, J.: An Introduction to Parallel Algorithms, vol. 17. Addison-Wesley, Reading (1992)
Jargalsaikhan, D., Diptarama, Ueki, Y. et al: Duel and sweep algorithm for order-preserving pattern matching. In: SOFSEM 2018: theory and practice of computer science 44th international conference on current trends in theory and practice of computer science, Krems, Austria, January 29-February 2, Proceedings pp 624-635, (2018)
Jargalsaikhan, D., Hendrian, D., Yoshinaka, R. et al: Parallel duel-and-sweep algorithm for the order-preserving pattern matching. In: International conference on current trends in theory and practice of informatics, pp 211–222 (2020)
Jargalsaikhan, D., Hendrian, D., Yoshinaka, R. et al: Parallel algorithm for pattern matching problems under substring consistent equivalence relations. In: 33rd Annual symposium on combinatorial pattern matching, CPM 2022, June 27-29, 2022, Prague, Czech Republic, LIPIcs, vol 223. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp 28:1–28:21, (2022a) https://doi.org/10.4230/LIPIcs.CPM.2022.28
Jargalsaikhan, D., Hendrian, D., Yoshinaka, R. et al: Parallel algorithm for pattern matching problems under substring consistent equivalence relations. CoRR abs/2202.13284. (2022b) https://arxiv.org/abs/2202.13284, 2202.13284
Kim, J., Eades, P., Fleischer, R., et al.: Order-preserving matching. Theoret. Comput. Sci. 525, 68–79 (2014)
Knuth, D.E., Morris, J.H., Jr., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977). https://doi.org/10.1137/0206024
Kubica, M., Kulczyński, T., Radoszewski, J., et al.: A linear time algorithm for consecutive permutation pattern matching. Inf. Process. Lett. 113(12), 430–433 (2013)
Matsuoka, Y., Aoki, T., Inenaga, S., et al.: Generalized pattern matching and periodicity under substring consistent equivalence relations. Theoret. Comput. Sci. 656, 225–233 (2016)
Ueki, Y., Narisawa, K., Shinohara, A.: A fast order-preserving matching with \(q\)-neighborhood filtration using SIMD instructions. In: SOFSEM (Student Research Forum Papers/Posters), pp 108–115 (2016)
Vishkin, U.: Optimal Parallel Pattern Matching in Strings. International Colloquium on Automata, Languages, and Programming, pp. 497–508. Springer, Berlin (1985)
Vishkin, U.: Deterministic sampling: a new technique for fast pattern matching. SIAM J. Comput. 20(1), 22–40 (1991)
Acknowledgements
We would like to express our sincere gratitude to the anonymous reviewers for their constructive comments and valuable feedback. Their contributions have been helpful in refining the final version of this paper.This work was supported by JSPS KAKENHI Grant Numbers JP15H05706, JP18K11150, JP19K20208, JP20H05703, and JP21K11745. This work was also supported by ImPACT Program of the Council for Science, Technology and Innovation (Cabinet Office, Government of Japan). Davaajav Jargalsaikhan was supported by a research grant from Tohoku University Division for International Advanced Research and Education.
Author information
Authors and Affiliations
Contributions
All the co-authors contributed equally to this work.
Corresponding authors
Ethics declarations
Conflict of interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A. Comparison with the parallel SCER matching algorithms
Appendix A. Comparison with the parallel SCER matching algorithms
Our proposed parallel algorithm is largely based on the general matching algorithm for arbitrary SCERs proposed by Jargalsaikhan et al. [19, 20]. An equivalence relation \(\cong \) over \(\Sigma ^*\) is called an SCER just in the case where \(X \cong Y\) implies \(|X|=|Y|\) and \(X[i \mathbin {:} j] \cong Y[i \mathbin {:} j]\) for any \(1 \le i \le j \le |X|\). Clearly, the OPM relation \(\approx \) is an SCER. The algorithm in [19] uses an encoding that satisfies the following conditions.
Definition 34
(\(\cong \)-encoding, [19, Definition 3]) Let \(\Sigma \) and \(\Delta \) be (possibly infinite) alphabets. We say a function \(f:\Sigma ^* \rightarrow \Delta ^*\) is an \(\cong \)-encoding if
-
(1)
\(f(X) = f(Y)\) iff \(X \cong Y\),
-
(2)
\(f(X[1:i]) = f(X)[1:i]\) for any \(i \le |X|\), and
-
(3)
\(f(X)[i] = f(Y)[i]\) implies \(f(X[j+1:k])[i-j] = f(Y[j+1:k])[i-j]\) for any \(j < i \le k\).
If \(X \not \cong Y\), one can find a witness position i such that \(f(X)[i] \ne f(Y)[i]\). The conditions (2) and (3) imply that when a position witnesses mismatch between substrings of X and Y, then that position witnesses the mismatch between the whole strings X and Y as well. This is an important property to “transfer” witnesses on an offset for other offsets in their algorithm. Accordingly, the efficiency of their algorithm depends on the efficiency of the calculation of the encoding of a string as well as that of recalculating the encoding of a substring from the encoding of the whole string.
Indeed, every SCER \(\cong \) admits an encoding satisfying the above: let \(\Delta \) be the power set of \(\Sigma ^*\), and \(f(\varepsilon )=\varepsilon \) and \(f(Xc) = f(X) \cdot [Xc]_{\cong }\) for \(c \in \Sigma \) where \([Y]_{\cong }\) is the equivalence class of \(Y \in \Sigma ^*\) under \(\cong \) (or, \(\Delta \) can be any set of symbols that can represent those equivalence classes). This construction guarantees that their algorithm works for every SCER in theory, but computing this encoding is apparently expensive. Actually, many SCERs, like exact match, parameterized match, and Cartesian match, admit computationally cheaper encodings satisfying Definition 3435. However, we do not yet know if there is such a reasonable encoding for OPM.
Concerning the OPM relation \(\approx \), recall that \(X \approx Y\) if and only if \( Lmax _{X} = Lmax _{Y}\) and \( Lmin _{X} = Lmin _{Y}\). The encoding \( Lmix _X\) of X defined as \( Lmix _X[i]= \langle Lmin _{X}[i], Lmax _{X}[i] \rangle \) fulfills (1)–(2) of Definition 3435, but not (3). For example, consider the mismatch between \(X=(4,6,1,5)\) and \(Y=(4,6,9,5)\). We have
The mismatch is witnessed at position 3, but not at the last position, with this encoding. On the other hand, the mismatch of their suffixes \(X'=(1,5) \not \approx Y'=(9,5)\) is witnessed at the last position.
Therefore, the algorithm proposed in [19] does not work with this encoding \( Lmix \). Instead, we have designed an algorithm that uses two positions as a mismatch witness, where we do not compare the encoded characters. This modification exempts us from the encoding costs. Furthermore, the OPM relation \(\approx \) is closed under reversal, i.e., \(X \approx Y\) iff \(\textrm{rev}(X) \approx \textrm{rev}(Y)\), which is not guaranteed in general SCERs. Thanks to this property, our proposed pattern preprocessing (Algorithm 9 more specifically) runs faster than the one for general SCERs in [19].
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jargalsaikhan, D., Hendrian, D., Ueki, Y. et al. Serial and parallel algorithms for order-preserving pattern matching based on the duel-and-sweep paradigm. Acta Informatica 61, 415–444 (2024). https://doi.org/10.1007/s00236-024-00464-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00236-024-00464-w