Elsevier

Information Processing Letters

Volume 128, December 2017, Pages 54-57
Information Processing Letters

Two strings at Hamming distance 1 cannot be both quasiperiodic

https://doi.org/10.1016/j.ipl.2017.08.005Get rights and content

Highlights

  • We prove that two strings at Hamming distance 1 cannot be both quasiperiodic.

  • This generalizes a known fact that two such strings cannot be both periodic.

  • Along the way we obtain new insights into combinatorics of quasiperiodicities.

Abstract

We present a generalization to quasiperiodicity of a known fact from combinatorics on words related to periodicity. A string is called periodic if it has a period which is at most half of its length. A string w is called quasiperiodic if it has a non-trivial cover, that is, there exists a string c that is shorter than w and such that every position in w is inside one of the occurrences of c in w. It is a folklore fact that two strings that differ at exactly one position cannot be both periodic. Here we prove a more general fact that two strings that differ at exactly one position cannot be both quasiperiodic. Along the way we obtain new insights into combinatorics of quasiperiodicities.

Introduction

A string is a finite sequence of letters over an alphabet Σ. If w is a string, then by |w|=n we denote its length, by w[i] for i{1,,n} we denote its i-th letter, and by w[i..j] we denote a factor of w being a string composed of the letters w[i]w[j] (if i>j, then it is the empty string). A factor w[i..j] is called a prefix if i=1 and a suffix if j=n.

An integer p is called a period of w if w[i]=w[i+p] for all i=1,,np. A string u is called a border of w if it is both a prefix and a suffix of w. It is a fundamental fact of string periodicity that a string w has a period p if and only if it has a border of length np; see [5], [9]. If p is a period of w, w[1..p] is called a string period of w. If w has a period p such that pn2, then w is called periodic. In this case w has a border of length at least n2.

For two strings w and w of the same length n, we write w=jw if w[i]=w[i] for all i{1,,n}{j} and w[j]w[j]. This means that w and w are at Hamming distance 1, where the Hamming distance counts the number of different positions of two equal-length strings. The following fact states a folklore property of string periodicity that we generalize in this work into string quasiperiodicity.

Fact 1

Let w and w be two strings of length n and j{1,,n} be an index. If w=jw, then at most one of the strings w, w is periodic.

Fact 1 is, in particular, a consequence of a variant of Fine and Wilf's periodicity lemma that was proved by Berstel and Boasson in [2] in the context of partial words with one hole (a hole is a don't care symbol). For completeness we provide its proof in Section 4 without using the terms of partial words.

We say that a string c covers a string w (|w|=n) if for every position k{1,,n} there exists a factor w[i..j]=c such that ikj. Then c is called a cover of w; see Fig. 1. A string w is called quasiperiodic if it has a cover of length smaller than n.

A significant amount of work has been devoted to the computation of covers in a string. A linear-time algorithm finding the shortest cover of a string was proposed by Apostolico et al. [1]. Later a linear-time algorithm computing all the covers of a string was proposed by Moore and Smyth [10]. Breslauer [3] gave an on-line O(n)-time algorithm computing the cover array of a string of length n, that is, an array specifying the lengths of shortest covers of all the prefixes of the string. Li and Smyth [8] provided a linear-time algorithm for computing the array of longest covers of all the prefixes of a string that can be used to populate all the covers of every prefix. All these papers employ various combinatorial properties of covers.

Our main contribution is stated as the following theorem. As we have already mentioned, a periodic string has a border long enough to be the string's cover. Hence, a periodic string is also quasiperiodic, and Theorem 2 generalizes Fact 1.

Theorem 2

Let w and w be two strings of length n and j{1,,n} be an index. If w=jw, then at most one of the strings w, w is quasiperiodic.

The proof of Theorem 2 is divided into three sections. In Section 2 we restate several simple preliminary observations. Then, Section 3 contains a proof of a crucial auxiliary lemma which shows a combinatorial property of seeds that we use extensively in the main result. Finally, Section 4 contains the main proof.

Section snippets

Preliminaries

We say that a string s is a seed of a string w if |s||w| and w is a factor of some string u covered by s; see Fig. 2. Furthermore, s is called a left seed of w if s is both a prefix and a seed of w. Thus a cover of w is always a left seed of w, and a left seed of w is a seed of w. The notion of seed was introduced in [6] and efficient computation of seeds was further considered in [4], [7].

In the proof of our main result we use the following easy observations that are immediate consequences of

Auxiliary lemma

In the following lemma we observe a new property of the notion of seed. As we will see in Section 4, this lemma encapsulates the hardness of multiple cases in the proof of the main result.

Before we proceed to the lemma, however, let us introduce an additional notion lying in between periodicity and quasiperiodicity. We say that a string w of length n is almost periodic with period p if there exists an index j{1,,np} such that:w[i]=w[i+p]for all i=1,,np,ij,and w[j]w[j+p]. In this case we

Main result

In this section we first present a proof of the folklore property of string periodicity (Fact 1) for completeness, and then proceed to the proof of our main result being a generalization of that fact (Theorem 2).

Proof of Fact 1

Assume to the contrary that w=jw and both strings are periodic. Let p and p (p,pn2) be the shortest periods of w and w, respectively. Assume w.l.o.g. that pp. It suffices to prove the lemma in the case that w is a square of length 2p and jp. Let us define w1=w[1..p] and w2=w[

Conclusions

In this note we have proved that every two distinct quasiperiodic strings of the same length differ at more than one position. This bound is tight, as, for instance, for every even n2 the strings an/21ban/21b and an are both quasiperiodic and differ at exactly two positions.

Acknowledgements

The authors thank Maxime Crochemore, Solon P. Pissis, and Wojciech Rytter for helpful discussions. We also thank an anonymous referee whose suggestions helped to simplify the proof of Theorem 2. Amihood Amir was partially supported by the ISF grant 571/14 and the Royal Society. Costas S. Iliopoulos was partially supported by the Onassis Foundation and the Royal Society.

References (10)

There are more references available in the full text version of this article.

Cited by (0)

1

The author is a Newton International Fellow.

View full text