Elsevier

Discrete Applied Mathematics

Volume 210, 10 September 2016, Pages 75-87
Discrete Applied Mathematics

Repetition-free longest common subsequence of random sequences

https://doi.org/10.1016/j.dam.2015.07.005Get rights and content
Under an Elsevier user license
open archive

Abstract

A repetition-free Longest Common Subsequence (LCS) of two sequences x and y is an LCS of x and  y where each symbol may appear at most once. Let R denote the length of a repetition-free LCS of two sequences of n symbols each one chosen randomly, uniformly, and independently over a k-ary alphabet. We study the asymptotic, in n and k, behavior of R and establish that there are three distinct regimes, depending on the relative speed of growth of n and k. For each regime we establish the limiting behavior of R. In fact, we do more, since we actually establish tail bounds for large deviations of R from its limiting behavior.

Our study is motivated by the so called exemplar model proposed by Sankoff (1999) and the related similarity measure introduced by Adi et al. (2010). A natural question that arises in this context, which as we show is related to long standing open problems in the area of probabilistic combinatorics, is to understand the asymptotic, in n and k, behavior of parameter R.

Keywords

Repetition-free subsequence
Common subsequence
Random sequences

Cited by (0)