Maximal strip recovery problem with gaps: Hardness and approximation algorithms

https://doi.org/10.1016/j.jda.2012.12.006Get rights and content
Under an Elsevier user license
open archive

Abstract

Given two comparative maps, that is two sequences of markers each representing a genome, the Maximal Strip Recovery problem (MSR) asks to extract a largest sequence of markers from each map such that the two extracted sequences are decomposable into non-intersecting strips (or synteny blocks). This aims at defining a robust set of synteny blocks between different species, which is a key to understand the evolution process since their last common ancestor. In this paper, we add a fundamental constraint to the initial problem, which expresses the biologically sustained need to bound the number of intermediate (non-selected) markers between two consecutive markers in a strip. We therefore introduce the problem δ-gap-MSR, where δ is a (usually small) non-negative integer that upper bounds the number of non-selected markers between two consecutive markers in a strip. We show that, if we restrict ourselves to comparative maps without duplicates, the problem is polynomial for δ=0, NP-complete for δ=1, and APX-hard for δ2. For comparative maps with duplicates, the problem is APX-hard for all δ0.

Keywords

Algorithmic complexity
Approximation algorithms
Comparative maps
Genome comparison
Synteny blocks

Cited by (0)

A preliminary version of this paper appeared in the proceedings of the 20th International Symposium on Algorithms and Computation, ISAAC 2009, Bulteau et al. (2009) [8].