doi:10.1016/j.ins.2006.07.024
Copyright © 2006 Elsevier Inc. All rights reserved.
Parallel comparison of run-length-encoded strings on a linear systolic array
aSTI – University of Urbino, Piazza della Repubblica, 13, Urbino 61029, Italy
Received 20 July 2005;
revised 13 July 2006;
accepted 15 July 2006.
Available online 10 August 2006.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
The length of the longest common subsequence (LCS) between two strings of M and N characters can be computed by an O(M × N) dynamic programming algorithm, that can be executed in O(M + N) steps by a linear systolic array. It has been recently shown that the LCS between run-length-encoded (RLE) strings of m and n runs can be computed by an O(nM + Nm − nm) algorithm that could be executed in O(m + n) steps by a parallel hardware. However, the algorithm cannot be directly mapped on a linear systolic array because of its irregular structure.
In this paper, we propose a modified algorithm that exhibits a more regular structure at the cost of a marginal reduction of the efficiency of RLE. We outline the algorithm and we discuss its mapping on a linear systolic array.
Keywords: Algorithms; Longest common subsequence; Run-length encoding; Systolic array; String comparison
Fig. 1. Basic dynamic programming algorithm for LCS: (a) 14 × 6 dynamic programming matrix; (b) computation steps on a systolic array of six elements; (c) binding between computation tasks and systolic array elements.
Fig. 2. RLE–LCS dynamic programming algorithm: (a) 14 × 6 dynamic programming matrix; (b) computation steps on a systolic array of 16 elements; (c) binding between computation tasks and systolic array elements.
Fig. 3. RLE–LCS algorithm for computing LCS between RLE strings.
Fig. 4. RLLE–LCS dynamic programming algorithm: (a) 14 × 6 dynamic programming matrix; (b) computation steps on a systolic array of 20 elements; (c) binding between computation tasks and systolic array elements.
Fig. 5. (a) Generic block in position (i, j) of basic and extended LCS matrixes. (b) Local connections required to provide input data to the elements of block (i, j). (c) Mapping of RLLE(LMax,1)–LCS algorithm on a linear systolic array. (d) Schematic of the basic computational unit.
Fig. 6. (a) Efficiency of RLLE, expressed as the ratio between the number of runs in the RLLELMax representation and the number of characters in the original string (in symbols,
). (b) Normalized time–space product for the parallel implementation of RLLE(LMax,1).