Abstract
We propose a new sort-based transform for lossless data compression that can replace the BWT transform in the block-sorting data compression algorithm. The proposed transform is a parametric generalization of the BWT and the RadixZip transform proposed by Vo and Manku (VLDB, 2008), which is a rather new variation of the BWT. For a class of parameters, the transform can be performed in time linear in the data length. We give an asymptotic compression bound attained by our algorithm.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Adjeroh, D., Bell, T., Mukherjee, A.: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer, Heidelberg (2008)
Arimura, M., Yamamoto, H.: Asymptotic optimality of the block sorting data compression algorithm. IEICE Trans. Fundamentals E81-A(10), 2117–2122 (1998)
Bentley, J.L., Sleator, D.D., Tarjan, R.E., Wei, V.K.: A locally adaptive data compression scheme. Comm. ACM 29(4), 320–330 (1986)
Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm, SRC Research Report 124, Digital Systems Research Center, Palo Alto (1994)
Deorowicz, S.: Improvements to Burrows–Wheeler compression algorithm. Software—Practice and Experience 30(13), 1465–1483 (2000)
Elias, P.: Universal codeword sets and representations of the integers. IEEE Trans. Inform. Theory IT-21, 194–203 (1975)
Nong, G., Zhang, S., Chan, W.H.: Computing inverse ST in linear complexity. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 178–190. Springer, Heidelberg (2008)
Schindler, M.: A fast block-sorting algorithm for lossless data compression. In: DCC 1997, Proc. Data Compression Conf., 469, Snowbird, UT (1997)
Vo, B.D., Manku, G.S.: RadixZip: Linear time compression of token streams. In: Very Large Data Bases: Proc. 33rd Intern. Conf. on Very Large Data Bases, Vienna, pp. 1162–1172 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Inagaki, K., Tomizawa, Y., Yokoo, H. (2009). Novel and Generalized Sort-Based Transform for Lossless Data Compression. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds) String Processing and Information Retrieval. SPIRE 2009. Lecture Notes in Computer Science, vol 5721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03784-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-03784-9_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03783-2
Online ISBN: 978-3-642-03784-9
eBook Packages: Computer ScienceComputer Science (R0)