Page Segmentation Techniques in Document Analysis

Kise, Koichi

doi:10.1007/978-0-85729-859-1_5

Koichi Kise³

4251 Accesses
17 Citations

Abstract

In this chapter, we describe various notions and methods of page segmentation, which is to segment page images into homogeneous components such as text blocks, figures, and tables. It constitutes the whole process called layout analysis along with the classification of segmented components described in Chap. 7 (Page Similarity and Classification). This chapter starts with classification of page layout structures from various viewpoints including different levels of components and printing colors. Then we classify methods to handle each class of layout. This is done based on three viewpoints: (1) objects to be analyzed, foreground or background; (2) primitives of analysis, pixels, connected components, maximal empty rectangles, etc.; (3) strategy of analysis, top-down and bottom-up. The details of classified methods are described and compared with one another to know pros and cons of these methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 549.99; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wong KY, Casey RG, Wahl FM (1982) Document analysis system. IBM J Res Dev 26(6): 647–656
Article Google Scholar
Ittner DJ, Baird HS (1993) Language-free layout analysis. In: Proceedings of the second ICDAR, Tsukuba, pp 336–340
Google Scholar
O’Gorman L (1993) The document spectrum for page layout analysis. IEEE Trans PAMI 15(11):1162–1172
Article Google Scholar
Kise K, Iwata M, Matsumoto K, Dengel A (1998) A computational geometric approach to text-line extraction from binary document images. In: Proceedings of the 3rd DAS, Nagano, pp 346–355
Google Scholar
Baird HS (1992) Anatomy of a versatile page reader. Proc IEEE 80(7):1059–1065
Article Google Scholar
Antonacopoulos A (1998) Page segmentation using the description of the background. Comput Vis Image Underst 70(3):350–369
Article Google Scholar
Kise K, Sato A, Iwata M (1998) Segmentation of page images using the area Voronoi diagram. Comput Vis Image Underst 70(3):370–382
Article Google Scholar
Nagy G, Seth S (1984) Hierarchical representation of optically scanned documents. In: Proceedings of the 7th ICPR, Montreal, pp 347–349
Google Scholar
Krishnamoorthy M, Nagy G, Seth S, Viswanathan M (1993) Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Trans PAMI 15(7):737–747
Article Google Scholar
Pavlidis T, Zhou J (1992) Page segmentation and classification. CVGIP: Graph Models Image Process 54(6):484–496
Google Scholar
Srihari SN, Govindaraju V (1989) Analysis of textual images using the Hough transform. Mach Vis Appl 2:141–153
Article Google Scholar
Fletcher LA, Kasturi R (1988) A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans PAMI 10(6):910–918
Article Google Scholar
Bloomberg DS (1996) Textured reduction for document image analysis. In: Proceedings of the IS&T/SPIE EI’96, conference 2660: document recognition III, San Jose
Google Scholar
Bukhari SS, Shafait F, Breuel TM (2011) Improved document image segmentation algorithm using multiresolution morphology. In: SPIE document recognition and retrieval XVIII, DRR’11, San Jose
Google Scholar
Dias AP (1996) Minimum spanning trees for text segmentation. In: Proceedings of the 5th annual symposium on document analysis and information retrieval, Las Vegas
Google Scholar
Simon A, Pret J-C, Johnson AP (1997) A fast algorithm for bottom-up document layout analysis. IEEE Trans PAMI 19(3):273–277
Article Google Scholar
Breuel TM (2002) Two geometric algorithms for layout analysis. In: Proceedings of the DAS2002, Princeton, pp 188–199
Google Scholar
Agrawal M, Doermann D (2010) Context-aware and content-based dynamic Voronoi page segmentation. In: Proceedings of the 9th DAS, Boston, pp 73–80
Google Scholar
Yin P-Y (2001) Skew detection and block classification of printed documents. Image Vis Comput 19(8):567–579
Article Google Scholar
Antonacopoulos A, Pletschacher S, Bridson D, Papadopoulos C (2009) ICDAR 2009 page segmentation competition. In: Proceedings of the 10th ICDAR, Barcelona, pp 1370–1374
Google Scholar
Shafait F, Keysers D, Breuel TM (2008) Performance evaluation and benchmarking of six-page segmentation algorithms. IEEE Trans PAMI 30(6):941–954
Article Google Scholar
Mao S, Kanungo T (2001) Empirical performance evaluation methodology and its application to page segmentation algorithms. IEEE Trans PAMI 23(3):242–256
Article Google Scholar
Strouthopoulos C, Papamarkos N, Atsalakis AE (2002) Text extraction in complex color documents. Pattern Recognit 35:1743–1758
Article Google Scholar
Hase H, Yoneda M, Tokai S, Kato J, Suen CY (2004) Color segmentation for text extraction. IJDAR 6:271–284
Article Google Scholar
Perroud T, Sobottka K, Bunke H, Hall L (2001) Text extraction from color documents – clustering approaches in three and four dimensions. In: Proceedings of the 6th ICDAR, Seattle, pp 937–941
Google Scholar
Yuan Q, Tan CL (2001) Text extraction from gray scale document images using edge information. In: Proceedings of the 6th ICDAR, Seattle, pp 302–306
Google Scholar
Jain AK, Bhattacharjee S (1992) Text segmentation using Gabor filters for automatic document processing. Mach Vis Appl 5:169–184
Article Google Scholar
Jain AK, Zhong Y (1996) Page segmentation using texture analysis. Pattern Recognit 29(5):743–770
Article Google Scholar
Etemad K, Doermann D, Chellappa R (1997) Multiscale segmentation of unstructured document pages using soft decision integration. IEEE Trans PAMI 19(1):92–97
Article Google Scholar
Acharyya M, Kundu MK (2002) Document image segmentation using wavelet scale-space features. IEEE Trans Circuits Syst Video Technol 12(12):1117–1127
Article Google Scholar
Kumar S, Gupta R, Khanna N, Chaudhury S, Joshi SD (2007) Text extraction and document image segmentation using matched wavelets and MRF model. IEEE Trans Image Process 16(8):2117–2128
Article MathSciNet Google Scholar
Cheng H, Bouman CA (2001) Multiscale Bayesian segmentation using a trainable context model. IEEE Trans Image Process 10(4):511–525
Article Google Scholar
An C, Baird HS, Xiu P (2007) Iterated document content classification. In: Proceedings of the 9th ICDAR, Curitiba, pp 252–256
Google Scholar
Zheng Y, Li H, Doermann D (2004) Machine printed text and handwriting identification in noisy document images. IEEE Trans PAMI 26(3):337–353
Article Google Scholar
O’Gorman L, Kasturi R (1995) Document image analysis. IEEE Computer Society, Los Alamitos
Google Scholar
Dori D, Doermann D, Shin C, Haralick R, Phillips I, Buchman M, Ross D (1997) The representation of document structure: a generic object-process analysis. In: Bunke H, Wang PSP (eds) Handbook of character recognition and document image analysis. World Scientific, Singapore, pp 421–456
Chapter Google Scholar
Jain AK, Yu B (1998) Document representation and its application to page decomposition. IEEE Trans PAMI 20(3):294–308
Article Google Scholar
Okun O, Pietikäinen M (1999) A survery of texture-based methods for document layout analysis. In: Pietikäinen M (ed) Texture analysis in machine vision. Series in machine perception and artificial intelligence, vol 40. World Scientific, Singapore
Google Scholar
Nagy G (2000) Twenty years of document image analysis in PAMI. IEEE Trans PAMI 22(1):38–62
Article Google Scholar
Mao S, Rosenfeld A, Kanungo T (2003) Document structure analysis algorithms: a literature survey. In: Proceedings of the document recognition and retrieval X, Santa Clara, pp 197–207
Google Scholar
Namboodiri AM, Jain A (2007) Document structure and layout analysis. In: Chaudhuri BB (ed) Digital document processing: major directions and recent advances. Springer, London, pp 29–48. ISBN:978-1-84628-501-1
Chapter Google Scholar
Cattoni R, Coianz T, Messelodi S, Modena CM (1998) Geometric layout analysis techniques for document image understanding: a review. Technical report TR9703-09, ITC-irst
Google Scholar
Normand N, Viard-Gaudina C (1995) A background based adaptive page segmentation algorithm. In: Proceedings of the 3rd ICDAR, Montreal, pp 138–141
Google Scholar
Kise K, Yanagida O, Takamatsu S (1996) Page segmentation based on thinning of background. In: Proceedings of the 13th ICPR, Vienna, pp 788–792
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University, 1-1 Gakuencho, Naka, 599-8531, Sakai, Osaka, Japan
Koichi Kise

Authors

Koichi Kise
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Koichi Kise .

Editor information

Editors and Affiliations

University of Maryland, College Park, MD, USA
David Doermann
Université de Lorraine, Nancy, France
Karl Tombre

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Kise, K. (2014). Page Segmentation Techniques in Document Analysis. In: Doermann, D., Tombre, K. (eds) Handbook of Document Image Processing and Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-859-1_5

Download citation

DOI: https://doi.org/10.1007/978-0-85729-859-1_5
Published: 24 July 2019
Publisher Name: Springer, London
Print ISBN: 978-0-85729-858-4
Online ISBN: 978-0-85729-859-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics