Skip to main content

Page Segmentation Techniques in Document Analysis

  • Reference work entry
  • First Online:
Book cover Handbook of Document Image Processing and Recognition

Abstract

In this chapter, we describe various notions and methods of page segmentation, which is to segment page images into homogeneous components such as text blocks, figures, and tables. It constitutes the whole process called layout analysis along with the classification of segmented components described in Chap. 7 (Page Similarity and Classification). This chapter starts with classification of page layout structures from various viewpoints including different levels of components and printing colors. Then we classify methods to handle each class of layout. This is done based on three viewpoints: (1) objects to be analyzed, foreground or background; (2) primitives of analysis, pixels, connected components, maximal empty rectangles, etc.; (3) strategy of analysis, top-down and bottom-up. The details of classified methods are described and compared with one another to know pros and cons of these methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 549.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wong KY, Casey RG, Wahl FM (1982) Document analysis system. IBM J Res Dev 26(6): 647–656

    Article  Google Scholar 

  2. Ittner DJ, Baird HS (1993) Language-free layout analysis. In: Proceedings of the second ICDAR, Tsukuba, pp 336–340

    Google Scholar 

  3. O’Gorman L (1993) The document spectrum for page layout analysis. IEEE Trans PAMI 15(11):1162–1172

    Article  Google Scholar 

  4. Kise K, Iwata M, Matsumoto K, Dengel A (1998) A computational geometric approach to text-line extraction from binary document images. In: Proceedings of the 3rd DAS, Nagano, pp 346–355

    Google Scholar 

  5. Baird HS (1992) Anatomy of a versatile page reader. Proc IEEE 80(7):1059–1065

    Article  Google Scholar 

  6. Antonacopoulos A (1998) Page segmentation using the description of the background. Comput Vis Image Underst 70(3):350–369

    Article  Google Scholar 

  7. Kise K, Sato A, Iwata M (1998) Segmentation of page images using the area Voronoi diagram. Comput Vis Image Underst 70(3):370–382

    Article  Google Scholar 

  8. Nagy G, Seth S (1984) Hierarchical representation of optically scanned documents. In: Proceedings of the 7th ICPR, Montreal, pp 347–349

    Google Scholar 

  9. Krishnamoorthy M, Nagy G, Seth S, Viswanathan M (1993) Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Trans PAMI 15(7):737–747

    Article  Google Scholar 

  10. Pavlidis T, Zhou J (1992) Page segmentation and classification. CVGIP: Graph Models Image Process 54(6):484–496

    Google Scholar 

  11. Srihari SN, Govindaraju V (1989) Analysis of textual images using the Hough transform. Mach Vis Appl 2:141–153

    Article  Google Scholar 

  12. Fletcher LA, Kasturi R (1988) A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans PAMI 10(6):910–918

    Article  Google Scholar 

  13. Bloomberg DS (1996) Textured reduction for document image analysis. In: Proceedings of the IS&T/SPIE EI’96, conference 2660: document recognition III, San Jose

    Google Scholar 

  14. Bukhari SS, Shafait F, Breuel TM (2011) Improved document image segmentation algorithm using multiresolution morphology. In: SPIE document recognition and retrieval XVIII, DRR’11, San Jose

    Google Scholar 

  15. Dias AP (1996) Minimum spanning trees for text segmentation. In: Proceedings of the 5th annual symposium on document analysis and information retrieval, Las Vegas

    Google Scholar 

  16. Simon A, Pret J-C, Johnson AP (1997) A fast algorithm for bottom-up document layout analysis. IEEE Trans PAMI 19(3):273–277

    Article  Google Scholar 

  17. Breuel TM (2002) Two geometric algorithms for layout analysis. In: Proceedings of the DAS2002, Princeton, pp 188–199

    Google Scholar 

  18. Agrawal M, Doermann D (2010) Context-aware and content-based dynamic Voronoi page segmentation. In: Proceedings of the 9th DAS, Boston, pp 73–80

    Google Scholar 

  19. Yin P-Y (2001) Skew detection and block classification of printed documents. Image Vis Comput 19(8):567–579

    Article  Google Scholar 

  20. Antonacopoulos A, Pletschacher S, Bridson D, Papadopoulos C (2009) ICDAR 2009 page segmentation competition. In: Proceedings of the 10th ICDAR, Barcelona, pp 1370–1374

    Google Scholar 

  21. Shafait F, Keysers D, Breuel TM (2008) Performance evaluation and benchmarking of six-page segmentation algorithms. IEEE Trans PAMI 30(6):941–954

    Article  Google Scholar 

  22. Mao S, Kanungo T (2001) Empirical performance evaluation methodology and its application to page segmentation algorithms. IEEE Trans PAMI 23(3):242–256

    Article  Google Scholar 

  23. Strouthopoulos C, Papamarkos N, Atsalakis AE (2002) Text extraction in complex color documents. Pattern Recognit 35:1743–1758

    Article  Google Scholar 

  24. Hase H, Yoneda M, Tokai S, Kato J, Suen CY (2004) Color segmentation for text extraction. IJDAR 6:271–284

    Article  Google Scholar 

  25. Perroud T, Sobottka K, Bunke H, Hall L (2001) Text extraction from color documents – clustering approaches in three and four dimensions. In: Proceedings of the 6th ICDAR, Seattle, pp 937–941

    Google Scholar 

  26. Yuan Q, Tan CL (2001) Text extraction from gray scale document images using edge information. In: Proceedings of the 6th ICDAR, Seattle, pp 302–306

    Google Scholar 

  27. Jain AK, Bhattacharjee S (1992) Text segmentation using Gabor filters for automatic document processing. Mach Vis Appl 5:169–184

    Article  Google Scholar 

  28. Jain AK, Zhong Y (1996) Page segmentation using texture analysis. Pattern Recognit 29(5):743–770

    Article  Google Scholar 

  29. Etemad K, Doermann D, Chellappa R (1997) Multiscale segmentation of unstructured document pages using soft decision integration. IEEE Trans PAMI 19(1):92–97

    Article  Google Scholar 

  30. Acharyya M, Kundu MK (2002) Document image segmentation using wavelet scale-space features. IEEE Trans Circuits Syst Video Technol 12(12):1117–1127

    Article  Google Scholar 

  31. Kumar S, Gupta R, Khanna N, Chaudhury S, Joshi SD (2007) Text extraction and document image segmentation using matched wavelets and MRF model. IEEE Trans Image Process 16(8):2117–2128

    Article  MathSciNet  Google Scholar 

  32. Cheng H, Bouman CA (2001) Multiscale Bayesian segmentation using a trainable context model. IEEE Trans Image Process 10(4):511–525

    Article  Google Scholar 

  33. An C, Baird HS, Xiu P (2007) Iterated document content classification. In: Proceedings of the 9th ICDAR, Curitiba, pp 252–256

    Google Scholar 

  34. Zheng Y, Li H, Doermann D (2004) Machine printed text and handwriting identification in noisy document images. IEEE Trans PAMI 26(3):337–353

    Article  Google Scholar 

  35. O’Gorman L, Kasturi R (1995) Document image analysis. IEEE Computer Society, Los Alamitos

    Google Scholar 

  36. Dori D, Doermann D, Shin C, Haralick R, Phillips I, Buchman M, Ross D (1997) The representation of document structure: a generic object-process analysis. In: Bunke H, Wang PSP (eds) Handbook of character recognition and document image analysis. World Scientific, Singapore, pp 421–456

    Chapter  Google Scholar 

  37. Jain AK, Yu B (1998) Document representation and its application to page decomposition. IEEE Trans PAMI 20(3):294–308

    Article  Google Scholar 

  38. Okun O, Pietikäinen M (1999) A survery of texture-based methods for document layout analysis. In: Pietikäinen M (ed) Texture analysis in machine vision. Series in machine perception and artificial intelligence, vol 40. World Scientific, Singapore

    Google Scholar 

  39. Nagy G (2000) Twenty years of document image analysis in PAMI. IEEE Trans PAMI 22(1):38–62

    Article  Google Scholar 

  40. Mao S, Rosenfeld A, Kanungo T (2003) Document structure analysis algorithms: a literature survey. In: Proceedings of the document recognition and retrieval X, Santa Clara, pp 197–207

    Google Scholar 

  41. Namboodiri AM, Jain A (2007) Document structure and layout analysis. In: Chaudhuri BB (ed) Digital document processing: major directions and recent advances. Springer, London, pp 29–48. ISBN:978-1-84628-501-1

    Chapter  Google Scholar 

  42. Cattoni R, Coianz T, Messelodi S, Modena CM (1998) Geometric layout analysis techniques for document image understanding: a review. Technical report TR9703-09, ITC-irst

    Google Scholar 

  43. Normand N, Viard-Gaudina C (1995) A background based adaptive page segmentation algorithm. In: Proceedings of the 3rd ICDAR, Montreal, pp 138–141

    Google Scholar 

  44. Kise K, Yanagida O, Takamatsu S (1996) Page segmentation based on thinning of background. In: Proceedings of the 13th ICPR, Vienna, pp 788–792

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Koichi Kise .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag London

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Kise, K. (2014). Page Segmentation Techniques in Document Analysis. In: Doermann, D., Tombre, K. (eds) Handbook of Document Image Processing and Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-859-1_5

Download citation

Publish with us

Policies and ethics