Skip to main content
Log in

Simple algorithm page layout analysis

  • Representation, Processing, Analysis and Understanding of Images
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

An algorithm for page layout analysis (segmentation) is suggested in the paper. It allows whitespace between text blocks to be detected on a document page. The algorithm could be used in document analysis and recognition problems. In particular, it can be used for column recognition in multicolumn text and tables. The suggested algorithm is quite simple for implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. H. S. Baird, S. E. Jones, and S. J. Fortune, “Image Segmentation by Shape-Directed Covers,” in Proc. Int. Conf. on Pattern Recognition (Atlantic City, 1990), Vol. 1, pp. 820–825.

    Article  Google Scholar 

  2. T. M. Breuel, “Two Geometric Algorithms for Layout Analysis,” in Proc. 5th Int. Workshop on Document Analysis Systems (Nara, 2008), Vol. 2423, pp. 188–199.

    Article  Google Scholar 

  3. R. Cattoni, T. Coianiz, S. Messelodi, and C. M. Modena, “Geometric Layout Analysis Techniques for Document Image Understanding: a Review,” Tech. Rep. IRST (Trento, 1998).

  4. J. Chaudhuri, S. C. Nandy, and S. Das, “Largest Empty Rectangle among a Point Set,” J. Algorithms 46(1), 54–78 (2003).

    Article  MATH  MathSciNet  Google Scholar 

  5. K. Kise, A. Sato, and M. Iwata, “Segmentation of Page Images Using the Area Voronoi Diagram,” Comp. Vision Image Understand. 70, No. 3, 370–382 (1998).

    Article  Google Scholar 

  6. P. Lyman and H. R. Varian, “How much Information?,” Tech. Rep. (2003), Available from: http://www.sims.berkeley.edu/how-much-info-2003

  7. Machine Learning in Document Analysis and Recognition, Ed. by S. Marinai and H. Fujisawa (2008), Vol. 90.

  8. M. Orlowski, “A New Algorithm for the Largest Empty Rectangle Problem,” Algorithm. 5, Nos. 1–4, 65–73 (1990).

    Article  MATH  MathSciNet  Google Scholar 

  9. A. O. Shigarov, I. V. Bychkov, G. M. Ruzhnikov, and A. E. Khmel’nov, “A Method for Table Detection in Metafiles,” Pattern Recogn. Image Anal. 19, No. 4, 693–697 (2009).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shigarov, A.O., Fedorov, R.K. Simple algorithm page layout analysis. Pattern Recognit. Image Anal. 21, 324–327 (2011). https://doi.org/10.1134/S1054661811021008

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1054661811021008

Keywords

Navigation