SPAN: A Simple Predict & Align Network for Handwritten Paragraph Recognition

Coquenet, Denis; Chatelain, Clément; Paquet, Thierry

doi:10.1007/978-3-030-86334-0_5

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12823))

Included in the following conference series:

International Conference on Document Analysis and Recognition

3494 Accesses
14 Citations

Abstract

Unconstrained handwriting recognition is an essential task in document analysis. It is usually carried out in two steps. First, the document is segmented into text lines. Second, an Optical Character Recognition model is applied on these line images. We propose the Simple Predict & Align Network: an end-to-end recurrence-free Fully Convolutional Network performing OCR at paragraph level without any prior segmentation stage. The framework is as simple as the one used for the recognition of isolated lines and we achieve competitive results on three popular datasets: RIMES, IAM and READ 2016. The proposed model does not require any dataset adaptation and can be trained without line breaks in the transcription labels. Our code and trained model weights are available at https://github.com/FactoDeepLearning/SPAN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bluche, T.: Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. Adv. Neural. Inf. Process. Syst. 29, 838–846 (2016)
Google Scholar
Bluche, T., Louradour, J., Messina, R.O.: Scan, attend and read: end-to-end handwritten paragraph recognition with MDLSTM attention. In: ICDAR, pp. 1050–1055 (2017)
Google Scholar
Carbonell, M., Fornés, A., Villegas, M., Lladós, J.: A neural model for text localization, transcription and named entity recognition in full pages. Pattern Recognit. Lett. 136, 219–227 (2020)
Article Google Scholar
Carbonell, M., Mas, J., Villegas, M., Fornés, A., Lladós, J.: End-to-end handwritten text detection and transcription in full pages. In: Workshop on Machine Learning, ICDAR, pp. 29–34 (2019)
Google Scholar
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: CVPR (2017)
Google Scholar
Chung, J., Delteil, T.: A computationally efficient pipeline approach to full page offline handwritten text recognition. In: Workshop on Machine Learning, ICDAR, pp. 35–40 (2019)
Google Scholar
Coquenet, D., Chatelain, C., Paquet, T.: Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network. In: ICFHR, pp. 19–24 (2020)
Google Scholar
Coquenet, D., Soullard, Y., Chatelain, C., Paquet, T.: Have convolutions already made recurrence obsolete for unconstrained handwritten text recognition ? In: Workshop on Machine Learning, ICDAR, pp. 65–70 (2019)
Google Scholar
Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network (2020)
Google Scholar
Graves, A., Fernández, S., Gomez, F.J., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, vol. 148, pp. 369–376 (2006)
Google Scholar
Grosicki, E., El Abed, H.: ICDAR 2011-French handwriting recognition competition, pp. 1459–1463 (2011)
Google Scholar
Grüning, T., Leifert, G., Strauß, T., Michael, J., Labahn, R.: A two-stage method for text line detection in historical documents. Int. J. Doc. Anal. Recognit. 22(3), 285–302 (2019)
Article Google Scholar
Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5, 39–46 (2002)
Article Google Scholar
Michael, J., Labahn, R., Grüning, T., Zöllner, J.: Evaluating sequence-to-sequence models for handwritten text recognition. In: ICDAR, pp. 1286–1293 (2019)
Google Scholar
Moysset, B., Kermorvant, C., Wolf, C.: Full-page text recognition: learning where to start and when to stop. In: ICDAR, pp. 871–876 (2017)
Google Scholar
Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: ICFHR, pp. 7–12 (2018)
Google Scholar
Renton, G., Soullard, Y., Chatelain, C., Adam, S., Kermorvant, C., Paquet, T.: Fully convolutional network with dilated convolutions for handwritten text line segmentation. Int. J. Doc. Anal. Recognit. 21(3), 177–186 (2018)
Article Google Scholar
Schall, M., Schambach, M., Franz, M.O.: Multi-dimensional connectionist classification: reading text in one step. In: 13th International Workshop on Document Analysis Systems, pp. 405–410 (2018)
Google Scholar
Sánchez, J.A., Romero, V., Toselli, A., Vidal, E.: ICFHR 2016 competition on handwritten text recognition on the read dataset, pp. 630–635 (2016)
Google Scholar
Tensmeyer, C., Wigington, C.: Training full-page handwritten text recognition models without annotated line breaks. In: ICDAR, pp. 1–8 (2019)
Google Scholar
Wigington, C., Tensmeyer, C., Davis, B., Barrett, W., Price, B., Cohen, S.: Start, follow, read: end-to-end full-page handwriting recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 372–388. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_23
Chapter Google Scholar
Yousef, M., Bishop, T.E.: Origaminet: weakly-supervised, segmentation-free, one-step, full page text recognition by learning to unfold. In: CVPR, pp. 14698–14707 (2020)
Google Scholar
Yousef, M., Hussain, K.F., Mohammed, U.S.: Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognit. 108, 107482 (2020)
Article Google Scholar

Download references

Acknowledgments

The present work was performed using computing resources of CRIANN (Normandy, France) and HPC resources from GENCI-IDRIS (Grant 2020-AD011012155). This work was financially supported by the French Defense Innovation Agency and by the Normandy region.

Author information

Authors and Affiliations

LITIS Laboratory - EA 4108, Rouen, France
Denis Coquenet, Clément Chatelain & Thierry Paquet
Rouen University, Rouen, France
Denis Coquenet & Thierry Paquet
Normandie University, Mont-Saint-Aignan, France
Denis Coquenet
INSA of Rouen, Saint-Étienne-du-Rouvray, France
Clément Chatelain

Authors

Denis Coquenet
View author publications
You can also search for this author in PubMed Google Scholar
Clément Chatelain
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Paquet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Denis Coquenet .

Editor information

Editors and Affiliations

Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Coquenet, D., Chatelain, C., Paquet, T. (2021). SPAN: A Simple Predict & Align Network for Handwritten Paragraph Recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12823. Springer, Cham. https://doi.org/10.1007/978-3-030-86334-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-86334-0_5
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86333-3
Online ISBN: 978-3-030-86334-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)