Abstract
Unconstrained handwriting recognition is an essential task in document analysis. It is usually carried out in two steps. First, the document is segmented into text lines. Second, an Optical Character Recognition model is applied on these line images. We propose the Simple Predict & Align Network: an end-to-end recurrence-free Fully Convolutional Network performing OCR at paragraph level without any prior segmentation stage. The framework is as simple as the one used for the recognition of isolated lines and we achieve competitive results on three popular datasets: RIMES, IAM and READ 2016. The proposed model does not require any dataset adaptation and can be trained without line breaks in the transcription labels. Our code and trained model weights are available at https://github.com/FactoDeepLearning/SPAN.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bluche, T.: Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. Adv. Neural. Inf. Process. Syst. 29, 838–846 (2016)
Bluche, T., Louradour, J., Messina, R.O.: Scan, attend and read: end-to-end handwritten paragraph recognition with MDLSTM attention. In: ICDAR, pp. 1050–1055 (2017)
Carbonell, M., Fornés, A., Villegas, M., Lladós, J.: A neural model for text localization, transcription and named entity recognition in full pages. Pattern Recognit. Lett. 136, 219–227 (2020)
Carbonell, M., Mas, J., Villegas, M., Fornés, A., Lladós, J.: End-to-end handwritten text detection and transcription in full pages. In: Workshop on Machine Learning, ICDAR, pp. 29–34 (2019)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: CVPR (2017)
Chung, J., Delteil, T.: A computationally efficient pipeline approach to full page offline handwritten text recognition. In: Workshop on Machine Learning, ICDAR, pp. 35–40 (2019)
Coquenet, D., Chatelain, C., Paquet, T.: Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network. In: ICFHR, pp. 19–24 (2020)
Coquenet, D., Soullard, Y., Chatelain, C., Paquet, T.: Have convolutions already made recurrence obsolete for unconstrained handwritten text recognition ? In: Workshop on Machine Learning, ICDAR, pp. 65–70 (2019)
Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network (2020)
Graves, A., Fernández, S., Gomez, F.J., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, vol. 148, pp. 369–376 (2006)
Grosicki, E., El Abed, H.: ICDAR 2011-French handwriting recognition competition, pp. 1459–1463 (2011)
Grüning, T., Leifert, G., Strauß, T., Michael, J., Labahn, R.: A two-stage method for text line detection in historical documents. Int. J. Doc. Anal. Recognit. 22(3), 285–302 (2019)
Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5, 39–46 (2002)
Michael, J., Labahn, R., Grüning, T., Zöllner, J.: Evaluating sequence-to-sequence models for handwritten text recognition. In: ICDAR, pp. 1286–1293 (2019)
Moysset, B., Kermorvant, C., Wolf, C.: Full-page text recognition: learning where to start and when to stop. In: ICDAR, pp. 871–876 (2017)
Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: ICFHR, pp. 7–12 (2018)
Renton, G., Soullard, Y., Chatelain, C., Adam, S., Kermorvant, C., Paquet, T.: Fully convolutional network with dilated convolutions for handwritten text line segmentation. Int. J. Doc. Anal. Recognit. 21(3), 177–186 (2018)
Schall, M., Schambach, M., Franz, M.O.: Multi-dimensional connectionist classification: reading text in one step. In: 13th International Workshop on Document Analysis Systems, pp. 405–410 (2018)
Sánchez, J.A., Romero, V., Toselli, A., Vidal, E.: ICFHR 2016 competition on handwritten text recognition on the read dataset, pp. 630–635 (2016)
Tensmeyer, C., Wigington, C.: Training full-page handwritten text recognition models without annotated line breaks. In: ICDAR, pp. 1–8 (2019)
Wigington, C., Tensmeyer, C., Davis, B., Barrett, W., Price, B., Cohen, S.: Start, follow, read: end-to-end full-page handwriting recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 372–388. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_23
Yousef, M., Bishop, T.E.: Origaminet: weakly-supervised, segmentation-free, one-step, full page text recognition by learning to unfold. In: CVPR, pp. 14698–14707 (2020)
Yousef, M., Hussain, K.F., Mohammed, U.S.: Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognit. 108, 107482 (2020)
Acknowledgments
The present work was performed using computing resources of CRIANN (Normandy, France) and HPC resources from GENCI-IDRIS (Grant 2020-AD011012155). This work was financially supported by the French Defense Innovation Agency and by the Normandy region.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Coquenet, D., Chatelain, C., Paquet, T. (2021). SPAN: A Simple Predict & Align Network for Handwritten Paragraph Recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12823. Springer, Cham. https://doi.org/10.1007/978-3-030-86334-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-86334-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86333-3
Online ISBN: 978-3-030-86334-0
eBook Packages: Computer ScienceComputer Science (R0)