Skip to main content
Log in

Handwriting recognition using cohort of LSTM and lexicon verification with extremely large lexicon

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this article, a handwriting recognition model whose complexity does not depend on the lexicon size is proposed. It is an alternative to lexicon-driven decoding, based on a lexicon verification process that allows to deal with millions of words, without any time consuming decoding stage. This lexicon verification is included in a cascade framework that uses complementary LSTM RNN classifiers. An original and very efficient method to obtain hundreds of complementary LSTM RNN extracted from a single training, called cohort, is proposed. The proposed approach achieves new state-of-the art performance on the Rimes and IAM datasets, and provides 90% of accuracy on the Rimes dataset when dealing with a gigantic lexicon record of 3 millions of words. The last contribution extends the idea of cohort and lexicon verification in a ROVER combination for handwriting line recognition, and achieves state-of-the-art results on the Rimes dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Namely on certain upper case characters, especially for letter “j” in the word “je” or “j’” where many ground truth errors occur in the dataset.

  2. evaluated on an Intel CPU i7-3740QM.

References

  1. Bharath A, Madhvanath S (2012) Hmm-based lexicon-driven and lexicon-free word recognition for online handwritten indic scripts. IEEE PAMI 34 (4):670–682

    Article  Google Scholar 

  2. Bideault G, Mioulet L, Chatelain C, Paquet T (2015) Spotting handwritten words and regex using a two stage blstm-hmm architecture. In: Document recognition and retrieval

  3. Bluche T, Louradour J, Knibbe M, Moysset B, Benzeghiba MF, Kermorvant C (2014) The a2ia arabic handwritten text recognition system at the open hart2013 evaluation. In: Document Analysis Systems, pp 161–165

  4. Brakensiek A, Rottland J, Rigoll G (2002) Handwritten address recognition with open vocabulary using character n-grams. In: WFHR, pp. 357–362

  5. Chatelain C, Heutte L, Paquet T (2006) A two-stage outlier rejection strategy for numerical field extraction in handwritten documents. In: ICPR, Vol. 3, pp. 224–227

  6. Chatelain C, Heutte L, Paquet T (2006) Segmentation-driven recognition applied to numerical field extraction from handwritten incoming mail documents. In: Document Analysis System, pp. 564–575

  7. Chelba C, Mikolov T, Schuster M, Ge Q, Brants T, Koehn P, Robinson T One billion word benchmark for measuring progress in statistical language modeling. arXiv:1312.3005

  8. Choromanska A, Henaff M, Mathieu M, Arous G, LeCun Y (2015) The loss surfaces of multilayer networks. In: AISTATS

  9. Chung J, Gulcehre C, Cho K, Bengio Y Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555

  10. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, Vol. 1, pp. 886–893

  11. El Abed H, Margner V, Kherallah M, Alimi AM (2009) Icdar 2009 online arabic handwriting recognition competition. In: ICDAR, pp. 1388–1392

  12. El-Yacoubi A, Gilloux M, Sabourin R, Suen CY (1999) An hmm-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE PAMI 21(8):752–760

    Article  Google Scholar 

  13. Fiscus JG (1997) A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (rover). In: Automatic Speech Recognition and Understanding,pp. 347–354

  14. Fissore L, Micca G, Pieraccini R, Palace P (1988) Strategies for lexical access to very large vocabularies. Speech Comm 7(4):355–366

    Article  Google Scholar 

  15. Graves A Rnnlib: A recurrent neural network library for sequence learning problems

  16. Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: Continual prediction with lstm. Neural computation 12(10):2451–2471

    Article  Google Scholar 

  17. Gers FA, Schraudolph NN, Schmidhuber J (2003) Learning precise timing with lstm recurrent networks. J Mach Learn Res 3:115–143

    MathSciNet  MATH  Google Scholar 

  18. Graves A (2012) Supervised sequence labelling with recurrent neural networks, Vol. 385 springer

  19. Graves A, Fernández S, Gomez FJ, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, pp. 369–376

  20. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18 (5):602–610

    Article  Google Scholar 

  21. Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: NIPS, pp. 545–552

  22. Grosicki E, El Abed H (2009) Icdar 2009 handwriting recognition competition. In: ICDAR, pp. 1398–1402

  23. Hamdani M, Doetsch P, Kozielski M, Mousa AE-D, Ney H (2014) The rwth large vocabulary arabic handwriting recognition system. In: IAPR International Workshop on Document Analysis Systems, pp. 111–115

  24. Hamdani M, Mousa AE-D, Ney H (2013) Open vocabulary arabic handwriting recognition using morphological decomposition. In: ICDAR, IEEE, pp. 280–284

  25. Hochreiter S (1998) The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6(02):107–116

    Article  Google Scholar 

  26. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computation 9(8):1735–1780

    Article  Google Scholar 

  27. Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences 79(8):2554–2558

    Article  MathSciNet  Google Scholar 

  28. Koerich AL, Sabourin R, Suen CY (2003) Large vocabulary off-line handwriting recognition: a survey. Pattern Analysis & Applications 6(2):97–121

    Article  MathSciNet  Google Scholar 

  29. Kozielski M, Rybach D, Hahn S, Schlüter R, Ney H (2013) Open vocabulary handwriting recognition using combined word-level and character-level language models. In: ICASSP, pp. 8257–8261

  30. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  31. Madhvanath S, Govindaraju V (1996) Holistic lexicon reduction for handwritten word recognition. In: Document Recognition III, pp. 224–234

  32. Madhvanath S, Krpasundar V, Govindaraju V (2001) Syntactic methodology of pruning large lexicons in cursive script recognition. Pattern Recogn 34 (1):37–46

    Article  Google Scholar 

  33. Marti U-V, Bunke H (2002) The iam-database: an english sentence database for offline handwriting recognition. IJDAR 5(1):39–46

    Article  Google Scholar 

  34. Menasri F, Louradour J, Bianne-Bernard A-L, Kermorvant C (2012) The a2ia french handwriting recognition system at the rimes-icdar2011 competition. In: Document Recognition and Retrieval XIX, pp. 82970Y–82970Y

  35. Mioulet L, Bideault G, Chatelain C, Paquet T, Brunessaux S (2015) Exploring multiple feature combination strategies with a recurrent neural network architecture for off-line handwriting recognition. In: Document Recognition and Retrieval, pp. 94020F–94020F

  36. Pham V, Bluche T, Kermorvant C, Louradour J (2014) Dropout improves recurrent neural networks for handwriting recognition. In: ICFHR, pp. 285–290

  37. Plamondon R, Srihari SN (2000) Online and off-line handwriting recognition: a comprehensive survey. IEEE PAMI 22(1):63–84

    Article  Google Scholar 

  38. Poznanski A, Wolf L (2016) Cnn-n-gram for handwriting word recognition. In: CVPR, pp. 2305–2314

  39. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. Signal Processing, IEEE Transactions on 45(11):2673–2681

    Article  Google Scholar 

  40. Senior A, Robinson T (1996) Forward-backward retraining of recurrent neural networks. NIPS, pp 743–749

  41. Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE PAMI 39(11):2298–2304

    Article  Google Scholar 

  42. Shridhar M, Houle G, Kimura F (1997) Handwritten word recognition using lexicon free and lexicon directed word recognition algorithms. In: ICDAR, Vol. 2, pp. 861–865

  43. Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. In: NIPS, pp. 3104–3112

  44. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, Vol. 1, pp. I–511

  45. Voigtlaender P, Doetsch P, Ney H (2016) Handwriting recognition with large multidimensional long short-term memory recurrent neural networks. In: ICFHR, pp. 228–233

  46. Zamora-Martinez F, Frinken V, España-Boquera S, Castro-Bleda MJ, Fischer A, Bunke H (2014) Neural network language models for off-line handwriting recognition. Pattern Recogn 47(4):1642–1652

    Article  Google Scholar 

  47. Zhang B (2013) Reliable classification of vehicle types based on cascade classifier ensembles. Intelligent Transportation Systems 14(1):322–332

    Google Scholar 

  48. Zhang P, Bui T, Suen C (2007) A novel cascade ensemble classifier system with a high recognition performance on handwritten digits. Pattern Recogn 40 (12):3415–3429

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bruno Stuner.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stuner, B., Chatelain, C. & Paquet, T. Handwriting recognition using cohort of LSTM and lexicon verification with extremely large lexicon. Multimed Tools Appl 79, 34407–34427 (2020). https://doi.org/10.1007/s11042-020-09198-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09198-6

Keywords

Navigation