research-article

Open Access

Impact of the ground truth quality for handwriting recognition

Authors:
Michael Jungo

University of Applied Sciences and Arts Western Switzerland, Switzerland

University of Applied Sciences and Arts Western Switzerland, Switzerland

0009-0001-1790-1687
View Profile

,
Lars Vögtlin

University of Fribourg, Switzerland

University of Fribourg, Switzerland

0000-0002-2543-9074
View Profile

,
Atefeh Fakhari

University of Fribourg, Switzerland

University of Fribourg, Switzerland

0009-0001-4727-2471
View Profile

,
Nathan Wegmann

University of Fribourg, Switzerland

University of Fribourg, Switzerland

0009-0009-1543-2715
View Profile

,
Rolf Ingold

University of Fribourg, Switzerland

University of Fribourg, Switzerland

0000-0001-7738-133X
View Profile

,
Andreas Fischer

University of Applied Sciences and Arts Western Switzerland, Switzerland

University of Applied Sciences and Arts Western Switzerland, Switzerland

0000-0003-0069-3436
View Profile

,
Anna Scius-Bertrand

University of Applied Sciences and Arts Western Switzerland, Switzerland

University of Applied Sciences and Arts Western Switzerland, Switzerland

0009-0006-7414-2214
View Profile

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication TechnologyDecember 2023Pages 135–140https://doi.org/10.1145/3628797.3628976

Published:07 December 2023Publication History

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

Pages 135–140

ABSTRACT

Handwriting recognition is a key technology for accessing the content of old manuscripts, helping to preserve cultural heritage. Deep learning shows an impressive performance in solving this task. However, to achieve its full potential, it requires a large amount of labeled data, which is difficult to obtain for ancient languages and scripts. Often, a trade-off has to be made between ground truth quantity and quality, as is the case for the recently introduced Bullinger database. It contains an impressive amount of over a hundred thousand labeled text line images of mostly premodern German and Latin texts that were obtained by automatically aligning existing page-level transcriptions with text line images. However, the alignment process introduces systematic errors, such as wrongly hyphenated words. In this paper, we investigate the impact of such errors on training and evaluation and suggest means to detect and correct typical alignment errors.

References

[n. d.]. Bullinger Digital. https://www.bullinger-digital.ch/ Accessed on 18.09.2023.Google Scholar
Cathaoir Agnew, Ciarán Eising, Patrick Denny, Anthony Scanlan, Pepijn Van De Ven, and Eoin M. Grua. 2023. Quantifying the Effects of Ground Truth Annotation Quality on Object Detection and Instance Segmentation Performance. IEEE Access 11 (2023), 25174–25188. https://doi.org/10.1109/ACCESS.2023.3256723Google ScholarCross Ref
Michele Alberti, Lars Vögtlin, Vinaychandran Pondenkandath, Mathias Seuret, Rolf Ingold, and Marcus Liwicki. 2019. Labeling, cutting, grouping: an efficient text line segmentation method for medieval manuscripts. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1200–1206.Google ScholarCross Ref
José Andrés, Alejandro H Toselli, and Enrique Vidal. 2023. Search for Hyphenated Words in Probabilistic Indices: A Machine Learning Approach. In International Conference on Document Analysis and Recognition. Springer, 269–285.Google Scholar
Hangbo Bao, Li Dong, Songhao Piao, and Furu Wei. 2021. Beit: Bert pre-training of image transformers. ICLR 2021 (2021).Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarCross Ref
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR 2021 (2021).Google Scholar
G Fink and Thomas Plotz. 2007. On the use of context-dependent modeling units for HMM-based offline handwriting recognition. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Vol. 2. IEEE, 729–733.Google ScholarCross Ref
Andreas Fischer, Marcus Liwicki, and Rolf Jurg Ingold. 2020. Handwritten historical document analysis, recognition, and retrieval-state of the art and future trends. (2020).Google Scholar
Alex Graves, Santiago Fernández, Faustino Gomez, and Jürgen Schmidhuber. 2006. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. In Proceedings of the 23rd International Conference on Machine Learning(ICML ’06). Association for Computing Machinery, 369–376. https://doi.org/10.1145/1143844.1143891Google ScholarDigital Library
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (nov 1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735Google ScholarDigital Library
Philip Kahle, Sebastian Colutto, Günter Hackl, and Günter Mühlberger. 2017. Transkribus - A Service Platform for Transcription, Recognition and Retrieval of Historical Documents. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 04. 19–24. https://doi.org/10.1109/ICDAR.2017.307Google ScholarCross Ref
Boda Li, Gabriele Baris, Pak Hung Chan, Anima Rahman, and Valentina Donzella. 2022. Testing ground-truth errors in an automotive dataset for a DNN-based object detector. In 2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME). 1–6. https://doi.org/10.1109/ICECCME55909.2022.9988623Google ScholarCross Ref
Minghao Li, Tengchao Lv, Jingye Chen, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, and Furu Wei. 2023. Trocr: Transformer-based optical character recognition with pre-trained models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 13094–13102.Google ScholarDigital Library
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv abs/1907.11692 (2019).Google Scholar
Ilya Loshchilov and Frank Hutter. 2017. Decoupled Weight Decay Regularization., pages. arXiv:1711.05101 arXiv:1711.05101Google Scholar
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfGoogle ScholarDigital Library
B. T. Polyak and A. B. Juditsky. 1992. Acceleration of Stochastic Approximation by Averaging. SIAM J. Control Optim. 30, 4 (jul 1992), 838–855.Google ScholarDigital Library
Joan Puigcerver. 2017. Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 01. 67–72.Google ScholarCross Ref
Najoua Rahal, Lars Vögtlin, and Rolf Ingold. 2023. Historical document image analysis using controlled data for pre-training. International Journal on Document Analysis and Recognition (IJDAR) (2023), 1–14.Google ScholarDigital Library
Anna Scius-Bertrand, Phillip Ströbel, Martin Volk, Tobias Hodel, and Andreas Fischer. 2023. The Bullinger Dataset: A Writer Adaptation Challenge. In International Conference on Document Analysis and Recognition. Springer, 397–410.Google Scholar
Anna Scius-Bertrand, Lars Voegtlin, Michele Alberti, Andreas Fischer, and Marc Bui. 2019. Layout analysis and text column segmentation for historical Vietnamese steles. In Proceedings of the 5th International Workshop on Historical Document Imaging and Processing. 84–89.Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015 (2015).Google Scholar
Martin Spoto, Beat Wolf, Andreas Fischer, and Anna Scius-Bertrand. 2022. Improving Handwriting Recognition for Historical Documents Using Synthetic Text Lines. In International Graphonomics Conference. Springer, 61–75.Google Scholar
Vlad Taran, Yuri Gordienko, Alexandr Rokovyi, Oleg Alienin, and Sergii Stirenko. 2020. Impact of ground truth annotation quality on performance of semantic image segmentation of traffic conditions. In Advances in Computer Science for Engineering and Education II. Springer, 183–193.Google Scholar
Maxim Tkachenko, Mikhail Malyuk, Andrey Holmanyuk, and Nikolai Liubimov. 2020-2022. Label Studio: Data labeling software. https://github.com/heartexlabs/label-studio Open source software available from https://github.com/heartexlabs/label-studio.Google Scholar
Alejandro H. Toselli and Enrique Vidal. 2021. The Finnish Court Records Dataset. (May 2021). https://doi.org/10.5281/zenodo.4767732Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).Google Scholar
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38–45. https://www.aclweb.org/anthology/2020.emnlp-demos.6Google ScholarCross Ref

Index Terms

Impact of the ground truth quality for handwriting recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition

Recommendations

Attempts to recognize anomalously deformed Kana in Japanese historical documents
HIP '17: Proceedings of the 4th International Workshop on Historical Document Imaging and Processing

This paper presents methods for three different tasks of recognizing anomalously deformed Kana in Japanese historical documents, which were contested by IEICE PRMU1 2017. The tasks have three levels: single character recognition, three Kana characters ...
Read More
Indic script family and its offline handwriting recognition for characters/digits and words: a comprehensive survey
Abstract
Handwriting recognition has become an active area of research in pattern recognition and machine learning in recent years. Handwriting recognition systems have a variety of applications ranging from digital character conversion to signboard ...
Read More
Numeral characters and capital letters segmentation recognition in mixed handwriting context
ICDAR '95: Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2

For the analytic on-line recognition of handwriting, the range of pattern recognition problems can be described by the severity of letter segmentation required. More difficult problems require an interaction of letter segmentation and recognition. These ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology
December 2023
1058 pages
ISBN:9798400708916
DOI:10.1145/3628797

Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 December 2023
Check for updates
Author Tags
Deep Learning
Diplomatic transcription
Ground truth improvement
Handwriting recognition
Historical document
Hyphenated word
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate147of318submissions,46%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 91
  Total Downloads
- Downloads (Last 12 months)91
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Impact of the ground truth quality for handwriting recognition

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Attempts to recognize anomalously deformed Kana in Japanese historical documents

Indic script family and its offline handwriting recognition for characters/digits and words: a comprehensive survey

Numeral characters and capital letters segmentation recognition in mixed handwriting context

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Impact of the ground truth quality for handwriting recognition

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Attempts to recognize anomalously deformed Kana in Japanese historical documents

Indic script family and its offline handwriting recognition for characters/digits and words: a comprehensive survey

Numeral characters and capital letters segmentation recognition in mixed handwriting context

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media