A linguistic fuzzy recogniser of off-line handwritten characters
Introduction
Automatic recognition of handwritten characters is an area of pattern recognition that is extremely useful in numerous fields, including documentation analysis, mailing address interpretation, bank check processing, signature and document verification, and many others (Impedovo, 1994). Several approaches can be considered to address the problem. They differ from each other basically in the type of features, which are used to identify characters, and in the computational intelligence techniques exploited to recognise them (Jain and Lazzerini, 1999). Types of features used in the literature include topological and geometrical, directional, mathematical, and structural features. Statistical analysis, structural and syntactic analysis, pattern matching, neural networks, fuzzy systems and genetic algorithms are some of the computational techniques.
Although substantial progress has been recently achieved, the recognition of handwritten text cannot yet approach human performance. The major difficulties descend from the similarity of some characters with each other, the variability of someone’s calligraphy over time, and the infinite variety of writing styles produced by different writers. Furthermore, the unavoidable presence of background noise and various kinds of distortions (such as poorly written, degraded, or overlapping characters), and the possible low quality of the text image can make the recognition process even more difficult. Therefore, handwriting recognition is still an open and interesting area for research and novel ideas.
In the last years, several alternatives to traditional handwriting recognition systems have been investigated. The objective to emulate as faithfully as possible the human reading process has caused a growing interest in approaches to handwriting recognition based on soft decisions. It is in this context that fuzzy logic has proved to be a powerful tool to represent vague, uncertain or imprecise character patterns (e.g., Bouslama, 1997, Chi and Yan, 1995a, Chi and Yan, 1995b, Gader et al., 1997, Malaviya and Klette, 1996, Parizeau and Plamondon, 1995). Typically, handwritten characters are broken down into component features, which are described in terms of linguistic variables and fuzzy sets. Fuzzy rules are then generated to describe sample models of characters. Therefore, the fuzzy recognition process of an unknown character consists of identifying the features composing the unknown character and comparing these features with the ones of the models defined in the fuzzy rule base.
This paper presents LoaFeR, a linguistic fuzzy recogniser of off-line isolated handwritten characters. LoaFeR is based on a novel method for linguistic fuzzy classification of the shape of characters. Although a large variety of writing styles exists, the general shape of characters can be described by reference models. A fuzzy approach is used to model character shape. First, uniform fuzzy partitions are built on the horizontal and vertical axes of the character image. Then, a linguistic representation of the character is generated in the two universes of the labels associated with the horizontal and vertical fuzzy sets, respectively. Finally, a linguistic reference model of each character is appropriately derived from the linguistic representations of the samples of the character composing the training set. When an unknown character has to be recognised, its linguistic representation is compared to the linguistic reference models of each character by using a purposely-defined weighted distance. The character is recognised as the character associated with the closest reference model in terms of the defined distance.
We tested LoaFeR on a subset of the National Institute of Standards and Technology (NIST) handwritten database. To create the reference model of each character, we used 60 samples of that character written by 60 different writers. LoaFeR was applied to recognise 1040 lower-case characters written by 40 different writers, whose handwriting was not included in the training set. In practice, each writer was asked to write the 26 (lower-case) characters of the English alphabet. The training and test recognition rates were 79.5% and 69.5%, respectively.
Section snippets
Overview of LoaFeR
As shown in Fig. 1, LoaFeR consists of three basic modules, namely the linguistic descriptor, the linguistic model builder and the comparator. The input to the system is a 300 dpi B/W bit-map. The bit-map is large enough (currently 45×80 pixels) to contain each character. Each bit-map is processed by the linguistic descriptor, which models a character in terms of a linguistic expression. The output of the linguistic descriptor is a linguistic representation Lu of the character.
The comparator
Experimental results
We tested our system on the NIST handwritten character database. The testing was performed using n-fold cross-validation. We selected a data set composed of 100 alphabets written by 100 different writers, for a total of 2600 samples. We considered alphabets consisting of 26 lower-case letters. The selected data set was segmented to provide a training and a test set of 60 and 40 alphabets, respectively. The alphabets composing the training and test sets were randomly chosen. We repeated the
Discussion
The various methods applied to recognition of isolated handwritten characters use one or more of a great variety of feature types to identify characters. Different training and test sets are adopted, often exploiting handwritten character databases, such as NIST, CEDAR and ETL-1. The different recognition accuracy obtained by the methods may depend on such parameters as the dimension and composition of the training and test sets, the complexity of the used method, the chosen initial features,
Conclusions
In this paper we have presented LoaFeR, a linguistic fuzzy recogniser of isolated handwritten characters. LoaFeR is based on a novel method for linguistic fuzzy classification of separated handwritten characters. Firstly, the method defines a uniform fuzzy partition on the horizontal and vertical axes of the character image. The optimal number of fuzzy sets for each partition is determined by an exhaustive procedure. Then, the method describes characters in terms of linguistic expressions
Acknowledgments
This work was supported by the Italian Ministero dell'Università e della Ricerca Scientifica e Tecnologica (MURST) in the framework of the MOSAICO Project. The authors thank Dr. Adriana Maggiore for valuable discussion and Gianluca Simoni for implementing some parts of LoaFeR.
References (15)
- et al.
Handwritten numeral recognition using a small number of fuzzy rules with optimized defuzzification parameters
Neural Networks
(1995) - et al.
Handwritten numeral recognition using self-organizing maps and fuzzy rules
Pattern Recognition
(1995) Computational theory for interpreting handwritten text in constrained domains
Artificial Intelligence
(1994)- et al.
A structural/statistical feature based vector for handwritten character recognition
Pattern Recognition Letters
(1998) - et al.
A soft computing approach to handwritten numeral recognition
- Bouslama, F., 1997. Arabic character recognition by fuzzy techniques. In: Proc. EUFIT’97, Aachen, Germany,...
- et al.
Neural and fuzzy methods in handwriting recognition
IEEE Computer
(1997)
Cited by (14)
An investigation of the modified direction feature for cursive character recognition
2007, Pattern RecognitionA novel approach for structural feature extraction: Contour vs. direction
2004, Pattern Recognition LettersAn adaptive combined classifier system for invariant face recognition
2002, Digital Signal Processing: A Review JournalA fuzzy approach to 2-D shape recognition
2001, IEEE Transactions on Fuzzy SystemsGeometric-topological based arabic character recognition, a new approach
2017, Journal of Theoretical and Applied Information TechnologyOff-line cursive script recognition: Current advances, comparisons and remaining problems
2012, Artificial Intelligence Review