Towards robust writer verification by correcting unnatural slant☆
Research highlights
► The value of slant as a writer identification feature has been overrated. ► Deliberate slant change can be partly countered by the shear transform. ► Deliberate slant change introduces non-affine distortions to the handwriting. ► A new dataset of deliberately slanted handwriting was introduced.
Introduction
A salient property of Western handwriting is slant: the dominant angle of near-straight downstrokes with respect to the horizontal. Slant is caused by the choice of pen grip and the relative contributions of wrist and finger movements. It has been modeled as the effect of locally using a single actuator (muscle) in a two-dimensional neuromuscular apparatus (Dooijes, 1986). Slant seems to be a key feature for writer verification: it plays an important role in biometric systems, as it is a major constituent of angular features (Bulacu and Schomaker, 2003, Crettez, 1995, Maarse, 1987). For example, the state-of-the-art Hinge feature (Bulacu and Schomaker, 2007) is based on angular frequencies; it is influenced by both curvature and slant. Furthermore, forensic document examiners and paleographers use this feature as a discriminatory characteristic (Burgers, 1995, Hardy and Fagel, 1995). These facts suggest that slant is a key feature for writer verification. However, it is not known to what extent slant contributes as an isolated factor to the performance of biometric systems for handwriting and its value may be overestimated.
In particular, slant is not a valuable feature in (possibly) disguised handwriting. In such a case, the handwriting was produced in a deliberately modified style, with the intention to avoid recognition of the writer’s identity. Disguised handwriting is often used in threatening or stalking letters. In some cases, the mutilation of shapes successfully disturbs handwriting examination by forensic experts (Found and Rogers, 2005). Moreover, disguised handwriting cannot be handled by state-of-the-art systems for handwriting biometrics (writer verification and identification): computational features that are invariant to disguise do not exist. This is one of the reasons why systems for handwriting biometrics are not fully suitable for application in the forensic domain yet. Other unmet requirements are explainability of the system, robustness for variation in background effects, and robustness for forgery. Those issues have been addressed to some extent (Brink et al., 2007, Brink et al., 2008, Cha and Tappert, 2002, Franke and Köppen, 2000), but computational robustness against disguise is a largely untouched problem area.
A strategy to handle disguise is by applying an image operation to undo the effect of disguise, resulting in handwriting that is close to natural. This seems possible for the most frequently used kind of handwriting disguise: a change of slant. It is not surprising that slant modification is the most frequently used kind of disguise (Harris, 1953, Koppenhaver, 2007, Morris, 2000, Nickell, 2007), since humans can easily modify the slant during writing, and the effect on the visual appearance is dramatic (Koppenhaver, 2007). Therefore, an important step in making biometric systems robust for disguise is by correcting the slant. An obvious approach is to perform the correction by transforming the image with the shear operation, possibly resulting in the writer’s natural handwriting.
The objective of this study is twofold. The first objective is to determine how much information about the writer’s identity is contained in the slant characteristic of natural handwriting. This will be tested in the first experiment by eliminating the slant in natural handwriting (slant elimination) and measuring to what extent the performance of automatic writer verification degrades. This experiment contributes to the theoretical basis of computational writer features based on directionality, such as the Hinge feature (Bulacu and Schomaker, 2007). The result will direct the design of future features.
The second objective is to determine the effectiveness of the shear transform in correcting handwriting disguised by slant change, when used as a preprocessing step before applying features such as Hinge (Bulacu and Schomaker, 2007) and Fraglets (Schomaker et al., 2004). Hinge and Fraglets are state-of-the-art features, based on statistical pattern recognition, which show impressive performance in test conditions.
At the same time, the underlying question will be answered: to what extent is a change of slant during human production of handwriting functionally equivalent to a shear transform? Slant change may result in more than just a shear effect, since it requires a non-habitual movement of the finger-wrist system, which may affect curvature. It has been suggested that there must also be an effect on writing speed, pressure, connecting strokes, style, construction, and size (Morris, 2000). Furthermore, disguised handwriting is less consistent (Harris, 1953, Koppenhaver, 2007, Morris, 2000). In the second experiment, it will be quantitatively determined to what extent such other effects occur. This will be done by shearing slanted text back to the supposed writer’s natural slant angle (slant correction), and determining the performance of writer verification using state-of-the-art features. This is a first step in designing new biometric systems that are robust to disguise. To the best of our knowledge, no similar experiment has been performed before.
The experiments will be performed on a newly created public dataset: the TriGraph slant dataset, containing both natural and slanted handwriting of 47 subjects. It is described into more detail in the next section. In Sections 3 Slant estimation, 4 Feature extraction and comparison, methods for slant estimation and feature extraction are described; these are preliminaries for the experiments. Experiment 1 will show that slant is not as informative as is usually assumed; it is described in Section 5. Experiment 2 will show that deliberate slant change is not equal to a simple shear transform; it is described in Section 6. Section 7 summarizes the conclusions.
Section snippets
TriGraph slant dataset
A new dataset was created, the TriGraph slant dataset: a unique collection of clean, deliberately slanted handwriting in conjunction with each writer’s natural handwriting. It consists of 188 scanned images of handwritten pages, written by 47 untrained Dutch subjects, aged 27 on average. This dataset is relatively small compared to other datasets such as Firemaker (Schomaker and Vuurpijl, 2000) (251 writers), IAM (Marti and Bunke, 1999) (657) and Srihari’s dataset (Srihari et al., 2002) (1500).
Slant estimation
Since Experiments 1 and 2 both require a reliable technique to estimate slant, a limited comparison of techniques is included here. A variety of slant estimation methods exists, based on different definitions of ‘slant’. For example, it has been defined as the average direction of near-straight or long downstrokes (Maarse and Thomassen, 1983), or “the angle between the vertical direction and the direction of the strokes that, in an ideal model of handwriting, are supposed to be vertical” (
Feature extraction and comparison
The effect of slant on features of handwriting was evaluated using three well-performing automatic features. These features will be briefly introduced below; refer to the respective papers for the details.
- •
The Directions feature (Bulacu and Schomaker, 2003) (p(ϕ)) is a probability distribution (p.d.) of ink directions at the contours. This encodes slant and direction usage.
- •
The Fraglets feature (Schomaker et al., 2004) (p(g), also named “fCO3”) is a p.d. of usage of graphemes (fragments of
Experiment 1: information in slant
The first experiment focused on determining how informative the slant value in natural handwriting is. This was determined by computing the performance of writer verification on unmodified handwriting (denoted ‘AN vs BN’), and comparing it to the performance on handwriting of which the slant was eliminated (‘AN vs BN, elim.’). This is explained in the next subsections.
Experiment 2: is deliberate slant change an affine transform?
The aim of the second experiment is to determine whether deliberate slant change is functionally equivalent to a simple affine transform: shear. In this experiment, apart from natural handwriting (BN), the disguised handwriting (BL, BR) from the dataset was included as well. Thus the experiment was performed three times, each time comparing documents from AN with those from either BN, BL or BR. Furthermore, in an attempt to restore the handwriting, slant correction was used instead of slant
Conclusion
Slant is a salient feature of handwriting and it is an important factor of state-of-the-art features, but as an isolated factor, it is not essential for good writer verification performance. It is not as informative for handwriting comparison as is usually assumed. This was found in a series of writer verification experiments using three state-of-the-art statistical features: Directions, Fraglets, and Hinge. Removing the absolute slant lowered writer verification performance by only 1–5
References (27)
- et al.
Slant estimation algorithm for OCR systems
Pattern Recognition
(2001) - et al.
Produced and perceived writing slant: Difference between up and down strokes
Acta Psychol.
(1983) - et al.
A new normalization technique for cursive handwritten words
Pattern Recognition Lett.
(2001) - Bertolami, R., Uchida, S., Zimmermann, M., Bunke, H., 2007. Non-uniform slant correction for handwritten text line...
- et al.
Off-line cursive script word recognition
IEEE Trans. Pattern Anal. Machine Intell. (PAMI)
(1989) - Brink, A., Schomaker, L., Bulacu, M., 2007. Towards explainable writer verification and identification using vantage...
- Brink, A., van der Klauw, H., Schomaker, L., 2008. Automatic removal of crossed-out handwritten text and the effect on...
- et al.
Writer style from oriented edge fragments
- et al.
Text-independent writer identification and verification using textural and allographic features
IEEE Trans. Pattern Anal. Machine Intell. (PAMI)
(2007) - Burgers, J., 1995. De paleografie van de documentaire bronnen in Holland en Zeeland in de dertiende eeuw. Peeters,...
Comprehensive survey on distance/similarity measures between probability density functions
Internat. J. Math. Models Methods Appl. Sci.
Cited by (26)
Evaluating synthetic pre-Training for handwriting processing tasks
2023, Pattern Recognition LettersHandwriting based writer recognition using implicit shape codebook
2019, Forensic Science InternationalCitation Excerpt :Writer identification techniques reported in the literature are traditionally categorized into text -dependent and text-independent approaches. Text-dependent methods for writer identification are inspired by matching techniques employed by forensic experts where same characters and character combinations are compared [4,8,10]. Text-independent methods, on the other hand, aim to capture the writing style of a specific writer independent of the semantic content of the writing samples.
Arabic offline writer identification on a new version of AHTID/MW database
2023, International Journal of Biometrics
- ☆
This research was made possible thanks to NWO Grant 634.000.434 (ToKeN/TriGraph).