1 Introduction

In a highly populated country like India, 30% of its population are children under the age of 15. Children identification is crucial to avoid swapping or missing of babies in hospitals and to provide and track their immunization. The children do not get access to quality healthcare and full vaccination is still a challenging task [1]. According to Centre for Disease Control and Prevention (CDC), the infant should be administered with the first dosage of vaccines within two months [2]. Vaccination or immunization plays a vital role in strengthening the children’s immune system. The children fail to take vaccination due to the following reasons: (i) vaccination chart / card is lost or torn, (ii) relocation / migration of parents due to their job or personal reasons.

These problems can be overcome, by identifying the children based on their biometrics [3]. The biometric features can be fingerprints, palmprints, toeprints, iris, ears, retinas, and face [4]. Among all these features, fingerprints are the most predominant features. Fingerprint based identification of children is helpful in improving or eradicating the malnutrition that is used as biological evidence [5]. In general, fingerprint recognition systems make use of acquisition methods that require the contact of the finger with a scanner or sensor. Several studies have discussed contact-based methods for children recognition [5,6,7,8]. These studies used Commercial off the Shelf (COTS) scanners, fingerprint sensors, and apparatus with light diffuser for image acquisition. Contact-based acquisition systems can produce samples with distortions due to finger pressure, excessive sweat, similar skin conditions, and wounds [9]. Moreover, contact-based systems leave a latent fingerprint on the scanner surface leading to security issues [10]. In the COVID-19 scenarios and in future, contact-based acquisition methods aid in the spread of virus infection.

In order to maintain hygiene and to avoid the problems posed by contact-based methods, a contactless acquisition method based on mobile phone camera is proposed in this study. In this digital era, smartphones are helpful for acquiring images as they are well equipped with high resolution camera and high-speed processing power. Several studies have been conducted for identification of adults using fingerprints acquired in a contactless way [9, 11,12,13,14]. Convolutional Neural Network (CNN) based approaches yield good results in contactless fingerprint identification [15]. Hence, we proposed Child-CLEF, a CNN based approach for children identification using contactless fingerprints. A mobile phone-based scanner namely, CLEF capture system is proposed for acquiring fingerprints.

The following are the key contributions of this study:

  1. 1.

    Generation of CLCF dataset with 1016 fingerprints.

  2. 2.

    Design of mobile scanner CLEF application for capturing the infant/child fingerprints.

  3. 3.

    Proposal of hybrid contactless fingerprint image enhancement method to enhance the grey-scaled fingerprint images.

  4. 4.

    Proposal of Child-CLEF Net model to extract the minutiae from children fingerprint images.

The rest of the paper is organised as follows: The literature on fingerprint recognition is discussed in Sect. 2. Section 3 elaborates the proposed Child-CLEF approach. Section 4 discusses the experimental results and Sect. 5 concludes the paper.

2 Literature review

There have been several works for fingerprint recognition of adults and children. A detailed study of fingerprint biometry has been done in [16]. A comparative analysis of few existing works on fingerprint recognition using images acquired through traditional methods by touching the scanner and through mobile phone cameras in a contactless way are tabulated in Table 1.

Table 1 Summary of contact-based and contactless fingerprint recognition

Koda et al. developed a software for minutiae detection and could mark only up to 50 minutiae approximately [8]. The verification results obtained using the software is low and it requires improvement in fingerprint capturing device, image quality enhancement method, feature extraction process, and fingerprint matching task. Jain et al. compared the fingerprints using two COTS image matchers and fused the results for matching score [5]. Two fingerprint readers were used and the obtained results were fused to compute the similarity scores [7]. Engelsma et al. used a fusion of texture matcher and COTS matcher for recognition [6]. All the above works have acquired fingerprints using traditional contact-based devices for recognizing the children.

Tan et al. addressed the problem of pose invariants in contactless fingerprint identification [14]. Labati et al. recovered improper deformations and fingertip alignments using camera systems [13]. Yin et al. proposed enhancement method for intrinsic image decomposition and guided image filtering [12]. Khan et al. addressed rotation and scale-invariant images acquired through smartphones [11]. Sagiroglu et al. proposed an apparatus for fingerprint identification using smartphones [9]. Though all these authors have worked on images acquired through camera scanners, they did not work on children’s fingerprints.

In biometric applications, one of the challenging tasks is extraction of features, especially in fingerprints [17]. Fingerprints are represented by local landmarks called minutiae [18]. A good quality fingerprint image consists about 25 to 80 minutiae points and can be used for comparison [19]. Minutiae features include arch, whorl, spots, sweat pores, loop, ridge ending, bifurcation, island, etc. [20, 21]. Minutiae based verification system has been found to have better accuracy [22]. The existing works tabulated in Table 1 used minutiae feature and CNN based approaches.

Tang et al. proved that using CNN the fingerprint identification is faster on a GPU [23]. They extracted relevant features by preserving the ridge information and obtained a recall rate of 53%. Tang et al. proposed a method for fingerprint representations which includes orientation, segmentation, and enhancement that can be automated using CNN [24]. In order to improve the network’s capacity for representation, it is then extended, and the weights are released so that it can learn from data how to account for complex background variance while maintaining end-to-end differentiability. Darlow et al. addressed the comparative analysis of fingerprint identification using CNN [25]. They developed a voting system to provide training data, and then trained the minutiae network automatically on a big dataset for portability and resilience, doing away with the necessity for laborious manual data labelling. Lin et al. proposed a multi-Siamese CNN architecture for matching fingerprints [15]. They used deep feature vectors produced by many networks to create a reliable deep fingerprint representation. Minaee et al. used end to end fingerprint recognition using CNN [26]. On a well-known fingerprint dataset, they attained an extremely high recognition accuracy. They held the opinion that the framework can be broadly used to biometrics recognition jobs, allowing for the development of more precise and scalable systems. Joshua et al. proposed an infant fingerprint recognition system for global good like delivery of vaccinations and supplements [27]. They showed that the infant-prints may reliably and accurately identify (over time) newborns enrolled between the ages of 2 and 3 months, allowing for the efficient administration of immunisations, medical treatment, and nutritional supplements.

All these works are related to contact-based children and contactless adults that motivated us to study the contactless fingerprint identification for children. In this study, we addressed the image quality enhancement methods of contactless image acquisition process as the fingerprint image has external distortions. In order to produce accurate fingerprint matching / recognitions, the CNN based approach is designed and integrated with Atrous Spatial Pyramid Pooling (ASPP) with the minutiae fingerprint feature. For fingerprint matching, a well-known and widely used matching algorithm, NIST BOZORTH3 is used and it establishes a matching score between fingerprints [28].

3 Proposed child-CLEF approach

Fig. 1
figure 1

Block schematic of proposed child registration module

Fig. 2
figure 2

Block schematic of proposed child identification module

Child-CLEF is a CNN based child identification system proposed to identify a child using fingerprints. This system consists of two modules: child registration and child identification. The fingerprints of a child are acquired in a contactless way using the proposed CLEF App. The quality of the acquired fingerprints is enhanced using image processing techniques and the minutiae features of fingerprints are extracted using the proposed Child-CLEF Net. The extracted features are stored in the minutiae template database. Figure 1 depicts the block schematic of the proposed child registration module. The contactless fingerprint acquisition, fingerprint quality enhancement, and feature extraction of child identification module are similar to that of child registration module but it utilizes a fingerprint matching algorithm to either accept or reject the fingerprints of a child. Figure 2 depicts the block schematic of the proposed child identification module. The following subsections discuss the contactless fingerprint acquisition, quality enhancement of fingerprints, feature extraction using Child-CLEF Net, and minutiae matching:

3.1 Contactless fingerprint acquisition

The fingerprints of a child are acquired in a contactless way using CLEF capture system. The following points are noted in the design of the CLEF application:

  • The system is more human centric over technology centric approach, as the skin of the new-borns peels during the first week.

  • If contact-based is used, multiple scanning is required to get the infant fingerprints and the sensor used in contact-based affects the softness of infant fingers.

  • The 500 ppi is not enough to get the infant fingerprint images and also any external lighting condition will penetrate into skin of infant [29].

Fig. 3
figure 3

Side and front view of CLEF capture system

Accordingly, a mobile device camera attached with macro lens is used for acquiring the fingerprints. The advanced technology in mobile phone have brought new ideas to use this technology with biometric systems. Specifically, cameras are the most important feature of the mobile devices and has obtained significance due to usability for different purposes. The usage of this device for acquiring fingerprints poses the following problems: (i) low contrast between ridge and valley pattern, (ii) noise in the background, (iii) illumination problem, and (iv) focus issue. To address these issues caused by contactless acquisition, improved methods have been used in this study.

The image acquisition does not follow any finger placement guides, but requires that finger should be placed in front of the lens area with a particular distance so that focus of the image is clear. We used 20X macro lens with aluminium alloy body for focus and coated glass to reduce reflections and light flares. In general, the lens will not work closer than 30 mm and farther than 80 mm from the subject. So, the finger is placed at a distance of more than 30 mm before the lens area. The camera used in this study is iPhone SE, with wide range of 12 megapixels and the image size of iPhone is 4032 width x 3024 height and 1072 ppi is used. Figure 3 depicts the side and front view of the CLEF capture system.

The image acquisition has been made portable by an iOS mobile application called CLEF App, designed and developed by us. The CLEF App is used for the registration of the children. The fingerprint image is acquired using the application and the image size is reduced to 1280 width x 720 height without comprising the image quality. Before starting the image acquisition process, a wet wipe is used to clean the infant’s fingerprint.

3.2 Fingerprint quality enhancement

The acquired contactless fingerprint images need special intervention with respect to pre-processing like removal of noise, enhancing the contrast of image, reduction in data size, orientation, and illumination aspects. Enhancement of image is a process that increases the interpretability of pixels to take out meaningful information from the images [30]. The image enhancement techniques for contactless fingerprints are different from the fingerprints collected by contact-based methods. Though contactless acquisition, there are some factors that affect the quality of image. The following are the methods used to eliminate all these problems:

3.2.1 Sharpening

A 3 \(\times\) 3 kernel is applied on a raw image to sharpen the fingerprint image which adds contrast to the edges of the images. Furthermore, the sharpened image is converted into grey scale.

3.2.2 Normalization

The sharpened image contains noise. This noise is a variation in colour values in images. To eliminate noise, Contrast Limited Adaptive Threshold (CLAHE) method is applied. The CLAHE is based on histogram equalization approach, which improves image noise on block basis. The sharpened image is divided into blocks, so that it will not collide with each other and histogram equalization improves the overall noise in the image. In order to overcome the noise, contrast limitation is applied [9]. The colour values are distributed equally in blocks and average value of pixels is equalized to grey value and is computed as follows:

$$\begin{aligned} N_{aver}=\frac{N_{CR-XP}\times {N_{CR-YP}}}{N_{grey}} \end{aligned}$$
(1)

where \(N_{grey}\) is the number of grey levels in block, \(N_{CR-XP}\) is the number of pixels in the X axis, and \(N_{CR-YP}\) is the number of pixels in the Y axis.

The contrast limit is computed by multiplying the average value of pixels in block and maximum of the average pixels at each grey level and is computed as follows:

$$\begin{aligned} N_{CL}=N_{clip} \times N_{aver} \end{aligned}$$
(2)

3.2.3 Smoothing

In normalised image, overall contrast and noise removal improvement has been made. The high spatial frequency noise in the image is eliminated by applying a low-pass filter. This low-pass filter moves the window operator which affects one pixel of the image at a time, changing its value of a local region (window) of pixels. The operator moves over the image to affect all the pixels in the image. Here, Gaussian filter is applied to remove noise. The Gaussian filter outputs a weighted average of each pixel’s neighbourhood with the average weight that is more towards the value of the central pixels. The Gaussian filter provides smoothing, thus spatial frequency noise is eliminated. It is computed as follows:

$$\begin{aligned} G\left( x, y\right) =\frac{1}{2\pi \sigma ^2}e^{\frac{x^2+y^2}{2\sigma ^2}} \end{aligned}$$
(3)

where \(\sigma\) is the standard deviation of the Gaussian distribution, x is the distance from the origin in the horizontal axis, and y is the distance from the origin in the vertical axis.

3.2.4 Region of interests marking

From the smoothened filtered image, Region of Interest (RoI) is calculated. To obtain the RoI, the frequency of ridges is computed. The distinct frequencies in the fingerprint image with pixel neighbourhood as ridges and valleys can be modelled as a sinusoidal-shaped wave along a direction normal to the local ridge orientation. This orientation is computed using the mean on a block basis in the fingerprint image. This mean value is computed by averaging the sine and cosine of doubled angles. This will lead the image from wrapping around the pixels. The block is rotated and the ridge lines are aligned in a vertical manner. The rotated image has only the RoI part. The RoI value is the grey values of ridges. The spatial frequency of ridges in RoI is computed by dividing the distance between the first and last peak by the total number of peaks minus one. If no peaks are detected or the wavelength is outside the allowed bounds, the frequency is set to zero.

3.2.5 Features marking

From the RoI extracted image, the ridge endings of the features and bifurcations are extracted and an enhanced image is obtained. From the spatial frequency, the distinct frequencies are marked. These frequencies are generated as a kernel like filter and image orientation has been checked. A Gabor filter is applied to analyse the local feature property of fingerprint image and is computed as follows:

$$\begin{aligned} G \left( x, y; \theta , f \right) =exp\left\{ \frac{-1}{2} \left[ \frac{x^{2}_\theta }{\sigma ^{2}_x}+\frac{y^{2}_\theta }{\sigma ^{2}_y}\right] \right\} \cos \left( 2\pi fx_\theta \right) \end{aligned}$$
(4)

where \(x_\theta = x \cos \theta -y \sin \theta\) and \(y_\theta = x \sin \theta +y \cos \theta\), \(\theta\) is the orientation direction, f is the cosine wave frequency, and \(\sigma _x\) and \(\sigma _y\) are the fixed distances from the Gaussian properties along the x and y axes respectively.

Fig. 4
figure 4

Fingerprint quality enhancement steps

Figure 4 depicts the above discussed steps of fingerprint quality enhancement processes.

3.3 Minutiae extraction

Fig. 5
figure 5

Architecture of proposed child-CLEF net

Minutiae extraction takes enhanced image as input and given to the pooling layer followed by convolutional layers for feature extraction. Minutiae in the fingerprint is used to determine the uniqueness as it contains the most important features of ridge endings and bifurcations which are the most prominent ones for identification of fingers [26, 31].

Child-CLEF Net architecture is proposed in order to extract the more precise minutiae features from the processed fingerprint images. The proposed architecture is illustrated in Fig. 5. The minutiae extraction process from a pre-processed contactless fingerprint image is given to Child-CLEF Net. The minutiae values are represented as \((x, \ y, \ \theta )\), where x and y represent minutiae coordinates and \(\theta\) represents the orientation. The pooling layers are used to scale down the minutiae region patch. The output obtained after second, third, and average pooling layer are fed to an Atrous Spatial Pyramid Pooling (ASPP) network with corresponding learning rates for segmentation [32]. The third pooling layer and average pooling layer are tested as coarse estimates while the second pooling layer serves as fine estimate. The role of ASPP is to concatenate the dilated convolutional layers with different dilation factors and provides multi-scale reception fields without increasing the architecture size. The minutiae patch is labelled as 1 and the non-minutiae patch is labelled as 0.

A block size of 16 \(\times\) 16 is used to extract ridge endings and bifurcations in minutiae; each level is fused to get the final minutiae score map with the size h/16 \(\times\) w/16, where h and w are the height and width of the input image. The candidate regions are generated by reducing the common candidates using Non-Maximum Suppression [33]. The best appropriate feature is selected and assumed that it covers the object to be detected. If the features are overlapping, then the objects are too close. Out of the remaining features, the next top scoring feature is selected, and the procedure is repeated until no more features are left out. This procedure involved a similarity between features and setting a threshold for suppression. The most reliable minutiae score map is identified and marked as minutiae in the fingerprint image while some candidate regions are deleted due to spurious minutiae. A commonly used approach is categorizing the candidate scores in ascending order. The L2 distance between pair wise candidates is calculated with hard thresholds for distance and orientation. By iteratively comparing each candidate with the rest in the list of candidates, the candidates with higher score and above the threshold score is taken into consideration. In detail, after sorting the scores of the candidate list, high score candidates are retained and the candidates with low scores are ignored with at least 50% overlap in already selected candidates.

3.4 Minutiae matching

Once the model is trained with Child-CLEF Net architecture, the output of minutiae values is given to BOZORTH3 algorithm for matching [34]. BOZORTH3 algorithm that focused on the process of its match minutiae points that have the location (x, y) and orientation angle theta (t) is represented as xyt files using the rotation and translation invariant. This algorithm produces a matching score when a test image is matched with train images.

Fig. 6
figure 6

Proposed matching approach

Figure 6 depicts the proposed matching approach. The registered children’s fingerprints minutiae are stored in the database. When a test image is passed to the child identification system, the minutiae features are extracted and then it is verified in the minutiae database. The matching scores for left index, right index, left thumb, and right thumb are computed and fused to get the matching results. Algorithm 1 depicts the proposed work which takes the contactless children fingerprints as input and computes the TN, FN, TP, and FP to analyze the performance of the model.

figure a

4 Experimental results and discussions

The proposed Child-CLEF approach was implemented using Python on Intel Core i7(4C) and 16 GB RAM under Ubuntu.

4.1 Description of dataset

The publicly available PolyU contactless fingerprint dataset has been utilized in this study [31]. However, the unavailability of contactless fingerprint dataset for children motivated us to generate the CLCF dataset. A total of 1016 contactless fingerprint images comprised of 101 different subjects were acquired from Bethesda Hospital, Chennai. Before data collection, the parents were required to sign a consent form to give their child’s fingerprint images and a gift (baby napkins, baby wipes, and sanitizers) worth INR 200 was provided to the parents after the data collection. Most of the samples acquired in the dataset consists of 8 impressions from each of the two thumbs and two indices (left and right). The challenge faced in the acquisition of child fingerprint was when the child became restless or non – cooperative we were unable to acquire the images and ended up by acquiring a smaller number of images. The distance from the image capture area to macro lens was not below 30 mm. The samples were collected in the time frame of 1 to 2 months.

Fig. 7
figure 7

Sample contactless fingerprint images a CLCF dataset – left thumb b CLCF dataset – right thumb c CLCF dataset – left index d CLCF dataset – right index e PolyU public fingerprint dataset

Table 2 Statistics of generated CLCF dataset

The statistics of our CLCF dataset are tabulated in Table 2. Figure 7a–d depict our CLCF left thumb image, right thumb image, left index image, and right index image respectively. Figure 7(e) depicts a sample image from the publicly available contactless fingerprint dataset. The dataset is organised in such a way that the images are labelled by giving the child a unique id, followed by date of birth and then the selected feature. Example: 5000-00-06-12-LI1, the child unique id is 5000, 0 years 6 months 12 days old, and first image from left index. The proposed Child-CLEF Net architecture was used to extract the minutiae of all the images. For training and testing, one sample was removed from each subject for testing and the rest were used for training. The experiments were done with the patch size of 64, block size of 16, and orientation direction of 90. Figure 8a and b depict the training and validation accuracy with varying epochs of PolyU dataset and CLCF dataset respectively. Figure 8c and d depict the training and validation loss with varying epochs of PolyU dataset and CLCF dataset respectively. Both the datasets have been trained and validated for 50 epochs and it is evident that there is a smooth convergence happened in the validation process as compared to the training process.

Fig. 8
figure 8

Epoch vs. performance metrics, accuracy and loss a PolyU - accuracy b CLCF - accuracy c PolyU - loss d CLCF - Loss

Figure 9 depicts the marked minutiae of children fingerprints obtained as output from the Child-CLEF Net model. It is observed that the Child-CLEF Net model extracted approximately around 20 minutiae from the children’s fingerprints.

Fig. 9
figure 9

Marked minutiae of children fingerprints

4.2 Performance analysis

The samples are tested with the ground truth of the system by setting a threshold value and if the test image of a child is given, the system will compute a matching score using BOZORTH3 and then it will check for the following four conditions:

  • Clause 1: If the child registers for the first time, the fingerprint samples may not be in the database, then it is True Negative (TN).

  • Clause 2: If the test image of the child is not recognized or matched with its own class due to variation in orientation of fingers in both test and train images, or may be the system was trained with good clarity of images while the quality of test image is low in minutiae points due to problem in acquisition, then it is False Negative (FN).

  • Clause 3: If the test image matches with the same class it belongs, then it is True Positive (TP).

  • Clause 4: If the test image matches with other class, then it is False Positive (FP).

The performance of the proposed approach is analysed in terms of the following metrics: Precision, Recall, F1-score, Accuracy, and Equal Error Rate (EER).

Table 3 Comparison of proposed approach with existing Methods

The proposed approach was tested using PolyU dataset [26] and contactless CLCF dataset and compared with open-source extractor MINDTCT from NBIS [25] and Contactless Net [11]. The comparative analysis is tabulated in Table 3. It is evident that the proposed approach outperforms the existing approaches in both datasets. Figure 10 depicts the relationship between false acceptance rate and true acceptance rate of CLCF dataset for different fingerprint recognition systems. It is seen that the proposed Child-CLEF Net approach provides low false acceptance rate and high true acceptance rate with higher area under the curve as compared to the existing approaches.

Fig. 10
figure 10

False acceptance rate vs. true acceptance rate

5 Conclusion

In this study, a CNN based contactless biometric system named Child-CLEF is proposed for recognition of children. The child fingerprints were acquired using a mobile phone-based scanner named CLEF. The quality of the captured contactless fingerprints was enhanced using proposed image enhancement method. As the quality of the images was improved, so as the quality of minutiae features with ridge endings and bifurcations also got improved. The minutiae features were extracted using the proposed Child-CLEF Net model and the children recognition was done using BOZOROTH3 algorithm. Experiments were conducted using generated child contactless fingerprint dataset and publicly available PolyU fingerprint dataset. The proposed approach was compared with existing fingerprint recognition system and achieved an accuracy of 98.46% with the equal error rate of 1.99%. The extension of this study could be the extraction of patches from the fingerprint minutiae to identify infants. As it is feasible to build networks independent of fingerprint size by building them in accordance with the patch size given by the designer instead of the fingerprint image size, patches should be utilised instead of the whole fingerprint.