Forensic Population Genetics – Original Research
Toward DNA-based facial composites: Preliminary results and validation

https://doi.org/10.1016/j.fsigen.2014.08.008Get rights and content

Highlights

  • Facial variation is a complex and multipartite trait that requires proper modeling techniques.

  • Using a genetic basis of 24 SNPs, sex and genomic ancestry we create DNA-based facial composites.

  • Physical accuracy of the predictions is mainly determined by sex and genomic ancestry.

  • The SNP-effects significantly increase the distinctiveness of the predictions.

Abstract

The potential of constructing useful DNA-based facial composites is forensically of great interest. Given the significant identity information coded in the human face these predictions could help investigations out of an impasse. Although, there is substantial evidence that much of the total variation in facial features is genetically mediated, the discovery of which genes and gene variants underlie normal facial variation has been hampered primarily by the multipartite nature of facial variation. Traditionally, such physical complexity is simplified by simple scalar measurements defined a priori, such as nose or mouth width or alternatively using dimensionality reduction techniques such as principal component analysis where each principal coordinate is then treated as a scalar trait. However, as shown in previous and related work, a more impartial and systematic approach to modeling facial morphology is available and can facilitate both the gene discovery steps, as we recently showed, and DNA-based facial composite construction, as we show here. We first use genomic ancestry and sex to create a base-face, which is simply an average sex and ancestry matched face. Subsequently, the effects of 24 individual SNPs that have been shown to have significant effects on facial variation are overlaid on the base-face forming the predicted-face in a process akin to a photomontage or image blending. We next evaluate the accuracy of predicted faces using cross-validation. Physical accuracy of the facial predictions either locally in particular parts of the face or in terms of overall similarity is mainly determined by sex and genomic ancestry. The SNP-effects maintain the physical accuracy while significantly increasing the distinctiveness of the facial predictions, which would be expected to reduce false positives in perceptual identification tasks. To the best of our knowledge this is the first effort at generating facial composites from DNA and the results are preliminary but certainly promising, especially considering the limited amount of genetic information about the face contained in these 24 SNPs. This approach can incorporate additional SNPs as these are discovered and their effects documented. In this context we discuss three main avenues of research: expanding our knowledge of the genetic architecture of facial morphology, improving the predictive modeling of facial morphology by exploring and incorporating alternative prediction models, and increasing the value of the results through the weighted encoding of physical measurements in terms of human perception of faces.

Introduction

The ultimate goal of evaluating evidentiary DNA is to assign a biological origin to the sample with a high degree of statistical certainty [1], [2]. Standard forensic DNA-based identity analysis relies on comparative grounds: typically, a short tandem repeat (STR) profile generated from an evidentiary DNA sample is compared to a known STR profile from a reference, either individuals populating a reference database or DNA collected from a person of interest [3], [4]. If the STR profile from the evidentiary DNA sample does not match an STR profile in the database or the person of interest, then the information obtained from the STR markers is of little further use in identifying the origin of the evidentiary DNA. In order to help an investigation out of an impasse or to progress the re-investigation of cold cases, a DNA-based prediction of externally visible characteristics (EVC) [5], or ancestry [6] from the evidentiary sample can be considered. This process by which forensically useful phenotypes are estimated from the analysis of an individual's DNA sample has been termed molecular photofitting [7].

In the context of molecular photofitting, where any number of identifying traits could be predicted, the generation of a DNA-based (in contrast to eyewitness-based) facial composite is forensically of high value. The face is probably the single most telling part of the human body as it advertises our age, sex, ancestry, solar exposure, general health, kinship, intentions, and emotional state of mind. Facial recognition and individualization is a specialized human ability and a widely accepted identification and authentication method [8], [9], [10], [11]. Craniofacial reconstruction is a technique focused on identification of the deceased, which has its foundation in facial anatomy and its relationship to the underlying skull [12], [13], [14]. In theory, a DNA-based facial composite should be possible given the compelling evidence that facial features are under strong genetic control [15]. Such evidence includes remarkable facial similarity between identical twins [16], clear facial resemblances within families, distinctive facial features associated with particular genetic conditions [17], [18], and facial similarities within geographic populations [19] and within the sexes [20]. This suggests that inter-individual variation in facial morphology is, in most cases, primarily determined by genetic variation.

Constructing a DNA-based facial composite remains challenging due to both the genetic and physical complexities of the human face. Genetically, facial morphology is a complex trait and in contrast to Mendelian traits, its developmental path from genotype-to-phenotype involves numerous genes and environmental factors. At present, the underlying genetic architecture of facial morphology is largely unknown. Physically, facial morphology is a multipartite trait and in contrast to current modeling of most traits that have been subject to prediction so far (e.g., eye color, height, weight, starvation stress, and milk fat percentage), can only be described in high-dimensional and multivariate quantitative terms. Traditionally, such physical complexity is simplified extensively a priori by breaking the trait down into smaller parts from which aspects are extracted as univariate scalar measures. Common aspects of facial morphology for example are distances between facial landmarks (e.g., nose or mouth width, eye spacing, and facial height). Alternatively, dimensionality reduction techniques, like principal component analysis (PCA) on landmark configurations, are applied and the resulting principal component (PC) axes are then treated as separate univariate phenotypic traits. Both strategies of physical simplification have been used in two recent GWAS analyses on normal-range facial variation [21], [22]. Together they identified only five genes [PAX3 (in both studies), PRDM16, TP63, C5orf50, COL17A1], which is a small number considering the compelling evidence for genetic effects on facial features and the large numbers of both faces and genetic markers screened in these efforts. A major problem with these approaches is that the shape effects of particular genes do not necessarily coincide with either the shape features typically selected a priori or with the PC axes that emerge. In order to improve statistical modeling of the relationships between genetic variation and facial morphology, the physical complexity cannot be simplified a priori. Instead it should be modeled systematically and impartially as recently shown in Peng et al. [23] and Claes et al. [24].

We recently discovered 24 SNPs in 20 genes showing significant effects on normal-range facial morphology using a sample of 592 persons with mixed African and European ancestry [24]. In using a novel relationship modeling technique known as bootstrapped response-based imputation modeling (BRIM), this was done without simplifying the physical complexity, which pictorially revealed individual effects. Although this work is clearly preliminary given the relatively small sample size, and the fact that only 46 genes were screened, these results provide some useful insights on the genetic basis of normal-range facial morphology. (Note that, the genotyping for this work was completed before the publication of the two GWAS scans [21], [22], so the five genes discovered in these were not included.) The purpose here is to illustrate possible forensic implications of the work by presenting preliminary results on the combination of these pictorial effects into a DNA-based facial composite. To this end, a method based on image blending is presented that deals with the physical complexity of facial morphology in DNA-based predictions. These initial results are illustrated with example cases and evaluated using cross-validation. We conclude with a discussion of the results together with future and alternative avenues of research to improve the creation of DNA-based facial composites.

Section snippets

Data sampling and preprocessing

The same dataset and the results from [24] are used as input. Briefly, 592 participants from admixed populations in the US, Brazil, and Cape Verde were sampled under a Penn State University Internal Review Board (IRB) approved research protocol titled, “Genetics of Human Pigmentation, Ancestry and Facial Features”. These three populations have varying levels of West African and European genomic ancestry. We restricted this analysis to participants between the ages of 18 and 40 to minimize

Results

To give a perceptual idea of the possible quality and shortcomings of the predictions, Fig. 3 depicts four test cases across the range of the dataset in terms of ancestry and sex from a subset of participants who provided additional photo-release consent. Another prediction was published in the New Scientist [35] that appeared in conjunction with the publication of previous work [24]. This work presents an opportunity to appreciate this particular prediction within the given methodological

Discussion

Starting from previous work [24] that provided initial findings on the genetic basis of normal-range facial morphology and a systematic and unbiased framework for modeling 3D morphology, we propose a novel approach to generate DNA-based facial composites. Using an admixed population dataset, the effects of sex, genomic-ancestry, and 24 individual SNPs across 20 genes were modeled onto facial morphology using BRIM [24]. First, genomic ancestry and sex are used to create a base-face, which is an

Conclusion

When confronted with an evidentiary DNA sample that does not match with a reference sample, the DNA-based prediction of externally visible characteristics or molecular photofitting can help the investigation out of an impasse [5], [7]. In this context, a DNA-based facial composite is forensically of great interest, due to the identity information coded in the human face [9], [14]. Strong evidence exists supporting the genetic control of facial features [15], but the actual prediction of facial

Acknowledgements

This work was supported by grants to MDS from Science Foundation of Ireland Walton Fellowship (04.W4/B643); to MDS from the National Institute Justice (2008-DN-BX-K125); PC is supported by the Flemish Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT Vlaanderen) and the Research Fund (BOF) KU Leuven. HH is supported by discovery projects of the Australian Research Council. Funding sources had no involvement in the conduct of the research.

References (48)

  • C.N. Stephan et al.

    The reproducibility of facial approximation accuracy results generated from photo-spread tests

    Forensic Sci. Int.

    (2010)
  • M.A. Jobling et al.

    Encoded evidence: DNA in forensic analysis

    Nat. Rev. Genet.

    (2004)
  • J.M. Butler

    Fundamentals of Forensic DNA Typing

    (2009)
  • B. Budowle et al.

    Codis str loci data from 41 sample populations

    J. Forensic Sci.

    (2001)
  • J.M. Butler

    Forensic DNA Typing: Biology, Technology, and Genetics of str Markers

    (2005)
  • M.D. Shriver et al.

    Ethnic-affiliation estimation by use of population-specific DNA markers

    Am. J. Hum. Genet.

    (1997)
  • T. Frudakis

    Molecular Photofitting: Predicting Ancestry and Phenotype Using DNA

    (2010)
  • D. Smeets et al.

    A comparative study of 3d face recognition under expression variations

    IEEE Trans. Syst. Man Cybern. C: Appl. Rev.

    (2012)
  • K. Lai et al.

    Application of biometric technologies in biomedical systems

  • C. Wilkinson

    Facial reconstruction – anatomical art or artistic anatomy?

    J. Anat.

    (2010)
  • L. Kohn

    The role of genetics in craniofacial morphology and growth

    Annu. Rev. Anthropol.

    (1991)
  • S.M. Weinberg et al.

    Heritability of face shape in twins: a preliminary study using 3D stereophotogrammetry and geometric morphometrics

    Dent. 3000

    (2013)
  • P. Hammond

    The use of 3D shape modelling in dysmorphology

    Arch. Dis. Child.

    (2007)
  • G. Baynam et al.

    The facial evolution: looking backwards and moving forward

    Hum. Mutat.

    (2013)
  • Cited by (55)

    View all citing articles on Scopus
    View full text