The use of biometrics in identification is commonplace across a variety of contexts. For example, face photographs are featured in many forms of documentation internationally, including passports and driving licenses. Our reliance on the face as a means of identification is likely a result of our belief that we are face experts. However, in reality, we are only familiar face experts (Young & Burton, 2018). Numerous studies have now shown that we are error-prone when making decisions based upon unfamiliar faces (e.g. Bruce, Henderson, Newman, & Burton, 2001; Burton, White, & McNeill, 2010; Jenkins, White, Van Montfort, & Burton, 2011; Kemp, Towell, & Pike, 1997). Further, and perhaps surprisingly, trained passport officers perform at similar levels to untrained university students (White, Kemp, Jenkins, Matheson, & Burton, 2014).
Errors with unfamiliar faces become especially problematic when dealing with various types of fraudulent identification. For instance, researchers in recent years have begun to investigate the issue of “face morphing attacks” (Ferrara, Franco, & Maltoni, 2014). This term refers to the following three-step process to obtain a passport fraudulently. Person A (who has no criminal record) creates a morphed photo of himself and person B (whose prior record prevents him from international travel). First, person A submits this AB morph as his ID photograph with his passport application. Second, the morph is compared with previous images of person A that are kept on file and the application is subsequently approved by the passport issuing officer on the grounds that the image sufficiently resembles him. Third, person A gives this FOG (Interpol, n.d.) passport to person B, who then proceeds to use it during travel as he also resembles the morph image sufficiently to pass through border control.
Problematically, since the document itself is genuine, typical anti-counterfeit measures (e.g. the use of security watermarks, inks, and fibers) are powerless to detect these types of fraud. Therefore, detection must rely upon comparing the morph with previously stored face photographs (at the point of issuance) or the “live” face (at the point of presentation for travel). As digital image manipulation software becomes more advanced, the resulting morphs become more difficult to detect. One approach is to develop increasingly sophisticated computer methods for morph detection (e.g. Makrushin, Neubert, & Dittmann, 2017; Neubert, 2017; Raghavendra, Raja, Venkatesh, & Busch, 2017a, 2017b; Scherhag, Nautsch, et al., 2017; Scherhag, Raghavendra, et al., 2017; Seibold, Samek, Hilsmann, & Eisert, 2017, 2018). For example, inconsistencies between the reflections visible in the eyes and skin could signal a morphed image (Seibold, Hilsmann, & Eisert, 2018). Such techniques may be incorporated into automated border control (ABC) systems in order to prevent the use of morph images.
In many situations, however, the decision to accept an ID image is left to a human operator. Indeed, even in face matching scenarios where algorithms are initially employed, human users are often presented with a “candidate list” and are required to make the final selection, potentially reducing the overall accuracy of the process (White, Dunn, Schmid, & Kemp, 2015). Although important across a variety of contexts, the question of whether people are able to detect morphs and/or whether they accept such images as genuine ID photographs has received little attention to date.
Ferrara, Franco, and Maltoni (2016) provided evidence that several computer algorithms performed with high error rates when tasked with detecting morph images. In addition, they found that human performance on their task was also poor, with morphs going undetected in most cases (see Makrushin et al., 2017, for similar findings). In line with previous work on face matching with expert populations (White et al., 2014), their results also showed that professionals working in the field (border guards) were no better than university students and employees in detecting morphs.
Recently, two articles by Robertson and colleagues have specifically focused on human performance in the matching and detection of morphs. In the first, participants completed computer tasks in which they decided whether two face images onscreen depicted the same person or not (Robertson, Kramer, & Burton, 2017). In seven trials, the two images were different photographs of the same face, and in another seven trials, the images were photographs of two different people. For the remaining 35 trials, a face photograph was paired with a morph containing differing amounts of that face and a second person. (When creating morphs, the researcher can specify the percentage weighting of each identity contained in the final image.) The results demonstrated that 50/50 morphs (weighting both identities equally) were accepted as “matches” for the faces they were paired with on 68% of trials. After providing instructions regarding the nature of morphs, and with the additional response option of “morphed image,” participants subsequently accepted them as “matches” on only 21% of trials. Taken together, the authors suggested that erroneously accepting morphs as ID images was common, but these errors can be significantly reduced through instruction.
In the second article, the researchers investigated whether people were able to detect morph images and whether training could help with this task (Robertson et al., 2018). Participants were shown ten-image arrays containing a mixture of the morph images and exemplars (original, unmorphed faces) used in the previous article and were asked to identify which were the morphs. Performance was poor, with the 50/50 morphs resulting in average d’ sensitivities of 0.56 and 0.96 (for the two groups that took part: training versus none), suggesting that morphs were not readily detected. However, providing information regarding the nature of the morphs, along with some tips to help with identifying them, resulted in a significant increase in sensitivity (to 2.69 and 2.32, respectively). An additional training protocol, in which feedback was provided via a two-alternative forced choice (2AFC) task, also led to a further benefit for the group that received it (the first mentioned in the values reported above). The authors concluded that people were poor at detecting morphs, but that training could significantly improve performance.
As mentioned earlier, with improvements in image manipulation techniques, and in combination with a criminal’s determination to avoid being caught, we should expect that real-world morphs will be made with a level of sophistication that renders them virtually undetectable to the human eye. Problematically for the two articles investigating human acceptance and detection of morphs (Robertson et al., 2017, 2018), the images used were not representative of the level of cutting-edge methods that are likely to be applied by fraudsters. Although the initial face averaging was carried out using advanced morphing software (JPsychomorph; e.g. Benson & Perrett, 1993), there was no subsequent “touch up” stage in order to remove artefacts that are known to result from the averaging process (e.g. the presence of a secondary outline for the hair). As Fig. 1 (top row) illustrates, the 50/50 morph (center) included obvious artefacts that can be easily removed using image-editing software. Indeed, these artefacts were highlighted to participants during the morph detection training phase of both previous studies: “look for a ‘ghost-like’ outline of another face; look for the outline of another person’s hair over the forehead” (Robertson et al., 2018, p. 4). In addition, by presenting faces that have been cropped to remove the neck and background, these images did not conform to real-world ID specifications and also highlighted to participants that all the images had been altered to some extent. For these reasons, we predict that the performance levels reported, along with the apparent training benefits, may only be of limited utility with regard to real-world behaviors when using more realistic images.
In the current set of studies, we aim to address these issues by creating higher-quality morph images and investigating both human and computer detection of these images. It is important to determine whether people accept morphs, or can detect their use, when every effort is made to produce images that reflect real-world fraud. For example, if training methods were implemented with the assumption that morph detection would be significantly improved, this might result in a false sense of security (literally) for passport control and issuing officers. Therefore, in this paper, we investigate human morph detection performance with and without training, reflecting a passport-issuing context (Experiments 1 and 2), whether people accept morphs as ID images in a “live” task, reflecting a border control scenario (Experiment 3), and, finally, whether computational modelling outperforms human detection, providing a more suitable alternative than training people (Experiment 4).