Hyper-realistic face masks: a new challenge in person identification

Sanders, Jet Gabrielle; Ueda, Yoshiyuki; Minemoto, Kazusa; Noyes, Eilidh; Yoshikawa, Sakiko; Jenkins, Rob

doi:10.1186/s41235-017-0079-y

Original article
Open access
Published: 25 October 2017

Hyper-realistic face masks: a new challenge in person identification

Jet Gabrielle Sanders¹,
Yoshiyuki Ueda²,
Kazusa Minemoto²,
Eilidh Noyes¹,
Sakiko Yoshikawa² &
…
Rob Jenkins¹

Cognitive Research: Principles and Implications volume 2, Article number: 43 (2017) Cite this article

10k Accesses
8 Citations
765 Altmetric
Metrics details

Abstract

We often identify people using face images. This is true in occupational settings such as passport control as well as in everyday social environments. Mapping between images and identities assumes that facial appearance is stable within certain bounds. For example, a person’s apparent age, gender and ethnicity change slowly, if at all. It also assumes that deliberate changes beyond these bounds (i.e., disguises) would be easy to spot. Hyper-realistic face masks overturn these assumptions by allowing the wearer to look like an entirely different person. If unnoticed, these masks break the link between facial appearance and personal identity, with clear implications for applied face recognition. However, to date, no one has assessed the realism of these masks, or specified conditions under which they may be accepted as real faces. Herein, we examined incidental detection of unexpected but attended hyper-realistic masks in both photographic and live presentations. Experiment 1 (UK; n = 60) revealed no evidence for overt detection of hyper-realistic masks among real face photos, and little evidence of covert detection. Experiment 2 (Japan; n = 60) extended these findings to different masks, mask-wearers and participant pools. In Experiment 3 (UK and Japan; n = 407), passers-by failed to notice that a live confederate was wearing a hyper-realistic mask and showed limited evidence of covert detection, even at close viewing distance (5 vs. 20 m). Across all of these studies, viewers accepted hyper-realistic masks as real faces. Specific countermeasures will be required if detection rates are to be improved.

Significance

In several high-profile criminal cases, offenders have used hyper-realistic face masks to transform their appearance, leading police to pursue suspects who look nothing like the offenders themselves (e.g., different race or age). In other settings, airline passengers wearing hyper-realistic masks have boarded international flights without the deception being noticed. Such incidents are likely to become more common as hyper-realistic masks become easier to manufacture. These developments have potentially far-reaching implications for security and crime prevention. Face identification requires a one-to-one mapping between faces and people, so that appearance can be traced to identity unambiguously. If viewers do not distinguish between hyper-realistic masks and real faces, the mapping can be compromised, and facial appearance is no longer informative for identification. We find that viewers fail to detect hyper-realistic masks, even when they attend to facial appearance. Exceptions to this pattern hint at possible methods for improving detection performance.

Background

Face recognition is a common means of identifying people and an important component of security and crime prevention internationally. For example, passport issuance (White, Kemp, Jenkins, Matheson, & Burton, 2014) and passport control (McCaffery & Burton, 2016) both involve facial image comparison. Conviction of criminal suspects can sometimes hinge on eyewitness testimony (Wells & Olson, 2003; Bruce, 1988; https://www.innocenceproject.org) or CCTV footage (Burton, Wilson, Cowan, & Bruce, 1999; Davis & Valentine, 2009). In many countries, a photo-ID is required for the purchase of age-restricted goods (Gosselt, van Hoof, de Jong, & Prinsen, 2007; Vestlund, Langeborg, Sörqvist, & Eriksson, 2009). Because face identification carries such weight in these situations, it is also a major focus for identity fraud and deception (Robertson, Kramer, & Burton, 2017). In particular, individuals may wish to impersonate someone else or to avoid being recognised themselves (Dhamecha, Singh, Vatsa, & Kumar, 2014).

One way to conceal identity is simply to cover the face, for example, using fabric or a mask (Fecher & Watt, 2013). Covering the face is generally effective in obscuring identity (Burton et al., 1999), but it is also visually and socially salient, and likely to arouse the suspicion of onlookers (Zajonc, 1968). Over the past decade, this limitation has been challenged by the emergence of hyper-realistic, hand-painted silicone masks (Fig. 1), originally developed in the special effects industry as an alternative to multi-hour make-up sessions. The flexibility and strength of silicone confer several advantages in this situation. Unlike traditional masks that cover the face only, a silicone mask may cover the whole head and neck so that it extends below the collar without any joins. This seamless construction creates the impression that the visible face is part of a continuous body surface rather than being a separate overlay (Anderson, Singh, & Fleming, 2002). Realism is further enhanced by transmission of non-rigid movement (e.g., rotation of the head relative to the body, opening and closing of the mouth, gross changes in facial expression) from the surface of the face to the surface of the mask. Importantly, the wearer’s real eyes, nostrils and mouth cavity are all visible through the mask via close-fitting holes that match the topology of the face beneath. Several manufacturers offer hand-punched human hair and stubble as optional extras.

These advances in mask fabrication raise the question of how realistic a mask can be. For the present purposes, we adopt a pragmatic definition of realism, namely a mask is realistic if it is perceived as a real face. This criterion has the advantage of being testable and can be applied across different viewers and viewing conditions. It also gets to the heart of the practical problem. If covering one’s face arouses suspicion, the ability to cover one’s face without arousing suspicion would seem to favour the deceiver.

There are reasons to doubt that this level of realism can be achieved in practice. For one, the visual system is highly attuned to face stimuli, including subtleties of skin tone (Fink, Grammer, & Matts, 2006; Frost, 1988; Bindemann & Burton, 2009) and face shape (Oosterhof & Todorov, 2008; Ekman, 2003). Thus, it seems plausible that even minor departures from authentic appearance at the physical level could loom large at the perceptual level. Paradoxically, some demands of the perceptual system may become harder to satisfy as authenticity increases. The ‘uncanny valley’ refers to the phenomenon whereby human response to humanoid artifacts (e.g., robots, dolls, puppets) shifts from empathy to revulsion as the humanoid approaches, but fails to attain, lifelike appearance (Mori, 1970; see Mori, MacDorman, & Kageki, 2012, for an English language translation). Given humans’ particular sensitivity to face stimuli, one might expect the uncanny valley to pose a particular challenge for masks (Seyama & Nagayama, 2007). A sense of eeriness could undermine an otherwise compelling overall impression of realism.

Theoretical concerns aside, the important question is whether these masks actually fool anyone. There is now a good deal of anecdotal evidence that hyper-realistic masks can pass for real faces in everyday life. In one incident, a white bank robber used a silicone mask to disguise himself as a black man for a string of robberies in the USA. Six out of seven bank tellers wrongly identified a black man as the culprit in a photo line-up; only when the robber’s girlfriend intervened was the black suspect released from jail (Bernstein, 2010). In another case, a young Asian man disguised himself as an elderly white man using a silicone mask and boarded a flight from Hong Kong to Canada (Zamost, 2010). The deception was only detected when the passenger removed the mask midflight and a fellow traveller brought the change in appearance to the attention of the crew. These examples imply that realistic masks can be mistaken for real faces, even when the viewer’s attention is focused on facial appearance (as is the case in police line-ups and passport checks). Surprisingly, however, there has been no experimental research into hyper-realistic masks and the conditions under which they can be detected.

Herein, we address these questions in three experiments. We examine mask detection from static photographs (Experiment 1 and 2) and in live viewing (Experiment 3) to assess performance in these two modes of face identification. We had the opportunity to collect data from both British and Japanese participants, allowing us to compare performance for own-race and other-race faces. A large body of research on the other-race effect has shown that identification performance is more reliable for own-race faces than for other-race faces (Meissner & Brigham, 2001). Our question here is whether a similar bias operates when distinguishing hyper-realistic masks from real faces.

Experiment 1

In Experiment 1, we secretly embedded photos of hyper-realistic masks among photos of real faces. Participants worked through these photos sequentially, rating the person in each photo on a series of social dimensions. This task ensured that participants processed the images, but did not draw attention to the distinction between real faces and masks. We then asked a series of graded questions to determine whether or not they had noticed any masks among the faces. After explaining the manipulation, we showed the stimuli again and asked participants to pick out any photos that contained masks. We predicted that, when participants were not expecting to see masks (i.e., during the rating phase), realistic masks might not be detected, resulting in few spontaneous reports of masks in post-test questioning. However, when participants are expecting to see masks (i.e., after the manipulation has been explained), they should be able to distinguish realistic masks from real faces, merely by inspecting the photographs.

Method

Ethics statement

Ethical approval was granted by the departmental ethics committee at the University of York.

Participants

Sixty undergraduate and postgraduate members of the volunteer panel at the University of York (10 males; mean age = 21, age range 18–39 years) took part in exchange for a small payment or course credit.

Stimuli and design

We used three different mask models from Realflesh Masks, Quebec, Canada – The Pensioner (Old Male Mask), The Fighter (Young Male Mask) and The Grandma (Old Female Mask). The company offers a range of hair options for its masks. We opted for punched human hair eyebrows on all three and a full head of hair on The Grandma.

To generate mask images, we took multiple photographs of the same volunteer model wearing each of the three masks. We took photos indoors and outdoors under different viewing conditions to approximate the range of variability seen in natural face images (Jenkins, White, Van Montfort, & Burton, 2011). For each mask, we selected two different photos that depicted the mask in frontal view with no occlusions (six mask images in total).

To generate real face images, we entered the terms ‘young male’, ‘old male’, ‘young female’ and ‘old female’ into Google Image search. For each of these four face types, we selected the first five colour photos of unfamiliar Caucasian faces that (1) exceeded 200 pixels in height, (2) showed the face in roughly frontal aspect and (3) were free from occlusions (20 real face images in total). All photos (masks and real faces) were cropped to show the head region only and resized to 540 pixels high × 385 pixels wide for presentation.

Starting with the 20 real face photos, we created different stimulus sets by substituting one mask for one real face of the same type (young male, old male, or old female). This resulted in six variant image sets, each consisting of one mask photo embedded in 19 real face photos. Ten participants saw each variant.

Procedure

Participants viewed 20 photographs (19 real faces and 1 hyper-realistic mask), one at a time, in a random order. To encourage deep processing of facial appearance, we asked participants to estimate the age of the person in each photo, and to rate the person for ‘Trustworthiness’, ‘Dominance’ and ‘Attractiveness’, using a 7-point Likert scale. There was no time limit for this task and photos remained on screen until all responses were made. This rating task was followed by a series of graded questions to assess detection of the mask. Question 1, ‘What did you think of the faces you saw?’, was deliberately open and was intended to capture spontaneous, overt detection of the mask. Question 2, ‘Did you notice anything unusual about any of the faces?’, encouraged participants to report any suspicions that they may have had during the task (i.e., more covert detection). Both of these questions invited typed responses. Question 3, ‘In this experiment, half of the participants are in the Mask group (where at least one of the photos contains a mask). The other half are in the No Mask group (where none of the photos contained a mask). Which group do you think you were in (Mask vs. No Mask)?’ led to a two-alternative forced choice (2AFC), which was intended to provide a more sensitive measure. After responding, participants were informed that they were in the Mask group. They were then presented with all 20 of the photos they had rated (19 real faces and 1 mask) in a randomly ordered 5 × 4 array and asked to indicate any photo that contains a mask (Question 4; Fig. 2). At the end of the experiment, participants were debriefed and asked to indicate whether or not they had prior knowledge of realistic silicone masks before the start of the experiment.

Results

Mask detection

We first tested for overt detection of the masks by analysing the content of typed responses to Question 1 (‘What did you think of the faces you saw?’) and Question 2 (‘Did you notice anything unusual about any of the faces?’). To avoid imposing our own interpretations on these responses, we simply coded for the presence (1) or absence (0) of the word ‘mask’ in the text. As it turned out, none of the 60 participants included the word ‘mask’ in either response. That is, there were no cases of overt detection (see Additional file 1 for raw data). For the 2AFC item (Question 3), only 21.7% of participants guessed that they were in the Mask group, significantly lower than the chance level of 50% (t(59) = 5.28, P < 0.001, d = −17). Finally, in the array challenge (Question 4), 70% of participants correctly picked out the mask. However, participants also picked out an average of 2.5 (range 0–10) real faces (Fig. 3, left). In fact, all but one of the real faces (YM1) was reported as a mask at least once. χ² analysis revealed no significant differences in detection performance across mask types (2AFC: χ² (3, n = 60) = 0.79, P = 0.68, Cramer’s v = 0.13; Array challenge: χ² (3, n = 60) = 1.43, P = 0.490, v = 0.12).

Mask knowledge

Overall, 38 of the 60 participants declared prior knowledge of hyper-realistic masks. χ² analyses revealed no significant difference in 2AFC performance between Knowledge (n = 38; 21.1%) and No Knowledge (n = 22; 22.7%) subgroups (χ² (2, n = 60) = 0.02, P = 0.807, v = 0.03). However, prior knowledge conferred a significant advantage in the array challenge (Knowledge: 78.9%; No Knowledge: 54.5%; χ² (3, n = 60) = 3.95, P = 0.046, v = 0.28).

Discussion

We find it quite striking that not a single participant volunteered that they had seen a mask. Even under 2AFC questioning, only 22% thought that a mask might have been presented. These findings suggest that, at least in the context of viewing photos, participants need to both (1) be informed that a mask may be present and (2) have the images available for inspection, if they are to distinguish hyper-realistic masks from real faces. Even when these conditions were met (in the array challenge), 30% of participants missed the mask and 78% picked out at least one real face. The message from this experiment is that detecting hyper-realistic masks is hard, even when the test conditions are highly favourable. We next consider a situation in which the test conditions may be less favourable – viewing other-race faces.

Experiment 2

Viewers are generally poor at identifying other-race faces compared with own-race faces. This is true for tasks involving recognition memory (Meissner & Brigham, 2001) and also for tasks involving perceptual comparison of face photographs (e.g., Megreya, White, & Burton, 2011). The perceptual explanation of this own-race bias is that the ability to distinguish individuals is refined by experience, namely that viewers become attuned to the variability that surrounds them and remain relatively insensitive to variability outside of this range (O’Toole, Deffenbacher, Valentin, & Abdi, 1994). This differential sensitivity supports finer perceptual discriminations for own-race faces than for other-race faces. In the case of hyper-realistic masks, distinguishing a mask from a real face also requires fine perceptual discriminations, perhaps akin to distinguishing one person from another. If so, the task of hyper-realistic mask detection may also be susceptible to own-race bias. In Experiment 2, we had the opportunity to replicate Experiment 1 in Japan, using the same stimuli and procedure as before, but now with Japanese participants. Given that all of our stimuli showed Western (Caucasian) faces and masks, our main interest was whether hyper-realistic masks would be more readily accepted by Japanese participants compared with the UK participants in Experiment 1.