Human face-recognition ability varies widely from person to person. People who perform with exceptionally high accuracy on face-recognition tasks are called “super”-recognisers (see Noyes, Phillips, & O’Toole, 2017 for a review). People with prosopagnosia are at the other end of the spectrum—these people experience severe difficulty with face recognition (see Kress & Daum, 2003). The ability of the rest of the population is dispersed between these two extremes. Many jobs require accurate identifications to be made for security and legal purposes. Screening candidates for these jobs on their person-identification abilities would, in theory, create a workforce of people best skilled for the job. To date, screening has focused exclusively on face recognition. However, real-world identification scenarios often include other information that can aid identification such as the body or a person’s movement.
More broadly, research suggests that the body provides information that can aid identification. Accurate identifications are made more frequently from images of faces, than images of the body (Burton, Wilson, Cowan, & Bruce, 1999; O'Toole et al., 2011; Robbins & Coltheart, 2012). Despite this, above-chance accuracy has been achieved on matching tasks that involve pairs of body images (O'Toole et al., 2011). Additionally, fusing identification decisions from the face, with those from the body, provided more accurate identity decisions than from the face alone (O'Toole et al., 2011). The role of the body in identifications is supported further by Robbins and Coltheart (2015), who reported that participants made more accurate identification decisions from full-person stimuli (video footage or static image) over face-only and body-only stimuli. It follows that people rely more on the face than the body to inform their identity judgements (Robbins & Coltheart, 2012), even when they identify familiar people (Burton et al., 1999). However, when it is difficult to extract information from the face, the body can be relied upon to inform identification, even without conscious awareness of this reliance (Hahn, O’Toole, & Phillips, 2016; Rice, Phillips, Natu, An, & O’Toole, 2013).
A person’s movement can also facilitate identification (O'Toole et al., 2011; Robbins & Coltheart, 2015; Simhi & Yovel, 2016). O'Toole et al. (2011) reported that participants achieved higher matching accuracy when they viewed video footage of people’s bodies (face obscured), than when they viewed static body images. Conversely, Robbins and Coltheart (2015) reported no benefit of movement for person recognition on images that showed the face only, the body only, or the full person. These studies tested for the benefits of movement using natural videos of faces and bodies. Often, studies that have examined the role of motion in recognition use point light biological motion to isolate movement from an image of a body. These point light videos were originally created by attaching lights, florescent tape, or markers to the joints of people who are dressed in dark clothing, and then filming movements of these people in a dark room (see Johansson, 1973). They can be created now with computer software by attaching markers to the joints and key reference points of models (e.g. head, arms, legs, shoulders, elbows, knees). Identity information can be extracted from biological motion, with studies showing that familiar people can be identified from point light motion videos (Barclay, Cutting, & Kozlowski, 1978; Beardsworth & Buckner, 1981; Cutting & Kozlowski, 1977; Jacobs & Shiffrar, 2004; Loula, Prasad, Harber, & Shiffrar, 2005). There is also evidence of unfamiliar-person learning and matching from point light motion displays (Baragchizadeh & O’Toole, 2017; Loula et al., 2005; Stevenage, Nixon, & Vince, 1999; Troje, 2005).
Here we asked whether a standard screening test of face recognition ability can be used to predict a person’s ability to make identifications from the face, the body, and biological motion. To date, there is only one study that has tested for an association between face-recognition ability and body-recognition ability (Biotti, Gray, & Cook, 2017). In that study, participants with prosopagnosia and control subjects were tested on face and body recognition skill on static images (Biotti et al., 2017). As a group, people with prosopagnosia were significantly worse than controls at recognising images of faces and bodies. Scatterplots highlighted that some, but not all, prosopagnosic participants were impaired at body recognition. We are interested in the relationship between face recognition, body recognition, and recognition from biological motion across the spectrum of face-recognition ability encountered in the general population.
In generating hypotheses about the relationship between face-recognition accuracy and accuracy on other person-identification tasks, we can consider the use of processing strategies. For example, previous studies show that people recruit similar holistic processing strategies when they view face images (Collishaw & Hole, 2000; Maurer, Le Grand, & Mondloch, 2002; Murphy, Gray, & Cook, 2017; Rossion, 2013; Tanaka & Simonyi, 2016; Tanaka & Farah, 1993; Young, Hellawell, & Hay, 1987), body images (Aviezer & Todorov, 2012; Robbins & Coltheart, 2012; Seitz, 2002), and biological motion videos (Bertenthal & Pinto, 1994; Chatterjee, Freyd, & Shiffrar, 1996; Thompson, Clarke, Stewart, & Puce, 2005). Moreover, similar inversion effects have been found for face, body, and full-person images (Minnebusch, Suchan, & Daum, 2008; Reed, Stone, Bozova, & Tanaka, 2017; Robbins & Coltheart, 2012; Yovel, Pelc, & Lubetzky, 2010). However, other studies have reported no inversion effects for headless bodies (Minnebusch et al., 2008; Robbins & Coltheart, 2012; Yovel et al., 2010) and Bauser, Suchan, and Daum (2011) found no evidence of integration of the top and bottom half of body-only images. Inversion effects have been reported also for biological-motion stimuli (Pavlova & Sokolov, 2000; Sumi, 1984; Troje & Westhoff, 2006).
A different hypothesis may be generated from an examination of the neural processes activated by the face, body, and movement. Despite the similarities in processing strategies described above, distinct neural processes are activated by face and body images (Kanwisher & Yovel, 2006; Peelen & Downing, 2007). A complex cortical network of brain regions is involved in face recognition (Calder & Young, 2005; Haxby, Hoffman, & Gobbini, 2000), which includes the fusiform face area (FFA) (Gobbini & Haxby, 2007; Grill-Spector, Knouf, & Kanwisher, 2004; Kanwisher, Mcdermott, & Chun, 1997; Kanwisher & Yovel, 2006) and the occipital face area (OFA) (Pitcher, Walsh, & Duchaine, 2011; Pitcher, Walsh, Yovel, & Duchaine, 2007). Body recognition has been linked to activation of the extra-striate body area (EBA) and the fusiform body area (FBA) (Downing, 2001; Kanwisher & Yovel, 2006) in the ventral visual stream. Furthermore, the posterior superior temporal sulcus (pSTS) in the dorsal visual stream is activated by viewing motion of the face, motion of the body, and more generally, biological motion (Allison, Puce, & McCarthy, 2000; Beauchamp, Lee, Haxby, & Martin, 2003; Fox, Iaria, & Barton, 2009; Giese & Poggio, 2003; Gilaie-Dotan, Kanai, Bahrami, Rees, & Saygin, 2013; Yovel & O’Toole, 2016). The distributed model of Haxby et al. (2000) predicts that the invariant information about faces and bodies is processed by ventral stream areas (FFA, OFA, FBA, and EBA). The changeable information from biological motion is processed by the dorsal stream regions in the pSTS. If recognition abilities reflect the neural processing systems, face and body processing ability may be linked, while biological motion processing could be independent of these other abilities.
Turning now to the methodological questions of how skills are related, there are two main challenges inherent in any investigation of the relationship among person recognition skills. First, there is the challenge of incorporating individual differences into conclusions. For example, the literature depicts super-recognisers as consistent high performers across a range of face-recognition tasks (Bobak, Bennetts, Parris, Jansari, & Bate, 2016; Bobak, Dowsett, & Bate, 2016; Bobak, Hancock, & Bate, 2016; Davis, Lander, Evans, & Jansari, 2016; Noyes et al., 2017; Robertson, Noyes, Dowsett, Jenkins, & Burton, 2016; Russell, Duchaine, & Nakayama, 2009). This conclusion most often reflects group-level results (Noyes et al., 2017). In this literature, group-level results most often compare the average performance of super-recognisers on a task against the average performance of control participants on the same task. At a group level, super-recognisers outperform control groups on a range of face-recognition tasks. However, there are often complex patterns of individual performance across face-recognition tasks (Bobak, Bennetts, et al., 2016; Bobak, Dowsett, & Bate, 2016; Davis et al., 2016; Noyes et al., 2017; Robertson et al., 2016). In other words, whereas summary statistics examine the overall pattern of performance, individual performance is best seen directly from the distribution of subjects’ performance. In a review of the literature on super-recognisers, Noyes et al. (2017) point to several instances where group-level claims did not fully represent the data. Specifically, there are cases where super-recognisers perform with lower accuracy than controls, and instances when controls outperform some super-recognisers. Often, individual differences are acknowledged within experiment results; presented either in scatterplots, violin plots, or statistically with individual modified t test analysis (Bobak, Bennetts, et al., 2016; Bobak, Dowsett, & Bate, 2016; Davis et al., 2016; Robertson et al., 2016). However, they are often presented as an afterthought or to provide caveats to the group data. Critically, conclusions tend to be based on the group result. These group results have been reported in the media. As we will see in the current study, it is better policy to begin with individual distributions before performing any group analysis.
The second challenge in understanding how skills (e.g., body recognition and face recognition) relate to one another, is that it is difficult to distinguish “genuine ability” from the more general factor of motivation/conscientiousness. At the outset, it is reasonable to assume that motivation or conscientiousness will have some predictive value across tasks. Thus, highly conscientious people will make a sustained effort across all tasks and will likely perform better than less motivated individuals. If high performance in one task is strongly related to high performance on another task, it is unclear whether the relationship is due to skill, motivation, or to a combination of the two. This problem is particularly vexing when we observe strong correlations between face recognition performance and performance on other tasks. We should assume that motivation is part of this correlation. Thus, when processes are also related inherently (e.g., generated by similar neural mechanisms or psychological strategies), there will be strong correlation between performance on all tasks due to the underlying relationship among processes. It is difficult, perhaps impossible, to parcel out what part of the correlation is based on participant motivation versus skill. The easier case is when strong dissociation is seen between different tasks. In that case, it is reasonably easy to assume that motivation is not the entire cause of the observed correlation in task performance.
Here our goal was to determine whether a standard face-matching test (the Glasgow Face Matching Task (GFMT) short version) is an accurate screening measure of person matching. The GFMT was chosen as the screening test, because it is frequently reported in the literature as a measure of face-matching ability. Moreover, it has standardised norm scores that are available for the task (Burton, White, & McNeill, 2010). Specifically, we tested whether accuracy on the GFMT relates to identity-matching performance for face images, body images, and biological motion. In this study, we progressed through a carefully selected set of analyses that build from individual performance-level exploratory analyses to inferentially based group analysis. We first divided participants into face-recognition ability groups based on their performance on the face-matching screening task (GFMT). In the exploratory analysis, to visualise performance accuracy across our three identity-matching tasks (face, body, and biological motion), we created plots that show the full distribution of identity-matching ability for each task, colour coded by performance on the face-screening task. Multivariate analysis was also deployed to visualise the pattern of performance of individual subjects within the array of test types. Next, to compare performance of specific individuals across tasks, we created scatterplots and tested for correlation. Finally, we analysed data at a group-level using an analysis of variance (ANOVA). The visualisation methods showed that the GFMT screening predicted performance on the face task, but not on the body or biological motion task. However, group analysis supported the misleading conclusion that face recognition ability is related to performance on each of the other tasks.