Due to the memory-matching nature of the RT-CIT, it makes theoretical sense that any factors that may reduce the resemblance between test stimuli and one’s memory representations of the Probes can also potentially compromise the RT-CIT. One such example is the angle from which the Probe is photographed, and is the focus of the current study. That is, can the RT-CIT still be effective when the picture of the crime-related items is photographed from a different angle or perspective from that which the perpetrator saw (or remembers)? This problem is not uncommon in the field since the forensic team does not always have information (e.g., security footage) regarding the culprit’s gaze and head angle at the scene. Thus, relevant RT-CIT images can often be photographs of Probe items laid flat against the ground from a 180° angle in a two-dimensional manner, which may not be the way the culprit perceived or remembered the same item. Therefore, the answer to this question would have far-reaching implications for how the forensic team should photograph the Probe items at the scene of the crime.
At the heart of this issue is the longstanding debate of whether object recognition (and, hence, memory detection) is viewpoint-dependent or viewpoint-invariant (e.g., Tarr, Williams, Hayward, & Gauthier, 1998). Based on the classic findings from Shepard and Metzler’s (1971) mental rotation study where participants’ RT for same/different judgment increased linearly with the amount of angular rotation between two objects, there is empirical support to suspect that this may be the case in the RT-CIT. In addition, by manipulating the deviations between study and test angles of geons, Tarr et al. (1998) also found that object memory matching became increasingly more difficult as the difference between studied and tested viewpoints increased. Crucially, this effect of viewpoint-dependence is largest when recognition has to be object-specific instead of categorical (Hamm & McMullen, 1998), which raises concern for RT-CIT validity because we do not want the culprit to just recognize a superordinate category (e.g., gun), but a specific sample within that category (e.g., the gun I used last Friday).
Taken together, if object representation in memory is viewpoint-dependent, then the Probe–Irrelevant RT difference may decrease as a function of increasing dissimilarity between the Probe photograph and what the culprit actually saw. In this case, there would likely be an upper limit of the angular difference that, if exceeded, would make a guilty suspect appear innocent (non-significant RT difference between Probe and Irrelevant). Findings from the object recognition literature seem to suggest this critical angle to be within the range of 60°–120° (Jolicoeur, 1985, 1990; Jolicoeur, Regehr, Smith, & Smith, 1985). If so, this would imply that the forensic team should follow the footsteps of the culprit closely and approximate their height when taking pictures of the Probe items in order to provide a close match. However, if angular differences do not significantly affect the RT gap between Probe and Irrelevant responses, then perhaps the RT-CIT is robust against angularly-mismatched photographs, and thus no additional guidelines are needed when photographing the Probe items. To answer this question, in this study we manipulated the deviation angles, from 0° to 330° in 11 steps, between the pictures that were seen initially and the ones that were eventually used for the RT-CIT.
Methods
Participants
Twenty-seven participants (12 male, 15 female; age 20–35 years, mean age = 24.03) with normal eyesight or corrected-to-normal eyesight were recruited for this experiment. All participants gave informed consent prior to their participation, and received financial compensation for their time. Three participants’ data were excluded from analysis due to low accuracy in the main task. All experimental procedures were approved by the Joint Institutional Review Board of Taipei Medical University, Taiwan.
Apparatus and materials
Study and test materials were images of carefully selected shoes (matched based on similar color, function, shape, etc.) across Targets, Probes, and Irrelevants. The entire experiment used a total of 60 shoes. During the design phase, we randomly chose 10 shoes to serve as Probes, then specifically selected another 10 that matched the Probes in appearance/function to serve as Targets, and used the remaining 40 as Irrelevants. These 10 Probes, 10 Targets, and 40 Irrelevants were then used as experimental stimuli for all participants (see Supplementary Figure).
A total of 720 unique images were made out of these 60 shoes: namely, for every shoe, 12 images were rendered from the 360° rotation view, starting from 0° (i.e., tip of the shoe facing directly to the right), in increments of 30° (i.e., 0°, 30°, 60°, 90°, 120°, 150°, 180°, 210°, 240°, 270°, 300°, 330°). This resulted in a total of 120 Probe images, 120 Target images, and 480 Irrelevant images.Footnote 1
One angle per item (i.e., 10 Probes and 10 Targets) was selected to be the angle that the participants looked at during the studying phase (i.e., 20 images in total; see Supplementary Figure). These randomly selected images of Targets and Probes were then used for all participants, and became the 0° basis for computing deviation angles for further analysis.
The ratio of the Target, Probe, and Irrelevant stimuli in the experiment was 1:1:4 (e.g., Verschuere & De Houwer, 2011). The present study used a multiple-probe protocol using 10 Probe, 10 Target, and 40 Irrelevant shoes (for a comparison between single-probe and multiple-probe protocols, see Verschuere, Kleinberg, & Theocharidou, 2015).
Design and procedure
The experiment consisted of two phases: a study phase (for Target and Probe items) and a test phase.
Study phase
Participants first viewed and memorized images of 10 Target items and 10 Probe items, and were explicitly told that they would later be tested and must suppress their memory/knowledge of the Probe items. For each category (Target vs Probe), all 10 items were of a different angle, thus covering 10 out of 12 angles, and 90° and 270° were excluded from the study phase due to their high level of difficulty (but were included in the test phase).
To ensure vivid memory of all Target and Probe items, participants went through three different tasks during the study phase. During the study phase, participants first filled out a questionnaire regarding each of the Target and Probe items. They were asked to identify the category, colors, and materials of the item in each image, as well as answering questions regarding how much the shoe in the picture appealed to them, how fashionable they thought it was, how much they thought the retail price would be, and how much they would be willing to pay for it. This was done for all 20 images (approximately 15 min), which was designed to motivate the participants to pay close attention to various details of the Target and Probe items. After the questionnaire, participants were given jigsaw puzzles to complete, one at a time, of all 10 Target images, followed by all 10 Probe images (in the exact same angle as the study session). This task was designed to encourage the participants to encode the holistic structure of the items, and the average time spent in this section was also about 15 min. Again, these activities were done in order to increase the different aspects of visual impression of these stimuli and increase their memory strength for subsequent testing.
Lastly, before going into the main test, participants completed one last pretest. In this pretest, one image at a time was presented (10 Target, 10 Probe, and 20 new images) on display for 5 s in randomized order, and participants were to press “F” for Target and “J” for Probe, or the spacebar for new images that they have not seen before. The response time had to be less than 5 s to be scored as accurate. At the end of each trial there was immediate feedback on display as well as their cumulative overall accuracy (500-ms duration). This pretest was conducted twice for all participants.
Between the first and second pretests, participants performed one final rehearsal with a deck of 20 flashcards of the studied Target and Probe items. Participants were to shuffle the deck and speak out that they “have seen it before” or “have not seen it before” upon seeing Target and Probe items, respectively. The average duration for the final rehearsal was around 5 min.
At the end of the pretest, only participants who have correctly categorized 36 out of all 40 images (90% accuracy) from the pretest were admitted into the formal test phase. This was done to ensure that both Targets and Probes were well encoded into participants’ memory with minimal confusion. Participants who scored under 90% accuracy had to restudy the flashcards and redo the pretest until they achieved at least 90% accuracy (Fig. 1, top).
Note that although participants completed extensive training during the study phase, they have only seen one angle for each Target or Probe item. That is, the same Target and Probe images were used repeatedly throughout the memorization period, jigsaw puzzle, and pretest. Therefore, although the participants were very familiar with every Target and Probe item (from a particular angle) thus far studied, there were 11 other versions, or angles, of the same Target and Probe item that they had not seen before, until the test phase.
Test phase
During the formal experimental test phase, there were 760 trials in total, which came from 140 Target trials (all 12 angles for 10 items, plus one repeating trial for 0° and 180° per item), 140 Probe trials (all 12 angles for 10 items, plus one repeating trial for 0° and 180° per itemFootnote 2), and 480 Irrelevant trials (all 12 angles for 40 items). All trial types and all angle deviations were intermixed in randomized order. At the start of every trial, a 500-ms fixation cross was displayed at the center, followed by an image with a duration of 1200 ms, and ended with a 500-ms inter-trial interval (Fig. 1, middle). Participants were to press “F” for Target, “J” for Irrelevant, and again “J” for Probes that they recognized but were instructed to conceal. Participants were instructed to react as fast as possible while being highly accurate. A break was provided every 60 trials.
In the test phase, participant(s) with low accuracy in the Target condition would be excluded from further data analysis. Although low accuracy in Targets is typically observed in the RT-CIT due to Targets’ low probability of occurrence, an unusually low Target accuracy (< 50%; Kleinberg & Verschuere, 2015) may indicate either poor learning of the Targets or a deliberate attempt to say “no” for every item. To this end, we set the exclusion criterion for Target accuracy at 1.5 grand SD (i.e., 52.80% accuracy) below the grand mean Target accuracy, which resulted in the exclusion of three participants (Target accuracy: 42.37%, 45.00%, and 47.14%).
Results
We first analyzed the accuracy and RT between Probe, Irrelevant, and Target trials without considering the angles by collapsing all trials of different angular rotations together.Footnote 3 This gives us the traditional RT-CIT comparisons. Furthermore, error trials with wrong or no responses were discarded from analysis of RT data.
In terms of RT, a repeated-measures one-way ANOVA (Probe, Target, Irrelevant) showed a significant main effect of trial type, F(2,46) = 81.236, p < 0.001, η2p = 0.779. Subsequent post-hoc comparisons using Bonferroni correction revealed significant differences between RT from all trial types: Target vs Probe, t(23) = 5.934, p < 0.001, d = 1.211; Target vs Irrelevant, t(23) = 11.758, p < 0.001, d = 2.400; and Probe vs Irrelevant, t(23) = 7.350, p < 0.001, d = 1.500 (Fig. 2, right). Critically, the significant difference between Probe and Irrelevant replicates the classic RT-CIT finding, and our observed RT difference is in the same range as previous reports (Noordraven & Verschuere, 2013; Verschuere, Kleinberg, et al., 2015). Similar observations were also made in terms of accuracy (Fig. 2, left). A repeated-measures one-way ANOVA revealed a significant main effect of trial type, F(2,46) = 28.185, p < 0.001, η2p = 0.551, where post-hoc comparisons with Bonferroni correction also showed significant differences between all trial types: Target vs Probe, t(23) = − 3.745, p = 0.002, d = − 0.764; Target vs Irrelevant, t(23) = − 9.862, p < 0.001, d = − 2.013; and Probe vs Irrelevant, t(23) = − 2.919, p = 0.008, d = − 0.596. Together, these results highlight the robustness of the RT-CIT, and suggest that our participants do take longer to deny recognition of a previously-seen object. In particular, Target trials have lower accuracy and longer RT, which is also similar to previous reports. This is perhaps due to the unequal 1:5 probability of yes and no responses (Target vs Probe and Irrelevant) that has biased most participants to use “no” as their default response.
Effect of study-test angular deviations
To investigate the possible effect of angle differences between the study and RT-CIT sessions, angular deviation from 0° to 360° was analyzed symmetrically such that any given deviation can come from two rotations. For example, if a given item was seen at 120° (i.e., presentation angle), an angular deviation of 30° can be obtained bidirectionally from the same item photographed at 150° and 90°, or a deviation of 60° from the 180° and 60° images, and so on. Thus, although there are 12 different presentation angles (0°–360°, whose effect is presented in the following section), there are only six angular deviations (30°–180°). It is worth emphasizing again that the present study reports the effects of two kinds of angles, presentation and deviation, although we are primarily interested in the latter (e.g., seeing a shoe at a 30° presentation angle and later being tested using a 90° presentation angle would yield a study-test deviation angle of 60° in our analysis).
In accuracy, a repeated-measures two-way ANOVA with the factors of trial type (Probe, Target, Irrelevant) and angular deviation (0°, 30°, 60°, 90°, 120°, 150°, 180°) revealed a main effect of both trial type, F(2,46) = 27.830, p < 0.001, η2p = 0.548, and deviation, F(6,138) = 12.434, p < 0.001, η2p = 0.351, as well as a significant interaction between the two, F(12,276) = 9.564, p < 0.001, η2p = 0.294. Subsequent one-way ANOVAs showed that there were significant effects of angular deviations for Targets, F(6,138) = 15.833, p < 0.001, η2p = 0.408, and Irrelevants, F(6,138) = 6.250, p < 0.001, η2p = 0.214, but not for Probes, F(6,138) = 0.857, p = 0.561, η2p = 0.036. These results indicated that trial type was an important factor, but no effect of angular deviation was detected for Probe items (Fig. 3).
Importantly, is the RT-CIT’s effectiveness compromised by differing angles between the study and test phases? A 3 × 7 repeated-measures ANOVA on participants’ RT data showed a significant effect of trial type, F(2,46) = 84.899, p < 0.001, η2p = 0.787, and deviation, F(6,138) = 7.455, p < 0.001, η2p = 0.245, but no interaction between the two, F(12,276) = 1.871, p = 0.080, η2p = 0.075. To help interpretation of the null interaction, a Bayesian repeated-measures ANOVA was performed using the open-source software package JASP (Wagenmakers et al., 2018). The main effect of trial type was significant, reflected in a higher Bayes factor for the alternative hypothesis (H1: trial type influences participants’ RT; BF10 = 6.467 × 1084) than for the null hypothesis. In contrast, the Bayes factor for the effect of angular deviation was less than one (BF10 = 0.121). For the interaction, we compared the BF10 value for the model with interaction against the BF10 value for the model with only two main effects: the Bayes factor for the interaction term was 0.058, providing strong evidence for no interaction (17.367 times more likely than the alternative hypothesis).
Looking at the Probe RTs across all angles it is quite apparent that the RT distribution is flat across all angles, which is reflected by the absence of a significant one-way ANOVA, F(6,138) = 0.724, p = 0.575, η2p = 0.031. Also, none of the paired-sample t tests between every Probe angle reached statistical significance. Most importantly, is the Probe–Irrelevant difference altered by angular rotations? To answer this question, we computed the RT differences between Probe and Irrelevant conditions across every angle, and submitted them to a one-way repeated-measure ANOVA. The RT differences between Probe and Irrelevant were flat across all angles, which was reflected by the absence of a significant one-way ANOVA, F(6,138) = 0.942, p = 0.467, η2p = 0.039 (Fig. 3). Separate comparisons also showed that Probe RTs are significantly slower than their Irrelevant counterparts at every angle: 0°, t(23) = 4.884, p < 0.001, d = 0.997; 30°, t(23) = 6.343, p < 0.001, d = 1.295; 60°, t(23) = 4.728, p < 0.001, d = 0.965; 90°, t(23) = 3.980, p = 0.004, d = 0.812; 120°, t(23) = 7.645, p < 0.001, d = 1.561; 150°, t(23) = 6.955, p < 0.001, d = 1.420; and 180°, t(23) = 6.309, p < 0.001, d = 1.288 (Bonferroni correction). Therefore, these data suggest that the Probe RT remains significantly slower than the Irrelevant RT in a fairly consistent manner, and that we could not detect an effect of angular deviations.
Effect of presentation angle
Apart from deviations in angle rotations, some viewing angles of an object may naturally be more difficult than others because they provide less visual cues or information regarding object identity. For example, in the context of this experiment, images from the 90° and 270° categories might be much harder to identify due to the least amount of information that they contain (Fig. 4, top), which was why these angles were excluded from the study phase (but were used in the test phase to ensure that all deviation angles were covered). Although these angles would be stimulus-specific, and are not the main focus of the present study, it is important to consider whether some angles may require caution for a particular set of Probe images. To this end, to explore whether there are any differences in levels of difficulty that are naturally associated with a specific image angle, here we tested whether some angles might be harder than others for the participants to recognize, regardless of angular deviation.
In both accuracy and RT, a two-way repeated-measures ANOVA with the factors of trial type (Target, Probe, Irrelevant) and presentation angle (12 angles from 0° to 330°) showed a main effect for both trial type (accuracy, F(2,46) = 28.691, p < 0.001, η2p = 0.555; RT, F(2,46) = 83.368, p < 0.001, η2p = 0.784) and presentation angle (accuracy, F(11,253) = 8.807, p < 0.001, η2p = 0.277; RT, F(11,253) = 5.910, p < 0.001, η2p = 0.204), as well as a significant interaction between them in accuracy, F(22,506) = 3.224, p = 0.004, η2p = 0.111, but not a statistically significant interaction in RT, F(22,506) = 1.207, p = 0.310, η2p = 0.050.
We then conducted a trend analysis for each trial type. For accuracy, there were significant quartic trends for Target and Irrelevant items, but not Probe items (Target, F(1,23) = 38.903, p < 0.001, η2p = 0.628; Irrelevant, F(1,23) = 9.456, p = 0.005, η2p = 0.291; Probe, F(1,23) = 0.919, p = 0.348, η2p = 0.038). For RT, significant quartic trends were observed for both Target and Irrelevant items, but not Probes (Target, F(1,23) = 16.191, p = 0.001, η2p = 0.413; Irrelevant, F(1,23) = 37.670, p < 0.001, η2p = 0.621; Probe, F(1,23) = 2.182, p = 0.153, η2p = 0.087). These results confirm our initial suspicion that perhaps the 90° and 270° angles were more difficult to detect than others. Thus, police investigators may wish to consider avoiding those angles in forensic contexts. Furthermore, there seems to be a weak quartic trend in Probe RT as well (Fig. 4, right) but, perhaps due to the number of trials, such a trend was not statistically significant.
RT-CIT efficacy over time
Lastly, one follow-up question was whether the RT-CIT’s effectiveness would decrease as participants gained more exposure to the different rotated versions (11 total) of the Probe image. It seems that multiple exposures, although of photographs from different angles, would nonetheless still facilitate a more complete mental representation of the Probe item. To investigate this, we divided the 760 trials into 3 epochs by time (i.e., first 253, middle 253, and last 254 trials). A repeated-measures ANOVA on RT data with factors of trial type (Probe, Target, Irrelevant) and epoch (1, 2, 3) revealed main effects of both epoch, F(2,46) = 19.353, p < 0.001, η2p = 0.457, and trial type, F(2,46) = 67.252, p < 0.001, η2p = 0.745, as well as a significant interaction between them, F(4,92) = 5.265, p = 0.014, η2p = 0.186. To further explore the relationship between Probe and Irrelevant stimuli, post-hoc analysis with Bonferroni correction suggest that the interaction was driven by the closing gap between the Probe and Irrelevant RTs from the first, second, and third epochs, which was due to a steeper decrease of RT for the Probes (first, t(23) = 6.504, p < 0.001, d = 1.328; second, t(23) = 6.364, p < 0.001, d = 1.299; third, t(23) = 3.617, p = 0.004, d = 0.738; see Fig. 5, right). However, even at the third epoch the Probe–Irrelevant RT difference was statistically significant, once again validating the robustness of the RT-CIT. These results suggest that, although participants were becoming faster in Probe RT, in the context of ~ 800 trials the RT-CIT is good enough to still separate Probe from Irrelevant trials.
In contrast, the accuracy measure from the RT-CIT is more susceptible to multiple exposures of Probe images over time. In a two-way ANOVA, we observed a main effect of trial type, F(2,46) = 30.183, p < 0.001, η2p = 0.568, and epoch (1, 2, 3), F(2,46) = 4.749, p = 0.021, η2p = 0.171, and a significant interaction between them, F(4,92) = 4.738, p = 0.012, η2p = 0.171. The interaction was also driven by the rapidly improving accuracy of Probe items (Fig. 5, left). Post-hoc analysis with Bonferroni correction showed that Probe accuracy went from ~ 80% in the first epoch to ~ 90% in the second epoch and ~ 93% in the third epoch (first, t(23) = − 4.208, p = 0.001, d = − 0.859; second, t(23) = − 1.691, p = 0.313, d = − 0.345; third, t(23) = − 0.943, p = 1.000, d = − 0.193). Therefore, unlike the robust effect in RT, there was no difference between Probe and Irrelevant accuracy after the first ~ 250 trials.
Discussion
In this experiment we investigated whether angle differences between RT-CIT photographs and participants’ actual exposure and memory would have an impact on RT-CIT accuracy. To our surprise, we could not detect an effect of angular rotations. Therefore, although the participants have never seen 11 out of the 12 angles used here, they nonetheless were able to recognize the Probe items and show significant RT difference between Probes and Irrelevants at all angular rotations (Fig. 3, lower panel). In addition, we observed a significant presentation angle effect at 90° and 270°, and that the slower RT of Probes decreased over time but remained significantly different from Irrelevants at almost 800 trials.
The absence of a detectable angular effect in the RT-CIT is good, but also puzzling. Most studies to date have supported the viewpoint-dependent view (e.g., Riesenhuber & Poggio, 2000). Particularly, the classic Shepard and Metzler (1971) study was the first to show a linear relationship between angular rotation and recognition RT. However, since our participants here were comparing one displayed photograph with others stored in their memory, it is plausible that such a process may be different from the online, juxtaposed picture-comparison paradigm that is often used in object recognition studies. In other words, the culprit here may be the different processes involved in memory-based (e.g., RT-CIT) and perception-based (e.g., object recognition literature) comparisons.
One possible factor that is critical in memory-based comparisons, but absent in perception-based comparisons, is memory strength, or depth of encoding. In the context of the RT-CIT, one important study by Seymour and Fraynt (2009) has previously investigated the role of encoding strength (and its possible interaction with memory decay time). These authors randomly assigned participants into either a deep or shallow probe-study condition, as well as three different delay conditions: 10 min, 24 h, or 1 week. In the deep-encoding condition, their participants first performed cued recall, and then performed picture matching, word jumble, hand writing, and word shouting tasks to enhance their memory for the Probes. In contrast, the shallow-encoding group only performed cued recall and a paraphrasing task about a news story covering the mock crime. Seymour and Fraynt found that, for the deep-encoding condition, RT-CIT efficacy remained robust across all three delay timeframes. Importantly, for the shallow-encoding condition, RT-CIT accuracy decreased as the gap time increased, thus demonstrating that encoding strength does have an impact on RT-CIT accuracy.
In the context of the current task, we ensured adequate memory strength by asking our participants to go through the questionnaire (15 min), jigsaw puzzle (15 min), and a final rehearsal (5 min), plus the two pretests in the study phase before their participation in the RT-CIT. The number of tasks in this phase is quite similar to the deep encoding condition by Seymour and Fraynt (2009), which resulted in our study phase being at least 35 min. This similarity to a deep-encoding design may have contributed to the unexpected absence of the angular rotation effect from the present experiment. To this end, we conducted Experiment 2 with a much shorter and easier study phase to see whether the possible effect of photograph rotations in the RT-CIT would surface with shallow encoding.