Methods
Design
The experiment design was a 3 (Item type: Consistent, Neutral, Misleading) × 3 (Group: Standard, Interim Test, Emphasized Detailed) mixed design. Item type was manipulated within subjects, while Group was a between-subjects variable.
Participants
Experiment 1 included a group of 132 participants recruited from the Human Participant Pool at Tufts University. Sample size for each experiment was calculated using G*Power 3 (Faul, Erdfelder, Lang, & Buchner 2007). Our goal was to determine the appropriate sample size using moderate parameters (power = 0.80, effect size f = 0.30). Participants ranged in age from 18 to 23 years, all spoke English as their primary language, and had not been previously exposed to the experimental material. Participants were randomly assigned to one of three groups, with an equal number of participants in each group.
Materials and procedure
The original event was a 42-minute episode of the television show 24 (20th Century Fox Television 2001). Following the informed consent procedure, participants were instructed to watch with the knowledge that a memory test about the episode would later occur. After viewing the video, participants in the Interim Test Group took an immediate cued recall test on 33 details of the video (e.g., Question: What did the terrorist use to knock out the flight attendant? Answer [not provided to participants]: A hypodermic syringe). Questions were presented via E-prime 2.1 software (Version 2.1; Schneider, Eschman, & Zuccolotto 2002) and participants were required to respond to all questions. No corrective feedback was provided. The 33 questions presented on the interim test were directly associated with the 33 critical details presented in the post-event synopsis. Participants in the Standard and Emphasized Details Misinformation groups played Tetris (a computerized falling-rock puzzle game) instead of taking the first test. Testing and game play lasted 12 minutes. All participants then completed a brief demographic questionnaire and a vocabulary test (Salthouse 1993). Participants were given 8 minutes to complete these tasks.
All participants were then visually presented with the post-event synopsis, with the instructions to read at their own pace. The synopsis was presented visually using E-prime 2.1 in sequential segments. Participants were instructed to read each segment and press the spacebar to move forward. Thirteen segments were presented, and each contained between one and three critical details. A total of 33 critical details were presented; 11 sentences contained misleading information (misleading, e.g., The terrorist knocks the flight attendant unconscious with a chloroform rag), 11 contained information consistent with the video (consistent, e.g., The terrorist knocks the flight attendant unconscious with a hypodermic syringe), and 11 served as neutral, control sentences (neutral, e.g., The terrorist knocks the flight attendant unconscious). The misleading information always involved replacing a specific item with a plausible alternative. Misleading, neutral, and consistent sentences were counterbalanced. Each critical detail appeared only once in the narrative and whether the detail was consistent, neutral, or misleading was counterbalanced across participants. Both focal and non-focal details were manipulated.
Participants in the Interim Testing and Standard Misinformation groups received the same narratives. In these groups, the narrative was written in 16-point black Arial font, and presented against a white background. Participants in the Emphasized Details group received the narrative in a similar fashion to the other groups, with one important exception. Sentences containing critical details were presented in red font, and the critical details themselves were underlined. All critical details (consistent, neutral, misleading) were emphasized in this manner. Immediately following the narrative, all participants took a 33-question, forced cued recall test. This test was identical to the one used as the interim test. Participants were instructed to respond with only details from the video, thereby forcing participants to discriminate between the original event and post-event synopsis. Test question order was the same across all groups and followed the narrative structure of the video. Testing was untimed; however, participants could not advance to the next question before responding. A schematic of the procedure can be found in Fig. 1.
Results
Accurate recall on the interim test
All follow-up comparisons used a Bonferroni correction unless otherwise stated. Accurate recall on the interim and final tests was calculated by dividing the total number of trials in which participants produced correct video details by the total number of trials for that given item type. On the interim test, 0.55 of participants’ responses were accurate and 0.05 consisted of spontaneous misinformation production.
Accurate recall on the final test
A 3 (Item type: Consistent, Neutral, Misleading) × 3 (Group: Standard, Interim Testing, Emphasized Details) ANOVA on average final test accuracy found a main effect of item type, F(2, 258) = 148.72, P < 0.001, \( {\eta}_p^2=0.53 \). As illustrated in Fig. 2, consistent trials (M = 0.81) resulted in significantly greater accuracy as compared to neutral trials (M = 0.57, t(131) = 11.98, P < 0.01, d = 1.42). In addition, participants were more accurate on neutral trials compared to misleading trials (M = 0.47, t(131) = 4.86, P < 0.01, d = 0.49). We also found an interaction between item type and group (F(4, 258) = 4.17, P < 0.005, \( {\eta}_p^2=0.06 \)). This interaction was driven by the differences between performance on neutral trials and misleading trials across the three groups. As Fig. 2 illustrates, this difference was small in the Standard Misinformation group, and non-significant when examined using a Bonferroni corrected t-test (t(43) = 0.40, P = 0.70). However, participants in the Emphasized Details group (t(43) = 5.31, P < 0.001, d = 0.69) and participants in the Interim Test group (t(43) = 3.51, P < 0.001, d = 0.58) were significantly less accurate on misleading trials as compared to neutral trials. No other comparisons on final test accuracy were significant.
Misleading errors of commission on the final test
A 3 (Item type: Consistent, Neutral, Misleading) × 3 (Group: Standard, Interim Testing, Emphasized Details) ANOVA on average misleading errors of commission found a main effect of item type (F(2, 258) = 189.12, P < 0.001, \( {\eta}_p^2=0.59 \)). As expected, misleading errors of commission were more likely to occur after the presentation of misleading details in the synopsis than spontaneously on consistent or neutral trials. We also found an interaction between item type and group (F(4, 258) = 5.27, P < 0.005, \( {\eta}_p^2=0.08 \)). Consistent with previous RES literature, participants in the Interim Testing group (M = 0.34) were more likely to produce misleading errors of commission on the final test than participants in the Standard misinformation group (M = 0.23, t(86) = 2.26, P < 0.05, d = 0.66). Participants in the Emphasized Details group (M = 0.33) were also significantly more likely to produce misleading details incorrectly than those in the Standard misinformation group (t(86) = 3.25, P < 0.005, d = 0.49). The difference in mean misleading errors of production between the Interim Test and Emphasized Details group did not reach statistical significance (t < 1). These data are presented in Fig. 3.
Discussion
Experiment 1 demonstrated that misinformation susceptibility was similar for participants in the Interim Test and Emphasized Details groups. That is, participants in these groups demonstrated a greater difference in accuracy between neutral and misleading trials than participants in the Standard misinformation group. Further, these participants were more likely to produce misleading errors of commission on a final test as compared to participants in the Standard misinformation group. Consistent with previous research, these data would suggest that interim testing results in changes to how the post-test narrative is processed. Behaviorally, the increase in misinformation susceptibility was similar to what was demonstrated by highlighting critical details in the present research. Greater susceptibility to misinformation in the context of interim testing and emphasizing details suggests that both procedures may serve to increase accessibility of synopsis details, and that accessibility may influence misinformation error production on the final test. Thus, both interim testing and emphasizing details may result in an ironic effect, boosting suggestibility. Although the findings of the present experiment align with previous research, it remains unclear why interim testing in this eyewitness paradigm does not result in better learning of previously tested information. We hypothesized that such benefits may only emerge when final testing is delayed, because misleading information will no longer exert influence on memory.