Examining the replicability of backfire effects after standalone corrections

Prike, Toby; Blackley, Phoebe; Swire-Thompson, Briony; Ecker, Ullrich K. H.

doi:10.1186/s41235-023-00492-z

Original article
Open access
Published: 03 July 2023

Examining the replicability of backfire effects after standalone corrections

Cognitive Research: Principles and Implications volume 8, Article number: 39 (2023) Cite this article

1971 Accesses
5 Citations
4 Altmetric
Metrics details

Abstract

Corrections are a frequently used and effective tool for countering misinformation. However, concerns have been raised that corrections may introduce false claims to new audiences when the misinformation is novel. This is because boosting the familiarity of a claim can increase belief in that claim, and thus exposing new audiences to novel misinformation—even as part of a correction—may inadvertently increase misinformation belief. Such an outcome could be conceptualized as a familiarity backfire effect, whereby a familiarity boost increases false-claim endorsement above a control-condition or pre-correction baseline. Here, we examined whether standalone corrections—that is, corrections presented without initial misinformation exposure—can backfire and increase participants’ reliance on the misinformation in their subsequent inferential reasoning, relative to a no-misinformation, no-correction control condition. Across three experiments (total N = 1156) we found that standalone corrections did not backfire immediately (Experiment 1) or after a one-week delay (Experiment 2). However, there was some mixed evidence suggesting corrections may backfire when there is skepticism regarding the correction (Experiment 3). Specifically, in Experiment 3, we found the standalone correction to backfire in open-ended responses, but only when there was skepticism towards the correction. However, this did not replicate with the rating scales measure. Future research should further examine whether skepticism towards the correction is the first replicable mechanism for backfire effects to occur.

Significance statement

Belief in false claims and relying on misinformation in one’s reasoning and decision making can have wide-ranging negative consequences for both individuals and society. Therefore, it is crucial to find effective tools to counter misinformation and to ensure those tools do not inadvertently increase belief in the misinformation. Corrections are one of the most common tools used for tackling misinformation, with individuals and organizations regularly issuing fact-checks or correcting news stories as information becomes available. There is considerable research showing that corrections effectively reduce misinformation belief and reliance. However, concerns have been raised that if—as part of a correction—misinformation is spread to new audiences, then this may lead to greater misinformation reliance. We conducted a series of three experiments (total N = 1156) to test whether standalone corrections—that is, corrections presented without initial misinformation exposure—are at risk of backfiring and increasing misinformation reliance. Corrections did not backfire when misinformation reliance was measured immediately (Experiment 1) or after a one-week delay (Experiment 2). However, when we intentionally chose scenarios to induce skepticism in the correction (Experiment 3), there was some mixed evidence that corrections may backfire. Future research should further examine whether correction skepticism reliably leads to backfire effects. For now, we advise those combating misinformation to continue to use corrections as part of their toolkit. As long as people are not skeptical of the correction, there is low risk that corrections targeting novel misinformation or reaching new audiences will backfire.

Introduction

Misinformation—false or misleading information potentially believed to be true—presents a significant societal challenge (Ecker et al., 2022). Misinformation about health (e.g., “doctor dies following COVID vaccination”; Widmer, 2021) or politics (e.g., “the 2020 US election was stolen”; Cassidy, 2021) can negatively impact both individuals and society (Ha et al., 2021; Horne et al., 2015; Lewandowsky et al., 2017; MacFarlane et al., 2021; Swire-Thompson & Lazer, 2022; Thorson, 2016). It is therefore crucial to develop effective interventions for countering misinformation. Corrections are one of the most widely used and studied interventions, with research clearly indicating that they are an effective intervention for reducing misconceptions and misinformed reasoning and decision making (Lewandowsky et al., 2020; Paynter et al., 2019). That being said, it is also clear that corrections are generally only partially effective, with considerable evidence showing that people continue to rely on misinformation in their reasoning even after being given corrections. This continued reliance on misinformation has been termed the continued influence effect (Johnson & Seifert, 1994; for a review see Ecker et al., 2022).

Beyond corrections not being fully efficacious, an even greater concern has been that under certain conditions, corrections can be entirely ineffective or may even backfire, resulting in increased misinformation reliance (Lewandowsky et al., 2012). The current evidence suggests that this is a rare phenomenon that can occur if a correction attacks a worldview-bolstering belief (i.e., the worldview backfire effect; Nyhan & Reifler, 2010; but see Ecker & Ang, 2019; Ecker et al., 2021; Wood & Porter, 2019). A second type of backfire effect that has been proposed is a familiarity-driven effect (Lewandowsky et al., 2012; Schwarz et al., 2007). Specifically, if a correction repeats the misinformation in order to invalidate it, the repetition of the false claim may boost its familiarity and thus inadvertently increase claim belief (Pluviano et al., 2017, 2019; Skurnik et al. 2007 [unpublished; discussed in Schwarz et al., 2007]). However, there is little empirical evidence in support of this familiarity backfire effect (Cameron et al., 2013; Ecker et al., 2017; Ecker et al., 2020c; Ecker et al., 2023; Kemp et al., 2022a, 2022b; Swire et al., 2017; Wahlheim et al., 2020; for reviews, see Ecker et al., 2022; Swire-Thompson et al., 2022).

Despite this relative lack of evidence, it has been proposed that there are several situations where familiarity backfire effects may be especially likely to occur. One situation is when a person encounters a correction that negates a novel piece of misinformation (Schwarz et al., 2016). Put simply, a person learning that “x did not happen” may develop a stronger belief in “x” than someone who was never given the x-denying correction (or any other information about x). This may be due to the correction boosting the familiarity of the novel claim. Indeed, a recent study by Autry and Duarte (2021) found that presenting participants with a correction backfired when they had not been exposed to the initial, novel misinformation. In other words, a standalone correction seemed to cause greater misinformation reliance relative to a situation where participants were not exposed to the misinformation nor the correction. A second occasion where such an effect may be likely to arise is if people are particularly skeptical of the correction. For instance, the fact that a correction is issued may be interpreted as evidence that the misinformation was once believed to be true or is believed to be true by some people. Therefore, in the current study we sought to conceptually replicate the findings of Autry and Duarte (2021), to ascertain whether corrections can continue to be safely used even if people may have not encountered the targeted piece of misinformation before.

Theoretical accounts of the continued influence effect may also provide insight into why corrections may potentially backfire. The two dominant accounts of continued influence are the mental-model account and the selective-retrieval account. The mental-model account posits that people desire a complete mental model of an event and its associated cause (Johnson & Seifert, 1994). Therefore, people may be motivated to continue to rely on false information post-correction because it allows them to retain a complete mental model of the event and avoid the psychological discomfort associated with an incomplete mental model (Ecker et al., 2011; Susmann & Wegener, 2022). When encountering a standalone correction (e.g., drug use did not cause an athlete’s suspension), readers learn that an event (the athlete’s suspension) has occurred, without receiving a validated cause. As such, some people may increase their belief in the negated information (drug use) to form and retain a complete mental model that includes a cause of the event, even though they were exposed to the cause only as part of a correction negating it.

The second account of the continued influence effect proposes that misinformation and corrective information are concurrently stored in memory (Ayers & Reder, 1998), and that continued influence is caused by the selective retrieval of misinformation (Ecker et al., 2010). One variant of this account is based on dual-process theories of memory, which assume a rapid, automatic retrieval process driven by familiarity, and a slow, strategic retrieval process required to recollect contextual details, including information source and veracity (Yonelinas, 2002). According to this account, continued influence can arise if misinformation is automatically retrieved based on its familiarity, and strategic recollection of corrective information fails. It follows that misinformation familiarity can be a driver of continued influence, which is in line with the illusory truth effect, the finding that the more familiar a piece of information is, the more likely it is perceived as true (Begg et al., 1992; De keersmaecker et al., 2020; Dechêne et al., 2010; Fazio et al., 2015; Pennycook et al., 2018; Unkelbach, 2007). Because corrections typically repeat the misinformation (e.g., the correction “the athlete’s suspension was not caused by a failed drug test” inevitably repeats the two concepts “suspension” and “drug” and their association), presenting a correction without initial misinformation exposure may boost the familiarity of the misinformation compared to baseline, increasing the subsequent likelihood of misinformation being retrieved and relied upon.

Although these accounts offer some theoretical justification for why standalone corrections may backfire, some previous studies using standalone corrections without initial misinformation exposure have not found any evidence of deleterious effects (Ecker et al., 2020b, 2020c; Gordon et al., 2019). However, Autry and Duarte (2021) argued that the reason for this is that those studies used corrections that were licensed negations. A licensed negation is one that counters either a known (e.g., based on common knowledge or previous exposure) or an easily activated claim (e.g., a stereotype; Mayo et al., 2004). Therefore, because previous studies either corrected a common stereotype (e.g., that a robber was not Black; Gordon et al., 2019) or used a fact-checking approach that presented the false statement in an affirmative format together with a false tag (e.g., “Hospitals are busier on full moons—FALSE”; Ecker et al., 2020b, 2020c), Autry and Duarte suggested that the corrections were licensed.

When a licensed negation is presented, it is relatively easy for people to understand why the negation is being presented and what it is referring to (i.e., what claim is being corrected and why). This is because the specific claim being negated is either known or stereotypical. However, Autry and Duarte (2021) argued that unlicensed corrections—those that negate a piece of information that is unexpected or novel (Mayo et al., 2004) may be at greater risk of backfiring. This is because unlicensed corrections may require more processing, as they negate an unexpected or novel piece of information. This greater level of processing may mean that unlicensed negations are at greater risk of boosting the familiarity of the corrected misinformation than licensed corrections (Autry & Levine, 2012), thereby increasing the likelihood that the misinformation will later be selectively retrieved and relied upon.

Accordingly, Autry and Duarte (2021) ensured their misinformation was not stereotypical, and their negating corrections were not presented as tagged affirmative claims. They presented participants with a multi-paragraph passage in which participants either were or were not exposed to initial misinformation (e.g., “he saw a blue car”). Participants then received a correction (“the car was not blue”), replacement (“the car was red”), or no correction (“the car was his neighbor’s new vehicle”). Autry and Duarte found that unlicensed standalone corrections significantly increased misinformation reliance relative to a no-misinformation, no-correction condition. However, it should be noted that this finding was based on a single event report and that the effect was no longer statistically significant in a second experiment that used a broader range of materials.

Nevertheless, this finding raises the possibility that, unlike licensed negations, unlicensed negations of novel misinformation might be at unique risk of backfiring. However, before accepting this conclusion, it is important to establish that unlicensed negations reliably lead to backfire effects. Moreover, to be relevant to the real world, it is also important to establish that such effects occur when unlicensed negations correct information that carries some relevance. Some corrections used by Autry and Duarte (2021) negated arbitrary side details (e.g., that a dining table was “not square”) which may be less well-remembered and less relevant to meaningful, real-world corrections than an unlicensed negation of a more central and important piece of information (e.g., the cause of an event). Additionally, presenting unlicensed negations of arbitrary side details may be perceived as odd because it violates Gricean maxims of communication (Grice, 1975). Specifically, information relevance is essential to effective communication, and therefore referring to a “big table which turned out to be not square” when a table had never previously been mentioned may be perceived as unexpected or odd by readers. Such norm violations may lead readers to appraise the information in unintended ways. For example, if the information seems entirely irrelevant, it seems particularly plausible to assume that it is only being mentioned (in a negation format) because there is some reason to believe it is true. Therefore, rather than reflecting a familiarity backfire effect, Autry and Duarte’s findings may instead be the result of communication-norm violations leading participants to infer that there are unmentioned reasons to believe the misinformation or to be skeptical of the correction.

The present study

The current study examined the potential for unlicensed negations to backfire while ensuring that the negations referred to core (causal) event details and were not perceived as odd by participants. As in Autry and Duarte (2021), we used multi-paragraph passages, although our reports were somewhat shorter in length (approx. 200–250 words vs. 420–500). A set of eight news reports were developed and pilot-tested to ensure that the reports selected for inclusion featured causal misinformation that was not highly stereotypical (nor highly unexpected), and was not perceived as odd in the context of a standalone correction. Additionally, because we were interested in the effect of presenting (vs. not presenting) standalone corrections, we did not include the replacement condition that replaced the target misinformation with an alternative (e.g., “blue” being replaced with “red”; for further details see Autry & Duarte, 2021). Experiment 1 was a conceptual replication of Autry and Duarte (2021), which used these newly developed and tested materials to examine whether unlicensed negations of novel misinformation would backfire and increase misinformation reliance. Experiments 2 and 3 then further examined two key factors that may increase the risk of corrections backfiring, namely a delay between exposure and test, and skepticism regarding the correction, respectively.

In Experiments 1 and 2, participants were (or were not) exposed to initial misinformation, and then were (or were not) presented with a misinformation-negating correction, creating four within-subject conditions: misinformation/no-correction, misinformation/correction, no-misinformation/correction, and no-misinformation/no-correction (control). In Experiment 3, participants were never initially exposed to misinformation, with standalone corrections directly contrasted with the control condition. The extent to which participants relied on the misinformation in their event-related inferential reasoning was measured via questionnaire.

We expected participants in the misinformation/no-correction condition to have the highest level of misinformation reliance.^{Footnote 1} In line with previous research (Ecker et al., 2017; Ecker et al., 2020b, 2020c; Gordon et al., 2019) we expected a correction that negates a previously presented piece of misinformation (misinformation/correction condition) to reduce but not entirely eliminate misinformation reliance (i.e., we expected a continued influence effect to emerge). Given that there is a large body of evidence demonstrating that corrections do not backfire (Cameron et al., 2013; Ecker et al., 2017, 2023; Ecker et al., 2020c; Kemp et al., 2022a, 2022b; Swire et al., 2017; Wahlheim et al., 2020) and the inconsistent results in Autry and Duarte (2021), we did not expect standalone corrections to backfire, and thus predicted misinformation reliance in the no-misinformation/correction condition to not be significantly higher than control.

Experiment 1

Method

Experiment 1 used a 2 × 2 within-subjects design with the independent variables of misinformation exposure (no misinformation; misinformation) and correction (no correction; correction). The dependent variable, reliance on misinformation, was measured by open-ended responses to event-summary and inference questions. Memory for report details was measured with multiple-choice questions. The experiment used a Qualtrics survey (Qualtrics, Provo, UT, USA, 2022) and was administered online.

Participants

Based on an a-priori power analysis (G*Power 3.1; Faul et al., 2007), a minimum sample size of 200 participants was required to detect an interaction effect of size ƒ = 0.20 (with α = 0.05 and 1 – β = 0.80).^{Footnote 2} This effect size was chosen because it is the effect size used in the power analysis reported by Autry and Duarte (2021). This effect size is also consistent with recommendations by Brysbaert (2019), which suggest that Cohen’s d of 0.4 (f = 0.2) is a good first estimate of the smallest effect size of interest in psychological research. To account for potential exclusions and ensure ample statistical power, 283 participants were recruited from the online testing platform Amazon Mechanical Turk (MTurk) via CloudResearch (Litman et al., 2017). Participants were eligible if they resided in the United States of America and had previously completed more than 5000 MTurk tasks (HITs) with a minimum approval rating of 97%. The data were screened using a-priori criteria to exclude any participants who did not report their English proficiency as at least “good” (> 2 on a 5-point scale ranging from 1, poor to 5, excellent; n = 0), indicated they did not reside in the U.S. (n = 0), self-nominated their data to be excluded because of low effort (n = 2), provided uniform responses (n = 1), or did not meet the minimum memory score to ensure adequate encoding of materials (n = 4; see details below). The final sample size for analysis was thus N = 276. The sample included 133 females, 141 males, 1 non-binary individual, and 1 individual preferring not to declare their gender. Participant age ranged from 18 to 77 years (M = 43.04; SD = 11.65).^{Footnote 3} The experiment took approximately 15 min to complete; participants were paid US$2.50 for their participation.

Materials

News reports

Eight novel news reports were created and pilot-tested for the current study, leading to the selection of four news reports for inclusion in Experiment 1 (see Additional file 1). In the pilot test, independent samples of N = 100 MTurk participants rated the reports on cause stereotypicality and oddness of a standalone correction, respectively, on a 0 to 10 rating scale (see Additional file 1 for full details). The four reports with lowest cause stereotypicality and standalone-correction oddness were selected for inclusion in Experiment 1. Each report described a fictional event; for example, one report detailed the exclusion of a football club’s star player from an important match (“FC Tokyo’s left winger Yasuto Tanaka has been side-lined for next Wednesday’s J1-League game”); the others related to a local government budget deficit, flight delays, and a server crash. Each report existed in four versions, depending on whether or not it contained misinformation and whether or not it contained a correction (see Table 1 for an example). In the report versions containing misinformation, the first section of the report provided a cause of the event (e.g., “It is believed that Tanaka’s exclusion is due to a failed drug test”); in the no-misinformation versions, the cause was replaced with a neutral, arbitrary statement (e.g., “It is believed that there will be a record crowd for the much-anticipated game”). Irrespective of whether the report provided misinformation initially, the report versions containing a correction provided a negating correction in the second section (e.g., “At today’s press conference the team chairman explained that Tanaka’s exclusion was not due to a failed drug test”); in the no-correction versions, this was replaced with a neutral statement (e.g., “At today’s press conference the team chairman explained that the team still had high hopes of winning the title”). The no-misinformation/no-correction control condition thus simply reported the event without mentioning any cause. Each news report was presented in two parts, on successive screens.

Table 1 Example scenario: athlete sidelined

Full size table

Test questionnaires

Each scenario had a corresponding test questionnaire that included ten questions: an open-ended event-summary recall question; three multiple-choice questions that assessed memory for report details; five open-ended inference questions that provided an opportunity to mention the misinformation; and one open-ended direct-inference question asking about the event’s cause (all questions are provided in Additional file 1). For methodological consistency with Autry and Duarte (2021), misinformation reliance in Experiment 1 was measured using open-ended questions. This is also consistent with much of the existing work on misinformation and the continued influence effect, which has also often used open-ended questions (Ecker et al., 2010, 2011; Johnson & Seifert, 1994; Seifert, 2002). Three of the inference questions were identical across all scenarios: “What would be a good headline for the report?”; “How could such a situation be avoided in the future?”; and “What should happen next?”. The remaining two inference questions were scenario-specific (e.g., “Why might Tanaka’s season have been described as ‘up-and-down’?”). The memory questions explicitly tested memory for details in the news reports that were unrelated to the event cause and thus the experimental manipulation (e.g., “who will FC Tokyo compete with in the upcoming game?”).

Procedure

Participants initially received an information sheet approved by the University of Western Australia’s Human Research Ethics Office (Ethics ID: RA/4/20/6423) and provided informed consent. Participants answered some basic demographic questions about their English proficiency, age, gender, and country of residence. Participants then read the four fictional news reports—one per experimental condition. Presentation order and assignment of reports to conditions were counterbalanced across participants using a Graeco-Latin-square design. Reading was self-paced but a minimum presentation time was enforced (set at approx. 150 ms per word). Participants were unable to revisit the reports once they had continued. After a one-minute filler-task (a word sleuth), participants completed the four questionnaires, which were presented in the same order as the reports. Lastly, participants were asked if they had put in a reasonable effort and if their data should be included in the analysis, before being fully debriefed.

Results

Memory for report details

Memory was assessed only to ensure all participants included in the main analyses had encoded the reports. Memory scores were calculated across reports, based on the number of correct responses to the three multiple-choice questions per report; the maximum possible score was thus 12. Participants were required to correctly answer at least one question per news report on average (i.e., memory score ≥ 4) for their data to be included in the analyses, leading to four participants being excluded (see Participants section for more details). For the final sample (i.e., after exclusions, N = 276), the mean memory score was M = 9.12, SD = 2.36.

Scoring of misinformation reliance for open-ended responses

Reliance on misinformation was calculated by summing references made to misinformation in response to the open-ended event-summary recall question, the five open-ended inference questions, and the open-ended direct-inference question. To this end, each response was scored using values of 0, 0.5, or 1, based on a detailed written scoring guide created specifically for the data set (see Additional file 1). Any direct reference to the target misinformation or a response that implied belief in the target misinformation was scored as 1 (e.g., “Tanaka could have avoided taking drugs” or “Drugs and sports don’t mix”). Scores of 0.5 were awarded for responses that referred to the misinformation but expressed uncertainty, for example implying there was a chance that the event could be due to a reason other than the misinformation (e.g., “player was side-lined, presumably due to a failed drug test”). A score of 0 was awarded if the misinformation was mentioned but controverted (e.g., “soccer player excluded, but not due to drugs”) or if the participant did not mention the misinformation at all in their response. The maximum possible inference score was seven for each report (i.e., in each condition). A primary scorer scored all responses according to the scoring guide; ambiguous cases were additionally scored by a secondary scorer; discrepancies were resolved through discussion. To determine interrater reliability, a third scorer then scored the responses of a subsample of 36 participants, using the same scoring guide. Reliability was found to be satisfactorily high, r = 0.95. All scorers were blind to experimental conditions.

Misinformation reliance

Mean misinformation reliance across conditions is shown in Fig. 1. The misinformation reliance measure included a large proportion of zeros, especially in the no-misinformation conditions. Inspection of skewness and kurtosis revealed the no-misinformation conditions violated the assumption of normal distribution with skew values ≥ 8.98 and kurtosis values ≥ 27.11. In addition, Shapiro–Wilk tests indicated violation of normal distribution for all conditions, all Ws ≤ 0.22, all ps < 0.001. The deviation was considered so significant that no data transformation processes were deemed applicable. Therefore, rather than using a within-subjects ANOVA, a zero-inflated Poisson (ZIP) regression model was used for analysis. The ZIP regression model effectively addresses the high frequency of zeros often encountered in count data by concurrently modelling a discrete count distribution and the inflated number of zeros (Green, 2021; Lambert, 1992). It can therefore be considered a two-component mixture model combining a point mass at zero with a proper count distribution; zero scores may therefore come from either the point mass or the count component. The specific function used was zeroinfl from the R package pscl (Jackman, 2020; Zeileis et al., 2008); it runs a Poisson count model and a logit model for predicting excess zeros.

The two experimental factors, misinformation exposure and correction, as well as their interaction, were used to predict the number of misinformation references made by participants. The specific scenario was also included as a predictor in the model, while the repeated-measures design was accounted for by including participant ID as a predictor of both the count and the zero-inflation component. As the zeroinfl function expects count data and therefore cannot deal with half scores, misinformation-reference scores were multiplied by two prior to analysis. There were statistically significant main effects of misinformation exposure, β = 4.40, 95% CI [3.74, 5.07], SE = 0.34, z = 12.94, p < 0.001, and correction, β = 1.62, 95% CI [1.20, 2.05], SE = 0.22, z = 7.52, p < 0.001, indicating greater reliance on misinformation after misinformation exposure and reduced reliance after a correction. There was also a statistically significant interaction between misinformation exposure and correction, β = 1.24, 95% CI [0.85, 1.63], SE = 0.20, z = 6.23, p < 0.001, indicating that a correction reduced misinformation reliance only in the condition exposed to the misinformation. The specific scenario used was not a significant predictor of misinformation reliance, β = 0.002, 95% CI [− 0.04, 0.04], SE = 0.02, z = 0.10, p = 0.924, nor was participant ID, β < 0.001, SE < 0.001, z = 0.61, p = 0.542.

To establish whether a continued influence effect was present, analysis was restricted to the two conditions featuring a correction (i.e., the misinformation/correction and no-misinformation/correction conditions). The model used condition (misinformation vs. no misinformation) and scenario to predict the number of misinformation references made by participants after a misinformation-negating correction. Participant ID was again additionally included as a predictor of both count and zero-inflation components. There was a statistically significant difference between the two conditions featuring a correction, β = 1.53, 95% CI [1.22, 1.84], SE = 0.16, z = 9.54, p < 0.001. This demonstrates a continued influence effect: a correction following misinformation exposure did not reduce the number of misinformation references to the baseline level associated with presenting a correction in the absence of misinformation exposure. The scenario used was again not a significant predictor, β = 0.03, 95% CI [− 0.03, 0.10], SE = 0.03, z = 1.80, p = 0.073, nor was participant ID, β < 0.001, SE < 0.001, z = − .29, p = 0.199.

The main focus of this research, however, was on the impact of a standalone correction on misinformation reliance relative to a no-misinformation/no-correction control condition. Therefore, if the results of Autry and Duarte (2021) replicate, then we would expect misinformation reliance to be higher in the no-misinformation/correction condition than the no-misinformation/no-correction control condition. To this end, a second restricted ZIP regression analysis was conducted to investigate the difference between the two no-misinformation conditions. The model used condition (correction vs. no correction) and scenario to predict the number of misinformation references made by participants after no initial exposure to the misinformation. Participant ID was again included as a predictor of both count and zero-inflation components. The model provided no evidence of a significant difference between the two conditions, β = 0.39, 95% CI [− 0.14, 0.91], SE = 0.27, z = 1.44, p = 0.151. Thus, there was no evidence to suggest that reading a negated correction of novel misinformation, with no initial exposure to the misinformation, increased reliance on the novel misinformation relative to a control condition. The scenario used in the news report was not a significant predictor, β = 0.11, 95% CI [− 0.15, 0.38], SE = 0.14, z = 0.82, p = 0.410, nor was participant ID, β = 0.002, SE = 0.001, z = − 0.89, p = 0.372.

The scenario used was not found to be a significant predictor of references made to misinformation. However, for the sake of thoroughness, an exploratory post-hoc review of the no-misinformation/correction and no-misinformation/no-correction control conditions uncovered some variation in misinformation reliance across scenarios. Mean reliance on misinformation across the two no-misinformation conditions and scenarios is shown in Fig. 2. As can be seen, only the ‘government-deficit’ scenario, and to a lesser extent the ‘athlete-exclusion’ scenario, showed a numeric increase in misinformation reliance in the no-misinformation/correction condition relative to control. Examination of these results was purely exploratory and as such no inferential statistical tests were conducted.

Discussion

In line with our hypotheses, Experiment 1 found that corrections of novel misinformation did not lead to increased misinformation reliance, even when the corrections were presented without initial misinformation exposure. This finding is inconsistent with the results of Autry and Duarte (2021) and other previous research that suggests that corrections can backfire due to boosting claim familiarity (Pluviano et al., 2017, 2019; Skurnik et al. 2007 [unpublished]). However, the results are consistent with previous research that has not found evidence of backfire effects with either novel (Ecker et al., 2020b; Gordon et al., 2019) or potentially non-novel misinformation (Cameron et al., 2013; Ecker et al., 2017; Ecker et al., 2020c; Swire et al., 2017; Swire-Thompson et al., 2022).

In Experiment 1, we chose to use open-ended questions for methodological consistency with previous research including Autry and Duarte (2021). Although there are several benefits to this method, there are also some limitations; for instance, belief in misinformation may be underreported due to the effort of writing responses (see Connor Desai & Reimers, 2019). Furthermore, given that we are examining a familiarity-based effect, more familiarity-based procedures such as rating scales may be more sensitive than recall-based measures. We therefore switched to rating scales in Experiment 2.

Experiment 2

Given the widespread use of corrections, even if standalone corrections of novel misinformation do not generally backfire, there could still be serious negative consequences if there are specific circumstances in which they do. Therefore, building on the results of Experiment 1, in Experiment 2 we introduced a one-week delay between reading the articles and completing the test questionnaires. Previous research has found that correction effectiveness is reduced over time (Ecker et al., 2020b; Ecker et al., 2020c; Rich & Zaragoza, 2020; Swire et al., 2017; Swire-Thompson et al., 2023), and of the few studies that have reported familiarity backfire effects, most reported these effects only after a one-week delay (Pluviano et al., 2017, 2019; Skurnik et al. 2007 [unpublished]). There are also theoretical reasons to expect that a delay may increase the risk of a correction backfiring. If corrections can inadvertently lead to increased misinformation reliance because participants rely on familiarity cues and/or fail to retrieve the correction, then introducing a delay will increase the risk of standalone corrections backfiring because familiarity is less sensitive to time delays than recollection (Yonelinas & Levy, 2002).