Performance-linked visual feedback slows response times during a sustained attention task

Steinkrauss, Ashley C.; Shaikh, Anjum F.; O’Brien Powers, Erin; Moher, Jeff

doi:10.1186/s41235-023-00487-w

Original article
Open access
Published: 29 May 2023

Performance-linked visual feedback slows response times during a sustained attention task

Ashley C. Steinkrauss ORCID: orcid.org/0000-0002-5128-554X^1,2^na1,
Anjum F. Shaikh¹^na1,
Erin O’Brien Powers¹ &
…
Jeff Moher¹

Cognitive Research: Principles and Implications volume 8, Article number: 32 (2023) Cite this article

1588 Accesses
Metrics details

Abstract

In the present study, we tested a visual feedback triggering system based on real-time tracking of response time (RT) in a sustained attention task. In our task, at certain points, brief visual feedback epochs were presented without interrupting the task itself. When these feedback epochs were performance-linked—meaning that they were triggered because participants were responding more quickly than usual—RTs were slowed after the presentation of feedback. However, visual feedback epochs displayed at predetermined times that were independent of participants’ performance did not slow RTs. Results from a second experiment support the idea that this is not simply a return to baseline that would have occurred had the feedback not been presented, but instead suggest that the feedback itself was effective in altering participants’ responses. In a third experiment, we replicated this result across with both written word feedback and visual symbolic feedback, as well as in cases where the participant was explicitly told that the feedback was linked to their performance. All together, these data provide insight into potential mechanisms for detecting and disrupting lapses in sustained attention without interrupting a continuous task.

The experience of driving a car and having a passenger exclaim “Watch out!” to alert you to a potential hazard is a reality that is occasionally experienced by even the most competent drivers. This phenomenon exemplifies how easily our minds can wander, especially when trying to balance tasks such as talking and driving, but also how a timely alert can immediately snap us back into focus. In attention-demanding tasks, lapses in focus during key moments can have dangerous consequences. Studies conducted with pilots, for example, demonstrate that repetitive tasks over an extended period of time are vulnerable to lapses in attention even when performed by experts (Casner & Schooler, 2015; Dehais et al., 2014). Losing focus as a radar operator or a radiologist might be the difference between life and death for airplane passengers and patients. Therefore, determining why, when, and how these attentional decrements, or lapses in focus, occur is crucial to understanding how we can possibly intervene to prevent hazardous and costly mistakes in tasks that require sustained attention.

Methods of studying sustained attention

The study of sustained attention gained prominence after the experiences of radar operators in World War II. During the war, the Royal Air Force realized that radar operators were becoming less efficient over the course of a shift. In 1948, Dr. Norman Mackworth devised the Mackworth clock test, which is considered to be the first task measuring sustained attention. In this task, participants were instructed to focus on a blank clock with a black pointer (clock hand) moving in short, constant increments for 2 h, and respond whenever the clock hand moved at double the length of the short increment. This study found that attentional decrements arise after 30 min of executing a sustained attention task (Mackworth, 1948).

As technology has improved, sustained attention has most consistently been examined through tasks with a constant stream of stimuli in which the participant continuously responds to targets, thus allowing for a more fine-grained analysis of behavior. Two types of tasks have emerged as standard measures of sustained attention—CPTs and Not-X-CPTs. CPT stands for the Continuous Performance Task. In this task, participants are presented with one letter (e.g. “A”) during each trial. They are instructed to only perform a keypress when the target letter, often the letter “X,” appears on the screen (e.g., Cohen, 2011). Not-X-CPTs, also known as the Conners Continuous Performance Task, similarly present a stream of letter stimuli sequentially. However, for these tasks, participants are directed to perform a keypress when every letter except the letter “X” appears on the screen—these can be referred to as “no-go” trials (Conners & Sitarenios, 2011). CPTs are generally longer tasks and have been shown to be a reliable measure of vigilance decrements, or periods of declined accuracy and performance due to lapses of sustained attention (Riccio et al., 2002). On the other hand, Not-X-CPTs have been inconsistent in measuring vigilance decrements, but studies have demonstrated that they are a dependable measure of sustained attention based on variability in response times (RT) (Folsom & Levin, 2013). In these tasks, vigilance decrements are often noted by a decline in accuracy or performance. More specifically, not-X-CPTs utilize faster RT responses to predict errors committed on “X” trials due to vigilance decrements (Cheyne et al., 2006; Rosenberg et al., 2013).

Robertson’s Sustained Attention to Response Task (SART) pioneered the go/no-go continuous performance task (CPT). In the SART, participants are instructed to perform keypresses for a random, sequential stream of numbers and withhold key presses to the number three (the no-go target) that appears infrequently. The SART was developed to study vigilance decrements in patients who had experienced traumatic brain injuries, but has since been utilized in numerous studies on sustained attention. The SART demonstrated that erroneous keypresses on go trials (non-target stimuli) could be predicted by a decrease in RT directly preceding the error (e.g., Robertson et al., 1997). Despite the fact that the SART is a CPT, this study demonstrated that the SART has some of the benefits of not-X-CPTs and paved the way for future iterations of CPTs that can reliably measure vigilance decrements and predict lapses of sustained attention based on variability in response time.

In an effort to fully examine both of these measures of sustained attention simultaneously, Rosenberg et al. (2013) created a new task called the gradual-onset continuous performance task (gradCPT), which allows for sustained attention to be measured in terms of both response times and vigilance decrements. In the gradCPT, participants are presented with images of either a rare non-target or a target on each trial and are instructed to perform a keypress for each target stimulus and withhold a response for the rare non-target stimuli. Using this novel sustained attention task, Rosenberg et al. (2013) found that increased variability in RT resulted in more errors of commission, or erroneous responses to non-target stimuli. Furthermore, commission errors steadily increased over the course of the task. This task has been subsequently used to study many aspects of sustained attention such as brain networks associated with fluctuations in sustained attention (Esterman et al., 2012), sustained attention over the lifespan (Fortenbaugh et al., 2015), and the minimal impact of rewards on sustained attention (Esterman et al., 2014).

Theories and approaches to improve sustained attention

In order to target lapses in sustained attention for intervention, it is necessary to understand why they occur. There are at least two potential explanations for why lapses of sustained attention occur that have been discussed in the literature. The overload theory attributes attentional decrements in sustained attention tasks to resource depletion. The overload theory suggests that there is a limited amount of resources available for cognitive processing and that these resources deplete due to attentional exertion and increased difficulty of a task (Grier et al., 2003; Pattyn et al., 2008; Ralph et al., 2017; Smit et al., 2004; Thomson et al., 2015; Warm et al., 2008). Based on this theory, interventions such as rest breaks are the best way to improve attentional decrements in a sustained attention-demanding task. Underload theory, on the other hand, attributes attentional decrements to the lack of arousal and the monotonous nature of a task. According to underload theory, in order to increase attention, arousal must be increased such as with an alternate task that engages the attention of the participant (Manly et al., 1999; Pattyn et al., 2008; Ralph et al., 2017; Thomson et al., 2015). These two contrasting theoretical approaches suggest that different potential interventions for reducing lapses of sustained attention may be effective depending on the nature of the task.

Multiple studies have investigated ways to reduce attentional decrements. For example, one study examined the effect of live neurofeedback on the attentional state of an individual. Participants were placed in a functional magnetic resonance imaging (fMRI) scanner and were shown a composite image of a face and a scene and asked to focus on one of the two images. The attention of the participant was tracked using whole-brain fMRI analysis, specifically real-time fMRI (rtfMRI) with multivariate pattern analysis, to provide feedback. When the participant's attention began to slip as measured by rtfMRI, the target of the composite image became harder to see as the images blurred together even more. Likewise, when attention improved the composite image target became clearer and easier to decipher. Utilizing rtfMRI, the researchers discerned that live neurofeedback training affected regions of the brain including the frontal cortex, ventral temporal cortex, and basal ganglia (specifically the striatum and globus pallidus). Critically, the study found that their live neurofeedback mechanism decreased attentional decrements over time (deBettencourt et al., 2015).

Simpler behavioral approaches can also be used to improve performance on sustained attention tasks. Ralph et al. (2017) found that both taking breaks and engaging in an alternate task decreased participants’ response times, thereby improving their sustained attention. Participants in that study engaged in a mentally challenging task where they completed a version of the Mackworth clock task. In this task, participants were asked to watch a clock’s hand move. If the clock hand “skipped” a tick, then they had to click a button. Three versions of this task were administered: one where participants performed the Clock Task continually, one where participants were instructed to take a rest break, and one where participants were instructed to complete another visuospatial task (the Car Task) during a break period. Ralph et al. (2017) demonstrated that interrupting a repetitive task, even if the interruptions were demanding tasks themselves, could reduce lapses of sustained attention. In addition to exploring the effects of breaks and alternate tasks, the effects of monetary incentives on sustained attention have also been examined. Specifically, Esterman et al. (2016), found having a large looming loss (e.g. loss of money) as a consequence of a potential single error reduced the trend of increasing errors over the course of the task. Interestingly, small rewards over time improved the overall performance, but observers still experienced the same trend of increased errors over time.

The studies described above illustrate multiple ways in which sustained attention decrements can be reduced. However, many of these approaches are not easily applicable to real life situations in which sustained attention is necessary. For example, a study that examined ways to improve sustained attention in pilots found that interruptions, or having the pilot engage in other cockpit activities, were a distraction and led to more attentional decrements and more mistakes (Casner & Schooler, 2015). This example highlights the reality that breaks, alternate tasks, or looming threats do not align with many tasks that require sustained attention due to the fact that taking a break or changing one’s attentional target could result in major mistakes. Furthermore, in many real-world tasks these reduction strategies would not be feasible. For example, tasks such as operating heavy machinery or conducting surgery may require hours of continuous sustained attention without the opportunity for breaks. Due to this reality, the aim of the current study is to examine more applicable alternatives in which attentional decrements may be attenuated with simple, real-time interventions that respond to changes in a participant’s behavior without interrupting the task.

Present research with effects of alerts on attention

One potential mechanism to reduce attentional decrements involves deploying an alert when an individual is most vulnerable to lapses in focus. The use of an alert system is particularly relevant to the operation of vehicles such as cars and planes. Much of the literature employs either tactile or auditory alerts in an effort to minimize the consequences of attentional decrements and reinstate attention on the assigned task (e.g., Graham, 1999; Hester et al., 2017; Lees et al., 2011; Nees et al., 2016). There is, however, a relatively limited literature available on real-time visual feedback alerts improving performance in a sustained attention task.

A predominant focus of the literature on alert systems centers around improving the attention of vehicle operators through simulations in which alerts are employed. For example, Fitch et al. (2007) found that seat vibration alerts in a car, paired with the location on the seat representative of the region of the car where a crash was going to occur, were the most effective at alerting a driver to a potential crash. This study suggests that interventions from alerts right before a potential mistake can help participants maintain accuracy. Gonzalez et al. (2012) also examined alerts while driving, finding that while auditory alerts increase urgency in driving, they also increased annoyance. Wiese and Lee (2001) also looked at auditory alerts on driver performance and attitude and similarly found that annoyance was a significant characteristic that can affect workload and performance. Given these findings, these results suggest that auditory alerts may have unintended negative consequences, especially on the participants’ opinions of alerts.

Returning to the discussion on sustained attention, deBettencourt et al. (2019) examined the relationship between sustained attention and working memory using real-time tracking of sustained attention based on the response times of the participants. Participants were instructed to press keys based on the shapes of a visual array. To track sustained attention in real time, deBettencourt et al. monitored participants’ cumulative RT and standard deviation as well as the trailing mean of the last three trials prior to each trial. Whenever this trailing mean was less than the cumulative mean minus the cumulative standard deviation or greater than the cumulative mean and cumulative standard deviation combined, a working memory task probe was deployed that probed for the colors of all shapes during the previous trial. When there was low attention as reflected by RTs that were shorter than one standard deviation below the mean, participants remembered fewer items from prior trials (deBettencourt et al., 2019). This indicates that lapses in attention, indicated by fast response times, lead to worse working memory performance.

This novel probing method is a promising approach to studying fluctuations in attention and implementing changes in real-time, without interrupting the task at hand—a feature of vital importance to occupations where taking a break or completing alternate assignments during sustained attention tasks is not feasible and may even cause harm.

In the present study, we adapted the approach from deBettencourt et al. (2019) to test how the appearance of visual feedback, prompted by user performance, can change behavior in a sustained attention task. We used real-time tracking of sustained attention (similar to deBettencourt et al., 2019) to detect when participants were likely to be in a less focused attentional state, and feedback epochs were presented when attention appeared to be declining based on the live-tracking of RT. Feedback consisted of words (“Correct!” or “Incorrect”) appearing under the image or colored circles surrounding the image for short bursts of 5 trials according to how the participant responded. Importantly, this intervention did not require any pausing of the task.

In Experiment 1, participants completed a version of the gradCPT (adapted from Fortenbaugh et al., 2015). Short bursts of feedback were presented either when participants were in a less-focused state based on our real-time tracking of RT, or at predetermined times. This allowed us to determine whether changes in performance following feedback epochs occurred only because of the feedback itself, or because the feedback was triggered by specific patterns in participant performance. In Experiment 2, all participants were subject to the real-time RT-based feedback. For each period of shorter-than-usual RT that was detected, feedback epochs were either displayed to the participant (“visible” feedback) or hidden from the participant (“invisible” feedback), allowing us to determine whether changes in performance observed in Experiment 1 were due to the feedback itself, or to a return to baseline that naturally occurs over time even if feedback was not presented. Across both experiments, we hypothesized that when feedback was triggered by periods of shorter-than-usual RT, commission errors would be attenuated and RTs would increase in trials immediately following the feedback, reflecting a return to a more focused attentional state. In other words, we predicted that the feedback epochs would serve as a useful alert that improved participant performance and forestalled potential upcoming errors. Finally, in Experiment 3, we manipulated the type of feedback, using either written words as in Experiments 1 and 2, or visual symbols in the form of green and blue circles. We also manipulated the participants’ knowledge of the connection between the feedback and their own performance. This allowed us to explore the extent to which performance-triggered feedback might increase RT across a broad variety of contexts.

Experiment 1

Methods

Participants

101 participants completed the experiment (females = 51 and males = 47; mean age = 28 years; three participants failed to report demographic information); we removed from analyses all participants who went at least twenty trials without pressing a key, under the assumption that this would include participants who were not engaged with the task throughout the experiment. This resulted in the elimination of three participants, leaving 98 total participants. All participants had normal or corrected-to-normal vision. Experiments were posted and publicly available on Prolific (http://www.prolific.co). Participants were required to have a United States-based location. All participants provided informed consent and were provided monetary compensation for their participation. The protocol was approved by the Connecticut College Institutional Review Board. Sample size was based on prior studies using a similar task (deBettencourt et al., 2019; Rosenberg et al., 2013) but with larger samples to compensate for potentially more variable data from online data collection as resources would allow.

Stimuli

We used a version of the gradCPT adapted from Fortenbaugh et al. (2015). Participants were shown a series of sequential images that were either cities (85% of the time) or mountains (15% of the time). Note that we used a higher proportion of mountains relative to prior studies in order to increase the number of no-go trials during critical periods of interest. There were 10 round grayscale images measuring 200 × 200 pixels from each category that were used. Images were presented in a randomized order with the exception that an image was never presented on consecutive trials, and images continuously faded in for 800 ms from a starting point of 0% opacity until they reached 100% opacity, then faded out for 800 ms in the opposite direction of opacity. Each segment of the experiment began and ended with a gray circle of the same dimensions. Images overlapped so as one image was fading in, the next image was fading out at the same time, all at the same central location. When feedback was presented it appeared in the form of text below each image that read either “Correct!” or “Incorrect.” These messages stayed on the display for 500 ms (see Fig. 1 for examples of the feedback). Feedback was triggered either by a response, or by the end of the trial if no response occurred during the trial. During the primary task, when feedback was presented it was shown for epochs of 5 consecutive trials. We will subsequently refer to these periods as feedback epochs. The timing of the appearance of these feedback epochs was determined as a function of experimental condition (see procedure for more details).

We used custom JavaScript code for stimulus presentation adapted from PsiTurk (Gureckis et al., 2016). At the end of the experiment, we asked participants a series of questions from an ADHD self-report questionnaire (Adult ADHD Self-Report Scale [ASRS-v1.1] Part A) along with a simple self-report mind-wandering questionnaire (see “Appendix A”) and a series of questions about their awareness and feelings about the feedback epochs. The latter included questions about whether the participant thought the feedback epochs were responding to their performance, whether they found the feedback epochs helpful, whether they thought the feedback epochs had a positive impact on their performance, and whether they thought the feedback epochs were annoying. ASRS data were collected for a separate project but are publicly available along with other subject data at our OSF website (https://osf.io/mkgej/).

Procedure

Participants were told they would see a series of pictures of cities and mountain scenes and every time they saw a city scene they should press the ‘M’ key, but they should withhold from pressing the “M” key when they saw a mountain scene. Participants completed 50 practice trials that ramped up in speed and difficulty until they were practicing the primary task. All practice trials had city and mountain scenes appear equally, unlike the experiment where mountains appeared on a randomly selected 15% of all trials. Practice trials were broken apart into three sections. The first section had no transitions, slow speed, and feedback (10 trials). The second section had transitioning images, slow speed, and feedback (20 trials). The third section had transitioning images, full pace, and no feedback (20 trials). Participants saw instructions before each set of practice trials informing them of the changes. For each screen they were instructed to be as accurate as possible. After practice, participants completed 4 blocks of 100 trials each with no breaks in between.

There were two between-subjects experimental conditions regarding feedback during the primary task. Participants were randomly assigned to one of these two groups prior to the experiment. For both groups, no feedback was presented during the first block of trials in order to establish a stable mean RT for each participant. For the Predetermined group, participants were randomly assigned to see either 3, 4, 5, or 6 feedback epochs. Each epoch began at a predetermined trial number. The spacing between each epoch was roughly equal, with shorter spacings for participants who received more frequent feedback epochs; however, the exact trial on which each epoch appeared was randomly jittered for each participant by up to 5 trials in either direction to make the appearance of the epochs less predictable. For the Triggered group, feedback epochs were triggered by a period of atypically rapid responding. These periods were defined as three consecutive city trials with correct responses in which the participant’s mean RT was at least 1 standard deviation lower than their cumulative mean RT up to that point of the experiment (similar to deBettencourt et al., 2019). After one feedback epoch was triggered, the next epoch could not be triggered until at least 30 additional trials had passed. Pressing the key was considered a correct commission on city trials and a commission error on mountain trials.

Data analysis

Keypresses were linked to trials using the same algorithm as Fortenbaugh et al. (2015) in which a keypress was assigned to a trial using an iterative procedure taking into account when the most recent keypress was made, the relative time at which recent images appeared, the trial type of the most recent and current trial, and whether a response was already recorded for the prior trial. Once assigned, RT was considered relative to the onset of an image, such that a response time of less than 800 ms would be a keypress that occurred before the image reached 100% opacity.

Results

Self-report

We conducted independent-samples t-tests between the predetermined and triggered groups to determine whether there were group-level differences on self-report questions (see “Appendix” for question details). A small number of participants did not respond to all questions—for each analysis, we included all participants who answered the particular questions in that analysis. The groups showed no difference in the extent to which they thought the feedback epochs were responding to their performance, t(86) < 1. Critically, this suggests that participants were not aware of when the feedback was being triggered by their performance as opposed to when feedback epochs were presented at predetermined times that were entirely independent of their performance. In other words, participants were not consciously aware of the key manipulation of the present study. We also found no differences across other self-report questions related to mind-wandering or participants’ perceptions of the feedback epochs, all ts < 1. On average, participants found the feedback epochs to be annoying (M: 3.83) more so than helpful (M: 3.11) or positively affecting their performance (M: 3.49), though none of these comparisons reached statistical significance, ps > 0.05. See Table 1 for all means and test statistics.

Table 1 Statistic descriptives using t-test for equality of means

Full size table

Total feedback epochs

Our goal in the present study was to approximately match the total number of feedback epochs across our two groups. However, because it was not possible to predict exactly how many feedback epochs would occur in the triggered group, this was an approximation done before data collection. Overall, more feedback epochs did occur in the triggered group (M: 5.5) compared to the predetermined group (M: 4.7), t(76.50) = -3.89, p < 0.001.

There were two primary approaches we used to determine the extent to which feedback impacted performance. The first is to examine overall group level differences on primary metrics of performance we refer to these as global analyses. The logic for these analyses is that perhaps performance-triggered feedback leads to global changes in performance relative to feedback that is not related to an individual’s performance. The second approach was to examine local changes related to feedback epochs—that is, are there overall or group-level changes that are caused in the immediate aftermath of the sudden appearance of visual feedback. We refer to these as local analyses.

Global analyses

We examined measures of response time (RT), coefficient of variation (CV, calculated as the SD divided by the mean for a given condition, multiplied by 100), commission errors, and correct commission rates across each block of trials. Responses on trials during which feedback was occurring were not included in these analyses. Finally, we considered the possibility that group level differences took time to emerge over the course of the experiment; thus, we looked at each of these measures only within the final block of 100 trials in a separate analysis.

A 2 × 4 ANOVA was run on RT, CV,^{Footnote 1} commission errors, and correct commission rates with factors of block (1, 2, 3, or 4) and condition (triggered or predetermined). For RT, there were no main effects of condition F(1,96) = 0.30, p = 0.58 or block F(3,288) = 1.5, p = 0.21, nor an interaction F(3,288) = 0.04, p = 0.99. For CV, there was a significant effect of block F(3,262) = 34.76, p < 0.001, η_p² = 0.27 but no effect of condition F(1,96) = 0.06, p = 0.81, nor an interaction F(3,262) = 0.64, p = 0.57. The main effect of block reflects an increasing CV as the experiment progressed. For commission errors, there was no effect of condition, F(1,96) = 0.02, p = 0.88, nor an interaction, F(3,288) = 0.30, p = 0.82. There was a significant effect of block, F(3,288) = 20.92, p < 0.001, η_p² = 0.18 (Fig. 2). These follow similar patterns to previous findings (e.g., Rosenberg et al., 2013) in which errors increase in the gradCPT as the task progresses. For correct commission rates, there was again no main effect of condition nor an interaction, ps > 0.05. There was again a significant effect of block, F(3,288) = 13.38, p < 0.001, η_p² = 0.12, with a similar pattern to correct commission rates, where performance decreased as time-on-task increases. See Additional file 1: Table S1 for means and standard deviations. Finally, we conducted independent samples t-tests to compare commission errors, correct commission rates, CV, and RT for just the last 100 trials across the two different conditions; no results were significant, all ps > 0.05.

All together, we saw no evidence of a global change in performance generated by the presence of feedback epochs. This does not mean that performance-triggered feedback could not lead to overall improvements in performance; rather, it suggests that the current approach does not do so, either because the impact of feedback epochs is the same regardless of whether the feedback is triggered by performance or not, or because the feedback does not improve performance in this context. Future studies should directly compare these conditions against a condition in which no feedback is presented.

Local analyses

To better understand the impact of feedback epochs on performance, we examined RT, CV, correct commission rates, and commission error rates immediately before and after feedback epochs across both conditions. Because commission error rates can only be measured when mountain trials appear, and those only appear in 15% of all trials, we chose to examine means from windows of the 8 trials immediately prior to the onset of feedback epochs and the 8 trials immediately following the conclusion of feedback epochs; all but one participant had at least one trial in each condition for this analysis (that participant was removed for these analyses). A 2 × 2 mixed factorial ANOVA was run with factors of condition (predetermined vs. triggered) and time (before vs. after a feedback epoch) for each dependent variable.

For RT, there were main effects of condition F(1,96) = 6.36, p = 0.013, η_p² = 0.06 and time F(1,96) = 26.16, p < 0.001, η_p² = 0.21 that were mediated by a significant interaction, F(1,96) = 11.68, p = 0.001, ηp² = 0.11 (Fig. 3). Follow-up t-tests showed that RT increased following feedback epochs in the triggered condition (before: 619 ms, after: 676 ms), t(47) = 6.66, p < 0.001). However in the predetermined condition, the difference in RT failed to reach significance (before: 679 ms, after: 690 ms), t(49) = 1.11, p = 0.27). These results indicate that triggered feedback linked to performance did alter participants’ behavior in a way that feedback independent of performance did not. Specifically, the triggered feedback significantly slowed participants’ responses.

There was a main effect of time on all other measures as well. CV was higher following feedback epochs (before: 19, ms; after: 22 ms), F(1,96) = 10.44, p = 0.002, η_p² = 0.10. Commission error rates were higher following feedback epochs, (before: 26.2%, ms; after: 36.4% ms), F(1,94) = 10.10, p = 0.002, η_p² = 0.10. For the commission error rate analysis, two participants had no mountain trials in at least one of the conditions, and thus were removed from that analysis. And finally, correct commission rates were lower following feedback epochs, (before: 96.6%, ms; after: 95.5%, ms), F(1,96) = 4.51, p = 0.04, η_p² = 0.05. This suggests that feedback epochs on the whole were quite disruptive; rather than improving performance by re-focusing on the task, these feedback epochs appeared to distract participants and disrupt focused attention.

There was a significant main effect of condition on commission error rates, F(1,94) = 4.13, p = 0.045, η_p² = 0.04, with higher commission error rates in the predetermined condition (35.1%) compared to the triggered condition (27.5%). However, this difference was not observed when the global data were analyzed as reported in the earlier section, and may be an artifact of differences in the pre-feedback requirements in the triggered condition (e.g., a series of consecutive accurate responses that were required to trigger feedback). Thus, we caution against interpreting this result as suggesting that overall commission error rates were truly lower in the triggered condition.

Critically, there were no other main effects of condition or interactions for any of these other measures, all ps > 0.05. In other words, these measures were otherwise not differentially affected by whether the feedback was triggered or predetermined; RT was the only measure that was differentially affected. Together, these data highlight two key findings. First, feedback epochs on the whole were disruptive rather than helpful to performance. Second, however, these disruptive feedback epochs were successful in slowing participants down when they were responding rapidly—normally a key indicator that they are losing focus and likely to soon commit an error.

Because the size of the time window was somewhat arbitrary, we conducted a follow-up analysis examining RT at a more fine-grained level (Fig. 4). Starting at the trial immediately before and immediately after feedback epochs, we created a moving window average of 3 trials to examine RT. For example, trial 1 would include the first trial following the conclusion of the feedback epoch, and the two trials after that. In these windows, we only include RT for a given trial in the mean for each participant if it is a city trial and the response was accurate. For trial -1, this value would include the trial immediately before the feedback epoch began, and the two trials preceding that trial. As can be seen in Fig. 4, there is a clear pattern in the triggered trials in which RT was decreasing leading up to the triggering of the feedback epochs, as would be expected based on the algorithm that was used to trigger those feedback epochs. Notably, following feedback, RT in the two conditions looks almost identical. In a three-way ANOVA with factors of time, condition, and trial number, there was a significant three-way interaction, F(7.96) = 26.51, p < 0.001. To better parse this, we conducted separate two-way ANOVAs with factors of trial number and condition for the trials before the feedback epochs and the trials after. Confirming the description above, an interaction was observed for the trials before the feedback epoch, F(7,672) = 55.51, p < 0.001, ηp² = 0.37, but not after, F(7,672) = 0.75, p = 0.63.

Experiment 2

The data from Experiment 1 suggest that feedback epochs that are triggered by periods where participants are responding faster than normal can alter subsequent behavior, as RT is subsequently slowed. Conversely, randomly appearing feedback epochs have little effect on RT. However, a possible alternative explanation is that the feedback epochs are not causally involved in the post-feedback slowing we observed in the triggered condition. That is, participants would have slowed down anyways even if we had not presented them with feedback epochs, because the periods that preceded that feedback involved atypically fast responding. Furthermore, differences in the frequency of feedback epochs across conditions potentially complicated the interpretation of analyses from Experiment 1.

To address these concerns, in Experiment 2 we generated a within-subjects comparison in which periods of atypically short RTs would only trigger feedback epochs half the time. The other half of the time, the feedback epochs were still triggered by the algorithm but made invisible, such that from the participant’s perspective, nothing changed. This way, we can directly compare what happens to a participant's pattern of responses following a period of atypically short RTs as a function of whether feedback epochs were triggered or not.