All experimental procedures were approved by the Washington University Human Research Protections Office prior to data collection. Participants provided informed consent and were compensated $10/hour for all study procedures, with the opportunity to gain up to an additional $8 bonus, based on the experimental tasks.
A sample of healthy adults (N = 31, 18–23 years old) completed a pilot study to assess the feasibility of completing cognitive effort discounting procedures across both working memory and speech comprehension domains (see A1 for further details). As a brief overview, participants completed a task familiarization phase in which they performed either a N-back task, with working memory load varied across blocks (i.e., how many previous items need to be stored in working memory; N = 1–4, with higher N indicating increased cognitive demands), or a speech-in-noise task, with effortful speech comprehension varied across blocks (i.e., listening to spoken sentences presented with different levels of background noise; signal-to-noise ratios [SNRs] ranging from − 12 to 0 dB, with lower numbers corresponding to greater cognitive demands). Following the familiarization phase, participants completed a decision-making phase, by performing the COG-ED in each of the two domains (i.e., N-Back, speech-in-noise). In the COG-ED, with conditions adapted from prior work (Westbrook et al., 2013), participants were required to make a series of decisions between performing high-effort task levels (e.g., 2, 3, 4 back; − 4, − 8, − 12 SNR) for high monetary reward or low-effort task levels (e.g., 1-back; 0 SNR) for a lower monetary reward value. Critically, a within-subject design was employed, with each participant completing both the familiarization and discounting phases in both working memory and speech comprehension domains (counterbalanced across participants).
This design enabled us to quantify the subjective costs of cognitive effort for each participant in each domain, and to look at relationships between them.
We found that across both domains, participants discount task load (i.e., cognitive effort) similarly, whereby more difficult levels of the task (i.e., purple; 4-Back, − 12 SNR) are discounted more, or have a lower subjective value, relative to easier task levels (i.e., red; 2-Back, − 4 SNR), B = − 0.15 [− 0.12, − 0.18], SD = 0.02, with no differences observed across domains, B = 0.08 [− 0.02, 0.17], SD = 0.05 (Additional file 1: Fig. S1). Furthermore, examining the average subjective value of cognitive effort across working memory and speech domains revealed a strong within-subjects association, r = 0.521 [0.234, 0.744], BF10 = 39.21. In other words, participants who exhibited a low subjective value of cognitive effort (i.e., find engaging in cognitive effort to be costlier) in the working memory domain also tended to have a low subjective value of cognitive effort in the speech comprehension domain (Additional file 1: Fig. S2). The relationship between the costs of cognitive effort in working memory and speech comprehension domains remained, even after controlling for individual differences related to task difficulty and performance in each respective domain (working memory: hit rate, correct rejection rate, mean RT; speech comprehension: intelligibility), r = 0.400 [0.213, 0.558].
Self-reported ratings of mental demand, effort, and frustration provided further support of the costs of cognitive effort in each domain. There was a main effect of task load across ratings of mental demand B = 13.95 [11.09, 16.73], SD = 1.43, effort B = 11.81 [9.09, 14.49], SD = 1.38, and frustration B = 8.49 [5.68, 11.31], SD = 1.43. This suggests that as task load level increased, subjective ratings of mental demand, effort, and frustration increased. However, in contrast to the behavioral findings, there was also a main effect of domain for self-reported ratings of effort B = − 15.81 [− 23.23, − 8.23], SD = 3.79, and mental demand B = − 10.37 [− 17.50, − 3.19], SD = 3.65, which indicated that participants rated the speech-in-noise task to be less mentally demanding and effortful overall, relative to the working memory task. Frustration ratings did not differ across task domain, B = − 0.39 [− 7.82, 7.00], SD = 3.82.
Furthermore, we did not find conclusive evidence for a relationship between self-reported (e.g., NCS) and behavioral measures of cognitive motivation (e.g., cognitive effort discounting) in the pilot sample. Correlations between NCS and the working memory COG-ED (r = 0.115 [− 0.218, 0.451], BF10 = 0.33), speech comprehension COG-ED (r = 0.174 [− 0.145, 0.493], BF10 = 0.45), and the composite COG-ED score (r = 0.158 [− 0.197, 0.471], BF10 = 0.42) were anecdotal. It is important to note that in the pilot data, other potential covariates, such as working memory capacity or personality traits, were not assessed.
To examine the relationship between experimental measures of cognitive effort, we used the COG-ED (Westbrook et al., 2013) to estimate the subjective value (i.e., cost) of cognitive effort across two domains (i.e., working memory, speech comprehension) and test for within-subject associations between these two domains. Moreover, we obtained individual difference measures of the component processes that seemed plausibly likely to contribute to the computation of the cognitive effort costs (i.e., working memory capacity, reward sensitivity). The assessment of these other measures provided the means to statistically control for their influence (via partial correlation) when assessing the strength of the association of cognitive effort discounting across working memory and speech comprehension domains.
Orthogonal to our main hypotheses of interest, we also collected a self-reported measure of cognitive motivation (NCS), in order to test for the strength of the association between self-reported and behavioral indices of cognitive motivation. Although the NCS assessment and analyses were not the primary scope of this experiment, collecting these data provided an important baseline of research needed to rigorously explore the relationships between self-reported and behavioral measures of cognitive motivation in future work.
The experiment took place via remote online testing, across two separate sessions, scheduled approximately 24 h apart. All questionnaires and tasks were self-administered using the software platform Inquisit 6 (www.millsecond.com). In the first experimental session, participants were assessed with a range of individual difference measures that indexed working memory capacity (Listening-span; L-span; Cai et al., 2015); Operation-Span; O-Span; Symmetry-Span; Sym-Span; Unsworth et al., 2005). In addition, we collected self-report measures of reward motivation: Behavioral Inhibition and Behavioral Activation Scales (BIS/BAS; Carver & White, 1994), Generalized Reward and Punishment Expectancy Scale (GRAPES; Ball & Zuckerman, 1990), and Sensitivity to Punishment and Sensitivity to Reward Questionnaire (SPSRQ; Torrubia et al., 2001). Self-reported cognitive motivation (NCS; Cacioppo & Petty, 1982) was also collected for use with exploratory analyses. All tasks and questionnaires during this session were administered in the same order across participants.
Working memory familiarization phase
In the second experimental session, participants completed the familiarization and decision-making phases of the COG-ED within each cognitive domain. During the familiarization phase, participants first experienced variously demanding levels of either the working memory or speech-in-noise task; task order was fixed across participants. Both familiarization blocks (working memory, speech comprehension) were roughly equated in total duration. In the working memory task (N-Back), participants respond to each of a sequence of letters, presented one at a time in the center of a computer screen. The task requires that participants indicate when the current stimulus (i.e., letter) matches the letter from N steps earlier in the sequence (target) or when the stimulus differs from the letter presented N steps earlier (non-target). Prior work has shown that as the level of N increases, the task becomes progressively more difficult and effortful (Ewing & Fairclough, 2010). Participants completed one 20-trial run (5 targets; 15 non-targets) of each level of the task (1-back, 2-back, 3-back, 4-back) in ascending order of difficulty. Each level of the task was assigned a color (i.e., 2-Back = “red”) to avoid anchoring effects (i.e., cognitive biases that could cause subjects to base judgments off of an initial level of difficulty; Ariely et al., 2003). Thus, participants learned to associate each task level with its assigned color before beginning the discounting procedure. This discounting procedure has been successfully used across multiple participant populations, showing robust effects (Culbreth et al., 2019; Westbrook et al., 2013). N-back task performance during this familiarization phase was assessed in terms of hit rate, correct rejection rate, and mean RT for each task load level. As described below, these values were used to statistically control for individual differences in task performance when estimating cognitive motivation in the working memory domain.
Speech comprehension familiarization phase
During the speech-in-noise task, adapted from McLaughlin et al. (2021), participants were presented with sentences with varying levels of noise. We used speech-shaped noise, that is, steady noise with a spectrum matching that of the sentences. Prior to starting the experiment, participants were encouraged to locate to a quiet space and use headphones for the task, if possible. The signal-to-noise ratio (SNR) was adjusted to manipulate task difficulty; negative SNR values indicate that the signal is presented at a lower level than the noise. Sentences were presented at various levels of noise (SNRs of 0 dB, − 4 dB, − 8 dB, and − 12 dB), and participants were instructed to type the sentence they heard back into a text box on each trial. If they were unsure of any words in a sentence, they were instructed to make their best guess. Each task level consisted of 15 self-paced trials wherein participants heard a sentence, typed it back into a text box, and then used the spacebar to begin the next trial. Like the working memory task, participants completed task blocks in order of difficulty, from easiest (0 dB SNR) to hardest (− 12 dB SNR), with the same color mappings for task difficulty used in the working memory task. Speech task performance during this familiarization phase was assessed in terms of intelligibility, operationalized as the number of key words in each sentence that were entered correctly (each sentence included four key words). As described below, these values were used to statistically control for individual differences in task performance when estimating cognitive motivation in the speech comprehension domain.
Following each run of the familiarization task (i.e., completing the 1-Back or 0 SNR task), participants completed self-reported ratings of the mental demand, physical demand, temporal demand, effort, frustration, and performance from the preceding task block using the NASA Task Load Index (Hart, 2006). Participants provided their responses using a visual analog scale ranging from 1 (very low)—21 (very high). These ratings helped to serve as a manipulation check to ensure that participants found the tasks to be effortful and mentally demanding across each load level.
After the familiarization phase, in which each load level was experienced and practiced, the critical decision-making phase of the COG-ED occurred. In this phase, participants made repeated choices about whether to repeat performance of a higher load-level of the task (e.g., 2-, 3-, or 4-back; − 4, − 8, or − 12 SNR) or instead perform the easiest load level (1-back, 0 SNR). In the first trial of each higher- and low-effort pairing, participants were presented with equal reward amounts (either $2, $3, or $4) for completing the chosen task (e.g., $2 for 1-back vs. $2 for 2-back). The offer for the chosen task was then stepwise titrated across a series of 5 calibration trials, to estimate the value at which participants were indifferent between the two offers (i.e., they would be equally likely to choose either offer). For example, if a participant chose the $2 for 1-Back over $2 for the 2-Back, then the next calibration trial would present the participant with the offer of performing the 1-Back for $1 (i.e., half of the amount of the previous offer) or performing the 2-Back for $2 (i.e., fixed offer amount). On the other hand, if the participant instead chose to perform the 2-Back for $2 on the first trial (relative to $2 for the 1-Back) then the offer amount for the higher effort option would be stepwise titrated until the indifference point was reached. The point of subjective indifference is critical because it quantifies how much more subjectively costly the unchosen task level is relative to the chosen task. As a result, these indifference points estimate the “cost” of cognitive effort. In other words, the indifference point is the amount of money an individual is willing to forgo to avoid performing the unchosen task.
Participants completed a total of 45 decision trials in each domain (3 task load levels × 3 monetary reward levels × 5 calibration trials, with the task load and reward levels randomly intermixed) after they completed the corresponding familiarization phase. Critically, participants were informed that one of their choices would be used to determine task-based compensation and that they would be asked to repeat the task they chose, for the amount of money offered (i.e., $2 for the “red” task). Task-based compensation was not based on performance from the familiarization phase, but rather, participants were told that in order to successfully earn the money for repeating the chosen task, they would need to maintain their effort from the familiarization block when repeating the task block.
After completing all task blocks in each domain, participants completed a post-task questionnaire to assess how much their choices during the discounting phase were based on the difficulty, effort, or monetary reward associated with the task. In addition, after completing the speech comprehension phase, participants were asked what device was used to complete the task (e.g., speakers, headphones). Complete descriptions of all self-report questionnaires are provided in Additional file 1. Data collection and analysis were not performed blind to the conditions of the experiments.
We used Bayes factor design analysis (BFDA) to determine the sample size for this experiment. Adopting a sequential design with maximal N using BFDA helped to ensure that we were collecting sufficient evidence while maintaining efficiency in our design (Schönbrodt & Wagenmakers, 2018; Schönbrodt et al., 2017). As an overview, in sequential designs, sampling is continued until the desired level of the strength of evidence is reached (i.e., Bayes factor; BF10), which in this case is 10 times in favor of the experimental hypothesis over the null hypothesis, or vice versa. To strike a balance between the feasibility and interpretability of the results, we planned to stop all data collection after the maximal N for this study (N = 300) was collected, if the Bayes factor threshold had not already been reached. To aid in the calculation of the approximate sample size, we used the BFDA package (Schönbrodt & Stefan, 2019), which runs 10,000 Monte Carlo simulations based on the pre-specified prior distribution and effect size estimates provided by the user. For this experiment, we opted to follow the approach of a safeguard power analysis (Perugini et al., 2014), choosing a smaller effect size (r = 0.3) than what was previously observed in our pilot study (r ~ 0.5 or r ~ 0.4 after controlling for task performance) in order to avoid underestimating the sample size. Furthermore, we decided to use an uninformed prior, a central Cauchy distribution with a scaling parameter of r = √2/2, as is default in the BayesFactor (Morey & Rouder, 2018) package in R, taking a more conservative approach to power analysis.
Results from the simulations suggested that the median sample size needed to obtain a Bayes factor ≥ 10 given the parameters specified above was N = 112, and, conversely, finding evidence in support for the null hypothesis, BF10 ≤ 0.1, would require a median sample size of N = 140 (results summarized in Additional file 1). Thus, we planned to sample, at minimum, 100 participants; after reaching this sample size, we then tested for sufficient evidence every ten participants thereafter, until the Bayes factor threshold (i.e., BF10 ≥ 10 or BF10 ≤ 0.1) was reached or until we collected data from 300 participants, the maximal N.
Participants were healthy adults, ages 18–40 years, recruited through the online research platform Prolific (www.prolific.co) (Palan & Schitter, 2018). Inclusion criteria for participation included English as native language, with no lifetime history of neurological trauma, seizures, hearing difficulty, or mental illness, and no current use of psychotropic medications. After completing the first experimental session indexing individual differences in working memory capacity and reward sensitivity, participants were invited back to participate in the second experimental session (e.g., discounting) if they completed all tasks and questionnaires from the first session (n = 184). From this sample, 52 participants declined to participate in the second experimental session. In addition, participants were excluded from the final sample if they reported not using headphones during the speech comprehension task (n = 10) or if they did not complete all parts of both discounting tasks (n = 18). The final sample consisted of 104 participants (47 females; 18–40 years, M = 27.3, SD = 5.8; 1 American Indian or Alaskan Native, 13 Asian, 11 Black or African American, 73 White, 5 more than 1 race, 1 not reported; 18 Hispanic or Latinx). We strived to use all available data in the subsequent analyses. However, a small subset of participants (n = 5) exhibited a behavioral profile that suggested possible non-compliance with the task instructions (e.g., almost always choosing the high-effort option; average subjective value > 1, or in other words, a reverse discounting pattern). As such, we performed additional analyses both with and without the excluded participant(s) and report both sets of values.
Bayesian linear mixed effect models were conducted in the package brms (version 2.16.1; Bürkner, 2017, 2018), R version 4.1.0 (RRID:SCR_001905; R Core Team, 2018) to estimate the effects of task load, domain (e.g., working memory, speech comprehension), and performance variables (N-Back: hit rate, correct rejection rate, mean RT; speech: intelligibility) on participants' discounting behaviors. Additional analyses estimated the effects of task load and domain on participants’ self-reported ratings of mental demand, effort, and frustration. In all models, task, domain, and performance variables were entered as fixed effects, with a random effect of intercept. Further, we used the default prior distributions in brms for each of the fixed effects (i.e., flat prior; central t-distribution, df = 3) and default number of iterations (4000) for each of these models, providing an estimate equivalent to maximum likelihood approaches used in multilevel modeling (using the package lme4; Bates et al., 2015). In the reported results, we provide the beta estimate (i.e., mean of the posterior distribution), the 95% credible intervals, standard deviation of the posterior distribution (i.e., error), and a Bayesian approximation of R2 (for more information, see Gelman et al., 2018).
The main variable of interest for our analysis was the subjective value (i.e., cost) of cognitive effort. The subjective value was calculated using each participant’s responses during the discounting procedure; as an overview, participants made repeated choices between high- and low-effort tasks, each at equal offer amounts at fixed values ($2, $3, $4), and the monetary values of the chosen option (either high- or low-effort task) were then stepwise titrated across a series of 5 calibration trials, with each trial in the series utilizing the participant’s prior responses to set the current value. The value of the titrated reward at the end of the calibration series, provided the indifference point (i.e., the value at which the participant was equally likely to choose either the low- or high-effort option) for a given amount and task load pairing. For task choices following trials in which participants initially chose the low-effort option (e.g., discounting high-effort option), each indifference point was divided by the corresponding monetary value of the high-effort option either $2, $3, or $4, to summarize the subjective value of engaging in cognitive effort, a positive value ranging from 0 to 1. If participants initially choose the high-effort option when presented with equal monetary rewards for performing the high- or low-effort task (i.e., discounting the low-effort option), we subtracted the indifference point from the fixed monetary reward amount and divided by the value of the fixed monetary reward. We transformed all subjective value estimates in which participants initially chose the high-effort option by adding 1 to the estimate, such that the subjective value estimate ranged from 0 to 2; values > 1 indicate preferences for higher effort tasks, whereas values < 1 indicate preference for the easy task.
The initial stage of analyses was to examine the subjective value estimates in each domain, in order to evaluate the effect of reward amount and task load factors. Additionally, we examined the effect of these factors on self-reported ratings of mental demand, effort, and frustration. In the first test of our hypothesis, we measured the zero-order correlation between cognitive effort discounting, estimated separately from the working memory and speech comprehension domains. For this analysis, we first calculated the average subjective value across all task conditions (3 monetary reward amounts × 3 task load levels) for each participant in each domain, then using the Correlation package in R (Makowski et al., 2020), which implements Bayesian correlations using the package bayestestR (Makowski et al., 2019), we correlated those two subjective value estimates with each other. An uninformed prior was used for this analysis, Cauchy distribution (µ = 0, r = √2/2). We report the correlation value as the median of the posterior distribution, in addition to the 95% credible intervals. Further, we report the Bayes factor, which contrasts the strength of the experimental model (i.e., correlation between effort costs across domains) relative to the null hypothesis (i.e., no correlation between effort costs across domains). This analysis served to replicate the initial finding in our pilot sample, which showed a strong association between the subjective value of cognitive effort across working memory and speech comprehension domains.
For the second test of our hypothesis, we first statistically controlled for task load and performance in each respective domain prior to computing the correlation between cognitive effort discounting in working memory and speech comprehension domains. To accomplish this, we entered task-level and relevant task performance variables (N-Back: hit rate, correct rejection rate, mean RT; speech: intelligibility) as covariates in a multilevel model predicting subjective value in each domain separately using the package brms in R (Bürkner, 2017, 2018); the averaged residuals from each participant within each domain were then correlated with each other using the same uninformed prior distribution as detailed above in order to quantify the strength of the relationship between effort discounting across domains (for model specifications, see Additional file 2). This analysis helped to ensure that we were accounting for task-specific variables, such as performance, that could influence the subjective value of cognitive effort across domains.
To extend the results of our pilot study, we then conducted a third test of our hypothesis, by additionally controlling for the influences of trait-level individual differences in working memory capacity and reward sensitivity when examining the association between the subjective value of cognitive effort across working memory and speech comprehension domains. For working memory capacity, we created a composite score, for which we summed the z-scores from the total score from each working memory measure (L-span, O-Span, Sym-Span). Reward sensitivity was calculated by summing the z-scores obtained in each reward sensitivity measure (BAS total score, GRAPES reward expectancy score, and the SPSRQ reward sensitivity score). The first step of the analysis was to examine the distributions and zero-order correlations involving these composite variables, as well as their association with the two subjective value estimates. Next, the two composite variables (working memory capacity, reward sensitivity) were included as covariates in a partial correlation analysis that used the cognitive effort discounting residual scores estimated for the second stage of analysis. We used the same uninformed prior distribution as detailed above, to measure the strength of the relationship between the subjective value (i.e., costs) of cognitive effort between working memory and speech comprehension domains, when controlling for the two individual difference measures.
This third stage of analysis was critical for determining whether the data provided support for a domain-general motivational construct that reflects the costs of cognitive effort, controlling for other relevant processes. We hypothesized that if this relationship existed, it would suggest that cognitive motivation can be indexed as a trait-like measure, such that measuring the subjective value of cognitive effort in one domain (e.g., working memory), would predict that an individual exhibits similar behavior in other cognitive domains. In contrast, if we found that the first hypothesis (a correlation between indifference points across the two effort discounting tasks) was confirmed, but the second hypothesis (a persistent correlation with added covariates) was disconfirmed, we would have concluded that cognitive motivation is domain specific. In other words, it is an individual’s working memory capacity and/or reward sensitivity that accounts for the relationship between the costs of cognitive effort across multiple cognitive (working memory, speech) domains. If the results from the third stage of analysis were inconclusive, we would decide that we could not draw firm conclusions regarding whether trait-level individual differences, such as working memory capacity and reward sensitivity, can account for the relationship between the subjective costs of cognitive effort across domains.