Method
Participants
Participants were 212 students at Colorado State University (CSU) who received course credit in exchange for their participation. This sample size gives us 80% power to find statistical significance at α = 0.05 for a one-sample t-test with an effect size of d = 0.19 and also 80% power to find statistical significance for a correlation of r = 0.19. The sample was 71.1% female, 84.3% White, 82.9% non-Hispanic, and had a mean age of 19.2 years (SD = 1.54). The study had the approval of the Colorado State University Institutional Review Board.
Procedure
Participants completed the gun perception task and a variety of individual differences measures.
Gun perception task
The first task participants completed was the gun perception task. For this task, participants stood approximately 5 feet away from a 55″ liquid crystal display television screen and responded to a series of images. The images contained a White person, dressed in black and wearing a black ski mask, who was holding either a black gun pointed at the camera or a white shoe held perpendicular to the line of site to the camera (see Fig. 1). The images were taken from screen shots of movies filmed in 19 different locations. For each location, one movie was made with the gun and another with the shoe, and the screen shots selected attempted to make as much of the scene and man’s stance identical between the two images. The 38 images were the same as those used in the previous study (Witt and Brockmole 2012).
Participants completed two blocks of trials, which varied only in the object they used to respond. For one block, participants held a Wii gun (21.6 cm length, 3.8 cm width, 12.7 cm height). For the other block, participants held a spatula (34.3 cm length, 10 cm width, 3.1 cm height, Oxo brand). Motion tracking markers were affixed to each object to track the trajectories of the participants’ movements (see Fig. 2). Movements were measured using a Vicon 3D motion tracking camera system (Vicon Nexis 2.0, Vicon Bodybuilder 3.6.1, and Bonita 10 Camera) that tracked the location of the markers on each object in X, Y, and Z coordinates. The Z-coordinate corresponds to up and down and was the coordinate used to determine movements (see “Data Preprocessing” section). The first object held was counterbalanced across participants.
At the start of each trial, the participant placed the held object (gun or spatula) onto a wireless mouse that was affixed to a music stand (see Fig. 3). Depressing the mouse button was necessary to start the trial. An initial series of practice trials used arrows to help participants become familiar with the task. Upon pressing the mouse button, an up or down arrow would appear, and participants would move their object off the mouse and point up or down as indicated by the arrow. They were instructed to initiate their response only once they knew which direction to move and to make the movement in one smooth motion. Participants were reminded of this instruction throughout the experiment as needed. Participants completed 12 practice trials, 6 in each direction, and order was randomized.
For the test trials, text on the screen said “Get ready” to instruct participants to place the object onto the mouse. Once the mouse was depressed, a blank screen with a fixation cross was presented for a randomly selected delay between 500 and 1000 ms. The delay duration was randomized to minimize anticipatory responses. One of the test images was then presented. As displayed on the screen, it measured 57.5 × 96.8 cm. The image remained on the screen for 850 ms or until participants released the mouse button. Participants moved up when they saw a gun and moved down when they saw a shoe. Then a blank screen with three asterisks was presented until the research assistant coded the movement. The research assistant coded movements as 1 for up, 2 for down, and 3 for an error such as slipping off the mouse or reversing directions, then the next trial began. Due to a programming error, the experimenter’s entry was not recorded for the first group of participants. For analysis, we used the motion tracking data unless it was missing due to the cameras being unable to detect the sensors, in which case we used the experimenter’s entry instead.
Participants completed two blocks of trials, one for each held object. Order of held object was counterbalanced. Each block contained 152 trials, which included two presentations of each of the 38 images and of a mirror-reversal version of each. Order within block was randomized.
Measures of individual differences
Participants then completed several self-report measures of individual factors via laptop computer and the Qualtrics survey platform (Qualtrics Inc 2019). First, participants indicated how often they use a gun. Possible responses were ‘never’, ‘less than once a year’, ‘1 to 5 times per year’, ‘6 to 12 times per year’, and ‘13 or more times per year’. The frequency was later dichotomized into ‘never’ and ‘at least once’ for purpose of the analysis. Participants were also asked whether they play gun-oriented video games 4 or more hours per week, and responded with either a yes or no. Following, participants completed the Gun Attitudes Scale (Tenhundfeld et al. 2020), which measured a general attitude toward gun use. Next, they completed an assessment of the Big Five Personality Traits (John and Srivastava 1999), which measured extroversion, conscientiousness, agreeableness, neuroticism, and openness to experience. The Sensation Seeking Personality Type (SSPT) scale (Conner 2019) was used to assess sensation seeking. The SSPT contains two subscales, experience seeking, which measures an individual’s propensity for seeking novel experiences, and risk seeking, which measures an individual's willingness to take risks. The UPPS + P scale (Lynam et al. 2006; Whiteside and Lynam 2001) was administered to measure impulsivity-related traits including positive urgency, negative urgency, lack of perseverance, and lack of premeditation. Emotion dysregulation was measured using the Difficulty with Emotion Regulation Scale (DERS) (Gratz and Roemer 2004), which assesses various aspects of emotion regulation, including emotional clarity, emotional awareness, emotionally driven impulsivity, emotional non-acceptance, difficulty engaging in goal-directed behavior while emotionally distressed, and access to emotion regulation strategies. Finally, participants completed the Levenson Multidimensional Locus of Control Scales (Levenson 1973), which measured a participant’s locus of control related to Internal Locus of Control, Powerful Others, and Chance.
Following the surveys, participants completed the Stop Signal Task, a behavioral measure known to correlate with inhibition and impulsive responding (Lappin and Eriksen 1966; Li et al. 2006). During this task, participants were shown a large, gray box on the computer monitor. In the box, a target moved from one side of the screen to the other. Participants were instructed to click on the target before it reached the other side. On 20% of the trials a whistle sounded at varying delays (50, 150, 250 and 350 ms) after trial initiation. On such trials the participant was instructed to inhibit the click response. A total of 120 trials (24 stop signal, 96 no signal) were included in each session. The percent correct responses were calculated for trials without the stop signal, and for each stop signal delay. Percent correct is inversely related to impulsivity.
Data preprocessing and planned analyses
For each participant, there were 3 data files: two motion tracking files that coded the x, y, and z-coordinates for each object held (spatula and gun) and one file that coded the stimuli, response time (RT) and experimenter’s coding of the movement. We used custom code to align the three data files. Alignment could not be achieved for 19 participants due to noise or issues with the motion tracking data. For 9 of these participants, hand-coded responses were available and used; data were excluded for the remaining 10 participants. Motion tracking data were missing for one or both conditions for 20 participants, so their data were excluded. Of the 182 participants in the final analysis, trials with no movement were excluded. This comprised less than 0.5% of the data.
The next step was to classify the movement on each trial. Movements were classified based on the vertical position of the object during each trial. Thresholds for moving up and for moving down were calculated as 30% above and below the median location. Trials with movements that did not reach either threshold were categorized as no movement trials and excluded. Trials with movements that reached both thresholds were categorized as reversals and coded as errors. Experimenter entry was used to verify coded motion tracking data and resolve any discrepancies.
Data analyses were done in R (R Core Team 2017). A critical question was whether there was a difference in responses when holding a gun versus holding a spatula. Scores were computed for each of the hold conditions (gun and spatula) and analyzed with a paired-samples t-test. Another critical question was whether the gun embodiment effect correlated with other individual differences and these data were analyzed with Pearson correlations. For both the t-tests and the correlations, we calculated Bayes factors (BFs) using the R BayesFactor package with the default Cauchey prior (Morey et al. 2014). BFs are presented as the Bayes factor in favor of the alternative (BF10), so values greater than 1 are evidence in favor of the alternative over the null hypothesis with values greater than 3 and greater than 10 constituting substantial and strong evidence, respectively (Jefferys 1990; Lakens 2016). Values less than 1 are evidence in favor of the null over the alternative hypothesis, and values less than 0.33 and less than 0.10 are considered substantial and strong evidence for the null hypothesis over the alternative, respectively. Cohen’s dz was calculated using the cohensD function in the lsr package (Navarro 2015).
Multiverse analysis plan
To eliminate effects of experimenter degrees of freedom, we conducted a multiverse analysis (Steegen et al. 2016). For a multiverse analysis, the same test (e.g., the paired-sample t-test) is calculated many times for each possible variation such as variations in the dependent measures used or variations in outlier exclusion. For the current study, there were four forks in the “garden of forking paths” (Gelman and Loken 2013). One fork concerned the dependent measure, which could be accuracy, signal detection measures of A′ and B″, or reaction time. The other 3 forks concerned outlier exclusion. For reaction times, outliers can be determined on the basis of raw reaction times, mean reaction times, or differences in reaction times. For other measures, outliers can be determined on the basis of mean scores or differences in scores. At each fork, we conducted a multiverse with three criteria: no outliers excluded, outliers beyond 3 times the interquartile range (IQR), and outliers beyond 1.5 times the IQR. The multiverse analysis showed consistent patterns regardless of the outlier criteria selected (Additional file 1). This suggests the effects (or lack thereof) are robust to outlier exclusion. Thus, for ease of presentation, only one path for each dependent measure is presented in the Results section. This path used the criteria of excluding RTs that were beyond 1.5 times the subject- and stimulus-specific IQR and excluding both mean scores and difference scores that were beyond 1.5 times the group’s IQR for each dependent measure assessed. Because this criterion was applied based on the specific measure being analyzed, different participants were excluded, and the degrees of freedom differed across the various analyses.
Results
The analyses are split into two sections that coincide with the two aims: replicating the gun embodiment effect and determining whether individual differences moderate the effect.
Gun embodiment effect
The gun embodiment effect was originally defined as the bias to report the presence of a gun more often when holding a gun than when holding a neutral object. To quantify this bias, we calculated the nonparametric signal detection theory measures of A′ and B″. A′ provides a nonparametric measure of discriminability, which refers to the ability to distinguish when a gun versus a shoe is present. B″ provides a nonparametric measure of bias, which refers to the tendency to report that a gun (or a shoe) is present. For each participant for each hold condition, we calculated hit rates and false alarm rates based on the proportion correct scores (see Fig. 4). Hits refer to when a gun was present and the participant reported that a gun was shown in the picture (i.e., an up movement). False alarms refer to when a shoe was present but the participant incorrectly reported that a gun was shown. From the hit and false alarm rates, we calculated A′ and B″ scores based on formulas in Stanislaw and Todorov (1999). B″ scores could not be computed for 4 participants due to perfect performance (100% hits and 0% false alarms) in one of the conditions.
We conducted a paired-samples t-test between the B″ scores when holding the gun and the B’’ scores when holding the spatula. The effect was not statistically significant, and the Bayes factor showed 7 times more support for the null hypothesis than the alternative, t(146) = 0.95, p = 0.34, BF = 0.14, dz = 0.08, 95% CI [− 0.08, 0.24]. The original paper (Witt and Brockmole 2012) reported a significant difference in B″ scores between holding a gun versus a neutral object. The current analysis showed a clear failure to replicate this effect (see Fig. 5).
The original research showed a gun embodiment effect on the bias scores but no effect on reaction times. In contrast, the current study showed no effect on the bias scores but an effect on reaction times (RTs). We calculated mean RTs for each participant for each hold condition and for each stimulus condition, and then calculated the difference in mean RTs to respond to a gun compared with a shoe for each hold condition. These two difference scores were analyzed with a paired-samples t-test to determine whether the speed to respond to a gun versus a shoe differed when holding a gun versus a spatula. The effect was statistically significant, and the Bayes factor showed over 10 times more support for the alternative than the null hypothesis, t(159) = 3.29, p = 0.001, BF = 14.98, dz = 0.26, 95% CI [0.10, 0.42], Mdiff = 6 ms, 95% CI [2, 10 ms], 59% of participants showed a positive effect (see Figs. 6, 7).
The data also showed a significant gun embodiment effect on accuracy (see Fig. 4). Accuracy was calculated as the difference score in the proportion of correct responses when seeing a gun minus when seeing a shoe for both the hold gun and hold spatula conditions. This difference score was larger when holding a gun than when holding a spatula, t(142) = 3.07, p = 0.003, BF = 8.19, dz = 0.26, 95% CI [0.09, 0.42], SDdiff = 1.4%, 95% CI [0.5, 2.3%], 58% of participants showed a positive effect.
To summarize the results thus far, the gun embodiment effect was present. However, the gun-embodiment effect did not express itself in bias scores, as had been found in the original research (Witt and Brockmole 2012). Instead, the gun embodiment effect expressed itself in reaction time and in accuracy. Regardless of how the gun embodiment effect was measured, the effect was small (dz = 0.26 for both two measures). To achieve 80% power to obtain a p value < 0.05 for an effect of dz = 0.26, an experiment would need approximately 118 participants (or 93 participants with a one-sided t-test). That power was achieved in the current study, so we next assessed whether any individual differences moderated the gun embodiment effect.
Individual differences as moderators of the gun embodiment effect
The next research question pertained to the universal nature of the gun embodiment effect. If no individual differences measures moderate the magnitude of the gun embodiment effect, this would be evidence for a universal and fixed effect. In contrast, if some individual differences cause the gun embodiment effect to amplify, or to be eliminated, this would suggest the effect is malleable and flexible.
Before exploring moderators, we had to select which dependent measure to use to quantify the gun embodiment effect. To do individual differences research, it is necessary that a measure have high reliability. Also, even if a measure does not show a main effect, it can still show a correlation (Miller and Schwarz 2018). We evaluated the reliability of the gun embodiment effect as measured with reaction times, accuracy, and bias scores. For each score, we calculated the relevant difference score on the odd trials and on the even trials (split based on condition) and calculated the correlation between the two. Only RTs showed high reliability (see Fig. 8). Thus, all subsequent analysis involved only the gun embodiment effect as measured with RTs.
We calculated the correlation between the gun embodiment effect (measured with RT difference scores) and each of the individual differences measures. Overall, there were few correlations of any notable magnitude (see Fig. 9).
The largest effect was the correlation between prior gun experience and the gun embodiment effect, r = − 0.25, p = 0.002, BF = 18. Given that gun experience was coded into binary categories of never and at least once, we ran a t-test. A two-sample t-test revealed that gun experience affected the gun embodiment effect, t = 3.26, df = 133.62, p = 0.001, BF = 15, d = 0.53, 95% CI [0.19, 0.87]. As shown in Fig. 10, the gun embodiment effect was present for people who had never used a gun, t = 4.57, df = 91, p < 0.001, BF > 1000, d = 0.48, 95% CI [0.26, 0.69], but the gun embodiment effect was not present for people who had used a gun, t = − 0.19, df = 560 p = 0.84, BF = 0.15, d = 0.03, 95% CI [− 0.23, 0.28].
That gun experience modulates the gun embodiment effect is direct evidence against the idea that the gun embodiment effect is universal and fixed. Instead, the result suggests the effect is malleable and can be modified with prior experience. However, this conclusion is not fully supported by the data. When we looked at the gun embodiment effect as measured with accuracy scores, people with prior gun experience showed some tendency toward the gun embodiment effect, t = 1.92, df = 56, p = 0.060, BF = 0.80, d = 0.25, 95% CI [− 0.01, 0.51] (see Fig. 11). This hint of an effect with accuracy scores casts doubt on whether people with prior gun experience are truly immune to the gun embodiment effect. We return to this issue in Experiment 2.
Regarding the other correlations, only one other correlation was close to significance. This was the correlation between extroversion and the gun embodiment effect, as measured with reaction time, r = 0.22, p = 0.006, df = 146, BF = 7 (see Fig. 12). Bonferroni correction for multiple comparisons would set the alpha equal to 0.002 (0.05/25 comparisons), in which case, this result would not be deemed significant. Bonferroni is perhaps overly conservative, although the same conclusion of failing to achieve significance resulted from Benjamini–Hochberg correction as well. However, the Bayes factor showed the data provided substantial support for the alternative over the null hypothesis. Together, the conclusions are mixed. We think it best to treat this finding as preliminary and a possible avenue for future exploration but not definitive evidence for modulation of the gun embodiment effect.
None of the other factors showed significant correlations with the gun embodiment effect. In a few cases, the correlation could be present, but we did not have enough power to detect it. We had approximately 80% power to detect a correlation of r = 0.23, so perhaps some of the correlations were smaller than this value (see Fig. 9).
In other cases, there was substantial evidence from Bayes factors that no correlation existed between the measure and the gun embodiment score. Measures with Bayes factors less than 0.33 (and thus have substantial evidence for the null hypothesis over the alternative hypothesis) included 5 of the 6 measures of Difficulty with Emotion Regulation scale (DERS; (Gratz and Roemer 2004), 3 of the Big 5 personality measures (agreeableness, openness, and conscientiousness), the 3 locus of control measures (Levenson 1973), and the 4 measures of impulse-related traits (Lynam et al. 2006; Whiteside and Lynam 2001). These items were some of our best guesses as to factors that would moderate the gun embodiment effect. That so few correlations emerged, and that so many even had substantial evidence for the null hypothesis, speaks to a robustness and inflexibility of the gun embodiment effect, even if the effect is small.
Discussion
The goal of Experiment 1 was first to attempt to replicate the gun embodiment effect and second to determine whether any individual differences modulate the effect. The data showed strong support for the gun embodiment effect. However, in contrast to the previously published research (Witt and Brockmole 2012), the current results show the effect is small (dz = 0.26) and the gun embodiment effect is more likely to be expressed in reaction time or accuracy rather than the nonparametric signal detection measure of bias.
Regarding the second goal, we conducted 25 various tests for modulation. Two of these tests suggested modulation. One was the personality measure of extroversion. We had predicted that people higher in extroversion would show a stronger gun embodiment effect because they are more likely to make rash decisions and to act without forethought. However, the p value did not reach statistical significance after Bonferroni correction. The other effect related to prior gun experience. When measured using reaction times, people with prior gun experience did not show a gun embodiment effect. However, there was a hint of an effect when we used accuracy instead. To further explore this issue, we conducted Experiment 2, which used the same gun embodiment task but with a different gun.