The ShadowHunt paradigm used in Patton et al., (2021) exerts a high cognitive load on working memory as participants track the current and previous locations of both their own ship and other distractor ships. Therefore, reducing working memory demand may increase accuracy in detecting the hostile ship. Although history trails have not been examined in this type of paradigm, it is expected that they should serve to offload the spatial working memory load of remembering the prior locations of all the items into perception. This should avail more cognitive resources for other aspects of detection of the hostile ship and therefore improve performance.
Patton et al. (2021) found a strong decline in detection accuracy when the hostile ship was hunting at a far distance. Current aids, based mainly on artificial intelligence and highlighting (St. John et al., 2005; see Riverio et al., 2018 for review), tend to fail at far distances (Dahlbom & Nordlund, 2013). Therefore, it is important to specifically examine the impact of history trails at far distances when evaluating their potential for performance enhancement.
Two hypotheses were proposed. First, it was hypothesized that history trails would increase overall detection accuracy of both hunting and shadowing because they will reduce the load on working memory and allow more cognitive resources to be used for detection. Second, as found in Patton et al. (2021), it was hypothesized that the detrimental effect of distance on detection and diagnosis would be larger for hunting than shadowing. Additionally, given prior work suggesting that distance effects can be important, we were interested in the impact of trails at far distances across both behaviors, although no directional hypothesis was posed.
Methods
Participants
Each participant gave informed consent prior to commencing the experiment. Data were collected from 35 people on Prolific, all of whom were located in the United States. Two datasets were removed due to a combination of performance under chance, on average using less than eight of 35 possible steps, and further evidence of inattention from large time lags between interactions with the program.
Task
Participants viewed a computer screen (see Fig. 1) containing a green cross indicating their ship’s position, which they could control, and six white circles with numbers which represented other ships and were controlled by a software application.
On each trial, the starting location of all ships was randomly generated. The participant’s ship could be moved in one of 4 directions (up, down, left or right) by clicking the arrow keys at the bottom of the screen. This movement produced a small jump by the usership in the chosen direction on the screen. These arrow keys could only be clicked once per second to negate the potential to create apparent motion through rapid keystrokes. There was no time limit on when the next movement had to be made. The movement of the participant’s ship on the screen was accompanied by an update of the computer-controlled ships, although these ships were able to move diagonally. Thus, all ships moved at the same time, with at least a one second delay in between movements.
On every trial, one of the computer-controlled ships was randomly selected to act in a hostile fashion. All ships were assigned a number for identification, thus making every ship a potential target. The hostile ship’s movements were contingent on the user’s movements. The hostile ship would do one of two things—hunt or shadow. Hunting meant moving in a way such that it would eventually reach the usership. An algorithm computed which directional movement produced the greatest reduction in distance between the usership and the hostile ship, and moved the hostile ship in that direction as the usership moved. Shadowing aimed to generally keep a consistent distance from the usership through replication of their movements. For instance, if the user moved left, the shadowing ship also moved left. If the usership moved toward the shadowing ship, it moved the same direction as the user so the distance between the ships stayed the same. These target movements occurred simultaneously with the usership movement that triggered it.
The other five ships on the display moved independently of the user’s actions. The behavior of the five non-hostile ships were randomly assigned other movement patterns. Three of the ships moved toward their own fixed target location, coded as an invisible point on the coordinate grid. The other two ships exhibited “patrol” behaviors, where they moved in a rectangular course that covered either 1/3, 1/2 or 2/3 of the screen. They could be oriented in any direction and the ship could start at any point on the path. A passive version of the task in which the viewer is not actively controlling ship movement can be accessed through the files at the link [https://osf.io/vkfdr/].
Movements of all computer-controlled ships contained 25% noise, such that, on average, every one out of four moves was not as expected for that ship’s programmed behavior. For example, if the hostile ship was shadowing, approximately every one out of four moves would not be the same as the usership. On half of the trials, all ships left history trails—dots indicating the ship’s previous nine positions, with lines connecting the dots. The usership’s trail was white, and all computer-controlled ships’ trails were green. The trials with history trails were blocked and randomized. Participants received four blocks of nine trials, two blocks with and two blocks without history trails.
Two initial practice trials demonstrated hunting and shadowing behaviors, with no data collected. Unlike in the experimental trials, on each practice trial the hostile ship was a different color and the hostile behavior was announced when the trial started. This allowed participants to practice working through a scenario but also showed the difference between hostile behaviors.
On each trial, the participant was required to make at least five moves, but no more than 35 moves, in whatever pattern they chose before determining which ship they believed was hostile. Once they made a decision, they clicked an “End” button. The ship display froze, and the participant indicated whether they were being hunted, shadowed, or neither. If they chose hunting or shadowing, the next question asked them to choose which ship was exhibiting that behavior by clicking the radio button that matched the ship number they believed was hostile. They then clicked “submit” and were given feedback only on the correctness of their response, but not on the correct target nor the hostile behavior exhibited on the trial. Ending the trial before 35 moves was at the discretion of the participant. Participants completed 36 trials (18 with and 18 without history trails), which took approximately 45 min.
Results
Overall, participants correctly detected the hostile ship and behavior 62% of the time (chance performance for guessing both the correct ship and type of hostile behavior is 8.3%). Accuracy, operationalized as correct detection of both the hostile ship and its behavior, was the dependent variable for all analyses. Notably, when the correct ship was detected, the correct behavior was also detected 92% of the time. When the correct behavior was chosen, it was only assigned to the wrong ship 16% of the time and for the reverse, the wrong ship was assigned the correct behavior 38% of the time. This indicates that ship and behavior detections were closely coupled. Additionally, there was a significant (t(31) = −2.12, p = 0.003, d = 0.55) bias to report shadowing (73%) more than hunting (37%) on error trials.
Due to the randomization of starting distance across trials and the resulting large discrepancies in the number of trials at each distance that participants received, overall inferential statistics were not conducted across all three-way effects of distance with trails and behavior. A 2 (history trail) × 2 (behavior) repeated measures ANOVA was conducted. The proportion of correct trials for each behavior with and without trails was calculated for each participant. There was a main effect of history trail (Fig. 2), with small but significant benefits to performance on trials with history trails (66%) compared to those without (57%; F(1,32) = 8.15, p = 0.007, \(\eta_{p}^{2}\) = 0.20). There was no significant main effect of behavior (F(1,32) = 1.52, p = 0.22, \(\eta_{p}^{2}\) = 0.04), nor interaction between trails and behavior (F(1,32) = 0.81, p = 0.37, \(\eta_{p}^{2}\) = 0.02) consistent with comparable performance benefits from history trails for both types of hostile intent.
Based on Patton et al.’s (2021) finding of highly degrading effect of increasing distance on hunting detection but not on shadowing detection, a planned examination of the impact of history trails under those circumstances was conducted. The 2 (behavior) × 4 (starting distance separation quartile) repeated measures ANOVA was conducted. As shown in Fig. 3, there was a main effect of distance (F(3,78) = 10.67, p < 0.001, \(\eta_{p}^{2}\) = 0.29), with worse performance at further distances, and, as before, no main effect of behavior (F(1,26) = 1.68, p = 0.20, \(\eta_{p}^{2}\) = 0.06). The interaction was significant (F(3,78) = 5.98, p < 0.001, \(\eta_{p}^{2}\) = 0.18), indicating the minimal degrading effect of distance for shadowing (simple main effect: F(3,96) = 2.18, p = 0.09, \(\eta_{p}^{2}\) = 0.06) compared to the large drop off with hunting (simple main effect: F(3,78) = 12.28, p < 0.0001, \(\eta_{p}^{2}\) = 0.32), thus replicating the prior findings of Patton et al. (2021) and supporting the second hypothesis. Using accuracy for only those participants who encountered all four conditions at the longest distance, a 2 (trails) by 2 (behavior) repeated measures ANOVA produced no hint of a significant interaction (F(1,21) = 0.025, p = 0.87, \(\eta_{p}^{2}\) = 0.001), but a significant main effect indicating the consistent benefit of trails (F(1,21) = 6.54, p = 0.01, \(\eta_{p}^{2}\) = 0.23).
Speed accuracy trade-off
There was no difference in the mean number of steps used on trials with history trails (15.5) compared to trials without (15.8; t(34) = −0.51, p = 0.61, d = 0.03). The similarity in steps used combined with differences in accuracy indicates that history trails allowed people to accumulate more diagnostic evidence from the same number of steps. There was no difference in average time spent between steps (M = 2.0 s) with and without history trails.
We examined the speed-accuracy tradeoff between participants to assess the extent to which those who accumulated more evidence (more steps) also performed better. This examination revealed a positive correlation of r = 0.21 between average number of steps (per participant) and mean accuracy.
Discussion
Our first hypothesis was that history trails would increase accuracy, which was confirmed, although the gains observed were rather modest. Specifically, history trails supported an overall improvement in detection (9%), including at further distances, which is important to note because current hostile intention detection aids tend to fail at far distances (Dahlbom & Nordlund, 2013). The findings are therefore consistent with history trails reducing working memory demands. Working memory (WM) is involved in the detection of hostile ships as movements had to be held in WM, then combined to form a trajectory, and then compared to the usership trajectory. We infer that WM decay and capacity limits impact the ability of an operator to hold all of the trajectories in memory, as revealed in other studies of and involving multi-object tracking (e.g., Gao et al., 2019; Harris et al., 2020). With the visual aid of the history trail, trajectories were able to be perceived, rather than remembered and imagined. This approach is congruent with the concept from ecological interface design that replacing memory with perception improves performance (Bennett & Flach, 2013), as well as the idea that a “visual echo” can offset vulnerabilities of working memory (Helleberg & Wickens, 2003). The offloading of trajectories also allowed detection accuracy to improve without a change in the number of steps or evidence accumulated.
However, the size of the benefit derived from the addition of history trails was fairly modest, given that even with trails, accuracy was still only 66%. Thus, the limits on performance are either not purely resulting from working memory, or the remaining demands on working memory in the performance of the task continue to overwhelm its limited capacity even with history trails supporting certain aspects. We return to this issue in Experiment 2.
The second hypothesis, that there would be degrading effect of distance on hunting but not shadowing (seen in Patton et al., 2021) was confirmed. The clear decrement to detection of hunting at a distance indicates that something about hunting behavior is qualitatively different than shadowing for the perception of patterns. This could be because a shadowing hostile ship and usership can be treated as if they are connected by a virtual semi-rigid line, which has been shown to improve tracking performance (Yantis, 1992; see Patton et al., 2021 for a further discussion).