Graphical literacy – the ability to read, construct, and interpret visual displays of information – is a critical skill for all citizens, but particularly for those in STEM fields. Everyday STEM tasks require people to judge whether information in a graph supports a claim, identify the cause of a problem on the basis of anomalous data, and extrapolate trends to predict future performance. Although graphs are often perceived as intuitive because they group relevant information together in a visuospatial format (e.g., Larkin & Simon, 1987), graph interpretation requires analysis of the parts of a graph over time, more akin to slowly reading a paragraph than to glancing at a picture (e.g., Carpenter & Shah, 1998). Because of visual capacity limitations, the relations that people extract from graphs may be tightly constrained by the order in which they attend to graphical elements.
Specifically, we propose that extracting a between-value graph comparison (e.g., “Are there more X than Y?”) elicits a serial operation that we refer to as a visual routine (Cavanagh, 2004; Ullman, 1984) in which attention shifts to at least one of the objects in the relation to guide the comparison process. For instance, when judging whether a two-bar graph depicts a specific relation configuration (i.e., [short tall] or [tall short]?), most people either systematically attend to the left bar first or the taller bar first (Michal, Parrot & Franconeri: Three modes for seeing relations between objects, in preparation). The requirement of using a visual routine to extract such relations is not unique to graphs – people also judge color configurations of objects (e.g., “green circle left of red circle?”) by shifting attention to one of the colored objects (Franconeri, Scimeca, Roth, Helseth, & Kahn, 2012; Holcombe, Linares, & Vaziri-Pashkam, 2011; Yuan, Uttal, & Franconeri, 2016).
These visual routines may be needed because of exceptionally tight capacity limits on these types of visual relation extraction (Logan, 1994, 1995; Wolfe, 1998), with one proposal requiring strictly serial processing of objects within a relation (Franconeri et al., 2012). In order to judge a particular spatial relation between two objects (e.g., “Is the larger object on the left?”), one object has to be designated as the target (e.g., the larger object) and one object has to be designated as the referent (e.g., the smaller object). Although the individual features of multiple objects may initially be available at a global, scene-statistic level, such as knowing there are two sizes, two contrast values, and two locations in a display (e.g., Alvarez, 2011), the visual system must have a mechanism for assigning features belonging to the target object and referent object, and one likely mechanism is strict spatial isolation of attention across temporal intervals (Franconeri et al., 2012, Treisman & Gelade, 1980; but see Hummel & Biederman, 1992). Thus, when judging whether the left of two objects is the larger object, a visual routine would isolate the two features “larger” and “left” to a single object, making the location of the larger object explicit.
Past evidence shows that visual routines occur during graph comprehension (e.g., Michal, Uttal, Shah, & Franconeri, 2016), but converging evidence is needed to show that these routines are instrumental for extracting specific relations. For example, when judging the size configuration of the bars in the top graph of Fig. 1, we claim that looking toward the taller bar first allows the viewer to extract the relation “the taller bar is on the right,” rather than “the shorter bar is on the left.” That claim requires evidence that the type of relation judged affects the way that attention shifts (as reflected in eye movements).
In support of this idea, we recently found that interpretations of magnitude relations depended on which item people attended to first (Michal et al., 2016). Participants were asked to verify whether a two-bar graph matched a statement such as, “Are there more blueberries than oranges?,” and visual routines that mimicked the linguistic order within the question led to faster responses. By attending first to the relational target (e.g., the blueberry bar), people were more likely to interpret the graph in a way that was aligned with the question (i.e., they interpreted the graph as “more blueberries than oranges” rather than “fewer oranges than blueberries”), which led to quicker responses.
Because Michal et al. (2016) asked participants to verify a specific framing of a visual relation (e.g., blue bar larger than orange bar?), participants may have been primed to attend to the graph using a sentence-order visual routine. We must therefore additionally show that people naturally interpret graphs in a sentence-like way even when there is no verbal statement to compare the graph to, and when they are free to attend to the graphed values in any temporal order they choose.
One way to test whether visual routines are associated with specific relational comparisons is to ask people to judge relations for which there are multiple ways to compare the two objects – and therefore multiple potential routines. For example, although the bottom graph in Fig. 1 (under the words until response) has only two bars, one could compare the bars’ locations, sizes, or contrast values. Each comparison can also be described on the basis of different dimensions, such as “the taller bar is on the right” or “the darker bar is on the left.” Finally, each comparison can be framed differently; for instance, judging that “the taller bar is on the right” is distinct from “the shorter bar is on the left” (e.g., Clark & Chase, 1972). Thus, when there are several possible ways of comparing data points in a graph, it is critical to be able to make relevant comparisons. For instance, if a person wanted to verify whether the right bar was larger than the left bar in the bottom graph of Fig. 1, they would need to compare the sizes of the bars and not the contrasts or spatial locations.
We asked participants to judge configurations of a two-bar graph on the basis of size (i.e., “[short tall] or [tall short]?”) and contrast values (i.e., “[light dark] or [dark light]?”). By measuring which bar participants attended to first over a range of display types, we could infer the features or “anchor points” (Couclelis, Golledge, Gale, & Tobler, 1987) that guided their comparisons. If people are systematic about the feature that they attend to first, then it is likely that their interpretation of the graph is driven by that guiding feature. For example, a typical observer judging the top graph in Fig. 1 might use the taller bar as an anchor point when comparing the sizes of the bars, whereas she might use the darker bar as an anchor point when comparing the contrasts (middle graph of Fig. 1). But do these visual routines happen simply because of the relative bottom-up salience of those feature values, or are the saccades guided by the relational decision in a top-down manner? The example in the lowest graph of Fig. 1 pits the two strategies of attending to the taller or darker bar first against each other because the graph varies in both size and contrast, but only one dimension is task-relevant. Thus, if visual routines are associated with specific graph interpretations, then the routine used to compare two data points should be based on whichever dimension is currently relevant for the comparison. If a viewer prefers dark and tall, then in this display they should attend left when judging contrast versus right when judging size, even in otherwise identical displays.
By measuring the target object of first saccades over a range of configurations, we first identified anchor points for each participant while they judged graph configurations on the basis of size (with task-irrelevant contrast kept constant) and contrast (with task-irrelevant size kept constant). After completing these single-dimension tasks, participants completed an orthogonal task in which graphs varied in both size and contrast configurations. Half of the participants judged size configurations while task-irrelevant contrast values varied independently (Experiment 1a), and the other half judged contrast configurations while task-irrelevant size values varied independently (Experiment 1b). If visual routines are guided by the task-relevant dimension (size or contrast) during graph comprehension, then people should show the same anchor point biases for the orthogonal task as for the corresponding single-dimension task. Furthermore, if visual routines are implemented in a task-relevant way, then the order of this visual routine might affect how people interpret relations in graphs; if so, then exploring the “right” order for a given problem could have pedagogical implications.