Visual working memory for connected 3D objects: effects of stimulus complexity, dimensionality and connectivity

He, Chuanxiuyue; Gunalp, Peri; Meyerhoff, Hauke S.; Rathbun, Zoe; Stieff, Mike; Franconeri, Steven L.; Hegarty, Mary

doi:10.1186/s41235-022-00367-9

Original article
Open access
Published: 19 February 2022

Visual working memory for connected 3D objects: effects of stimulus complexity, dimensionality and connectivity

Chuanxiuyue He ORCID: orcid.org/0000-0002-5819-7171¹,
Peri Gunalp¹,
Hauke S. Meyerhoff²,
Zoe Rathbun¹,
Mike Stieff³,
Steven L. Franconeri⁴ &
…
Mary Hegarty¹

Cognitive Research: Principles and Implications volume 7, Article number: 19 (2022) Cite this article

3215 Accesses
2 Citations
Metrics details

Abstract

Visual working memory (VWM) is typically measured using arrays of two-dimensional isolated stimuli with simple visual identities (e.g., color or shape), and these studies typically find strong capacity limits. Science, technology, engineering and mathematics (STEM) experts are tasked with reasoning with representations of three-dimensional (3D) connected objects, raising questions about whether those stimuli would be subject to the same limits. Here, we use a color change detection task to examine working memory capacity for 3D objects made up of differently colored cubes. Experiment 1a shows that increasing the number of parts of an object leads to less sensitivity to color changes, while change-irrelevant structural dimensionality (the number of dimensions into which parts of the structure extend) does not. Experiment 1b shows that sensitivity to color changes decreases similarly with increased complexity for multipart 3D connected objects and disconnected 2D squares, while sensitivity is slightly higher with 3D objects. Experiments 2a and 2b find that when other stimulus characteristics, such as size and visual angle, are controlled, change-irrelevant dimensionality and connectivity have no effect on performance. These results suggest that detecting color changes on 3D connected objects and on displays of isolated 2D stimuli are subject to similar set size effects and are not affected by dimensionality and connectivity when these properties are change-irrelevant, ruling out one possible explanation for scientists’ advantages in storing and manipulating representations of complex 3D objects.

Introduction

Constructing and maintaining representations of three-dimensional structures is important for success in science, technology, engineering and mathematics (STEM) disciplines (National Research Council, 2006). Disciplines such as chemistry, geology and engineering often require an ability to both understand the spatial properties of multipart objects and maintain representations of those objects (see Fig. 1). For example, in organic chemistry, two molecules with the same structure can have critically different properties depending on which atoms are bound to the structure. Scientists and their students can quickly detect changes in complex representations made up of many parts (Morphew et al., 2015). Research on visual working memory (VWM) suggests a capacity limit for simple items (such as shapes and colors) of around 3–4 (Brady, et al., 2011; Cowan, 2001; Luck & Vogel, 1997, 2013). Encoding and making judgments about STEM representations therefore seem to exceed working memory limits, raising questions about relative working memory demands of these types of representations.

Working memory capacity is often measured by a change detection paradigm in which participants are shown a set of two stimuli separated by a brief delay and have to indicate whether the two displays are the same or different. As the number of items in the display increases beyond four, sensitivity to a change decreases (Brady et al., 2011; Luck & Vogel, 1997; Vogel et al., 2001). Visual working memory studies typically use displays composed of abstract, two-dimensional, isolated items (see Fig. 2a). In contrast, STEM representations, such as molecular representations (see Figs. 1a, 2e), often comprise complex three-dimensional objects made up of many connected parts. Here, we explore whether the set size effect found with displays of isolated objects also applies to representations of multipart objects and whether connectivity and dimensionality contribute to the apparent visual memory advantages for STEM representations. To preview our results, we find a set size effect for the number of parts of an object but no evidence that connectivity and dimensionality enhance visual working memory for STEM-like representations.

Detecting the replacement of an atom in a chemical reaction is somewhat analogous to detecting the replacement of a color in a visual display in that different atoms are represented by different color changes in ball-and-stick molecular representations (see examples in Fig. 3). Inspired by this similarity, a recent study used a change detection task to examine visual working memory for these molecular representations (Stieff et al., 2020). Sensitivity to a change was better when the changes involved groups that correspond to recurring patterns of atoms in organic molecules (e.g., a hydroxyl group consisting of one oxygen atom bonded with a hydrogen atom) that formed visual “chunks” (see Fig. 3), compared to when the changes were to other atoms in the molecule. Interestingly, these effects were found for both organic chemistry students and students naive to chemistry, suggesting that students were sensitive to spatial groupings in these visual stimuli, regardless of their knowledge of the meaning of these groupings. This study motivated our current studies on other properties of the representations of 3D multipart objects that might affect working memory capacity.

The present study

Here, we used a color change detection task to examine working memory for stimuli that have similar properties to molecular representations, in that they are complex 3D objects made up of connected solids, with different colors, and extending in different spatial dimensions. First, we examined how the number of colored parts of a single object affects performance in color change detection tasks when the number of parts exceeds two. All stimuli in the study by Stieff et al (2020) were made up of the same number of atoms, so that study could not establish how the number of parts of a single complex object affects visual working memory. The previous research on visual working memory has been conducted with 2-part stimuli of different colors (e.g., Luck & Vogel, 1997) or different color–shape combinations (e.g., Xu, 2006) (see Fig. 2c, d). However, these stimuli differ from the type of complex visual representations used in STEM in that they are 2D and contain isolated objects made up of only 2 parts. To our knowledge, basic research in visual cognition has not systematically examined whether the number of parts of a single object similarly affects performance in change detection tasks.

In addition to set size, we study the effects of dimensionality and connectivity on color change detection in complex objects. Molecular representations exhibit two different aspects of dimensionality: structural dimensionality and object dimensionality. Structural dimensionality refers to the number of dimensions into which a structure extends (one, two or three dimensions; the x-, y- and z-planes, see Fig. 3). Object dimensionality refers to the more traditional meaning of dimensionality; that is, the number of dimensions each stimulus unit has (e.g., 2D shapes vs. 3D geons). While dimensionality is irrelevant to a color change, a history of visual cognition has shown that 3D object-like stimuli are easier to perceive (Purcell & Stewart, 1991) and also enhance perception (Lanze, et al., 1982, 1985; Weisstein & Harris, 1974) and memory (Ankrum & Palmer, 1991) for line-drawn stimuli. Moreover, color change detection can be enhanced by including depth information in stereoscopic displays, when isolated colored squares are shown in different depth planes (Chunharas et al., 2019; Sarno et al., 2019; Xu & Nakayama, 2007). In contrast, Stieff et al. (2020) found that structural dimensionality (2D vs 3D) had no effect on change detection. However, because the changed elements of those ecologically valid ball-and-stick stimuli include multiple features such as the relative sizes of parts and angles between bonds in addition to color, detection of a single feature change could not be experimentally manipulated in that study without sacrificing ecological validity (see Fig. 3).

For connectivity, the previous research also suggests that accuracy in change detection tasks increases when features to be remembered are present on the same part of a multipart object (Xu, 2002b, 2006), are presented in close proximity to one another (e.g., Peterson & Berryhill, 2013; Wang et al., 2016) or are connected (Delvenne & Bruyer, 2006; Woodman et al., 2003; Xu, 2006). These effects are typically stronger for connectivity than for proximity (Woodman et al., 2003), possibly because the to-be-remembered information forms objects (Luck & Vogel, 1997) or facilitates the integration of feature conjunctions (Xu, 2006).

Overall, the present work examined questions important to STEM representations that no research to date has systematically investigated. First, we studied how the number of parts and dimensionality of a single object affect demands on working memory capacity (Experiment 1a & 1b) as measured by a color change detection task. Second, we examined the effects of two aspects of dimensionality: structural dimensionality (Experiment 1a) and object dimensionality (Experiments 1b), that is, stimulus properties that are irrelevant to a color change, on this task. Third, we examined the separate effects of change-irrelevant connectivity and dimensionality on color change detection (Experiments 2a and 2b).

Experiment 1a

In Experiment 1a, we varied two aspects of the stimuli: complexity (number of cube constituents) and structural dimensionality (the number of dimensions in which the cubes extended); see Fig. 4 for examples of stimuli. We predicted that set size effects found with isolated stimuli (Brady et al., 2011; Cowan, 2001; Luck & Vogel, 1997, 2013) would generalize to multipart objects, such that sensitivity to changes would decrease as the number of parts of an object increase. Structural dimensionality, however, could positively affect performance because participants can compress common features in a complex object by noting the locations of similar features (e.g., the red–orange–yellow group of colors were here, here and here), reducing the need to represent the actual color three times. This might boost performance even when the locations of those features are not explicitly relevant to the task (Brady & Alvarez, 2015a, 2015b; Brady et al., 2009a). It is less likely that this would happen in one-dimensional (1D) arrangements than in 2D and 3D arrangements, which provide progressively richer location representations and more items in close proximity to each other. According to this hypothesis (the configural hypothesis), sensitivity to a change should be lower in 1D objects than 2D or 3D objects.

Method

Participants

Fifty-five students (35 female) participated. For all experiments, the participants were students from the University of California who had normal or corrected to normal vision and received course credit for participation. Participants were excluded from analysis if they had lower than 80% accuracy on a verbal concurrent task or if they had lower than chance (50%) accuracy on the change detection task. In Experiment 1a, four (female) students were excluded, for failure to reach the 80% criterion on the verbal concurrent task and one was excluded for lower than chance accuracy. An a priori power analysis for ANOVA using G*Power (Faul et al., 2007) with an alpha level of 0.05, power of 0.8 and an effect size of f = 0.176 (corresponding to a small effect size; η_p² = 0.03), indicated that our sample size (51) exceeded the minimum number of participants needed for this experiment to be sufficiently powered.

Materials

Apparatus

Stimuli were presented on a 24-inch ASUS VG248 monitor with an AMD Radeon T R7450 graphics card, 1920 × 1080 resolution, 60 Hz refresh rate and 8-bit depth.

Experimental task

A change detection task, programmed with the PsychoPy libraries (Pierce, 2007), was employed in which participants were shown a set of two stimuli separated by a brief delay and were asked to assess whether the second (test) stimulus differed from the first stimulus. To avoid use of verbal encoding strategies and verbal working memory (WM), a concurrent verbal task was employed. Stimuli were presented within a 20.6° region in the center of the computer monitor with a white background and viewed at a distance of approximately 70 cm.

Stimuli

The stimuli were pictures of objects consisting of connected colored cubes. Each cube within the stimulus had a unique color, which was selected randomly (without replacement) from a set of nine: red, orange, yellow, green, blue, purple, pink, brown and gray. Colors were selected using Color Brewer 2^{Footnote 1} (n = 9, qualitative; Brewer, 2006) to ensure contrast between each of the values (RGB values are presented in Appendix). Objects were created and rendered using Blender version 2.78. Because each object was three-dimensional, the addition of depth and shading meant that there was variation in the luminance values of the colors of the objects.

Objects were composed of 4, 6 or 8 units. As the relative size of the cubes was preserved between conditions, objects with more units had a greater visual angle (4-unit objects had a maximum visual angle of 10.2°, for 6-unit objects, this visual angle was 15.9° and for 8-unit objects the maximum visual angle was 20.6°). The objects also varied in structural dimensionality. One-dimensional (1D) objects had cubes extending in only the x-coordinate plane, two-dimensional (2D) objects extended into both the x and y planes and three-dimensional (3D) objects extended in the x, y and z planes (see Fig. 4).

On half of the trials the sample and test stimuli were identical (except for a rotation of 10° clockwise or counterclockwise from the sample stimulus, to minimize the ability to detect changes by monitoring for local pixel changes). On the other half, they were identical except for a change in color of one single substituent cube and the same rotation. This change in color was selected randomly from the remaining colors in the nine-color set (i.e., a color not used in the sample stimulus). There were eight trials for each condition of the 3 (number of units) by 3 (structural dimensionality) by 2 (change, no change) factorial design for a total of 144 trials.

Spatial ability measures We also included two measures of spatial ability. Details of these measures and their correlations with performance are presented in Additional file 1.

Color blindness measure The Ishihara compatible pseudoisochromatic plate (PIPIC) color vision test (Waggoner, 2005) was used to test for color blindness.

Procedure

Participants were first administered the color blindness measure and then given instructions for the experimental task. They were first instructed on the verbal concurrent task and were told that they would be repeating four letters aloud throughout each trial. They were then given instructions on the experimental task, which explained that two structures would appear sequentially on the screen and, after seeing the second structure, their task was to indicate whether the two structures were the same or different. Participants were reminded that they should repeat the four letters throughout the trial and that they would be prompted to report the letters on randomly selected trials. Participants completed four practice trials, and if they were not confident in their understanding or performed poorly on these practice trials, they were asked to repeat them before proceeding.

The experimental procedure is shown in Fig. 5. In each trial, four randomly selected distinct consonants were first presented to the participants for 3000 ms. Participants were instructed to repeat this string of consonants aloud throughout the trial. After a 500-ms inter-stimulus interval, the sample stimulus was presented in the center of the screen for 1000 ms, followed by a 1000 ms retention interval. Finally, the test stimulus was presented (in the same location but rotated 10 degrees clockwise or counterclockwise from the sample stimulus, to minimize any memory contributions from similar retinotopic or afterimage-based representations) until the participant responded or until 3000 ms at which time the trial timed out. Participants responded by pressing one of two keys (“1” for different, “9” for same) on a standard keyboard for the visual working memory task and were given immediate feedback on their answer. On 20% of trials, they were prompted to report the string of consonants they had been repeating. On these trials, a box appeared in the center of the screen and participants typed the letters and again were given immediate feedback.

After completing the experimental task, participants were administered spatial ability measures (see Additional file 1) and an online questionnaire, which asked questions about strategies used to complete the structure comparison task and demographics.^{Footnote 2}

Results and discussion

Accuracy as a function of structural dimensionality, number of parts and target stimulus change is shown in Table 1. In all experiments in this study (see Tables 1, 2), participants had a positive response bias in all conditions, so additional analyses were conducted using d' (graphed in Figs. 6, 8, 10) as a measure of performance (see Additional file 1 for response times for all experiments).

Table 1 Means (standard errors in parentheses) for measures of accuracy Experiment 1a and 1b

Full size table

Table 2 Means (standard errors in parentheses) for measures of accuracy for Experiments 2a and 2b

Full size table

A 3 (number of parts: 4, 6, 8) \(\times\) 3 (structural dimensionality: 1D, 2D, 3D) repeated-measures ANOVA on d’ found a large significant main effect of number of parts, F(2, 400) = 137.25, p < 0.001, η_p² = 0.41, no significant main effect of structural dimensionality, p = 0.20, and no significant interaction p = 0.23 (see Fig. 6). Notably, the Bayes Factor (BF₁₀) for structural dimensionality was 0.059, indicating strong evidence that structural dimensionality had no effect, a result that is consistent with Stieff et al. (2020).

In sum, Experiment 1a showed that working memory capacity for our object-like stimuli was similar to working memory limits for simpler displays, in that sensitivity to a change decreased with more units in the structure. Moreover, we found that structural dimensionality, which is irrelevant to a color change, did not affect sensitivity. The Bayes Factor strongly supports the null hypothesis and there was no evidence for the alternative (configural) hypothesis.

Experiment 1b

Experiment 1b directly compared the decline in sensitivity with more parts of a 3D multipart object to the decline in sensitivity with more isolated elements in a 2D display (typically used in visual working memory tasks). In this study, the same participants performed a change detection task with these two types of displays. We did not attempt to control other stimulus properties and the two types of stimuli varied in both structural dimensionality (the number of dimensions into which they extend) and object dimensionality (2D squares vs. 3D cubes).