Understanding dynamic systems
Mastering dynamic systems is a recurrent task in our lives. In school, learning the behavior of neurons, the growth of plants, the behavior of molecules, and the events leading to the French Revolution; in our everyday lives, filing income taxes, operating the proverbial VCR, and using new software; in our public lives, understanding the workings of the electoral college, the behavior of the stock market, and the actions of the various political and religious factions in the Middle East. These systems can be decomposed into parts, the actions of the parts over time, and the consequences of the actions; hence, the term “dynamic systems.” Grasping some dynamic systems is difficult because the systems are not thoroughly understood or probabilistic, but even well-understood dynamic systems are challenging. Dynamic systems ordinarily have one or more structural layers and one or more layers of action. Structural layers consist of a set of parts, typically with specific associated properties, and their interrelations. Layers of action, behavior, process or causality consist of sequences of kinds of actions and their consequences. The structural layer is static, and if only for that reason, is easier to understand. The action layer is dynamic; it consists of changes in time, specifically, a sequence of varying actions and outcomes that are the consequences of the actions, often accompanied by causal reasons. Smart undergraduates who happen to score below the median in a test of mechanical ability—that is half the undergraduates—have difficulties understanding the behavior of dynamic systems, even relatively simple ones like the workings of a car brake or bicycle pump or pulley system, though they readily grasp the structure of the system parts (e.g., Hmelo-Silver & Pfeffer, 2004; Tversky, Heiser, & Morrison, 2013). Understanding the behavior of dynamic systems entails comprehending the temporal sequence of the actions of the parts of the system, the nature of the actions, the changes that result, and the causal dependencies between the actions and the changes.
Representing dynamic systems in graphics
The structural levels of dynamic systems, a configuration of parts, can be readily mapped to diagrams and that is, in fact, a common approach to representing them. Putting concepts into the world in the form of sketches, models, diagrams, artifacts and the like is well-known to promote memory, thinking and learning (e.g., Card, Mackinlay & Shneiderman, 1999; Hegarty, 2011; Larkin & Simon, 1987; Mayer, 2005; Schon, 1983; Tufte, 1983; Tversky, 2001, 2011). For simplicity, let us call the various forms of externalizing thought graphics. Putting and arranging thought in the world using graphics can spatialize that information as well as expand memory and promote information processing. Importantly, the ways that elements are shown and spatially arranged can abstract and structure thought more directly and congruently than language. The parts of a system that are close or interacting can be shown as close and interacting. The parts and whole can be depicted, as can some kinds of actions. Sequences of actions can be indicated by arrows. Representing the objects and arrangements of thought in the world provides a platform for inference and discovery (e.g., Tversky, 2011).
Representing change over time in graphics
Graphics are for the most part static; they can stay in front of the eyes to be contemplated. Yet, exactly because graphics are static, conveying dynamic systems that entail action, process, behavior, or change in time, has proved challenging for graphics.
Several solutions have been devised to convey dynamic information in graphics, including arrows, successive still diagrams and animated diagrams; none have proved to be universally satisfactory. As noted, a common and often successful solution is to use arrows. People readily produce and interpret arrows as temporal and/or causal relations (e.g., Heiser & Tversky, 2006). However, arrows can be ambiguous because they have a multitude of uses in diagrams. They can be used to label, to indicate temporal sequence, to indicate movement, to indicate causal connection, to show invisible forces, and more (e.g., Tversky, 2011). Many diagrams in the social sciences, biological and physical sciences, and engineering use arrows in multiple ways without disambiguating their meanings, resulting in diagrams that can be confusing and difficult to comprehend (Tversky, Heiser, MacKenzie, Lozano, & Morrison, 2007). In addition, showing the qualitative properties of important kinds of actions, such as forming alliances or chemical bonding or explosions or condensation, takes more than arrows. Another common method to show change in time is a sequence of still diagrams; however, successive stills also have limitations. Like arrows, they cannot readily show qualitative aspects of actions. In addition, they require integrating the separate still diagrams in turn, not an easy task. The separate diagrams must be compared by eye, and the changes between them imagined. Yet another way to convey action is by animations. Animations are especially compelling because they are conceptually congruent with what they convey: they use change in time to convey change in time (Tversky, Morrison, & Betrancourt, 2002). However, a broad survey comparing animated and still graphics relaying the same information and designed to educate viewers about complex processes that occur over time showed no benefits from animated graphics (Tversky et al., 2002). Three reasons were proposed for the failure to find benefits of animated over static graphics for conveying processes in time. One reason for the lack of success of animated educational graphics is perceptual, too much happens at the same time, so it is hard to grasp the sequence and nature of the changes. Another shortcoming of most educational animations is that they do not break the changes into their natural units. Instead, they show change in time continuously, proportionate to real time. The explanations that teachers and lay people in general provide are not continuous in time and proportionate to real time. Instead, explanations provided by people generally break processes into natural steps. Here is a simple example: when explaining routes, people segment them as a sequence of turns at landmarks (Denis, 1997; Tversky & Lee, 1998). Similarly, in describing actions that are continuous in time, like doing the dishes or making a bed, people segment the actions into discrete steps and substeps by accomplishment of goals and subgoals, not by time per se (e.g., Tversky, Zacks, & Hard, 2008). Animations typically fail to segment processes into their natural steps. Finally, showing is not explaining. Animations can show some changes, but in and of themselves do not explain the causal connections. In fact, animations accompanied by explanations can improve understanding when compared with animations without explanations (e.g., Mayer, 2005).
The roles of gesture in expressing and understanding thought
An underused and understudied possibility for effectively explaining dynamic systems is to use gestures. Gestures are actions; they should be natural for conveying actions (e.g., Cartmill, Beilock, & Goldin-Meadow, 2012; Hostetter & Alibali, 2008). Numerous studies have shown that people spontaneously gesture when explaining to themselves or to others (e.g., Alibali, Bassok, Solomon, Syc, & Goldin-Meadow, 1999; Alibali, Spencer, Knox, & Kita, 2011; Atit, Gagnier, & Shipley, 2014; Cartmill et al., 2012; Chu & Kita, 2011; Emmorey, Tversky, & Taylor, 2000; Ehrlich, Levine, & Goldin-Meadow, 2006; Engle, 1998; Goldin-Meadow & Beilock, 2010; Goldin-Meadow & Alibali, 1999; Goldin-Meadow, Kim, & Singer, 1999; Goldin-Meadow, Nusbaum, Kelly, & Wagner, 2001; Gukson, Goldin-Meadow, Newcombe, & Shipley, 2013; Hostetter & Alibali, 2008; Kang, Tversky, & Black, 2014; Schwartz & Black, 1996). In many cases, gestures carry information that is not carried in speech. Considerable research has shown that information carried solely by gesture can facilitate learning, thinking and understanding in both children and adults in a broad range of tasks including conservation (e.g., Church, Ayman-Nolley, & Mahootian, 2004; Ping & Goldin-Meadow, 2008), word learning (McGregor, Rohlfing, Bean, & Marschner, 2009), problem solving (Beilock & Goldin-Meadow, 2010; Singer & Goldin-Meadow, 2005; Tversky & Kessell, 2014), sentence memory (Thompson, Driscoll, & Markson, 1998), asymmetry (Valenzeno, Alibali, & Klatzky, 2003), math (e.g., Alibali & DiRusso, 1999; Cook, Duffy, & Fenn, 2013; Cook & Goldin-Meadow, 2006; Goldin-Meadow et al., 1999; Segal, Tversky, & Black, 2014), math analogies (Richland & McDonough, 2010), cyclical and simultaneous time (Jamalian & Tversky, 2012), story understanding (Beattie & Shovelton, 1999), and more.
Gestures can represent and resemble action
Gestures are frequently produced spontaneously to express both structure and action (e.g., Atit et al., 2014; Cartmill et al., 2012; Chu & Kita, 2011; Emmorey et al., 2000; Enfield, 2003; Engle, 1998; Goldin-Meadow & Beilock, 2010; Gukson et al., 2013; Kang et al., 2014). In previous research showing effects of communicative gestures that convey actions, the gestures used have been single actions on visible objects, such as lifting a disk (Goldin-Meadow & Beilock, 2010), counting (Carlson, Avraamides, Cary, & Strasberg, 2007) or rotating an imagined object (Alibali et al., 2011; Chu & Kita, 2011; Schwartz & Black, 1996). The present research examines the role of an integrated sequence of gestures representing a sequence of actions on named rather than instantiated objects. In order to convey structure, action, or other concepts, gestures must be custom-crafted to represent the specific content. Like effective graphics, effective gestures should be congruent with the meanings they express. As for graphics, gestures can map meanings more directly than language. A sequence of pointing gestures in gesture space can map the relative spatial locations of landmarks in an environment, much like a schematic map (Emmorey et al., 2000). A circling gesture is a more direct and congruent representation of circling motion than the word “circling.” Gestures are themselves actions and can be three-dimensional so can represent complex manners of action more directly certainly than words and in many cases also more directly than flat diagrams or animations. Note that in these congruent mappings of meaning, the gestures both represent the concept to be conveyed and resemble the concept to be conveyed. Both the word “circling” and a circular motion of the finger represent circling motion but only the circular motion resembles circling. A circling gesture can be more readily apprehended than a word, which is an arbitrary mapping of meaning to sound requiring knowledge of the language.
Neuroscience and action
Gesture, then, should have a special role in representing action for explanations and understanding. Gestures are spontaneously used to convey action and gestures can both represent and resemble actions. Neuroscience research also shows connections between thought, action and gesture. Watching actions performed by others, especially well-known actions, has been shown to activate regions of the brain involved in planning or making actions, a phenomenon known as motor resonance (e.g., Decety et al., 1997; Iacoboni, Rizzolatti & Craighero, 2004; Iacoboni et al., 1999; Molenberghs, Cunnington, & Mattingly, 2012; Rizzolatti & Craighero, 2004; Rizzolatti, Fogasse, & Gallese, 2001; Utihol, van Rooij, Bekkering, & Haselager, 2011). The general view is that this kind of motor mirroring serves action understanding. Seeing action gestures, then, should induce motor resonance, adding a layer of meaning and understanding of action.
This analysis suggests that gestures showing a sequence of actions could deepen understanding of the actions of a dynamic system, the goal of the present study. After considering previous research and extensive pretesting, we selected the four-stroke engine typically found in automobiles as a test platform. Previous research has used mechanical systems such as a bicycle pump, a pulley system or car brake, or biological systems such as the heart (e.g., Mayer, 2005). However, these systems do not have many differentiated actions or are already familiar to many undergraduates. An engine has several different kinds of integrated actions and is more complex and less known than the systems typically studied. Yet, it does not assume the background knowledge required in studies of chemistry, biology or physics. In the present study, students viewed one of two videos explaining the behavior of an engine accompanied by one of two types of gesture. The text of the explanation was exactly the same for both conditions and both videos were accompanied by the same rudimentary diagram of the engine showing the named parts in the proper configuration. In the action-gesture video, the explanation was accompanied by gestures that portrayed the actions of each part of the system, for example, opening, closing, expelling, exploding, igniting, compressing, reducing, letting in, rotating, descending, going in, going up, and going out. In the structure-gesture video, the explanation was accompanied by an identical number of gestures that portrayed the structure of each part of the system, for example, the crankshaft, the cylinder, the intake valve, the piston, the spark plug and the exhaust valve. In pretesting, two viewings of the video resulted in only chance performance on the knowledge test but four viewings led to a reasonable level of comprehension, above chance but not perfect, similar to previous work on learning complex environments (e.g., Taylor & Tversky, 1992).
Understanding was evaluated in several ways: by questions about structure and action that could be answered solely from the text, by student-created visual explanations and by student-created oral explanations to peers. We were especially interested in the students’ creations, their visual explanations and oral explanations because these require both understanding the information and reformulating it. If seeing action gestures creates a deeper understanding of action, those who viewed them should represent more action in their diagrams and include more action information in their verbal explanations by using more action words and more action gestures. Because structure is typically easier to learn than action and because both groups viewed a rudimentary diagram of structure, little or no benefit was expected from seeing the structure gestures.