The rapid technological development of the past two decades has spawned a variety of software agents that can perceive and act with some degree of autonomy (Rudowsky, 2004; Russel & Norvig, 2010; Woolridge & Jennings, 1995). When people interact with these software agents, they may call upon many of the cognitive skills that underlie human-to-human interaction (e.g., Kuchenbrandt, Eyssel, Bobinger, & Neufeld, 2013; Malle, 2015). People may, for example, have to interpret a software agent’s request, reason about its goals, or make predictions about its behavior. But the unique properties of software agents can make these tasks challenging. Software agents, by design, behave in some respects like humans but in other respects like machines, and different software agents may reflect different aspects of human thought. As a result, interacting with software agents requires people both to call upon concepts of how human and mechanical agents operate and to deploy these concepts effectively given the pragmatics of the interaction.
For example, when one encounters a software agent during a service call, a successful interaction requires more than a simple decision of whether to treat the agent as a person or a machine. It also requires explicit consideration of particular ways in which the agent is likely to be person-like. For instance, the service-call agent may have some forms of knowledge and may be able to respond to emotions, but it is unlikely to know much about topics irrelevant to the typical service call or to have non-auditory sensory functions.
Further, your interaction with the automated system may elicit responses that are incompatible with how you thought the system was operating. These responses will help you calibrate how you conceptualize this particular automated system, and they may also help you refine your deployment of agency concepts in future interactions with other systems (Epley, Waytz, & Cacioppo, 2007; Gopnik & Wellman, 1994; Levin, Saylor, Adams, & Biswas, 2013; Levin, Saylor, & Lynn, 2012). This form of learning about agents has rarely been explored empirically, but it may be quite important, especially given recent arguments that it could induce a fundamental change in our understanding of the ontological distinction between living and nonliving things (Kahn et al., 2012).
In this paper, we examine the reciprocal relationship between agency concepts and agent interactions in one particularly important context: the classroom. We report three experiments in which middle school students used an established teachable-agent-based computer learning environment called Betty’s Brain (Blair, Schwartz, Biswas, & Leelawong, 2007; Leelawong & Biswas, 2008) for lessons on scientific topics. We find that students with stronger pragmatic understanding of human agency—that is, students who make more intentional predictions about human behavior on a behavioral prediction measure—learn more effectively from the teachable agent system. We also find that the process of interacting with the system sharpened students’ distinctions between human behavior and mechanical behavior.
Conceptualizing agents
When considering how people conceptualize software agents, it is helpful to start with a definition of “software agents”. Although researchers have relied upon a range of definitions, one relatively uncontroversial definition is that software agents are programmed entities that include some form of autonomy, ability to learn, and ability to interact socially with human users (see Nwana, 1996 for review). As this definition suggests, when users need to understand software agents, they likely draw upon their understandings of human thinking.
A variety of findings suggest that, when an unfamiliar entity exhibits minimal cues of agency (e.g., an entity has “eyes”, appears to make goal-directed movements, or behaves unpredictably), people are quick to anthropomorphize it, using their knowledge about human thought and behavior as a framework for understanding and drawing inferences about the entity’s internal operations (Barrett & Lanman, 2008; Epley et al., 2007; Gray, Gray, & Wegner, 2007; Heider & Simmel, 1944; Jipson & Gelman, 2007; Kahn et al., 2012; Levin, Saylor, Adams, & Biswas, 2013; Levin et al., 2012; Martini, Gonzalez, & Wiese, 2016; Melson et al., 2009). For example, when asked to describe shapes moving around a screen in a pre-determined pattern, people tend to do so in human, goal-oriented terms, saying things like “the big triangle was chasing the little one” or “the big triangle is aggressive” (Heider & Simmel, 1944). On one view, this can be understood as extending “theory of mind” to perceived agents, imputing beliefs, desires, and goals that can explain and support predictions about their behavior (Baron-Cohen, Leslie, & Frith, 1985; Gopnik & Wellman, 1992, 1994; Wimmer & Perner, 1983).
A key concept underlying theory of mind is the distinction between intentional and nonintentional representations. Intentional representations are characteristic of human thought and are closely linked to their referents. One referent cannot be freely substituted for another, as the representation–referent link is embedded in a rich set of contextual knowledge and perceptual experiences (Dennett, 1991). Non-intentional representations, on the other hand, are more characteristic of computers. These representations are less closely linked to their referents, serving as symbolic placeholders that the system acts upon with little importance placed on their semantic content (Searle, 1986). One way of summarizing this contrast is to suggest that intentional representations reflect truly situated semantic knowledge about the world while non-intentional representations are more like pointers to a representing system that does not really “know” the true meaning of the representations. In this paper, we refer to the ability or tendency of an entity to use, or behave as though it is using, intentional representations as “agency” (Schlosser, 2015). Speaking generally, the use of intentional representations enables agents to engage in the types of coherent, goal-directed behavior characteristic of humans, while the use of non-intentional representations does not.
Although the distinction between intentional and nonintentional representations is abstract, it is possible to understand it more concretely by considering how children begin to generate different expectations for humans and inanimate objects over their first few years of life (Kuhlmeier, Bloom, & Wynn, 2004; Spelke, Phillips, & Woodward, 1995; Woodward, 1998). For example, Woodward (1998) repeatedly showed nine-month-old infants either a human hand or an inanimate reaching device (e.g., a stick) moving toward one of a pair of objects (a teddy bear or a ball). Then, on the critical trial, the locations of the two objects were switched, and the hand or inanimate stick either moved toward the same object in its new location or toward the other object in the same location. Woodward hypothesized that when the hand moves toward the previously reached-for object in the new location, the action is explainable based on a goal that is supported by an intentional representation of the object (the person wants that object). Alternatively, the hand that moves toward a different object in the previously reached-for location is behaving consistently with non-intentional representations: rather than acting upon a particular object, this agent is repeatedly acting on a location, meaning that the goal object can be freely substituted across trials without consequence. Woodward found that infants viewing the critical trial looked longer (indicating surprise) when the hand moved to the new object at the old location, suggesting that the infants interpreted the reach by the hand as a goal-driven intentional action. Importantly, the same action by an inanimate stick produced no such effect, implying that the infants were limiting the inference of goal-directed intentional action to the human agent.
While much research has demonstrated that children develop the basic concepts of goal-directedness and theory of mind at young ages, this does not mean that these concepts are fully elaborated or that they are consistently applied to new situations (Birch & Bloom, 2007; Christensen & Michael, 2016; Keysar, Lin, & Barr, 2003). Indeed, there are reliably measurable individual differences both in older childrens’ (Baron-Cohen, O’Riordan, Stone, Jones, & Plaisted, 1999) and adults’ (Baron-Cohen & Wheelwright, 2004) theory of mind, an observation that led Apperly (2012) to emphasize the importance of “the varying capacity to deploy [theory of mind] concepts in a timely and contextually appropriate manner” (p. 385).
Similarly, people likely vary in their capacity to apply these concepts in support of interactions with artificial agents such as software agents. We propose two factors that may be particularly relevant to this capacity. First, on the assumption that people use their understanding of human cognition as a base for understanding artificial agents (Epley et al., 2007), they must know enough about the set of skills that comprise human cognition to explicitly judge which of these skills a given software agent may possess. This is important because software agents vary considerably—they may simulate intentions but lack emotion, they may simulate knowledge but lack any capability of making decisions “on their own”, and they may be able to “think” in some ways but be unable to sense information in their surroundings. Having a good understanding of these skills and the dividing lines between them can prevent users from over- or under-generalizing when considering evidence about a software agent’s capabilities. Second, people must have some sense of how the pragmatics of different situations will call upon these various skills. For example, in a situation where an intelligent animated software agent assists with a word processing program, the user would benefit from understanding that the software agent’s role will require it to have knowledge about word processing and to make decisions about whether to interrupt the user with hints, but will not require the agent to possess emotions or the ability to see.
People who more easily recognize the purposes that software agents serve and the subset of human-like skills most relevant to those purposes will likely interact with software agents more effectively for a number of reasons. First, people with these abilities may be better able to “get inside the head” of a software agent, and therefore reap benefits analogous to those afforded by theory of mind in human-to-human social interactions. Second, if people cannot judge the subset of skills that a software agent is likely to exhibit in a given setting, they may become frustrated with agents that lack expected skills, or, conversely, with agents that do more than expected (de Graaf, Ben Allouch, & van Dijk, 2016; Scheeff, Pinto, Rahardja, Snibbe, & Tow, 2002). Further, even in the absence of a negative emotional response, poor pragmatic understanding of agents may cause cognitive inefficiencies, either as a result of engaging in capacity-absorbing social responses that do not facilitate problem solving (Herberg, Levin, & Saylor, 2012), or by engaging in cognitive elaborations on agents that interfere with more basic information processing (Baker, Hymel, & Levin, 2016).
Measuring pragmatic understanding of agency
In previous work, we constructed and validated a measure of how readily adults deploy knowledge about different types of agents to novel situations. This measure asks participants to predict the behavior of multiple types of agents (e.g., a human, a computer, and a robot) in a series of scenarios (e.g., Levin, Killingsworth, Saylor, Gordon, & Kawamura, 2013). The scenarios were designed so that participants’ behavioral predictions would differ depending on whether the agent exhibited intentional, goal-directed behavior or non-intentional, mechanical behavior. For example, one of the scenarios—the “object-goal” scenario—closely followed Woodward’s (1998) experiment, asking participants to imagine that a particular agent had repeatedly selected one of two objects and then asking which object the agent would select when the locations of the two objects were switched. If participants believe the agent to be acting in an intentional and goal-directed manner, they should predict that the agent will maintain the same goal (choose the same object), but if participants believe the agent to be acting in a rote or non-intentional manner, they should predict the agent would maintain its movement pattern without regard to goal state by reaching to the new object at the old location. Other scenarios focused on categorization, with participants predicting whether agents would classify objects using taxonomic categories (for example, “office supplies”), or more surface-level, feature-based categories (for example, “rectilinear objects”; Bloom, 1997; Deak & Bauer, 1996).
On average, adult participants made more “intentional” behavioral predictions for the human and more “non-intentional” behavioral predictions for mechanical agents, providing evidence of construct validity (Levin, Saylor, et al., 2013). Additional work has demonstrated the robustness of the pattern of predictions (Levin, Harriott, Paul, Zhang, & Adams, 2013) and demonstrated that neither perceived limits in current technology nor perceived intelligence of the agents fully accounts for the differences in predictions, as the same pattern occurs when participants consider agents from the distant future (Levin, Killingsworth, & Saylor, 2008).
Importantly, however, this pattern of predictions is not obvious or universal, even among adults. While some participants consistently made “intentional” predictions for humans and “non-intentional” predictions for mechanical agents, others did not. This variability demonstrates that, although basic concepts of goal-directed behavior are typically in place—and limited to humans—at a young age (e.g., Woodward, 1998), elaboration and effective deployment of these concepts varies across development and into adulthood (Apperly, 2012; Keysar et al., 2003).
A key feature of the behavioral prediction measure is that “correct” responses require explicit recognition that the situation tests a specific agency concept. For example, consider the object-goal scenario based on the Woodward (1998) study. In the case of a human, the nominally correct prediction is that the person will maintain the same goal and reach to the old object, now in a new location. At some level, this is a simple prediction that relies on basic concepts of goal-directedness that even infants understand (Woodward, 1998). But the basic concept of goal-directedness does not totally determine the agent’s actions in this situation because a person could, for some reason, have the goal of reaching to a given location (for example, to change the weight distribution on the table where the objects rest). A correct response requires recognizing not only that such a goal would be atypical, but also that the pragmatic intent of the question is to assess typical goals. We observe that this need to coordinate basic knowledge about goal-directedness with the pragmatics of a specific situation is similar to the demands that teachable agents place upon learners in educational contexts. Specifically, in the case of a teachable agent, it is likely helpful for learners to understand how the pedagogical setting that the agent inhabits constitutes a pragmatic constraint that determines how the agent’s mental processes will operate. For example, it is useful for students to effectively merge their understanding that the Betty system is meant to teach causal biological relationships with their understanding that Betty can be said to have the goal of learning biology, while she does not have goals related to cultivating personal relationships or getting lunch.
Finally, we note that participants’ behavioral predictions can be modified by experience with agents (Levin et al., 2008; Levin, Harriott, et al., 2013). This is consistent with more general evidence suggesting that experience may increase or decrease attributions of agency to machines (Nass & Moon, 2000; Somanader, Saylor, & Levin, 2011; for review, see Epley et al., 2007; Jaeger & Levin, 2016). This is also an important component of pragmatic understanding of agency: people can re-calibrate the concepts they apply to a particular agent based on cues from the agent and the environment. The studies we report in this article investigate how experience with software agents affects understanding of agency as well as the reverse relationship.
Teachable agents and pragmatic understanding of agency
The present studies focus on the role that pragmatic understanding of agency plays in the middle school classroom, where several well-known technology-based learning systems seek to help students learn material by having them teach it to “teachable agents”. A teachable agent is a graphical character in a computer environment that can be taught concepts by students. Using artificial intelligence, the teachable agent answers questions, completes quizzes, and provides explanations based entirely on what the student has taught it (for review, see Chase, Chin, Oppezzo, & Schwartz, 2009). The teachable agent is often viewed as an extension of the learning-by-teaching paradigm in which students not only learn more effectively by providing explanations to other students (Webb, 1983), but also learn more effectively by merely preparing to explain material to other students (Bargh & Schul, 1980; Bransford, Brophy, & Williams, 2000).
Schwartz et al. (2009) argue that learning by teaching is effective because it invokes metacognitive monitoring of both the student’s own knowledge and their partner’s (i.e., the teachable agent’s) knowledge. Betty’s Brain, an extensively studied teachable agent system, is thought to be an effective teaching tool because it invokes this type of metacognitive monitoring, among other reasons (for review, see Blair et al., 2007).Footnote 1
The use of teachable agent software, however, may present unique challenges for some students. Students with weaker pragmatic understanding of agency may be at a disadvantage because they are less able to optimally deploy agency concepts that facilitate learning. Further, the cognitive effort involved in deploying the appropriate concepts at the appropriate times, along with the effort of monitoring cues from the software to determine which concepts are most helpful in what circumstances, may disproportionately tax the cognitive resources of these students. This would leave fewer resources available for the metacognitive monitoring critical to the learning-by-teaching paradigm—monitoring which is itself resource-intensive (Schwartz et al., 2009).
For these reasons, we hypothesize that pragmatic understanding of agency facilitates learning from teachable agent systems. There are, however, several possibilities as to which understandings of agency (understandings about a human, a computer, or a particular teachable agent) are most relevant to learning. Students’ understandings of human agents are the broadest and most commonly used. Indeed, understanding of human intentionality underlies much of everyday thought and provides a foundation for theory of mind. Students generally have far more knowledge about how people operate than how computers or teachable agents operate. The ability to draw on this broad knowledge of human intentionality should help students interact with software agents like Betty, who are in many ways designed to imitate humans. It is also possible, however, that students’ understandings of computers facilitate learning: teachable agents are, ultimately, symbols in computer systems. It may be that understanding the non-intentional representations used by computer systems helps students navigate some of the teachable agents’ limitations. Another interesting possibility is that students’ understandings of the particular teachable agent (e.g. Betty in Betty’s Brain) are most relevant to learning—at least if the students’ experience with the teachable agent system allows them to build a robust understanding of the particular agent. Further, it seems plausible that learning could be facilitated by either an intentional understanding of Betty (enabling students to better “play along” with the idea that Betty is human in the context of the software) or a non-intentional understanding of Betty (by enabling students to better recognize and cope with some of Betty’s limitations in the learning environment).
We also hypothesize that using teachable agent software will improve students’ pragmatic understandings of agency (Levin, Saylor, et al., 2013). Specifically, we expect that time spent dealing with software agents—and grappling with the attendant difficulties in deploying the appropriate agency concepts—will help refine students’ agency concepts, and their deployment of those concepts in other contexts. Again, there are a number of ways this learning could manifest itself. One possibility is that it could increase intentional attributions to Betty as students’ interactions with her increase their attributions of her agency. This would be consistent with research from Nass and Moon (2000) documenting automatic social responses caused by interactions with computers. However, it is also possible that students will learn about some of the differences between people and artificial agents as they come to know Betty’s limitations. In such a case, one might expect a decrease in attributions of agency to Betty, and possibly even an increase in attributions of agency to people as the interaction clarifies for students the salience of goals in everyday behavior.