How causal information affects decisions

Zheng, Min; Marsh, Jessecae K.; Nickerson, Jeffrey V.; Kleinberg, Samantha

doi:10.1186/s41235-020-0206-z

Original Article
Open access
Published: 13 February 2020

How causal information affects decisions

Min Zheng¹,
Jessecae K. Marsh²,
Jeffrey V. Nickerson³ &
…
Samantha Kleinberg ORCID: orcid.org/0000-0001-6964-3272¹

Cognitive Research: Principles and Implications volume 5, Article number: 6 (2020) Cite this article

15k Accesses
14 Citations
101 Altmetric
Metrics details

Abstract

Background

Causality is inherently linked to decision-making, as causes let us better predict the future and intervene to change it by showing which variables have the capacity to affect others. Recent advances in machine learning have made it possible to learn causal models from observational data. While these models have the potential to aid human decisions, it is not yet known whether the output of these algorithms improves decision-making. That is, causal inference methods have been evaluated on their accuracy at uncovering ground truth, but not the utility of such output for human consumption. Simply presenting more information to people may not have the intended effects, particularly when they must combine this information with their existing knowledge and beliefs. While psychological studies have shown that causal models can be used to choose interventions and predict outcomes, that work has not tested structures of the complexity found in machine learning, or how such information is interpreted in the context of existing knowledge.

Results

Through experiments on Amazon Mechanical Turk, we study how people use causal information to make everyday decisions about diet, health, and personal finance. Our first experiment, using decisions about maintaining bodyweight, shows that causal information can actually lead to worse decisions than no information at all. In Experiment 2, we test decisions about diabetes management, where some participants have personal domain experience and others do not. We find that individuals without such experience are aided by causal information, while individuals with experience do worse. Finally, our last two experiments probe how prior experience interacts with causal information. We find that while causal information reduces confidence in individuals with prior experience, it has the opposite effect on those without experience. In Experiment 4 we show that our results are not due to an inability to use causal models, and that they may be due to familiarity with a domain rather than actual knowledge.

Conclusion

While causal inference can potentially lead to more informed decisions, we find that more work is needed to make causal models useful for the types of decisions found in daily life.

Significance statement

Causality is at the core of decision-making, yet little is known about how well people use causal models to make real-world decisions. Methods to go from data to causes have been introduced in machine learning, statistics, economics, and other areas. These algorithms are evaluated based on how accurately they can recover a causal structure, but it is not clear how such models will interact with what individuals already know. In this work we show that causal models can aid decision-making in unfamiliar situations, yet when individuals have prior experience with a domain, causal models can reduce confidence and lead to less accurate decisions. This work has implications for the development of algorithms to find causes, and the presentation of information. First, extracting increasingly complex and detailed models will not necessarily lead to better decisions. Thus, new methods for evaluating the utility of causes are needed. Second, causal information may need to be tailored to each individual’s experience and beliefs.

Background

Causal relationships let us robustly predict future events, change the future through individual actions and public policies, and look backward to explain why events such as a person’s heart attack happened. A key goal of finding causes, rather than correlations, is to help humans make better decisions. Many computational methods have been developed to extract causal relationships from large quantities of data, for example, to find graphical models of causal links in datasets (Pearl 2000), to uncover the timing of relationships (Friedman et al. 1998), and to identify logically complex relationships (Kleinberg 2012). Given the increasing availability of data, computational methods for extracting causal structure have been applied to a wide variety of problems, including identifying risk factors for heart failure (Kleinberg and Elhadad 2013), finding connectivity from functional magnetic resonance imaging data (Friston et al. 2003), and uncovering causes of sentiment change in online social networks (Bui et al. 2016). While methods for causal inference from data are routinely evaluated on their ability to correctly, completely, and efficiently identify the underlying causal model of a system, the utility of such models in helping people understand real-life decision-making situations has not yet been explored. It is now possible to uncover increasingly more complex and detailed causal models, but it is not known whether these models are necessarily the most useful ones. Thus, there is a need to understand how and when causal models can help people to make decisions.

Prior work in psychology has focused primarily on understanding how people learn about causal structures, rather than on how people use these models for real-world decision-making. Bayesian networks are a common output for machine learning, and many of the models of human causal reasoning are based on causal Bayesian networks (CBNs) (Tenenbaum et al. 2011; Griffiths and Tenenbaum 2005; Rottman and Hastie 2014). There is evidence that the process by which people learn about causes can be captured to a large extent by CBNs, though there are cases where judgment deviates from what would be predicted by a CBN (Rottman and Hastie 2016). Prior knowledge can also affect how we interpret correlations or data (Fugelsang and Thompson 2003; Griffiths et al. 2011), which may have implications for how people combine their knowledge with a CBN to make a decision. However, regardless of whether CBNs can model the process of learning, it is an open question as to whether and to what extent causal information, such as that presented in a CBN, influences decisions.

In particular, prior work has not examined the use of high-level machine learning output (e.g., relationships between variables such as carbohydrates and blood glucose) to make specific choices in daily life (e.g., deciding between items on a restaurant menu). Instead, work on causal learning mainly focuses on testing scenarios that can be understood without prior knowledge.^{Footnote 1} Such work has focused on making decisions within a model (e.g., which variable to intervene upon to make another variable true), as opposed to how people link newly presented information to what they already know to make real-world decisions. For example, an individual could use guidance on preventing type 2 diabetes to influence their food choices, but this guidance will exist alongside what they already know about diet and health plus their experiences and preferences. Making decisions in a novel system can demonstrate that people understand conceptually how causes work, but this ability does not necessarily imply that people can translate this knowledge to decisions in familiar domains.

Our primary goal in this paper is to understand how causal information can potentially assist people in making decisions, such as what to eat and how to plan for retirement, by better linking actions to goals. While it is not yet known whether provided causal models can actually improve human decision-making, they could potentially reduce cognitive load by acting as heuristics that simplify decisions (Garcia-Retamero and Hoffrage 2006) or by helping to overcome limits on working memory (Easterday et al. 2009). Similarly, causal information could aid decision-making by providing supporting reasons for a choice (Shafir et al. 1993), clarifying the valuation of options (Usher and McClelland 2001; Busemeyer and Townsend 1993), or serving as the basis of task information feedback (Karelaia and Hogarth 2008; Balzer et al. 1992). Influence diagrams, which are visually similar to Bayesian networks, have been used to support communication and decision-making around risk (Fischhoff and Downs 1997; Fischhoff and Davis 2014). Causes are often thought of as intervention strategies (Woodward 2003), and Hagmayer and Sloman (Sloman and Hagmayer 2006; Hagmayer and Sloman 2009) proposed that decisions can be viewed as interventions as well. Understanding the cause of a phenomenon may further determine whether people think an intervention is required at all. For example, Kim and LoSavio (2009) found that when participants were given causal explanations of an individual’s psychological symptoms, they perceived the individual as less in need of treatment compared to when they did not receive such explanations. Despite all of these possible ways that causal information could support decision-making, the hypothesis that causal information can aid in real-world decision-making has not yet been tested.

Furthermore, a core open problem is understanding the interaction between information provided during a decision and an individual’s existing knowledge. In domains where individuals have no personal experience or which they perceive as complex (e.g., decisions about repairing an aircraft engine), causal models may provide welcome aid (Kominsky et al. 2018). However, people are notoriously bad at assessing their own knowledge (Rozenblit and Keil 2002), and therefore they may more often feel they have adequate knowledge without needing the help of a causal model. The interaction between this knowledge and new information (e.g., causal diagrams of varying complexity) is still poorly understood.

For machine learning to have an impact on real-world decisions, we need to better understand how people reason with causes and how this interacts with their prior knowledge and experience. In this work we specifically aim to understand how people use causal information to make the types of decisions found in daily life, rather than decisions that do not relate to prior knowledge and expectations. Understanding this type of decision-making will help us better comprehend how computational methods may actually help the average person. We further aim to understand how the use of causal information is aided or impeded by an individual’s personal experience within a domain, by an individual’s perceptions of their knowledge relative to others, and by an individual’s actual knowledge. In particular, we examine the use of causal information presented as text and as graphical models, as these are the most common output of computational methods for causal inference. Using large-scale experiments on the Amazon Mechanical Turk (MTurk) platform, we test, in a domain where people presumably should have some personal experience (weight management), if additional causal information leads to better decisions (Experiment 1). In Experiment 2, we expand to a domain where people vary more in their familiarity (type 2 diabetes), allowing us to test people with and without personal experience making decisions in this domain. This personal domain experience means they may have existing beliefs and knowledge about the topic, though participants will not necessarily have made decisions in the specific contexts posed in each of our study questions. To further understand the role of experience, we explore how information presented at decision time affects confidence in decisions depending on personal experience in the domain (Experiment 3). Finally, we explore whether people’s perceptions of their knowledge or their actual knowledge may drive our effects (Experiment 4). Across these experiments we demonstrate the intricacies of how causal information can influence decision-making.

Experimental overview

While prior work has provided insight into causal reasoning and decision-making, we do not yet know how useful causal information is for supporting everyday decisions that relate to an individual’s existing knowledge and experience. We build on prior work showing the possibility of using Amazon’s MTurk platform for studies involving assessing the complexity of causal systems (Kominsky et al. 2018) and for behavioral research more generally (Crump et al. 2013).

We conduct a series of four experiments testing (1) whether causal information improves decision-making, (2) whether the impact of causal information differs for people with and without personal domain experience, (3) how causal information at decision time affects decision confidence, and (4) how perceived and actual knowledge affect use of causal information for decision-making. The four experiments have the same basic structure:

Introduction This page contains general information including expected duration, compensation, and qualifications.
Instructions This screen provides instructions on the task and explains the diagrams that appear in some questions. Participants were told they may be shown diagrams that could assist them with the questions.
Decision-making question(s) The core task involves one or more multiple choice decision-making questions, with one shown on each page.
Post-task assessment This section varied across the experimental conditions: survey on helpfulness of diagrams (Experiments 1–2), assessment of confidence in decision (Experiment 3), and assessment of domain knowledge (Experiment 4). Details are provided within the relevant experiments.
Demographic survey We collect the following information from participants: age, sex, country of birth, race and ethnicity, level of education completed, and current participation in education. For Experiments 2–4, we also collect information about personal domain experience. For Experiments 2–4 we ask whether the individual has diabetes (type 1, type 2, or unsure of status) or is a caregiver for a person with diabetes. For Experiment 4 we additionally ask whether the individual participates in a retirement plan and if they have made active choices in other investment types. The specific questions are provided within each relevant experiment.
Debrief Participants are shown a post-task information screen informing them of the purpose of the study.
Feedback Finally, we provide a form for open-ended comments or feedback on the task.

Based on the expected duration (15 min) and minimum wage at the time of the study, we paid $2 for completing the task. Experiment 4 was longer (approximately 20 min), so payment was increased to $2.75. The task was open to all US resident MTurk workers aged 18–64 who could understand written English. Individuals were only able to complete one survey, and they were prohibited from using the back button to ensure questions were answered in the order presented. We did not restrict to workers who have completed a specified number of tasks or who have achieved a certain approval rating. We communicated an estimated task time of 15 min (20 min for Experiment 4), but allowed 60 min. The task was posted during daytime hours on weekdays to reduce the effects of time of day on demographics while allowing us to meet recruiting targets. Using the TurkPrime microbatch feature (Litman et al. 2016), each task was reposted in small batches, so samples were collected evenly throughout the day. For each condition in each experiment, we used a sample size of at least 100. To calculate this sample size, we ran small pilot versions of each experiment to estimate the expected differences in proportions. For each experiment, using the estimated proportion difference and assuming a power of 0.8, we found that approximately 100 participants per condition should be sufficient to detect differences between groups with 95% confidence while also allowing for participant exclusions. Experiment-specific considerations regarding sample size are described within the participants section of each experiment.

Experiment 1

Our first experiment is designed to test whether the type of causal information extracted by machine learning methods is beneficial for the kinds of decisions made in daily life. That is, rather than only learning how a novel system works or making decisions in scenarios where people have no prior experience, real-world decisions such as what to eat or whether to walk somewhere or drive involve combining prior knowledge and experience with any new information presented. Causal information, and particularly that represented in diagrams, could potentially improve decision-making by reducing cognitive load. For example, work on learning has shown that causal diagrams can improve synthesis and retention of knowledge (Corter et al. 2011), and other work suggested causal information can reduce cognitive load by serving as heuristics (Garcia-Retamero and Hoffrage 2006). We now test whether seeing causal information at the time of decision leads to better decision-making.

Method

Participants

A total of 1800 people recruited through Amazon MTurk participated in the study. There were 18 possible combinations of questions (3 causal information question versions x 3 control question versions x 2 orderings). We were unsure if the ordering or combination of questions may influence results, so we recruited a sample of 100 participants for each combination per our minimum sample size. Participants were US residents, aged 18–64. Participants’ data were excluded if they reported being outside the allowed age range, if they failed to complete the survey, or if they submitted an incorrect code at the end of the survey. Of the 1800 participants, 76 submitted an incorrect code or failed to complete the study, and 6 reported being outside age 18–64. Thus, 1718 participants remained in the analysis. Detailed demographic information for all participants can be found in the Appendix.

Materials

For the decision-making task we selected a domain about which we believed a sample of American participants would have ample experience making decisions: weight management. Note that we do not mean that individuals necessarily have experience in the specific decision type posed in the problem, rather that they are likely to have thought about the domain (weight management, making diet and activity choices) and are likely to have beliefs about said domain. Participants were shown a short scenario and asked to give advice to another person. We believe that this set-up is more likely to focus the participants on the question as presented rather than elicit idiosyncratic preferences from their own life (see (Polman 2012) and (Lu et al. 2013), for example). For the causal information, we use a model that is relatively simple but reflects current guidance on factors affecting bodyweight.^{Footnote 2} As a result, there are multiple direct causes of weight, as well as an indirect cause. The decision-making question used is as follows.

Bodyweight real-world question:

Jane just started college and is adjusting to her busy schedule of classes and extracurricular activities. She has heard about the “freshman 15,” where new college students gain 15 pounds during their first year of college. Jane wants to avoid this, while also having fun, making new friends, and leaving time for homework and studying.

What is the ONE thing you think Jane should do to achieve her goal?

A. Go for a 30-min walk every weekend

B. Maintain a healthy diet

C. Avoid hanging out with friends

D. Watch less TV

Answer choices were presented to all participants in the order shown. The correct answer was designated as the choice that was the most direct cause of weight change, i.e., choice B. Detailed explanations for this and other answers can be found in the Appendix.

To augment the decision-making question, we created two different formats of causal model information that are relevant to the problem and reflect the type of guidance and level of detail commonly provided to individuals. We created a causal text addition that described current guidelines for how to manage bodyweight (Fig. 1a). We also created a causal diagram, seen in Fig. 1b, that presented the same information as the text in a graphical format akin to a CBN. We augmented edges (i.e., the arrows) with “+” or “–” signs to indicate whether the cause produces or prevents the effect.

Procedure

Participants were randomly assigned to receive one of three information conditions in answering the real-world question: (1) no extra information beyond the question (no info condition; n=573), (2) text-based causal information (causal text condition; n=572; see Fig. 1a), or (3) a simple causal diagram (causal diagram condition; n=573; see Fig. 1b). For the causal text and causal diagram conditions, the extra information was displayed visually in between the question and the answers. In the instructions for the task, participants were informed that some questions may have diagrams to assist them, and they were provided with an introduction to the meaning of the diagrams as well as the meaning of features such as plus or minus signs along the edges. Participants who were shown a diagram received a questionnaire afterward asking: Did you consult the table or diagram when answering the FIRST/SECOND question? This is the question that asked about weight management. Answer choices are: Yes, and it was helpful; Yes, but it did not affect my answer; No; Not sure/can’t remember.^{Footnote 3}

Participants also received a question that did not pertain to real-world knowledge, namely a question about a blicket detector (n=1718). This question also varied the causal information presented to aid with decision-making. The version of this question that included a causal diagram may have been confusing, as it was intended to convey a lack of effect but may have been interpreted by participants as conveying a preventative causal relationship. As such, we do not feel the results of this question can be interpreted meaningfully and so we do not include the results of this question in the analyses here. Participants were randomly assigned to receive the real-world or the control blicket question first. There was no effect on performance on the real-world question regardless of the order in which it was answered (p=0.2943). Thus, we collapse across conditions for participant responses from the real-world question regardless of whether it was answered first or second. We revisit the comparison of a control question that did not involve real-world knowledge in Experiment 4.

Results and discussion

Effect of causal information on decision-makingOur main question of interest was whether people would be more likely to pick the correct behavior if provided a causal diagram. In the following analyses, we compare the percentages of people who chose the correct answer across conditions. We test for significant differences in these percentages in this experiment and in all subsequent experiments using two-tailed Fisher exact tests and report odds ratios (ORs) to provide insight into effect sizes.

As shown in Fig. 2 and Table 1, a large percentage of participants (88.8%) picked the correct answer in our decision-making paradigm when no extra causal information was provided.^{Footnote 4} However, contrary to our expectations, causal information did not lead to better decisions. Fewer participants correctly answered the question when given more information, regardless of whether it was presented as causal text (82.7% correct responses) or as a causal diagram (80.1% correct responses). Thus, a causal diagram led to 8.7% fewer correct responses than no information at all (p<0.0001, OR=1.98), and causal text led to 6.1% fewer correct responses (p=0.0031, OR=1.66). The difference between the diagram and text conditions was not significant (p=0.2877).^{Footnote 5} Accuracy in answering the real-world question in the causal diagram condition was not driven by perceived helpfulness of the diagram, in that there was no difference in the percentage of people who correctly answered the question who said the diagram was helpful compared to those who said it was not (helpful and informed answer = 80.7% correct, not helpful = 86.5% correct, p=0.4439).^{Footnote 6}

Table 1 Percentage of respondents selecting each option across the three conditions in Experiment 1

Full size table

Our findings are surprising given that past work has suggested presenting more information (often in the form of complementary text and diagrams) results in better inferences (Mayer 2014). One possible explanation of our results is that people are doing worse when shown causal information (regardless of whether it is represented as causal text or a causal diagram) precisely because they have experience in this decision-making domain. This may be related to the expertise reversal effect, which posits that the more someone knows, the less useful some enhancements such as visualizations are (Kalyuga et al. 2003). We test this possibility in Experiment 2.

Experiment 2

In Experiment 1 we showed how causal information, presented as either text or a diagram, can lead to fewer correct responses in real-world decisions where information must be combined with existing knowledge and transferred from these general causal claims to specific options. In that experiment we used a scenario where most people can be expected to have prior knowledge or beliefs. We now follow up on this using a decision-making scenario where we can expect some participants to be less experienced than others. While people with prior beliefs may experience conflicts between the diagrams and their existing understanding, people without experience in a domain may be more likely to take the new information at face value.

To test this, we employed the example of managing type 2 diabetes (T2D), using diet and exercise to keep blood glucose (BG) in a healthy range. We chose this question as diabetes affects a substantial portion of the US population (so we can expect a sample of MTurk workers to include people with diabetes) and it is challenging to manage, as many other factors also affect glucose, including stress and physical activity. If personal domain experience is truly responsible for lower accuracy with the causal diagram, then participants with diabetes experience should perform closer to the sample of Experiment 1, while inexperienced participants should perform better than experienced participants in the causal diagram condition. While the question relates to diet and exercise (as these are important for managing T2D), this is in the context of managing diabetes, so we believe that most participants without experience managing diabetes will have few beliefs about the causal structure.