Diagnostic errors in medicine are major contributors to poor patient outcomes (Gandhi et al., 2006). One of the main causes is physicians’ errors in probabilistic reasoning, such as prematurely settling on a diagnosis (Croskerry, 2002; Graber, Franklin, and Gordon, 2005; Voytovich, Rippey, and Suffredini, 1985). Bayesian reasoning is fundamental to the normative diagnostic process (Ledley and Lusted, 1959; Pauker and Kassirer, 1980). To calculate the posttest likelihood of a disease, Bayes’ rule combines the pretest probability of disease (the prior probability, base rate, or prevalence) and the likelihood ratio (the sensitivity and specificity of the test). The same Bayesian framework also applies when combining the prevalence of a disease with a patient’s symptoms to determine the likelihood of different diagnoses, which was the focus of the present study.
The effect of prevalence beliefs on diagnosis
Whether, when, and how people use base rates is a subject of long-standing debate (Barbey and Sloman, 2007; Gigerenzer and Hoffrage, 1995; Koehler, 1996). However, most of the experiments providing a basis for this debate on base rate “neglect” or underuse have given participants what resembles an algebra word problem: Participants are provided with the prior probability and likelihood ratio and are expected to come up with Bayes’ rule and apply the equation to the supplied statistics (Agoritsas, Courvoisier, Combescure, Deom, and Perneger, 2011; Casscells, Schoenberger, and Graboys, 1978; Chambers, Mirchel, and Lundergan, 2010; Eddy, 1982; Lyman and Balducci, 1994; Puhan, Steurer, Bachmann, and ter Riet, 2005; Sox, Doctor, Koepsell, and Christakis, 2009; Steurer, Fischer, Bachmann, Koller, and ter Riet 2002). In everyday clinical practice, physicians are not provided with external prevalence estimates. Though they could seek out prevalence estimates from the literature, often they rely on their own beliefs about prevalence estimates, based on either their previous reading of the literature or their experience.
The main question in this study was whether physicians use their beliefs about prevalence for making preliminary diagnoses. This question has existed for a long time within medical communities. Theodore E. Woodward, a famous medical researcher and diagnostician, cautioned students to think of “horses” (common diseases) when hearing hoofbeats (symptoms), not “zebras” (rare diseases). It is possible that physicians spontaneously use their own prevalence beliefs more or less than they use externally provided statistics.
Researchers in two previous studies assessed whether doctors use prevalence beliefs based on their own experience in diagnosis. Unfortunately, these studies have critical limitations that prohibit strong conclusions. In one of the studies (Christensen-Szalanski and Bushyhead, 1981), the researchers found a correlation between physicians’ judgments of the probability of pneumonia given different symptoms (e.g., cough) and the objective predictive value of the symptoms, which has been widely cited as evidence that doctors use base rates (Christensen-Szalanski and Beach, 1982; Koehler, 1996; Medin and Edelson, 1988). However, it is possible that the doctors relied only on their knowledge of which symptoms are more (e.g., crackling sound while breathing) or less (e.g., stomachache) predictive of pneumonia and did not use base rates at all (Kleiter et al., 1997). Additionally, the doctors grossly overestimated the likelihood of pneumonia relative to chest x-ray results, implying that they did not attend to the low base rate of pneumonia.
In another widely cited study, family practitioners judged the likelihood of diagnoses for vignette cases (Weber, Böckenholt, Hilton, and Wallace, 1993). They judged high-prevalence diseases as being more likely than low-prevalence diseases, which is consistent with use of base rates. However, it is also possible that the symptoms were more consistent with the high-prevalence diseases in those vignettes.
In summary, the key studies on whether physicians use the prevalence of diseases when making a diagnosis have strong alternative explanations. Our goal was to test this question with a paradigm that controls for these alternative explanations.
Origins of prevalence beliefs
Another important question is how physicians develop prevalence beliefs in the first place (Richardson, 1999). Though published prevalence estimates could serve as a general guide, prevalence can vary by geographic location, patient demographics, and clinical setting. There is considerable variability in physicians’ prevalence estimates of diseases (Dolan, Bordley, and Mushlin, 1986). The question addressed here is which factors influence prevalence beliefs among residents.
One likely factor is residents’ own experiences with patients. Each resident treats an idiosyncratic set of patients, which could lead to different prevalence beliefs. Additionally, highly memorable patients may alter subjective judgments of prevalence (Detmer, Fryback, and Gassner, 1978; Lichtenstein, Slovic, Fischhoff, Layman, and Combs, 1978; Tversky and Kahneman, 1973). In the present study, we did not have a way to capture residents’ full experiences, nor could we determine which experiences were most memorable to them. However, we were able to investigate other hypotheses about how experience may influence prevalence beliefs.
Specifically, we hypothesized that residents’ prevalence beliefs may become more precise and accurate across the 3 years of residency. As the residents in the same program gain experience, the law of large numbers tends to make their experiences become more similar, which should increase precision and accuracy. Prevalence estimates may also become more accurate and precise if more experienced residents (measured by residency year) have been exposed to more literature on the true prevalence of a disease.Footnote 1
Present study
In the present study, we examined the effect of residency year on prevalence judgments, as well as the association between prevalence judgments and diagnostic judgments. We tested whether residents judge a diagnosis as more plausible when they personally believe the disease to have a higher prevalence relative to other residents who believe the disease to have a relatively lower prevalence. The main difference of this approach compared with past research is that we tested whether physicians’ own prevalence beliefs predict their diagnostic judgments, an across-subject, within-disease effect.
We caution that the relationship between prevalence and likelihood of diagnosis is expected to be small. First, as clinical findings accumulate, the prevalence of the diseases normally should become a smaller factor in diagnosis. Even in our short vignettes, there were many clinical findings. Second, as already explained, instead of testing whether residents believe that more prevalent diseases are more likely, we tested whether residents with different beliefs about the prevalence of a single disease make different diagnostic likelihood judgments for that disease. This is a much more subtle effect.
To study the origins of the residents’ prevalence beliefs, we examined the influence of experience on both the accuracy and the precision of prevalence judgments. For precision, we tested whether the standard deviations of the prevalence estimates for a given disease decreased across the 3 years of residency. We assessed the influence of experience on accuracy in two ways. First, we tested whether the absolute deviation between the mean prevalence estimate for each disease and the actual prevalence in the general medicine service at the University of Chicago decreased across the 3 years of residency. Second, we also report the average correlation between an individual resident’s prevalence estimates and the actual prevalence of the diseases as another overall measure of accuracy.