In 2002, The Institute of Medicine1 (IOM) reported results from its congressionally mandated study of existing evidence of racial and ethnic disparities in healthcare. The Agency for Healthcare Research and Quality's National Healthcare Disparities Report2 reinforced the IOM's findings, crystallizing public concern about racial and ethnic disparities in healthcare in the United States. The IOM report attempted to identify likely causal factors, highlighting the role of clinician-patient interaction as a contributing factor to service disparities. Researchers such as van Ryn and Burke3 and Balsa and McGuire4 emphasize how unconscious and statistical biases may influence clinician judgment of patients' illness under uncertain circumstances, such as those induced by incomplete or inaccurate information. Widespread attention has been paid to clinician bias in clinical assessments.5 Despite these reports' emphasis on the role of clinical uncertainty, there is a dearth of research investigating clinicians' processes of obtaining and using information during the clinical encounter.
Diagnostic assessment bias occurs when clinicians make systematic errors in the collection or processing of clinical information that could lead to misdiagnosis,6 false-positives, or false-negatives.7 Reducing diagnostic bias is one way to eliminate misdiagnosis8 and improve service delivery. But identification of the patient's main problem, which is the foundation for the proper treatment of psychiatric disorders, is challenging given the level of unavoidable uncertainty in diagnostic decision making.9 In fact, the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) was expected to make substantial improvements to diagnostic formulation by offering a checklist of symptoms, whereby clinicians would first determine which diagnostic criteria were present, whether enough criteria had been fulfilled to justify the diagnosis, and then rule out medical conditions or other psychiatric conditions that could account for these symptoms.10 However, it is not only the information collected in diagnostic assessment, but also how the information is applied in decision making that is critical for an accurate diagnosis. Paul Meehl11 showed that "actuarial" methods (eg, formal, algorithmic procedures whereby symptoms are collected in a checklist and statistically analyzed to reach a prediction) for combining diagnostic information were superior to clinical judgments (eg, those that rely on human judgment to merge information, discuss it with others, and reach a diagnostic impression). Yet clinicians resist actuarial or statistical methods in diagnostic formulation.12 A structured interview may seem to constrain clinicians to prescribed questions and clinicians may feel that establishing a good rapport with patients is better undertaken through a more open dialogue, thus making this a priority over actuarial methods. If clinicians eschew actuarial methods, what patterns of reasoning do they use to weight symptoms, and does this have any link to problems in diagnostic bias? To inform practice and develop interventions that reduce disparities in the clinical encounter, it is necessary to understand what happens in usual care regarding diagnostic formulation.
This article focuses on clinicians treating socioeconomically disadvantaged ethnic/racial minority patients in naturalistic settings. We study diagnostic assessments conducted within usual conditions of intake in public "safety-net" clinics so that our findings lead to usable research for knowledge translation and integration.13 Clinicians often complain that research findings may not apply to their particular settings14 because practitioners typically contend with a much more heterogeneous population and more complex care processes than those researched in the literature.15 Our goal in this article is to describe how diagnostic bias might take place while assessing patients-particularly minority patients-for diagnosis.
During intake, clinicians must determine which information will help identify a patient's diagnosis and the "cultural formulation" of that diagnosis. A person's ethnicity/race/culture may impact what clinicians ask the patient to report, and how clinicians interpret provided information. On the other hand, some researchers question whether much is gained by focusing on cultural and sociocontextual phenomena.16 Furthermore, the system of care can also play a role in what information is gathered and clinicians' opportunities to engage patients. Clinical determinations in safety-net facilities must be made in severely resource-constrained environments; therefore, any theory of diagnostic formulation needs to deal with the issue of missing information. One possible approach that may be useful in this context is that of Simple Heuristics,17 that is, inferences done with limited time, limited knowledge, and reduced computational power. This literature suggests the importance of ecological rationality, whereby one utilizes adaptive behavior resulting from the fit between the mind's mechanisms and the structure of the environment in which it operates. This is compounded by working under time pressure, resource constraints, and the consequent need to rely on cognitive shortcuts as additional factors that might produce a lack of collected information in the clinical encounter.
Methods
Selecting clinics, clinicians, and patients
The convenience sample of 47 clinicians and 129 patients participating in mental health intakes come from the Patient-Provider Encounter Study conducted in 2006-2008. Data were collected in eight primarily safety-net outpatient clinics in the Northeast offering mental health and substance abuse services to a diverse and socioeconomically disadvantaged patient population. This study complied with human subject protocols in all clinics where data were collected. Twenty-eight percent of clinicians were psychiatrists, 26 percent psychologists, 38 percent social workers, and the remaining were nurses or other, with the majority of clinicians (70%) having more than 5 years of clinical practice. Approximately 53 percent of clinicians self-identified as non-Latino Whites, while 36 percent self-identified as Latino, 9 percent as non-Latino Black (African American or Afro-Caribbean), and 2 percent as Asian. Clinicians were overwhelmingly women (66%), aged 35 to 49 years (45%), and were permanent staff at these clinics (68%).
All patient participants included in the study were seen by participating clinicians. Inclusion criteria were broad: all participants were adults aged 18 and over who were not identified as actively psychotic or suicidal by clinicians, who did not require interpreter services, and who could provide informed consent. The majority of patient participants were women (60%) and Latino (50%), with 39 percent self-identifying as non-Latino White, and the remaining 12 percent as African American or Afro-Caribbean. Almost two thirds of the sample (65%) had completed high school and 45 percent were employed. Approximately 64 percent reported a personal income of less than $15000 per year and approximately 50 percent were on Medicaid.
Collecting data from diagnostic interview
Clinician participants in the study were recruited at the clinics through introductory informational meetings. Patient participants' recruitment was conducted through direct person-to-person solicitation upon presentation for intake. All patient participants completed an assessment of their capacity to consent prior to their participation. The capacity to consent was established using a 10-item screening measure on the basis of the four legal standards of demonstrating capacity (understanding, appreciation, reasoning, and voluntarism18).
Participation in the study consisted of three components for both clinician and patient: (1) videotaping of intake; (2) participating in a postdiagnostic qualitative interview following intake; and (3) completing survey measures. Immediately following the videotaped intake, the qualitative research interviews were conducted separately with both patients and clinicians using a semistructured guide. These interviews, lasting approximately 30 minutes, were conducted in English or Spanish according to participants' language preference and focused on understanding clinician and patient experiences during intake. Clinician interviews included questions regarding their understanding of the patients' presenting problem, the process of clinical decision making, perceived rapport, and the role of sociocultural factors in patients' presenting problem and care offered. Patient interviews included questions regarding the presenting problem, perceived rapport, and significance of sociocultural factors in the presenting problem and care sought. All interviews were conducted by trained research assistants. Supervision was provided throughout the data collection process by an expert in medical ethnography, including a review of interviews, reflections on the process, and feedback aimed at improvement of interviewing skills. Patients and clinicians then completed survey measures as noted above. These were based on measures used in the National Latino and Asian American Study.19 Measures included questions regarding demographic information, language abilities, and level of acculturation.
Analyzing the information exchange between clinicians and patients
We designed a tool that documents the information discussed during intake. The tool, which we refer to as the information checklist, included 128 items and more than 200 subitems. Items covered symptoms related to major Axis I disorders, including major depression, dysthymia, alcohol and substance use, panic disorder, agoraphobia, generalized anxiety disorders, posttraumatic stress disorder, social phobia and specific phobia, and screening items for bipolar disorder, psychotic disorders, and adjustment disorder. All items originated from the diagnostic criteria in the DSM-IV20 and the alcohol use disorders and associated disabilities interview schedule.21 In addition, we conducted an extensive literature review to identify sociocultural factors that play an important role in psychiatric and substance use disorders. We integrated items that reflect personal and familial history related to mental illness (eg, history of loss, employment status), items that describe physical symptoms/illness and conditions of disability and items describing any treatment history of the patient.
Each intake videotape was coded by a clinician coder using the information checklist to document the content of information exchanged between clinician and patient. First, each item was coded for whether it was discussed and whether the information was volunteered by the patient, elicited by the clinician, or observed by the coder (for a limited set of items such as tangential speech) during the intake. Finally, each item that the patient discussed was coded for whether it was endorsed, denied, not answered, or the patient did not know. For instance, if a patient mentioned that he could not sleep, this item was coded as discussed, volunteered, and endorsed by the patient. All items describing symptomatology were coded as present only if the patient was currently experiencing them at the time of intake or had experienced them during the 12 months before intake. Eight mental health clinicians coded for the information checklist. These clinicians consisted of three psychiatrists, one social worker, one psychologist, and three advanced clinical psychology graduate students.
Five practice interviews per clinician were taped and discussed with the clinical supervisor who met with clinicians once every 2 weeks to provide feedback and monitor the quality of reassessment done by clinician coders. Adequate reliability was established using the five training tapes. Overall agreement was 86 percent to 87 percent between each coder and the master coding, and percent "yes" agreement for the discussed items ranged from 66 percent to 73 percent. Both the clinician coder reviewing the intake and the clinician were asked to independently give all diagnoses that applied to the patient. The clinician coder was blind to the diagnoses provided by the clinician doing the intake. The reappraisal produced data about the comparability of diagnoses arrived at by the clinician doing the intake and the clinician coder watching the videotaped intake and coding symptoms. We then estimated the concordance between clinician and coder for the aggregated disorders (eg, any depressive disorder, any anxiety disorder, or any substance disorder) and also by specific disorders, such as major depressive disorder, dysthymia, panic, posttraumatic stress disorder, generalized anxiety disorder, alcohol abuse or dependence, or drug abuse or dependence. We evaluated the level of agreement for each diagnosis using the kappa coefficient.
The patient and provider interviews were analyzed using NVivo 7 (QSR International Pty Ltd, Victoria, Australia) to identify major themes across interviews in the content of patients' information and involved a series of steps. Patient interviews were read independently to identify sections of the transcripts where patients' experience with the intake was discussed. We then described the symptoms used for cases to address our main question focusing on indicators, which lead the clinician to determine whether a patient experiences a psychiatric disorder as compared to the judgment made by the coder.
Summarizing patterns of elicited diagnostic information
We conducted factor analysis to determine clusters of symptom items used to determine diagnoses. Before examining the data, we established conceptually groups of symptom items into depressive, anxiety, and substance use-related items along with other symptom items. Then we conducted exploratory factor analysis and confirmatory factory analysis separately for each symptom group to determine the number of latent factors needed to adequately describe the observed patterns in the data. Our underlying assumption was that the latent factors are normally distributed and they explain the information contained in the observed symptom items. For models with more than one latent factor, we loaded items onto the latent factor with the higher varimax rotated factor loadings. If the factor loadings for an item were equally high for multiple factors, then we allowed the item to load on multiple factors. If an item had a negative loading, then we recoded the item so that all items loaded positively onto the latent factors. We selected the most parsimonious latent factor models that had good fit with the data as assessed by the [chi]2 test of model fit (>0.05), comparative fit index (>0.95), Tuker-Lewis index (>0.95), and root-mean-square error of approximation (<0.05). To demonstrate the components of each factor, we displayed a subset of symptom items with factor loadings of at least 0.4. We then generated a factor score from each of the latent factor models associated with each symptom group for each patient participant in our study. Because all symptom items were indicators of whether that symptom was discussed, one may interpret these factor scores as the amount information on each of the symptom clusters. The latent factor analysis was conducted using all samples in Mplus Version 5 (Muthen & Muthen, Los Angeles, California).
We then used logistic regression models to estimate the relationships of latent factor scores and race/ethnicity with the risk of a particular diagnosis as reported by the intake clinician. We conducted these analyses including only Latinos and Whites (excluding Blacks because of small sample size) and fitted separate models for each diagnosis (depression, anxiety, and substance abuse/dependence). The logistic regression models included the latent factor scores, race/ethnicity, the interaction of the latent factor scores with race/ethnicity, and adjusted for patients' sex and clinician type. We determined whether race/ethnicity was a modifier of the diagnostic information collected from patients by examination of the statistical significance of the regression coefficients of the interaction terms. For statistically significant interactions, we derived odds ratios to demonstrate the modifying effects of race/ethnicity given the highest and lowest factor scores. Model discrimination was measured by the area under the receiver operator characteristic curve. The logistic regression analysis was conducted in Stata Version 9.0 (StataCorp LP, College Station, Texas). By construction, we have no missing data for the items discussed during the intake interview. We eliminated 8 patients because of poor quality of videotaping, leaving a final sample of 121 participants.
Results
Qualitative analyses
Patients talked about the importance of being able to tell their story without being interrupted by the provider asking repeated questions. A brief excerpt illustrates the experience of providers when they tried to use an actuarial approach in the clinical intake:
I got the sense that he wanted to, you know, have to kind of like tell his story and that just wasn't in the cards for today, 'cause we needed to fill out the intake, so, I could tell that he was unpleased that, you know, I would be interrupting him and asking him all these questions. You know, he said at the end that he didn't feel that he could really talk to me, that sometimes he didn't think I was listening. [450CN]*
Repeatedly, providers stated the tension they experienced in trying to fulfill multiple roles in the intake process: understanding what brought the patient in; completing a diagnostic assessment; establishing a good working alliance; planning treatment; addressing questions for the referral source; and completing intake forms. They described how achieving these multiple goals required using time efficiently.
Concordance in diagnoses
Comparing the diagnosis given during the intake session and the one given by the clinician coder (data not shown), we found greater levels of agreement for substance-related disorders ([kappa] = 0.70), lower levels for anxiety disorders ([kappa] = 0.35) and depressive disorders ([kappa] = 0.33), and very low agreement for most specific disorders except drug abuse/dependence ([kappa] = 0.80), obsessive-compulsive disorder ([kappa] = 0.66), and panic disorder ([kappa] = 0.64).* These kappa coefficients would support the contention that clinicians weigh information differently.
We also observed two major patterns (data not shown). First, cases for which there is agreement between intake clinician and clinician coder on the presence of a particular disorder (eg, concordant positive) are more likely to exhibit more symptom information linked to the observed disorder, whereas cases in which they agree that the patient does not have the disorder (eg, concordant negative) or they do not agree on the patient's diagnosis (eg, discordant) have substantially less information collected. For example, in 94 percent of concordant-positive cases of depressive disorder there was discussion of depression; in 73 percent of these cases suicide was brought up. For discordant cases in which the intake clinician defined the case as positive for depression but the clinician coder evaluated the case as negative, we see that depressed mood (73%) or depression (64%) was raised in the majority of these cases. However, when we look at the frequency of discussing specific symptoms of depression, in only about a third of cases were sleep disturbance or weight loss talked about during the intake. In cases where the clinician coder regarded the case as positive for depression but the intake clinician did not, the opposite pattern was seen, with symptoms of depression being more likely to be mentioned (72.7% discussed sleep disturbance, 63.6% brought forth poor appetite, 54.5% mentioned of loss of energy) but apparently disregarded by the intake clinician.
Omission of diagnostic information in intake was more common for anxiety and substance disorders. Most criteria for anxiety and substance disorders were not assessed, according to the clinician coders who evaluated what items were discussed during the clinical interview. Again, for discordant cases, anxiety and exposure to traumatic events were discussed in approximately half of these cases, but specific symptoms of anxiety were generally not discussed. A similar pattern was observed for substance use disorders, where only in concordant-positive cases do we see specific symptoms linked to substance abuse or dependence criteria as part of the intake in comparison with negative or discordant cases. Table 1 shows that most cases, independently of race or ethnicity, were screened with general probes for mood, anxiety, and substance disorders. However, except for major depressive disorder (64.9%), most criteria to fulfill a diagnosis were not assessed in actual practice for patients belonging to any of the racial and ethnic groups (see Table 2). Compared with Latino and Black patients, Whites had more criteria assessed for alcohol disorder (26.9%) and for drug abuse disorder (44.8%).
Factor analyses to identify latent symptom clusters
The factors included the following: (1) a depression factor (eg, with items on diminished interest or pleasure in activities, weight change, sleep disturbance, fatigue or loss of energy, worthlessness, excessive guilt, diminished ability to think, suicidality, hopelessness, and poor functioning; range of factor scores = [-1.827, 1.465]); (2) an anxiety factor (eg, with items on any mention of anxiety, anxiety in places where escape will be difficult, recurrent and unexpected panic attacks, somatic symptoms of anxiety, marked and persistent fear of objects/situations; obsessions; or being worried, nervous, or tense; range of factor scores = [-0.99, 2.261]); (3) a trauma factor (eg, with items on exposure to a traumatic event, traumatic event persistently experienced or persistent symptoms of increased arousal; range of factor scores = [-0.787, 1.793]); (4) substance history, life stressor, and mental health history factor (eg, lifetime substance abuse treatment, current substance abuse treatment, history of respondent and/or family substance use, availability of substances, history of mental disorder, current psychiatric hospitalization, problems with physical health, relationship conflict, academic failure in school, trouble with police, history of driving under the influence; range of factor scores = [-0.247, 2.481]); (5) family history of abuse/victimization and disability factor that describes other symptoms (eg, respondent or family member victim of a crime, family member perpetrator, history of physical, sexual or emotional abuse; and general role difficulties/disability; range of factor scores = [-0.382, 2.126]); (6) an alcohol use factor (eg, any information on alcohol use, symptoms of alcohol abuse and/or dependence; range of factor scores = [-1.511, 1.705]); and (7) a drug use factor (eg, any information on substance use, symptoms of substance abuse and/or dependence; range of factor scores = [-1.21, 1.804]).
Rates of discussing symptoms and topics by race/ethnicity
In this next set of analyses, we look at what information factors are actually collected during the diagnostic interview and whether the rate of discussion of different topic areas (factors) differs by patients' race/ethnicity. Table 3 displays the rates of discussion for the items related to the latent factors that have factor loadings greater than 0.40 and also the P values for testing differences in discussion rates for each item between Whites and Latinos. Examining the discussion rates, we found that although marked dysfunction with depression was more likely to be discussed with Latinos than with Whites, having depressed mood was more frequently discussed with Whites than with Latinos (P < .01). For anxiety items, any mention of anxiety and obsession was more likely to be discussed with Whites than with Latinos (P < .05), whereas exposure to traumatic events was more likely to be discussed with Latinos than with White patients (P < .05). Most alcohol factor symptoms, drug factor symptoms, substance history, and life stressor factor symptoms were substantially more frequently discussed with White patients than with Latino patients, as shown in Table 3.
Association of race/ethnicity and latent symptom factors with diagnosis
Female sex and information on trauma were associated with increased likelihood of giving a depression diagnosis by the clinician doing the intake, whereas discussion of anxiety symptoms decreased the likelihood of depression diagnosis (Table 4). Being female and discussion of exposure to trauma increased the odds of receiving a diagnosis of depression but not ethnicity. There was a significant interaction between Latino ethnicity and discussion of family history of abuse/victimization and disability. Latinos who had more discussion of family history of abuse/victimization and disability with the provider were more likely to be given a depression diagnosis than non-Latino Whites who also had the same discussions. Symptoms of anxiety (anxiety factor) and being Latino (as compared with White) were found to be associated with increased likelihood of receiving an anxiety diagnosis (Table 4). There were no significant interactions between race/ethnicity and the latent factor scores associated with anxiety disorder diagnosis. Being female and being Latino were associated with a decreased risk of substance use diagnosis, whereas discussing symptoms of the drug abuse/dependence factors was related to augmented odds of receiving a substance use disorder diagnosis. Because of low event rates, we were unable to examine the race/ethnicity and latent factor interactions for diagnoses of substance use disorders.
Discussion
Our findings support previous work23,24 suggesting that factors accounting for differences in diagnosis of minority patients have to do with differences in the availability of patient information (information variance); differences in how diagnostic criteria are applied (criterion variance); the use of structured interviewing procedures versus unstructured interviewing procedures (procedural variance); and differences in interpretation of symptom probes because of culture or language (culture/language variance). Comprehensive assessments of diagnostic criteria seldom appear to be assessed in the intake process, partly to be able to address the multiple demands of intake (eg, establishing therapeutic alliance, treatment planning, understanding the patient) confronted by the clinician. This is consistent with work by Krupinski and Tiller,25 who found that only 28 percent of cases of depression had been evaluated in clinical practice as having sufficient symptoms to meet criteria for DSM-IV major depressive disorder.
Our findings suggest that most clinicians use general information of patients mentioning depression, anxiety, or substance use to identify the presence of these disorders. This brings forth the problem of missing information and how clinicians base their judgments on the most generalized statement of illness, exposure to trauma, and history of family victimization or abuse. Our results show that differential assessment of certain symptoms is observed depending on the race/ethnicity of patients. Even with similar information collected during the clinical intake, such as history of victimization or abuse, clinicians will weight the information differently to assign a diagnosis depending on the race/ethnicity of the patient. Differential discussion of symptom areas and differential weighting of the same diagnostic information, depending on patients' ethnicity, can lead to differential diagnosis and increased likelihood of diagnostic bias.
The analyses of concordance of diagnosis between the clinician coder and the intake clinician suggest that in the absence of more detailed symptom information for depression or anxiety disorders, there is increased likelihood that two clinicians will arrive at different diagnosis. However, such is not the case for substance disorders. This seems to imply that lack of information will not necessarily lead to a differential diagnosis but might rather depend on the diagnosis.
This problem of missing information might best be understood by clinicians using a system, whereby they begin with the assumption that all cases coming to treatment are likely to have depression, anxiety, or substance disorders. If the patient reports depression or anxiety, for example, the assumption of the clinician is confirmed and additional information is sought regarding past or current history of treatment of the disorder or family psychiatric history. With this limited information, the clinician can optimize the use of time for diagnostic purposes, while allowing sufficient time for getting to know the patient and establishing rapport in the clinical encounter. Nonetheless, as can be seen, clinicians confront the problem of missing information (particularly for anxiety and substance disorders) and make assumptions regarding whether data presented are sufficient to account for a diagnosis. We found ethnicity to be a modifying factor of the association between symptom reports and likelihood of a depression diagnosis, leading to diagnostic bias.
Serious consideration should be given to the time constraint of the initial interview. Alternative models such as devoting significant time to complete a comprehensive assessment while allowing sufficient time to develop rapport should be considered. This might require having ancillary staff that could help patients confidentially answer a brief symptom assessment, do rapid coding, and flag areas for in-depth assessment by the clinician. Preparing the patient for reviewing the brief diagnostic assessment before the conclusion of the intake process and after the patient has a chance to tell their story might help fulfill patients' and providers' mutual goals, while reducing diagnostic bias. Providers should be given information on how diagnostic bias may play a role in their decision making. Our next step is to test different ways of blending the engagement and assessment goals during clinical intake to be able to provide specific recommendations of how to decrease diagnostic bias. We are also developing a brief symptom measure to efficiently assess diagnosis and having providers suggest their own recommendations for how to intervene with diagnostic bias.
REFERENCES