The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×
ArticleFull Access

A Ten-Year Review of the Validity and Clinical Utility of Depression Screening

Published Online:https://doi.org/10.1176/ps.49.1.55

Abstract

OBJECTIVE: Use of depression screening instruments in primary care is controversial. The authors reviewed research studies published since the development of national practice guidelines to determine whether new evidence might favor screening. The review focused on evidence-related validity and clinical utility of depression screening instruments. METHODS: Silver Platter MEDLINE was searched for English-language studies of depression screening instruments published between 1986 and 1995. Studies were classified by type—reviews of studies, outcome studies, validation studies. Results and conclusions: Fifty-nine studies met criteria for review. Validation studies were the most frequent type (39 studies) and were subclassified according to population, type of comparison, and analytical method. These studies documented the validity of screening instruments compared with formal criteria and demonstrated consistently better performance for systematic approaches compared with clinical impressions. Thirteen studies were reviews; those reviewing evidence for effectiveness disagreed in their conclusions. Only seven outcome studies related to depression screening instruments were found, and none showed measurable benefit in a screened population. Several studies showed that very brief instruments performed about as well as longer, well-validated questionnaires for screening in general populations.

Researchers, governmental agencies, payers for health care, and others have noted the high prevalence, morbidity, mortality, and costs associated with depressive disorders (1-3). The Epidemiologic Catchment Area (ECA) survey reported a 2.3 percent one-month prevalence rate for a major depressive episode in the community at large (4). Similarly, the Agency for Health Care Policy and Research reported a point prevalence of major depressive disorder of 2.3 to 3.2 percent for men and 4.5 to 9.3 percent for women in general community studies and 4.8 to 8.6 percent for both sexes combined in outpatient primary care settings (5). It also reported a lifetime risk for major depressive disorder of 7 to 12 percent in men and 20 to 25 percent in women.

Katon and Schulberg (6) cited studies in which major depression was estimated to occur in 2 to 4 percent of persons in the community, 5 to 10 percent of primary care patients, and 10 to 14 percent of medical inpatients. In each of these settings, Katon and Shulberg estimated that there are “two to three times as many persons with depressive symptoms that fall short of major depression criteria.” Koenig and colleagues (7) reported rates of major depressive disorder in hospitalized medically ill men of 22.4 percent for those under age 40 and 13.5 percent for those age 70 or over; they reported rates of minor depression of 18.1 percent and 29.2 percent, respectively, for the two age groups.

As described more fully below, numerous authorities have indicated that evidence is insufficient to recommend screening for depression in primary care settings, despite the high prevalence of major depressive disorder in general populations. This paper reviews literature from the last ten years on depression screening instruments to determine whether evidence favoring screening has been found.

Background

When broadly defined, depression has a greater impact on morbidity and mortality than its well-established relationship with suicide and suicide attempts would suggest. Depressive symptoms are associated with increases in utilization of and costs for care in health maintenance organizations (8,9), as well as longer lengths of stay and greater rates of rehospitalization (10).

Morbidity and mortality

Wells and coworkers (11) reported that among depressed patients, deterioration in function associated with depressive symptoms was at least as severe as, and added to, functional decline associated with eight chronic medical conditions. Among 2,980 ECA study participants, Broadhead and associates (12) found a 4.78 relative risk of disability in persons with major depression and a 1.55 relative risk in persons with minor depression and mood disturbance but not major depression. They noted that individuals with minor depression had 51 percent more disability days in the community than persons with major depression because of the greater prevalence of the former (12).

In a study of 2,393 subjects from the Los Angeles ECA site, Judd and colleagues (13) concluded that “significantly more people with subsyndromal depressive symptoms or major depression reported impairment in eight of ten functional domains than did subjects with no disorder.” More recently, Murray and Lopez (14) used disability-adjusted life years to show that depression is now fourth on a list of diseases that shorten life or cause disabilities in the entire world population.

Other researchers have reported increased mortality and specific disease morbidity associated with depressive disorders. In a 27-year follow-up epidemiological study that used multivariate analysis, Barefoot and Schroll (15) noted a 59 percent increased risk of death and a 71 percent increased risk of myocardial infarction for patients with high scores on depressive symptoms on a validated 40-item psychological assessment instrument. These findings remained unchanged even after the analysis controlled for risk factors and signs of disease at baseline. Morris and coworkers (16) reported that 53 percent of patients they identified as having either major or minor depression approximately two weeks after a stroke died over the course of a ten-year follow-up. The odds ratio for death among depressed patients was 3.4, compared with nondepressed patients, independent of age, sex, social class, type of stroke, lesion location, and level of social functioning. Koenig and associates (17) reported that depression increased mortality independent of severity of physical illness.

These and other studies clearly indicate that depressive symptoms have a potentially significant impact on patients' functioning and outcomes and affect rates of utilization of health services. What may not be as clear is whether identification and treatment of depressive symptoms result in improved outcomes or lower utilization and cost in the general medical sector.

Benefits of depression treatment

Research in defined populations of mental health patients documents the efficacy of currently available treatments for depression. Both pharmacotherapy (18-20) and specific forms of psychotherapy (21) reduce depressive symptoms both in the acute stage of illness (22) and over the long term (23,24).

Fewer studies have focused on depression in the primary care setting. Wells and Sturm (25) reported that appropriate treatment improved patients' functioning and outcome. They also showed potential cost-effectiveness when depressive disorders of primary care patients are adequately recognized and treated. Von Korff and coworkers (26) showed that improvement in depressive symptoms was associated with reduced disability days and disability scores. In their study, depression and disability improved simultaneously.

Many fewer individuals with depression in primary care settings are identified and treated than could benefit from appropriate care (7,27,28). Spitzer and associates (28) investigated the Primary Care Evaluation of Mental Disorders (PRIME-MD ) system, a structured screening and diagnostic system for mental disorders in primary care settings. They found that “48 percent of 287 patients with a PRIME-MD diagnosis who were somewhat or fairly well known to their physicians had not been recognized to have that diagnosis before the PRIME- MD evaluation.”

Recognition alone is not sufficient (29). Several researchers have reported improved rates of treatment by using screening instruments and other approaches to standardize care. In a randomized controlled trial among primary care patients with major and minor depression, Katon and coworkers (30) showed that a collaborative management approach in primary care settings improved patients' satisfaction with care and depression outcomes among patients with major but not minor depression. Because this model is labor intensive, practitioners and health care systems may desire models for care that are as effective but less expensive. Use of a standard screening instrument might be a way of reducing costs of diagnosis and monitoring.

The utility of depressive screening instruments in primary care settings is controversial. The U.S. Preventive Services Task Force (31) found “insufficient evidence to recommend for or against the routine use of standardized questionnaires to screen for depression in asymptomatic primary care patients.” Similar conclusions have been reached by other authorities (5,32). Full screens such as the Beck Depression Inventory (33) and the Hamilton Rating Scale for Depression (34) may be unfamiliar to non-mental-health professionals and more time intensive to administer and score. Shorter instruments could mitigate this problem.

Rationale for the literature review

In late 1995 and early 1996, as part of a Department of Veterans Affairs (VA) effort to develop and implement national practice guidelines for the recognition and management of major depressive disorder, we reviewed current literature on depression screening instruments. Members of the expert panels who were leading the VA's effort had expressed interest in screening for depression, despite known limitations of screening instruments. Two reasons were advanced: first, that depression is more prevalent among VA patients than in the general population, and second, that research done since the publication of the national practice guidelines (5) has favored screening. This review addresses the second point by presenting evidence from scientific literature published during the past ten years.

Materials and methods

We searched the published medical literature using Silver Platter MEDLINE (35). Articles for inclusion and primary references were identified from 1986 through 1995 MEDLINE files of articles published in the English language. Key words and the National Library of Medicine's subject headings included Beck Depression Inventory, Carroll Rating, CES-D (Center for Epidemiologic Studies Depression Scale), depression, depression—diagnosis, evaluation, General Health Questionnaire, Geriatric Depression Scale, Hamilton Rating Scale, Hopkins Symptom, IDS (Inventory for Depressive Symptomatology), mass screening, psychiatric status rating scales, and Zung Depression Scale.

We also used references obtained from bibliographies in review articles generated by the MEDLINE searches, published guidelines, and American Psychiatric Association publications. Articles were screened to determine if they were one of three types of publication: reviews of studies of screening instruments for depression; studies of any outcome of use of screening instruments, with “outcome” including health status change, therapeutic intervention, recognition of depression by health care providers, or health services utilization; and studies that validated instruments by direct comparison with other instruments used with the same subjects.

We specifically excluded articles that addressed only the reliability of instruments and those in which the only measure of validity was comparative prevalence estimates in similar populations. Although many instruments have been translated into foreign languages, we elected to use only articles about studies of English-speaking subjects.

For each type of study, slightly different information was abstracted from the selected articles. For the review articles, the abstracted elements were the instruments studied, the comparisons made, and the conclusions. For the validation studies, the elements were the instruments evaluated, the criterion measure of depression (if stated), a description of the population in which the instrument was tested, and the main statistical findings (area under the receiver operating curve, sensitivity, specificity, predictive value, kappa, and correlation coefficients). For outcome studies, the elements were the outcomes examined, the measures used, a description of the population in which the instrument was tested, and the results.

Results

Fifty-nine articles met criteria for inclusion. A complete list of the articles and a 25-page evidence table describing the studies and summarizing their findings is available from the first author. Also available is a companion glossary of the instruments reviewed in the articles. Thirty-nine of the 59 articles (66 percent) were validation studies, 13 (22 percent) were reviews, and seven (12 percent) were outcome studies.

Of more than 40 instruments mentioned in the articles, only five were mentioned in 20 percent or more of the studies. They were the Geriatric Depression Scale (GDS) (36), mentioned in 19 articles; the Beck Depression Inventory (BDI), mentioned in 16 articles; the General Health Questionnaire (GHQ) (37) and the short version of the Zung Depression Scale (SDS) (38), both mentioned in 15 articles; and the Center for Epidemiologic Studies Depression Scale (CES-D) (39), mentioned in 12 articles. The BDI, GDS, and SDS are depression-specific self-administered instruments. The GDS is intended for older persons. The CES-D is a subscale of a larger, population-based research screening tool. The GHQ is not specific for depression.

A rich variety of other tools was formally studied less frequently. They included several specifically intended for use in primary care settings, such as the Symptom Driven Diagnostic System for Primary Care (40) and the PRIME-MD (28).

Review articles

The 13 literature reviews were of three general types: users' guides, which consisted largely of expert opinion about the strengths and weaknesses of different instruments; quantitative comparisons and meta-analyses, which summarized studies that included statistical measures of validity against specific criterion instruments; and reviews of evidence, which examined whether using screening instruments produced favorable outcomes. The articles by Applegate and associates (41), Gallagher (42), Leserman and Koch (43), Kavan and colleagues (44), and Van Gorp and Cummings (45) typify users' guides. Leserman and Koch made a unique contribution by summarizing the studies showing instruments' sensitivity to change with treatment of depression.

Quantitative comparisons varied in sophistication. Mulrow and her co-investigators (46) used a computerized literature search with defined criteria to identify studies systematically, focusing on research comparing instruments to criterion standards in primary care settings. Studies were evaluated based on whether the criterion assessment of patients was independent of screening and whether a large proportion of subjects were both screened and diagnosed. The review includes a meta-analysis of sensitivities and specificities of the instruments included.

The review by Coulehan and colleagues (47) produced similar results to that by Mulrow, although with less rigorous analysis of the original research. Sensitivity and specificity measures cited in Coulehan's review appeared to be higher than those in Mulrow's review. Both reviews examined the BDI, CES-D, GHQ, and SDS.

Clark and Watson (48) explored the reasons depression screening instruments often lack specificity. Using meta-analytic techniques, the authors identified a nonspecific distress factor that is present to some degree in measures of both depression and anxiety and that confounds screening scales.

The reviews of evidence for effectiveness disagreed in their conclusions. Feightner and Worrall (49) identified only four studies of effectiveness of early intervention using screening and concluded that instruments have been shown only to increase detection of depression, with no documented impact on outcome. These authors advised caution in screening general populations of primary care patients for depression. Zung's analysis (50) did not include effectiveness studies but did cover six studies demonstrating that offering physicians knowledge of the results of depression screening tests increased their recognition of depression. Zung suggested that depression screening instruments be used as a “depression thermometer,” monitoring a patient's mental state in the same way a thermometer tracks a physical sign of health status.

Validation studies

There is no obvious single classification system to describe the mixture of 39 papers identified under the general heading of validation studies. Several classification approaches may be informative, although not necessarily exhaustive. Three are described here.

One way of classifying the studies is by type of comparison. Two major groups were noted—those comparing an instrument to a criterion measure, such as a DSM-III diagnosis (27 studies), and those comparing clinical judgment or diagnosis to a criterion measure (seven studies). The few studies that did neither were wide ranging, from those comparing an instrument to a subset (for example, Andresen and Malmgren [51]); an instrument to itself (for example, Burrows and associates [52]); or an instrument to measures that are not depression specific (for example, Wilkinson and Barczak [53]).

Studies comparing an instrument to a criterion measure test validity in the classic sense—if the instrument is in agreement with the criterion measure, it can be said to measure what it purports to measure, and therefore is valid, at least under the conditions tested. The circumstances, population, and definition of “agreement” are critical to such validity studies. As described below, the studies differed markedly in each of these areas. For example, five of the studies that compared the instrument and a criterion dealt with general populations. Of those, four reported sensitivity and specificity as the measure of agreement, while one reported area under the receiver operating curve, a summary measure of screening instrument performance in comparison to a criterion.

The studies of clinicians' judgment were a diverse group. Three studies—by Pond and associates (54), Coyne and colleagues (55), and Gerber and coworkers (56)—documented relatively poor performance of nonpsychiatrist physicians when their judgments were compared with criterion standards. Spitzer and colleagues (28) found high specificity but only moderate sensitivity of primary care physicians in detecting depression with an instrument-guided inquiry.

Studies may also be classified according to the population used. Such classification is important because instruments may be optimized for specific groups or because of concern that performance may suffer in particular groups. Twenty-two studies dealt with elderly patients, four specifically comparing cognitively normal groups and impaired groups. Nine focused on more general populations, mostly patients in ambulatory settings. Considered as a group, these papers suggested that several screening instruments, including ones not intended for detecting depression or not specific for depression, are nearly equivalent in measured validity against criterion standards. In general populations, clinicians' judgment did not perform as well as standardized approaches, as noted above.

A third way of looking at these studies is by method of analysis. Two statistical approaches appeared to be prevalent. Eleven of the studies presented results as a receiver operating curve (ROC), plotting true versus false positive rates at different cutoff scores of an instrument. These studies often included calculation of the area under the ROC as a summary measure of the instrument's performance. Sixteen papers reported sensitivity and specificity of instruments at stated cutoff scores. This approach often included calculation of positive and negative predictive values in the population studied. The remainder, which reported neither ROC nor sensitivity-specificity, employed a broad range of other techniques such as kappa scores, correlation coefficients, and rates of agreement.

Extensive discussion of the properties of the different analytic measures is beyond the scope of this review. Bailar and Mosteller (57) have briefly described the basic statistical methods, and Somoza and colleagues (58) presented a detailed explanation of the application of ROC methods to validation of depression screening tests.

Finally, one issue raised in five different studies is the minimum number of questions needed for efficient depression screening. Berwick and colleagues (59) reported that carefully selected short subsets of the Mental Health Inventory (MHI) (60) performed nearly as well as the complete instrument in detecting patients with major depression. Indeed, Berwick's team was able to identify a single item that worked comparably to the entire MHI. Broadhead's group (40) studied a set of just four questions and found sensitivity and specificity comparable to published validation studies of established, and much longer, instruments at their optimum cutoff points. Steer and his coworkers (61) identified two symptoms from the BDI that distinguished between anxious and depressed patients almost as well as the entire instrument.

Rost and associates (62) found that two-item subsets of the Diagnostic Interview Schedule (DIS) had 99 percent negative predictive value in three ECA populations when compared with the full DIS. Wyshak and Barsky (63) tested a single question using both physician and patient ratings and found that its performance was comparable to longer instruments, at least in the special population studied.

Outcome studies

A well-designed outcome study showing significant benefit to persons screened, identified, and treated for depression would be important evidence in a decision to implement routine depression screening. None of the seven studies in this review met that standard. Most dealt with recognition—whether physicians given results of screening would better recognize depression—and treatment, whether persons screening positive would be more likely to receive treatment for depression.

Results of recognition studies were mixed. Iliffe and colleagues (64) typified this approach, showing that use of screening instruments led to increased recognition of depression in one of two practices studied. Gold and Baraff (65) showed that providing GHQ scores to emergency physicians increased recognition of depression and referral for psychosocial services. Magruder-Habib and colleagues (66) also documented increased recognition and treatment if physicians were provided SDS scores. On the other hand, Shapiro's group (67) found that although screening did result in improved recognition for at least some patients, it did not lead to increased medical management of depression.

Two studies dealt with longer-term effects of screening. Berwick and associates (68) discovered that patients in a health maintenance organization who scored high on the GHQ were more likely to make medical visits in the subsequent year than those with lower scores. Of course, the GHQ is not specific for depression. Magruder-Habib and her co-investigators (69) found that patients who were identified using the SDS and whose physicians were told the scores were more likely to receive antidepressants than patients whose physicians did not receive the scores, but the difference was not statistically significant. They also noted that levels of depressive symptoms did not change over 12 months of follow-up for all patients.

Conclusions

In this literature review, we failed to confirm the hypothesis that studies over the past ten years have found better or conclusive evidence of benefit from screening for depression in general ambulatory populations. The conclusion of the U.S. Preventive Services Task Force is still borne out in the published literature. Few outcome studies related to depression screening instruments exist, and none show that screening leads to measurable benefit in a screened population. This finding does not negate studies showing that treatment of depression improves outcome.

Validation studies published after the time period covered by the formal review have produced similar conclusions about validity of screening instruments, particularly in geriatric populations. Loke and colleagues (70) demonstrated good sensitivity and specificity of two brief instruments compared with a structured diagnostic schedule and found geriatricians' diagnoses to have low sensitivity. Hermann and colleagues (71) showed excellent agreement between a 15-item GDS subset and the Montgomery-Åsberg Depression Rating Scale among geriatric outpatients.

In one outcome study, BDI screening was used to measure the effect of informing physicians of undiagnosed depressed individuals in their practices (72). Patients whose BDI score was disclosed to their physician did no better than control subjects over 12 months of follow-up. If anything, patients with diagnosed depression deteriorated over the follow-up compared with those without such a diagnosis.

Another outcome study published after the period of formal literature review found little impact on three-month postscreening health status when the Symptom Driven Diagnostic System for Primary Care (SDDS-PC) was used for screening (73). Patients who had at least one mental health concern identified by the instrument and whose physicians were given complete SDDS-PC results made fewer specialist visits over the next three months than those whose physicians did not have SDDS-PC data. The study was small (185 patients and 172 controls) and not limited to depression. The results of these two outcome studies would not have affected the conclusion of this review.

This collection of relatively recent studies of depression screening instruments is noteworthy in several respects. First, it does support the notion that depression screening instruments measure more than depression, as argued by Clark and Watson (74). This finding would explain the lack of specificity of even the most respected instruments.

Second, the instruments do work. One can detect clinically significant depression in diverse populations of English-speaking people using one or another of the validated tools. Several are effective among elderly individuals, but severe dementia reduces performance.

Third, several papers suggest that systematic approaches to diagnosis are better than nonsystematic ones, and their authors encourage better training of clinicians. The unfavorable comparisons of clinician judgment with standardized methods would seem to support this view, at least at the stage of diagnosis of depression. Higgins' review (75) of recognition and treatment of mental illness in primary care settings, which was not limited to depression, reached a similar conclusion. Higgins found only one large study, reported in conference proceedings, that showed improvement in measured outcomes over six months of follow-up among patients cared for by physicians with special training in interview techniques compared with patients whose physicians did not receive the training. Research supporting a systematic approach to diagnosis or better physician training in managing psychiatric disorders is not evidence in favor of the adoption of a screening tool in an entire population of patients.

Finally, evidence exists that “less is more” in depression screening. Because short instruments with well-selected questions appear to perform as well as more elaborate ones (for case finding), brevity may be a key feature.

The results of this review helped the VA decide to use a brief screening instrument for depression among its ambulatory patients. This recommendation is part of the VA's national guidelines for major depressive disorder, which are now being implemented systemwide. Although findings about the benefit to patient outcomes of brief screening remain controversial, the evidence indicates that brief screening instruments perform as well as longer instruments, can be administered with minimal personnel cost, and may lead to improvement in overall patient health. Further research is planned to determine whether these conclusions tilt the cost-benefit balance in favor of universal screening. Evaluation of the impact of more uniform recognition and treatment of depressive disorders is planned through the VA's external peer review program.

Acknowledgments

The review was supported by contract V101(93)P-1369 between the U.S. Department of Veterans Affairs Office of Performance and Quality and the West Virginia Medical Institute. The authors thank Carolyn D. Schade, M.L.S., who performed the computerized literature searches and screened the articles.

Dr. Schade is a medical epidemiologist at the West Virginia Medical Institute, 3001 Chesterfield Place, Charleston, West Virginia 25304. Dr. Jones is assistant chief of the psychiatry service at the Veterans Affairs Medical Center in Salem Virginia. Dr. Wittlin is section chief of satellite clinics in the psychiatry service at the Veterans Affairs Medical Center in San Francisco.

References

1. Simon GE, Von Korff M, Barlow W: Health care costs of primary care patients with recognized depression. Archives of General Psychiatry 52:850-856, 1995Crossref, MedlineGoogle Scholar

2. McFarland BH: Cost-effectiveness considerations for managed care systems: treating depression in primary care. American Journal of Medicine 97(suppl 6A):47S-58S, 1994Google Scholar

3. Sturm R, Wells KB: How can care for depression become more cost-effective? JAMA 273:51-58, 1995Google Scholar

4. Regier DA, Farmer ME, Rae DS, et al: Comorbidity of mental disorders with alcohol and other drug abuse: results from the Epidemiologic Catchment Area (ECA) study. JAMA 264:2511-2518, 1990Crossref, MedlineGoogle Scholar

5. Rush AJ, Golden WE, Hall GW, et al: Depression in Primary Care, vol 1. Clinical practice guideline 5. AHCPR pub 93-0550. Rockville, Md, Agency for Health Care Policy and Research, Apr 1993Google Scholar

6. Katon W, Schulberg H: Epidemiology of depression in primary care. General Hospital Psychiatry 14:237-247, 1992Crossref, MedlineGoogle Scholar

7. Koenig HG, Meador KG, Shelp F, et al: Major depressive disorder in hospitalized medically ill patients: an examination of young and elderly male veterans. Journal of the American Geriatrics Society 39:881- 890, 1991Crossref, MedlineGoogle Scholar

8. Henk HJ, Katzelnick DJ, Kobak KA, et al: Medical costs attributed to depression among patients with a history of high medical expenses in a health maintenance organization. Archives of General Psychiatry 53:899-904, 1996Crossref, MedlineGoogle Scholar

9. Von Korff M, Ormel J, Katon W, et al: Disability and depression among high utilizers of health care: a longitudinal analysis. Archives of General Psychiatry 49:91-100, 1992Crossref, MedlineGoogle Scholar

10. Saravay SM, Pollack S, Steinberg MD, et al: Four-year follow-up of the influence of psychological comorbidity on medical rehospitalization. American Journal of Psychiatry 153:397-403, 1996LinkGoogle Scholar

11. Wells KB, Stewart A, Hays RD et al: The functioning and well-being of depressed patients: results from the Medical Outcomes Study. JAMA 262:914-919, 1989Crossref, MedlineGoogle Scholar

12. Broadhead WE, Blazer DG, George LK, et al: Depression, disability days, and days lost from work in a prospective epidemiologic survey. JAMA 264:2524-2528, 1990Crossref, MedlineGoogle Scholar

13. Judd LL, Paulus MP, Wells KB, et al: Socioeconomic burden of subsyndromal depressive symptoms and major depression in a sample of the general population. American Journal of Psychiatry 153:1411-1417, 1996LinkGoogle Scholar

14. Murray CJL, Lopez AD: Evidence-based health policy: lessons from the Global Burden of Disease Study. Science 274:740- 743, 1996Crossref, MedlineGoogle Scholar

15. Barefoot JC, Schroll M: Symptoms of depression, acute myocardial infarction, and total mortality in a community sample. Circulation 93:1976-1980, 1996Crossref, MedlineGoogle Scholar

16. Morris PLP, Robinson RG, Andrzejewski P, et al: Association of depression with 10-year post-stroke mortality. American Journal of Psychiatry 150:124-129, 1993LinkGoogle Scholar

17. Koenig HG, Shelp F, Goli V, et al: Survival and health care utilization in elderly medical inpatients with major depression. Journal of the American Geriatrics Society 37:599-606, 1989Crossref, MedlineGoogle Scholar

18. Depression Guideline Panel: Depression in Primary Care, vol 2: Treatment of Major Depression. Clinical practice guideline 5. AHCPR pub 93-0551. Rockville, Md, Agency for Health Care Policy and Research, Apr 1993Google Scholar

19. Thase ME, Kupfer DJ: Recent Developments in the pharmacotherapy of mood disorders. Journal of Consulting and Clinical Psychology 64:646-659, 1996Crossref, MedlineGoogle Scholar

20. Work Group on Major Depressive Disorder: Practice Guideline for Major Depressive Disorder in Adults. American Journal of Psychiatry 150(Apr suppl):1-26, 1993Google Scholar

21. Elkin I, Shea T, Watkins JT, et al: National Institute of Mental Health treatment of depression collaborative research program. Archives of General Psychiatry 46:971-982, 1989Crossref, MedlineGoogle Scholar

22. Kaplan HI, Sadock BJ (eds): Comprehensive Textbook of Psychiatry, 6th ed. Baltimore, Williams & Wilkins, 1995, pp 1152-1189Google Scholar

23. Kupfer DJ, Frank E, Perel JM, et al: Five-year outcome for maintenance therapies in recurrent depression. Archives of General Psychiatry 49:769-773, 1992Crossref, MedlineGoogle Scholar

24. Keller MB, Lavori PW, Mueller TL, et al: Time to recovery, chronicity, and levels of psychopathology in major depression: a 5-year prospective follow-up of 431 subjects. Archives of General Psychiatry 49:809-816, 1992Crossref, MedlineGoogle Scholar

25. Wells KB, Sturm R: Informing the policy process: from efficacy to effectiveness data on pharmacotherapy. Journal of Consulting and Clinical Psychology 64:638-645, 1996Crossref, MedlineGoogle Scholar

26. Von Korff M, Ormel J, Katon W, et al: Disability and depression among high utilizers of health care: a longitudinal analysis. Archives of General Psychiatry 49:91-100, 1992Crossref, MedlineGoogle Scholar

27. Sherbourne CD, Wells KB, Hays RD, et al: Subthreshold depression and depressive disorder: clinical characteristics of general medical and mental health specialty outpatients. American Journal of Psychiatry 151:1777-1784, 1994LinkGoogle Scholar

28. Spitzer RL, Williams JBW, Kroenke K, et al: Utility of a new procedure for diagnosing mental disorders in primary care: the PRIME-MD 1000 study. JAMA 272:1749-1756, 1994Crossref, MedlineGoogle Scholar

29. Tiemens BG, Ormel J, Simon GE: Occurrence, recognition, and outcome of psychological disorders in primary care. American Journal of Psychiatry 153:636-644, 1996LinkGoogle Scholar

30. Katon W, Von Korff M, Lin E, et al: Collaborative management to achieve treatment guidelines: impact on depression in primary care. JAMA 273:1026-1031, 1995Crossref, MedlineGoogle Scholar

31. US Preventive Services Task Force: Guide to Clinical Preventive Services, 2nd ed. Alexandria, Va, International Medical Publishing, 1996, pp 541-546Google Scholar

32. Feightner JW, Worrall G: Early detection of depression by primary care physicians. Canadian Medical Association Journal 142:1215-1220, 1990Google Scholar

33. Beck AT, Ward CH, Mendelson M et al: An inventory for measuring depression. Archives of General Psychiatry 4:561-571, 1961Crossref, MedlineGoogle Scholar

34. Hamilton M: Development of a rating scale for primary depressive illness. British Journal of Social and Clinical Psychology 6:278-296, 1967Crossref, MedlineGoogle Scholar

35. Silver Platter Software, version 3.1. Norwood, Mass, Silver Platter International, 1992Google Scholar

36. Yesavage JA, Brink TL, Rose TL, et al: Development and validation of a geriatric depression screening scale: a preliminary report. Journal of Psychiatric Research 17:37-49, 1983Crossref, MedlineGoogle Scholar

37. Goldberg DP, Blackwell B: Psychiatric illness in general practice: a detailed study using a new method of case identification. British Medical Journal 1(707):439-443, 1970Google Scholar

38. Zung WW: A self-rating depression scale. Archives of General Psychiatry 12:63-70, 1965Crossref, MedlineGoogle Scholar

39. Radloff LS: The CES-D Scale. Applied Psychological Measurement 1:385-401, 1977CrossrefGoogle Scholar

40. Broadhead WE, Leon AC, Weissman MM, et al: Development and validation of the SDDS-PC screen for multiple mental disorders in primary care. Archives of Family Medicine 4:211-219, 1995Crossref, MedlineGoogle Scholar

41. Applegate WB, Blass JP, Williams TF: Instruments for the functional assessment of older patients. New England Journal of Medicine 322:1207-1214, 1990Crossref, MedlineGoogle Scholar

42. Gallagher D: Assessing affect in the elderly. Clinics in Geriatric Medicine 3:65-85, 1987Crossref, MedlineGoogle Scholar

43. Leserman J, Koch G: Review of self-report depression and anxiety measures. Drug Information Journal 27:537-548, 1993CrossrefGoogle Scholar

44. Kavan MG, Pace TM, Ponterotto JG, et al: Screening for depression: use of patient questionnaires. American Family Physician 41:897-904, 1990MedlineGoogle Scholar

45. Van Gorp WG, Cummings JL: Assessment of mood, affect, and personality. Clinics in Geriatric Medicine 5:441-459, 1989Crossref, MedlineGoogle Scholar

46. Mulrow CD, Williams JW, Gerety MB, et al: Case-finding instruments for depression in primary care settings. Annals of Internal Medicine 122:913-921, 1995Crossref, MedlineGoogle Scholar

47. Coulehan JL, Schulberg HC, Block MR: The efficiency of depression questionnaires for case finding in primary medical care. Journal of General Internal Medicine 4:541-547, 1989Crossref, MedlineGoogle Scholar

48. Clark LA, Watson D: Tripartite model of anxiety and depression: psychometric evidence and taxonomic implications. Journal of Abnormal Psychology 100:316-336, 1991Crossref, MedlineGoogle Scholar

49. Feightner JW, Worrall G: Early detection of depression by primary care physicians. Canadian Medical Association Journal 142:1215-1220, 1990Google Scholar

50. Zung WWK: Role of rating scales in the identification and management of the depressed patient in the primary care setting. Journal of Clinical Psychiatry 51(June suppl):72-76, 1990Google Scholar

51. Andresen EM, Malmgren JA: Screening for depression in well older adults: evaluation of a short form of the CES-D. American Journal of Preventive Medicine 10:77-84, 1994Crossref, MedlineGoogle Scholar

52. Burrows AB, Satlin A, Salzman C, et al: Depression in a long-term care facility: clinical features and discordance between nursing assessment and patient interviews. Journal of the American Geriatrics Society 43:1118-1122, 1995Crossref, MedlineGoogle Scholar

53. Wilkinson MJB, Barczak P: Psychiatric screening in general practice: comparison of the General Health Questionnaire and the Hospital Anxiety Depression Scale. Journal of the Royal College of General Practitioners 38:311-313, 1988MedlineGoogle Scholar

54. Pond CD, Mant A, Bridges-Webb C: Recognition of depression in the elderly: a comparison of general practitioner opinions and the Geriatric Depression Scale. Family Practice 7:190-194, 1990Crossref, MedlineGoogle Scholar

55. Coyne JC, Schwenk TL, Smolinski M: Recognizing depression: a comparison of family physician ratings, self report, and interview measures. Journal of the American Board of Family Practitioners 4:207-215, 1991MedlineGoogle Scholar

56. Gerber PD, Barrett J, Barrett J, et al: Recognition of depression by internists in primary care. Journal of General Internal Medicine 4:7-13, 1989Crossref, MedlineGoogle Scholar

57. Bailar JC, Mosteller F: Medical technology assessment, in Medical Uses of Statistics, 2nd ed. Edited by Bailar JC, Mosteller F. Boston, NEJM Books, 1992Google Scholar

58. Somoza E, Steer RA, Beck AT, et al: Differentiating major depression and panic disorders by self-report and clinical rating scales: ROC analysis and information theory. Behaviour Research and Therapy 32:771-782, 1994. Crossref, MedlineGoogle Scholar

59. Berwick DM, Murphy JM, Goldman PA, et al: Performance of a five-item mental health screening test. Medical Care 29:169-176, 1991Crossref, MedlineGoogle Scholar

60. Ware JE, Johnson SA, Davies-Avery A, et al: Conceptualization and Measurement of Health for Adults in the Health Insurance Experiment, vol 3. Santa Monica, Calif, Rand Corp, 1979Google Scholar

61. Steer RA, Beck AT, Riskind JH, et al: Differentiation of depressive disorders from generalized anxiety by the Beck Depression Inventory. Journal of Clinical Psychology 42:475-478, 1986Crossref, MedlineGoogle Scholar

62. Rost K, Burnham A, Smith GR: Development of screeners for depressive disorders and substance abuse history. Medical Care 31:189-200, 1993Crossref, MedlineGoogle Scholar

63. Wyshak G, Barsky AJ: Relationship between patient self-ratings and physician ratings of general health, depression, and anxiety. Archives of Family Medicine 3:419-424, 1994Crossref, MedlineGoogle Scholar

64. Iliffe S, Mitchley S, Gould M, et al: Evaluation of the use of brief screening instruments for dementia, depression, and problem drinking among elderly people in general practice. British Journal of General Practice 44:503-507, 1994MedlineGoogle Scholar

65. Gold I, Baraff LJ: Psychiatric screening in the emergency department: its effect on physician behavior. Annals of Emergency Medicine 18:875-880, 1989Crossref, MedlineGoogle Scholar

66. Magruder-Habib K, Zung WWK, Feussner JR: Improving physicians' recognition and treatment of depression in general medical care: results from a randomized clinical trial. Medical Care 28:239-250, 1990Crossref, MedlineGoogle Scholar

67. Shapiro S, German PS, Skinner EA, et al: An experiment to change detection and management of mental morbidity in primary care. Medical Care 25:327-339, 1987Crossref, MedlineGoogle Scholar

68. Berwick DM, Budman S, Damico-White J, et al: Assessment of psychological morbidity in primary care: explorations with the General Health Questionnaire. Journal of Chronic Diseases 40(suppl 1):71S-79S, 1987Google Scholar

69. Magruder-Habib K, Zung WWK, Feussner JR et al: Management of general medical patients with symptoms of depression. General Hospital Psychiatry 11:201-206, 1989Crossref, MedlineGoogle Scholar

70. Loke B, Nicklakson F, Burvill P: Screening for depression: clinical validation of geriatricians' diagnosis, the Brief Assessment Schedule depression cards and the 5-item version of the Symptom Check List among non-demented geriatric inpatients. International Journal of Geriatric Psychiatry 11:461-465, 1996CrossrefGoogle Scholar

71. Hermann N, Mittmann N, Silver IL, et al: A validation study of the Geriatric Depression Scale short form. International Journal of Geriatric Psychiatry 11:457-460, 1996CrossrefGoogle Scholar

72. Dowrick C, Buchan I: Twelve-month outcome of depression in general practice: does detection or disclosure make a difference? British Medical Journal 311:1274- 1276, 1995Google Scholar

73. Reifler DR, Kessler HS, Berhnard EJ, et al: Impact of screening for mental health concerns on health service utilization and functional status in primary care patients. Archives of Internal Medicine 156:2593- 2599, 1996Crossref, MedlineGoogle Scholar

74. Clark LA, Watson D: Tripartite model of anxiety and depression: psychometric evidence and taxonomic implications. Journal of Abnormal Psychology 100:316-336, 1991Crossref, MedlineGoogle Scholar

75. Higgins ES: A review of unrecognized mental illness in primary care. Archives of Family Medicine 3:908-917, 1994Crossref, MedlineGoogle Scholar