Elsevier

Sleep Medicine Reviews

Volume 18, Issue 4, August 2014, Pages 321-331
Sleep Medicine Reviews

Clinical review
Evaluation of the measurement properties of the Epworth sleepiness scale: A systematic review

https://doi.org/10.1016/j.smrv.2013.08.002Get rights and content

Summary

Objective

To examine published evidence on the psychometric properties of the Epworth sleepiness scale (ESS) for describing the level of daytime sleepiness (DS) in adults.

Methods

Articles were located on MEDLINE and EMBASE. Psychometric properties were appraised using the COnsensus-based Standards for the selection of health status Measurement Instruments (COSMIN) checklist.

Results

We found thirty-five studies evaluating psychometric properties of the ESS in adults. Of these, 27 studies examined construct validity, 14 – known-group validity, 8 – internal consistency and 4 – test–retest reliability. Study quality ranged from excellent to poor the majority being fair. Internal consistency by Cronbach's alphas was good (0.73–0.86). There is little available evidence on test–retest reliability. Pooled correlations of the ESS with other constructs varied: from moderate (the maintenance of wakefulness test; ρ = −0.43), to weak (the multiple sleep latency test; ρ = −0.27, and sleep apnea-related variables; ρ from 0.11 to 0.23). Although ESS scores varied significantly across groups of subjects with known differences in DS, not all differences were clinically important.

Conclusion

There have been relatively few high quality studies on the ESS psychometric properties. The internal consistency of the ESS suggests that this instrument can be recommended for group but not individual-level comparisons. Correlations with other measures of DS were stronger than with sleep apnea-related or general health measures, but still lower than expected. Further studies are required in the areas of test–retest reliability of the ESS.

Introduction

Daytime sleepiness (DS) is a complicated clinical problem, often indicating a serious underlying physiological abnormality [1]. DS is associated with higher mortality [2], [3], an increased risk for motor-vehicle crashes [4] and work-related accidents, and a higher prevalence of co-morbid conditions such as diabetes, myocardial infarction, and stroke [5], [6], [7].

Accurately estimating an individual's level of DS is important, both to better understand the factors associated with the level of DS and to estimate the health and social consequences of DS. However, the wide spectrum of definitions currently associated with sleepiness complicates its quantification [8]. The International classification of sleep disorders – second edition (2005) [9] defines DS as a difficulty in maintaining the alert awake state during the wake phase of the 24-h sleep–wake cycle. DS has been operationalized as drowsiness, as propensity to sleep [10] or by assessing the impact that sleepiness has on various aspects of daily life [8]. The most often used operational definition of sleepiness is the speed, ease or likelihood of falling asleep as opposed to remaining awake and is represented by instantaneous sleep propensity (ISP), situational sleep propensity (SSP) and average sleep propensity (ASP) [11], [12]. The ISP describes a person's sleep propensity over the preceding few minutes in one particular situation, at one particular time. Combination of multiple ISP values for one situation forms the SSP, the person's usual sleepiness in that particular situation. In the same way, multiple SSP results for varying situations form the ASP, a person's general level of sleepiness across a variety of situations commonly encountered in daily life. ASP is only related to propensity to sleep and measures only that component of DS which persists from week to week in a given subject. As such, ASP differs from feeling tired, sleepy or drowsy in particular situations and does not measure the impact of sleepiness on aspects of daily life [13].

Different methods have been proposed for measuring sleepiness and they can be classified according to their operational definitions [12], [14]. The Epworth sleepiness scale (ESS) is the measure of sleepiness most commonly used in sleep research and clinical settings. A search conducted on July 22nd, 2013 of PubMed articles containing “Epworth sleepiness scale” as a search term returned in total 1868 articles: 163 – in 2010, increasing to 208 articles in 2011 and 238 articles in 2012. By contrast, a search for “Stanford sleepiness scale” returned only 259 articles in total, 15 in 2012. The ESS is the only English language tool available to measure a person's ASP in daily life. This contrasts it with the multiple sleep latency test (MSLT) [15] and the maintenance of wakefulness test (MWT) [16] that measure a person's SSP. Unlike the Karolinska [17] and the Stanford [18] sleepiness scales, the ESS does not measure subjective feelings of drowsiness.

The ESS was developed in 1991 using data from healthy subjects and patients with a variety of sleep disorders to describe “the general level of [DS], as distinct from feelings of sleepiness at a particular time” [19]. The ESS asks people to rate, on a four-point scale, their usual chances of falling asleep in eight different situations, chosen to represent the different levels of “somnificity” that most people encounter as part of their daily lives [19]. Somnificity was defined as “the general characteristic of a posture, activity and situation that reflects its capacity to facilitate sleep-onset in the majority of subjects” [20]. The ESS is inexpensive and easy to administer, complete and score. ESS item-scores are recorded as a number from 0 to 3 written in a single box for each item [19]. The total ESS score is the sum of item-scores and ranges between 0 and 24; the higher the score, the higher the person's level of DS. From the sleep propensity viewpoint, each of the eight ESS item-scores represents a different subjectively-reported SSP [21]. The total score gives a subjectively-reported ASP across the eight ESS situations [20].

Given its widespread use in the field of sleep research it is surprising that there has not been a comprehensive review of the measurement properties of the ESS. While there have been individual papers examining the various aspects of the psychometric properties of the ESS, these studies have not been examined together to evaluate the measurement properties of the ESS. The purpose of this paper is to fill this evidence gap by reviewing the available research examining the measurement properties of the ESS for describing DS in adults. In doing so we hope to provide valuable knowledge for future research projects in this area, in deciding to use the ESS to describe DS.

Section snippets

Search strategy

A broad search of the literature was performed incorporating both electronic and manual components. Two electronic databases, MEDLINE and EMBASE, were searched. Table 1 displays the terms for psychometric properties used in the search. Finally, we carried out manual searches of the references of all articles deemed relevant.

Selection criteria

Searches were limited to studies in adult populations and English language articles published between 1991 (when the scale was first reported) and June 2012. We included

Literature search

From 462 papers found through the online search, 46 describing or commenting on the psychometric properties of the ESS were selected by scanning their titles and abstracts (Fig. 1). By applying our selection criteria through full text review, 35 primary articles were selected to form the basis of this review: eight studies evaluated internal consistency, four evaluated test–retest reliability, 27 evaluated convergent construct validity, and 14 evaluated known-group validity. The general

Discussion

The comprehensive literature search identified 35 studies that evaluated psychometric properties of the ESS in an adult population. The bulk of these studies examined construct validity; eight evaluated internal consistency and only four examined test–retest reliability. The study quality ranged from excellent to poor, with the majority being fair. We discuss the results below under the domains of reliability and construct validity.

Conclusion

Although the ESS is widely used in sleep research and clinical settings, overall it has only modest measurement properties and there have been relatively few high quality studies on its psychometric properties.

The internal consistency of the ESS suggests that this instrument can be used for group level comparisons, but caution is recommended if using the ESS for individual level comparison. Questions remain about the unidimensionality of the ESS scale, particularly for items that may occur

Acknowledgments

The first author, Dr. Tetyana Kendzerska is supported by 2011/2012 Ontario Graduate Scholarship, 2011/2012 Hunter Graduate Scholarship (the University of Toronto) and 2012/2013 Doctoral Research Award from the Canadian Institutes of Health Research. Dr. Peter Smith is supported by a Discovery Early Career Research Award from the Australian Research Council.

References (86)

  • R.D. Chervin et al.

    Comparison of the results of the Epworth Sleepiness Scale and the Multiple Sleep Latency Test

    J Psychosom Res

    (1997)
  • E.M. Weaver et al.

    Polysomnography indexes are discordant with quality of life, symptoms, and reaction times in sleep apnea patients

    Otolaryngol Head Neck Surg

    (2005)
  • M. Skibitsky et al.

    Can standardized sleep questionnaires be used to identify excessive daytime sleeping in older post-acute rehabilitation patients?

    J Am Med Dir Assoc

    (2012)
  • A. Sharafkhaneh et al.

    Contextual factors and perceived self-reported sleepiness: a preliminary report

    Sleep Med

    (2003)
  • R. Cluydts et al.

    Daytime sleepiness and its evaluation

    Sleep Med Rev

    (2002)
  • M. Johns

    Rethinking the assessment of sleepiness

    Sleep Med Rev

    (1998)
  • T.J. Walter et al.

    Comparison of Epworth Sleepiness Scale scores by patients with obstructive sleep apnea and their bed partners

    Sleep Med

    (2002)
  • A.T. Mulgrew et al.

    Residual sleep apnea on polysomnography after 3 months of CPAP therapy: clinical implications, predictors and patterns

    Sleep Med

    (2010)
  • M. Takegami et al.

    Development of a Japanese version of the Epworth Sleepiness Scale (JESS) based on item response theory

    Sleep Med

    (2009)
  • M.W. Johns

    Daytime sleepiness, snoring, and obstructive sleep apnea. The Epworth Sleepiness Scale

    Chest

    (1993)
  • K. Stavitsky et al.

    Sleep in Parkinson's disease: a comparison of actigraphy and subjective measures

    Parkinsonism Relat Disord

    (2010)
  • K. Ruggles et al.

    Evaluation of excessive daytime sleepiness

    WMJ

    (2003)
  • J.C. Hays et al.

    Risk of napping: excessive daytime sleepiness and mortality in an older community population

    J Am Geriatr Soc

    (1996)
  • A.B. Newman et al.

    Daytime sleepiness predicts mortality and cardiovascular disease in older adults. The Cardiovascular Health Study Research Group

    J Am Geriatr Soc

    (2000)
  • E.R. Chasens et al.

    Daytime sleepiness and functional outcomes in older adults with diabetes

    Diabetes Educ

    (2009)
  • I. Koutsourelakis et al.

    Predictors of residual sleepiness in adequately treated obstructive sleep apnoea patients

    Eur Respir J

    (2009)
  • T.B. Young

    Epidemiology of daytime sleepiness: definitions, symptomatology, and prevalence

    J Clin Psychiatry

    (2004)
  • American Academy of Sleep Medicine

    International classification of sleep disorders: diagnostic and coding manual, second edition (ICSD-2)

    (2005)
  • M.W. Johns

    The subjective measurement of excessive daytime sleepiness

  • W. Johns

    A new perspective on sleepiness

    Sleep Biol Rhythm

    (2010)
  • M.W. Johns

    Reliability and factor analysis of the Epworth Sleepiness Scale

    Sleep

    (1992)
  • T.E. Weaver et al.

    An instrument to measure functional status outcomes for disorders of excessive sleepiness

    Sleep

    (1997)
  • M.A. Carskadon et al.

    The multiple sleep latency test: what does it measure?

    Sleep

    (1982)
  • T. Akerstedt et al.

    Subjective and objective sleepiness in the active individual

    Int J Neurosci

    (1990)
  • E. Hoddes et al.

    The development and use of the Stanford sleepiness scale

    Psychophysiology

    (1972)
  • M.W. Johns

    A new method for measuring daytime sleepiness: the Epworth sleepiness scale

    Sleep

    (1991)
  • M.W. Johns

    Sleep propensity varies with behaviour and the situation in which it is measured: the concept of somnificity

    J Sleep Res

    (2002)
  • M.W. Johns

    Sleepiness in different situations measured by the Epworth Sleepiness Scale

    Sleep

    (1994)
  • F. Winne

    Distortions of construct validity in multiple regression analysis

    Can J Behav Sci

    (1983)
  • L.B. Mokkink et al.

    The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study

    Qual Life Res

    (2010)
  • J.M. Cortina

    What is coefficient alpha? An examination of theory and applications

    J Appl Psychol

    (1993)
  • L.G. Portney et al.

    Foundations of clinical research: applications to practice

    (2000)
  • D.L. Streiner et al.

    Health measurement scales: a practical guide to their development and use

    (2008)
  • Cited by (211)

    View all citing articles on Scopus

    The most important references are denoted by an asterisk.

    View full text