Introduction

Primary open-angle glaucoma is generally considered a progressive optic neuropathy, including damage at the optic nerve head (ONH) and/or retinal nerve fiber layer (RNFL), and reduced visual function.1 Randomized clinical trials indicate that the first detectable glaucomatous change at early stages of the disease can be either functional or structural.2, 3 The agreement among RNFL, neuroretinal rim, and visual field (VF) measurements for detecting progression, however, is poor,4, 5, 6 and a combination of structural and functional tests is therefore used to increase the diagnostic sensitivity and detection of progression.7, 8

Standard automated perimetry (SAP) has become the functional clinical standard for diagnosing and monitoring patients with glaucoma. Particularly, the 24-2 Swedish Interactive Threshold Algorithm (SITA) of the Humphrey Field Analyzer (HFA)9 and the G1 Tendency-Oriented Perimetry strategy (TOP) of the Octopus10 are among the most commonly used VF algorithms worldwide. Objective structural imaging instruments have also been standardized for the diagnosis and follow-up of patients with, or at risk for, glaucoma. One of these tools is optical coherence tomography (OCT), which provides quantitative and reproducible measurements of the ONH parameters and peripapillary RNFL thickness.11, 12, 13, 14, 15 Understanding their comparative roles and performance in clinical practice is key to the management of glaucoma.

Current knowledge of the relative diagnostic performance of both SAPs (HFA and Octopus) and OCT comes from studies in which the data were analyzed independently for each test or where only one kind of SAP was compared with a structural test.11, 12, 13, 14, 15, 16, 17, 18, 19, 20 Some studies compared tests for evaluating the VF based on other stimuli than white-on-white perimetry,21, 22, 23, 24, 25, 26, 27 or evaluated the structure–function relationship.28, 29, 30, 31, 32, 33 The objective of the present study was to evaluate and compare the glaucoma diagnostic accuracy of the most widely used VF testing algorithms in clinical practice (24-2 SITA Standard strategy of HFA and G1 TOP strategy of Octopus) with one of the most advanced structural imagining tests (OCT). To the best of our knowledge, this is the first study aimed at comparing the diagnostic accuracy of these tests.

Materials and methods

Participants

The Institutional Review Board approved the prospective study protocol. All participants provided informed consent prior to enrollment and the study methodology adhered to the tenets of the Helsinki Declaration for biomedical research.

A sample of 249 consecutive subjects was prospectively preselected. Glaucoma patients were recruited from the Department of Ophthalmology of the Gregorio Maranon University Hospital and the Glaucoma Clinic of the Moncloa Hospital (Madrid, Spain). Controls were enrolled from patients referred for refraction without abnormal ocular findings, relatives of patients, and friends and family of the hospital staff.

Inclusion criteria for the glaucoma group were a glaucomatous optic disc morphology and intraocular pressure (IOP) ≥21 mm Hg, regardless of the SAP or OCT outcomes. The control group had normal optic disc morphology and IOP <21 mm Hg.

The morphology of the ONH was evaluated by slit-lamp indirect ophthalmoscopy with a 90-diopter lens. The optic discs were evaluated by two glaucoma specialists masked to patient identity and clinical history, and any disagreement was resolved by consensus. Glaucomatous optic disc appearance was defined as focal (localized notching) or diffuse neuroretinal rim narrowing with concentric enlargement of the optic cup or both.

All participants met the following inclusion criteria: best-corrected visual acuity >20/40, refractive error <5 D sphere and 2 cylinder, transparent ocular media (nuclear color/opalescence, cortical or posterior subcapsular lens opacity <1) according to the Lens Opacities Classification System III system,34 and open anterior chamber angle. Subjects with previous intraocular surgery, diabetes or other systemic diseases, history of ocular or neurologic disease, or current use of a medication that could affect VF sensitivity were excluded.

Participants underwent full ophthalmologic examination: clinical history, visual acuity, biomicroscopy of the anterior segment using a slit lamp, gonioscopy, Goldmann applanation tonometry, central corneal ultrasonic pachymetry, and ophthalmoscopy of the posterior segment.

Visual field measurements

All participants underwent at least two white-on-white SAPs with the HFA (Humphrey Zeiss Systems, Dublin, CA, USA; 24-2 SITA Standard strategy) and at least two SAPs with the Octopus (Haag-Streit International, Koeniz, Switzerland; G1 TOP test strategy) to minimize the learning effect. If the perimetry was not reliable (fixation losses <20% and false positive and negative rates <15%), the test was repeated. The last reliable perimetry was included in the statistical analysis. The subjects completed the perimetry tests before undergoing clinical examination or structural testing. Each perimetry examination was performed on different days to avoid a fatigue effect.

HFA and Octopus perimeters use the same background lighting of 31.5 apostilbs (asb). Retinal sensitivity measured by both perimetry types is indicated in decibels (dB), which are tenths of a log unit. Nonetheless, the maximal luminance varied between the instruments. A 0 dB value was the maximum brightness, which corresponds to a 10 000 asb stimulus intensity for HFA, and 4000 asb for Octopus perimetry.

OCT measurements

Peripapillary RNFL thickness and optic disc morphometric parameters were measured using the Optic Disc Cube 200 × 200 scanning protocol of the Cirrus OCT (Carl Zeiss Meditec). For image acquisition, scanning laser images were focused after subjects were seated and properly positioned. The left eye data were converted to a right eye format (Figure 1). All images were artifact-free and acquired with a quality greater than >6/10. The same trained operator performed all scans.

Figure 1
figure 1

The grid of HFA was numbered as shown at the top, while each of the test points of the Octopus was numbered as shown in the middle. Bottom: The 12 OCT sectors were numbered according to the 12 clock-hour positions.

Statistical analyses

All statistical analyses were calculated using SPSS for Mac (version 22.0, IBM corporation, Somers, NY, USA) and Windows MedCal statistical software (version 15; Mariakerke, Belgium). Minimum sample size should be eight individuals per group considering a difference of 25.1 μm for the average thickness as significant,14 with a type 1 error rate of 0.01, and a power of 90%. All variables studied were normally distributed, as verified by the Kolmogorov–Smirnov test. Demographics, HFA, Octopus, and OCT parameters were compared between groups with independent t-tests.

Each of the mean threshold values at each point of the VF was numbered (Figure 1). To evaluate the diagnostic ability for glaucoma, the receiver-operating characteristic (ROC) curves were plotted for the RNFL and ONH parameters acquired with OCT, and for all the HFA and Octopus study points and main parameters. Sensitivities at 85 and 95% fixed-specificities were also calculated. The best areas under the ROC curves (AUCs) were compared using the DeLong method, which is an algorithm for the calculation of the standard error of the AUC and of the difference between two AUCs.35 Using Bonferroni's correction for multiple comparisons, a P-value ≤0.001 was considered significant.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Results

Of the 249 enrolled participants, 11 did not complete all required tests and were excluded from further analysis. In total, 150 eyes of 150 glaucoma patients and 88 eyes of 88 healthy subjects were enrolled. Their clinical characteristics are summarized in Table 1. Comparison of the clinical characteristics revealed significant differences (P<0.001) in all the parameters except age, sex, and laterality of eye included.

Table 1 Demographic characteristics of the study population.

Among the 150 glaucoma participants, 24 had normal HFA results according to pattern standard deviation (PSD) or the glaucoma hemifield test. Three participants of the control group presented with PSD beyond 5% probability level. This finding can be explained as part of the VF learning process. Among the 150 glaucoma participants, 32 had normal Octopus results based on mean deviation (MD <2 dB). All the controls had normal Octopus results.

Figure 2 represents the mean sensitivity (MS) at each study point of the VF evaluated with the HFA and Octopus in the two study groups. There was a clear depression of the sensitivity at all points of the VF in the glaucoma group, which was especially significant in the area corresponding to the upper arcuate defects and nasal step (orange-red color in HFA and yellow-orange in Octopus). All threshold values of both the HFA and Octopus were significantly different (P<0.001) between normal subjects and the glaucoma group (Student’s t-test).

Figure 2
figure 2

Mean threshold values at each study point of the HFA and Octopus perimetries. Top: HFA perimetry; Bottom: Octopus perimetry. The control and glaucoma groups are represented on the left and right sides, respectively.

All peripapillary RNFL thickness and ONH parameters measured with OCT were significantly different between healthy and glaucoma patients, except for at the 3 (nasal sector; P=0.306) and 9 clock-hour positions (temporal sector; P=0.035), and the disc area (P=0.536).

Diagnostic ability of the RNFL thickness and ONH parameters measured by OCT

To evaluate the diagnostic ability for glaucoma of the three studied tests, the AUCs were calculated with their 95% confidence intervals. AUCs of the RNFL thickness measurements and ONH parameters measured by OCT are presented in Figure 3 (bottom). Among these parameters, Mean cup-to-disc (C/D) ratio had the largest AUC (0.958; 95% CI: 0.937–0.980; P<0.001), followed by the Vertical C/D ratio (0.957; 95% CI: 0.935–0.978; P<0.001), and Rim area (0.952; 95% CI: 0.927–0.977; P<0.001). Except for the Disc area (0.513; 95% CI: 0.437–0.587; P=0.74), all parameters had AUCs >0.90 (P<0.001).

Figure 3
figure 3

Top: area under the ROC curves of the main HFA (left) and Octopus (right) parameters. Bottom: area under the ROC curves of the best RNFL parameters (left) and of the optic disc morphometric parameters (right) measured by Cirrus OCT. H=RNFL thickness at clock hour position for a right eye.

The RNFL thicknesses showed good diagnostic ability in the superior and inferior quadrants and 5–7 and 11–2 clock-hour positions (AUCs >0.774; P<0.001). The AUCs of the RNFL thickness in the nasal and temporal quadrants, and at the 3, 4, and 8 to 10 clock-hour positions, however, ranged from 0.599 to 0.710 (Figure 3). The inferior quadrant thickness (0.926; 95% CI: 0.894–0.958; P<0.001), Mean thickness (0.918; 95% CI: 0.885–0.952; P<0.001), and RNFL thickness at the 7 clock-hour position (0.905; 95% CI: 0.869–0.940; P<0,001) had the largest AUCs.

Diagnostic ability of HFA

Figure 3 (top and right) represents the AUCs of the main HFA indices. The largest AUCs were observed for the MD (0.966; 95% CI: 0.945–0.987; P<0.001) and VF index (VFI) (0.961; 95% CI: 0.934–0.987; P<0.001). The largest AUCs were found for the threshold values at point 7 (0.901; 95% CI: 0.857–0.944; P<0.001), point 19 (0.911; 95% CI: 0.867–0.954; P<0.001), point 20 (0.923; 95% CI: 0.886–00.961; P<0.001), and point 21 (0.914; 95% CI: 0.875–0.954; P<0.001).

Diagnostic ability of Octopus Perimetry

Figure 3 (top and left) represents the AUCs of the main Octopus indices. The largest AUCs were observed for MS (0.941; 95% CI: 0.913–0.969; P<0.001). The largest AUCs were observed for the threshold values at point 3 (0.907; 95% CI: 0.870–0.944; P<0.001), point 5 (0.900; 95% CI: 0.861–0.940; P<0.001) and point 21 (0.911; 95% CI: 0.877–0.946; P<0.001).

Comparison of the diagnostic ability between the HFA, Octopus, and OCT

Overall, the best sensitivity/specificity balance was observed for the VFI of HFA (91.3–89.8%; cutoff point ≤98) and the PSD of HFA (87.3–93.2%; cutoff point >1.88). The Octopus parameter with the best sensitivity-specificity balance was MD (84.6–94.3%; cutoff point >0.75). Table 2 represents sensitivities at 85 and 95% fixed-specificities. At 85% fixed-specificity, the best parameter for discriminating between control and glaucoma patients was the VFI of HFA (93.3%), and at 95% fixed-specificity the best parameter to discriminate between control and glaucoma eyes was the PSD of HFA (82.0%).

Table 2 Sensitivities at 85 and 95% fixed-specificities.

The comparison of the best AUCs (DeLong method) of the three-study test did not show significant differences between them. No differences were detected (significant differences were considered when P≤0.001) between the MD of the HFA and the MS of the Octopus (P=0.042), or between the MD of the HFA and RNFL thickness at the inferior quadrant (P=0.027).

Discussion

Our study was designed to assess and compare the diagnostic accuracy of the 24-2 SITA-Standard algorithm of HFA, the G1 TOP strategy of the Octopus, and OCT to discriminate between normal and glaucoma patients. The best parameter to discriminate between healthy and glaucoma eyes at 95% fixed specificity was the PSD of the HFA (sensitivity 82.0%). Additionally, HFA parameters presented better numerical AUCs than the Octopus and OCT parameters, but comparison of the ROC curves revealed no significant differences in the diagnostic ability of the three tests. Nevertheless, the P<0.001 condition for statistical significance was set high due to multiple comparisons, and this fact is actually limiting the ability to compare the tests. Although the DeLong method did not show significant differences, we should take into account that the low P-values are suggestive of a difference, even if they do not reach the fixed level to be considered as significant in this study.

The main parameters of the HFA and Octopus (Figure 3, top) had good accuracy for discriminating between normal and glaucoma eyes (all AUCs >0.929; P<0.001). The VF parameters with the best AUCs were MD of the HFA (0.966; 95% CI: 0.945–0.987; P<0.001) and MS of the Octopus (0.941; 95% CI: 0.913–0.969; P<0.001). Our inclusion criteria were very strict and only included participants with transparent ocular media. This fact may have led to these results, where the MD of the HFA and the MS of the Octopus had larger AUCs than PSD of the HFA and square root of loss of variance (sLV) of the Octopus, which are usually more sensitive for detecting focal losses instead of diffuse vision loss. The study population was selected to avoid any other ocular pathologies than glaucoma in the glaucoma group. Participants presented with very good visual acuities, no cataracts, and no previous intraocular surgeries.

We found similar sensitivities at fixed specificities for the main indices of the Octopus. In a study published in 2006 by De la Rosa et al18 analyzing the diagnostic accuracy and reproducibility of TOP in glaucoma, the Octopus MD and Octopus LV had similar diagnostic precision for moderate and advanced glaucoma, but the LV was the best diagnostic index when the MD was <6 dB.

The largest AUCs were for the MD of the HFA (0.966; 95% CI: 0.945–0.987; P<0,001) and the VFI of the HFA (0.961; 95% CI: 0.934–0.987; P<0.001). Although the VFI presented the best sensitivity/specificity balance and the best sensitivity (93%) at 85% fixed specificity, clinicians should take into account the ceiling effect that occurs in VFI with MDs better than −5 dB.36

Study points with the largest AUCs were located in the superior hemifield: points 7, 19, 20, and 21 of the HFA, and points 3 and 21 of the Octopus. These study points represent a superior arcuate defect and a nasal step, locations of very typical glaucomatous VF defects.37, 38, 39 Paracentral points (19 and 10) have also been documented as very typical of glaucoma, especially when regionally enhanced spatial resolution is used,40 but points close to the fovea did not present with good AUCs in our sample (mild glaucoma). We used non-equivalent strategies from both perimetry types, SITA Standard and TOP, because they are the default strategies used in clinical practice. Our objective was to analyze the strategies that are normally used in clinical practice.

In this study, optic disc parameters presented with numerically better AUCs than RNFL thickness measurements. Three of the optic disc parameters measured by the OCT had the largest AUCs: average C/D ratio (0.958; 95% CI: 0.937–0.98; P<0.001), vertical C/D ratio (0.957; 95% CI: 0.935–0.978; P<0.001), and rim area (0.952; 95% CI: 0.927–0.977; P<0.001). Our results are among the best in the literature, especially considering that our glaucoma group basically comprised early glaucomatous eyes according to the Hodapp–Parrish–Anderson score (MD of HFA was −5.42±4.6 dB).41 It is important to note, however, that patients were selected because of the optic disc appearance regardless of the VF or the OCT results, which may have biased the ONH parameters to be more accurate for differentiating between normal and eyes with glaucomatous optic neuropathy. Mwanza et al13 found similar OCT diagnostic ability. They studied 73 glaucomatous eyes and 146 control eyes, and observed the largest AUCs for the rim area (0.912) and the vertical C/D ratio (0.890). This last parameter, vertical C/D ratio, had the best AUC (0.962) in a very similar study published by the same group that included 58 glaucomatous eyes and 99 controls.42 A recent similar study43 that included a large sample (209 glaucomatous eyes, 405 pre-perimetric glaucomatous eyes, and 109 controls) had worse results compared with our study and that of Mwanza et al13 Only rim area presented a relatively good AUC and it was worse than that of the average RNFL thickness. One potential reason for this is that they included glaucomatous eyes at an earlier stage of the disease. Indeed, the diagnostic ability improved when advance glaucoma cases were selected. Pollet-Villard et al32 reported similar results.

Many studies have confirmed the good diagnostic ability of peripapillary RNFL thickness measured by OCT.11, 12, 13 In our study, the best AUC was for the RNFL thickness at the inferior quadrant (0.926; 95% CI: 0.894–0.958; P<0.001), followed by the average RNFL thickness (0.918; 95% CI: 0.885–0.952; P<0.001) and the RNFL thickness at the 7 clock-hour position (0.905; 95% CI: 0.869–0.940; P<0.001). These three parameters also had the best AUCs in the Mwanza et al study.13

Comparison of measurements obtained by the Cirrus and Stratus OCT indicates that both tests have the same diagnostic ability despite the better reproducibility of the Cirrus OCT.26, 44, 45 Most of the studies show consistency in the parameters with the best AUCs: RNFL thickness at the inferior quadrant, RNFL thickness at the 7 and 6 clock-hour positions, and average RNFL thickness.40, 46 These results support the idea that early glaucoma damage usually starts at the superior and inferior optic disc poles.14, 30, 47, 48

In conclusion, the HFA, Octopus, and Cirrus OCT did not significantly differ in their ability to discriminate glaucomatous optic neuropathy. The three tests demonstrated very good diagnostic ability for discriminating between healthy and glaucomatous eyes.