Introduction

Monitoring patients with glaucoma to detect progression and determine the rate of visual function loss is the mainstay of glaucoma care. In clinical settings, the recommended tests for monitoring include tonometry, clinical examination of the optic disc and retinal nerve fibre layer (RNFL), and visual field testing. The evaluation of structural changes is complemented by quantitative measurement using optical coherence tomography (OCT). Standard automated perimetry (SAP) is the reference standard for assessment of visual function in glaucoma [1]. It is a subjective method relying on patients’ cooperation, and in some patients there is a high variability of mean deviation over time which decreases the ability to detect true change from noise [2]. Electroretinography (ERG) is an objective method, and both pattern ERG (PERG) and the photopic negative response (PhNR) of the ERG are sensitive markers of the retinal ganglion cell (RGC) dysfunction that is a characteristic of glaucoma [3,4,5,6].

The PERG is a measure of the electrical activity of the RGC population of the central retina (more than 40% of the total RGC population) in response to a suprathreshold stimulus [7]. The PhNR of the light-adapted ERG is a negative-going wave that occurs after the b-wave in response to a brief flash. It reflects generalized activity of the RGC and their axons [8], and its amplitude, similarly to PERG, can be reduced early in diseases that affect the innermost retina [5].

A recent review article on the clinical applicability of electrophysiological tests in glaucoma found a reasonable correlation between amplitudes and latency of electrophysiological measures and routine tests for glaucoma, mainly SAP and OCT [9]. However, it remains unclear what is the role of these tests in early detection and monitoring of glaucoma. Requirement of complex protocols, equipment and experienced personnel limits the use of electrophysiology to special cases and research. In our previous cross-sectional study, we found that patients with suspect and early glaucoma had significantly reduced PERG N95 and PhNR amplitudes compared to controls, indicating high sensitivity of both electrophysiological measures for early detection of ganglion cell damage [10]. In addition, in eyes with suspect glaucoma, a greater decrease in PhNR amplitude was associated with small changes in peripapillary retinal nerve fibre layer (pRNFL) thickness that may be predictive of glaucoma progression [10].

The aim of the present study was to investigate the value of PERG and PhNR in monitoring glaucoma compared to standard clinical tests (SAP and optic disc assessment) and structural measurements using spectral-domain OCT.

Methods

This longitudinal study was performed according to the tenets of the Declaration of Helsinki and was approved by the National Ethics Committee, University Medical Centre Ljubljana, Ljubljana, Slovenia (KME 33/11/11). All of the participants were fully informed about possible consequences of the research protocol and signed their informed consent before enrolment.

Thirty-two patients with ocular hypertension (OHT), suspect glaucoma or early open-angle glaucoma were recruited from the Glaucoma Clinic of the Department of Ophthalmology, University Medical Centre Ljubljana, Slovenia. The enrolment started in January 2012, and the participants had at least 4 years of follow-up. The patients were aged 25 to 81 years (mean age ± SD, 59.5 ± 12.0 years), with 9 males and 23 females. The inclusion criteria were visual acuity ≥ 0.8 Snellen, clear optic media and myopia < −5D. Exclusion criteria were treatment with topical or systemic corticosteroids, diabetes or neurological disorders (e.g. Parkinson’s disease, multiple sclerosis).

At baseline, all of these patients underwent complete ophthalmological examination and visual field testing and were diagnosed as: OHT, characterized by untreated intraocular pressure (IOP) consistently > 21 mmHg, and normal optic disc and visual field; suspect glaucoma, characterized by suspicious appearing optic disc, with normal or suspicious visual field; or early glaucoma, characterized by the presence of glaucomatous changes at the optic disc, and corresponding reproducible visual field loss, with mean defect from 2 to 6 dB. A glaucomatous optic disc appearance included focal and/or diffuse thinning of the neuroretinal rim, and asymmetry in the optic disc cupping between the eyes > 0.2 that was not caused by the difference in optic disc size or shape [11]. Due to the large variation of optic disc appearance among healthy subjects (in size, shape), there are no clear criteria for early glaucomatous disc changes. Therefore, the term suspicious optic disc was used when the discs had features resembling glaucomatous optic disc changes and a definite diagnosis of glaucoma can only be ascertained with the follow-up [11]. The assessment of optic disc was performed by a glaucoma consultant (BC) using the above criteria.

Visual field defects were defined as three or more adjacent points of ≥ 5 dB loss or two or more points ≥ 10 dB loss, in the absence of other changes that could explain the defect.

The patients with glaucoma and high-risk OHT were treated with topical hypotensive medication, either as a monotherapy or as a combination of drugs, such as prostaglandin analogues, beta-blockers, alpha-2 agonists, and carbonic anhydrase inhibitors.

After ophthalmological examination, IOP measurement and visual field testing, all subjects underwent the ERG and OCT tests.

Visual field testing

Standard automated perimetry (SAP) was performed in all subjects, using an Octopus 900 perimeter (Haag-Streit AG, Koeniz, Switzerland) with the Dynamic Strategy G2 program. Only reproducible tests with < 20% false-positive and < 20% false-negative response rates were used in the evaluation. The following global visual field indices were recorded: mean defect (MD), and square root of loss variance (sLV). The MD is a positive value using Octopus perimetry and represents the average visual field loss from all locations, whereas sLV is a measure of variability across the visual field and increases in localized defects.

Electroretinography

Electroretinographic responses were recorded using an Espion visual electrophysiology testing system (Diagnosys LLC, Littleton, MA, USA). The recording procedure for the PERG and the PhNR followed the standards and guidelines of International Society for Clinical Electrophysiology of Vision (ISCEV) [12, 13]. A HK loop served as a recording electrode and was placed in the fornix of the lower eyelid [14]. The silver chloride reference electrode was placed on the ipsilateral temple, and the ground electrode was positioned on the forehead. The PERG recording does not require pupil dilation; therefore, it was recorded first. It was elicited with a 0.8° checkerboard pattern with 99% contrast that reversed 1.8 times per second, which was presented on a 21.6° X 27.8° cathode ray tube screen stimulator. The patients were sitting 1 m away from the screen stimulator and used optimal refractive correction during the recording. One hundred sweeps were collected for each recording and repeated at least twice. Later, the pupils were dilated with 1% tropicamide (Mydriacyl, Alcon) and the patients were light-adapted for 10 min. Photopic ERGs were elicited with a Ganzfeld ColorDome stimulator (Diagnosys LLC, Littleton, MA, USA), using 2.5 cd s/m2 monochromatic red stimuli (635 nm) on a 10 cd/m2 blue background (470 nm). The rate of stimulation was 1 Hz, and 30 sweeps were collected for each recording, which was repeated at least three times. Sweeps that included artefacts with an amplitude larger than 500 μV were rejected automatically during the recording, while sweeps with low-amplitude artefacts that influenced the baseline or the expected waveform of the response up till 80 ms after the stimulus onset were rejected manually. The average of two most repeatable or all three recordings (collected from 50–80 sweeps) was taken into further analysis. The signals were amplified with a band pass from 0.1 to 500 Hz. For the PERG, the P50 amplitude was measured from the N35 trough, while the N95 amplitude was measured from the P50 peak. For the photopic ERG, the PhNR amplitude was measured from the baseline to the negative trough that clearly appeared after the b-wave and the i-wave. The ratio between the PhNR and b-wave amplitude (PhNR ratio = PhNR amplitude/b-wave amplitude) was also calculated and used for further analysis.

OCT measurements

Spectral-domain OCT (Topcon 3D OCT-2000; Topcon Inc., Tokyo, Japan) was performed following the ERG test. The following two scan protocols were used: 6.0 × 6.0 mm three-dimensional (3D) disc (512 A-scans by 128 B-scans) and 6.0 × 6.0 mm 3D macula (512 A-scans by 128 B-scans). The commercial software derives a peripapillary retinal nerve fibre layer (pRNFL) thickness plot from the segmentation of the 3D disc scan, by centring a circle after the scan is obtained. The data were exported by the software and analysed by the proprietor’s automatic segmentation algorithm. The following layers were collected: the mean pRNFL thickness, the mean thicknesses of the macular nerve fibre layer (NFL), the ganglion cell–inner plexiform layer (mGCIPL) and the ganglion cell complex (GCC). Only scans with the image quality > 70 were accepted.

Follow-up

The patients were examined over the minimum of 4 years, using OCT and electroretinography yearly (within the interval 11–13 months), and clinical examination was performed according to the European Glaucoma Society guidelines within the period 6–12 months. Criteria for glaucoma progression were based on the SAP and/or documented changes of the optic disc/RNFL at ophthalmoscopy from baseline (change in the neuroretinal rim thinning, disc haemorrhage). To define visual field progression, a trend-based analysis was performed in at least six reproducible visual field tests using the commercially available software (EyeSuite Progression Analysis™). Progression was defined as diffuse (MD) and/or local (sLV) worsening at P < 1% [15].

Statistical analysis

The statistical analyses were carried out using the Statistical Package for Social Sciences (SPSS 2013, version 22; Inc., Chicago, Illinois, USA) and Origin 8.0 (OriginLab Corporation, Northampton, USA). Data of paired eyes are likely to be correlated, so by default only the right eye was included in the analysis. The exception were patients with progression in both eyes, for which the eye with faster rate of progression in the visual field was analysed. In a few patients, the left eye was included because of the better ERG signal. The normality of the distributions for dependent variables was tested using the Shapiro–Wilk test. Most variables showed normally distributed data. Therefore, the means’ comparison was made with the analysis of variance (ANOVA) and the Dunnett’s post hoc test. The trends of changes over time for clinical, electrophysiological and OCT parameters were calculated by applying linear regression line: y = a + bx, to the mean values of each parameter over time. The steepness of slope is indicated by the parameter b and the parameter a is the intercept, indicating the value of each parameter at the time point 0. The calculated fitted lines were compared between the stable and progressing group using the F test to determine whether the two data sets were significantly different from each other and to detect differences in progression over time. Correlations of ERG and OCT measures with the visual field indices were calculated using the Pearson correlation test. The Bland–Altman analysis was used to assess test–retest variability for the first two visits in the stable group, and 95% confidence intervals were constructed to assess the precision of the limits of agreement (LoA), as described by Bland and Altman [16]. LoAs were also calculated as a percentage of the mean value to allow between-session findings to be compared across techniques [17]. The coefficient of variation was calculated in the stable group as the ratio of the standard deviation to the mean value for each parameter at all five visits mainly to compare the difference in the variability among examination methods (ERG, OCT, visual field). To discriminate between the progressing and stable glaucoma eyes at the first visit, receiver operating characteristic (ROC) curves for all the variables were constructed. All the statistical tests were two-sided, and a p value < 0.05 was considered statistically significant.

Results

Thirty-two eyes of 32 patients were analysed in this longitudinal study. According to their clinical appearance at baseline, these eyes were classified as OHT (6 eyes), suspect glaucoma (16 eyes) or early glaucoma (10 eyes). The mean follow-up was 52 months (SD 4.7 months; range 48–64 months). Participants’ characteristics and clinical measurements from one eye per subject at baseline are presented in Table 1. During follow-up, 13 patients showed glaucoma progression in both eyes and 19 patients were stable. In Fig. 1, two case examples are shown: case 1 an eye without progression in the visual field, PERG, PhNR and pRNFL thickness, and case 2 an eye with glaucoma progression with early/minimal changes in the visual field (nasal step) and the pRNFL thickness, but at the same time (2013) an important decrease in the PhNR and PERG. With follow-up, significant progression in SAP and OCT was noted, while the ERG abnormality remained stable.

Table 1 Clinical characteristics of participants and their measurement data at baseline
Fig. 1
figure 1

SAP a, ERG b and OCT c findings at follow-up visits for 2 patients, Case 1 with stable clinical picture and Case 2 with fast progression, that was seen in all measures analysed. At the PhNR traces, yellow arrows indicate borderline reduction in the response, while red arrows indicate a notable abnormality of the response

The clinical measurements (visual acuity, IOP, visual field indices), ERG parameters (P50, N95, PhNR, PhNR ratio) and OCT data at baseline, at the third, intermediate visit and at the last follow-up visit for the stable and progressing eyes are summarized in Table 2. For the stable group, all the clinical and electrophysiological parameters, MD, sLV, NFL and GCC thickness remained unchanged during the follow-up, while for the pRNFL and mGCIPL thickness a significant worsening was observed at the last visit only. For the progressing group, a significant worsening was seen for MD, sLV, N95 and all the OCT parameters, while P50 and PhNR showed only slight, but not significant decrease over time of follow-up.

Table 2 Means comparison between the first, intermediate (3rd) and the last (5th) visit for clinical findings, SAP, ERG and OCT measures for the stable (1) and progressing (2) group (ANOVA with Dunnett’s post hoc test)

The trend of change over time was further analysed and compared between the groups by applying linear regression curve to the mean values of each parameter at all the visits, as shown in Fig. 2. In the stable group, visual field indices, electrophysiological measures and OCT parameters showed more or less a horizontal fitted curve. In the progressing group, there was a steeper slope of the fitted line, indicating worsening of the values seen for the visual field indices, N95, PhNR, PhNR ratio and all the OCT parameters. The difference in slopes over time was significant for all the parameters except visual acuity and was highly significant for the GCC, pRNFL and mGCIPL thicknesses.

Fig. 2
figure 2

Mean value (± standard deviation) of the two SAP measures—mean defect (MD) and square root of loss variance (sLV), ERG measures—P50 amplitude (P50), N95 amplitude (N95), PhNR amplitude (PhNR) and PhNR amplitude ratio (PhNR ratio), and OCT measures—pRNFL thickness (pRNFL), macular NFL thickness (NFL), GCC thickness (GCC) and mGCIPL thickness (mGCIPL) at the follow-up visits for the stable and progressing group. Blue and red linear regression lines (y = a + bx) indicate the trend of changes over time for stable and progressive groups, respectively

At the first visit, there were significant moderate correlations of PERG N95 and OCT measures with the visual field indices, with the highest negative correlation between pRNFL thicknesses and MD, and sLV (r = -0.50, p = 0.004 and r = -0−55, P = 0.001, respectively) (Table 3). At the last visit, the strength of association with visual field indices increased, more so for the OCT parameters (r>0.7, p < 0.001) than for the N95. The PhNR amplitude demonstrated significant, but modest correlation with the MD (r = -0.35, p = 0.047) (Table 3).

Table 3 Correlation of ERG and OCT measures with the visual field indices

Inter-session repeatability of ERG and OCT for the stable group between the first and second visit is shown as LoAs (Table 4) and graphically in the Supplementary file (Fig. S). OCT had better inter-session repeatability with smaller % LoAs (range from 3.6% for mGCIPL to 17.4% for pRNFL thicknesses) than ERG (range from 35.9% for N95 to 59.9% for PhNR amplitude). In eyes without progression (stable group), the coefficient of variation was calculated for repeated measurements at first and follow-up visits to compare the rate of inter-session variability between the ERG and OCT measures (Table 4). ERG measures (N95 and PhNR) showed higher measurement variability in relation to the mean than OCT measures. Among OCT measurements, the coefficient of variation was lower for macular parameters than for pRNFL thicknesses.

Table 4 Inter-session repeatability between visit 1 (V1) and 2 (V2), assessed by limits of agreement (LoA) and coefficient of variation (CoV) between all 5 visits for the ERG and OCT measures in the non-progressing eyes (n = 19)

To determine if the analysed parameters could be sensitive in distinguishing between the stable and progressing eyes, ROC analysis was applied to the data at the first visit (Table 5). The GCC and pRNFL thicknesses significantly discriminated between the progressing and non-progressing eyes with the area under the ROC curve of 0.72 and 0.71, respectively. The GCC thickness cut-off values of 97.5 µm had a 63% sensitivity and 85% specificity, whereas pRNFL thickness of 78.5 µm had 74% sensitivity and 77% specificity.

Table 5 Areas under the ROC curves at baseline for detecting progression

Discussion

The purpose of this study was to evaluate PERG and PhNR in monitoring subjects and compare the ERG and structural OCT measures between the progressing and non-progressing eyes over a 4-year follow-up. The stable group had significant thinning for the pRNFL and mGCIPL thicknesses at the last follow-up visit only, whereas the progressing group showed significant deterioration in the visual field indices and OCT parameters at the intermediate visit, and significant changes for all parameters, except in the PhNR at the last visit. In progressing eyes, all measures (PERG N95, PhNR, RNFL and macular thicknesses) showed significantly steeper linear regression slopes, which were highly significant for the pRNFL and GCC thicknesses. At baseline, both OCT measures and N95 moderately correlated with the visual field indices, whereas at the last visit strong correlation with the visual field indices was present only for the OCT measures. The ROC curves showed that at the first visit the pRNFL and GCC thicknesses were the only measures that discriminated between the progressing and stable eyes.

The majority of previous reports address PERG and PhNR as an objective method to help in the early diagnosis of glaucoma or investigate the correlation of ERG measures with visual field parameters and structural changes [3,4,5,6,7, 9, 10, 18, 19]. Many glaucoma studies used a steady-state PERG, recorded with a faster reversal rate of pattern stimulus (typically 16 reversals per second (rps)) which generates a steady-state, sinusoidal waveform, whose period corresponds to the reversal frequency. The steady-state PERG reflects mainly spike-related ON pathway activity, whereas transient PERG (used in our study) receives nearly equal amplitude contributions from ON and OFF pathways with N95 reflecting spiking activity of ganglion cells and P50 non-spiking activity as well [20]. PERG and PhNR are measures of RGC integrity, and lowering of IOP in OHT and early glaucoma eyes was associated with an increase in PERG and PhNR amplitudes indicating an improvement in the inner retinal function [21,22,23,24]. It would be expected that there is weak to modest correlation of electrophysiological with structural measures and visual field as these assess different aspects of pathological process that do not occur at the same time (i.e. retinal dysfunction preceding cell death) [18, 25]. In addition, PERG and PhNR reflect function of the inner retina (PERG is a central response, while PhNR is a diffuse response from the whole retina, and therefore the two tests may offer different levels of information) [26], whereas SAP represents not only the retinal activity but also the activity of the whole visual pathway. OCT measures structural changes of the optic nerve, RNFL and macular parameters which can help clinician to distinguish the anatomic changes in glaucoma patients when compared with normal subjects [27]. Like ERG, the diagnostic ability of OCT is modest in suspect glaucoma and improves with the severity of glaucoma [28, 29].

Cross-sectional studies reported variable, usually weak to moderate correlation between PERG/PhNR and SAP/structural parameters [30,31,32,33,34] or even lack of correlation [25, 35]. Different stages of disease (e.g. OHT, suspect, early or advanced glaucoma) and a high variability of PERG/PhNR amplitudes in the normal population may account for different findings. This variability in normal subjects can affect the results that patients with glaucomatous visual field defect can still have normal PERG [25]. Similarly, we found a considerable overlap of PERG and PhNR amplitudes among OHT, suspect and early glaucoma patients at baseline (Supplementary file) and consequently absence of correlation or moderate correlation of the PhNR and N95 with the visual field parameters.

Several prospective studies assessed the ability of electrophysiological measures to predict progression in OHT or suspect glaucoma eyes using mainly steady-state PERG [36,37,38,39]. Bach et al. [40] comparing two steady-state PERG protocols with the same reversal rate (15 rps) showed that the PERGLA protocol using skin electrodes detected glaucoma similarly to the PERG ratio protocol (PERG to two check sizes of 0.8° and 16°) using corneal electrodes. The largest study including 120 eyes of 64 patients with OHT and the mean follow-up of 10.3 years found that 10% of eyes converted to glaucoma with visual field defect [39]. A PERG amplitude ratio (for standard/large checks reversing at 15 rps) had a significantly steeper mean negative slope over time in converters when compared to non-converters and detected glaucoma patients 4 years before visual field changes occurred. The PERG ratio showed an area under ROC curve of 0.75 (sensitivity of 75%, specificity of 76%) [39]. Ventura et al. [41] monitored RGC function in suspect glaucoma patients (with normal visual field) using PERGs to check alternating at 15 rps (PERGLA paradigm) over a mean of 5.7 years. The PERG amplitude showed a significant negative slope in 15% to 20% of suspect glaucoma eyes, while significant progression of SAP-MD was found in only 0% to 2% of eyes. Banitt et al. [42] evaluated longitudinal rates of change for the pRNFL thickness using OCT and PERG amplitude in suspect glaucoma patients. They found that patients with significantly reduced baseline PERG amplitude (≤ 50% of its age-adjusted normative value) had lower baseline RNFL thicknesses, and the fastest rate of RNFL thinning over the subsequent 5 years. In our study, at baseline visit, only the pRNFL and GCC thicknesses significantly discriminated between the stable and progressing eyes. Similarly, Siesky et al. reported that thinner mean RNFL thickness at baseline was associated with shorter time to visual field progression over the 5-year follow-up [43]. In a recent retrospective study including 357 glaucoma suspects with a follow-up of up to 5.7 years, faster thinning of pRNFL (-1.13 ± 0.85 µm/year) and mGCIPL (-0.71 ± 0.57 µm/year) thicknesses predicted development of visual field defects [44]. In the same study, the rate of change in average pRNFL (-0.27 ± 0.64 µm/year) and GCIPL (-0.19 ± 0.32 µm/year) thicknesses was significant over time also in glaucoma suspect eyes that did not show visual field changes [44]. In our study, a significant decline over time was found for pRNFL and mGCIPL thicknesses in the stable group as well. Longitudinal and cross-sectional analyses showed a consistent rate of approximately 0.2% per year of age-related thinning in NFL and GCC thicknesses [45], but presently commercially available OCT algorithms for monitoring progression do not incorporate thinning due to ageing effect.

The main goal of our study was to investigate the role of both ERG and OCT in monitoring glaucoma patients. In this study, the mean PhNR and N95 amplitude showed a significant negative slope for progressing eyes, but the OCT measures demonstrated even steeper linear slopes between the two groups and appear to be more useful in detecting progression. In addition, we recorded high variability of ERG measurements in stable eyes over time. The coefficients of variation for the non-progressing group were similar to those found by others [40, 46, 47]. These normalized coefficients of variation (11.9% for N95 and 23.6% for PhNR) were high compared with those for anatomical measures (1.6%-5.6%), limiting their sensitivity for detecting changes. However, a certain percentage of variability found for the pRNFL and mGCIPL thicknesses in the stable group may have been caused by true progression due to ageing effect, as both OCT measures showed a significant thinning at the last follow-up visit. Furthermore, the inter-session variability/repeatability was also calculated as LoAs for the first two visits in the stable group and showed that OCT had lower inter-session variability than ERG measurements. OCT mGCIPL thickness had the smallest test–retest variation, within ± 3.6% of the mean; the inter-session variation of PERG N95 amplitude was within ± 35.9% of the mean and of PhNR within 59.9% of the mean. Two studies reported much larger variation for PhNR of ± 88.4% and ± 148.3% of mean amplitude [17, 48].

One of the limitations of our study was the small number of mixed cases including subjects with OHT, suspect and early glaucoma patients compared to prospective studies including suspect glaucoma only. Furthermore, medical treatment has been changed in some patients to achieve lower intraocular pressure which may affect retinal function with a potential improvement in ERG responses [22]. The strength of our study was that clinical criteria for progression were used: change in the optic disc and/or visual field deterioration confirmed by trend analysis.

Although a considerable body of evidence exists that supports the usefulness of PERG and PhNR in predicting and detecting early glaucomatous damage, the present study shows that both have limited applicability in monitoring glaucoma progression, mainly due to high inter-session variability, which hinders detection of true changes over time from noise. Conversely, OCT measures show low inter-session variability and might have a better predicting value for early differentiation of progressing cases in clinical practice.