Introduction

In recent years, changes in the structure of upper gastrointestinal (GI) tract diseases have occurred in Asian countries secondary to the Westernization of dietary habits, including changes in the gastric environment as represented by the decrease in the rate of Helicobacter pylori infection1. Among the various upper GI diseases, gastroesophageal reflux disease (GERD) is considered one of the most prevalent2. With these epidemiological changes, both accurate diagnosis of GERD and an appropriate treatment strategy are required. Recent reports from Western countries have indicated that clinicians tend to excessively diagnose GERD among patients complaining of gastric pain and/or heartburn and that proton pump inhibitors (PPIs) are being prescribed in excess of what is needed3,4.

GERD is classified into two types according to endoscopic findings: non-erosive GERD (NERD) and erosive GERD (ERD). The Los Angeles (LA) classification is widely used internationally for endoscopic classification of ERD5. In Japan, LA classification grade M (minimal change) has been introduced6. Several studies have investigated the differences among LA grades M, N, and A in terms of histology, acid exposure time, and esophageal motility dysfunction7,8,9,10,11. However, the clinical significance of classification of NERD into LA grades N and M remains unknown. To clarify the clinical characteristics of LA grade M, we compared the patients’ backgrounds, intensity of GERD and functional dyspepsia (FD) symptoms, impact on daily life, psychiatric bias, quality of life, and response to PPI treatment among LA grades M, N, and A.

Results

Patient characteristics

Among 290 consecutive patients with GERD, 45 (22.4%) patients with LA grade M, 62 (30.8%) patients with LA grade N, and 94 (46.8%) patients with LA grade A were enrolled at baseline. Of these patients, 108 (53.7%) were men and 93 (46.3%) were women. Their mean age was 57.1 ± 14.4 years, and their mean body mass index (BMI) was 23.6 ± 3.5 kg/m2. At baseline, the mean GERD symptom subscale (GERD-SS) score was 3.4 ± 1.2. The mean scores on the dissatisfaction with daily life scale (DS-SS) [i.e., dissatisfaction with eating (Q6), dissatisfaction with sleep (Q7), dissatisfaction with daily activities (Q8), and dissatisfaction with mood (Q9)] were 2.1 ± 1.1, 2.1 ± 1.1, 2.1 ± 1.0, and 2.7 ± 1.1, respectively. The mean Hospital Anxiety and Depression Scale (HADS) scores were 6.5 ± 3.5 and 5.6 ± 3.6, respectively. The mean acute (1-week recall) version of the health-related quality of life survey [Short Form-8 (SF-8)] physical component summary score was 45.4 ± 6.6, and the mean SF-8 mental component summary score was 46.3 ± 6.8.

Comparison of patients’ backgrounds and clinical characteristics among the three groups

Table 1 compares the patients’ backgrounds and clinical characteristics among the three groups. For all items of the patients’ backgrounds and clinical characteristics, there were no significant differences between LA grades N and M. In contrast, there were significant differences in the BMI and DS-SS scores (sleeping, mood, and daily life) and differences with a tendency toward statistical significance in the DS-SS scores (eating and social activity) between patients with LA grades M and A. There were significant differences in the BMI and pretreatment FD-SS, FD-postprandial distress syndrome-SS, and DS-SS scores (eating, sleeping, social activity, mood and daily life) and differences with a tendency toward statistical significance in the GERD-SS scores between patients with LA grades M and A.

Table 1 Comparison of patients’ backgrounds and clinical characteristics among the three groups.

Degree of symptom improvement after 4 weeks of PPI administration

Figure 1 shows the changes in the GERD-SS scores before and after 4 weeks of PPI administration according to LA grades N, M, and A. Figure 2 shows the difference in the residual rate of GERD symptoms among the three groups. There was no significant difference in the residual rate of GERD symptoms among the three groups, although the residual rate tended to be lower in patients with LA grade A (Fig. 2). Figure 3 shows that there was no difference in symptom improvement by patient impressions among the three groups. Figure 4 shows the difference in the numeric rating scale score among the three groups. There was a significant difference between LA grades A and N, but no significant difference between LA grades N and M.

Figure 1
figure 1

Changes in gastroesophageal reflux disease symptom subscale scores before and after 4 weeks of proton pump inhibitor administration.

Figure 2
figure 2

Comparison of symptom improvement among the three groups after 4 weeks of proton pump inhibitor administration: Residual rate of gastroesophageal reflux disease symptoms. NOTE: a = analysis of variance, b = Tukey’s honestly significant difference test, d = Cohen’s d.

Figure 3
figure 3

Comparison of symptom improvement among the three groups after 4 weeks of proton pump inhibitor administration: Symptom improvement through patient impressions. NOTE: a = analysis of variance.

Figure 4
figure 4

Comparison of symptom improvement among the three groups after 4 weeks of proton pump inhibitor administration: Global assessment scale. NOTE: a = analysis of variance, b = Tukey’s honestly significant difference test, d = Cohen’s d.

Discussion

In this multicenter prospective observational study, the clinical characteristics of patients with LA grade M were similar to those of patients with LA grade N both before and during treatment; however, they were clearly different from those of patients with LA grade A. These findings indicate that it is clinically meaningful to distinguish patients with NERD from those with ERD, but it is not meaningful to distinguish between patients with and without minimal change.

Because upper GI endoscopy appears to be the only diagnostic modality that can diagnose ERD with high sensitivity and can also diagnose Barrett’s esophagus, endoscopy should be the first-line procedure when testing for ERD is necessary12. Most patients with GERD do not have endoscopic evidence of mucosal breaks; therefore, negative endoscopy findings do not exclude the diagnosis of GERD. In other words, endoscopic examination seems to be less sensitive for symptomatic GERD. Traditionally, a diagnosis of NERD has been given to patients with heartburn or acid regurgitation without any esophageal mucosal breaks; generally, patients with NERD tend to have negative pH tests13. The characteristics of acid reflux and the pattern of symptoms suggest that this patient population is heterogeneous. In Japan, LA classification grade M has been introduced to classify such patients more clearly14. However, this terminology has caused some confusion among endoscopists; research has shown that interobserver agreement on the endoscopic diagnosis of LA grade M is too low to be of clinical significance15. According to one review, at least two-thirds of patients with NERD have microscopic esophageal mucosal damage, such as dilated intercellular spaces, and this microscopic esophageal injury is considered a clinical sign of a response to PPI therapy16. Although minimal change such as redness of the esophageal mucosa can occur due to gastric acid reflux, this phenomenon can also occur even within the normal range of gastric acid reflux; therefore, the presence of minimal change does not mean that a patient necessarily has GERD. When the diagnosis of LA grade M is made by endoscopy, the condition is not the same as LA grade A but is the same as LA grade N9,13. This is consistent with our study results.

Several studies have focused on the pathogenesis of minimal change esophagitis7,9, but the results have been conflicting. Lei et al.9 studied the pathogenesis of minimal change esophagitis in 100 patients (esophagitis without minimal change, n = 52; esophagitis with minimal change, n = 48). The rate of effective peristalsis was comparable in patients with and without minimal change esophagitis (p = not significant). There was no difference in the esophageal acid reflux status or DeMeester score between the two groups (p = not significant). Additionally, intragastric acidity (pH < 4) was comparable in patients with and without minimal change esophagitis. The authors concluded that among patients with NERD, the disease characteristics in terms of esophageal acid exposure and motor dysfunction may be similar between patients with and without minimal change esophagitis9. This finding is consistent with our results. Another study from Japan examined the difference in the pathogenesis with respect to esophageal acid reflux using 24-h pH monitoring between patients with LA grade M and those with LA grade N7. There was no significant difference in the quality of life score or patients’ backgrounds between the two groups. However, 57.1% (8/14) of patients with grade M had a pH of < 4 for ≥ 4% of the total time (abnormal acid reflux), compared with only 11.8% (2/17) of patients with grade N (p = 0.018). Nevertheless, the median percent time with a pH of < 4.0 was 1.5% (range 0.0–11.1%) and 6.4% (range, 0.3%–14.9%) for grade N and grade M, respectively, which are not high. Therefore, it is possible that this difference was not large enough to result in a difference in PPI responsiveness.

This study has several limitations. First, the interobserver agreement on the endoscopic diagnosis of LA grade M is known to be too low. We did not examine the interobserver agreement among the endoscopists at all institutions with regard to an endoscopic diagnosis of minimal change. Before the study, we had an opportunity to present representative endoscopic images and explain the endoscopic diagnosis of minimal change, and we obtained a consensus from the endoscopists participating in the study. However, we do believe that the accuracy of the endoscopic diagnosis of GERD is reliable because all doctors involved in this study were actively engaged in the treatment of GERD on a daily basis and were endoscopists certified by the Japanese Society of Endoscopy. Second, 24-h pH-impedance monitoring and high-resolution esophageal manometry were not performed in this study; therefore, esophageal hypersensitivity, functional heartburn, and esophageal dysmotility may have been included in the NERD group. Third, because the LA grade was determined based on the endoscopy findings of patients who presented with symptoms that met the Montreal criteria, the NERD group may have included patients who previously had ERD and subsequently changed to NERD because of treatment or the natural history of the disease. Fourth, the power of the study was insufficient because of the small number of cases. Even when the effect size (Cohen’s d) was moderate (around 0.5), no significant difference was found between some patients, indicating that the study may not have sufficiently demonstrated a clinically meaningful difference. A study with a larger number of cases is desirable.

In conclusion, minimal change as an endoscopic classification of GERD is unnecessary in clinical practice. From the viewpoint of clinical characteristics, the classification of NERD should be sufficient for endoscopy of patients who have heartburn without mucosal breaks.

Methods

Study design

This multicenter prospective observational study was conducted at 29 institutions in Japan. One or more investigators per institution was a member of the GERD Society, a Japanese collaborative research group consisting of experts in clinical practice of GERD treatment. This study was conducted in accordance with the Declaration of Helsinki (sixth revision, 2008) after obtaining approval from the ethics committee of each institution or the central ethics committee of Nishi Clinic, Osaka, Japan. The study was registered with the University Hospital Medical Information Network Center Clinical Trials Registry in Japan (reference number UMIN000006614).

The study design is shown in Fig. 5. Eligible patients were asked to complete the following questionnaires to evaluate patients’ clinical characteristics. Symptoms of GERD/FD and quality of life were assessed using the GERD-TEST17 and the SF-818 at weeks 0, 2, and 4. Psychiatric assessments were conducted using the HADS19 at weeks 0 and 4. All questionnaires were completed by the study participants themselves and mailed to the data center.

Figure 5
figure 5

Study design.

Patients

Outpatients with gastroesophageal reflux symptoms were enrolled in this study. Patients were considered to have gastroesophageal reflux symptoms if they had experienced moderate or severe heartburn or acid regurgitation at least once a week or mild heartburn or acid regurgitation at least twice a week during the 2 weeks prior to this study, according to the Montreal definition20. After upper GI endoscopy, the patients were administered a PPI at the dose approved in Japan; i.e., omeprazole at 20 mg once daily, lansoprazole at 30 mg once daily, or rabeprazole at 10 or 20 mg once daily.

The eligibility criteria were (1) an endoscopic diagnosis of LA grade N, M, or A GERD according to the modified LA classification system (Fig. 6)6; (2) age of > 20 years at the time of providing consent; and (3) provision of written informed consent.

Figure 6
figure 6

Typical endoscopic images of LA grades N and M. (a) LA grade N = no endoscopic changes in esophageal mucosa. (b) LA grade M = endoscopic appearance of discoloration of the esophageal mucosa.

The exclusion criteria were (1) concomitant or prior diseases that may affect the study results (e.g., Zollinger–Ellison syndrome, inflammatory bowel disease, irritable bowel syndrome, esophageal stricture, eosinophilic esophagitis, esophageal achalasia, malabsorption, cerebrovascular disease); (2) vomiting associated with other diseases, peptic ulcers except in the scar stage, or other symptoms of severe liver disease, renal disease, cardiac disease, psychiatric disease, metabolic disorder, neurological disease, or collagen disease; (3) confirmed or suspected malignancy; (4) history of GI surgery or vagotomy; (5) history of hypersensitivity to PPIs or their excipients; (6) eradication of H. pylori within 6 months prior to enrollment; (7) pregnancy, possible pregnancy, or lactation; (8) medication with a PPI or histamine type 2 receptor antagonist within 1 week prior to enrollment; and (9) ineligibility for the study as determined by the physician. Prohibited medications were those that may affect the study results (PPIs, histamine type 2 receptor antagonists, prokinetic agents, gastric mucosal protectants, and anticholinergics other than the study medication) and those that may interact with any of the study medications.

Details of questionnaires for data collection

The GERD-TEST is a patient-reported 13-item questionnaire developed to investigate the symptoms of GERD and dyspepsia, their impact on the patient’s daily life, and the patient’s impression of the treatment. The GERD-SS is defined as the mean of the heartburn (Q1) and acid regurgitation (Q2) scores, and the FD symptom subscale (FD-SS) is defined as the mean of epigastric pain/burning symptoms (Q3) and postprandial distress symptoms [postprandial fullness (Q4) and early satiation (Q5)]. The DS-SS is divided into dissatisfaction with eating (Q6), dissatisfaction with sleep (Q7), dissatisfaction with daily activity (Q8), and dissatisfaction with mood (Q9). Questions 10 to 13 focus on the effects of PPI treatment. The details of the GERD-TEST have been discussed in our previous report21.

The SF-8 is a questionnaire used to assess patients’ health status and consists of a physical component summary and a mental component summary22. Other details about the SF-8 have been previously reported22.

The HADS is a well-established measure of psychiatric bias with subscales for anxiety and depression, each comprising seven items19. A higher score indicates a higher level of anxiety or depression. The anxiety and depression scores were compared among the three groups in the present study.

Therapeutic response to PPI therapy

The efficacy of PPI therapy in patients with GERD was evaluated using the following three indices, as we previously reported21: (1) the GERD-SS residual symptom rate: 100 (%) × (Week 4 GERD-SS score − 1)/(Week 0 GERD-SS score − 1), (2) the patient’s impression of the treatment: Q11 score on GERD-TEST (score of the impression that GERD symptoms have improved compared with those before the current medication; 1 = very much improved, 2 = improved, 3 = somewhat improved, 4 = no change, and 5 = worsened), and (3) the numeric rating scale score for GERD symptoms (Q12 of GERD-TEST): a numerical rating of the relative intensity of GERD symptoms (0 points = no symptoms, 10 points = same severity of symptoms as before treatment).

Statistical analysis

Data from patients who underwent upper GI endoscopy; completed questionnaires within 4 weeks before starting treatment; provided information on sex, age, height, and body weight; and had a medication adherence rate of ≥ 75% were analyzed. Patients with gastroesophageal reflux symptoms were divided into three groups according to the endoscopic findings: LA grade N, LA grade M, and LA grade A. Data are expressed as mean ± standard deviation. The statistical methods used to compare patients’ characteristics and treatment effects among the three groups were analysis of variance followed by Tukey’s test and Fisher’s exact test. The size of the difference between groups was evaluated by the effect size (Cohen’s d). Cohen’s d values of ≥ 0.20, ≥ 0.50, and ≥ 0.80 were defined as small, medium, and large effects, respectively23. JMP12.0.1 software (SAS Institute Inc., Cary, NC, USA) was used for data analysis, and a p value of < 0.05 was considered to indicate clinical significance.

Ethics approval and consent to participate and publish

All procedures were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and the Helsinki Declaration of 1964 and later versions. Informed consent or a substitute was obtained from all patients for their participation in the study.