International literature provides evidence that bariatric surgery can contribute to substantial weight loss and a positive effect on obesity-related comorbidities [1, 2]. On the other hand, bariatric surgery may also lead to severe postoperative complications, as well as endocrine and metabolic complications [1, 3,4,5]. Psychological consequences of bariatric surgery were also described as complex and not always entirely understood [6,7,8,9]. Therefore, bariatric surgery requires a detailed evaluation of its impact on a patients level and is best assessed with a Quality of Life (QoL) assessment [10].

Standard clinical outcomes, such as weight loss and resolution of comorbidities, are mostly objectively measured in registries, resulting in quantitative data, which are convenient for analyses. QoL assessments, however, are primarily patient-reported measurements and may be more challenging to interpret. On the other hand, it is essential to include QoL assessments in the evaluation of health interventions of bariatric surgery as the patient perspective can provide valuable information on the effectiveness of bariatric surgery that cannot be obtained from clinical outcome measures alone [10,11,12]. QoL could be measured by using questionnaires reflecting the patient’s perspective on the effects of the provided healthcare or treatment given to the patient in their daily lives [10, 11]. Literature showed that the Short Form 36-item Health Survey® (SF-36) is the most commonly used QoL measurement in bariatric surgery [10]. A nearly identical questionnaire is the RAND 36-item Health Survey (RAND-36) that evaluates the same domains as the SF-36. The difference between these two questionnaires mainly consists of the commercial fees required for using the questionnaire [13, 14]. The RAND-36 is also the standard QoL measuring tool offered to patients in all Dutch bariatric hospitals.

Recent studies mainly focussed on clinical outcomes such as total weight loss and obesity-related comorbidity reduction [15, 16]. The few initial studies which compared QoL after sleeve gastrectomy (SG) and Roux-en-Y gastric bypass (RYGB) did not use the RAND-36 and included a low sample size (range 50–1703) with only two studies of more than 1000 patients [17, 18]. Another pitfall in previously conducted studies is the low volume of postoperative respondents as well as single-centre studies, making the results on improvement after bariatric surgery less reliable and also not generalizable to other settings in daily practice [19]. Furthermore, a recent study comparing RYGB and SG in three different European countries showed differences in preoperative characteristics, which may have been the reason for a different surgical approach but could also affect the outcomes including postoperative QoL after bariatric surgery [20]. We therefore compared changes between these procedures, not only looking at statistical significance but also considering clinically relevant differences.

The aim of this study is to compare improvement in QoL after primary bariatric surgery for the two mainly performed primary bariatric procedures in the Netherlands: SG and RYGB. In addition, the study compares postoperative values with reference values for the general Dutch population. A multicentre study design is chosen for a better representation across multiple sites.

Materials and methods

QoL-data were prospectively collected from all patients undergoing a primary RYGB or SG in the five participating hospitals in the Netherlands between 1 January 2015 and 1 January 2017. QoL-data were linked to the national bariatric DATO-registry covering all centres providing bariatric surgery [21].

The scientific committee, which coordinates the national DATO-registry, represents all participating bariatric centres and all members are mandated by the practising hospital where they practice. This committee approved the research proposal for the present study and manuscript for publication. A more in depth description about the scientific committee is given in an earlier scientific article [21].

Patients

In the Netherlands, patients with a body mass index (BMI) ≥ 40.0 kg/m2 or with a BMI ≥ 35.0 kg/m2 and one or more obesity associated comorbidities were eligible for bariatric surgery during the study period [21, 22]. These obesity associated comorbidities were type-2 diabetes mellitus (T2DM), hypertension (HT), dyslipidaemia, gastroesophageal reflux disease (GERD), obstructive sleep apnoea syndrome (OSAS) and musculoskeletal pain. Further treatment strategies, including the choice for SG or RYGB, were determined by a multidisciplinary team and by shared decision making with the patient.

Baseline characteristics in patients undergoing SG or RYGB were compared using the mean ± standard deviation (SD) for normally distributed variables and the median with interquartile range for non-normally distributed variables. The Mann–Whitney U test was performed for continuous variables and χ2 for categorical variables. The threshold for significance has been set at 0.05.

Quality of life (QoL)

During the development process of the nationwide DATO-registry, several questionnaires were considered including the Bariatric Analysis and Reporting Outcome System (BAROS) [23, 24], SF-36 and RAND-36 questionnaires. Given the controversy surrounding BAROS a generic QoL questionnaire was preferred. The RAND-36 and SF-36 are identical, except for different scoring algorithms for the pain and general health perception scales, resulting in the choice for the RAND-36 questionnaire [13, 14, 25, 26].

RAND-36

The Dutch version of the RAND-36 is a validated and standardized translation of the original RAND-36 questionnaire [13, 14]. The questionnaire contains 36 questions within nine scales. These scales are physical functioning, social functioning, physical role limitations, emotional role limitations, mental health, vitality, pain, general health perception and health change perception. Previous studies have shown this to be a valid tool for the measurement of QoL among obese patients undergoing bariatric surgery [25, 27, 28].

Each patient undergoing bariatric surgery in one of the five participating bariatric centres was included in the study. The preoperative questionnaire was completed during the initial screening for bariatric surgery. The postoperative questionnaire was administered 12 months (range 9–15) after primary surgery. The questionnaires were part of the standard given care in the five participating centres.

Analysing the questionnaire

All completed questionnaires were analysed by a predefined algorithm, provided by the RAND-36 research group and included in the original article [13, 14]. A brief summary is given below.

All scores were recoded following the provided algorithm: a high score indicates a more favourable health state (or outcome) of the patient [14, 26]. Each item was scored on a 0 to 100 range. An average of all scores in each of the nine individual scales has been calculated. Missing values were replaced with the personal mean of the specific scale if at least half of the answers on the questions of the scale were provided [13, 14, 26].

Comparing to the Dutch reference population

First, postoperative RAND-36 scores were divided into six age groups (18–24, 25–34, 35–44, 45–54, 55–64 and 65 +). Second, the extent of improvement after surgery was calculated by subtracting preoperative RAND-36 scores from the postoperative RAND-36 scores, providing the delta separately for SG and RYGB. Third, the postoperative RAND-36 scores were compared with the Dutch reference values [13], in order to see if patients experience the same QoL postoperatively as the Dutch reference group.

Finally, for a valid comparison, the age distribution of the Dutch reference population [13] was applied to the age-specific QoL-values of the SG or RYGB patients to prevent overall values being different because of a difference in age distribution. The age-standardized QoL scores for each of the nine scales were compared for SG and RYGB patients with the Dutch population using the t-test.

Postoperative influences on the QoL outcomes

The Clavien–Dindo classification (CD) is used to determine whether a patient had experienced a severe postoperative complication [29, 30]. All patients with a CD grade ≥ 3, within 30 days after primary surgery, were registered as severe. Due to the low number of severe complications, both operative techniques have been combined and the T-test compared the severe complicated group with the uncomplicated group.

A distinction has also been made to see whether the achievement of Total Weight Loss (TWL) influences the QoL outcomes [21, 31]. All patients were subdivided into patients who reached 20% TWL within 12 months postoperatively and patients who did not. There were no patients in this cohort missing preoperative or postoperative weight scores. Both groups are compared using the t-test (reviewer 2, question 1 and 2).

Comparison between SG and RYGB

In order to compare between SG and RYB, we compared the extent of improvement between SG and RYGB patients adjusted for patient variables that differed at baseline using multivariate linear regression analysis reporting the beta estimate and p-value.

Analyses were performed using R version 3.5.1 and the R-packages ‘Companion to Applied Regression’-package (car 3.0–2), ‘A Grammar of Data Manipulation’-package (dplyr 0.7.8) and ‘Table 1’-package (tableone 0.9.3)’ were used.

Table 1 Baseline characteristics

Results

A total of 5574 unique patients underwent a primary SG or RYGB. Patients who were operated and who did not complete both questionnaires (n = 710) were excluded. A total of 4864 (87.3%) patients were eligible for analyses, having completed both a preoperative and postoperative questionnaire. Baseline characteristics were shown in Table 1 and correspond to the national bariatric benchmark in the Netherlands [21].

Patients with a SG were significantly heavier (p < 0.001), reflected in both higher BMI and higher waist circumference (p < 0.001). Statistically significant differences in obesity-related diseases were seen for T2DM, hypertension, dyslipidaemia and musculoskeletal pain. All these obesity-related diseases were seen significantly more often in the RYGB group (p < 0.001) compared to SG.

Sleeve gastrectomy

For patients undergoing SG, the best postoperative QoL scores were seen in relatively young bariatric patients (Table 2a). However, the older bariatric patients showed a larger positive delta score and therefore showed a bigger improvement compared to the younger and middle-aged patients (Table 2b). For example, the youngest patients scored postoperatively better on physical functioning and physical role limitations. On the other hand, the delta scores for these domains are slightly lower for the younger patients compared to the oldest group (Table 2a, b).

Table 2 Postoperative changes after sleeve gastrectomy

Roux-en-Y gastric bypass

For RYGB patients, especially the three youngest categories showed better postoperative scores in almost all RAND-36 domains except for the domain vitality and general health perception (Table 3a). However, the largest improvement (delta) was seen in the oldest group (65 +). This applies to all domains except physical role limitations, emotional role limitations and health change. For these domains, the second-oldest group (55–64) showed the largest improvement (Table 3b).

Table 3 Postoperative changes after Roux-en-Y gastric bypass

Overall quality of life

The comparison of the Dutch reference values with both the postoperative SG and RYGB values were standardized to the age distribution of the Dutch reference group [13]. As suggested in recent scientific literature, a statistical significant QoL-score difference of > 5% is considered a minimal important difference (MID) [32, 33].

Results showed a MID in the domains physical functioning (for RYGB), physical role limitations and health change (for both SG and RYGB) compared to Dutch reference values. Especially, for the domain health change, a large difference was observed (Table 4a). However, patients postoperatively still report lower scores on the domain general health perception. In addition, the SG scores were slightly lower than the RYGB in all RAND-36 domains (Table 4a).

Table 4 Postoperative measured changes compared to the Dutch benchmark

Complications and total weight loss

Hypothetically, a postoperatively complicated course has a negative influence on the postoperative QoL outcomes. To make reliable calculations, both operative techniques were combined and the postoperative complicated group was compared with the uncomplicated group. Table 4b shows that the positive effects in the domains physical functioning and physical role limitations have disappeared (shown in Table 4a). A MID was seen in the domain of social functioning, physical role limitations, emotional role limitations, vitality and pain. Patients postoperatively still report lower scores on the domain general health perception, while they still report a significant health change (Table 4b). (reviewer 2, question 1 and 2).

SG vs. RYGB

Comparing the extent of improvement between SG and RYGB patients on each domain, a significant difference was seen in the domains physical functioning and general health perception when adjusted for differences in baseline characteristics; T2DM, hypertension, dyslipidaemia and musculoskeletal pain. These significant differences were mostly seen in the RYGB group (Table 5).

Table 5 Postoperative delta scale scores of sleeve gastrectomy and Roux-en-Y gastric bypass

Discussion

Current studies on bariatric surgery particularly focus on weight loss and improvement of obesity-related diseases but do not sufficiently take the patient's perspective into account [21, 34,35,36,37,38]. It is important to focus more on postoperative outcomes from a patient’s perspective, because of the enormous increase in bariatric procedures worldwide [22, 39,40,41].

There were a few initial studies comparing QoL after SG and RYGB, but these studies had mostly a low sample size and did not use the RAND 36-item Health Survey (RAND-36) [17, 18]. In addition, these studies were almost all single-centre studies and most of them reported low postoperative response rates [19].

However, there were two larger population-based studies comparing postoperative QoL after bariatric surgery. The first study from Waljee et al. had a larger sample size comparing to the previous noted studies, but does not distinguish between SG and RYGB and has a poor follow-up rate [42, 43]. The second study from Sarwer et al. focused on the QoL and sexual functioning of patients with obesity and looked specific on the changes in these domains, but does not made a distinction between RYGB and SG either [44]. As a result, the question remained which postoperative differences in QoL could be measured between the two most commonly used surgery techniques and how these changes relate to the Dutch population. Therefore, the first multicentre study has been conducted comparing QoL between SG and RYGB with a large sample size and a postoperative response rate of more than 85%.

Results showed that bariatric patients had meaningful higher postoperative scores on physical functioning, physical role limitations and health change for both SG and RYGB compared to Dutch reference values, but meaningful lower scores on general health perception. It may be concluded that patients feel better postoperatively, but not yet fully healthy (reviewer 1, question 3). These results could be a prelude to focus more on these domains, so that bariatric patients do not end up in social isolation and feeling healthier, similar to the national average.

Table 4b clearly showed that a postoperatively severe complicated course or failure to achieve the desired weight loss influences the QoL outcomes. Where first a meaningful positive postoperative score in physical functioning and physical role limitations was seen (Table 4a), a significant negative score was now seen in the severe complicated group and the unsuccessful %TWL group. In addition, a negative trend is also observed in almost all other domains. This argues for better psychological postoperative support for patients where the outcomes do not meet Textbook Outcome [45] (reviewer 2, question 1 and 2).

The two bariatric surgical techniques showed a similar QoL improvement in all domains except for physical functioning and general health perception for which RYGB patients showed a higher postoperative improvement. This difference could be explained by the underlying indication for treatment. Particularly for patients with a high BMI (> 50 kg/m2) a SG is preferably, so a second-stage procedure may follow [46]. The use of the SG for morbid obese patient stems from the use of this procedure as a modification to the duodenal switch. Later on, it was used as a first part of a two-stage gastric bypass procedure on morbid obese patients. In the beginning of this century multiple studies have been published with a laparoscopic sleeve gastrectomy as an isolated bariatric procedure, with promising results [47, 48]. During the time when this study was conducted, the RYGB was often used as a second-stage procedure in Dutch bariatric hospitals. In recent years and increase in the use of the "one anastomosis gastric bypass" (OAGB) and the "single anastomosis duodeno-ileal bypass with sleeve gastrectomy" (SADI-s) was seen. In addition, minor modifications have been applied to the existing RYGB. This makes the RYGB more successful in patients with a higher BMI. This has led to a decrease in the number of SG procedures nowadays (reviewer 1, question 1 and 2).

Patient with a SG could experience a worse health perception compared to patients with a single RYGB operation. Despite the fact that BMI is added in the case-mix model, also the weight loss in both groups can be experienced differently.

Another significant difference was seen in the preoperative registered obesity-related diseases in the RYGB group. Several studies suggest that the RYGB has a greater beneficial effect on obesity-related diseases after surgery compared to SG [49,50,51,52]. This also could have an effect on the surgeon's choice for the type of bariatric surgery and therefore the differences in experienced QoL.

But the proper interpretation of these results remains a point of discussion. As has been shown for other type of surgeries and diseases, the RAND-36 is a generic questionnaire and may not be specific enough to fully analyse the QoL in bariatric patients [53, 54]. For example, when looking at physical functioning, the score was calculated on the basis of ten questions. These questions relate to typical activities during the day and may be one of the key items for patients with obesity, but can only be answered with a limited number of options; limited a lot, limited a little or not limited at all which is likely to capture only the very severe physical limitations e.g. induced by severe obesity. More subtle differences may not be measured adequately, while this is essential for obese and bariatric patients. Using a bariatric-specific quality of life questionnaire may detect more clinically relevant differences. However, as already mentioned in the introduction, there were currently no suitable bariatric questionnaires that could be applied in current scientific research.

Another point of debate is calculating and reporting the MID. Not only the baseline scores may vary by population and context, but also the differences experienced and noticed by the patient may vary. Therefore the proper interpretation of these results remains a point of discussion [55]. This means, for example, that an increase of 10 points should be interpreted differently if the baseline values differ. And also, an individual rise from 10 to 20 on a 100-point scale can be interpreted differently than a rise from 60 to 70.

As mentioned earlier in the discussion, other studies have not shown differences between SG and RYGB in QoL, while different outcomes following these two operations can be hypothesized [17, 18]. For example, the indication for the type of bariatric surgery is mainly based on the choice and expertise of the specific surgeon. There were some studies that suggest that the RYGB has a more beneficial effect on metabolic obesity-related diseases [46]. We tried to correct this by adding these baseline differences (Table 1) in the case-mix model.

There are some limitations, despite the accuracy of this study. A possible response bias could be generated by excluding patients from the study without postoperative measurements. However, this study offers a high response rate, whereby it can be assumed that the influence will be very small. In addition, this study didn't focus on statistically significant differences alone but also described whether these differences were clinically relevant (minimal important difference; MID) which at the same time will also safeguard against finding chance differences. This study does not have specific data to make a comparison between patients who participate in a multidisciplinary postoperative coaching programme and patients who did not. These coaching programmes could consist of participating in support groups, appointments with a dietician and psychological follow-up by trained professionals (reviewer 2, question 3).

There is still no consensus, whether to correct for multiple testing or not. Current statements say, an adjustment is particularly required in confirmatory analyses with multiple analyses stating one final conclusion [56]. This study, on the other hand, has an exploratory meaning with the aim of not missing potentially important findings by a standard adjustment of multiple testing [57]. Therefore, the results in this study were not corrected for multiple testing.

The strength of this study was the large sample size, high response rate and the prospective study design. Most other studies collecting patient-reported outcome data struggle to get a sufficiently high response rate as a part of daily routine clinical care. Therefore, the number of requested items has been kept to a minimum, but at the cost of having limited other data to e.g. adjust for patient characteristics. Furthermore, this was a multicentre study including hospitals located in different geographic areas; therefore, a representative group from the population was obtained. A limitation of the current study was the availability of only one-year follow-up data. When all Dutch hospitals have implemented the PROMs registration, the follow-up will be extended to an annual follow-up up to five years after the primary surgery in the national DATO-registry. This will further substantiate the current outcomes of this study.

Conclusion

This study showed that bariatric patients achieve better postoperative physical functioning, physical role limitations and health change for both SG and RYGB compared to Dutch reference values, but worse general health perception. In addition, a larger improvement in general health perception was seen in patients who underwent RYGB compared to SG. Further studies are needed to develop a specific QoL questionnaire, which focuses on the different aspects of the bariatric patient and the different inclusion criteria for a specific procedure.