Introduction

Bariatric surgery has beneficial effects of major and sustained weight loss with improved metabolic comorbidities. Bariatric surgery improves glycaemic control and even induces diabetes remission, which can be complete or partial, defined by fasting glycaemia and HbA1c normalisation without glucose-lowering treatment 1 year post-bariatric surgery [1]. These observations recently led to the revision of treatment guidelines, which now recommend bariatric surgery in the treatment of type 2 diabetes at any stage of obesity [2]. These guidelines are expected to substantially augment the already increasing number of bariatric surgery interventions worldwide [3]. However, despite the beneficial effects of bariatric surgery on metabolic conditions, there is significant inter-individual variability for individuals with type 2 diabetes. Outcomes are dependent on various factors including bariatric surgery procedure type and severity of type 2 diabetes before surgery.

A meta-analysis using an earlier definition of diabetes remission found that 78% of individuals with type 2 diabetes achieved diabetes remission post-bariatric surgery [4]. However, by applying the latest ADA proposed definitions [1] when considering all bariatric surgery procedures the proportion achieving diabetes remission decreased to 35%. When specifically focusing on the Roux-en-Y gastric bypass (RYGB), 40–60% of individuals achieve diabetes remission within 1 year [5, 6]. This remission rate decreases to 37% 5 years post-RYGB, denoting an important prevalence of relapse [7]. Furthermore, although bariatric surgery has overall beneficial health outcomes, perioperative morbidity and mortality rates remain at 3.4% and 0.3%, respectively [4]. Deleterious effects, such as nutritional deficiency, are also observed with different types of bariatric surgery [8, 9]. Together, the anticipated increasing number of bariatric surgery procedures and the uncertainty in predicting clinical outcomes, both short- and long-term, emphasises the need to establish useful and clinically applicable tools to predict metabolic/bariatric surgery outcomes [2].

Current clinical predictors include preoperative clinical variables (e.g. young age, short diabetes duration, type 2 diabetes control [i.e. low HbA1c], no insulin requirement), as well as post-bariatric surgery outcomes (e.g. significant post-bariatric surgery weight loss). Several scoring systems or statistical models based on these and other variables [10,11,12,13] currently help to predict diabetes remission post-bariatric surgery. Among them, the DiaRem, a scoring system based on preoperative age, HbA1c and the use of some glucose-lowering treatments, has a predictive accuracy of 84% 1 year post-RYGB [14]. However, the use of the DiaRem across the entire score spectrum has limitations. Individuals with a mid-range DiaRem score (i.e. between 8 and 17) only show a 50% probability of diabetes remission [13]. Furthermore, one-third of patients with a high score (i.e. those predicted to not achieve diabetes remission) manage to attain diabetes remission [15]. Importantly, the current DiaRem does not take into account novel glucose-lowering agents, such as glucagon-like peptide-1 (GLP-1) analogues, dipeptidyl peptidase-4 (DPP-IV) inhibitors or sodium–glucose co-transporter-2 (SGLT2) inhibitors, which may also influence diabetes remission [13]. Collectively, these observations prompted us to examine how to optimise this scoring system in order to improve outcome prediction before bariatric surgery.

We aimed to develop an improved predictive scoring system (i.e. the advanced [Ad]-DiaRem) for diabetes remission post-bariatric surgery by adding easily accessible clinical variables and tested its predictive accuracy in a test cohort. We then validated this improved tool in two independent confirmation cohorts from France and Israel.

Methods

Study design and participants

We leveraged our ongoing cohort (‘BARICAN’ [BARiatric surgery cohort of the Institute of CArdiometabolism and Nutrition] recorded in CNIL [Commission nationale de l’informatique et des libertés] no. 1222666) followed in the Pitié-Salpêtrière Hospital Nutrition Department (Paris, France), which consists of obese individuals who meet standard guideline recommendations for bariatric surgery intervention [16]. We selected individuals who had undergone RYGB, excluding revisional surgery, and who had a very detailed clinical dataset at 1 year follow-up. Intending to build this putative optimised scoring system, we identified individuals with type 2 diabetes with baseline bioclinical and anthropometric variables, obesity-related disease information and detailed information on treatment usage, blood metabolic and inflammatory variables, adipocyte size and liver histological diagnosis.

The test cohort, which consisted of 213 participants with type 2 diabetes with complete data for all of the abovementioned variables (patients’ characteristics are displayed in Table 1 and ESM Table 1), enabled the development of two different scoring systems: (1) the Ad-DiaRem, which included additional simple clinical variables that significantly differed at baseline between participants who had achieved diabetes remission and those who had not. We further compared the Ad-DiaRem with the existing DiaRem, and (2) the Costly-DiaRem, which was constructed for individuals falling within the Ad-DiaRem middle score category in order to further improve prediction accuracy for this group.

Table 1 Baseline characteristics of participants with type 2 diabetes before bariatric surgery according to remission status at 1 year post-surgery (test cohort)

A French confirmation cohort also from the ‘BARICAN’ cohort, consisted of 134 participants with type 2 diabetes with available information for each of the variables used in the Ad-DiaRem (Table 2). We further examined the Ad-DiaRem in another independent cohort from Israel, comprising 99 individuals with type 2 diabetes who had undergone RYGB as described previously [17]. The data for these participants were taken retrospectively from the electronic medical records of Clalit Health Services (CHS) and included individuals with type 2 diabetes who underwent bariatric surgery between 1999 and 2011, and had follow-up data until December 2014. Data from the CHS electronic database included the variables from the DiaRem and Ad-DiaRem. Figure 1 presents the study flow chart.

Table 2 Ad-DiaRem scoring system
Fig. 1
figure 1

Study flow chart. The study included obese individuals with type 2 diabetes who underwent RYGB from a French cohort (n = 352) and an Israeli cohort (n = 99). The test cohort (n = 213) consisted of participants from the French cohort who had a complete set of data available at baseline. The French (n = 134) and Israeli (n = 99) confirmation cohorts were used for Ad-DiaRem external validation. DR, complete diabetes remission; PDR, partial diabetes remission; NDR, non-remission

Ethics approval was obtained from the French Research Ethics Committee of CPP Ile de France-1 No. 13533 and the Rabin Medical Center Ethics Committee. All participants provided written informed consent.

Definition of diabetes and 1-year remission outcomes

Type 2 diabetes was defined according to ADA criteria [18]. In the French and Israeli confirmation cohorts, 1-year remission outcomes were defined according to the latest ADA definition [1] (i.e. partial remission: HbA1c <6.5%, fasting plasma glucose [FPG] <7.0 mmol/l and no use of glucose-lowering agents at 1 year; complete remission: HbA1c <6.0%, FPG <5.6 mmol/l and no use of glucose-lowering agents at 1 year). All participants who achieved complete or partial remission were considered as the ‘remission group’ because they displayed blood glucose control normalisation without the use of glucose-lowering agents.

Test cohort: bioclinical, anthropological and histological variables

Baseline clinical information on diabetes duration (i.e. duration up to RYGB intervention), use of glucose-lowering agents, and obesity-related comorbidities and treatments (e.g. hypertension, obstructive sleep apnoea and dyslipidaemia) were collected as previously described [16]. Glucose-lowering medication groups were classified as follows: GLP-1 analogues, DPP-IV inhibitors, sulfonylureas, thiazolidinediones (TZDs), glinides, α-glucosidase inhibitors, metformin and insulin (basal and/or bolus). The number of glucose-lowering agents prescribed was considered the sum of the above drug categories.

Blood samples were collected at baseline after a 12 h overnight fast. Pancreatic beta cell function (insulin secretion) and insulin resistance were estimated using HOMA-β and HOMA-IR, respectively [19]. Body composition was evaluated by whole body dual-energy X-ray absorptiometry scan (DXA, Hologic Discovery W) [20].

Adipocyte diameter, which enabled the calculation of adipocyte morphology [21], was evaluated with Perfect Image (Clara Vision, Verrières le Busson, France) from subcutaneous adipose tissue (scAT) needle-aspirated biopsies after collagenase digestion as previously described [22]. Perioperative surgical liver biopsies were collected to assess non-alcoholic fatty liver disease (NAFLD) or non-alcoholic steatohepatitis (NASH) using the steatosis activity fibrosis (SAF) score [23, 24].

The DiaRem score

The DiaRem was initially established to predict the probability of complete or partial diabetes remission following post-bariatric surgery. The DiaRem score, ranging from 0 to 22, was calculated for each participant using age, HbA1c, and use of some glucose-lowering medications and insulin, each with a defined weight as described in [13] (see electronic supplementary material [ESM] Table 2).

Development of an optimised scoring system: Ad-DiaRem

We examined 43 baseline variables (11 clinical variables, 27 laboratory variables and five scAT and liver biopsy variables; Table 1) as potential variables that could improve the predictive power of the DiaRem. Multivariate logistic regression was performed to estimate the OR of potential predictors of remission. The variables whose ORs were significant (i.e. p < 0.05) were selected and included in the Ad-DiaRem scoring system (i.e. all the variables included in the DiaRem plus two easily accessible clinical variables: the number of glucose-lowering agents and diabetes duration).

Statistical analyses

Categorical variables are expressed as numbers and percentages, continuous data as means ± SD. Categorical data were analysed using Fisher’s exact test for two groups. Continuous data were analysed using the Student’s t test. The analyses were adjusted by age. Two-tailed p values were considered significant at p < 0.05. All analyses were conducted using R software version 3.0.3 (http://www.r-project.org) and GraphPad Prism 6.0.

Learning ad-DiaRem

A clinical scoring system should be able to select relevant clinical variables, propose interpretable clinical thresholds and estimate weights for corresponding bins. We applied a machine learning method that simultaneously learns the restricted set of informative variables to retain. This method associates interpretable binning to map with each class variable (complete/partial diabetes remission or non-remission) and provides optimal weights to associate with these bins contributing to the score. For machine learning, we minimised empirical risk given the diabetes cohort and performed tenfold cross-validation to avoid possible overfitting. Specifically, as a classification algorithm, we used a sparse support vector machine. To optimise the problem of the score learning, we formulated it as a linear integer programming task and used the IBM ILOG CPLEX Optimization Studio (http://www-03.ibm.com/software), which is a state-of-the-art solver for constrained optimisation problems. We added integrity constraints to our task so that the resulting weights are integers. Also, constraints shrink similar variables to each other, creating bins and ordering them. The computations were done with R version 3.1.3 and the ‘Rcplex’ package, which is the interface to the IBM CPlEX Studio. The predictive performance of different scores were evaluated by the area under the receiver operating characteristic (AUROC) curve using the DeLong method.

Results

Clinical variables associated with 1-year diabetes remission post-RYGB

In the test cohort, 64% of participants achieved diabetes remission within 1 year (Fig. 1), concordant with previous reports [25]. Participants who achieved diabetes remission were younger, had significantly lower FPG and HbA1c, and were less likely to be treated with insulin or oral glucose-lowering agents other than metformin presurgery compared with those who did not achieve remission (Table 1, Fig. 2a). Those who achieved remission displayed a significantly higher BMI, higher DXA-evaluated fat mass and less abdominal fat distribution at baseline. Importantly, after adjustment for age, although differences in fat mass and its deleterious deposition (android/gynoid fat mass) remained significant, BMI did not. Participants who achieved remission also exhibited a shorter type 2 diabetes duration and potentially increased beta cell function as estimated by HOMA-β (Table 1). These differences remained significant after adjustment for age. The sex ratio was not significantly different between groups.

Fig. 2
figure 2

(a) Number of glucose-lowering medications at baseline in the test cohort. Each bar represents the percentage of participants in the remission and non-remission groups receiving no glucose-lowering treatments (white), one glucose-lowering treatment (light grey), two glucose-lowering treatments (dark grey) or three or more glucose-lowering treatments (black). (b) Evaluation of DiaRem and Ad-DiaRem scores for the remission vs non-remission groups for the test cohort (DiaRem: AUROC 0.856, accuracy 0.789; Ad-DiaRem: AUROC 0.911, accuracy 0.841) and the French confirmation cohort (DiaRem: AUROC 0.893, accuracy 0.881; Ad-DiaRem: AUROC 0.939, accuracy 0.896). (c) The percentage of participants achieving remission according to DiaRem score in the test cohort (DR, complete diabetes remission; PDR, partial diabetes remission). (d) Distribution of participants according to each DiaRem score in the test cohort. (e) Distribution of participants according to Ad-DiaRem score in the test cohort in the remission vs the non-remission groups. (f) Evaluation of DiaRem and Ad-DiaRem scores in participants with low (0–2) compared with high scores (19–22 for DiaRem and 19–21 for Ad-DiaRem) for the remission vs the non-remission groups in the test (DiaRem test: AUROC 0.857, accuracy 0.873; Ad-DiaRem test: AUROC 0.955, accuracy 0.944) and French confirmation cohorts (DiaRem: AUROC 0.899, accuracy 0.846; Ad-DiaRem: AUROC 0.977, accuracy 0.96). (g) Evaluation of DiaRem and Ad-DiaRem scores in participants with low (0–5) compared with high scores (15–22 for DiaRem and 15–21 for the Ad-DiaRem) for the remission vs the non-remission groups in the test (DiaRem: AUROC 0.857, accuracy 0.887; Ad-DiaRem: AUROC 0.935, accuracy 0.965) and French confirmation cohorts (DiaRem: AUROC 0.891, accuracy 0.91; Ad-DiaRem: AUROC 0.964, accuracy 0.96). (h) DiaRem and Ad-DiaRem scores in participants with mid-range scores (8–14) for the remission vs non-remission groups in the test and confirmation cohorts. (ik) Distribution of participants for the remission vs non-remission groups according to (i) Ad-DiaRem score in the French confirmation cohort, (j) DiaRem score in the Israeli confirmation cohort, and (k) Ad-DiaRem score in the Israeli confirmation cohort. (l) Evaluation of Ad-DiaRem in all participants in the Israeli confirmation cohort for the remission vs non-remission groups (AUROC 0.882). Diagonal segments are produced by ties. (d, e, ik) Red bars, non-remission; green bars, remission; red background, 80% of participants with non-remission; green background, 80% of participants in remission. (b, fh, l) Red lines, test cohort; blue lines, confirmation cohorts; dotted lines, DiaRem; solid lines, Ad-DiaRem

Although adipocyte diameter was increased in individuals with type 2 diabetes compared with those without diabetes (data not shown), it was not significantly different between the remission and non-remission groups at 1 year. Liver fibrosis scores (from liver biopsies) were more severe in participants who had not achieved remission compared with those who had (ESM Fig. 1), whereas other liver alterations (i.e. steatosis, inflammation activity, NAFLD/NASH scores) were similar between the groups. This exploration revealed that: (1) current DiaRem variables differed between the remission and non-remission groups, and (2) additional factors (i.e. number of glucose-lowering agents, diabetes duration and body composition variables) also varied.

DiaRem score in the test cohort

When evaluating the DiaRem in our test cohort, we found an AUROC of 85% (Fig. 2b). Using the Youden method, the threshold for remission was calculated to be a score of 7 (i.e. participants with a DiaRem score <7 should achieve diabetes remission), confirming our previous findings in another independent group [14]. Although the overall predictive accuracy of DiaRem was 78.9% (Fig. 2b), the false-positive (remission was predicted in participants who failed to achieve remission, n = 9) and false-negative (non-remission was predicted in participants achieving remission, n = 41) rates were quite high. Positive predictive value (PPV) was high (0.91) but negative predictive value (NPV) was much lower (0.62).

Subsequently, participants were stratified into five groups according to their DiaRem score: 0–2 (highest probability of remission), 3–7, 8–12, 13–17 and 18–22 (lowest probability of remission) (Fig. 2c). A high proportion of participants with low scores (i.e. 0–2 or 3–7) achieved remission, indicating a good predictive value of DiaRem for individuals in this range (Fig. 2c). However, only about half of the participants with scores ranging from 8–12 attained remission, demonstrating a poor predictive performance in this intermediate range. We highlighted a high degree of misclassification in this middle range (i.e. 27 participants (12.6%) with a DiaRem score of 8–17 still experienced remission) (Fig. 2c). Altogether, the majority of the individuals in either the remission or the non-remission groups were not readily separable by DiaRem, with an overlap between the score ranges that cumulatively included 80% of either group (Fig. 2d).

These results indicate a satisfactory predictive value of the DiaRem score for the extreme ranges, but a lot of participants remained incorrectly classified. This prompted us to evaluate the relevance of other variables in predictive accuracy.

Ad-DiaRem score improves prediction of diabetes remission 1 year post-RYGB

We examined baseline variables that significantly differed between the remission and non-remission groups (i.e. p < 0.05, Table 1) to develop an improved tool for the prediction of remission (Ad-DiaRem; ESM Table 3). After adjustment for the four variables already present in the DiaRem, the ORs were significant for number of glucose-lowering agents, diabetes duration and DXA-evaluated body composition, but not BMI. Since DXA might not be easily accessible in all clinical settings, we first tested whether including only two additional clinical variables would be sufficient to improve the DiaRem accuracy.

The Ad-DiaRem (Table 2) led to a better classification of participants achieving remission with an improved AUROC and accuracy compared with the DiaRem (0.911 vs 0.856 and 0.841 vs 0.789, respectively; p = 0.03) (Fig. 2b, e). Compared with the DiaRem (Fig. 2d), the Ad-DiaRem created a better separation of 80% of the participants that achieved remission vs those that did not (i.e. the majority [80%] of both groups did not overlap with the Ad-DiaRem) (Fig. 2e). Additionally, the Ad-DiaRem demonstrated better PPV and NPV (0.93 and 0.72, respectively) compared with the DiaRem (0.91 and 0.62, respectively); thus leading to improved classification of 16 (8%) participants who were initially misclassified. In total, the DiaRem correctly classified 164 of 213 participants from the test cohort, whereas 180 participants were correctly classified by the Ad-DiaRem. Using the Youden method, the threshold for remission was calculated to be below a score of 10 (i.e. participants with an Ad-DiaRem score <10 should achieve diabetes remission).

The predictive improvement was most noticeable for participants with low (0–5) or high (15–21) scores. As a consequence, the AUROC and accuracy calculation of Ad-DiaRem was better for the extreme ranges compared with the DiaRem, nearly reaching significance (Fig. 2f, g; p = 0.06 for comparison between scores 0–5 and 17–21/22 in the DiaRem and Ad-DiaRem).

For participants with mid-range scores, the Ad-DiaRem correctly reclassified 12 out of the 24 that the DiaRem incorrectly predicted as non-remission. Although AUROC and accuracy were increased with Ad-DiaRem for these participants (Fig. 2h), the difference did not reach statistical significance comparing the two scores.

Next we examined the Ad-DiaRem prediction accuracy in French and Israeli confirmation cohorts. In the French cohort, 58% of the participants achieved remission 1 year post-bariatric surgery (Fig. 1). Figure 2b and i show that, compared with the DiaRem, the Ad-DiaRem better classified participants in the French cohort, with an increased proportion with low scores (0–2 and 3–5) achieving remission and a very high proportion with high scores (17–22) with non-remission. This improvement was retained in different scoring sub-categories (Fig. 2f–h). Compared with the DiaRem, the Ad-DiaRem score correctly reclassified ten (7.4%) participants, and the overall accuracy and AUROC of Ad-DiaRem in predicting remission (vs non-remission) was superior in the test and confirmation cohorts (Fig. 2b; p = 0.03). NPV also increased with Ad-DiaRem in this confirmation cohort compared with the DiaRem (0.82 vs 0.75, respectively). A similar added value of the Ad-DiaRem was found when comparing participants with complete diabetes remission vs non-remission in the test and confirmation cohorts (i.e. excluding participants with partial remission) (ESM Fig. 2).

Of the 99 participants from the Israeli cohort, 58% achieved diabetes remission. Similar to the observations made in the French cohort, Ad-DiaRem clearly separated the majority (80%) of those individuals who achieved remission from those who did not (Fig. 2k), whereas DiaRem exhibited an overlap between the groups (Fig. 2j). Furthermore, the AUROC increased from 0.825 with DiaRem to 0.882 with Ad-DiaRem (Fig. 2l).The Distribution of Ad-DiaRem scores among the remission and non-remission groups in the Israeli confirmation cohort are shown in ESM Table 4.

Added value of other bioclinical variables for predicting diabetes remission post-RYGB

To evaluate if we could further improve Ad-DiaRem performance for participants with mid-range scores (8–14), we tested the addition of other variables, e.g. DXA-measured fat mass, fat-free mass proportion, fat mass/fat-free mass ratio, serum C-reactive protein (CRP) and HOMA-β. At baseline, these variables significantly differed between participants who achieved remission and those who did not.

Using a binning method, we developed the Costly-DiaRem scoring system, which penalised for low fat mass (%), high fat mass/fat-free mass ratio, high android/gynoid fat mass ratio, high serum CRP and low HOMA-β (see ESM Table 5). Despite the inclusion of these additional bioclinical variables providing deeper phenotyping, the Costly-DiaRem did not perform better than the Ad-DiaRem in any scoring range (ESM Fig. 3). Furthermore, prediction accuracy was not improved by the simple addition of HOMA-β to the Ad-DiaRem (data not shown).

Discussion

Here, we show that the Ad-DiaRem scoring system improves predictive accuracy for diabetes remission 1 year post-RYGB compared with the currently proposed DiaRem in a population of severely obese individuals with type 2 diabetes. Of the 347 individuals from the French cohort (214 with complete/partial diabetes remission), 16 were correctly reclassified using the Ad-DiaRem. This improved scoring system has two additional variables that are easily recordable in clinical practice (i.e. diabetes duration and number of glucose-lowering agents) and has modified scoring for each variable. The accuracy for predicting diabetes remission was significantly increased with Ad-DiaRem, as evidenced in French and Israeli confirmation cohorts. Developing an accurate scoring system to better predict the outcomes of bariatric surgery is becoming necessary as the number of bariatric surgery procedures being undertaken increases worldwide [3]. This increase is compounded by new guidelines for type 2 diabetes management now recommending a lower BMI cut-off for bariatric surgery in treatment algorithms [2]. Not all individuals experience the same beneficial outcomes following bariatric surgery, both in the extent of weight loss [26] and metabolic improvements [4]. Therefore, the development of reliable predictive tools will help routine care decision making and, in the future, to innovate personalisation of pre and postoperative care pathways.

The DiaRem, recently created using Cox regression analysis of 5-year follow-up data in 690 participants [13], demonstrated good predictive performance for 1-year remission, despite slightly lower accuracy in confirmation cohorts [13]. Here, although we confirmed the performance of the DiaRem in French and Israeli cohorts, a significant number of individuals remained misclassified [13, 15], primarily those in the medium score range (8–17), which comprised about one-third of participants. The Ad-DiaRem significantly decreased the predictive errors for the overall cohorts and for participants with mid-range scores. The Ad-DiaRem exhibited a PPV of 0.93 and NPV of 0.72 in predicting remission in the test cohort, thus improving the predictive accuracy of the previously published DiaRem.

The improved performance of Ad-DiaRem is likely to be due to multiple factors. First, the DiaRem score included age, a rather indirect marker of diabetes duration. Although increasing age is generally correlated with longer diabetes duration, it is known that, with the dramatic increase in obesity prevalence worldwide, type 2 diabetes now occurs earlier [27]. Therefore, the small penalty assigned for age below 40 years in the DiaRem might not be fully accurate anymore [13]. Diabetes duration is regarded as a consistent marker of disease progression and recognised as the best predictor of post-bariatric surgery diabetes remission [2, 28, 29]. Since this variable was not available in the Still et al database used for the DiaRem calculation, it could not be integrated [13]. Diabetes duration was integrated into another predictive tool for diabetes remission post-bariatric surgery, the ABCD [12]. However, the ABCD scoring system did not perform as well as the DiaRem [14] and might not be convenient for routine use as it relies on fasting C-peptide, an expensive serum marker that is not easily available in routine care. Admittedly, diabetes duration is not absolutely accurate. It is usually self-reported and the true onset of disease is indolent. Frequently, type 2 diabetes is diagnosed long after beta cell function has declined [30]. Still, diabetes duration is easy to collect and its value is demonstrable in the Ad-DiaRem. Second, the DiaRem does not take into account currently available drugs for type 2 diabetes treatment, mainly DPP-IV inhibitors and GLP-1 analogues. This latter class is widely used in individuals with type 2 diabetes who are obese because it improves glucose control and decreases weight in some individuals [31]. We hypothesised that taking into account the overall number of drugs might be more reflective of disease progression during the preoperative stage. Thus, we integrated this information into the Ad-DiaRem (Table 2). Furthermore, since glucose-lowering agents are not standardised among countries [32, 33] and they are given according to tolerance and secondary effects, we believe that by adding the number of glucose-lowering agents into the score individual heterogeneity will be accounted for better.

By using this retrospective cohort of individuals with type 2 diabetes who have undergone RYGB that were extensively phenotyped at baseline, we also describe new clinical variables that were associated with non-remission. Compared with those individuals who achieved at least partial remission, those with non-remission had less adipose tissue (lower fat mass); however, these individuals displayed increased android fat mass repartition at baseline, which is recognised as detrimental for metabolic complications [34]. They also displayed liver fibrosis more frequently.

The Ad-DiaRem improved predictive accuracy compared with the DiaRem but did not fully solve misclassification of individuals with mid-range scores. Despite our effort to add other detailed phenotypic characteristics differing at baseline between participants who achieved remission and those who did not (i.e. body composition data and insulin secretion index) to the Ad-DiaRem, we were unable to further improve prediction accuracy. This opens up the possibility of testing other biological markers. For example, recent literature points to the importance of genomic variation (single nucleotide polymorphisms) related to insulin secretion in the prediction of diabetes remission post-bariatric surgery, suggesting that measures related to pancreas failure to (hyper-)secrete insulin might be of interest. The added value of genetic scoring must be examined in comparison with scores using clinical variables and other measurements linked to individuals’ impaired metabolism. However, it is unknown whether adding more complex individual information derived from high throughput analysis, such as systemic proteomics, metabolomics or metagenomics [35, 36], or tissue alterations would be helpful in improving prediction, particularly in individuals with mid-range scores. As such, we previously described that adipose tissue fibrosis is associated with reduced weight loss post-bariatric surgery [22, 37]. Whether adipose tissue scoring might be useful for predicting post-bariatric surgery outcomes needs to be investigated in the future.

Our study has some limitations. First, we focused on the achievement of complete/partial diabetes remission 1 year post-RYGB. Studies now demonstrate that remission is not sustained in all individuals [28]. For instance, 43% of participants who achieved 1-year remission displayed type 2 diabetes recurrence 5 years post-bariatric surgery [38]. This highlights the need to evaluate long-term glycaemic outcomes in type 2 diabetes and test the relevance of the Ad-DiaRem in the long term [39]. Indeed, type 2 diabetes is a progressive disease that worsens with time [40, 41] and bariatric surgery may only induce transient remission followed by resurgence or exacerbation. Despite this, while not all individuals achieve remission, they still improve their glycaemic control as seen with a reduction of the number of glucose-lowering agents and HbA1c as observed in two long-term randomised control trials [7, 42]. When tested for prolonged remission 5 years post-RYGB, the DiaRem was not optimal for predicting remission in individuals with high scores [39]. Second, the Ad-DiaRem needs to be investigated for other types of bariatric surgery procedures, in particular post-sleeve gastrectomy, a procedure that is being used increasingly worldwide [3]. Finally, it should be noted that we tested the validity of the Ad-DiaRem solely in a population of severely obese individuals, which, to date, represents the majority of bariatric surgery candidates [43, 44]. However, the Ad-DiaRem should be further tested in diabetic individuals with less severe obesity as these will increasingly become candidates for bariatric surgery procedures based on recent recommendations [2].

Conclusions

Here, we described the development of the Ad-DiaRem for predicting diabetes remission following Roux-en-Y gastric bypass in obese individuals with type 2 diabetes. We demonstrated its ability to better separate between individuals predicted to achieve remission and those who will not, and its improved predictive performance over the original DiaRem scoring system. In the future, individuals with type 2 diabetes who are not expected to achieve remission might benefit from a care pathway with a more intensive follow-up and/or increased physical activity. These approaches need to be further tested and new guidelines proposed.