Patients and data acquisition
This was a population cohort study involves a secondary analysis of data from the National Neurology Registry (NNEUR) of Malaysia. Data of all Malaysian patients with a history of index IS from August 2009 to December 2016 were extracted from the NNEUR of Malaysia. The details on the National Stroke Registry of Malaysia were published previously (21-23). The stroke was diagnosed according to the World Health Organization’s criteria (24). All diagnoses were confirmed using brain computed tomography or magnetic resonance imaging. Index IS was defined as the first stroke registered into the NNEUR for the patients from 2009 to 2016. Recurrent IS was defined as any IS event recorded by involving hospitals after the index IS for a specific patient in the NNEUR database. Malaysian adults aged above 18 years with the history of IS and registered with NNEUR was included. Non-Malaysian citizen and diagnosis other than IS was excluded from the study. Minimum events needed to develop this prognostic model was calculated as 228. Sample size – Survival analysis | Sample Size Calculators (sample-size.net)
Stroke Registry in Malaysia
The NNEUR in Malaysia was established in 2009. The NNEUR has recorded data from multi-ethnic involving stroke cases from 13 states in the country. The NNEUR aims to provide comprehensive epidemiological data on the country’s stroke statistics, trends, and management, representing a multicentre, hospital-based registry. The registry development is funded by the Ministry of Health, Malaysia (MOH). A comprehensive explanation of the NNEUR has been previously published(25).
Ethics Approval
Ethical approval for this study was obtained from the Medical Research and Ethics Committee (MREC), Ministry of Health, Malaysia (Research ID: NMRR-08-1631-3189). All methods were performed according to the guidelines of the Declarations of Helsinki. Informed consent was obtained from all subjects that included into this study.
Collected variables
Demographic data and concomitant diseases including DM, HTN, HPLD, IHD, and hyperuricemia were tested. They were defined either by physician diagnosis, patients’ electronic records, or deduced from the medication history, and the medications prescribed during discharge.
Data for external validation
Data on demographic and significant covariates identified in the final model of all Malaysian patients with a history of index IS registered in NNEUR from January 2017 until December 2020, were extracted to perform external validation of the developed model.
Analysis
Repeated time to the recurrent events of IS and factors predicting the recurrent IS were quantified and determined using NONMEM version 7.5 software and Perl speaks NONMEM (PsN) version 4.1.0. The event was described as having recurrent IS events after the index IS. All event times were treated as exact time models, in which the event assumed to occur at the time of observation time. Three models, which were exponential, Gompertz, and Weibull investigated for the baseline hazard model.
Model development
The model was developed in the following two steps: (i) a base model without any explanatory factors and (ii) exploration of covariates.
Development of the base model
A parametric survival function according to Equation 1 was used to describe the repeated time to the recurrent IS.
where S(t) is the time course of the probability of survival, or the survivor function calculated from the time-varying hazard h(t). The hazard is h(t), and the survival S(t) is a function of the cumulative hazard within the time interval between the time zero and the time t describing the probability of not experiencing any recurrent IS within this interval.
The base model was developed by exploring different functions for the hazard h(t), starting from a simple time-independent constant hazard and then gradually progressing to more complex functions, including Gompertz and Weibull according to Equations (2), (3), and (4) respectively (26).
Hazard of recurrent IS at baseline or baseline hazard function at different time point after the index was quantified based on Equation 5. Equation 5 gives an example of the baseline hazard h0(t) changes depending during different time t intervals.
Between-subject variability around the hazard was estimated, assuming an exponential distribution for the random effect.
Development of the covariate model
Possible explanatory variables that may influence or predict the changes in hazard were explored by including each explanatory variable in the hazard function. A parameter, βn, for each of the n explanatory variables, Xn, was estimated using the following equation.
Where h0 is the baseline hazard, βn is the coefficient for the explanatory variable, Xn, describing how the hazard varies with the explanatory variable. Exponentiation of the explanatory variable coefficient provides the hazard ratio (HR), which reflects the influence of the explanatory variables relative to the hazard when the explanatory variable is not present.
Initially, the covariates were tested in a univariate manner, i.e. each covariate relationship was evaluated on the base hazard individually. Then, based on the results, covariate relationships were identified for a systematic covariate search by applying stepwise analysis approach, i.e. with stepwise forward inclusion followed by backward elimination (27).
In the forward inclusion, the statistical significance level was set to P < 0.05, which corresponds to a reduction of the OFV of at least 3.84, for one degree of freedom (addition of one covariate parameter). While in the backward deletion, the significant value was set to P < 0.01, corresponding to an increase of the OFV of at least 6.64 to be kept in the model for one degree of freedom.
Model evaluation
Parameters were estimated using the LAPLACE method (ADVAN=6 TOL=9 NSIG=3) in NONMEM to obtain maximum likelihood estimates of time-to-event parameters. The parametric repeated time-to-event (RTTE) analysis was performed using NONMEM v7.5, and Perl speaks NONMEM (PsN) version 4.1.0.7. Model selection was based on comparing the OFV between models, bootstrap confidence intervals for parameter estimates, and biological plausibility. The improvement in the fit was measured by a decrease (28)in the OFV generated by NONMEM. The difference in OFV between two hierarchical models is approximately Χ2 distributed and can be tested for significance with Χ1,0.052=3.84
To evaluate the predictive performance of the model throughout model building, Kaplan-Meier visual predictive checks (VPCs) for internal and external validation, Xpose4 (version 4.7.1) function (29, 30) in RStudio software (version 1.1.456, RStudio, Inc., Boston, MA, http://www.rstudio.com/) was utilized. The plots were based on simulations of 1000 simulated dataset. To enable simulations for time points where no clinical observations had been made, extra dummy time points were added to the dataset until 7.37 years in all individuals for the to allow for VPC simulation. The parameter certainty was evaluated through relative standard error (RSE) produced from the sampling importance resampling (SIR) method (31).
External validation
Data from 2692 patients with and without recurrent IS were used to validate the developed final model externally. The parameters estimate obtained from the final model were used to simulate 1000 replicates of the dataset and to plot the VPC. The predictive performance of the final developed model was then evaluated on the ability of the model to predict the probability of not having recurrent IS from the validation data by overlaying the VPC plot on the Kaplan-Meier curve of the validation data.
Clinical application of the developed model
An online prognosis IS recurrent risk calculator was developed based on the developed final model. The probability of early (within a year) and late (2-year, and after 4-years after the index IS) IS recurrent for two clinical scenarios were predicted using the calculator. The scenarios were as the following:
Scenario 1: A patient with a history of IHD, HTN and HPLD had the first IS attack. The probability of recurrent IS was calculated.
Scenario 2: A patient with no concomitant diseases had the first IS attack. The probability of recurrent IS was calculated.