Introduction

Health care systems need to balance potentially unlimited demand for services with scarce health care resources1. This balance might become even more difficult in the future due to an aging population that leads to increased demand for health care services and reduced medical work force2. Hospitals often consume a large part of total health care budgets3. A possible way to mitigate resource constraints in hospitals is the use of modern technologies to make services more cost-effective4. For instance, the increasing availability of data allows to support medical decision making in hospitals with information derived from machine learning algorithms5,6.

Efficient resource allocation in hospitals requires the management of volatile demand and available resources. This management is more critical in hospitals than in other areas since lack of timely and sufficient services can lead to negative patient outcomes7. Insufficient staff can lead to increased morbidity and mortality8,9. Due to the shortage of trained medical staff in many health care systems, securing sufficient medical staff to permanently meet patient needs becomes a critical objective10. Inpatient mental health care is more staff intensive than other medical disciplines due to the personal nature of many interventions11,12.

The Covid-19 pandemic has strong effects on most health care systems and individual services providers13. A sudden surge in patients requiring intensive respiratory care and the possible shortage of ICU capacities led to political supply side interventions in many health care systems14. In Hesse, Germany, somatic and psychiatric hospitals were required to restrict new admissions to urgent care cases and avoid elective patients from 16th March 2020. Furthermore, new hospital hygiene regulations and the requirement to quarantine patients reduced hospital capacities.

Forecasting of admissions can help for the efficient organisation of hospital care and for the adjustment of resources to sudden changes in patient volumes. The demand for health care services is usually relatively stable. Several recent studies have compared methods to forecast the spread of the Covid-19 pandemic15,16. However, there is still a lack of evidence considering the appropriate accounting for external shocks, which might be impossible to be directly and prospectively accounted for in modelling approaches, in forecasting hospital service volumes. Therefore, we aimed to forecast the number of admissions to psychiatric hospitals before and during the COVID-19 pandemic and we compared the performance of machine learning models and time series models.

Methods

Data

We included all inpatient admissions from 01 January 2017 to 31 December 2020 to nine hospitals in Hesse, Germany. These hospitals are part of a common service provider and account for about half of all inpatient mental health care in the state of Hesse. Aggregated admission numbers per day were obtained from the hospital administrations and did not contain individual patient data. Returns after planned interruptions, such as home leave, were excluded. Multiple separate admissions of the same patient were counted individually. Admissions to the departments of child and adolescent psychiatry and admissions to the departments for psychosomatic medicine were excluded.

We obtained weather and climate data from the Climate Data Centre of Germany’s National Meteorological Service17. We used the gtrendsR package version 1.5.1 to query Google trend data for Hesse, Germany18. School holidays and public holidays were obtained from publicly available calendars.

Analyses

We used machine learning and time series models to predict the number of hospital admissions for each day of year 2019 and 2020. The machine learning models were (a) gradient boosting with trees (XGB)19, (b) support vector machines (SVM)20 and (c) elastic nets21. The time series models were a) exponential smoothing state space models (ETS)22, (b) exponential smoothing state space models with screening for Box-Cox transformation, ARMA errors and trend and seasonal components (TBATS)23 and (c) additive models with non-linear trends fitted by seasonal effects (PROPHET)24. The selection of modelling approaches was based on their performance in previous research and cannot be exhaustive. However, several other examples were successfully used for forecasting in the corona context and might be relevant to the interested reader16,25,26. We compared models forecasting a week in advance, a month in advance and a whole year one week in advance.

Features

Our machine learning model used calendrical variables, climate and weather data, google trend data, Fourier terms and lagged number of admissions as features. All features are provided with a detailed explanation in Table S1. The calendrical features were day of day of the week, weekend, public holiday, school holiday, quarter of the year, month of the year, bridge days, i.e. days between a public holiday and the weekend and the end of the year, i.e. the days between Christmas and new year’s eve. The climate and weather data were wind speed, cloudiness, air pressure, precipitation depth and type, duration of sunshine, snow height, air temperature and humidity. Since the weather of future days was unknown at the point of prediction we used lagged values, i.e. the weekly model used the weather 7 days before the predicted day and the monthly models used the weather data 28 days before the predicted day. We did not use weather data for the yearly model.

Google trend data were retrieved using the gtrendsR package18 in the R environment for statistical computing27. We used the German translations of the following keywords in google trend data: depression, sadness, sad, suicide, mania, fear, panic, dread, addiction, dependence, alcohol, drugs, schizophrenia, psychosis and hallucinations. The relative frequency of searches for these key words in the region of Hesse, Germany, was used as feature. As for the weather data, we used lagged values of google trend data. The weekly models used the number of admissions 14 days before the predicted day, because the number of admissions was not known yet on day 7 before prediction, as additional feature and the monthly model used these values with a lag of 35. Our time series models did not use feature variables.

Training and testing

We used prospectively sliding time windows to validate (2018) and test (2019 and 2020) model performance. The final weekly models predicted each day of one full week of hospital admissions seven days in advance. We tested one model for each week and study site in 2019 and 2020, thereby incrementally prolonging the training period and forwarding the 7-day testing period each by one week. The monthly models each predicted 28 days of hospital admissions in advance and the incremental slides were 28 days. In the yearly models, we predicted the whole year of 2019 and 2020 each one week before the years started.

We compared model performance with the Root-Mean- Squared-Error (RMSE), the R2, the Mean Absolute Error (MAE) and a seasonal Mean Absolute Scaled Error (sMASE) as follows28:

$$Observation\;at\;time\;t = Y_{t}$$
$$Forecast\;of\;Y_{t} = F_{t}$$
$$Forecast\;error = e_{t} = Y_{t} - F_{t}$$
$$MSE = mean\left( {e_{t}^{2} } \right)$$
$$RMSE = \sqrt {MSE}$$
$$R^{2} = correlation\left( {Y_{t} ,F_{t} } \right)^{2}$$
$$MAE = mean\left( {\left| {e_{t} } \right|} \right)$$
$$sMASE = \frac{{MAE}}{{seasonally\;adjusted\;naive~\;MAE}}$$

The sMASE was calculated by dividing the MAE of our weekly, monthly and yearly forecasts by the MAE derived from a naïve forecast based on the number of admissions observed 14, 35 and 364 days before the predicted day, respectively. Variable importance was calculated for each variable in the best performing model using model specific metrics, i.e. in the case of elastic nets the absolute value of the coefficients after standardizing each feature. An advantage of model-specific metrics compared to model-agnostic measures is that they should be better in accounting for collinearity between features29.

Ethics approval and consent to participate

Our study did not involve individual patient data but summed numbers of admissions per day. The ethics committee of the Medical School Hannover confirmed that our study did not require ethical oversight.

Results

The total number of admissions showed a relatively strong weekly seasonality and a yearly seasonality. Figure 1 provides the results of a multiple seasonal decomposition of the number of daily admissions by loess30. There was no strong trend in admission numbers during the first three years, until the commencement of the Corona hospital regulation on March 16th had a clear negative effect on the number of admissions.

Figure 1
figure 1

Multiple seasonal decomposition by loess. The y-axes show the number of days and are scaled differently between the facets. Loess = Locally weighted scatterplot smoothing.

Table 1 shows the forecasting performance in 2019 and in 2020 at all study sites combined. The naïve seasonal forecasts were based on the number of admissions 14, 35 and 364 days before the predicted day for the weekly, monthly and yearly models, respectively. In absolute terms, the best model in 2019 was the weekly elastic net, which achieved a MAE of 7.25 days and an explained variance of 90%. Compared to a naïve forecast based on the number of admissions two weeks in advance, this model achieved a forecast improvement of 38% (sMASE = 0.62). In absolute terms, the best model in 2020 was the weekly TBATS model. However, compared to the monthly possible naïve forecast, i.e. the number of admission 35 days in advance, the highest improvement was achieved with the monthly SVM.

Table 1 Forecasting performance in 2019 and in 2020.

The error accumulation in 2019 and 2020 at all study sites combined is shown in Fig. 2. While model performance was relatively similar in 2019, errors diverged after commencement of the Corona hospital regulation on March 16th, 2020. Weekly time series models adjusted quicker to the new circumstances and accumulated less error until the end of year 2020.

Figure 2
figure 2

Cumulated mean absolute error in 2019 and 2020 by machine learning and time series models (days). XGB = Gradient boosting with trees, SVM = Support vector machines, ETS = Exponential smoothing state space models, TBATS = Exponential smoothing state space models with Box-Cox transformation, ARMA errors and trend and seasonal components, PROPHET = Additive models with non-linear trends fitted by seasonal effects.

The forecasting models showed variation in performance between study sites. Figure 3 shows differences in percentage errors between study sites per week derived from the overall best performing weekly machine learning and time-series models (see Table 1), respectively. Both models performed similar in year 2019. However, the elastic net caused less error peaks, for instance at easter Monday and during Christmas time because it had these holidays as features. In contrast, the TBATS model adjusted quicker to the corona regulations and adjusted to the new level of admission numbers during the rest of 2020 better than the elastic net.

Figure 3
figure 3

Variation of percentage error between study sites. TBATS = Exponential smoothing state space models with Box-Cox transformation, ARMA errors and trend and seasonal components, IQR = Interquartile range.

Figure 4 shows the top 25 feature variables ordered by their importance in forecasting the number of admissions with the elastic net, which was the best performing machine learning algorithm in our comparison. Variable importance represents the influence of each feature on the forecast performance relative to the other variables31. The strongest influence on forecast performance was found in calendrical variables.

Figure 4
figure 4

Variable importance of TOP 25 features in machine leaning models. Positive and negative effects represent increases and decreases in the number of admissions, respectively. Dec = December. Bridge day: Day between a holiday and the weekend, vice versa. Lag (14) = The number of admissions fourteen days before the predicted day. The Fourier series accounted to a yearly and a weekly seasonality with sine (S) and cosine (C) waves with an maximum order of 2 (weekly) and 5 (yearly).

Discussion

Key findings

We aimed to forecast the number of admissions to psychiatric hospitals before and during the COVID-19 pandemic and we compared the performance of machine learning models and time series models. This would eventually allow to support timely resource allocation for optimal treatment of patients. Model performance did not vary much between different modelling approaches before the COVID-19 pandemic. Established forecasts were substantially better than seasonal naïve forecasts. The most important features were calendrical variables that did not require short term adjustments in weekly and monthly models. However, weekly time series models adjusted quicker to the COVID-19 related shock effects than monthly and yearly models and the machine learning models. This is to be expected based on the theory and mechanics underlying the different modelling approaches, since longer forecasting horizons made the models less flexible and slower to adapt to radical changes, and the machine learning models, in contrast to the time series models, based their predictions on many data points from the past, which made them slower to adapt to radical changes.

Strength and weaknesses

A strength of our study were the data of four years from nine hospitals representing about half of all inpatient psychiatric admissions in Hesse, Germany. This allowed both to give a representative picture of inpatient psychiatric care in Germany and to show how the forecasting approaches work at different study sites. Furthermore, it was possible to analyse the effect of sudden changes in hospital admissions to the performance of different modelling approaches due to the commencement of the Corona hospital regulation in March 2020.

A limitation of our study was the lack of data to differentiate between causes of reduced hospital admissions after the corona regulation came into effect in March 2020. The reduced admissions could have been a result of different supply side and demand side effects, such as avoidance of elective admissions, reduced capacities due to isolation and quarantine requirements and unwillingness of patients to enter hospitals during the Corona crisis. Another limitation of our study was its restriction to one large German provider of inpatient mental health care, which requires a lot of care when translating to different healthcare systems or different clinical settings.

Comparison to previous research

Previous studies in the field of forecasting admissions in hospitals often focused on emergency departments32 and there were no previous studies that analysed forecasting of psychiatric hospital admission comparable to our study in scale and scope.

Vollmer et al.33 predicted admission numbers in the emergency departments of two hospitals London with data from 2011 to 2018. They compared machine learning models to more traditional time series models to make forecasts of admissions one, three and seven days in advance. The forecasts of different time horizons, i.e., one, three and seven days in advance, performed very similar. This is comparable to our findings of relatively similar results between weekly, monthly and yearly predictions, although at a different scale. In contrast to our study, lagged admissions from previous weeks were among the strongest predictors, probably related to the stronger increase and decrease of admission number levels during the study period at the different hospitals in comparison to our study. As in our study, Vollmer et al. also found that calendrical variables were among the features with the strongest influence on forecasting performance. Weather and climate data and google search data had a relatively low influence on forecasting performance.

Similar results were found by Boutsioli et al.34, who used a simple OLS regression to forecast hospital admissions to the emergency departments of ten public hospitals in Greece. They only used the calendrical variables weekend, summer holiday, public holiday and the participation in emergency care in their model and explained a relatively high variance in hospital admissions of up to 88%.

Jones et al.35 forecasted the admission numbers at three emergency departments in the USA one, seven, fourteen, twenty-one and thirty days in advance. They used autoregressive integrated moving average (ARIMA) models, time series regression, exponential smoothing, and artificial neural network models to predict admissions per day. They also found that admissions were characterised by yearly and weekly seasonality (see for comparison our Fig. 1). As in our study, they found a relatively low improvement in forecasting performance in the shorter forecasting horizons in comparison to the longer horizons. Similar to our study and to the study of Vollmer et al.33, weather and climate had a relatively low influence on forecasting performance.

McCoy et al.36 forecasted hospital discharge numbers at two academic medical centers in the USA. They compared the performance of a PROPHET model with a seasonal ARIMA model and a one-step naïve seasonal forecast and compared monthly models to yearly models. The best performance was achieved by a PROPHET model. Comparable to our study, they also found relatively low to none improvement of forecasting accuracy in refitting their models monthly in comparison to yearly models.

A main similarity between our study and previous studies was the relatively strong influence of calendrical variables in comparison to other potential forecasting features, such as weather and google trend data. The finding is most likely related to the strong dependence of health care systems service patterns to the work-day/weekend difference in service unit performance on the one hand and the relatively weak influence of other factors on actual number of admissions.

Conclusions

Accurate forecasting of hospital admissions can help for the efficient organisation of hospital care and for the adjustment of resources to sudden changes in patient volumes. We found a substantial improvement of forecasting accuracy in comparison to a seasonally adjusted naïve baseline forecast. Model performance did not vary much between different modelling approaches and different forecasting horizons before the COVID-19 pandemic. However, weekly time series models adjusted quicker to the COVID-19 related shock effects. In practice, multiple individual forecast horizons could be used simultaneously, such as a yearly model to achieve early forecasts for a long planning period and weekly models to provided more precise forecasts that adjust quicker to sudden changes.