Insight into model mechanisms through automatic parameter fitting: a new methodological framework for model development

Tøndel, Kristin; Niederer, Steven A; Land, Sander; Smith, Nicolas P

doi:10.1186/1752-0509-8-59

Research article
Open access
Published: 20 May 2014

Insight into model mechanisms through automatic parameter fitting: a new methodological framework for model development

Kristin Tøndel^1,2,
Steven A Niederer¹,
Sander Land¹ &
…
Nicolas P Smith¹

BMC Systems Biology volume 8, Article number: 59 (2014) Cite this article

2635 Accesses
4 Citations
Metrics details

Abstract

Background

Striking a balance between the degree of model complexity and parameter identifiability, while still producing biologically feasible simulations using modelling is a major challenge in computational biology. While these two elements of model development are closely coupled, parameter fitting from measured data and analysis of model mechanisms have traditionally been performed separately and sequentially. This process produces potential mismatches between model and data complexities that can compromise the ability of computational frameworks to reveal mechanistic insights or predict new behaviour. In this study we address this issue by presenting a generic framework for combined model parameterisation, comparison of model alternatives and analysis of model mechanisms.

Results

The presented methodology is based on a combination of multivariate metamodelling (statistical approximation of the input–output relationships of deterministic models) and a systematic zooming into biologically feasible regions of the parameter space by iterative generation of new experimental designs and look-up of simulations in the proximity of the measured data. The parameter fitting pipeline includes an implicit sensitivity analysis and analysis of parameter identifiability, making it suitable for testing hypotheses for model reduction. Using this approach, under-constrained model parameters, as well as the coupling between parameters within the model are identified. The methodology is demonstrated by refitting the parameters of a published model of cardiac cellular mechanics using a combination of measured data and synthetic data from an alternative model of the same system. Using this approach, reduced models with simplified expressions for the tropomyosin/crossbridge kinetics were found by identification of model components that can be omitted without affecting the fit to the parameterising data. Our analysis revealed that model parameters could be constrained to a standard deviation of on average 15% of the mean values over the succeeding parameter sets.

Conclusions

Our results indicate that the presented approach is effective for comparing model alternatives and reducing models to the minimum complexity replicating measured data. We therefore believe that this approach has significant potential for reparameterising existing frameworks, for identification of redundant model components of large biophysical models and to increase their predictive capacity.

Background

Models in computational biology are becoming increasingly complex, as in-silico frameworks are expanded to account for our rapidly increasing knowledge of physiological mechanisms [1]. This poses considerable challenges for uniquely linking model parameters to experimental data. The desire to capture this complexity to simulate physiological function increasingly results in models where the identifiability of parameters from available experimental data is relatively low. This situation is exacerbated by the lack of consensus on the optimal method for fitting model parameters to data, taking into account the, often, poor signal to noise ratio in these measurements. Furthermore, in many cases the model structure is such that the inverse problem of parameter fitting is ill-posed due to multiple parameter values producing the same model output. Finally, measured data in the literature is often incomplete, and state-of-the-art models are consequently based on a synthesis of data measured at different temperatures, from different laboratories and often from different species [2, 3].

The reuse, combination and extension of existing models are necessary components of the Physiome approach [4]. In particular, as new datasets become available, and as models are applied to address new hypotheses and understand physiological situations, model developers are likely to need to augment or extend models or model components. This implies a requirement for a methodology for comparing model predictions with experimental data in a robust and automated fashion, efficiently incorporating new knowledge to better constrain the model parameters, systematically searching for the perturbation of the system that highlights parameter sensitivities and constrains the system, as well as reducing models to the minimal applicable version (as few parameters and equations as possible).

We believe that reduction in model complexity is important in that it typically increases the sensitivity of model outputs to the various parameters and hence the consequences of introducing changes to the model become more transparent. It also improves the likelihood that the models will be predictive outside the regime of the parameterising data. Specifically, if the identifiability of model parameters can be increased, this will enhance the ability to find the most relevant experimental measurements to use in order to constrain parameters within a given model framework, decreasing the uncertainty in parameter estimates.

In this study we address the issue of ill-posed inverse problems through the development of a generic framework for combined model parameterisation, comparison of model alternatives and analysis of model mechanisms. The fitting of model parameters from measured data is based on a combination of inverse metamodelling [5–9] (prediction of the input parameters as functions of the model outputs using regression) and iterative cost-function-based identification (look-up) of the simulations resulting in values of the output metrics in close proximity of the measured values, and subsequent zooming into relevant regions of the parameter space. In contrast to conventional nonlinear fitting or minimisation algorithms that only estimate parameter values, this method provides an overview of the parameter space and identifies regions in the parameter space where model outputs match measured data. The variation in possible solutions thereby provides an estimate of the uncertainty in the parameter values. Moreover, the inverse metamodelling component of the fitting pipeline provides an implicit sensitivity analysis and a quantification of the identifiability of model parameters from measured data.

In the look-up component of our proposed pipeline, the output spaces of model alternatives are analysed using Principal Component Analysis (PCA) [10, 11], providing effective visualisation of the consequences of introducing changes to models and allowing identification of redundant model components. Hence, this modelling framework represents a combined parameter fitting and systematic analysis of model behaviour and model mechanisms for possible model reduction. This has the clear advantage that it provides a transparent link between parameter values and experimental data in comparison to alternative methods such as simplex optimisation [12], simulated annealing [13] and Levenberg-Marquart optimisation [14], which only provide parameter value estimates without increasing the understanding of the underpinning model mechanisms.

We demonstrate our proposed approach by applying our parameter fitting pipeline to re-parameterise the cardiac cell contraction model developed by Niederer et al.[15], originally fitted to rat experimental data at room temperature, to represent mouse functionality at 37°C by iteratively matching the output from the Niederer-model to a combination of measured data and the outputs of the Land-model [16] (which was parameterised for mouse at 37°C). The lack of a complete and self consistent data set of all output metrics of interest from a single species, temperature and laboratory motivated the use of simulated outputs from one model as a substitute for measured data in the parameter fitting. Using in silico data also provides the opportunity to analyse how the parameter identifiability can be increased by introducing additional output metrics for which measured values are not available in the literature, guiding future measurements.

Following re-parameterisation of the Niederer-model, we apply the same methodology for finding reduced model versions through the identification of redundant model components. Specifically, we demonstrate how our methodology can be used for systematically comparing model versions, analysing the sensitivity of the model outputs to the input parameters, and choosing the most reduced version giving outputs matching measured data.

Methods

Application system

As outlined above we demonstrate our methodology by applying it to two models of cardiac cell contraction, consisting of differential equations describing contractile force, including length-dependence and velocity-dependence. The choice of application system was motivated by the high degree of maturity of cardiac models; the heart is arguably the most advanced example of a multi-scale framework for biology. Both these models represent cardiac muscle cells which consist of many contractile sub-units, sarcomeres, each again organised into thin and thick filaments [17, 18]. The thick filaments contain myosin crossbridges that bind to the thin actin filament to generate force. Electrical activation results in an increase in cytosolic calcium (Ca), and binding of calcium to the regulatory calcium binding site on troponin C (TnC) within the sarcomeres. This causes a conformational change in the associated tropomyosin complex that unblocks the thin filament actin sites for binding to the thick filament myosin crossbridges. In a crossbridge cycle, a myosin crossbridge on the thick filament attaches to the actin thin filament, performs a power stroke to generate force, and then detaches using Adenosine Triphosphate (ATP). Both models applied in this study consist of equations representing the influence of the muscle’s length on the tension it generates (length-dependence; more force is generated as a muscle is stretched), and the sensitivity of the generated force to the rate at which the muscle is stretched (velocity-dependence). The velocity-dependence parts of the two models are based on the same mathematical formulation, which is therefore not considered in this study (the velocity was set to zero for all simulations). Both models, parameterised from a range of data, are biophysically based, and represent two different frameworks for simulating the generation of contractile force in cardiac cells as a consequence of calcium binding (a central component of heart physiology). A description of the two contraction models including the differential equations is given in Additional file 1.

Both the Land-model and the Niederer-model were developed specifically for use with organ-scale simulations, and therefore have a relatively low level of detail compared to many other contraction models. Specifically, they do not include many sub-states for the attachment of ATP and the position of crossbridges. However, both of these models do include enough detail to enable the direct linking of parameters to biological data and exploration of different mechanistically based hypotheses.

The Niederer-model was originally parameterised using data for rats at 25°C, the calcium/TnC dynamics are modelled by a simple molecular binding model, and tropomyosin/crossbridge dynamics are represented by the transient changes in the proportion of available actin sites, while the binding sensitivity is length-dependent. With the default parameter values, the Niederer-model is unable to capture the fast relaxation kinetics of mouse cardiac muscle at higher pacing frequencies.

The Land-model uses a standard cooperative binding equation which has a Hill curve as its steady state solution to represent troponin binding, where the calcium half activation of maximal steady state tension generation is length-dependent, combined with a modified version of the crossbridge dynamics component from the model developed by Rice et al.[19], which uses a 4-state Markov model. The Land-model uses only 2 of these states, the so-called non-permissive and permissive (crossbridge cycling) states.

Evidence of the velocity-dependence of tension generation and the dynamic response to step changes remains controversial in the experimental literature. The fading memory model (FMM) [20] provides a succinct representation of these dynamics without being tied to a specific underlying mechanism, and is exploited by both models. The FMM represents the velocity response as several strain-rate dependent variables which all decay with time. An advantage of this model is that it is independent of the contraction model, and can be added after modelling isometric tension and length-dependence.

Our analysis of the two contraction models consists of the following steps:

1)
Sensitivity analysis and parameter identifiability analysis based on statistically designed model simulations and metamodelling. This was carried out to test whether the Niederer-model parameters could be predicted directly from the Land-model outputs using regression, and to identify redundant model components for both models. This is illustrated in Figure 1.
2)
Due to the relatively low prediction accuracy of the resulting inverse metamodel for several of the Niederer-model parameters, the inverse metamodelling approach was combined with a cost-function based look-up of simulations resulting in model outputs in close proximity to the target values. This was carried out iteratively as shown in Figure 2, resulting in a zooming into the region of the parameter space where the target outputs were replicated by the simulations.
3)
Model reduction by repetition of step 2 using reduced model versions. The reduced model versions were made based on the results from the parameter identifiability analysis, which was done for both models.

Sensitivity analysis and parameter identifiability analysis of the Niederer-model

In order to obtain an overview of the relationships between input parameters and dynamic outputs of the model, an experimental design of the Niederer-model parameters using relatively wide parameter ranges was made using a Latin Hypercube design (LHD) [21]. LHD is a semi-random sampling procedure that is especially suitable for use on high-dimensional data, since it separates the data into several hypercubes, and samples randomly within each hypercube. This ensures that all regions of the parameter space are sampled. Within our implementation, the parameter ranges in Table 1 were used to generate a LHD of 500 parameter value combinations, and simulations where run with the Niederer-model using cell lengths of 90, 100 and 110% of resting sarcomere length. An input Ca-transient measured for mouse at 37°C (Figure 3) [22] was used in all simulations. All simulations and subsequent analyses were done in MATLAB^® version R2012b [23].

Table 1 Description and initial ranges for the varied Niederer-model parameters

Full size table

Output metrics used to represent the model behaviour

Tension transients were simulated using both the Land and Niederer contraction models, and described by routinely experimentally measured descriptors of the transient morphology. A list of the descriptors and their recorded experimental values for mouse at 37°C is shown in Table 2. Tension transients were simulated at three cell lengths (90, 100 and 110% of resting sarcomere length) activated by the experimentally measured Ca-transient in Figure 3.

Table 2 Metrics used to describe the tension transients and measured data for mouse at 37°C

Full size table

Preliminary analyses of the results achieved by fitting the model parameters to the metrics in Table 2, using data obtained by simulations using the experimentally measured Ca-transient scaled by 90, 100 and 110%, showed that the model outputs were highly sensitive to the calcium concentration. In order to take this into account we also matched the force-pCa (F-pCa) relationships of the two models, using metrics from simulations run with fixed Ca-concentrations as additional model characteristics to constrain parameters. The Ca-concentrations used were a logarithmically spaced series of 82 different concentrations from 0.15 to 1 μM together with the concentration 10 μM. The resulting steady state tensions were normalised by the maximal simulated tension value.

Model and experimental steady state force-calcium curves are routinely approximated by a Hill-curve that can be logarithmically transformed to be linear. The relationship between pCa and log(F/(1-F)) was therefore fitted to a straight line using Ordinary Least Squares (OLS) Regression [24] (values of (1-F) < 10^-3 were removed in order to avoid numerical errors), and the metrics given in Table 3 were calculated to represent the properties of the force-pCa relationship. The F-pCa curves were simulated for 90, 100 and 110% of resting sarcomere length, and the resulting F-pCa metrics used as additional output constraints (together with the tension transient characteristics resulting from simulations with the experimental Ca-transient) to fit the parameters of the Niederer-model. Similarly, the final set of target output measures included both the metrics in Table 2 and those in Table 3, all calculated from simulations with 90, 100 and 110% of resting sarcomere length for the Land-model.

Table 3 Description of the output metrics used to describe the force-pCa relationship

Full size table

Sensitivity analysis by classical metamodelling

Partial Least Squares Regression (PLSR) [25–28] was then used for regression-based sensitivity analysis. PLSR is a subspace-based regression method based on decomposing the data into a subspace representing the main features of covariance between the regressors (here input parameters) and the response variables (here model output metrics). This subspace is represented by latent variables called score- and loading vectors. PLSR can be seen as a regression analogue to PCA, and can handle numerous input and output variables simultaneously. Unlike for example OLS Regression [24], PLSR does not require the regressor variables to be linearly independent. Coupling between parameters can be revealed using PLSR-based classical metamodelling through analysis of the regression coefficients for cross-terms between the parameters. In addition, the correlations (Pearson’s R) between all input parameters and output metrics included in the analysis were evaluated to obtain overview of the model system.

Based on the parameter-simulated output data for the Niederer-model, a classical metamodel was first constructed to predict the output metrics as functions of the parameters using PLSR. This classical metamodel was used for sensitivity analysis, using the regression coefficients as sensitivity measures (measures of the impact of variations in each of the parameters on the output metrics), as described in [29, 30]. The metamodelling procedure is illustrated schematically in Figure 1. Here, both parameters and output metrics were centred and standardised by subtraction of the mean value and dividing by the standard deviation of each variable prior to the regression, making the regression coefficients independent of the scales of the variables and thereby easier to compare in the sensitivity analysis. Cross-terms and second order terms of the input parameters (i.e. products between combinations of variables in the regressor matrix) were included in the metamodelling to take nonlinearities into account when predicting the output metrics.

Parameter identifiability analysis by inverse metamodelling

To evaluate whether it would be possible to get a reasonable estimate for the Niederer-model parameters by direct prediction using regression, an inverse metamodel, predicting the input parameter values from the simulated output metrics in Table 2 and Table 3, was generated using Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) [5, 6]. HC-PLSR is a nonlinear extension of PLSR. As described above, PLSR can handle correlated regressor variables, which makes it especially useful for inverse metamodelling of large, complex dynamic models, which contain large sets of highly coupled differential equations producing correlated model outputs. HC-PLSR is a locally linear regression method based on separating the observations into groups using fuzzy C-means (FCM) clustering [31–34] on the latent variables from a global PLSR model including all observations, and making local PLSR models within each cluster. This allows prediction of highly nonlinear input–output relationships. The inverse metamodelling procedure is also schematically illustrated in Figure 1, while the HC-PLSR method used for the metamodelling is outlined in Additional file 2.

Both parameters and output metrics were centred and standardised by subtraction of the mean value and dividing by the standard deviation prior to the regression, and 8 clusters where used in the HC-PLSR. The number of clusters was chosen based on a comparison of predictive ability between different HC-PLSR metamodel complexities ranging from models using 1–20 clusters. This comparison showed that 8 clusters was the minimum number of clusters providing maximal predictive ability, and 8 clusters were therefore used to limit the metamodel complexity. Cross-terms and second order terms of the output metrics were included in the inverse metamodelling to predict the input parameters, in order to better handle nonlinearities in the input–output relationships of the model.

Due to the relatively large differences between the default outputs from the Land-model and the Niederer-model, it was necessary to obtain a robust estimate of the predictive ability of the metamodel to evaluate whether it could be used to directly predict new parameter values for the Niederer-model. The inverse metamodel was therefore validated using 33% of the simulations from the experimental design of 500 simulations as a separate test set. Hence, the metamodel was calibrated using only 2/3 of the simulations, while the rest of the simulation results were kept aside for the purpose of prediction and thus not included in the parameterisation of the metamodel. This process produces a valid estimate of the ability of the metamodel to predict the parameter values from a new set of measured data.

Fitting of the Niederer-model parameters

The results from the sensitivity analysis and the parameter identifiability analysis above showed that the identifiability was relatively low for several of the Niederer-model parameters (see the Results section). We therefore combined the inverse metamodelling with an iterative generation of new experimental designs in the parameters, and identification of the simulations resulting in output metrics in close proximity to the target values. The target output metrics were found from simulations run with the Land-model using the default parameter set and otherwise the same settings as for the Niederer-model simulations. These were used as substitutes for measured data in the parameter fitting pipeline. A schematic representation of the parameter fitting pipeline is shown in Figure 2. The initial Niederer-model parameter ranges are given in Table 1, and were used to generate the initial experimental design (step 2 in Figure 2). Following simulations with the contraction model, the output metrics described above were calculated from the model outputs generated using the parameter values from the experimental design (step 3).

The next step of the fitting pipeline is to generate an inverse HC-PLSR metamodel, predicting the Niederer model parameters as functions of the output metrics in Table 2 and Table 3, based on the simulation results. This metamodel is then applied to the target outputs (from the Land-model simulations, see Table 2 and Table 3) to generate an initial estimate of the model parameters (step 4 in Figure 2). The inverse metamodelling is performed in the same way as described above under “Parameter identifiability analysis by inverse metamodelling”.

For each set of output metrics corresponding to one of the parameter sets in the experimental design, the proximity to the target is found (step 5), and the predicted parameter set from the inverse metamodelling is then combined with the 20 simulations that generated observations in the closest proximity to the experimental measurements (step 7). The predictions from the metamodelling were only included for those parameters for which the inverse metamodel could provide a prediction accuracy of >70% in a test set validation. Together, these 21 parameter sets (in the following referred to as the “guideline sets”) are used to identify the direction or localised region of the parameter space that allows the model to best match the target observations. Using the 20 simulations having the lowest distances to the measured metrics in the guideline set was considered sufficient in order to balance between zooming into the relevant parameter space region and keeping the possibility of identifying alternative regions giving feasible output metrics. This increases the likelihood that all possible regions in the parameter space that can produce physiologically feasible simulations are found during the parameter fitting. This, preferably together with constraints on the parameter values according to a priori knowledge about possible ranges, can generate robust/unique parameter estimates. The size and number of identified regions of the parameter space producing model outputs that replicate measured data give an indication of the uniqueness of the parameter estimates.

The achieved distances to the target outputs are found by PCA of the output metrics resulting from the simulations together with the target output (using centred and standardised variables), and calculation of the Root Mean Square Distances (RMSDs) of each simulation to the target in the PCA scores. The PCA scores are used to evaluate the distance to the target both in order to decrease the dimensionality of the data and to weight the metrics according to their contribution to the variation in the data. The PCA approach decomposes the data into latent variables describing the main variance directions in the data, and each score vector is a weighted sum of the original variables. Hence, the metrics having the largest contributions to the variation in the data have the highest weights for the first principal components (PCs) resulting from the PCA. The minimal number of PCs explaining 99% of the variance are included in the distance calculations in our fitting pipeline.

For each parameter, the new parameter range for the next iteration is set to the value span over the guideline sets (X1) ± an additional span defined by a variable called stepsize _new (in order to extend the design beyond the values for the guideline sets and thereby further approach the target output values (step 8 in Figure 2)). The ranges for the new experimental design are calculated using Equations (1) and (2).

Maximum values (i) = {max}_{i} X 1 + |(\frac{\bar{X 1_{i}}}{stepsiz e_{new}})|

(1)

Minimum values (i) = {min}_{i} X 1 - |(\frac{\bar{X 1_{i}}}{stepsiz e_{new}})|

(2)

The variable stepsize _new was introduced to allow adjustment of the spread in parameter values according to the degree of proximity to the target outputs. Initially, the value of stepsize _new is 4 in order to analyse a large part of the parameter space. In each following iteration, the minimum achieved RMSD in the PCA score space is compared to that for the previous iteration, and stepsize _new is increased by 2 if the value has decreased, until it reaches a maximum value of 20. Hence, the value of stepsize _new is increased as the distance from the target decreases, strengthening the zooming effect. If stepsize _new reaches the value 20 before the results are sufficiently close to the target metrics values, stepsize _new is decreased by 2 for the next iteration design.In each iteration, a new experimental design of parameter value combinations is generated using LHD in the region of the parameter space defined by the new parameter ranges. The number of simulations for each iteration is given as input to the fitting pipeline. Here, using a LHD size of 500 simulations was regarded sufficient in order to sample the parameter space relatively densely, while limiting the computational time used in each iteration. This procedure (step 2–8 in Figure 2) is repeated iteratively until the success criterion is met (evaluated in step 6 in Figure 2).

For our specific application, the criterion for success of the parameter fitting was defined as follows:

1)
For resting sarcomere length simulations: The tension transient metrics should be within the error bars for the measurements in Table 2.
2)
For 110% of resting sarcomere length simulations: The peak tension should be between 73 and 90 kPa (based on experimental measurements of relative changes in maximum twitch force generation [16]) and the minimum tension should be less than 1 kPa.
3)
For 90% of resting sarcomere length simulations: The peak tension should be between 12 and 20 kPa (±20% of the baseline value from the Land-model).
4)
For the force-pCa curve simulations: The RMSD between the simulated force for the Niederer-model and the target Land-model force (in standardised variables) should be less than 15%.

The test set prediction accuracy of the inverse metamodel was relatively low for several of the parameters (see the Results section), so the metamodel was used only in the first iteration to provide an initial indicator of the direction in the parameter space to move (by adding extra extensions to the ranges of some of the parameters based on the prediction). The constraints given in Table 4 were set on the parameters based on the variation in measured values in the literature.

Table 4 Constraints used on some of the Niederer-model parameters used in the parameter fitting

Full size table

The fitting pipeline was written in MATLAB^® version R2012b [23] as both a parallelised and a non-parallelised version, and both can be obtained from the authors upon request.

Reduction of model complexity for the Niederer-model and the Land-model

Parameter identifiability analysis for the Land-model

In the same way as for the Niederer-model, the possibility for reducing the Land-model was tested based on a similar test set validated inverse HC-PLSR metamodelling. The metamodel was made using data from simulations in a LHD of 500 observations within the ranges given in Table 5, using the output metrics in Tables 2 and 3 to predict the Land-model parameters. All variables were centred and standardised prior to the regression, and cross-terms and second order terms of the output metrics were included.

Table 5 Description and ranges for the Land-model parameters used for parameter identifiability analysis

Full size table

Model reduction

Reduction of the Niederer-model

The parameter fitting procedure described above was repeated with parts of the Niederer-model omitted in order to see whether the model could be reduced while keeping the replication of the simulated output data from the Land-model. The choice of model parts to omit was based on the results from the sensitivity- and parameter identifiability analysis, indicating very low sensitivity to the parameters n _r, α _r2 and K _z. These parameters were assumed to have minor effects on the model outputs, and were therefore omitted by making the model independent of these model parts. This omission was achieved by giving the parameter α _r2 the value zero, making the model independent also of n _r and K _z.

The initial parameter ranges in the fitting pipeline were the same as in the previous parameter fitting (given in Table 1), and all output metrics in Tables 2 and 3 were included to fit the model parameters.

Reduction of the Land-model

Based on the parameter identifiability analysis of the Land-model, k _TRPN, n _xb and k _xb were successively set equal to 1 (keeping the other parameters at the default values), in order to analyse the model output effects of variations in these parameters. The simulations were run as described above, and all output metrics in Tables 2 and 3 were included in the analysis.

Results

As described in the Methods section, we have analysed the sensitivity of the model outputs to variations in the input parameters, verified parameter identifiability and compared and matched the model outputs for the two cardiac contraction models. The analyses were based on both simulations run using a measured Ca-transient for mouse at 37°C to generate dynamic tension transients, and fixed, individual Ca-concentrations to simulate the steady state F-pCa curve. The Niederer-model was re-parameterised using the presented parameter fitting pipeline in Figure 2 using a combination of measured data and synthetic data from Land-model simulations. Reduced versions of both models were identified based on the parameter identifiability analysis and comparison of the outputs from reduced model versions with the Land-model default outputs. The results are detailed below.

Sensitivity analysis and parameter identifiability analysis of the Niederer-model

Sensitivity analysis by classical metamodelling

Sensitivity analysis based on a classical PLSR metamodel indicated that physiological simulations using the Niederer-model had low sensitivity to the parameters n _r, γ, α _r2 and K _z, while they were most sensitive to Ca _50ref, k _refoff, β ₀, β ₁, nH and T _ref. The regression coefficients from the PLSR showing the sensitivity of the output metrics to the input parameters are shown in Figure 4. These results indicate that parts of the Niederer-model tropomyosin kinetics component can be simplified by omitting the low sensitivity parameters. The model equations in Additional file 1 show that giving α _r2 the value zero would make also n _r and K _z redundant, significantly reducing the model complexity. This possibility was therefore analysed further below.The sensitivity patterns described above were confirmed by the plot of the correlations (Pearson’s R) between all input parameters and model output metrics included in this analysis, shown in Figure 5. As expected due to the sampling procedure used to generate the experimental design of parameter sets, Figure 5 shows no strong correlations between the input parameters in the model. However, there are several strong correlations between the output metrics. This was also expected, since they are metrics representing curve shapes. However, the results also show correlations between the metrics representing the tension transients and those representing the force-pCa-relationship.

Parameter identifiability analysis by inverse metamodelling

The parameter prediction accuracies from the inverse HC-PLSR metamodel are shown in Figure 6, represented by the correlation coefficients (R ²-values) between the simulated and the predicted parameters from a test set prediction. The test set consisted of 33% of the simulations from the LHD of 500 simulations. These simulations were not included in the calibration of the metamodel, and therefore represent new data, so that the resulting predictive ability would be what we can expect from a prediction using new measured data (or the output from simulations with the Land-model). As Figure 6 shows, the inverse metamodel was not able to predict all parameters accurately, but could give useful information about the parameters β ₁, β ₀ and T _ref. The reason why some of the model parameters that the sensitivity analysis indicated a model sensitivity to were predicted incorrectly by the inverse metamodel is probably that the model is sloppy, meaning that many parameter value combinations can generate the same output metrics values. This characterises most dynamic models [35]. The model can still be sensitive to variations in these parameters, but it is difficult to predict parameter values from the output metrics for sloppy models. However, our results demonstrate the value of using inverse metamodelling to give an indication of the best direction in the parameter space to move to approach reasonable estimates for the values of the three parameters for which the prediction accuracy was relatively high. For the other parameters the inverse metamodel alone will not provide an estimate that can be trusted. The fitting procedure therefore had to be extended by including the look-up approach to guide new simulations.