Elsevier

Ecological Modelling

Volume 142, Issue 3, 15 August 2001, Pages 285-294
Ecological Modelling

Simultaneous equations, error-in-variable models, and model integration in systems ecology

https://doi.org/10.1016/S0304-3800(01)00326-XGet rights and content

Abstract

Numerous dynamic ecological models of varied time and spatial scales exist in systems ecology. In general, small-scale models are more accurate, more capable of reflecting tiny local variations in eco-processes, and more sensitive to the outside disturbances than large-scale models. On the other hand, large-scale models are more comprehensive, and usually describe the ecosystem's average properties. There has been increased interest in how to integrate accurate small-scale models with comprehensive large-scale models. The two-stage or three-stage least squares regression is the classic parameter estimation method for such purposes. In this study, a two-stage error-in-variable method is introduced to estimate the parameters for model integration. It is proved theoretically that when the restriction is exactly identifiable, the two-stage least squares regression and the two-stage error-in-variable model produce the same estimates. If the restriction is over identifiable, both methods have solutions, but the estimates are not necessarily identical. For under identifiable systems, the estimate from the error-in-variable model still exists, but the estimate from the two-stage least squares regression is not valid any more. An example is provided to demonstrate how to use the two-stage error-in-variable model in a step-by-step fashion.

Introduction

To study how an ecological system functions, it is a common practice to first break down the system to sub-systems. The outputs from some sub-systems are treated as the inputs to others, and all sub-systems are coupled by the outputs and inputs to become a complete system (see the simplest example as in Fig. 1). In forestry, this practice is often referred to as scaling (Jarvis, 1995), linking (Somers and Nepal, 1994), aggregation (O'Neill and Rust, 1979), disaggregation (Zhang et al., 1993), or ‘integration’ (Daniels and Burkhart, 1988, Tang, 1991). The similar problems analogous to the Fig. 1 example also occurred in systems ecology, which is the problem of variable aggregation in ecological simulation models (Luckyanov et al., 1983). Luckyanov (1983/1984) had studied the problem of linear aggregation and separability in linear models of ecological systems and proved a theorem of the existence of a general solution. An attempt was also made on the problem of aggregation in nonlinear ecological system models (Logofet and Svirezhev, 1986, Iwasa et al., 1987). These studies are theoretically oriented, and the authors assumed the parameters in the systems are either given or known, and made efforts on finding the conditions such that a system can be aggregated. The problems in forestry are usually opposite, i.e. it is known that the variables in the system can be aggregated, however, the parameters in the system are generally unknown and have to be estimated from sampling data. In this study, we will concentrate our efforts on the parameters estimation method for integrated ecological models.

The system in the figure can be represented by equations dx1/dt=f1(t, x1, u1, a1), dx2/dt=f2(t, x2, u2, a2) and dx3/dt=f3(t, x1, x2, x3, a3), where t is time, xi are the states of the system, ui are the inputs to the system, and ai are the parameters to be estimated. As a typical simultaneous estimation problem, the solution to the equations can be expressed as x1=F1(t, u1, a1), x2=F2(t, u2, a2), and x3=F3(t, x1, x2, a3). There are at least two approaches available to estimate the parameters in the system. First, the three sub-systems could be fitted separately from u1 and u2 and X1, X2, and X3 to obtain the estimated parameters â1,â2, and â3; hence, the estimated states should satisfy equations:X1=F1(t,u1,â1),X2=F2(t,u2,â2),X3=F3(t,X1,X2,â3).

However, the parameters can also be estimated by integrating the three sub-systems as a single model such as:x3=F3(t,F1(t,u1,a1),F2(t,u2,a2),a3).The parameter estimates ā1,ā2, and ā3 through Eq. (4) are usually different from â1,â2, and â3. In terms of prediction, the result from Eq. (4) with u1 and u2 as input should be more accurate than the prediction through Eq. (3) with X1 and X2 as input, which are outputs from , where u1 and u2 are input (George et al., 1982). If the correlation between the variables in Eq. (1) or Eq. (2) is low and the chain of serial linking is long, the estimation errors would be large, and such propagated errors as a result of indirect prediction are not necessarily random. In fact, if the output from the previous stage sub model is used as the input to the model in the next stage, then the variables are endogenous, and the estimates are biased. If the model chain is very long, the accumulation of such biases will make the final estimate problematic (see Chapter 14, George et al., 1982).

If , , are linear, then econometricians have defined the system to be linear simultaneous equations with X1, X2, and X3 as endogenous variables. Two-stage or three-stage least square regression (George et al., 1982) is the classic parameter estimation method for such equations. This method is capable of eliminating the error propagation and the parameters estimated are asymptotically unbiased. In forestry, Borders (1989) discussed the applicability of linear simultaneous equations in forest growth-and-yield modeling. Estimation procedures are well documented. However, the procedures work only on the identifiable equations without the restrictions between the equations. The fact is that, in systems ecology, unidentifiable simultaneous equations are frequently found, and very often these equations are nonlinear. The objective of this study is to introduce a two-stage regression method based on the error-in-variable model to solve such problems.

Section snippets

Linear simultaneous equations and two-stage regression

For simplicity, a general conclusion about simultaneous equations (George et al., 1982) is introduced. Suppose the observations on p endogenous variables y1, y2, …, yp are Yti (1⩽tT, 1⩽ip), and the observations on q exogenous variables x1, x2, …, xq are xtj (1⩽tT, 1⩽jq), then the general form of simultaneous equations becomes:Yt1b11++Ytpbp1+xt1a11++xtqaq1+et1=0Yt1b1p++Ytpbpp+xt1a1p++xtqaqp+etp=0,where etj are random errors. Let B=(bij)p×p, A=(aij)q×p, Yt=(Yt1 Yt2  Ytp), xt=(xt1 xt2  xtq), and et=−(et1

Error-in-variable model

In fact, simultaneous , can also be regarded as an error-in-variable model, in which explanatory variables contain measurement errors (Fuller, 1987). In forestry, such a situation often appears in forest mensuration. For example, Curtis et al., 1974, Smith and Watts, 1987 once discussed the applicability of error-in-variable model in the field of forest growth and yield. To be presented in the form of error-in-variable model, Eq. (5) has to be re-written as:ytB+xtA=0,Yt=yt+δt,1≤t≤T,and there

The TSEM generalized to simultaneous nonlinear equations

The TSEM method can be extended to solve simultaneous nonlinear equations off1(Yt1,…,Ytp,xt1,…,xtq,c)=et1fp(Yt1,…,Ytp,xt1,…,xtq,c)=etp,where Yti and xtj (1⩽ip, 1⩽jq, and 1⩽tT) are observations, etp is random error, and c is a parameter vector. If the equations are treated as the error-in-variable model as:f1(yt1,…,ytp,xt1,…,xtq,c)=0fp(yt1,…,ytp,xt1,…,xtq,c)=0,withYti=ytiti,then, parameter c can be estimated by the TSEM method. Suppose a unique solution can be found from solving Eq. (17)

The TSEM parameter estimation by a simulation study

A data set (Table 1) of forest stand age (year), stand mean DBH (diameter at breast height, cm), and stand volume (m3/ha) was simulated with the following nonlinear systems:y1=b1xb2+x+e1,y2=b3+b4ln(y1)+e2,where x is the age, y1 is the DBH, y2 is the volume, b1, b2, b3, and b4 are parameters. The values of the four parameters used in simulation are given beforehand (Table 2). Errors e1 and e2 are random variables with e1=8(u−0.5) and e2=0.1(u−0.5), where u is of uniform distribution in [0, 1].

The

Discussion and conclusions

Although the TSLS model can be used to estimate parameters in Eq. (5), people have to make sure the model is identifiable and find some restrictions on matrices A and B. However, if the parameters in linear simultaneous equations are estimated through the TSEM by treating endogenous variables subject to measurement error and exogenous variables associated with no errors, there is no need to worry about verification on whether the model is identifiable or not. For example, if there is an

Acknowledgements

The authors are grateful to the National Science Foundation of China (NSFC Grant No. 39670609) for financial support, they also thank Brenda Laishley and Ian Corns of the Northern Forestry Center, Canadian Forest Service, for editing the manuscript.

References (17)

There are more references available in the full text version of this article.

Cited by (87)

View all citing articles on Scopus
1

Tel.: +86-10-62889178; fax: +86-10-62585584.

2

Tel.: +86-10-62209346.

View full text