Multiple multivariate regression and global sequence optimization:: An application to large-scale models of radiation intensity
Introduction
The motivation behind the research presented in this paper was the development of a system for the fast and accurate computation of intensity of radiation profiles, observed through a column of absorbing gas. This is a problem of interest for the aeronautical industry, and is applied to a wide range of domains such as infrared monitoring or turboreactor design. There exists a method for the precise resolution of this thermodynamical problem [3], but unfortunately it is too computationally intensive, and cannot be used for interactive simulations or real-time applications. A number of approximated physical models exist and have been implemented to compute relatively accurate profiles in reasonable computational time 3, 6.
To implement such approximated models, however, it is necessary to determine a large number of application-dependant parameters, to which models are extremely sensible. This is usually an expensive and lengthy task, and demands great experience on the part of model constructors. Part of the difficulty is the nonlinear nature of the problem: physicists must carefully transform and partition data so that local linear regression models work.
In this implementation we utilize nonlinear multi-layer perceptrons (MLPs) to build accurate models which are constructed automatically, without the lengthy hand-tuning of traditional models. We will see that, while the automatically constructed MLP models cannot compete in accuracy with hand-crafted local linear models, they can be made sufficiently accurate and offer an alternative for applications in which the cost of hand-crafting them is prohibitive. This is not possible with traditional linear models. Because of the large number of regressors needed, this is in fact not possible for traditional neural network techniques either.
In the process of developing the thermodynamical application, we have obtained several results which we believe will be of general interest to developers of neural network applications. For example, a typical problem of nonlinear regression is the large number of parameters needed, in comparison to linear models. We show that it is possible to construct MLP architectures with equal or even less parameters than the original linear models. Specifically, we show how the integration of several correlated regressors into a single neural network may bring a drastic reduction in the number of network parameters while improving its generalization performance and reducing the computational costs of its utilization and development.
In the standard model, a large number of linear regressors (several thousands, in our application) are constructed to approximate independently small regions of the signal space. These regions correspond to sequences of points (segment descriptions) in typical gas columns. Regressors are then combined into a complex nonlinear model. In the second part of our work we propose a method of global sequence optimization, which goes beyond the standard thermodynamical model. We set out to train our models globally, taking into account the entire sequence of points in a column, with respect to the researched profile. This allows us to improve the generalization performance on a family of input sequences, biasing or `specializing' the model in a natural and easy to implement manner.
In Section 2we give a brief overview of the thermodynamical model in which the application is based. We will see that the main problem that needs to be solved is fitting a large number of functions which form a parametric base for the global solution of the problem, in order to integrate a function of these parameters over the signal sequence. In Section 3we discuss how ordinary least-square regression and neural networks (NNs) may be used to fit these functions. We will see that a straightforward substitution of linear models by NNs brings a drastic improvement in accuracy, but at a very high computational cost. In Section 4we discuss a method to reduce this cost while improving the generalization capacity of the NN model. This method consists in using a single NN to approximate simultaneously a family of parameter functions. Finally, in Section 5, we propose a two-stage optimization procedure which directly optimizes the global sequence instead of fitting independently the different parameter functions. This improves generalization and has the special interest of providing a subtle and easily implementable way to specialize the models after normal training is completed.
Section snippets
The physical problem
Below, we briefly introduce the physical problem, without going into details; these are not important for the comprehension of this work and are beyond the scope of the paper. We have provided a more thorough description of the physical model in Appendix A. For a detailed description the reader may refer to 3, 6.
Spectral radiation profiles offer a concise representation of a radiation source over a certain region of the spectrum. The radiation models allow to compute these profiles in any point
Independent regression framework
We dispose of a set of construction points obtained by the LBL model from which the transmissivity parameter functions are to be fitted. Formally, we dispose of a set of points which assign, to each segment description , the values of the parameter functions at the segment, . NN terminology denotes x and y as the input and desired patterns, respectively, while in statistics they are called predictor and response measurement vectors.
A simple way
Multiple function regression
Several ideas lead naturally to the implementation of multiple function regressors for our application. The combination of regressors, when correlated, may improve generalization. In the case of linear regression it can be shown that a weighted sum of a set of regressors gives a better (or at worst equal) solution than the independent regressors. Appropriate weights may be estimated from data [1]. There do not exist to our knowledge similar results in the case of non-linear function
Two-stage sequence optimization
Whether we employed linear or nonlinear regression methods, the models of 3 Independent regression framework, 4 Multiple function regressionwere concerned with the approximation of the parameter functions, without regard for the role each function plays in the spectral radiation model. However, we can see from , that not all the parameter functions are equally important for the intensity calculations. Depending on the column characteristics some parameters will have negligible effect.
Conclusions
We have presented an effective model of spectral luminance computation based on neural networks. This model transforms a sequence of column measurements into a spectral radiation intensity profile. The main characteristics of the problem is that several thousands of regressors have to be simultaneously estimated. This forbids the use of traditional estimation methods such as cross-validation or regularization. We have shown how to make the computations tractable by taking benefit from the
Acknowledgements
This material is based on work supported in part by the French Aeronautical Society SNECMA.
References (8)
- et al.
Correlated-k and fictitious gas methods for H2O near 2.7 μm
J. Quant. Spetrosc. Radiat. Transfer
(1992) - L. Breiman, J.H. Friedman, Predicting multivariate responses in multiple linear regression, Royal Statist. Soc., in...
Learning many related tasks at the same time with backpropagation
Adv. Neural Inform. Systems
(1994)- et al.
Atmospheric Radiation
(1989)