A general algorithm to solve linear and nonlinear inverse problems

Lemes, N. H. T.; Borges, E.; Braga, J. P.

doi:10.1590/S0103-50532007000700008

Abstracts

A general algorithm to solve linear and nonlinear inverse problems, based on recursive neural networks, is discussed in this work. The procedure will be applied to physical chemical problems modeled by integral, differential and eigenvalue equations. Representative applications discussed are in positron lifetime spectroscopy, chemical kinetics and vibrational spectroscopy. The method is robust with respect to errors in the initial condition or in the experimental data. The present approach is simple, numerically stable and has a broad range of applicability.

inverse problems; neural networks; positron lifetime spectroscopy; chemical kinetics; vibrational spectroscopy

Um algoritmo geral para resolver problemas inversos lineares e não lineares, baseado em redes neurais recursivas, é discutido neste trabalho. O procedimento será aplicado a problemas físico-químicos modelados por equações integrais, diferenciais e de autovalor. As aplicações são discutidas em espectroscopia de aniquilação de pósitrons, cinética química e espectroscopia vibracional. O método é robusto com relação a erros nas condições iniciais ou em dados experimentais. A presente abordagem é simples, numericamente estável e tem uma grande aplicabilidade.

ARTICLE

A general algorithm to solve linear and nonlinear inverse problems

N. H. T. Lemes; E. Borges; J. P. Braga^* * e-mail: jpbraga@ufmg.br

Departamento de Química, Universidade Federal de Minas Gerais, 31270-901 Belo Horizonte-MG, Brazil

ABSTRACT

A general algorithm to solve linear and nonlinear inverse problems, based on recursive neural networks, is discussed in this work. The procedure will be applied to physical chemical problems modeled by integral, differential and eigenvalue equations. Representative applications discussed are in positron lifetime spectroscopy, chemical kinetics and vibrational spectroscopy. The method is robust with respect to errors in the initial condition or in the experimental data. The present approach is simple, numerically stable and has a broad range of applicability.

Keywords: inverse problems, neural networks, positron lifetime spectroscopy, chemical kinetics, vibrational spectroscopy

RESUMO

Um algoritmo geral para resolver problemas inversos lineares e não lineares, baseado em redes neurais recursivas, é discutido neste trabalho. O procedimento será aplicado a problemas físico-químicos modelados por equações integrais, diferenciais e de autovalor. As aplicações são discutidas em espectroscopia de aniquilação de pósitrons, cinética química e espectroscopia vibracional. O método é robusto com relação a erros nas condições iniciais ou em dados experimentais. A presente abordagem é simples, numericamente estável e tem uma grande aplicabilidade.

Introduction

Extracting chemical and physical information from experimental data is an important problem in science. This procedure, which characterizes an inverse problem,¹ is usually ill-posed in the sense that one of three conditions: (i) existence, (ii) uniqueness and (iii) continuity is not satisfied. When dealing with real data one has to handle ill-posed problems and this can be better clarified if a particular case is considered. For example, in an integral formulation of a physical problem, to be discussed along the paper, the solution of Kf=g has to be found. In such cases the matrix K is ill-conditioned, and consequently, its inverse will have very large values. Experimental errors will be amplified by the inversion procedure and condition (iii) is not fulfilled. Multiple solutions and existence conditions are also understood by analyzing the properties of the matrix K, as discussed in reference.¹ For every inverse problem there is a direct problem that is considerably easier to solve. For example, in a chemical kinetics process, if initial conditions, rate constants and kinetic law are given, the concentration of the species can be calculated along the time. For a given set of concentration, measured at different observation times, one may ask what are the rate constants.² Obtaining force fields from experimental data is another important inverse problem.³ Calculation of vibrational frequencies, from a given set of force constants, will define the direct problem. On the other hand, force constants retrieved from the experimental frequencies will characterize the inverse problem.

Numerical methods to handle linear inverse problems are well established. For example, techniques such as Tikhonov regularization,¹ decomposition into subspaces⁴ and dynamical neural network approach^5,6 have been used. A comparison between Tikhonov regularization and decomposition into subspaces has been discussed,⁷ but the neural network framework has been shown to be more stable, simple and fast when applied to inverse problems.⁸ The dynamical neural network methodology was first suggested in reference⁵ and applied to linear inversion problems by Vemuri and Jang.⁹ Several linear ill-posed problems were handled by this recurrent neural network technique, such as in: linear thermodynamics problem,¹⁰ positron lifetime spectroscopy,¹¹ magnetic resonance multiple sclerosis diagnostic.¹² All these applications have in common the integral form of the operator representing the direct problem. The original algorithm⁹ has been extended to solve nonlinear integral problems⁸ and to handle eigenvalue force field inverse problems.¹³

In the present work a general methodology to deal with inverse problems is presented. This algorithm will be able to solve linear integral inverse problem, nonlinear integral inverse problem, inverse problems in dynamic systems and identification problems. Some models will be used to exemplify the present approach. As a first example, inversion of positron annihilation lifetime spectrum represented by a Fredholm integral equation, will be taken. In a second example, chemical kinetics will be treated to illustrate inverse problems in differential form. The rate constants will be obtained from experimental concentration data. The final example will treat the eigenvalue inverse problem in which force constants are obtained from experimental frequencies.

Theoretical method

A Hopfield neural network is constructed by a recurrent layer, consisting of neurons fully connected. For example, in a Hopfield network formed by ten neurons, each neuron will have ten inputs, one for its own previous value and nine for the remaining ones. In this way, the output of a neuron u is a function of the input information, which is converted to another state by a activation function, f(u). The activation function propagates the required information and this occurs by following an energy-descent pathway. Therefore, the Hopfield neural network can be applied to solve optimization problems if an energy function is defined, which for practical applications, can be the error function for the physical problem. The learning property of the neural network is satisfied if f(u) is an increasing function of u, a condition to be fulfilled by functions of the kind (1 + tanh(u)) and f(u) = tanh(u). ⁵Both of these activation functions are used in the examples studied here. It is more appropriate to use the activation function (1 + tanh(u)) if inverted results assume only positive values, as in the positron annihilation lifetime spectrum. This restriction will prevent the inverted function to have negative values. For the same reason, this activation function was also used to invert data in inverse kinetics problem. In the force field inverse problem, the function f(u) = tanh(u) was used, for the inverted results represent force constants which can be negative. A more clarifying discussion about the recurrent neural network dynamics can be made by considering the error between measured and calculated properties. If the experimental and calculated properties are denoted, respectively by P^EXP, P^CAL and with o_j= P_j^CAL P_j^EXPthe total error can be defined as,

for m experimental data. Derivation with respect to the learning time, t, gives,

in which n is the quantity of neurons involved in the process. This is also equal to the number of variables to be inverted from experimental data. The total error will always decrease with respect to the learning time if two steps are taken. First, the condition

is imposed, transforming (2) into,

Secondly, the relation 0 is applied, which will imply in,

Therefore, the present algorithm will have the property of always decreasing the error with respect to the learning time. Due to the nature of the ill-posed problems, multiply solutions will always be presented. One has to choose, based on physical and chemical grounds, among the several possibilities. Nevertheless, the present approach has the property of always decreasing the error between the inverted and experimental data. Therefore, among the infinite solutions, the one obtained in the present work is the solution which best reproduces the experimental property. The properties P^EXPand P^CAL are not necessarily the system variables, but can be a function of them. For example, recovering potential energy function from second virial coefficient will require solving an integral equation over the coordinates.⁸ In chemical kinetics, the measured property can be represented by a combination of the concentration, as in the absorbance measurement.² For the force field inverse problem the property is the eigenvalue of a matrix.

The differential equations (3) were integrated by using a fourth order Runge-Kutta method with variable stepsize,¹⁴ until the learning process, which defines the stopping condition, has finished. For the three examples to be discussed in this work, this stopping condition occurs if the error function reaches the minimum value, i.e., if there is no further decreasing in the network energy. Retrieval of the required quantities, that is, solution of the inverse problem, is obtained according to the following algorithm: (i) an initial guess to the required quantities is made and used to propagate the differential equations; (ii) since error will decrease, the time derivative of the neuron state will reach a constant value with zero derivative. This will be represent by , defining also the final learning time; (iii) the converged values of the neuron states are activated to obtain the desired result. Several inversion problems can be solved by these steps. Artificial errors were introduced into the simulated and experimental data to test the robustness of the algorithm. These errors were randomly generated with negative and positive values having the same weight. Activating the states u will have an effect on the quality of the inverted results. For example, if the function to be calculated happens to be only positively, this can be easily incorporated into the present methodology by imposing

Results and Discussion

Positron lifetime spectroscopy-inversion in integral equations

The inverse problem to determine s(y) in the Fredholm integral equation of first kind,, from a given k(x,y) and g(x), will be tested for the present algorithm. As a representative example, the inversion of positron annihilation lifetime spectrum⁶ will be discussed. This problem was previously discussed by using a singular value decomposition method, in a previous work.¹¹The density function of the positron annihilation rate spectrum,

was taken as experimental result for the present analysis. In equation (6), a₁=0.42250, a₁=11.6951, s₁=0.57710, a₂=1.57270, a₂=12.0688 and s₂=0.93530, parameters for lysozyme in water system.¹⁰The theoretical data for this problem, calculated from , were generated by using a rectangular basis, n=128. Therefore, the error function in this example, will be . Random numbers uniformly distributed were used to simulate the experimental error. Inverted results, with initial condition equal to zero and random errors of ± 0.3% added to the experimental data (the experimental error in this kind of measurement is about ± 0.3%) are compared with exact results in Figure 1. These results show the stability of the algorithm for this problem. Moreover, the algorithm discussed here overcomes the previous framework applied to linear inverse problems,⁶ in which it was necessary to construct two matrices, derived from theavailable data and the linear transformation between input and output. For the algorithm proposed here, these matrices are unnecessary. This is an improvement in the inverse problem methodology since linear and nonlinear ill-posed problems can be treated on equal grounds.

Chemical kinetics-inversion in differential equations

The analysis of the fission and formation of C-C bonds in chemical reactions are often used as model systems in molecular kinetics. Experimental data consistent with the recombination-dissociation rate constants of several reactions are available in the literature. For example, trifluoromethyl radicals recombination, CF₃+CF₃® C₂F₆, has been studied experimentally¹⁵ and theoretically to test other inverse algorithms.¹⁶The ruby laser radiation has been used to produce photolysis of CF₃NO.¹⁵ In this technique, which is effective to produce CF₃radicals, the CF₃ and NO species are obtained, due to the photolyzing pulse of the laser. The CF₃NO concentration was monitored by absorption in the visible spectral region and the reaction cell filled with a mixture of CF₃NO, NO and a buffer gas. Therefore, concentration measurements of the CF₃NO component were used to determine the rate constant k₂ for this recombination process. With experimental errors in the range 2% to 7% this rate constant was established to be (3.9 ± 1.3) × 10^-12cm³ s^-1. Using the notation x₁=CF₃, x₂=[NO], x₃=[( CF₃)₂NO], x₄=[(CF₃)₂NO₂], x₅=[CF₃NO] and x₆=[C₂F₆], the mechanism for the recombination process is described by the set of six differential equations,

Thus, the inversion problem to be solved here will be: for given values of the rate constants k₁, k₃, k₄ and values of x^EXP along the time, what will be the rate constant k₂?The rate constants¹⁶ will be taken as: k₁ = 1.3×10^-11 cm³ s^-1, k₃ = 1.4 × 10^-13 cm³ s^-1 and k₄ = 2.0×10^-11 cm³ s^-1. The CF₃NO concentration was simulated in the interval 0.2 to1.0s, with step of 0.2s with 3.9× 10^-11 cm³ s^-2 for the k₂rate constant. In this case, integration of equation (3) will provide the temporal evolution of the neurons which represent the inverted rate constant k₂. With this rate constant it is possible to calculate the concentration x^CAL₅in an iterative process to minimize the error function , according to equation (5). The experimental error was taken to be, at maximum, of 7%.¹⁵These results were taken as experimental data from which inversion is to be performed. Integration of the differential equations (7) was performed with x₁(0) = x₂(0) = 1×10¹¹molecules cm^-3 and x₃(0) = x₄(0) = x₅(0) = x₆(0) = 0 as initial conditions. Retrieved results are presented in Table 1. Even with the largest experimental error tested in the concentrations, the inverted rate constants are in agreement with the experimental value. The less favorable inverted rate constant, 4.40×10^-12cm³ s^-1occurs at an experimental random error of 7% in the concentrations. Also in this situation, the inverted rate constant is within of the experimental value, (3.9±1.3)×10^-12 cm³ s^-1. The algorithm was very stable in relation to different initial conditions. Even with an initial guess two times the correct value, rate constants shown in Table 1 are obtained.

Thumbnail

Vibrational spectroscopy-inversion in eigenvalue problems

The eigenvalue inverse problem is based on the general equation AU = L, in which A is a matrix related to the desired information, U are the eigenvectors and L is the diagonal matrix of eigenvalues. Although the present algorithm has been applied to this kind of problem,¹³ it will also be applied here to another system. This additional example will exemplify the general applicability of the method for this problem. There are three main equations to formulate the problem: (i) the eigenvalue equation proposed by Wilson et.al.,¹⁷GFL = LL in which G is a matrix related to mass and geometry of the molecule, F the matrix representing the forces constants, La diagonal matrix with vibrational frequencies and L the normalized vibrational mode amplitudes matrix; (ii) equation (1) with the error function and (iii) equation (5), that guarantees a minimum error function, providing the calculation of reliable force constants. Therefore, the G and Lmatrices are known in formulating the problem. On the other hand, the F matrix is to be established. The error function in this problem is , are the eigenvalues of the matrix GF calculated with the inverted force constants and L_EXP are the experimental frequencies. The fluoroform molecule in its equilibrium configuration belongs to symmetry point group C₃_vand this is considered to obtain the matrix G. The force constant regularized matrix,³ with deviations up to 30%, was used as priori physical information, defining the initial condition for the network. The optimized results are presented in Table 2 and are in agreement with the regularized force field found in reference.³The errors in the calculated frequencies, with respect to the experimental data, can not be attributed to the method itself, but to the harmonic force field model¹⁷considered. This specific model was used to carry on a comparison with previous results in the literature.³Moreover, for the experimental data used, the errors in the calculated vibrational frequencies are in agreement with the experimental error, as presented in Table 3. Table 4 shows the inverted results are stable and convergent for large experimental errors. Experimental deviations in the fundamental frequencies¹⁸ which are about ±0.2cm^-1, corresponding to 0.03%, do not perturb the inverted results. The sensitivity of the inverted results with respect to different initial guesses is presented in Table 5. Calculation of retrieved force field presents low average error, up to 20% of deviation in relation to the exact initial guess.

Thumbnail

Conclusions

A general technique to handle inverse problems is presented in this work. The algorithm can be applied to a problem in which there is not an explicit function to represent the data. For example, in the inverse chemical kinetics problem an analytical expression for concentrations was not available, but the inverted problem was still possible to be carried out. Experimental errors were also considered in the present approach. For the three examples studied, the corresponding experimental errors were taking into consideration. Errors in the initial conditions were used to test the robustness of the methodology. In both cases, either with experimental errors or initial conditions deviations, the present approach was stable, giving accurate physical results. Also, the method is simple and represents an extension of the previous used algorithm,⁸ incorporating linear and non linear cases. The same approach can equally be applied to an integral, differential or eigenvalue inverse problem, for a general case of the measured property.

Acknowledgments

This work was supported by CNPq and FAPEMIG.

Received: January 15, 2007

Web Release Date: October 31, 2007

1. Tikhonov, A. N.; Arsénine,V.; Méthodes de Résolution de Problèmes Mal Posés, Mir, Moscow, 1974.
2. Laidler, K. J.; Chemical Kinetics, Benjamin Cummings: San Francisco, 1977.
3. Yagola, A. G.; Kochikov, I. V.; Kuramshina, G. M.; Pentin, Yu. A.; Inverse Problems of Vibrational Spectroscopy, Brill Academic Publishers: Utrecht, 1999.
4. Golub,G. H.; Van Loan, C. F.; Matrix Computations, John Hopkins University Press: Baltimore, 1989.
5. Hopfield, J. J.; Proc. Natl. Acad. Sci. 1982, 79, 2554.
6. Viterbo, V. C.; Braga, J. P.; Braga, A. P.; Almeida, M. B.; J. Chem. Inf. Comput. Sci. 2001, 41, 309.
7. Braga, J. P.; J. Math. Chem. 2001, 29, 151.
8. Sebastião, R. C. O.; Lemes, N. H. T.; Virtuoso, L. S.; Braga, J. P.; Chem. Phys. Lett. 2003, 378, 406.
9. Vemuri, V.; Jang, G. S.; J. Franklin Inst. 1992, 329, 241.
10. Neves, J. L.; Braga, J. P.; Braga, A. P.; de Almeida, M. B.; Inverse Probl. Eng. 2002, 10, 153.
11. Viterbo, V. C.; Sebastião, R. C. O.; Monteiro, R. P. G.; Magalhães, W. F.; Braga, J. P.; J. Braz. Chem. Soc. 2005, 16, 93.
12. Sebastião, R. C. O.; Braga, J. P.; J. Magn. Reson. 2005, 177, 146.
13. Borges, E.; Lemes, N. H. T.; Braga, J. P.; Chem. Phys. Lett. 2006, 423, 357.
14. Forsythe, G. E.; Malcolm, M. A.; Moler, C. B.; Computer Methods for Mathematical Computations, Prentice-Hall: New Jersey, 1977.
15. Vakhtin, A. B.; Int. J. Chem. Kinet. 1996, 28, 443.
16. Tang, W.; Zhang, L.; Linninger, A. A.; Tranter, R. S.; Brezinsky,; Ind. Eng. Chem. Res. 2005, 44, 3626.
17. Wilson, E. B.; Decius, J. C.; Cross, P. C.; Molecular Vibrations, The Theory of Infrared and Raman Vibrational Spectra,McGraw-Hill Book Company: New York, 1955.
18. Chambers, H. F.; Kirk, R. W.; Thompson, J. H.; Warner, M. J.; Willt, P. M.; J. Mol. Spectrosc. 1975, 58, 76.K.

*

e-mail:

jpbraga@ufmg.br

Publication Dates

Publication in this collection
08 Jan 2008
Date of issue
2007

History

Accepted
31 Oct 2007
Received
15 Jan 2007

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

[1] 1. Tikhonov, A. N.; Arsénine,V.; Méthodes de Résolution de Problèmes Mal Posés, Mir, Moscow, 1974.

[2] 2. Laidler, K. J.; Chemical Kinetics, Benjamin Cummings: San Francisco, 1977.

[3] 3. Yagola, A. G.; Kochikov, I. V.; Kuramshina, G. M.; Pentin, Yu. A.; Inverse Problems of Vibrational Spectroscopy, Brill Academic Publishers: Utrecht, 1999.

[4] 4. Golub,G. H.; Van Loan, C. F.; Matrix Computations, John Hopkins University Press: Baltimore, 1989.

[5] 5. Hopfield, J. J.; Proc. Natl. Acad. Sci. 1982, 79, 2554.

[6] 6. Viterbo, V. C.; Braga, J. P.; Braga, A. P.; Almeida, M. B.; J. Chem. Inf. Comput. Sci. 2001, 41, 309.

[7] 7. Braga, J. P.; J. Math. Chem. 2001, 29, 151.

[8] 8. Sebastião, R. C. O.; Lemes, N. H. T.; Virtuoso, L. S.; Braga, J. P.; Chem. Phys. Lett. 2003, 378, 406.

[9] 9. Vemuri, V.; Jang, G. S.; J. Franklin Inst. 1992, 329, 241.

[10] 10. Neves, J. L.; Braga, J. P.; Braga, A. P.; de Almeida, M. B.; Inverse Probl. Eng. 2002, 10, 153.

[11] 11. Viterbo, V. C.; Sebastião, R. C. O.; Monteiro, R. P. G.; Magalhães, W. F.; Braga, J. P.; J. Braz. Chem. Soc. 2005, 16, 93.

[12] 12. Sebastião, R. C. O.; Braga, J. P.; J. Magn. Reson. 2005, 177, 146.

[13] 13. Borges, E.; Lemes, N. H. T.; Braga, J. P.; Chem. Phys. Lett. 2006, 423, 357.

[14] 14. Forsythe, G. E.; Malcolm, M. A.; Moler, C. B.; Computer Methods for Mathematical Computations, Prentice-Hall: New Jersey, 1977.

[15] 15. Vakhtin, A. B.; Int. J. Chem. Kinet. 1996, 28, 443.

[16] 16. Tang, W.; Zhang, L.; Linninger, A. A.; Tranter, R. S.; Brezinsky,; Ind. Eng. Chem. Res. 2005, 44, 3626.

[17] 17. Wilson, E. B.; Decius, J. C.; Cross, P. C.; Molecular Vibrations, The Theory of Infrared and Raman Vibrational Spectra,McGraw-Hill Book Company: New York, 1955.

[18] 18. Chambers, H. F.; Kirk, R. W.; Thompson, J. H.; Warner, M. J.; Willt, P. M.; J. Mol. Spectrosc. 1975, 58, 76.K.

Brasil

Brasil

A general algorithm to solve linear and nonlinear inverse problems

Abstracts

Publication Dates

History