doi:10.1016/S0925-2312(01)00708-1
Copyright © 2002 Elsevier Science B.V. All rights reserved.
Partial least-squares algorithm for weights initialization of backpropagation network
a Institute of Biomedical Engineering, National Yang-Ming University, No. 155, Sec. 2, Li-Nung St., Taipei 112, Taiwan, ROC
b Institute of Biomedical Engineering, College of Medicine and Engineering, National Taiwan University, Taipei 100, Taiwan, ROC
Received 8 March 2001;
accepted 23 November 2001. ;
Available online 13 January 2002.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
This paper proposes a hybrid scheme to set the weights initialization and the optimal number of hidden nodes of the backpropagation network (BPN) by applying the loading weights and factor numbers of the partial least-squares (PLS) algorithm. The joint PLS and BPN method (PLSBPN) starts with a small residual error, modifies the latent weight matrices, and obtains a near-global minimum in the calibration phase. Performances of the BPN, PLS, and PLSBPN were compared for the near infrared spectroscopic analysis of glucose concentrations in aqueous matrices. The results showed that the PLSBPN had the smallest root mean square error. The PLSBPN approach significantly solves some conventional problems of the BPN method by providing the good initial weights, reducing the calibration time, obtaining an optimal solution, and easily determining the number of hidden nodes.
Author Keywords: Weights initialization; Backpropagation network; Partial least-squares; Feedforward neural networks
Fig. 1. Schematic diagram of the PLS. The matrix Y is used as the temporal U. The loading weight P is calculated by using least-squares method and scaled the vector to length 1. The score U is estimated via matrices X and P. The other score V is calculated via U (V=U in this paper). The loading weight Q is estimated by using the least-squares method.
Fig. 2. The PLSBPN can be represented as a three-layered feedforward neural network. X: input, V: hidden, Y: output nodes, P and Q: weights.
Fig. 3. The PLSBPN training procedure: PLS weights initialization, network processing, cost function decision, and training algorithm for weights modification. GDR=generalized delta rule.
Fig. 4. The NIR absorption spectra of different glucose concentrations in deionized water after subtracting the absorption spectrum of the water.
Fig. 5. Training epochs in the calibration phase of the BPN and PLSBPN using 1(♦),5(▪),10(
), and 15 (×) hidden nodes, respectively. Each BPN curve represents a 10-times-averaged calibration result, and the error bar represents one standard deviation.
Fig. 6. The root mean square error (RMSE) of the BPN, PLS, and PLSBPN in the calibration phase at 500 training epochs.
Fig. 7. The RMSE of the BPN, PLS, and PLSBPN in the cross-validation phase at 500 training epochs.