Elsevier

Signal Processing

Volume 144, March 2018, Pages 226-237
Signal Processing

Parameter estimation in wireless sensor networks with faulty transducers: A distributed EM approach

https://doi.org/10.1016/j.sigpro.2017.10.012Get rights and content

Highlights

  • A diffusion-averaging distributed Expectation-Maximization algorithm is proposed.

  • It estimates a parameter vector with a wireless sensor network with faulty nodes.

  • Only local information exchange among nodes, which may be faulty, is required.

  • A vanishing step-size is used to switch from a diffusion stage to an averaging one.

  • Stable points of the centralized solution are convergent for the proposed method.

Abstract

We address the problem of distributed estimation of a vector-valued parameter performed by a wireless sensor network in the presence of noisy observations which may be unreliable due to faulty transducers. The proposed distributed estimator is based on the Expectation-Maximization (EM) algorithm and combines consensus and diffusion techniques: a term for information diffusion is gradually turned off, while a term for updated information averaging is turned on so that all nodes in the network approach the same value of the estimate. The proposed method requires only local exchanges of information among network nodes and, in contrast with previous approaches, it does not assume knowledge of the a priori probability of transducer failures or the noise variance. A convergence analysis is provided, showing that the convergent points of the centralized EM iteration are locally asymptotically convergent points of the proposed distributed scheme. Numerical examples show that the distributed algorithm asymptotically attains the performance of the centralized EM method.

Introduction

Wireless sensor networks (WSNs) consist of many small, spatially distributed autonomous nodes, equipped with one or more on-board sensors to collect information from the surrounding environment, and which collaborate to jointly perform a variety of inference and information processing tasks. Applications include environmental and healthcare monitoring, event detection, target classification, and industrial automation [1], [2]. Distributed processing, by which computations are carried out within the network in order to avoid raw data transmission to a fusion center, is a desirable feature of WSNs since it usually results in energy savings and improved robustness [3], [4]. In particular, distributed estimation of unknown parameters in WSNs is an important problem which has been extensively considered over the past few years [5], [6], [7], [8], [9], [10], [11].

In practice, estimation performance may be severely degraded when the information collected by the nodes becomes unreliable due to sensor malfunction [12], [13], [14], [15], and therefore it is important to efficiently identify faulty nodes [16], [17]. Given that nodes are typically deployed in outdoor, potentially harsh environments, sensor malfunction effects should not be lightly dismissed. We consider the problem of distributed estimation of a vector-valued parameter from the observations collected by a WSN where some nodes may be subject to random transducer faults, so that their reports contain only noise [13], [18]. In the presence of such unreliable observations, one possibility is to run a node classification stage previously to the estimation stage [19]; however, this entails increased computational complexity and communication cost. In relation to algorithms based on prior detection of faulty nodes, the Mixed Detection and Estimation (MDE) scheme in [18] performs the node classification and estimation tasks in a jointly distributed manner. However, since MDE classifies nodes based on hard decisions, it is prone to decision errors whenever the signal-to-noise ratio (SNR) is not sufficiently high. To avoid this problem, we adopt an approach in which a soft classification of the data is performed by means of the expectation-maximization (EM) algorithm, a well-known method for computing the maximum likelihood (ML) estimate in the presence of hidden variables [20], [21]. The EM algorithm implicitly and iteratively produces estimates of the class probabilities, alternating between an expectation step (E-step), where access to the whole network dataset is required, and a maximization step (M-step), where updated estimates are obtained.

Distributed implementations of the EM algorithm for Gaussian mixture density estimation and clustering have been previously proposed. For example, in incremental approaches [22], [23], [24], [25], computations involving global network information at the E-step are addressed via aggregation strategies, assigning routing paths or junction trees within the network. This problem is avoided in [26], [27], [29], which apply full-blown gossip- or consensus-based schemes at each E-step so that all nodes arrive at an agreement about every intermediate estimate. The main drawback of these methods, however, is the need to exchange a large amount of information among neighbor nodes, with the consequent penalty in energy efficiency. In [28] a distributed EM algorithm based on the alternating direction method of multipliers (ADMM) is proposed for clustering. In this scheme the communication overhead is reduced but at the cost of significantly increasing the computational cost since each node has to solve a convex optimization problem via, e.g., interior point methods at each iteration. A potential way to overcome these problems is the use of diffusion strategies [11], by which nodes exchange local information only once per EM iteration and perform averaging over the values in their neighborhoods [30], [31], [32] (see [33] for an extension to general mixture models). Convergence analyses of these schemes either assume that an infinite amount of data is available at each node [30], [32], or adopt a stochastic framework under an independence assumption [31].

The algorithm proposed in this paper is based on a different diffusion-based approach [34], [35], in which the propagation of information throughout the network is embedded in the iterative parameter update. This is done by appropriately combining two terms for information diffusion and information averaging (consensus) in the update equations. The resulting iteration, termed diffusion-averaging distributed Expectation-Maximization (DA-DEM), is reminiscent of so-called consensus+innovations (C+I) algorithms for distributed estimation in linear models [36], whose updates combine a consensus term and a local innovation term; nevertheless, several important differences should be highlighted. First, the model underlying C+I schemes is linear, but in our setting this property does not apply due to the potential presence of faulty nodes. Second, C+I schemes are usually designed for on-line adaptation, i.e., sensors keep acquiring new observations as time progresses, whereas the DA-DEM algorithm is of batch type in which a single measurement is available to each sensor. Thus, in our setting, the “innovation” provided by the diffusion term does not correspond to information provided by new measurements, but rather to that provided by the iterative refinement of the estimates. Third, in contrast with [18], [34], [35], [36] where the diffusion and averaging terms have different asymptotic decay rates, thus leading to mixed time-scale recursions, in DA-DEM both terms have the same rate. In contrast with [30], [31], [32], this feature allows for the development of a local convergence analysis under a deterministic setting with a finite amount of data, showing that any convergent point of the centralized EM iteration, and therefore a (possibly local) maximum of the likelihood function, must be an asymptotically convergent point of DA-DEM. Numerical examples show that the DA-DEM estimator asymptotically attains the performance of centralized EM in terms of mean square error (MSE). In addition to the aforementioned convergence analysis, further contributions with respect to [35] include lack of knowledge about the a priori probability of a sensor fault and the consideration of vector-valued parameter. In contrast with incremental strategies, DA-DEM does not require the computation and management of routing paths through the network, resulting in sizable reduction in convergence time and thus leading to energy savings.

The paper is organized as follows. Section 2 describes the signal model, and Section 3 presents the centralized EM-based estimator, the starting point for the development of the distributed implementation in Section 4. The convergence analysis of DA-DEM is developed in Section 5. Finally, simulation results and conclusions are presented in Sections 6 and 7 respectively.

Notation: We use lowercase, bold lowercase, and bold uppercase symbols to respectively denote scalars, vectors and matrices. The transpose and inverse of matrix A are denoted by AT and A1 respectively. The 2-norm of a vector v is denoted by ‖v‖, whereas for a matrix A, ‖AF denotes its Frobenius norm, ‖A‖ its spectral norm (i.e., its largest singular value) and, for A square, ρ(A) is the spectral radius (largest of the moduli of the eigenvalues). For an n × n symmetric matrix S, vec{S} is a vector of size n(n+1)/2 obtained by stacking the entries of the upper triangular part of S. The composition of two functions f and g is denoted by fg, so that (fg)(x)=f(g(x)), and E{·} denotes statistical expectation.

Section snippets

Problem statement

We consider the problem of estimating a parameter vector xRL×1 based on a set of N ≫ L independent observations given by yi=aihiTx+wi,i=1,,N,where hi=[hi(1)hi(L)]T are assumed known ∀i, {wi, ∀i} are independent, identically distributed (i.i.d.) zero-mean Gaussian random variables with variance σ2, modeling the observation noise, and {ai, ∀i} are i.i.d. Bernoulli random variables with Pr(ai=1)=p, independent of wj, ∀{i, j}. A value of ai=1 indicates that node i has actually sensed the

Centralized EM estimator

Starting from an initial estimate, the EM algorithm alternates between an E-step, where the expected log-likelihood function (LLF) of the observations is computed using the current estimates, and an M-step, where the parameters maximizing the expected LLF are obtained; under mild conditions, the EM will converge to a maximum, possibly local, of the LLF [20], [21]. Consider the observation vector in (2) with pdf given by (6). We regard y as the incomplete observation and {y, a} as the complete

A diffusion-averaging distributed EM estimator

The proposed distributed implementation of the EM estimator hinges on the fact that in the centralized version the information from the different nodes is aggregated by means of averages, as can be seen in (13)–(17). This property is similar to that used in [38] for distributed computation of a Least Squares estimate. However, in contrast with [38], in our estimation problem not all of the quantities to be averaged are available at the nodes from the very beginning; rather, they depend on the

Local convergence analysis

We analyze now the convergence properties of the DA-DEM algorithm derived in Section 4. Recall from (33) that the step-size sequence αk governs the diffusion/consensus process, gradually switching from one to the other as long as this sequence converges to zero. The use of vanishing step-sizes is common in stochastic approximation [46] and it is found also in consensus applications with noisy signals [41], [42]. In particular, we consider the following choice: αk=ρk+ρ1,ρ>0,k=1,2,Note that α1=1

Simulation results

The theoretical results from Section 5 are supported here with computer simulations of a network composed of N=100 nodes randomly deployed over a unit square with connectivity radius rc=0.18. The nodes sense a unit-norm parameter vector xRL×1 with L=3, randomly generated and fixed throughout the simulation. Each node has access to one measurement yi=aihiTx+wi, with wiN(0,σ2), x is assumed sensed with probability p={0.7,0.9} and w is taken as a Metropolis weight matrix [38]. In each run, the

Conclusion

We have proposed a diffusion-averaging distributed EM algorithm for estimation of a vector-valued parameter with a wireless sensor network in the presence of noisy observations and with potentially faulty transducers. The DA-DEM recursion combines an initial period where the process of information diffusion is gradually switched off at the same time as an information averaging process is gradually switched on. The switching mechanism is controlled by proper choice of vanishing step-size

Acknowledgments

This work was supported by the Ministerio de Economía y Competitividad of the Spanish Government, ERDF funds [TEC2013-41315-R,TEC2015-69648-REDC,TEC2016-75067-C4-2-R,TEC2013-47020-C2-1-R, TEC2016-76409-C2-2-R]; and the Galician Government [Agrupación Estratéxica Consolidada de Galicia accreditation 2016–2019, Red Temática RedTEIC 2017-2018].

References (46)

  • A.S. Willsky

    A survey for design methods for failure detection in dynamic systems

    Automatica

    (1976)
  • F. Zhao, L. Guibas, Wireless sensor networks: an information processing approach, Morgan Kaufmann, San Mateo, CA,...
  • I.F. Akyildiz et al.

    Wireless Sensor Networks.

    (2010)
  • G. Dimakis et al.

    Gossip algorithms for distributed signal processing

    Proc. IEEE

    (2010)
  • R. Olfati-Saber et al.

    Consensus and cooperation in networked multi-agent systems

    Proc. IEEE,

    (2007)
  • J.-J. Xiao et al.

    Distributed compression-estimation using wireless sensor networks

    IEEE Signal Process. Mag.

    (2006)
  • S. Boyd et al.

    Randomized gossip algorithms

    IEEE Trans. Info. Theory,

    (2006)
  • S. Barbarossa et al.

    Decentralized maximum-likelihood estimation for sensor networks composed of nonlinearly coupled dynamical systems

    IEEE Trans. Signal Process.

    (2007)
  • T. Zhao et al.

    Information-driven distributed maximum likelihood estimation based on Gauss-Newton method in wireless sensor networks

    IEEE Trans. Signal Process.

    (2007)
  • I.D. Schizas et al.

    Consensus in ad hoc WSNs with noisy links - part I: distributed estimation of deterministic signals

    IEEE Trans. Signal Process.

    (2008)
  • S.S. Stanković et al.

    Decentralized parameter estimation by consensus based stochastic approximation

    IEEE Trans. Autom. Control

    (2011)
  • A.H. Sayed

    Diffusion adaptation over networks

  • Y. Zhang et al.

    Detection and diagnosis of sensor and actuator failures using IMM estimator

    IEEE Trans. Aerosp. Electron. Syst.

    (1998)
  • K. Ni

    Sensor network data fault types

    ACM Trans. Sensor Netw.

    (2009)
  • T.-Y. Wang et al.

    A collaborative sensor-fault detection scheme for robust distributed estimation in sensor networks

    IEEE Trans. Commun.

    (2009)
  • A. Mahapatro et al.

    Fault diagnosis in wireless sensor networks: a survey

    IEEE Commun. Surv. Tuts.

    (2013)
  • W. Li et al.

    Defective sensor identification for WSNs involving generic local outlier detection tests

    IEEE Trans. Signal Inf. Process. Netw.

    (2016)
  • Q. Zhou et al.

    Distributed estimation in sensor networks with imperfect model information: an adaptive learning-based approach

    Proc. IEEE Int. Conf. Acoust. Speech Signal Process.

    (2012)
  • G. Bianchin et al.

    Distributed fault detection in sensor networks via clustering and consensus

    Proc. IEEE Conf. Decis. Control,

    (2015)
  • A.P. Dempster et al.

    Maximum likelihood from incomplete data via the EM algorithm

    J. R. Stat. Soc. Ser. B

    (1977)
  • T.K. Moon

    The expectation-maximization algorithm

    IEEE Signal Process. Mag.

    (1996)
  • R.D. Nowak

    Distributed EM algorithms for density estimation and clustering in sensor networks

    IEEE Trans. Signal Process.

    (2003)
  • J. Wolfe et al.

    Fully distributed EM for very large datasets

    Proc. Int. Conf. Machine Learning

    (2008)
  • Cited by (9)

    • Consensus variational Bayesian moving horizon estimation for distributed sensor networks with unknown noise covariances

      2022, Signal Processing
      Citation Excerpt :

      Distributed state estimation has received great attention in recent years for its application in the context of distributed sensor networks (DSN) [1–7].

    • Expectation maximization algorithm over Fourier series (EMoFS)

      2022, Signal Processing
      Citation Excerpt :

      The impact of this algorithm on scientific research has been felt tremendously in numerous disciplines. Many studies in the various domains of signal processing, artificial intelligence and telecommunications made extensive use of the EM algorithm, e.g. wireless networks, image processing, channel estimation and detection, message passing algorithm, mixture based learning, direct position determination, supervised learning, neural networks, emission tomography, intelligent anomaly detection, etc. [3–11]. Numerous examples exist in the literature detailing the usage of the EM algorithm for different well-known probability distributions, e.g. uniform [12], chi-square [12], Beta [13,14], normal distribution [15], Rayleigh (alternatively at [16]) and even application of the Dirac delta function to present probabilities at individual values [17].

    • Distributed Banach-Picard Iteration for Locally Contractive Maps

      2023, IEEE Transactions on Automatic Control
    View all citing articles on Scopus
    1

    Present address: CNS Group, Universitat Pompeu Fabra, C/Ramon Trias Fargas, 25–27, 08005 Barcelona, Spain.

    2

    EURASIP Member

    View full text