Deconvolution of pulse trains with the L0 penalty
Graphical abstract
Highlights
► Deconvolution of pulse trains is performed using penalized regression. ► We propose the use of an L0 penalty and compare it with the more common L2 and L1 penalties. ► The model is extended with a smooth component to handle drifting baselines.
Introduction
Many instruments produce signals that consist of a series of pulses. Examples are electrophoretic DNA sequencers, chromatographs, and spectrometers. Some biological signals have the same characteristics; an example is hormone release in the human body. The pulses have (more or less) equal shapes but different heights, and they may overlap. These output signals are the convolution of a series of true (input) spikes or diracs and the impulse response function. The task is to deduce the heights and positions of the spikes from the output signal.
Essentially there are two ways to approach this issue. The first is to search for local maxima to find peak positions, followed by summarizing the signal in their neighborhoods, to estimate peak heights. Examples of this approach are found in many places in the literature. We mention only a small selection. Yasui et al. [27] search for zeros of the first derivative, while Mariscotti [14] uses the second derivative. When computing derivatives, it is essential that proper noise filtering is first applied. Wavelets have been proposed as a tool for filtering by Coombes et al. [6] in this setting, but other filters are also possible. Du et al. [7] use a wavelet spectrum to locate peaks. It is also possible to apply a discrete Markov chain as done by Silagadze [22] and Morháč [16], these approaches results in a probability distribution targeting the location of peaks.
The second approach is to model signals as a convolution of a series of sharp input spikes and a constant impulse response. The task then is to estimate the input from the observed output signal. This is the deconvolution problem that has been studied in many fields of science. It is a so-called inverse problem, and it is generally very badly conditioned, which means that small changes in the observed signal or the impulse response lead to large changes in the estimated input. Conversely, many very different inputs are compatible with the observed output.
To address the bad condition various deconvolution algorithms have been proposed. An early solution is the van Cittert algorithm (see e.g. [11]) that was later improved in the form of the Gold algorithm (see e.g. [1]). Other iterative approaches are often based on the Richardson–Lucy algorithm or using the general class of expectation maximisation (EM) algorithms (see e.g. [13], [24]). The EM algorithm iteratively redistributes the observed output, proportionally to the current estimate of the input. Averaging gives an improved estimate of the input, to be used in the next iteration.
The class of deconvolution algorithms also contains a branch of boosting algorithms. Cardot et al. [3] propose to use boosting to find an optimal set of input spikes. As a first step the locations of peak are estimated and subsequently renewed in an updating step, in the third stage peaks that are too close to each other are merged to one. Recently Morháč and Matoušek [18] proposed a boosted version of the Gold and Richardson–Lucy algorithms.
A general approach to ill-conditioned problems is the use of regularization: some form of penalty is imposed on the parameters of the model. We already referred to Li and Speed [13]. A familiar example in the chemometric literature is ridge regression [10], where the penalty is on the sum of the squares of the regression coefficients. This is called the L2 norm; generally the sum of absolute values (of the elements of a vector) to the power p is called the Lp norm. In recent years the Lasso [23], a penalty based on the L1 norm, the sum of absolute values, has become popular in many applications. References can be found in the next section.
Penalties with a norm based on p < 1 have received little attention. A main theoretical obstacle has been the fact that they leads to a non-convex optimization problem, in contrast to penalties with p ≥ 1. Hence one cannot be sure of having found a global minimum. Another drawback is the lack of good practical algorithms. In this paper we propose regularized deconvolution using the L0 penalty, and we show very good results using an algorithm based on repeated weighted regression. Apparently, in the limited context of pulse train deconvolution, a non-convex objective function is not a real problem.
In the next section we introduce the deconvolution framework, and we show the effects of regularization with different norms. There we assume that the impulse response is known. In practice only an approximation will be available, so we also consider “blind deconvolution”: the estimation of both input and impulse response form one signal. Drifting baselines are quite common; we present two ways to handle them.
In Section 3 we present three applications to experimental data. Two of them are instrumental (electrophoretic DNA sequencing and gas chromatography), the third is a series of high-frequency measurements of concentrations of luteinizing hormone in human blood, which show strong pulsative behavior.
In the final section we discuss possible extensions and refinements.
Section snippets
Convolution and deconvolution
Consider a (causal) discrete linear system with an input signal x, and a impulse response (or spread function) c, which incorporates blurring and filtering effects. Actually observed is the output signal y of length m, plus noise e. Using the superposition principle y can be described byDetails can be found in many books on linear system theory; see e.g. Brown [2]. Interpreting i as indexing time, this shows that yi is a weighted sums of the present and all previous
Applications
In this section the algorithm is applied to three different datasets. The first example comes from DNA sequencing by electrophoresis. The original data consists of four channels; one for each of the four nucleotides. Here we analyse only one channel. In the upper panel of Fig. 8 the signal is plotted. The middle panel shows a number of peaks that all coincide with the largest peaks present in the data, only a few smaller bumps are not considered a peak. We used blind deconvolution. In the lower
Discussion
Penalized regression is a popular choice to resolve identification issues in the case of ill-posed inverse problems. We have shown that for deconvolution of sparse input signals the L2 penalty is not useful. The L1 penalty gives decent results, but not as sparse as we would like to see, and deviations from the true peak height occur. The L0 penalty shows superior performance: a very sparse result, close to the true input values.
The iterative weighted least squares algorithm works well, on
Acknowledgements
We thank Claire Boucon (Unilever Research, Vlaardingen) for kindly providing the chromatogram data, as well as Daniël Vis (University of Amsterdam) and Hanno Pijl (Leiden University Medical Center) for kindly providing the hormone data.
References (27)
- et al.
Study of the van Cittert and gold iterative methods of deconvolution and their application in the deconvolution of experimental spectra of positron annihilation
Nuclear Instruments and Methods in Physics Research A
(1997) A method for automatic identification of peaks in the presence of background and its application to spectrum analysis
Nuclear Instruments and Methods
(1967)Multidimensional peak searching algorithm for low-statistics nuclear spectra
Nuclear Instruments and Methods in Physics Research A
(2007)An algorithm for determination of peak regions and baseline elimination in spectroscopic data
Nuclear Instruments and Methods in Physics Research A
(2009)- et al.
Baseline subtraction using robust local regression estimation
Journal of Quantitative Spectroscopy and Radiative Transfer
(2001) - et al.
Snip, a statistics-sensitive background treatment for the quantitative analysis of pixel spectra in geoscience applications
Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms
(1988) A new algorithm for automatic photopeak searches
Nuclear Instruments and Methods in Physics Research A
(1996)Engineering System Dynamics
(2006)- et al.
Boosting diracs for eletrophoresis
Journal of Computational and Graphical Statistics
(2004) - et al.
Atomic decomposition by basis pursuit
SIAM Journal of Scientific Computing
(1998)
Quality control and peak finding for proteomics data collected from nipple aspirate fluid by surface-enhanced laser desorption and ionization
Clinical Chemistry
Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform
Proteomics
Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching
Bioinformatics
Cited by (31)
Adaptive baseline model for autonomous marine equipment and systems
2021, ISA TransactionsCitation Excerpt :The baseline is modeled as a smooth curve. A simple approach involves smoothing in a moving window method [9]. Combination and a discrete penalty are used to establish a smooth baseline model to adjust smoothness [10].
Evaluating different sparsity measures for resolving LC/GC-MS data in the context of multivariate curve resolution
2020, Chemometrics and Intelligent Laboratory SystemsCitation Excerpt :They concluded that the application of this L0-norm penalty in the MCR-ALS algorithm increased the chance of finding unique profiles. In another paper, de Rooi et al. revealed that the degree of sparsity in the Lx-norm penalty methods for deconvolution of 1-D data increased from L2-norm to L0-norm [23]. They focused on deconvolution of a vector of pulse trains rather than resolution of a data matrix.
Sparse non-negative multivariate curve resolution: L<inf>0</inf>, L<inf>1</inf>, or L<inf>2</inf> norms?
2020, Chemometrics and Intelligent Laboratory Systems4.17 - Image Processing in Chemometrics
2020, Comprehensive Chemometrics: Chemical and Biochemical Data Analysis, Second Edition: Four Volume SetExploring the effects of sparsity constraint on the ranges of feasible solutions for resolution of GC-MS data
2018, Chemometrics and Intelligent Laboratory SystemsCitation Excerpt :It seems that L1-penalty can make a compromise between the results obtained using L0-and L2-norm penalties. Moreover, LX-norm penalty methods with x < 1 are classified in the category of non-convex optimization problems in contrast to penalties with x ≥ 1 [36]. It seems that, selection of the best LX-norm penalty for MCR-ALS algorithm depends on data and main aim of the research.
Application of a sparseness constraint in multivariate curve resolution – Alternating least squares
2018, Analytica Chimica ActaCitation Excerpt :In this work we propose solving the problem by L0-norm penalized regression [12], as is used in sparse deconvolution [14,15]. De Rooi and Eilers [14] show that using an L0-norm penalty as opposed to an L1-norm penalty prevents shrinkage of the main coefficients, while making adjacent coefficients zero. This results in a more sparse fit for the same fitting error.