Slope estimation in noisy piecewise linear functions

doi:10.1016/j.sigpro.2014.10.003

Signal Processing

Volume 108, March 2015, Pages 576-588

https://doi.org/10.1016/j.sigpro.2014.10.003 Get rights and content

Highlights

•
A hidden Markov model formulation for estimating slopes in a noisy piecewise linear function is proposed.
•
A computationally tractable dynamic programming routine on a linearly growing trellis for fast maximum a posteriori estimation is developed.
•
Optimality analysis of this routine via simulations and a comparison with reasonable upper and lower bounds is presented.
•
An alternating maximization algorithm to estimate unknown model parameters from data is also derived.
•
Results show that the algorithm is general enough to encompass different applications.

Abstract

This paper discusses the development of a slope estimation algorithm called MAPSlope for piecewise linear data that is corrupted by Gaussian noise. The number and locations of slope change points (also known as breakpoints) are assumed to be unknown a priori though it is assumed that the possible range of slope values lies within known bounds. A stochastic hidden Markov model that is general enough to encompass real world sources of piecewise linear data is used to model the transitions between slope values and the problem of slope estimation is addressed using a Bayesian maximum a posteriori approach. The set of possible slope values is discretized, enabling the design of a dynamic programming algorithm for posterior density maximization. Numerical simulations are used to justify choice of a reasonable number of quantization levels and also to analyze mean squared error performance of the proposed algorithm. An alternating maximization algorithm is proposed for estimation of unknown model parameters and a convergence result for the method is provided. Finally, results using data from political science, finance and medical imaging applications are presented to demonstrate the practical utility of this procedure.

Introduction

The need for piecewise linear regression arises in many different fields, as diverse as biology, geology, and the social sciences. This paper addresses the problem of direct estimation of slopes from piecewise linear data. An important application of interest for this paper is ultrasound shear wave elastography, where ultrasonic echoes are used to track the motion of an externally generated mechanical shear wave pulse traveling through multiple tissue interfaces [19]. The time of arrival of this shear wave pulse is recorded as a function of spatial coordinates in the ultrasound imaging plane and the reciprocal of the slope of this function gives an estimate of the speed of the wave. Breakpoints (where the slope changes) indicate tissue interfaces. These estimates are useful from a clinical perspective because they provide a way to quantify mechanical properties of tissue, thereby adding value to subjective judgments about the location and size of cancerous tumors.

A similar issue in larger spatial dimensions occurs in seismology where the time of arrival of seismic waves is tracked at different locations relative to the epicenter of an earthquake. The velocity of these waves provides information about the mechanical properties of the geological medium. Piecewise linear data also occurs in the study of flow of soil through water streams and is referred to as bedload data [20].

Assume that the piecewise linear data is generated by the following discrete time hidden Markov model (HMM). The underlying (unknown) function takes on values Z_n at each discrete index $1 \leq n \leq N$ . This function value is obtained by accumulating slope values S_k up to the time index n. Zero mean Gaussian noise with variance σ² is added to each running sum resulting in the observed function value X_n. Also, suppose that for any n, the probability of maintaining the previous slope value is p and the probability of transitioning into a new slope value is $1 - p$ . These relations can be written mathematically as follows: $Z_{0} = 0 with probability 1, Z_{n} = Z_{n - 1} + S_{n}, X_{n} = Z_{n} + w_{n}$ for $n = 1, \dots, N$ where $w_{n} \overset{iid}{~} N (0, σ^{2})$ . A Markov structure is imposed on the slope values as follows: $S_{n} = {\begin{matrix} S_{n - 1} & with probability p \\ U_{n} & with probability 1 - p \end{matrix}$ for $n = 2, \dots, N$ where $U_{n} ~ U ({0, 1 / (M - 1), \dots, (M - 2) / (M - 1), 1} \ {S_{n - 1}})$ denotes a discrete uniform random variable taking on one of $M - 1$ possible slope values and the initial slope value is drawn uniformly as $S_{1} ~ U ({0, 1 / (M - 1), \dots, (M - 2) / (M - 1), 1})$ .

Another implicit assumption is that the slopes can take on values on a closed bounded interval $[s_{l}, s_{u}]$ with upper and lower limits $0 < s_{l} < s_{u} < \infty$ known a priori. For instance, in the ultrasound-based wave tracking application, the values of s_l and s_u can be obtained from the underlying physics which dictates that such mechanical waves travel with speeds between 0.5 and 10 m/s in homogeneous tissue. With the knowledge of s_l and s_u, the given data vector can be translated and rescaled so that all slope values lie in the interval [0,1]. Hence, without loss of generality, it suffices to design a slope estimation algorithm that operates with a finite set of slopes $S = {0, 1 / (M - 1), \dots, (M - 2) / (M - 1), 1}$ . Intuitively, this quantization step is justified because in the presence of noise it is impossible to detect the difference between slope values that differ only slightly.

The main contributions of this paper are as follows:

(a)
a hidden Markov model formulation of the slope estimation problem that is general enough to encompass different applications;
(b)
a procedure for maximum a posteriori (MAP) estimation of slopes from this Markov model;
(c)
a dynamic programming routine on a linearly growing trellis for fast MAP estimation;
(d)
a mean squared error (MSE) optimality analysis of this routine via simulations and a comparison with reasonable upper and lower bounds;
(e)
an alternating maximization algorithm that alternately maximizes an objective function with respect to the unknown sequence of slope values and unknown model parameters to jointly estimate both of them from data;
(f)
a comparison of the performance of this algorithm with other methods in literature applied to real world data.

In many real world applications, the local slope values of an observed noisy function have interesting physical interpretations. Most of the existing methods do not directly address slope estimation; rather, they attempt to fit a model to the data. For instance, standard regression or spline-based methods can be used to fit a smooth function to the data and local slopes can be estimated from this fit. However, even if the function-fitting algorithm generates optimal fits (according to a cost function such as the minimum MSE), there is no guarantee that the local slope estimates obtained from this fit are themselves optimal. This paper bypasses the need for such post-processing by directly estimating the slopes and breakpoints. This is particularly useful when the slopes correspond directly to the variables of interest and the breakpoints correspond to where those variables change.

The topic of slope estimation from noisy data is quite old; an early paper can be traced back to 1964 where the popular Savitzky–Golay differentiator [10] was introduced. Their main idea is to use a locally windowed least squares fit to estimate the slope at each data sample, where the window coefficients are chosen to satisfy a certain frequency response that mimics a high pass filter together with some level of noise averaging. Another similar technique that is used in statistics is called locally weighted least squares regression (LOWESS) [22]. However, these methods undesirably smooth out the breakpoint locations in when data has sharp transitions or jumps. In contrast, the algorithm in the present paper prevents blurring the transitions by explicitly allowing for sharp slope transitions using a Markov model.

In some situations, the raw data can be massaged using a preprocessing step so that it becomes piecewise linear. The simplest example is the case of piecewise constant data — the running sum (integral) of such data yields a piecewise linear function. Ratkovic and Eng [21] discuss a statistical spline fitting approach combined with the Bayesian information criterion (BIC) to detect abrupt transitions in political approval ratings. Data from their paper is used in Section 6.1. As a special case, their method can be applied when function values stay almost constant over long intervals and occasionally shift to a new value. In another application, Bai and Perron [16] use statistical regression techniques to detect multiple regime shifts in interest rate data. The algorithm developed in the present paper provides comparable numerical performance as the Bai–Perron algorithm as shown in Section 6.2.

Closely related problems of piecewise linear regression for noisy data have been addressed over the years. For example, an early paper by Hudson [1] focuses on a technique to obtain piecewise linear fits in a least-squares sense with only two segments. The break point location is included as a parameter in the least squares optimization problem. The method is extended by dealing with multiple breakpoint locations on a case-by-case basis which becomes combinatorially intractable as the number of breakpoints increases. Bellman [2] suggests a dynamic programming approach when the number of breakpoints is unknown. However, this method requires the use of a large number of grid points for accurate results. Gallant and Fuller [3] generalize this to the fitting of arbitrary polynomials with unknown breakpoints while requiring the function to be composed of segments with continuous first derivatives. They apply a nonlinear optimization routine (Gauss–Newton minimization) to fine-tune breakpoint locations while minimizing the squared error relative to the data. Another non-parametric approach involves use of edge preserving penalized optimization such as total variation minimization [4]. Denison et al. [5] use a Markov chain Monte Carlo approach to fit piecewise polynomials with different numbers and locations of knot points. Tishler and Frey [6] discuss a maximum likelihood approach to fit a convex piecewise linear function expressed as a point-wise maximum of a collection of affine functions with unknown coefficients. Maximum likelihood estimates are obtained by running a constrained optimization routine for a smoothed approximation of an MSE cost function to bypass non-differentiability issues. A similar data model coupled with data clustering heuristics is utilized in a more recent paper by Magnani and Boyd [7] on fitting convex piecewise linear functions.

The use of adaptive methods is an attractive way of handling the issue of unknown number of breakpoints. One of the first algorithms using this technique was proposed by Friedman [8] under the name “adaptive regression splines (ARES).” Recursive partitioning is used to obtain better partitions of the set of data points at each iteration. Either goodness of fit criteria or generalized cross validation is used to estimate the number of partitions. On similar lines, Kolaczyk and Nowak [9] apply the method of recursive dyadic partitioning and fit a smooth function in each partition using maximum likelihood estimation. A penalty term for the number of partitions is introduced to trade off model complexity and quality of fit. In recent work, Saucier and Audet [11] propose a different class of adaptively constructed basis functions that can capture the transition points in otherwise piecewise smooth functions.

In [15] Bai and Perron discuss the problem of detecting structural changes in data without requiring the estimated function to be piecewise linear or even continuous. Their related paper [16] discusses a dynamic programming approach to obtain a least sum of squares fit. The model order is determined by using the Akaike information criterion (AIC) [18] and they impose a minimum limit on the “run length” of each segment in the piecewise model. In contrast, the present paper proposes a dynamic program that generates optimum MAP estimates based on a stochastic finite state HMM.

In the signal processing literature, two kinds of paradigms have been applied to this problem — Bayesian estimation and pattern recognition approaches. Punskaya et al. [23] model the function using the number and locations of the breakpoints as free parameters with certain prior distributions. The posterior density of the parameters conditioned on the noisy data is estimated through Monte Carlo techniques. In response to this method, Fearnhead [24] proposes a direct method for estimating parameters of the same model without resorting to Monte Carlo simulations and exploiting a Markov property in the model that allows calculation of the probability of future data points conditioned on the most recent breakpoint location. In the present paper, a Markov structure is imposed on the underlying slope values that are chosen from a finite set and the MAP algorithm estimates these slopes at each data sample.

The rest of this paper is organized as follows. The problem statement is discussed further in Section 2. A computationally tractable algorithm that uses the principle of dynamic programming is presented in Section 3. The issue of automatic selection of model parameters from data is addressed in Section 4. The problem of choosing the right number of quantization levels is analyzed through simulations and MSE distortion bounds in Section 5. A series of diverse applications are presented in Section 6, followed by some closing remarks in Section 7.

The notation $v_{i : j}$ is hereafter used to denote a vector $(v_{i}, v_{i + 1}, \dots, v_{j})$ and the model parameters are denoted by $θ = (σ^{2}, p)$ .

Section snippets

Problem statement and stochastic formulation

A pictorial representation of the relationships between various random variables of the piecewise linear data model for this paper is shown in Fig. 1. It can be seen that these relationships lead to an HMM with two hidden layers. Moreover, the cardinality of the state space of each of the random variables Z_n increases with n. The next two subsections show why standard inference and parameter learning algorithms for HMMs cannot be directly applied to this problem.

Maximum a posteriori estimation of slope values

The maximum a posteriori slope sequence can be obtained by maximizing (4) as a function of the slope sequence, which is equivalent to the following optimization problem: $s_{1 : N}^{⁎} = \underset{s_{1 : N}}{\arg \max} (- \frac{1}{2 σ^{2}} \sum_{n = 1}^{N} {(x_{n} - z_{n})}^{2} + \sum_{n = 2}^{N} \log [p χ_{{0}} (s_{n} - s_{n - 1}) + \frac{1 - p}{M - 1} χ_{{0}}^{c} (s_{n} - s_{n - 1})]) .$ The term containing $p (x_{1 : N} | θ)$ is dropped because it does not depend on $s_{1 : N}$ . Besides producing the MAP optimal solution, the objective function has certain intuitively appealing characteristics. The first summation on the right-hand side of (5) is

Alternating maximization algorithm for estimating model parameters

The following result provides a useful alternative re-estimation procedure that bypasses the computational issues with the full EM approach. Yet, it has a similar advantage as an EM algorithm in that it increases the likelihood at each iteration. Under an additional minor technical assumption, it can also be shown that the procedure converges.

Theorem 2

Let $s_{1 : N}$ be the current slope sequence estimate, $s_{1 : N}^{⁎}$ be the new sequence estimate obtained by running FastTrellis with the current model parameter

Mean squared error optimality analysis and model order selection

The aim of this section is to propose an empirical method for selecting the right number of quantization levels M for the set of slope values. Simple arguments from source coding theory are used to obtain upper and lower bounds for the MSE performance of the algorithm. This method can be applied directly to the sequence of estimated slope values.

Assume that the slope values originate from a continuous amplitude Markov source with amplitude levels in the interval [0,1]. The goal is to

Presidential approval ratings

United States presidential job approval ratings have been published by various agencies over the years. These ratings are usually quoted as percentage values that are computed from the results of opinion polls administered to a sample of the country׳s population. An analysis of the approval ratings for President George W. Bush is presented by Ratkovic and Eng [21]. The result of applying the MAPSlope algorithm to a snippet from the same dataset is shown in Fig. 6. A sudden transition in the

Conclusion

This paper presented the MAPSlope method for estimating slopes from noisy piecewise linear data which applies techniques from Bayesian estimation. An alternating maximization routine was presented for estimating unknown model parameters, and its convergence properties were analyzed theoretically. Further, MSE performance of this algorithm was tested using simulated data for different parameter values. The MSE performance was compared with suitable upper and lower bounds, and it was shown that

References (44)

A. Gholami et al.
A balanced combination of Tikhonov and total variation regularizations for reconstruction of piecewise-smooth signals
Signal Process.
(2013)
A. Saucier et al.
Construction of sparse signal representations with adaptive multiscale orthogonal bases
Signal Process.
(2012)
G. Archer et al.
Parameter estimation for hidden Markov chains
J. Stat. Plann. Inference
(2002)
L. Gao et al.
Imaging of the elastic properties of tissue—a review
Ultrasound Med. Biol.
(1996)
K. Nightingale et al.
Shear-wave generation using acoustic radiation forcein vivo and ex vivo results
Ultrasound Med. Biol.
(2003)
T. Varghese et al.
An analysis of elastographic contrast to noise ratio
Ultrasound Med. Biol.
(1998)
D. Hudson
Fitting segmented curves whose join points have to be estimated
J. Am. Stat. Assoc.
(1966)
R. Bellman et al.
Curve fitting by segmented straight lines
J. Am. Stat. Assoc.
(1969)
A. Gallant et al.
Fitting segmented polynomial regression models whose join points have to be estimated
J. Am. Stat. Assoc.
(1973)
D. Denison et al.
Automatic bayesian curve fitting
J. R. Stat. Soc.: Ser. B (Stat. Methodol.)
(1998)

A. Tishler et al.

A new maximum likelihood algorithm for piecewise regression

J. Am. Stat. Assoc.

(1981)

A. Magnani et al.

Convex piecewise-linear fitting

Optim. Eng.

(2009)

J. Friedman

Multivariate adaptive regression splines

Ann. Stat.

(1991)

E. Kolaczyk et al.

Multiscale generalized linear models for nonparametric function estimation

Biometrica

(2005)

A. Savitzky et al.

Smoothing and differentiation of data by simplified least squares procedures

Anal. Chem.

(1964)

L. Rabiner

A tutorial on hidden Markov models and selected applications in speech recognition

Proc. IEEE

(1989)

R. Garica et al.

An analysis of the real interest rate under regime shifts

Rev. Econ. Stat.

(1996)

J. Hamilton

A new approach to the economic analysis of nonstationary time series and the business cycle

Econometrica

(1989)

J. Bai et al.

Estimating and testing linear models with multiple structural changes

Econometrica

(1998)

J. Bai et al.

Computation and analysis of multiple structural change models

J. Appl. Econ.

(2003)

T.-J. Yen

A majorization–minimization approach to variable selection using spike and slab priors

Ann. Stat.

(2011)

H. Akaike

A new look at the statistical model identification

IEEE Trans. Autom. Control

(1974)

Cited by (8)

Local trend analysis method of hydrological time series based on piecewise linear representation and hypothesis test
2022, Journal of Cleaner Production
Citation Excerpt :
Based on a certain number of change points, the piecewise linear representation (PLR) algorithm can split a time series into a number of subseries, which can be fitted by a number of local trend lines (Garcia et al., 2017). Currently, various PLR algorithms have been widely used in many fields such as finance (Luo and Chen, 2013), signal processing (Ingle et al., 2015) and biology (Kong et al., 2020), but they are rarely used in hydrology. One reason is that overall trends and changes in mean values rather than local trends are considered as a matter of more concern (Wang et al., 2017b).
It is important to analyze the nonstationarity of hydrological time series under the influence of climate change and human activities. Compared with previous studies, this study focuses more on local trends rather than overall trend and mean value change, and presents a local trend analysis method, namely PLRHT method, for the nonstationary analysis of hydrological time series. In the PLRHT method, the fitted local trend lines determined by the piecewise linear representation algorithm and Akaike information criterion are used as the baseline component of hydrological time series. The difference between two adjacent fitted local trend lines includes the slope and intercept differences, which jointly determine the mean value difference of two corresponding subseries. In addition, based on the slope and intercept differences, three change points of the hydrological time series (inflection point, break point and bivariate point) are defined and their significance is tested by hypothesis test based on Monte Carlo experiments. Finally, the application effect of the PLRHT method is compared with five other representative methods of non-stationarity analysis (runs test, rank sum test, Brown-Forsythe test, T test and heuristic segmentation) by using artificial and observed hydrological time series as examples. The results show that, compared with the five nonstationarity analysis methods, the PLRHT method could not only detect more accurate change points of the artificial and observed hydrological time series, but also quantitatively analyze the contributions of trends and abrupt changes to mean value variations of these hydrological time series. Besides, the local trend analysis enables a deeper causality analysis for the nonstationarity of the observed hydrological time series. Therefore, local trend analysis could help understanding the variation characteristics and driving forces of hydrological processes in the changing environment, and the PLRHT method is a feasible local trend analysis method to reveal the relationship between local mean value changes and local trends.
A kernel smoothing algorithm for ablation visualization in ultrasound elastography
2019, Ultrasonics
Citation Excerpt :
Displacements between consecutive frames are estimated using a one-dimensional (1D) cross-correlation algorithm and the peak displacements at each pixel are recorded to localize the shear wave pulse in space and time. The reciprocal of the slope of a plot of the time of arrival of the shear wave pulse as a function of lateral distance away from the needle is used to estimate shear wave velocity at each pixel in the scan plane [33]. (Reproducing Kernel Hilbert Space)
Three-dimensional visualization of tumor ablation procedures have significant clinical value because the ability to accurately visualize ablated volumes can help clinicians gauge the extent of ablated tissue necrosis and plan future treatment steps. Better control over ablation volume can prevent recurrence of tumors treated using ablative procedures. This paper presents a kernel based smoothing algorithm called MatérnSmooth to reconstruct shear wave velocity maps from data acquired through ultrasound electrode vibration elastography. Shear wave velocity estimates are acquired on several intersecting imaging planes that share a common axis of intersection collinear with the ablation needle. An objective method of choosing smoothing parameters from underlying data is outlined through simulations. Experimental validation was performed on data acquired from a tissue mimicking phantom. Volume estimates were found to be within 20% of the true value.
Regularised differentiation of measurement data in systems for monitoring of human movements
2018, Biomedical Signal Processing and Control
Citation Excerpt :
Furthermore, no systematic strategy for the selection of the optimum differentiation method for a given practical problem has been proposed. Therefore, it is quite common to empirically compare the performance of different methods using real-world data related to a particular application − it has been done recently, e.g., in the context of automation [97–99], political science [100], aircraft diagnostics [101], medical imaging [102], image enhancement [103], oil-well hydraulics [104] and optical metrology [105]. The aim of this paper is to compare five methods for numerical differentiation in terms of their applicability in the systems for estimation of walking velocity using real-world data acquired by means of radar sensors and depth sensors.
The research reported in this paper is related to assistive technologies that may be used by healthcare institutions for monitoring movements of elderly persons. The corresponding systems, based on impulse-radar sensors and depth sensors are addressed. The reported study is focused on the estimation of the walking velocity of a monitored person, which requires numerical differentiation of data representative of that person’s position trajectory. Five methods of numerical differentiation are compared in terms of their applicability in such systems. The comparison is based on both synthetic data and real-world data. The results of experiments indicate the superiority of a differentiation method based on the Tikhonov regularisation. The bias and standard deviation of errors corrupting the velocity estimates, obtained by means of that method, do not exceed 0.10 m/s in the case of the impulse-radar sensors and 0.08 m/s in the case of the depth sensors.
Highly efficient hierarchical online nonlinear regression using second order methods
2017, Signal Processing
Citation Excerpt :
Recent developments in information technologies, intelligent use of mobile devices and Internet have procured an extensive amount of data for the nonlinear modeling systems [1,2].
We introduce highly efficient online nonlinear regression algorithms that are suitable for real life applications. We process the data in a truly online manner such that no storage is needed, i.e., the data is discarded after being used. For nonlinear modeling we use a hierarchical piecewise linear approach based on the notion of decision trees where the space of the regressor vectors is adaptively partitioned based on the performance. As the first time in the literature, we learn both the piecewise linear partitioning of the regressor space as well as the linear models in each region using highly effective second order methods, i.e., Newton–Raphson Methods. Hence, we avoid the well known over fitting issues by using piecewise linear models, however, since both the region boundaries as well as the linear models in each region are trained using the second order methods, we achieve substantial performance compared to the state of the art. We demonstrate our gains over the well known benchmark data sets and provide performance results in an individual sequence manner guaranteed to hold without any statistical assumptions. Hence, the introduced algorithms address computational complexity issues widely encountered in real life applications while providing superior guaranteed performance in a strong deterministic sense.
On characterizing scale effect of Chinese mutual funds via text mining
2016, Signal Processing
Citation Excerpt :
Specifically, we choose to test the correlation between fund scale and the performance with the important help of text retrieval on the sheer volume of online financial reports. This is not unusual since big data analytics has been widely adopted in various application domains [8,17,21]. Previous studies have reported mixed findings, most of which are with data from financial markets of developed countries.
This paper investigates the correlation between mutual funds׳ scale and return in China by text mining on the sheer volume of online financial reports. We crawl the webpages of all Chinese open-end mutual funds from a well-known financial website, which are then parsed to obtain time-series data of fund scales and returns. We argue that with long-tail distribution of fund scales, to examine the correlation directly in an individual level is not appropriate; rather, we should consider it in a group level by scales and take different market conditions into consideration. To illustrate this, we start with a data-fitting test to demonstrate that the tail of fund scale fits best in a distribution between Power-Law and Log-Normal. Hence, to categorize mutual funds by equal scale could lead to fund groups in substantially different sizes, and the subsequent results are thus prone to bias. We therefore introduce K-means clustering for fund categorization, which enables reliable examination of correlations between fund scale and return. Empirical study unveils some interesting findings on the scale effect of funds under different market conditions. These findings highlight the uniqueness of emerging markets while providing interesting guidelines for exploiting big data analytics for financial studies.
A Minimal Cardinality Solution to Fitting Sawtooth Piecewise-Linear Functions
2022, Journal of Optimization Theory and Applications

View all citing articles on Scopus

^☆: This work was supported in part by NIH-NCI Grants R01CA112192-S103 and R01CA112192-05.

¹: Sample MATLAB code is available on the author׳s website at http://homepages.cae.wisc.edu/~ingle/mapslope.html.

View full text

Slope estimation in noisy piecewise linear functions☆

Highlights

Abstract

Introduction

Section snippets

Problem statement and stochastic formulation

Maximum a posteriori estimation of slope values

Alternating maximization algorithm for estimating model parameters

Mean squared error optimality analysis and model order selection

Presidential approval ratings

Conclusion

Signal Process.

Signal Process.

J. Stat. Plann. Inference

Ultrasound Med. Biol.

Ultrasound Med. Biol.

Ultrasound Med. Biol.

Fitting segmented curves whose join points have to be estimated

J. Am. Stat. Assoc.

Curve fitting by segmented straight lines

J. Am. Stat. Assoc.

Fitting segmented polynomial regression models whose join points have to be estimated

J. Am. Stat. Assoc.

Automatic bayesian curve fitting

J. R. Stat. Soc.: Ser. B (Stat. Methodol.)

A new maximum likelihood algorithm for piecewise regression

J. Am. Stat. Assoc.

Convex piecewise-linear fitting

Optim. Eng.

Multivariate adaptive regression splines

Ann. Stat.

Multiscale generalized linear models for nonparametric function estimation

Biometrica

Smoothing and differentiation of data by simplified least squares procedures

Anal. Chem.

A tutorial on hidden Markov models and selected applications in speech recognition

Proc. IEEE

An analysis of the real interest rate under regime shifts

Rev. Econ. Stat.

A new approach to the economic analysis of nonstationary time series and the business cycle

Econometrica

Estimating and testing linear models with multiple structural changes

Econometrica

Computation and analysis of multiple structural change models

J. Appl. Econ.

A majorization–minimization approach to variable selection using spike and slab priors

Ann. Stat.

A new look at the statistical model identification

IEEE Trans. Autom. Control