doi:10.1016/j.csda.2006.11.037
Copyright © 2006 Elsevier B.V. All rights reserved.
Visualization and inference based on wavelet coefficients, SiZer and SiNos
Cheolwoo Parka,
,
, Fred Godtliebsenb, Murad Taqquc, Stilian Stoevd and J.S. Marrone
aDepartment of Statistics, University of Georgia, Athens, GA 30602-1952, USA
bDepartment of Mathematics and Statistics, University of Tromsø, N-9037 Tromsø, Norway
cDepartment of Mathematics and Statistics, Boston University, Boston, MA 02215, USA
dDepartment of Statistics, University of Michigan, Ann Arbor, MI 48109-1107, USA
eDepartment of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599-3260, USA
Received 10 January 2006;
revised 18 November 2006;
accepted 26 November 2006.
Available online 19 December 2006.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
SiZer (SIgnificant ZERo crossing of the derivatives) and SiNos (SIgnificant NOn-Stationarities) are scale-space based visualization tools for statistical inference. They are used to discover meaningful structure in data through exploratory analysis involving statistical smoothing techniques. Wavelet methods have been successfully used to analyze various types of time series. In this paper, we propose a new time series analysis approach, which combines the wavelet analysis with the visualization tools SiZer and SiNos. We use certain functions of wavelet coefficients at different scales as inputs, and then apply SiZer or SiNos to highlight potential non-stationarities. We show that this new methodology can reveal hidden local non-stationary behavior of time series, that are otherwise difficult to detect.
Keywords: Internet traffic; Long-range dependence; Non-stationarity; Scale-space method; SiNos; SiZer; Time series; Wavelet coefficients
Fig. 1. (a) Time series plot of packet counts measured every 10 ms at the link of University of North Carolina, Chapel Hill (UNC) on April 11, Thursday, from 1 to 3 p.m., 2002 (Thu1300). (b) The wavelet spectrum of the Thu1300 time series. The vertical segments on this plot are the estimated 95% confidence intervals of the log-mean-energy statistics of the wavelet spectrum. (c) and (d) display the dependent SiZer of the Thu1300 time series. In (d), the dotted curves show effective window widths for each bandwidth, as intervals representing ±2h.
Fig. 2. Wavelet SiZer plot of a simulated FGN with H=0.9. At all scales, the shapes of the family of smooths look similar.
Fig. 3. Wavelet SiZer plot of the Thu1300 time series. The top panel shows the family of smooths of the original time series. The second row displays the family of smooths and the SiZer map of the second power wavelet coefficients at the scales j=1 and 2. In the same way, the other rows display SiZer plots of the second power wavelet coefficients from j=3 to 10.
Fig. 4. The upper panel is a family plot of Variance SiNos for scale j=9 of the wavelet coefficients obtained from the Thu1300 data. The curves show the estimated variance of the wavelet coefficients for various levels of smoothing. In the lower panel, a feature map showing significant changes in the variance of the wavelet coefficients is given.
Fig. 5. (a) Time series plot of the Thu1300 data. The four windows highlighted by the vertical lines correspond to the significant spikes of the wavelet SiZer in Fig. 3. (b)–(e) Show the subtraces corresponding to the four windows in (a), respectively. The units of the x-axis are seconds. (f)–(i) are corresponding wavelet spectra when each window is excluded from the full time series and the remaining two parts are concatenated.
Fig. 6. Wavelet SiZer plot of packet counts measured every 10 ms at the link of University of North Carolina, Chapel Hill (UNC) on April 12, Saturday, from 3 to 4 p.m. (Sat1500), 2003. The left of the top panel displays the original time series and the right displays its family of smooths.
Fig. 7. (a) Time series plot of the Sat1500 data. The first window corresponds to the spike in wavelet SiZer in Fig. 6. (b) Shows the subtrace corresponding to the first window in (a). (c) Shows the SiZer plot of scale j=1 of the fourth power wavelet coefficients. (d) Shows the subtrace corresponding to the second window in (a).
Fig. 8. The wavelet SiZer of the simulated data. The top-left panel displays the simulated time series consisting of a deterministic curve plus noise. The deterministic curve is overlayed in white. It consists of a sine curve with increasing variations and two bursts. Its fluctuations are not visible in the figure. The added noise is FGN with H=0.8.
Fig. 9. The upper panel is a family plot of the Thu1300 time series. The lower panel is a feature map summary for scales j=1,2,…,10 and time points.