doi:10.1016/j.peva.2004.07.009
Copyright © 2004 Elsevier B.V. All rights reserved.
Time-domain analysis of Web cache filter effects
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Guangwei Bai and Carey Williamson
, 
Department of Computer Science, University of Calgary, 2500 University Drive NW, Calgary, Canada T2N 1N4
Available online 11 September 2004.
Abstract
This paper uses trace-driven simulation to study the traffic arrival process for Web workloads in a simple Web proxy caching hierarchy. Both empirical and synthetic Web proxy workloads are used in the study.
The simulation results show that a Web cache reduces both the peak and the mean request arrival rate for Web traffic workloads, while the variance-to-mean ratio of the filtered traffic typically increases, depending on the input arrival process and the configuration of the cache. If the input traffic is self-similar, then the filtered request traffic remains self-similar, with the same Hurst parameter, though with reduced mean. Finally, we find that a Gamma distribution provides a flexible and robust means of modeling aggregate workloads in hierarchical Web caching architectures, for a broad range of workload characteristics and Web proxy cache sizes. To demonstrate the generality and effectiveness of the modeling approach, we present a detailed example of filter effects and traffic superposition in a two-level Web caching hierarchy with heterogenous input workloads. The Gamma modeling results match well with the results from trace-driven simulations.
Keywords: Internet and WWW technology; Web proxy caching; Web traffic simulation; Workload and traffic characterization
Fig. 1. Conceptual illustration of Web cache filtering effect.
Fig. 2. Time series plot of request arrival process for empirical Web proxy workload: (a) interval size: 1 s; (b) interval size: 1 h.
Fig. 3. Evidence of self-similar request arrival process for empirical Web proxy workload: (a) time series plot; (b) autocorrelation function; (c) variance–time plot; (d) R/S Pox plot.
Fig. 4. Characteristics of the request arrival process for empirical Web proxy workload: (a) PDF; (b) CDF; (c) LLCD.
Fig. 5. Illustration of the proxy cache filter effects on the empirical Web proxy workload: (a) full trace time series (interval: 30 min); (b) busy period time series (interval: 5 min); (c) full trace hit ratio (interval: 30 min); (d) busy period hit ratio (interval: 5 min).
Fig. 6. Evidence of self-similar request arrival process for filtered Web proxy workload: (a) time series plot; (b) autocorrelation function; (c) variance–time plot; (d) R/S Pox plot.
Fig. 7. Characteristics of the filtered arrival process as a function of cache size (empirical workload, LFU policy): (a) PDF; (b) CDF; (c) LLCD.
Fig. 8. Characteristics of the filtered arrival process as a function of cache replacement policy (empirical workload, 8MB cache).
Fig. 9. Characteristics of the filtered arrival process as a function of cache size (H = 0.75, LFU policy).
Fig. 10. Gamma probability density function.
Fig. 11. Gamma distribution model for input request arrival count distribution (empirical workload, γ = 2.67, β = 7.60).
Fig. 12. Gamma distribution model for filtered request arrival count distribution (empirical workload, γ = 1.63, β = 7.22).
Fig. 13. Gamma distribution models for filtered request arrival count distribution (synthetic workload: 20 requests/s, H = 0.75, Z = 0.80, LFU cache, Gamma model (γ, β, μ).
Fig. 14. Example of a two-level Web proxy caching hierarchy.
Fig. 15. Synthetic self-similar workload traces used in simulations: (a) trace 1: H = 0.70, Z = 0.75; (b) trace 2: H = 0.80, Z = 0.80.
Fig. 16. Time series plots of request arrival processes for filtered workloads λ′1 and λ′2: (a) λ′1: interval size = 1 s; (b) λ′1: interval size = l h; (c) λ′2: interval size = 1 s; (d) λ′2: interval size = l h.
Fig. 17. Characteristics of filtered workload λ′1: (a) PDF; (b) CDF; (c) LLCD.
Fig. 18. Evidence of self-similarity in filtered workload λ′1: (a) autocorrelation function; (b) variance–time plot; (c) R/S Pox plot.
Fig. 19. Characteristics of aggregate request arrival process λ3: (a) PDF; (b) CDF; (c) LLCD.
Fig. 20. Evidence of self-similar for aggregate request arrival process λ3: H ≈ 0.76: (a) time series; (b) autocorrelation function; (c) variance–time plot; (d) R/S Pox plot.
Fig. 21. PDF for filtered workloads λ′1 and λ′2.
Fig. 22. Modeling of aggregate workload λ3: (a) PDF; (b) CDF; (c) LLCD.
Table 1.
Characteristics of empirical Web proxy workload (U of S proxy)

Table 2.
Experimental factors and levels for studying cache filter effects

Table 3.
Simulation results for different cache sizes (empirical workload, LFU policy)

Table 4.
Simulation results for different cache replacement policies (empirical workload, 8MB cache)

Table 5.
Characteristics of synthetic Web proxy workloads

Table 6.
Simulation results for different cache sizes (synthetic workload, H = 0.9, Z = 0.8, LFU policy)

Table 7.
Simulation results for different replacement policies (synthetic workload, H = 0.9, Z = 0.8, 8MB cache)


Corresponding author. Tel.: +1 403 220 6780; fax: +1 403 284 4707.