Copyright © 2007 Elsevier Ltd All rights reserved.
CPU demand for web serving: Measurement analysis and dynamic estimation
Received 3 December 2007;
Abstract
Managing the resources in a large Web serving system requires knowledge of the resource needs for service requests of various types. In order to investigate the properties of Web traffic and its demand, we collected measurements of throughput and CPU utilization and performed some data analyses. First, we present our findings in relation to the time-varying nature of the traffic, the skewness of traffic intensity among the various types of requests, the correlation among traffic streams, and other system-related phenomena. Then, given such nature of web traffic, we devise and implement an on-line method for the dynamic estimation of CPU demand.
Assessing resource needs is commonly performed using techniques such as off-line profiling, application instrumentation, and kernel-based instrumentation. Little attention has been given to the dynamic estimation of dynamic resource needs, relying only on external and high-level measurements such as overall resource utilization and request rates. We consider the problem of dynamically estimating dynamic CPU demands of multiple kinds of requests using CPU utilization and throughput measurements. We formulate the problem as a multivariate linear regression problem and obtain its basic solution. However, as our measurement data analysis indicates, one is faced with issues such as insignificant flows, collinear flows, space and temporal variations, and background noise. In order to deal with such issues, we present several mechanisms such as data aging, flow rejection, flow combining, noise reduction, and smoothing. We implemented these techniques in a Work Profiler component that we delivered as part of a broader system management product. We present experimental results from using this component in scenarios inspired by real-world usage of that product.
Keywords: Workload profiling; Linear regression; Web workload
Article Outline
- 1. Introduction
- 2. Prior work
- 3. Problem description
- 4. Analysis
- 4.1. Linear regression problem
- 4.2. Solution goodness measure
- 4.3. Correlation matrix
- 4.4. Data aging
- 5. Measurements
- 5.1. Measured data
- 5.2. Traffic skewness and variations
- 5.3. CPU utilization
- 5.4. Traffic correlation
- 6. Practical considerations
- 6.1. Insignificant flows
- 6.2. Utilization discounting
- 6.3. Low contribution flows
- 6.4. Dynamic variations
- 6.5. Collinear flows
- 6.6. Machine and process CPU
- 6.7. Degrees of freedom and responsiveness
- 6.8. Background noise
- 7. Experimental results
- 7.1. Validation of the linear model
- 7.2. Setup
- 7.3. Baseline
- 7.4. Introducing per-process CPU readings
- 7.5. Collinearity
- 8. Conclusion
- References
- Vitae






E-mail Article
Add to my Quick Links

Cited By in Scopus (0)







