Elsevier

Future Generation Computer Systems

Volume 54, January 2016, Pages 260-273
Future Generation Computer Systems

Simulation of SLA-based VM-scaling algorithms for cloud-distributed applications

https://doi.org/10.1016/j.future.2015.01.015Get rights and content

Abstract

Cloud Computing has evolved to become an enabler for delivering access to large scale distributed applications running on managed network-connected computing systems. This makes possible hosting Distributed Enterprise Information Systems (dEISs) in cloud environments, while enforcing strict performance and quality of service requirements, defined using Service Level Agreements (SLAs). SLAs define the performance boundaries of distributed applications, and are enforced by a cloud management system (CMS) dynamically allocating the available computing resources to the cloud services. We present two novel VM-scaling algorithms focused on dEIS systems, which optimally detect most appropriate scaling conditions using performance-models of distributed applications derived from constant-workload benchmarks, together with SLA-specified performance constraints. We simulate the VM-scaling algorithms in a cloud simulator and compare against trace-based performance models of dEISs. We compare a total of three SLA-based VM-scaling algorithms (one using prediction mechanisms) based on a real-world application scenario involving a large variable number of users. Our results show that it is beneficial to use autoregressive predictive SLA-driven scaling algorithms in cloud management systems for guaranteeing performance invariants of distributed cloud applications, as opposed to using only reactive SLA-based VM-scaling algorithms.

Introduction

Cloud Computing  [1] has evolved to become an enabler for delivering access to large-scale distributed applications  [2] running inside managed environments composed of network-connected computing systems. This made possible hosting of Distributed Enterprise Information Systems (dEISs) in cloud environments, while enforcing strict performance and quality of service requirements, defined using Service Level Agreements (SLA).

SLAs are contracts defining the performance and quality of service (QoS) boundaries of distributed applications. A cloud management system (CMS) enforces SLAs by dynamically allocating available computing resources to cloud services. A CMS monitors both the software cloud resources as well as the underlying physical network and computing resources. It uses this information for deciding the actions to be taken, such as increasing the number of VMs (scaling-out), decreasing (scaling-in), or migrating software components in order to maintain the conditions defined in the SLAs and for maximising provider-specific metrics (e.g. energy efficiency).

It is often the case that cloud applications exhibit predictable and repeatable patterns in their resource utilisation levels, caused by the execution of repeatable workloads (e.g. with hourly, daily, weekly patterns). A CMS can benefit from detecting such repeatable patterns by combining this information with prediction mechanisms in order to estimate the near-term utilisation level of both software and physical resources, and then to optimise the allocation of resources based on the SLAs.

Also, the specific way of packing cloud applications in Virtual Machines (VMs) allows a CMS to scale cloud-distributed applications by means of “horizontal”-scaling, where the number of VMs allocated to application-services is increased or decreased according to variations in the external workload. Therefore, using SLAs for specifying the performance of cloud applications could enable the CMS to better perform VM-scaling by correlating the SLA guarantees with the actual number of VMs allocated to cloud applications, their QoS metrics and the size of the distributed workload.

We define the research question as: “How can a CMS dynamically scale the number of VMs allocated to cloud services, so that the SLA-defined performance constraints are maintained under variable workload conditions such as fluctuating number of users?”.

We present an approach for designing and testing SLA scaling algorithms for dEIS systems by using performance-models of cloud-distributed applications (built with the help of constant-workload benchmarks) and then simulating the scaling algorithms in a cloud simulator against the performance models. We extend the work in  [3], [4] by presenting and evaluating two new SLA-based VM-scaling algorithms. In total, we compare three SLA-based VM-scaling algorithms (one using prediction mechanisms) based on (1) a real-world application scenario involving a large variable number of users, and (2) pre-recorded monitoring traces of an actual distributed enterprise application.

Our results show that it is valuable to use a predictive SLA-driven VM-scaling algorithm in a cloud management system for guaranteeing performance SLA invariants of distributed cloud applications.

Our main contributions can be summarised as follows. We present an approach for analysing the performance boundaries of a distributed application using batches of benchmarks. We then show how Little’s Law can be combined with the benchmark’s results and SLA-defined performance conditions in order to identify optimal scaling conditions for the distributed application. We also show how multi-step linear regression can be used to efficiently predict application workloads, and then we integrate the prediction mechanism into a SLA-based VM-scaling algorithm. In total, we analyse three SLA-based VM-scaling algorithms.

The rest of our paper is organised as follows. Section  2 presents the related work in the field of distributed enterprise applications, cloud computing simulators, prediction models, and SLA-based scaling of cloud services. Section  3 introduces the problem of predicting time series. Section  4 introduces an algorithm for doing multi-step prediction using linear regression models. Section  5 presents a benchmarking methodology based on Little’s Law for exploring the relations between system’s workload, occupancy (concurrency) and the average execution time. We then use these relations for finding the maximum processing capacity of the corresponding VM-instances based on SLA-defined performance conditions. Section  6 introduces two SLA-based VM-scaling algorithms that use the mechanisms presented in Sections  4 Multi-step prediction using linear autoregression, 5 Performance profiling of cloud-distributed applications. Section  7 discusses the results of evaluating the SLA-based VM-scaling algorithms using a simulation of a real-world multi-user workload in a cloud simulator. Finally, Section  8 draws conclusions.

Section snippets

Related work

We split the related work section into four subsections, as follows: (1) distributed enterprise information systems, (2) cloud computing simulator, (3) time series prediction mechanisms, and (4) SLA-based scaling of cloud services.

Time series prediction

We define the SLA Cloud Management Optimization (SLA-CMO) problem  [14], [18], [19], [20] as: improving the efficiency of allocating datacenter’s computing resources by dynamically changing the number of VMs allocated to cloud services, so that SLA-defined performance requirements are met under variable workload conditions.

Solving the SLA-CMO problem depends directly on having a reliable source of monitoring information reflecting the state (e.g. number of allocated VMs, system’s throughput,

Multi-step prediction using linear autoregression

We investigate autoregression for predicting multiple future values of an independent variable. As underlying example we will use the time series shown in Fig. 1 containing a window of data representing the arrival rate of requests of an ERP system. As we want to use the data for predicting the future values, this means that it will be processed in a streaming fashion, as it becomes available.

Two important properties of the prediction algorithm are the following: (1) immunity to small

Performance profiling of cloud-distributed applications

In this section we present a performance profiling analysis of a cloud-distributed Enterprise Information System (dEIS)  [3], [4], [24], [17], [14], [19] based on Little’s law  [21]. The purpose of this analysis is to determine the dependencies between the average arrival rate of requests to a distributed system, the system’s average throughput, the average number of concurrent requests executed by the system and the average execution time.

Once these relations are known, we will use them to

SLA-based VM-scaling algorithms for CMS

As we have seen in the previous section, the processing capacity of dEIS applications can be easily saturated if the system’s occupancy (L) approaches a critical region. Once the system enters into this hazardous region, the average execution time (W) will quickly increase from below one second to tens of seconds, lowering the quality of experience as a result of large delays in the processing of dEIS-requests.

In order for the CMS to prevent this behaviour where the quality of experience drops

Evaluation results

In order to evaluate the two new SLA-based VM-Scaling algorithms previously presented in Sections  6.1 , 6.2 Predictive we implemented them in CloudSim  [28], which allowed us to run multiple simulations against the dEIS distributed application.

Next, we describe some implementation details about the integration of the new scaling algorithms in CloudSim, and the implementation of the prediction mechanisms. We continue with comparing the λ-based and predictive λ-based VM-Scaling algorithms using

Conclusions

Cloud Computing has evolved to become an enabler for delivering access to large-scale distributed applications running on managed environments composed of network-connected computing systems. This makes possible hosting Distributed Enterprise Information Systems (dEIS) in cloud environments, while allowing Cloud Management Systems (CMS) to enforce strict performance and quality of service requirements, defined using Service Level Agreements (SLA).

In this paper we presented two new VM-scaling

Alexandru-Florian Antonescu is a Research Associate in the department of Products & Innovation at SAP Switzerland. He is expected to receive his Ph.D. from University of Bern (Switzerland) in 2015. Previously he obtained his Master in Management of Information Technology, and Diploma in Computer Science from University “Politehnica” of Bucharest (Romania). His research interests include distributed computing, scalability of cloud systems, large-scale statistical data analysis, and mobile

References (32)

  • J. Bozman, Cloud computing: the need for portability and interoperability, IDC Analyze the...
  • A. Leon

    Enterprise Resource Planning

    (2008)
  • G. Lab, Cloud simulator cloudsim, 2014....
  • R. Buyya

    Modeling and simulation of scalable cloud computing environments and the cloudsim toolkit: challenges and opportunities

  • S.K. Garg

    NetworkCloudSim: modelling parallel applications in cloud simulations

  • A. Visan et al.

    Bio-inspired techniques for resources state prediction in large scale distributed systems

    Int. J. Distrib. Syst. Technol. (IJDST)

    (2011)
  • Cited by (26)

    • REACT: A solidarity-based elastic service resource reallocation strategy for Multi-access Edge Computing

      2021, Physical Communication
      Citation Excerpt :

      Authors in [23] have proposed an auto-scaling algorithm to minimize costs and deal with unbalanced cluster load caused by resource expansion, i.e., scale-up, and the data reliability caused by resource scale-down. The work in [24] proposes a VM-scaling algorithm to Distributed Enterprise Information Systems, which optimally detects the most appropriate scaling conditions using performance-models of distributed applications based on SLA-specified performance constraints. Naha et al. [25] developed resource allocation and provisioning algorithms by using resource ranking and provisioning of resources in a hybrid and hierarchical fashion to address the problem of satisfying deadline-based dynamic user requirements in fog computing.

    • A load-aware resource allocation and task scheduling for the emerging cloudlet system

      2018, Future Generation Computer Systems
      Citation Excerpt :

      What is more, how to map the offloaded tasks with interdependencies to the appropriate resources is typically a NP-complete problem, so it is critical to design an efficient framework for resource allocation. There have been many existing works dedicated to build computing platforms or derive managing strategies to better support cloud applications [13–17]. In MCC architecture, one of the urgent issues requires to be addressed is the transmission latency between mobile devices and cloud servers.

    View all citing articles on Scopus

    Alexandru-Florian Antonescu is a Research Associate in the department of Products & Innovation at SAP Switzerland. He is expected to receive his Ph.D. from University of Bern (Switzerland) in 2015. Previously he obtained his Master in Management of Information Technology, and Diploma in Computer Science from University “Politehnica” of Bucharest (Romania). His research interests include distributed computing, scalability of cloud systems, large-scale statistical data analysis, and mobile computing. For his Ph.D. he investigated the use of Service Level Agreements in cloud environments for scaling distributed infrastructures.

    Torsten Braun got his Ph.D. degree from University of Karlsruhe (Germany) in 1993. From 1994 to 1995 he has been a guest scientist at INRIA Sophia-Antipolis (France). From 1995 to 1997 he has been working at the IBM European Networking Centre Heidelberg (Germany) as a project leader and senior consultant. He has been a full professor of Computer Science at the University of Bern (Switzerland) and head of the research group “Communication and Distributed Systems” since 1998. He has been member of the SWITCH (Swiss education and research network) board of trustees since 2001. Since 2011, he has been vice president of the SWITCH foundation.

    View full text