ABSTRACT
Currently, many web applications are deployed on cloud storage service provided by cloud service providers (CSPs). A CSP offers different types of storage including hot, cold and archive storage and sets unit prices for these different types, which vary substantially. By properly assigning the data files of a web application to different types of storage based on their usage profiles and the CSP’s pricing policy, a cloud customer potentially can achieve substantial cost savings and minimize the payment to the CSP. However, no previous research handles this problem. Towards this goal, we present a Markov Decision Process formulation for the cost minimization problem, and then develop a reinforcement learning based approach to effectively solve the problem, which changes the type of storage of each data file periodically to minimize money cost in long term. We then propose a method to aggregate concurrently requested data files to further reduce the cloud storage service payment for a web application. Our experiments with Wikipedia traces show the effectiveness of the proposed methods for minimizing cloud customer cost in comparison with other methods.
- [n.d.]. Amazon S3. https://aws.amazon.com/cn/s3/, [accessed in Jan. 2020].Google Scholar
- [n.d.]. ARIMA model for Time Series Forecasting. https://machinelearningmastery.com/arima-for-time-series-forecasting-with-python/, [accessed in Jan. 2020].Google Scholar
- [n.d.]. Azure Storage Pricing Policy. https://azure.microsoft.com/en-us/pricing/details/storage/blobs/ , [accessed in Jan. 2020].Google Scholar
- [n.d.]. Google Cloud Storage. https://cloud.google.com/storage/, [accessed in Jan. 2020].Google Scholar
- [n.d.]. Microsoft Azure. https://azure.microsoft.com/en-us/, [accessed in Jan. 2020].Google Scholar
- [n.d.]. Page View statistics from Wikimedia Projects. https://dumps.wikimedia.org/other/pagecounts-ez/, [accessed in Jan. 2020].Google Scholar
- Martín A., Paul B., Jianmin C., Zhifeng C., Andy D., Jeffrey D., Matthieu D., Sanjay G., Geoffrey I., and Michael I.2016. Tensorflow: a system for large-scale machine learning.. In Proc. of OSDI.Google Scholar
- H. Abu-Libdeh, L. Princehouse, and H. Weatherspoon. 2010. RACS: a case for cloud storage diversity. In Proc. of SOCC.Google Scholar
- A. Adya, W. Bolosky, M. Castro, G. Cermak, R. Chaiken, J. Douceur, J. Howell, J. Lorch, M. Theimer, and R. Wattenhofer. 2002. FARSITE: Federated, available, and reliable storage for an incompletely trusted environment. ACM SIGOPS Operating Systems Review(2002).Google ScholarCross Ref
- G. Alvarez, E. Borowsky, S. Go, T. Romer, R. Becker-Szendy, R. Golding, A. Merchant, M. Spasojevic, A. Veitch, and J. Wilkes. 2001. Minerva: An automated resource provisioning tool for large-scale storage systems. Trans. on TOCS (2001).Google Scholar
- E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. 2002. Hippodrome: Running Circles Around Storage Administration.. In Proc. of FAST.Google Scholar
- L. Chen, J. Lingys, K. Chen, and F. Liu. 2018. Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In Proc. of SIGCOM.Google Scholar
- J. E, Y. Cui, M. Ruan, Z. Li, and E. Zhai. 2019. HyCloud: Tweaking Hybrid Cloud Storage Services for Cost-Efficient Filesystem Hosting. In Proc. of INFOCOM.Google Scholar
- J. Gao, H. Wang, and H. Shen. 2020. Smartly Handling Renewable Energy Instability in Supporting A Cloud Datacenter. In Proc. of IPDPS.Google Scholar
- S. Hillmer and G. Tiao. 1982. An ARIMA-model-based approach to seasonal adjustment. J. Amer. Statist. Assoc.(1982).Google Scholar
- R. Howard. 1964. Dynamic programming and Markov processes. (1964).Google Scholar
- H. Jin, H. Guo, L. Su, K. Nahrstedt, and X. Wang. 2019. Dynamic Task Pricing in Multi-Requester Mobile Crowd Sensing with Markov Correlated Equilibrium. In Proc. of INFOCOM.Google Scholar
- Ana K., Heiner L., and Christos K.2018. Selecta: heterogeneous cloud storage configuration for data analytics. In Proc. of USENIX ATC.Google Scholar
- Leslieb K. and Andrew L., Michaeland M.1996. Reinforcement learning: A survey. Journal of artificial intelligence research(1996).Google Scholar
- R. Kotla, L. Alvisi, and M. Dahlin. 2007. SafeStore: A durable and practical storage system. In Proc. of ATC.Google Scholar
- Yang L., Li G., Akara S., and Yike G.2014. Enabling performance as a service for a cloud storage system. In Proc. of CLOUD.Google Scholar
- H. Li, L. Zhong, J. Liu, B. Li, and K. Xu. 2011. Cost-effective partial migration of VoD services to content clouds. In Proc. of Cloud.Google Scholar
- M. Li, C. Qin, J. Li, and P. Lee. 2016. CDStore: Toward reliable, secure, and cost-efficient cloud storage via convergent dispersal. Prof. of ATC (2016).Google Scholar
- G. Liu, H. Shen, and H. Wang. 2017. An economical and SLO-guaranteed cloud storage service across multiple cloud service providers. Trans. on TPDS (2017).Google Scholar
- H. Madhyastha, J. McCullough, G. Porter, R. Kapoor, S. Savage, A. Snoeren, and A. Vahdat. 2012. scc: cluster storage provisioning informed by application characteristics and SLAs.. In Proc. of FAST.Google Scholar
- H. Mao, M. Alizadeh, I. Menache, and S. Kandula. 2016. Resource management with deep reinforcement learning. In Proc. of HotNet.Google Scholar
- H. Mao, R. Netravali, and M. Alizadeh. 2017. Neural adaptive video streaming with pensieve. In Proc. of SIGCOM.Google Scholar
- W. Mao, Z. Zheng, and F. Wu. 2019. Pricing for revenue maximization in iot data markets: An information design perspective. In Proc. of INFOCOM.Google Scholar
- V. Mnih, P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning.Google Scholar
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu, J. Veness, M. Bellemare, A. Graves, M. Riedmiller, A. Fidjeland, and G. Ostrovski. 2015. Human-level control through deep reinforcement learning. Nature (2015).Google Scholar
- Di Niu, Hong Xu, and Baochun Li. 2012. Quality-assured cloud bandwidth auto-scaling for video-on-demand applications.. In Proc. of INFOCOM.Google ScholarCross Ref
- B. Plaza. 2011. Google Analytics for measuring website performance. Tourism Management (2011).Google Scholar
- Z. Pooranian, K. Chen, C. Yu, and M. Conti. 2018. RARE: Defeating side channels based on data-deduplication in cloud storage. In Proc. of INFOCOM workshop.Google Scholar
- H. Roh, C. Jung, W. Lee, and D. Du. 2013. Resource pricing game in geo-distributed clouds. In Proc. of INFOCOM.Google Scholar
- Richard S., Doina P., and Satinder S.1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. AI (1999).Google Scholar
- D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, and V. Panneershelvam. 2016. Mastering the game of Go with deep neural networks and tree search. nature (2016).Google Scholar
- Y. Song, M. Zafer, and K. Lee. 2012. Optimal bidding in spot instance market. In Proc. of INFOCOM.Google Scholar
- R. Sutton, A. Barto, and F. Bach. 1998. Reinforcement learning: An introduction.Google Scholar
- R. Sutton, D. McAllester, S. Singh, and Y. Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Proc. of ANIPS.Google Scholar
- Zhe W., Curtis Y., and Harsha V M.2015. CosTLO: Cost-Effective Redundancy for Lower Latency Variance on Cloud Storage Services.. In Proc. of NSDI.Google Scholar
- F. Wang, J. Liu, and M. Chen. 2012. CALMS: Cloud-assisted live media streaming for globalized demands with time/region diversities. In Proc. of INFOCOM.Google Scholar
- H. Wang and H. Shen. 2018. Proactive incast congestion control in a datacenter serving web applications. In Proc. of INFOCOM.Google Scholar
- B. Wickremasinghe and R. Buyya. 2009. CloudAnalyst: A CloudSim-based tool for modelling and analysis of large scale cloud computing environments. Prof. of MEDC (2009).Google Scholar
- A. Wieder, P. Bhatotia, A. Post, and R. Rodrigues. 2012. Orchestrating the Deployment of Computations in the Cloud with Conductor.. In Proc. of NSDI.Google ScholarDigital Library
- Z. Wu, M. Butkiewicz, D. Perkins, E. Katz-Bassett, and H. Madhyastha. 2013. Spanstore: Cost-effective geo-replicated storage spanning multiple cloud services. In Proc. of SOSP.Google Scholar
Recommendations
Cost Optimization for Cloud Storage from User Perspectives: Recent Advances, Taxonomy, and Survey
With the development and maturity of cloud storage, it has attracted a large number of users. Although cloud users do not need to concern themselves with the infrastructure used for storage, thus saving on equipment and maintenance costs, the sheer volume ...
Toward a cost-effective cloud storage service
ICACT'10: Proceedings of the 12th international conference on Advanced communication technologyBase platforms for many data-intensive applications start to move onto cloud computing services. These applications inherently requires very large storage space, and append-only distributed file systems have been developed for this purpose. These ...
Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud
Data deduplication has been demonstrated to be an effective technique in reducing the total data transferred over the network and the storage space in cloud backup, archiving, and primary storage systems, such as VM (virtual machine) platforms. However, ...
Comments