Skip to main content
Log in

Multi-dimensional multiple query scheduling with distributed semantic caching framework

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

It is becoming more important to leverage a large number of distributed cache memory seamlessly in modern large scale systems. Several previous studies showed that traditional scheduling policies often fail to exhibit high cache hit ratio and to achieve good system load balance with large scale distributed caching facilities. To maximize the system throughput, distributed caching facilities should balance the workloads and leverage cached data at the same time. In this work, we present a distributed job processing framework that yields high cache hit ratio while achieving balanced system load. Our framework employs a scheduling policy—DEMA that considers both cache hit ratio and system load and it supports geographically distributed multiple job schedulers. We show collaborative task scheduling and the data migration can even further improve the performance by increasing the cache hit ratio while achieving good load balance. Our experiments show that the proposed job scheduling policies outperform legacy load-based job scheduling policy in terms of job response time, load balancing, and cache hit ratio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. In our implementation, however, we performed a linear search that takes O(n) assuming that n is small enough, for example, up to 40.

References

  1. Andrade, H., Kurc, T., Sussman, A., Saltz, J.: Active Proxy-G: Optimizing the query execution process in the Grid. In: Proceedings of the ACM/IEEE SC2002 Conference (2002)

  2. Aron, M., Sanders, D., Druschel, P., Zwaenepoel, W.: Scalable content-aware request distribution in cluster-basednetwork servers. In: Proceedings of Usenix Annual Technical Conference (2000)

  3. de Berg, M., Cheong, O., van Kreveld, M., Overmars, M.: Computational Geometry. Algorithms and Applications. Springer, Heidelberg (1998)

    Google Scholar 

  4. Beynon, M.D., Ferreira, R., Kurc, T., Sussman, A., Saltz, J.: DataCutter: Middleware for filtering very large scientific datasets on archival storage systems. In: Proceedings of the Eighth Goddard Conference on Mass Storage Systems and Technologies/17th IEEE Symposium on Mass Storage Systems, pp. 119–133 (2000)

  5. Catalyurek, U.V., Boman, E.G., Devine, K.D., Bozdag, D., Heaphy, R.T., Riesen, L.A.: A repartitioning hypergraph model for dynamic load balancing. J. Parallel Distrib. Comput. 69(8), 711–724 (2009)

    Article  Google Scholar 

  6. Godfrey ,B., Lakshminarayanan, K., Surana, S., Karp, R., Stoica, I.: Load balancing in dynamic structured p2p systems. In: Proceedings of INFOCOM 2004 (2004)

  7. Katevenis, M., Sidiropoulos, S., Courcoubetis, C.: Weighted round-robin cell multiplexing in a general-purpose atm switch chip. IEEE J. Sel. Areas Commun. 9(8), 1265–1279 (1991)

    Article  Google Scholar 

  8. Kim, J.S., Andrade, H., Sussman, A.: Principles for designing data-/compute-intensive distributed applications and middleware systems for heterogeneous environments. J. Parallel Distrib. Comput. 67(7), 755–771 (2007)

    Article  MATH  Google Scholar 

  9. Kurc, T., Chang, C., Ferreira, R., Sussman, A., Saltz, J.: Querying very large multi-dimensional datasets in ADR. In: Proceedings of the ACM/IEEE SC1999 Conference (1999)

  10. Menasce, D.A., Almeida, V.A.F.: Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning. Prentice Hall PTR, Upper Saddle River (2000)

    Book  Google Scholar 

  11. Nam, B., Shin, M., Andrade, H., Sussman, A.: Multiple query scheduling for distributed semantic caches. J. Parallel Distrib. Comput. 70(5), 598–611 (2010)

    Article  MATH  Google Scholar 

  12. Pai, V., Aron, M., Banga, G., Svendsen, M., Druschel, P., Zwaenepoel, W., Nahum, E.: Locality-aware request distribution in cluster-based network servers. In: Proceedings of ACM ASPLOS (1998)

  13. Robert, C.P., Casella, G.: Monte Carlo Statistical Methods (Springer Texts in Statistics). Springer, Secaucus (2005)

    Google Scholar 

  14. Rodríguez-Martínez, M., Roussopoulos, N.: Mocha: A self-extensible database middleware system for distributed data sources. In: Proceedings of 2000 ACM SIGMOD, ACMPRESS, pp. 213–224, aCM SIGMOD Record, Vol. 29, No. 2 (2000)

  15. Smith, J., Sampaio, S., Watson, P., Paton, N.: The polar parallel object database server. Distrib. Parallel Databases 16(3), 275–319 (2004)

    Article  Google Scholar 

  16. Vydyanathan, N., Krishnamoorthy, S., Sabin, G., Catalyurek, U., Kurc, T., Sadayappan, P., Saltz, J.: An integrated approach to locality-conscious processor allocation and scheduling of mixed-parallel applications. IEEE Trans. Parallel Distrib. Syst. 15, 3319–3332 (2009)

    Google Scholar 

  17. Wolf, J.L., Yu, P.S.: Load balancing for clustered web farms. ACM SIGMETRICS Perform. Eval. Rev. 28(4), 11–13 (2001)

    Article  Google Scholar 

  18. Zhang, K., Andrade, H., Raschid, L., Sussman, A.: Query planning for the Grid: Adapting to dynamic resource availability. In: Proceedings of the 5th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), Cardiff, UK (2005a)

  19. Zhang, Q., Riska, A., Sun, W., Smirni, E., Ciardo, G.: Workload-aware load balancing for clustered web servers. IEEE Trans. Parallel Distrib. Syst. 16(3), 219–233 (2005b)

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by NRF (National Research Foundation of Korea) grant NRF-2014R1A1A2058843 and MKE/KEIT (No. 10041608, Embedded System Software for New Memory based Smart Devices).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beomseok Nam.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Eom, Y., Kim, J. & Nam, B. Multi-dimensional multiple query scheduling with distributed semantic caching framework. Cluster Comput 18, 1141–1156 (2015). https://doi.org/10.1007/s10586-015-0464-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0464-6

Keywords

Navigation