Skip to main content
Log in

Bayesian Cognitive Model in Scheduling Algorithm for Data Intensive Computing

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Science is increasingly becoming more and more data-driven. The ability of a geographically distributed community of scientists to access and analyze large amounts of data has emerged as a significant requirement for furthering science. In data intensive computing environment with uncountable numeric nodes, resource is inevitably unreliable, which has a great effect on task execution and scheduling. Novel algorithms are needed to schedule the jobs on the trusty nodes to execute, assure the high speed of communication, reduce the jobs execution time, lower the ratio of failure execution, and improve the security of execution environment of important data. In this paper, a kind of trust mechanism-based task scheduling model was presented. Referring to the trust relationship models of social persons, trust relationship is built among computing nodes, and the trustworthiness of nodes is evaluated by utilizing the Bayesian cognitive method. Integrating the trustworthiness of nodes into a Dynamic Level Scheduling (DLS) algorithm, the Trust-Dynamic Level Scheduling (Trust-DLS) algorithm is proposed. Moreover, a benchmark is structured to span a range of data intensive computing characteristics for evaluation the proposed method. Theoretical analysis and simulations prove that the Trust-DLS algorithm can efficiently meet the requirement of data intensive workloads in trust, sacrificing fewer time costs, and assuring the execution of tasks in a security way in large-scale data intensive computing environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bryant, R.E.: Data-intensive supercomputing: the case for DISC. Technical report CMU-CS-07-128 (2007)

  2. Sun, X.H., Chen, Y., Byna, S.: Scalable computing in multicore era. In: Proceedings of the International Symposium on Parallel Algorithms, Architectures and Programming (PAAP‘08) (2008)

  3. I-Rewinin, H.E., Lewis, T.G., Ali, H.H.: Task Scheduling in Parallel and Distributed System. Prentice Hall, Englewood Cliffs, New Jersey, pp. 401–403 (1994)

    Google Scholar 

  4. Wu, M., Gajski, D.: Hypertool. A programming aid for message passing system. IEEE Trans. Parallel Distrib. Syst. 1(3), 330–343 (1990)

    Article  Google Scholar 

  5. Hwang, J.J., Chow, Y.C., Anger, F.D., et al.: Scheduling precedence graphics in systems with inter-processor communication times. SIAM J. Comput. 18(2), 244–257 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  6. I-Rewinin, H.E., Lewis, T.G.: Scheduling parallel programs onto arbitrary target machines. J. Parallel Distrib. Comput. 9(2), 138–153 (1990)

    Article  Google Scholar 

  7. Sih, G.C., Lee, E.A.: A compile-time scheduling heuristic for interconnection-constraint heterogeneous processor architectures. IEEE Trans. Parallel Distrib. Syst. 4(2), 175–187 (1993)

    Article  Google Scholar 

  8. Iverson, M., Ozguner, F.: Dynamic competitive scheduling of multiple DAGs in a distributed heterogeneous environment. In: Proceedings of the Seventh Heterogeneous Computing Workshop, pp. 70–78. IEEE Computer Society Press, Orland (1998)

    Chapter  Google Scholar 

  9. Iverson, M., Ozguner, F.: Hierarchical, competitive scheduling of multiple DAGs in a dynamic heterogeneous environment. Distrib. Syst. Eng. 6(3), 112–120 (1999)

    Article  Google Scholar 

  10. Dogan, A., Ozguner, F.: Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 308–323 (2002)

    Article  Google Scholar 

  11. Shatz, S.M., Wang, J.P., Goto, M.: Task allocation for maximizing reliability of distributed computer systems. IEEE Trans. Comput. 41(9), 1156–1168 (1992)

    Article  Google Scholar 

  12. Courville, A.C., Daw, N.D., Touretzky, D.S.: Bayesian theories of conditioning in a changing world. Trends Cogn. Sci. 10(7), 294–300 (2006)

    Article  MathSciNet  Google Scholar 

  13. Tenenbaum, J.B., Griffiths, T.L., Kemp, C.: Theory-based Bayesian models of inductive learning and reasoning. Trends Cogn. Sci. 10(7), 309–318 (2006)

    Article  Google Scholar 

  14. Wang, W., Zeng, G.S.: Trusted dynamic scheduling for large-scale parallel distributed systems. In: 40th International Conference on Parallel Processing Workshops, Taipei City, Taiwan, 13–16 September (2011)

  15. Jøsang, A., Ismail, R.: The Beta reputation system. In: Proceedings of the 15th Bled Conference on Electronic Commerce. IEEE Computer Society, Bled, Slovenia (2002)

    Google Scholar 

  16. Mourad, H., Franck, B.: Reliability and scheduling on systems subject to failures. In: Proceedings of ICPP, IEEE Computer Society (2007)

  17. Hecherman D.: A tutorial on learning with Bayes networks. Technical Report MSR-TR-95-06, Microsoft Research Advanced Technology Division, Microsoft Corporation (1995)

  18. Peterson, L., Anderson, T., Culler, D., et al.: A blueprint for introducing disruptive technology into the Internet. In: Proc. HotNets -I, ACM Press, Princeton (2002)

    Google Scholar 

  19. Hadoop Apache: Apache hadoop. http://hadoop.apache.org/core/ (2008–2010)

  20. Gorton, I.: MeDICi: Middleware for data-intensive computing, Pacific Northwest National Laboratory, Richland, WA. PNNL-SA-62323 (2008)

  21. DeCandia, G., Hastorun, D., et al.: Dynamo: Amazon’s highly available key-value store. In: Proc. of the 21st ACM Symp. on Operating Systems Principles, pp. 205–220. ACM Press, New York (2007)

    Google Scholar 

  22. Isard, M., Budiu, M., et al.: Dryad: distributed data-parallel programs from sequential building blocks. In: Proc. of the 2nd European Conf. on Computer Systems (EuroSys)., pp. 59–72 (2007)

  23. Aguilera, M.K., Merchant, A., et al.: Sinfonia: a new paradigm for building scalable distributed systems. In: Proc. of the 21st ACM Symp. on Operating Systems Principles, pp. 159–174. ACM Press, New York (2007)

    Google Scholar 

  24. Chiba, T., Kielmann, T., et al.: Dynamic load-balanced broadcast for data-intensive applications on clouds. CCGrid (2010)

  25. Bicer, T., Jiang, W., Agrawal, G.: Supporting fault tolerance in a data-intensive computing middleware. IEEE IPDPS (IPDPS‘10), Atlanta, USA (2010)

  26. Wang, W., Zeng, G.S.: Cloud-DLS: dynamic trusted scheduling for cloud computing. Exp. Syst. Appl. 39(3), 2321–2329 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, W., Zeng, G. Bayesian Cognitive Model in Scheduling Algorithm for Data Intensive Computing. J Grid Computing 10, 173–184 (2012). https://doi.org/10.1007/s10723-012-9205-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-012-9205-8

Keywords

Navigation