Skip to main content
Log in

A dynamic data replication strategy using access-weights in data grids

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Data grids deal with a huge amount of data regularly. It is a fundamental challenge to ensure efficient accesses to such widely distributed data sets. Creating replicas to a suitable site by data replication strategy can increase the system performance. It shortens the data access time and reduces bandwidth consumption. In this paper, a dynamic data replication mechanism called Latest Access Largest Weight (LALW) is proposed. LALW selects a popular file for replication and calculates a suitable number of copies and grid sites for replication. By associating a different weight to each historical data access record, the importance of each record is differentiated. A more recent data access record has a larger weight. It indicates that the record is more pertinent to the current situation of data access. A Grid simulator, OptorSim, is used to evaluate the performance of this dynamic replication strategy. The simulation results show that LALW successfully increases the effective network usage. It means that the LALW replication strategy can find out a popular file and replicates it to a suitable site without increasing the network burden too much.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Bell WH, Cameron DG, Capozza L, Millar P, Stockinger K, Zini F (2003) OptorSim—a grid simulator for studying dynamic data replication strategies. Int J High Perform Comput Appl 17(4):403–416

    Article  Google Scholar 

  2. Cameron DG, Schiaffino RC, Millar P, Nicholson C, Stockinger K, Zini F (2003) UK grid simulation with OptorSim. In: e-science all-hands meeting, Nottingham, UK, September 2003

  3. Cameron DG, Schiaffino RC, Millar P, Nicholson C, Stockinger K, Zini F (2004) OptorSim: a grid simulator for replica optimisation. In: UK e-science all hands conference, 31 August–3 September 2004

  4. Cameron DG, Schiaffino RC, Millar P, Nicholson C, Stockinger K, Zini F (2002) Evaluating scheduling and replica optimization strategies in OptorSim. In: Proceeding of 4th international workshop on grid computing (Grid2003), Phoenix, USA, November 2002

  5. Cameron DG, Schiaffino RC, Ferguson J, Millar P, Nicholson C, Stockinger K, Zini F (2004) OptorSim v2.0 installation and user guide. http://edg-wp2.web.cern.ch/edg-wp2/optimization/optorsim.html

  6. Centioli C, Iannone F, Panella M, Vitale V, Bracco G, Guadagni R, Migliori S, Steffè M, Eccher S, Maslennikov A, Mililotti M, Molowny M, Palumbo G, Carboni M (2005) Wide area data replication in an ITER-relevant data environment. Fusion Eng Des 74(1–4):809–813

    Article  Google Scholar 

  7. Chang R-S, Chang J-S, Lin S-Y (2007) Job scheduling and data replication on data grids. Future Gener Comput Syst 23(7):846–860

    Article  Google Scholar 

  8. Chang R-S, Chen P-H (2007) Complete and fragmented replica selection and retrieval in data grids. Future Gener Comput Syst 23(4):536–546

    Article  Google Scholar 

  9. Chervenak A, Foster I, Kesselman C, Salisbury C, Tuecke S (2000) The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. J Netw Comput Appl 23:187–200

    Article  Google Scholar 

  10. Čibej U, Slivnik B, Robič B (2005) The complexity of static data replication in data grids. Parallel Comput 31(8–9):900–912

    MathSciNet  Google Scholar 

  11. Mat Deris M, Abawajy JH, Mamat A (2008) An efficient replicated data access approach for large-scale distributed systems. Future Gener Comput Syst 24(1):1–9

    Article  Google Scholar 

  12. Forestiero A, Mastroianni C, Spezzano G (2008) QoS-based dissemination of content in grids. Future Gener Comput Syst 24(3):235–244

    Article  Google Scholar 

  13. Foster I (2005) Globus toolkit version 4: software for service-oriented systems. In: IFIP international conference on network and parallel computing. Lecture notes in computer science, vol 3779. Springer, Berlin, pp 2–13

    Google Scholar 

  14. Hoschek W, Jaen-Martinez FJ, Samar A, Stockinger H, Stockinger K (2000) Data management in an international data grid project. In: Proceedings of the first IEEE/ACM international workshop on grid computing(GRID ’00), Bangalore, India, December 2000. Lecture notes in computer science, vol 1971. Springer, Berlin, pp 77–90

    Google Scholar 

  15. Lei M, Vrbsky SV, Hong X (2008) An on-line replication strategy to increase availability in data grids. Future Gener Comput Syst 24(2):85–98

    Article  Google Scholar 

  16. Ranganathan K, Foster I (2002) Identifying dynamic replication strategies for a high-performance data grids. In: Proceeding of 3rd IEEE/ACM international workshop on grid computing, Denver, USA, November 2002. Lecture notes on computer science, vol 2242. Springer, Berlin, pp 75–86

    Google Scholar 

  17. Tang M, Lee B-S, Yeo C-K, Tang X (2005) Dynamic replication algorithms for the multi-tier data grid. Future Gener Comput Syst 21:775–790

    Article  Google Scholar 

  18. Tang M, Lee B-S, Tang X, Yeo C-K (2006) The impact of data replication of job scheduling performance in the data grid. Future Gener Comput Syst 22:254–268

    Article  MATH  Google Scholar 

  19. The European Data Grid Project. http://eu-datagrid.web.cern.ch/eu-datagrid/

  20. The Large Hadron Collider. http://public.web.cern.ch/Public/en/LHC/LHC-en.html

  21. European Organization for Nuclear Research (CERN). http://public.web.cern.ch/Public/Welcome.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruay-Shiung Chang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, RS., Chang, HP. A dynamic data replication strategy using access-weights in data grids. J Supercomput 45, 277–295 (2008). https://doi.org/10.1007/s11227-008-0172-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-008-0172-6

Keywords

Navigation