Abstract
Highly parallel applications often use either highly parallel file systems or large numbers of independent disks. Either approach can provide the high data rates necessary for parallel applications. However, the failure of a single disk or server can render the data useless. Conventional techniques, such as those based on applying erasure correcting codes to each file write, are prohibitively expensive for massively parallel scientific applications because of the granularity of access at which the codes are applied. In this paper we demonstrate a scalable method for recovering from single disk failures that is optimized for typical scientific data sets. This approach exploits coarser-grained (but precise) semantics to reduce the overhead of constructing recovery data and makes use of parallel computation (proportional to the data size and independent of number of processors) to construct data. Experiments are presented showing the efficiency of this approach on a cluster with independent disks, and a technique is described for hiding the creation of redundant data within the MPI-IO implementation.
This work was supported by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Contract W-31-109-Eng-38.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blaum, M., Brady, J., Bruck, J., Menon, J., Vardy, A.: The EVENODD code and its generalization: An efficient scheme for tolerating multiple disk failures in RAID architectures. In: Jin, H., Cortes, T., Buyya, R. (eds.) High Performance Mass Storage and Parallel I/O: Technologies and Applications, ch. 14, pp. 187–208. IEEE Computer Society Press and Wiley, New York (2001)
Bonachea, D.: Gasnet specification, v1.1. Technical Report CSD-02-1207, University of California, Berkeley (October 2002)
Evard, R., Desai, N., Navarro, J.P., Nurmi, D.: Clusters as large-scale development facilities. In: Proceedings of IEEE International Conference on Cluster Computing (CLUSTER 2002), pp. 54–66 (2002)
Hellerstein, L., Gibson, G.A., Karp, R.M., Katz, R.H., Patterson, D.A.: Coding techniques for handling failures in large disk arrays. Algorithmica 12(2/3), 182–208 (1994)
Ladin, R., Liskov, B., Shrira, L.: Lazy replication: Exploiting the semantics of distributed services. In: IEEE Computer Society Technical Committee on Operating Systems and Application Environments, vol. 4, pp. 4–7. IEEE Computer Society, Los Alamitos (1990)
Ligon, W.: Private communication (2002)
Patterson, D.A., Gibson, G., Katz, R.H.: A case for redundant arrays of inexpensive disks (RAID). In: ACM, editor, Proceedings of Association for Computing Machinery Special Interest Group on Management of Data: 1988 Annual Conference, Chicago, Illinois, June 1–3, pp. 109–116. ACM Press, New York (1988)
Pillai, M., Lauria, M.: A high performance redundancy scheme for cluster file systems. In: Proceedings of the 2003 IEEE International Conference on Cluster Computing, Kowloon, Hong Kong (December 2003)
Plank, J.S.: A tutorial on reed-solomon coding for fault-tolerance in RAID-like systems. Software – Practice and Experience 27(9), 995–1012 (1997)
Thakur, R., Gropp, W.: Improving the performance of collective operations in MPICH. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 257–267. Springer, Heidelberg (2003)
The PVFS2 parallel file system, http://www.pvfs.org/pvfs2/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gropp, W.D., Ross, R., Miller, N. (2004). Providing Efficient I/O Redundancy in MPI Environments. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2004. Lecture Notes in Computer Science, vol 3241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30218-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-30218-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23163-9
Online ISBN: 978-3-540-30218-6
eBook Packages: Springer Book Archive