Combating I-O bottleneck using prefetching: model, algorithms, and ramifications

Verma, Akshat; Sen, Sandeep

doi:10.1007/s11227-007-0170-0

Combating I-O bottleneck using prefetching: model, algorithms, and ramifications

Published: 16 January 2008

Volume 45, pages 205–235, (2008)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Akshat Verma¹ &
Sandeep Sen²

67 Accesses
Explore all metrics

Abstract

Multiple memory models have been proposed to capture the effects of memory hierarchy culminating in the I-O model of Aggarwal and Vitter (Commun. ACM 31(9):1116–1127, [1988]). More than a decade of architectural advancements have led to new features that are not captured in the I-O model—most notably the prefetching capability. We propose a relatively simple Prefetch model that incorporates data prefetching in the traditional I-O models and show how to design optimal algorithms that can attain close to peak memory bandwidth. Unlike (the inverse of) memory latency, the memory bandwidth is much closer to the processing speed, thereby, intelligent use of prefetching can considerably mitigate the I-O bottleneck. For some fundamental problems, our algorithms attain running times approaching that of the idealized random access machines under reasonable assumptions. Our work also explains more precisely the significantly superior performance of the I-O efficient algorithms in systems that support prefetching compared to ones that do not.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adiga NR et al (2002) An overview of the bluegene/l supercomputer. In: Proceedings of supercomputing (SC)
Aggarwal A, Vitter J (1988) The input/output complexity of sorting and related problems. Commun ACM 31(9):1116–1127
Article MathSciNet Google Scholar
Aggarwal A, Alpern B, Chandra A, Snir M (1987) A model for hierarchical memory. In: Proceedings of ACM symposium on theory of computing
Aggarwal A, Chandra A, Snir M (1987) Hierarchical memory with block transfer. In: Proceedings of IEEE foundations of computer science, pp 204–216
Alpern B, Carter L, Feig E, Selker T (1994) The uniform memory hierarchy model of computation. Algorithmica 12(2):72–109
Article MATH MathSciNet Google Scholar
Brodal GS, Fagerberg R (2003) On the limits of cache-obliviousness. In: Proceedings of STOC, pp 307–315
Chaudhry G, Cormen TH (2002) Getting more for out-of-core columnsort. In: Proceedings of ALENEX
Chen T, Baer J (1995) Effective hardware-based data prefetching for high-performance processors. IEEE Trans Comput 44(5):609–623
Article MATH Google Scholar
Cormen TH, Sundquist T, Wisniewski LF (1999) Asymptotically tight bounds for performing BMMC permutations on parallel disk systems. SIAM J Comput 28(1):105–136
Article MathSciNet Google Scholar
Dementiev R, Sanders P (2003) Asynchronous parallel disk sorting. In: Proceedings of SPAA
Floyd R (1972) Permuting information in idealized two-level storage. In: Complexity of computer computations, pp 105–109
Frigo M, Leiserson CE, Prokop H, Ramachandran S (1999) Cache-oblivious algorithms. In: Proceedings of FOCS
Hong J-W, Kung HT (1981) I/O complexity: the red–blue pebble game. In: Proceedings of the 13th symposium on the theory of computing, May 1981
Iyer S, Druschel P (2001) Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous i/o. In: Proceedings of SOSP
Kallahalla M, Varman PJ (1999) Optimal read-once parallel disk scheduling. In: Proceedings of IOPADS, pp 68–77
Lund K, Goebel V (2003) Adaptive disk scheduling in a multimedia DBMS. In: Proceedings of ACM multimedia
Meyer U, Zeh N (2003) I-o efficient undirected shortest paths. In: Proceedings of ESA, pp 434–445
Nesbit KJ, Smith JE (2004) Data cache prefetching using a global history buffer. In: Proceedings of HPCA, pp 96–105
Sanders P (1999) Accessing multiple sequences through set associative caches. In: Proceedings of ICALP. A more recent version by Mehlhorn and Sanders was communicated to the authors in Dec 1999
Sen S, Chatterjee S, Dumir N (2002) Towards a theory of cache-efficient algorithms. J ACM
Vishkin U (1996) Can parallel algorithms enhance serial implementation? Commun ACM
Vitter J, Shriver E (1994) Algorithms for parallel memory I: two-level memories. Algorithmica 12(2):110–147
Article MATH MathSciNet Google Scholar
Worthington B, Ganger G, Patt Y The disksim simulation environment (version 2.0). In: Available at http://www.ece.cmu.edu/ganger/disksim/

Download references

Author information

Authors and Affiliations

IBM India Research Lab., IBM, Plot No. 4, Block C, Institutional Area, Vasant Kunj, New Delhi, 110070, India
Akshat Verma
Department of Computer Science and Engineering, IIT Delhi, New Delhi, 1100116, India
Sandeep Sen

Authors

Akshat Verma
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Sen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akshat Verma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Verma, A., Sen, S. Combating I-O bottleneck using prefetching: model, algorithms, and ramifications. J Supercomput 45, 205–235 (2008). https://doi.org/10.1007/s11227-007-0170-0

Download citation

Received: 13 January 2007
Accepted: 27 December 2007
Published: 16 January 2008
Issue Date: August 2008
DOI: https://doi.org/10.1007/s11227-007-0170-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combating I-O bottleneck using prefetching: model, algorithms, and ramifications

Abstract

Access this article

Similar content being viewed by others

Improve Prefetch Performance by Splitting the Cache Replacement Queue

Characterizing the Impact of Prefetching on Scientific Application Performance

A Study on Modeling and Optimization of Memory Systems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combating I-O bottleneck using prefetching: model, algorithms, and ramifications

Abstract

Access this article

Similar content being viewed by others

Improve Prefetch Performance by Splitting the Cache Replacement Queue

Characterizing the Impact of Prefetching on Scientific Application Performance

A Study on Modeling and Optimization of Memory Systems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation