Do Trace Cache, Value Prediction and Prefetching Improve SMT Throughput?

Cher, Chen-Yong; Park, Il; VijayKumar, T. N.

doi:10.1007/11682127_17

Chen-Yong Cher¹⁹,
Il Park¹⁹ &
T. N. VijayKumar¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3894))

Included in the following conference series:

International Conference on Architecture of Computing Systems

565 Accesses

Abstract

While trace cache, value prediction, and prefetching have been shown to be effective in the single-threaded superscalar, there has been no analysis of these techniques in a Simultaneously Multi threaded (SMT) processor. SMT brings new factors both for and against these techniques, and it is not known how these techniques would fare in SMT. We evaluate these techniques in an SMT to pro vide recommendations for future SMT designs. Our key contribu tions are: (1) we identify a fundamental interaction between the techniques and SMT’s sharing of resources among multiple threads, and (2) we quantify the impact of this interaction on SMT through put. SMT’s sharing of the instruction storage (i.e., trace cache or i-cache), physical registers, and issue queue impacts the effectiveness of trace cache, value prediction, and prefetching, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mendelson, A., Gabbay, F.: Speculative execution based on value prediction. Technical report, Technion (1997)
Google Scholar
Balasubramonian, R., Dwarkadas, S., Albonesi, D.H.: Reducing the complexity of the register file in dynamic superscalar processors. In: Proc. of the 34th MICRO (November 2001)
Google Scholar
Black, B., Rychlik, B., Shen, J.P.: The block-based trace cache. In: Proc. of the 26th ISCA (October 1999)
Google Scholar
Borch, E., Tune, E., Manne, S., Emer, J.: Loose loops sink chips. In: Proc. of 8th HPCA (February 2002)
Google Scholar
Calder, B., Reinman, G., Tullsen, D.M.: Selective value prediction. In: Proc. of the 26th ISCA (May 1999)
Google Scholar
Charney, M.J., Reeves, A.P.: Generalized correlation-based hardware prefetching. Technical Report EE-CEG-95-1, Cornell University (February 1995)
Google Scholar
Farkas, K.I., Jouppi, N.P.: Complexity/performance tradeoffs with non-blocking loads. In: Proceedings of the 21st Annual International Symposium on Computer Architecture, pp. 211–222 (April 1994)
Google Scholar
Friendly, D.H., Patel, S.J., Patt, Y.N.: Alternative fetch and issue policies for the trace cache fetch mechanism. In: Proc. of the 30th MICRO (November 1997)
Google Scholar
Hu, Z., Martonosi, M., Kaxiras, S.: Tcp: Tag correlating prefetchers. In: Proc. of 9th HPCA (February 2003)
Google Scholar
Joseph, D., Grunwald, D.: Prefetching using markov predictors. In: Proc. of the 24th ISCA (June 1997)
Google Scholar
Jouppi, N.P.: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In: Proc. of the 17th ISCA (May 1990)
Google Scholar
Kaxiras, S., Hu, Z., Martonosi, M.: Cache decay: Exploiting generational behaviour to reduce cache leakage power. In: Proc. of the 28th ISCA (June 2001)
Google Scholar
Lai, A.-C., Fide, C., Falsafi, B.: Dead-block prediction and dead-block correlating prefetchers. In: Proc. of the 28th ISCA (June 2001)
Google Scholar
Lipasti, M.H., Schmidt, W.J., Kunkel, S.R., Roediger, R.R.: Spaid: software prefetching in pointer and call intensive environments. In: Proc. of the 28th MICRO (November 1995)
Google Scholar
Lo, J., Barroso, L., Eggers, S., Gharachorloo, K., Levy, H., Parekh, S.: An analysis of database workload performance on simultaneous multithreaded processors. In: Proc. of the 25th ISCA (June 1998)
Google Scholar
Lipasti, M.H., Wilkerson, C.B., Shen, J.P.: Value locality and data speculation. In: Proc. of the 7th ASPLOS (October 1996)
Google Scholar
Moshovos, A., Sohi, G.S.: Streamlining inter-operation memory communication via data dependence prediction. In: Proc. of the 30th MICRO (December 1997)
Google Scholar
Park, I., Powell, M.D., Vijaykumar, T.N.: Reducing register ports for higher speed and lower energy. In: Proc. of the 35th MICRO (November 2002)
Google Scholar
Patel, S.J., Evers, M., Patt, Y.N.: Improving trace cache effectiveness with branch promotion and trace packing. In: Proc. of the 25th ISCA (June 1998)
Google Scholar
Patel, S.J., Friendly, D.H., Patt, Y.N.: Evaluation of design options for the trace cache fetch mechanism. IEEE Transactions on Computers, Special Issue on Cache Memory and Related Problems
Google Scholar
Patel, S.J., Friendly, D.H., Patt, Y.N.: Critical issues regarding the trace cache fetch mechanism. Technical Report CSE-TR-335-97, University of Michigan (May 1997)
Google Scholar
Rotenberg, E., Bennett, S., Smith, J.E.: Trace cache: A low latency approach to high bandwidth instruction fetching. In: Proc. of the 29th MICRO (December 1996)
Google Scholar
Sazeides, Y., Smith, J.E.: Implementations of context based value predictors. Technical Report ECE-97-8, University of Wisconsin-Madison (December 1997)
Google Scholar
Mowry, T.C., Lam, M.S., Gupta, A.: Design and evaluation of a compiler algorithm for prefetching. In: Proc. of the 5th ASPLOS (October 1992)
Google Scholar
Chen, T.F., Baer, J.L.: Reducing memory latency via non-blocking and prefetching caches. In: Proc. of the 5th ASPLOS (October 1992)
Google Scholar
Timothy Sherwood, G.H., Perelman, E., Calder, B.: Automatically characterizing large scale program behavior. In: Proc. of the 10th ASPLOS (October 2002)
Google Scholar
Tullsen, D.M., Brown, J.A.: Handling long-latency loads in a simultaneous multithreading processor. In: Proc. of the 34th MICRO (December 2001)
Google Scholar
Tullsen, D.M., Eggers, S.J., Emer, J.S., Levy, H.M., Lo, J.L., Stamm, R.L.: Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor. In: Proc. of the 23rd ISCA (May 1996)
Google Scholar
Tullsen, D.M., Eggers, S.J., Levy, H.M.: Simultaneous multithreading: maximizing on-chip parallelism. In: Proc. of the 22nd ISCA (June 1995)
Google Scholar
Tyson, G.S., Austin, T.M.: Improving the accuracy and performance of memory communication through renaming. In: Proc. of the 30th MICRO (December 1997)
Google Scholar
Yeh, T.-Y., Marr, D., Patt, Y.: Increasing instruction fetch rate via multiple branch prediction and a branch address cache. In: Proc. of the 7th ACM Int. Conf. on Supercomputing (July 1993)
Google Scholar
Zhigang Hu, S.K., Martonosi, M.: Timekeeping in the memory system: Predicting and optimizing memory behavior. In: Proc. of the 29th ISCA (May 2002)
Google Scholar

Download references

Author information

Authors and Affiliations

ECE, Purdue University, IN, 47907, USA
Chen-Yong Cher, Il Park & T. N. VijayKumar

Authors

Chen-Yong Cher
View author publications
You can also search for this author in PubMed Google Scholar
Il Park
View author publications
You can also search for this author in PubMed Google Scholar
T. N. VijayKumar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Passau, Innstr. 33, 94032, Passau, Germany
Werner Grass
Faculty of Computer Science and Mathematics – Institute of Computer Architectures, University of Passau, Innstrasse 33, 94032, Passau, Germany
Bernhard Sick
University of Frankfurt/Main, Robert-Mayer-Str. 11-15, 60325, Frankfurt, Germany
Klaus Waldschmidt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cher, CY., Park, I., VijayKumar, T.N. (2006). Do Trace Cache, Value Prediction and Prefetching Improve SMT Throughput?. In: Grass, W., Sick, B., Waldschmidt, K. (eds) Architecture of Computing Systems - ARCS 2006. ARCS 2006. Lecture Notes in Computer Science, vol 3894. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11682127_17

Download citation

DOI: https://doi.org/10.1007/11682127_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32765-3
Online ISBN: 978-3-540-32766-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics