Skip to main content
Log in

SCMP: A Single-Chip Message-Passing Parallel Computer

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

As technology improves and transistor feature sizes continue to shrink, the effects of on-chip interconnect wire latencies on processor clock speeds will become more important. In addition, as we reach the limits of instruction-level parallelism that can be extracted from application programs, there will be an increased emphasis on thread-level parallelism. To continue to improve performance, computer architects will need to focus on architectures that can efficiently support thread-level parallelism while minimizing the length of on-chip interconnect wires. The SCMP (Single-Chip Message-Passing) parallel computer system is one such architecture. The SCMP system includes up to 64 processors on a single chip, connected in a 2-D mesh with nearest neighbor connections. Memory is included on-chip with the processors and the architecture includes hardware support for communication and the execution of parallel threads. Since there are no global signals or shared resources between the processors, the length of the interconnect wires will be determined by the size of the individual processors, not the size of the entire chip. Avoiding long interconnect wires will allow the use of very high clock frequencies, which, when coupled with the use of multiple processors, will offer tremendous computational power.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. F. Allen et al., Blue gene: A vision for protein science using a petaflop supercomputer. IBM Systems Journal, 40(2):310–327, 2001.

    Google Scholar 

  2. S. Chatterji, M. Narayanan, J. Duell, and L. Oliker. Performance evaluation of two emerging media proces-sors: VIRAM and imagine. Workshop on Parallel and Distributed Image Processing, Video Processing, and Multimedia (PDIVM), 2003.

  3. W. Dally, Virtual-channel flowcontrol. IEEE Transactions on Parallel and Distributed Systems, 3(2):194–205, 1992.

    Google Scholar 

  4. W. Dally, J. Fiske, J. Keen, R. Lethin, M. Noakes, P. Nuth, R. Davison, and G. Fyler. The message-driven processor: A multicomputer processing node with efficient mechanisms. IEEE Micro, 12(2):23–39, 1992.

    Google Scholar 

  5. W. J. Dally and S. Lacy. VLSI architecture: Past, present, and future. 20th Conference on Advanced Research in VLSI (ARVLSI 99),March 1999.

  6. K. Diefendorff. Power4 focuses on memory bandwidth. Microprocessor Report, 13(13), 1999.

  7. K. Diefendorff and P. Dubey. How multimedia workloads will change processor design. Computer, 30(9):43–45, 1997.

    Google Scholar 

  8. DIS Stressmark Suite, Atlantic Aerospace Division, Titan Systems Corporation, www.aaec.com/projectweb/dis.

  9. S. Eggers, J. Elmer, H. Levy. J. Lo, R. Stamm, and D. Tullsen. Simultaneous multithreading: A platform for next-generation processors. IEEE Micro, 17(5):12–19, 1997.

    Google Scholar 

  10. B. Gaeke, P. Husbands, X. Li, L. Oliker, K. Yelick, and R. Biswas. Memory-intensive benchmarks: IRAM vs. cache-based machines. International Parallel and Distributed Processing Symposium (IPDPS '02), April 2002.

  11. P. Ghosh, R. Mangaser, C. Mark, and K. Rose. Interconnect-dominated VLSI design. 20th Conference on Advanced Research in VLSI (ARVLSI 99), March 1999.

  12. M. Hall, J. Anderson, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion, and M. Lam. Maximizing multipro-cessor performance with the SUIF compiler. Computer, 29(12):84–89, 1996.

    Google Scholar 

  13. C. Kozyrakis and D. Patterson. Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks. 35th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-35), pp. 283–293, 2002.

  14. V. Krishnan and J. Torrellas. A chip-multiprocessor architecture with speculative multithreading. IEEE Trans-actions on Computers, 48(9):866–880, 1999.

    Google Scholar 

  15. D. Matzke. Will physical scalability sabotage performance gains? Computer, 30(9):37–39, 1997.

    Google Scholar 

  16. K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson, and K. Chang. The case for a single-chip multiprocessor. Seventh International Symp. Architectural Support for Programming Languages and Operating Systems (ASPLOS VII), pp. 2–11, Oct. 1996.

  17. D. Patterson, T. Anderson, N. Cardwell, R. Fromm, K. Keeton, C. Kozyrakis, R. Thomas, and K. Yelick. A case for intelligent RAM. IEEE Micro, 17(2):34–44, 1997.

    Google Scholar 

  18. J. Suh, E. Kim, S. Crago, L. Srinivasan, and M. French. A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels. 30th Annual International Symposium on Computer Architecture (ISCA '03), pp. 410–419, June 2003.

  19. The International Technology Roadmap for Semiconductors 2003 Edition, SIA '03.

  20. M. Tremblay, J. Chan, S. Chaudhry, A. W. Conigliaro, and S. S. Tse. The MAJC architecture: A synthesis of parallelism and scalability. IEEE Micro, 20(6):12–25, 2000.

    Google Scholar 

  21. E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: Raw machines. Computer, 30(9):86–93, 1997.

    Google Scholar 

  22. D. S. Wills, H. H. Cat, J. Cruz-Rivera, W. S. Lacy, J. M. Baker, Jr., J. C. Eble, A. Lopez-Lagunas, and M. Hopper. High-throughput, low-memory applications on the pica architecture. IEEE Transactions on Parallel and Distributed Systems, 8(10):1055–1067, 1997.

    Google Scholar 

  23. W. A. Wulf and S. A. McKee. Hitting the memory wall: Implications of the obvious. Computer Architecture News, 23(1):20–24, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baker, J.M., Gold, B., Bucciero, M. et al. SCMP: A Single-Chip Message-Passing Parallel Computer. The Journal of Supercomputing 30, 133–149 (2004). https://doi.org/10.1023/B:SUPE.0000040612.33760.8a

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:SUPE.0000040612.33760.8a

Navigation