BLITZEN: A highly integrated massively parallel machine

https://doi.org/10.1016/0743-7315(90)90089-8Get rights and content

Abstract

The goal of the BLITZEN project is to construct a physically small, massively parallel, high-performance machine. This paper presents the architecture, organization, and feature set of a highly integrated SIMD array processing chip which has been custom designed and fabricated for this purpose at the Microelectronics Center of North Carolina. The chip has 128 processing elements (PEs), each with 1K bits of static RAM. Unique local control features include the ability to modify the global memory address with data local to each PE, and complementary operations based on a condition register. With a 16K PE system (only 128 custom chips are needed for this), operating at 20 MHz, data I/O can take place at 10,240 megabytes per second through a new method using a 4-bit bus for each set of 16 PEs. A 16K PE system can perform IEEE standard 32-bit floating-point multiplication at a rate greater than 450 megaflops. Fixed-point addition on 32-bit data exceeds the rate of three billion operations per second. Since the processors are bit-serial devices, performance rates improve with shorter word lengths. The BLITZEN chip is one of the first to incorporate over 1.1 million transistors on a single die. It was designed with 1.25-pm, two-level metal, CMOS design rules on an 11.0 by 11.7-mm die.

References (16)

  • K.E. Batcher

    Design of a massively parallel processor

    IEEE Trans. Comput.

    (Sept. 1980)
  • K.E. Batcher

    Array unit

  • K.E. Batcher

    The architecture of tomorrow's massively parallel computer

  • D.W. Blevins et al.

    Processing element and custom chip architecture for the BLITZEN massively parallel processor

    MCNC Tech. Rep. TR87-22

    (1987)
  • E.W. Davis et al.

    Architecture and operation of the BLITZEN processing element

  • J.R. Fischer

    Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications

    NASA Tech. Memorandum 87819

    (November, 1987)
  • B.M. Forest et al.

    Implementing neural network models on parallel computers

    Comput. J

    (1987)
  • R. Grondalski

    A VLSI chip set for a massively parallel architecture

There are more references available in the full text version of this article.

Cited by (29)

  • A data-parallel fp compiler

    1994, Journal of Parallel and Distributed Computing
View all citing articles on Scopus

This work was supported in part by NASA Goddard Space Flight Center under Contract Number NAG-5-966 to the Microelectronics Center of North Carolina. Professor Reif is also supported by Contracts ONR N00014-80-C-0647, Air Force AFOSR-87-0386, ONR N00014-87-K0310, NSF CCR-8696134, DARPA/ARO DAAL03-88-K-0195, and DARPA/ISTO N00014-88-K-0458.

View full text