A reliability-driven placement procedure based on thermal-force model

https://doi.org/10.1016/j.microrel.2005.04.015Get rights and content

Abstract

This paper deals with placing chips on an MCM substrate in chip array style for minimizing the system failure rate. The placement procedure begins with constructing an initial placement based on cooling considerations. Then, a thermal-force model is presented to transform the reliability-driven placement problem to solve a set of simultaneous nonlinear equations to determine thermal-force-equilibrium locations of the chips. A modified Newton–Raphson method is used to solve this system of equations. Finally, a chip assignment procedure transforms the thermal-force-equilibrium placement into an array style placement for minimum thermal distortion. Two assignment methods are developed and compared each other. Experiments on three industrial MCMs designed by IBM show that the obtained placements have significant improvements to their original designs in system reliability. Additionally, a simulated annealing approach is presented for justifying the performance of the proposed method.

Introduction

A multichip module (MCM) considered in this paper is described as a package combining multiple chips into a single system-level unit. The resulting module is capable of handing an entire function. MCMs provide a very high level of system integration, with hundreds of bare chips that can be placed very close to each other on a substrate. Therefore, systems based on MCM architectures can achieve much denser circuits and much shorter interconnect distances among the chips than those in which chips are packaged in a single chip module and placed on PCBs. However, this denser integration results in higher heat flux densities at the substrate and creates a very challenging thermal management problem. For example, the IBM’s S/390 Servers that have 35 chips mounted on a 12.7 cm × 12.7 cm substrate dissipate 1274 W per module [1]. If the dissipated heat is not properly removed, higher operating temperatures can occur. A higher temperature not only affects circuit performance directly by slowing down the transistors on chips, but also decreases their reliability. As a result, supporting high heat fluxes while maintaining relatively low chip temperatures is one of the major challenges facing today’s MCM system designers [2], [3].

The MCM placement problem is to assign the exact locations of chips on a substrate subject to timing, thermal, and routability constraints [4], [5]. Most of the previous placement methods used for MCM are extensions of well-known methods from the VLSI domain or the PCB area [6], [7], which are mainly focused on routability. However, temperature distribution on an MCM substrate is the most important reliability factor. It is conceivable that a placement tool without thermal considerations could place some chips with high heat dissipation closely spaced together. This would result in hot spots on the substrate, even though the total power consumption is constrained. To overcome the problem of overheating it is essential to develop good chip placement techniques for optimizing the system reliability, which is usually referred as the thermal placement problem.

There are mainly two types of chip placements related to the MCM design, namely, full custom style and chip array style. In the full custom style placement, the active substrate is treated as a continuous plane on which chips of varying sizes and shapes are free to reside anywhere on the active substrate as shown in Fig. 1(a). On the other hand, in the chip array style placement, the active substrate is partitioned into a matrix of identical chip sites into which the chips are placed as shown in Fig. 1(b). Noticed that the pitch (i.e. center-to-center spacing) of the chip sites in x-direction can be different from the pitch in y-direction.

Basically, placements of different styles need different placement algorithms. Previous studies on the thermal placement problem of MCM thus fall into two major categories: iterative-based approaches for chip array style placements and force-directed algorithms for full custom style placements. The iterative-based approaches consist of simulated annealing approaches [8], [9], [10] and hybrid genetic algorithms [11], [12]. Force-directed algorithms include the fuzzy-force [13] and the thermal-force [14] algorithms.

In the study, an extension of the previous thermal-force algorithm in [14] is developed to cover the chip array style thermal placement problem. This method generates excellent solutions both effectively and efficiently. Another important merit is that the proposed thermal-force model is easily combined with other force models developed for the objects of routability and performance [15], [16], [17], [18]. So, a multiobjective optimal placement problem can be modeled by a hybrid force model that is a combination of different force models, and solved by the same technique presented in the paper. In addition, a simulated annealing approach is also presented for justifying the performance of the proposed method.

The rest of this paper is organized as follows: some preliminary knowledge, such as problem description, reliability evaluation, packaging structure, and temperature calculation are provided in Section 2. The thermal placement algorithm based on a modified thermal-force model is presented in Section 3. Simulated annealing approach is presented in Section 4. Examples with computational results are given in Section 5. Conclusions are drawn in Section 6.

Section snippets

Problem description

The chip array style thermal placement problem can be stated briefly as follows: given a set of chips C = {ci∣1  i  m} with its set of heat dissipations Q = {qi∣ 1  i  m} and a set of chip sites S = {sj∣1  j  n, n  m} on a two-dimensional substrate as shown in Fig. 1(b), assign each chip to one of the chip site such that the system failure rate is minimized. For most practical cases, some chips may have been pre-assigned to some chip sites for timing or cooling considerations. These chips are called fixed

Thermal-force placement algorithm

The complete placement approach, named thermal-force placement (TFP) algorithm, consists of three phases: generating a ‘good’ initial placement in phase 1, solving the system of thermal-force equations for obtaining a thermal-force-equilibrium (TFE) placement in phase 2, and transforming the TFE placement to a chip array style placement in phase 3.

Simulated annealing approach

Simulated annealing (SA) is a general purpose combinatorial optimization technique that is analogous to the process of metallurgical annealing in which a system is heated and then cooled gradually until the material achieves certain desired metallurgical properties [25]. It has been shown to produce good quality placements for routability [26], [27]. So, a simulated annealing approach also proposed here for comparing and justifying the TFP algorithm.

Examples and computational results

The present algorithms have been implemented in C language, and run on a 2.8 GHz Pentium IV personal computer. Three industry MCMs designed by IBM are used to test the proposed algorithms.

Conclusion

This paper deals with placing chips in array style on an MCM substrate to minimize the system failure rate. A TFP algorithm and a simulated annealing approach are presented for this problem. Three industrial MCMs designed by IBM are examined by the proposed methods. The TFP algorithm generates excellent solutions both effectively and efficiently when comparing to the simulated annealing approach and IBM designs. Since, thermal placement problem is a NP-hard combinatorial optimization problem,

Acknowledgement

This work was supported by the National Science Council, Republic of China under contract no. NSC91-2215-E-218-005. I am pleased to thank Professor Jung-Hua Chou for his valuable comments and suggestions concerning this paper.

References (33)

  • Y.J. Huang et al.

    Reliability and routability consideration for MCM placement

    Microelectron Reliab

    (2002)
  • Katopis GA. The evolution of ceramic packages for S/390 servers. In: Proceedings of pacific Rim/ASME international...
  • T. Kam et al.

    EDA challenges facing future microprocessor design

    IEEE Trans Computer-Aided Des

    (2000)
  • S.V. Garimella et al.

    Thermal challenges in next generation electronic system-summary of panel presentations and discussions

    IEEE Trans Comput Packag Technol

    (2002)
  • L.L. Moresco

    Electronic system packaging: the search for manufacturing the optimum in a sea of constraints

    IEEE Trans Comput Hybr Manufact Technol

    (1990)
  • P.A. Sandborn et al.

    Conceptual design of multichip modules and systems

    (1994)
  • N.A. Sherwani et al.

    Introduction to multichip modules

    (1995)
  • N.A. Sherwani

    Algorithms for VLSI physical design automation

    (1999)
  • K.Y. Chao et al.

    Thermal placement for high-performance multi-chip modules

    Int Conf Comput Des

    (1995)
  • Lampaert K, Gielen G, Sansen W. Thermally constrained placement of small-power IC’s and multi-chip modules. In:...
  • C.H. Tsai et al.

    Cell-level placement for improving substrate thermal distribution

    IEEE Trans Computer-Aided Des

    (2000)
  • M.C. Tang et al.

    Consideration of thermal constraints during multichip module placement

    Electron Lett

    (1997)
  • Beebe C, Carothers JD, Ortega A. Object-oriented thermal placement using an accurate heat model. In: Proceedings of...
  • J. Lee

    Thermal placement algorithm based on heat conduction analogy

    IEEE Trans Comput Packag Technol

    (2003)
  • N. Quinn et al.

    A forced directed component placement procedure for printed circuit boards

    IEEE Trans Circuits Syst

    (1979)
  • M.D. Osterman et al.

    Placement for reliability and routability of convectively cooled PWB’s

    IEEE Trans Computer-Aided Des

    (1990)
  • Cited by (4)

    View full text