Abstract
A new formalism is given for read-modify-write (RMW) synchronization operations. This formalism is used to extend the memory reference combining mechanism introduced in the NYU Ultracomputer, to arbitrary RMW operations. A formal correctness proof of this combining mechanism is given. General requirements for the practicality of combining are discussed. Combining is shown to be practical for many useful memory access operations. This includes memory updates of the form mem_val := mem_val op val, where op need not be associative, and a variety of synchronization primitives. The computation involved is shown to be closely related to parallel prefix evaluation.
- 1 CAMBPELL, R. H., AND HABERMAN, A.N. The specification of process synchronization by path expressions. In International Symposium on Operating Systems. Lecture Notes in Computer Science, 16. E. Gelenbe and C. Kaise, Eds. Springer-Verlag, New York, 1974, pp. 93-106. Google Scholar
- 2 COLLIER, W. Principles of architecture for systems of parallel processes. IBM Tech. Rep. TR00.3100, Mar. 1981.Google Scholar
- 3 DICKEY, S., KENNER, R., AND SN{R, M. An implementation of a combining network for the NYU Ultracomputer, Ultracomputer Note 93, New York University, New York, Jan. 1986.Google Scholar
- 4 DICKEY, S., KENNER, R., SNIR, M., AND SOLWORTH, J. A VLSI combining network for the NYU Ultracomputer. In IEEE Proceedings of the International Con{erence on Computer Design, (Port Chester, N.Y., Oct. 1985). IEEE, New York, 1985, pp. 110-113.Google Scholar
- 5 DIJKSTRA, E.W. Hierarchical ordering of sequential processes. Acta In{. 1 (1971), 115-138.Google Scholar
- 6 DRAUOHON, E., GRISHMAN, R., SCHWARTZ, J., AND STEIN, A. Programming considerations for parallel computers. Rep. IMM 362, Courant Institute of Mathematical Sciences, New York University, New York, 1967.Google Scholar
- 7 GAJSKI, D. D., AND PEIR, J.-K. Essential issues in multiprocessor systems. IEEE Comput. 18, 6 (June 1985), 9-28.Google Scholar
- 8 GOTTLIEB, A., GRISHMAN, R., KRUSKAL, C. P., MCAULIFFE, K. P., RUDOLPH, L., AND SNIR, M. The NYU Ultracomputer--Designing an MIMD parallel computer. IEEE Trans. Comput. C-32, 2 (Feb. 1983), 75-89.Google Scholar
- 9 GOTTLIEB, A., AND KRUSKAL, C.P. Coordinating parallel processors: A partial unification. SIGARCH News (Oct. 1981), 16-24. Google Scholar
- 10 GOTrLIEB, A., LUBACHEVSKY, B. D., AND RUDOLPH, L. Efficient techniques for coordinating sequential processors. ACM Trans Program. Lang. Syst. 5, 2 (Apr. 1983), 164-189. Google Scholar
- 11 HOARE, C. A. R. Communicating sequential processes. Commun. ACM 21, 8 (Aug. 1978), 666-677. Google Scholar
- 12 LADNER, R., AND FISHER, M. J. Parallel prefix computations. J. ACM 27, 4 (Oct. 1980), 831-838. Google Scholar
- 13 LAMPORT, L. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July 1978), 558-565. Google Scholar
- 14 LAMPORT, L. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. C-28, 9 (Sept. 1979), 690-691.Google Scholar
- 15 LAMPORT, L. On interprocess communication. Distrib. Comput. 1, 2 (Apr. 1986), 77-101.Google Scholar
- 16 LEE, G., KRUSKAL, C. P., AND KUCK, D.J. The effectiveness of combining in multistage interconnection networks in the presence of 'hot spots'. In 1986 International Conference on Parallel Processing, (Aug. 1986). IEEE, New York, 1986, pp. 35-41.Google Scholar
- 17 LYNCH, N., AND FISHER, M.J. On describing the behavior and implementation of distributed systems. Theor. Comput. Sci. 13, 1 (Jan. 1981), 17-43.Google Scholar
- 18 PETERSON, J., AND SILBERSHATZ, A. Operating System Concepts, Addison-Wesley, Reading, Mass., 1983. Google Scholar
- 19 PFISTER, G. H., ET AL. The IBM Research Parallel Processor Prototype (RP3): Introduction and architecture. In 1985 International Con{erence on Parallel Processing. IEEE, New York, 1985, pp. 784-772.Google Scholar
- 20 PFISTER, G. H., ANO NORTON, A. 'Hot spot' contention and combining in multistage intercon-{ nection networks. IEEE Trans. Comput. C-34, 10 (Oct. 1985), 933-938.Google Scholar
- 21 RETTBERG, R., AND THOMAS, R. Contention is no obstacle to shared-memory multiprocessing. Cornmun. ACM 29, 12 (1986), 1202-1212. Google Scholar
- 22 RUDOLPH, L. Software structures for ultraparallel computing. Ph.D. dissertation, New York University, 1981. Google Scholar
- 23 SEITZ, C. The cosmic cube. Commun. ACM 28, 1 (Jan. 1985), 22-33. Google Scholar
- 24 SHASHA, D., AND SNIR, M. Efficient and correct execution of programs that share memory. ACM Trans. Program. Lang. Syst. 10, 2 (Apr. 1988), 282-312. Google Scholar
- 25 SMITH, B. J. Architectures and applications of the HEP multiprocessor computer system. Real- Time Signal Processing IV, Proceedings o{ SPIE. The International Society for Optical Engineering, 1981, pp. 241-248.Google Scholar
- 26 SULLIVAN, H., BASHKOW, T. R., AND KLAPPHOLZ, D. A large scale, homogeneous, fully distributed parallel machine. In The 4th Annual Symposium on Computer Architecture (1977). IEEE, New York, 1977, pp. 105-134. Google Scholar
- 27 ZHU, C. Q., AND YEW, D.C. A scheme to enforce data dependence on large multiprocessor systems. IEEE Trans. So{tw. Eng. SE-13, 6 (June 1977), 726-739. Google Scholar
Index Terms
- Efficient synchronization of multiprocessors with shared memory
Recommendations
Algorithms for scalable synchronization on shared-memory multiprocessors
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting tend to produce large amounts of memory and interconnect contention, ...
Nonblocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors
Most multiprocessors are multiprogrammed to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this ...
Comments