A frame of reference for the performance evaluation of asynchronous, distributed decision-making algorithms
Introduction
The use of centralized algorithms to control systems has been well documented in the literature. In this paradigm, data from one or more sources are collected at a central site and a single processor utilizes it to compute the system-wide decisions through sequential execution. The range of centralized decision-making algorithms extends from the battlefield (Lee and Ghosh, 1996), scheduling trains in a railway network (Coll et al., 1990), inventory management (Gross, 1963), and highway management (Goldstein, 1994) to distributed federated databases (Linn and Howarth, 1994). However, with increasing system complexity, the computational burden on the central processor continues to increase, eventually leading to lower throughput and poor efficiency. In contrast, distributed algorithms promise higher throughput and efficiency through sharing the overall computational task among multiple, concurrent processors. Markas et al. (1990) report a distributed implementation of fault simulation of digital systems and note throughput improvements over the traditional, centralized approach. Distributed algorithms may be classified into two principal categories – synchronous and asynchronous. The synchronous distributed approach (Ghosh and Yu, 1995) is characterized by the presence of a single, control processor that schedules the executions of all of the remaining processors. The presence of the sequential, control node theoretically limits the performance advantage of the synchronous approach. As the number of processors are increased, the synchronization requirement imposed by the control node will effectively counteract the potential advantages of the multiple concurrent processors.
This paper observes that many large-scale, complex, real-world systems may be characterized as one wherein the constituent components are geographically dispersed, they interact between themselves asynchronously, i.e., at irregular intervals of time, and are permitted autonomy in local decision-making. An example of such a system consists of the routing of messages in a data network where the messages interact asynchronously with the geographically dispersed routing nodes which, in turn, synthesize decisions, autonomously, to realize the propagation of the messages towards their respective destinations. Other examples include the management of inventory at the retail stores of a large department store, a modern battlefield, and the highway system. ADDM algorithms imply an emerging class of asynchronous, distributed algorithms that may naturally constitute the underlying control of such complex systems. Examples of successful ADDM algorithm designs include the routing algorithm in the Internet, PNNI (ATM Forum Technical Committee, 1996), NOVADIB (Ghosh, 1993), DARYN (Iyer and Ghosh, 1995) and DICAF (Utamaphethai and Ghosh, 1998). ADDM algorithms offer the potential of exploiting the maximal parallelism inherent in the system, given the absence of any forced synchronization.
The concept of performance is well understood and enjoys widespread use. A general, common-sense definition is presented by Ferrari (1978). He defines performance as an indication of how well a system, already assumed to be correct, works. This paper extends Ferrari's definition to include issues that relate to the ultimate objectives and outcomes of the system. Tron and colleagues (Tron et al., 1993) argue that in parallel systems, the performance metric must reflect the program, the machine, and the implementation strategies for it depends on each one of them. Ronngren et al. (1996) cite the need for a common benchmark suite to evaluate parallel algorithms. Gupta and Kumar (1993a) note that while the parallel execution time, speed-up and efficiency serve as well-known performance metrics, the key issue is the identification of the parallel processing overheads which sets a limit on the speed-up for a given architecture, problem size and algorithm. For their study in optimal overheads, they propose (Gupta and Kumar, 1993b) the minimization of the parallel execution time as the principal criterion. Brehm et al. (1995) propose to expand the list of metrics with computation and communication times. Kushwaha (1993) proposes the use of expected user response time, system throughput and average server utilization as metrics towards evaluating the performance of multi-client multi-server distributed systems. Braddock et al. (1992) claim system availability and system throughput as two key metrics to characterize operational distributed systems. While Lecuivre and Song (1995) suggest response time and resources load as the performance parameters, Kumar et al. (1994) propose throughput, subject to minimal total cost for execution and interprocess communication. Arrouye (1996), Clement and Quinn (1994), and Kremien (1995) stress scalability as a performance criterion for parallel systems. However, while Arrouye (1996) fails to define the exact type of the parallel systems for the proposed criterion, the efforts of Clement and Quinn (1994) are confined to data-parallel distributed algorithms, i.e., Single Instruction Multiple Data (SIMD) type programs. Furthermore, unlike speed-up which is quantitative, scalability is a quasi-quantitative measure and it refers to the continued application of the core algorithm as the system grows in size. Performance scalability may be measured semi-quantitatively for asynchronous, distributed algorithms, as reported in (Iyer and Ghosh, 1995), through tracking a specific performance measure while all of the entities in the system are doubled, quadrupled, etc.
Speed-up has been proposed in the literature as a metric for estimating the performance of parallel and distributed algorithms (Ghosh and Yu, 1995, Ghosh, 1995, Gupta and Kumar, 1993a), parallel processor architectures (Manwaring et al., 1994), and networks (Celenk and Yang, 1994). The most common definition of speed-up is the ratio of the execution time of the best known serial algorithm to the execution time of the parallel algorithm on a given number of concurrent processors. The execution time corresponding to a parallel execution is the longest of the CPU times required by any of the concurrent processors. For stochastic analysis of a distributed system (Westphal and Popovic, 1994), a number of serial and parallel executions are carried out, one for each stochastic input set, and for the speed-up computation, the numerator and denominator are determined from the arithmetic mean of the corresponding individual execution times. Frequently, for a given problem of constant size, a speed-up graph is obtained by first partitioning the problem for a number of different sets of concurrent processors, executing them, and then computing the ratios of the serial execution time to the execution time from each parallel execution. The speed-up graph may be utilized to study the overheads for a given problem and to project the maximum number of processors that may be exploited advantageously. Ertel (1994) complains that the use of the mean execution time during stochastic simulation of parallel systems causes the loss of variance information relative to speed-up. Ertel suggests the use of functional speed-up. Barr and Hickman (1992) claim that the effectiveness of speed-up factor is eroded by testing biases, machine influences, and the effects of the tuning parameters. Wieland et al. (1992) show empirical evidence of bias in speed-up and report its magnitude at 3.1 for concurrent, theater-level, battlefield simulation. Yan and Listgarten (1993) state that the presence of software instrumentations to measure performance invariably exert adverse influence on the measures.
The efforts reported in Tron et al. (1993), Ronngren et al. (1996), Gupta and Kumar (1993a), Gupta and Kumar (1993b), Brehm et al. (1995), Kushwaha (1993), Braddock et al. (1992), Lecuivre and Song (1995), Kumar et al. (1994), Arrouye (1996), Clement and Quinn (1994), Kremien (1995), Manwaring et al. (1994), Celenk and Yang (1994), Westphal and Popovic (1994), Ertel (1994), Barr and Hickman (1992), Wieland et al. (1992), Yan and Listgarten (1993), Culler et al. (1996), Bilardi et al. (1996) are generally applicable to data-parallel, i.e., SIMD, and synchronous-iterative distributed programs (Lalgudi et al., 1994), and may not apply to ADDM algorithms. In addition, the efforts in Tron et al. (1993), Ronngren et al. (1996), Gupta and Kumar (1993a), Gupta and Kumar (1993b), Brehm et al. (1995), Kushwaha (1993), Braddock et al. (1992), Lecuivre and Song (1995), Kumar et al. (1994), Arrouye (1996), Clement and Quinn (1994), Kremien (1995), Manwaring et al. (1994), Celenk and Yang (1994), Westphal and Popovic (1994), Ertel (1994), Barr and Hickman (1992), Wieland et al. (1992), Yan and Listgarten (1993) are based on either pure theoretical assumptions or limited-scale implementations and are unable to address the unique characteristics of large-scale systems under ADDM algorithm control. Capon (Capon, 1992) observes that understanding the behavior of asynchronous parallel programs is extremely difficult and is chiefly responsible for the limited parallelism reported in the literature. This paper presents a frame of reference for performance evaluation of ADDM algorithms. While the paper recognizes that the diversity among systems may require the development of novel and unique performance criteria, the frame of reference refers to the ideal or absolute performance measures that must be determined for every such measure. The development of the frame of reference encapsulates the authors' actual experiences in the design and development of several large-scale ADDM algorithms.
The remainder of the paper is organized as follows. Section 2 critically reviews the conventional performance metrics from the perspective of ADDM algorithms and motivates the development of a new frame of reference. Section 3 presents the frame of reference for evaluating the performance of ADDM algorithms and illustrates the computation of the absolute standards for a select few real-world systems. Finally, section 4 presents some conclusions.
Section snippets
Inadequacies of the conventional performance metrics for ADDM algorithms
The conventional performance metrics for distributed algorithms include speed-up, scaled speed-up (Gustafson et al., 1988), and isoefficiency (Kumar and Rao, 1987). In addition, Tsitsikilis and Stamoulis (Tsitsiklis and Stamoulis, 1995) propose communication complexity as a performance metric for asynchronous, distributed systems which they define as one where each processor stores in memory a local variable while estimates of the value of the local variable are maintained by each of its
A frame of reference for performance evaluation of ADDM algorithms
ADDM algorithms constitute the underlying, fully distributed control of systems that, in turn, are characterized as large, complex, real-world systems wherein the principal constituents – the sub-components, are geographically dispersed, they interact between themselves asynchronously, and are permitted autonomy in local decision-making. Under ADDM, the overall computational and control tasks of the system are distributed among its sub-components. Tsitsiklis and Stamoulis (1995) cite the
Conclusions
This paper has critically analyzed the role of speed-up from the perspective of an emerging class of distributed algorithms, termed ADDM algorithms. ADDM algorithms constitute the underlying control of many real-world systems, wherein the constituent sub-components are geographically dispersed, they interact between themselves asynchronously, and are permitted autonomy in local decision-making. The term real-world implies that such systems are subject to computer control, they relate to
Acknowledgements
The first author gratefully acknowledges the support by US BMDO and Army Research Offices under the grants DAAL03-91-G-0158, DAAH04-93-G-0126, and DAAH04-95-1-0101 and by the National Library of Medicine under grant N01-LM-3-3525.
Tony S. Lee received the combined Sc.B and A.B. degree from Brown University, Providence, Rhode Island in Computer Engineering and Political Science and the Sc.M and Ph.D degree in Electrical Engineering also from Brown University. He currently is a member of technical staff at Vitria Technology in Sunnyvalve, serves in a consulting role for the Networking and Distributed Algorithms Laboratory at Arizona State University where he continues his research into the area of network protocols,
References (52)
Scope: an extensible interactive environment for the performance evaluation of parallel systems
Microprocessing and Microprogramming
(1996)A distributed algorithm for fault simulation of combinatorial and asynchronous sequential digital designs, utilizing circuit partitioning, on loosely-coupled parallel processors
Microelectronics and Reliability – An International Journal
(1995)- et al.
Performance properties of large scale parallel systems
Journal of Parallel and Distributed Computing
(1993) Methodology for predicting performance of distributed and parallel systems
Performance Evaluation
(1993)- Private Network–Network Interface Specification Version 1.0 (PNNI 1.0), ATM Forum,...
- Barr, R.S., Hickman, B.L., 1992. On reporting the speedup of parallel algorithms: a survey of issues and experts....
- et al.
Data Networks
(1992) - Bilardi, G., Herley, K., Pietrecaprina, A., Pucci, G., Spirakis, P., 1996. BSP Vs. LogP. In: Proceedings of the 1996...
- Braddock, R.L., Claunch, M.R., Rainbolt, J.W., Corwin, B.N., 1992. Operational performance metrics in a distributed...
- Brehm, J., Madhukar, M., Smirni, E., Dowdy, L., 1995. PerPreT-a performance prediction tool for massively parallel...
Modeling and distributed simulation of broadband-ISDNetwork on a network of sun workstations configured as a loosely-coupled parallel processor system
IEEE Computer
Performance analysis of fast distributed link restoration algorithms
International Journal of Communication Systems
The communications system architecture of the North American Advanced Train Control System
IEEE Transactions on Vehicular Technology
LogP: a practical model of parallel computation
Communications of the ACM
Computer Systems Performance Evaluation
NOVADIB: a novel architecture for asynchronous distributed real-time banking modeled on loosely-coupled parallel processors
IEEE Transactions on Systems, Man and Cybernetics
NODIFS – simulating faults fast: asynchronous, distributed, circuit-partitioning based algorithm enables fast fault simulation of digital designs on parallel processors
IEEE Circuits and Devices
An asynchronous distributed approach for the simulation and verification of behavior-level models on parallel processors
IEEE Transactions on Parallel and Distributed Systems
Superlinear speedup for parallel implementation ofbiologically motivated spin glass optimization algorithm
International Journal of Modern Physics C
Centralized inventory control in multi-location supplySystem
Cited by (2)
Synchronous and asynchronous decision making strategies in supply chains
2014, Computers and Chemical EngineeringCitation Excerpt :In a practical scenario, asynchronous decision making within a supply chain is quite common. There are quite a number of works which cover the distributed and asynchronous nature of decision making (Ghosh et al., 2000; Mathews et al., 2009; Burgoon et al., 2010; Soubie and ZaratÉ, 2005; Wang et al., 2004; Vakili et al., 2013; Ilcinkas and Pelc, 2008; Ghosh, 2001). We discuss some of those works related to modeling of synchronous and asynchronous processes, particularly related to supply chains.
Understanding complex, real-world systems through asynchronous, distributed decision-making algorithms
2001, Journal of Systems and Software
Tony S. Lee received the combined Sc.B and A.B. degree from Brown University, Providence, Rhode Island in Computer Engineering and Political Science and the Sc.M and Ph.D degree in Electrical Engineering also from Brown University. He currently is a member of technical staff at Vitria Technology in Sunnyvalve, serves in a consulting role for the Networking and Distributed Algorithms Laboratory at Arizona State University where he continues his research into the area of network protocols, distributed system design, modeling and simulation. Dr. Lee has published over a dozen peer-reviewed journal and conference papers in fields ranging from transportation systems, banking systems and military command and control. He is also a co-author with Dr. Ghosh on Intelligent Transportation Systems. New Principles and Architectures (CRC Press, 2000). Prior to joining Vitira Technology, Dr. Lee held a position with NASA at Ames Research Center, Moffet Field, California. At NASA, he was involved with several high-several high-speed network research projects including Network Quality of Service and the Information Power Grid. Dr. Lee is also the author of the ATMSIM software package, a high-fidelity, distributed network simulator which enables network researchers to realistic model new protocols and algorithms and identify complex interactions.
Sumit Ghosh currently serves as the associate chair for research and graduate programs in the Computer Science and Engineering Department at Arozona State University. Prior to ASU, Sumit had been on the faculty at Brown University, Rhode Island, and before that he had been a member of technical staff (principal investigator) at Bell Laboratories Research in Holmdel, New Jersey. Sumit Ghosh received his B. Tech degree from the Indian Institute of Technology at Kanpur, India. and his M.S. and Ph.D degrees from Stanford University, California. Sumit's industrial experience includes Silvar-Lisco in Menlo Park, CA., Fairchild Advanced Research and Development, and Schlumberger Palo Alto Research Center, in addition to Bell Labs Research. Sumits research interests include fundamental yet practical problems from the disciplines of asynchronous distributed algorithms, stability of complex asynchronous algorithms, networking, deep space networking, impact of topology on network performance, network security attacks in ATMs networks, deep space networking, modeling and distributed simulation of complex systems, distributed resources allocation, hardware description languages, continuity of care in medicine, mobile computing, intelligent transportation, geographically distributed visualization, synthetic creativity, qualitative metrics for evaluating advanced graduate courses, issues in the Ph.D process and the physics if computer science problems. He is the author of three original monograph books: Hardware Description Languages: Concepts and Principles (IEEE Press); Modelling and Asynchronous Distributed Simulation of Complex Systems (IEEE Press); and Intelligent Transportation Systems: New Principles and Architectures (CRC Press). Five more books are under review/development. He serves on the editorial board of the IEEE Press Book Series on Microelectronic Systems Principles and Practice. Sumit is a US citizen.
Seong-Soon Joo received his B.S. degree from the Hanyang University, Seoul , Korea in 1980 and his M.S. and Ph.D. degrees from the Seoul National University in 1982 and 1989 respectively, all in electrical engineering. Since 1983 he had been a member of technical staff with the Electronics and Telecommunications Research Institute in Korea, where he is currently the project leader for high-speed routers. From 1996 to 1997 he was a visiting scientist in the Networking and Distributed Algorithms laboratory in the Computer Science & Engineering department at the Arizona State University. His current research interest include intelligent control of high-speed packet networks, active networks and generalized networks. He is an editor of the Institute of Electronics Engineer in Korea Reviews.