The Internet Backplane Protocol: a study in resource sharing

https://doi.org/10.1016/S0167-739X(03)00033-5Get rights and content

Abstract

In this work we present the Internet Backplane Protocol (IBP), a middleware created to allow the sharing of storage resources, implemented as part of the network fabric. IBP allows an application to control intermediate data staging operations explicitly. As IBP follows a very simple philosophy, very similar to the Internet Protocol, and the resulting semantic might be too weak for some applications, we introduce the exNode, a data structure that aggregates storage allocations on the Internet.

Introduction

It is commonly observed that the continued exponential growth in the capacity of fundamental computing resources—processing power, communication bandwidth, and storage—is working a revolution in the capabilities and practices of the research community. It has become increasingly evident that the most revolutionary applications of this superabundance use resource sharing to enable new possibilities for collaboration and mutual benefit. Over the past 30 years, two basic models of resource sharing with different design goals have emerged. The differences between these two approaches, which we distinguish as the Computer Center and the Internet models, tend to generate divergent opportunity spaces, and it therefore becomes important to explore the alternative choices they present as we plan for and develop an information infrastructure for the scientific community in the next decade.

Interoperability and scalability are necessary design goals for distributed systems based on resource sharing, but the two models we consider differ in the positions they take along the continuum between total control and complete openness. The difference affects the tradeoffs they tend to make in fulfilling their other design goals. The Computer Center model, which came to maturity with the NSF Supercomputer Centers in 1980s and 1990s, was developed in order to allow scarce and extremely valuable resources to be shared by a select community in an environment where security and accountability are major concerns. The form of sharing it implements is necessarily highly controlled—authentication and access control are its characteristic design issues. In the last few years this approach has given rise to a resource sharing paradigm known as information technology “Grids”. Grids are designed to flexibly aggregate various types of highly distributed resources into unified platforms on which a wide range of “virtual organizations” can build [1]. By contrast, the Internet paradigm, which was developed over the same 30-year period, seeks to share network bandwidth for the purpose of universal communication among an international community of indefinite size. It uses lightweight allocation of network links via packet routing in a public infrastructure to create a system that is designed to be open and easy to use, both in the sense of giving easy access to a basic set of network services and of allowing easy addition of privately provisioned resources to the public infrastructure. While admission and accounting policies are difficult to implement in this model, the power of the universality and generality of the resource sharing it implements is undeniable.

Though experience with the Internet suggests that the transformative power of information technology is at its highest when the ease and openness of resource sharing is at its greatest, the Computer Center model is experiencing a rebirth in the Grid while the Internet paradigm has yet to be applied to any resource other than communication bandwidth. But we believe that the Internet model can be applied to other kinds of resources, and that, with the current Internet and the Web as a foundation, such an application can lead to similarly powerful results. The storage technology we have developed called the Internet Backplane Protocol (IBP) is designed to test this hypothesis and explore its implications. IBP is our primary tool in the study of logistical networking, a field motivated by viewing data transmission and storage within a unified framework. In this paper we explain the way in which IBP applies the Internet model to storage, describe the current API and the software that implements it, lay out the design issues which we are working to address, and finally characterize the future directions of the work.

Section snippets

Background: the Internet Protocol and IBP

IBP is a mechanism developed for the purpose of sharing storage resources across networks ranging from rack-mounted clusters in a single machine room to global networks [2], [3], [4]. To approximate the openness of the Internet paradigm for the case of storage, the design of IBP parallels key aspects of the design of IP, in particular IP datagram delivery. This service is based on packet delivery at the link level, but with more powerful and abstract features that allow it to scale globally.

The IBP service, client API, protocol, and current software

IBP storage resources are managed by “depots”, or servers, on which clients perform remote storage operations. As shown in Table 1, the IBP client calls fall into three different groups.

IBP_allocate is used to allocate a byte array at an IBP depot, specifying the size, duration, reliability, whether the allocation can be revoked by the depot before the end of its lifetime, and the read/write semantics: append mode, truncate on write, FIFO or unsynchronized circular queue. A successful

IBP design issues

A design for shared network storage technology that draws on the strengths of the Internet model must also cope with its weaknesses and limitations. In this section we sketch the key design issues that we have encountered in developing IBP.

Data caching in NetSolve

NetSolve is a widely known project whose aim is to provide remote access to computational resources, both hardware and software. When implementing distributed computation in a wide area network, data can be produced at any location and consumed at any other, and it might be difficult to find the ideal location for the producer of the data, its consumer, and the buffer where the data are stored.

To implement a system where globally distributed caches cooperate to move data near consuming

Related work

IBP occupies an architectural niche similar to that of network file systems such as AFS [12] and network attached storage appliances [13], but its model of storage is more primitive, making it similar in some ways to storage area networking (SAN) technologies developed for local networks. In the Grid community, projects such as GASS [14] and the SDSC storage resource broker [15] are file system overlays that implement a uniform file access interface and also impose uniform directory,

Conclusions

While some ways of engineering for resource sharing, such as the Computer Center model, focus on optimizing the use of scarce resources within selected communities, the exponential growth in all areas of computing resources has created the opportunity to explore a different problem, viz. designing new architectures that can take more meaningful advantage of this bounty. The approach presented in this paper is based on the Internet model of resource sharing and represents one general way of

Acknowledgements

This work is supported by the National Science Foundation Next Generation Software Program under grant no. EIA-9975015, the Department of Energy Scientific Discovery through Advanced Computing Program under grant no. DE-FC02-01ER25465, and by the National Science Foundation Internet Technologies Program under grant no. ANI-9980203. The infrastructure used in this work was supported by the NSF CISE Research Infrastructure Program under grant no. EIA-9972889. The authors would like to acknowledge

Alessandro Bassi Despite being tempted by other fields, requiring the same mix of creation and scientific rigour, Alessandro Bassi attended the Information Science Faculty at the Universities of Studies in Milan, until the final achievement of his Laurea degree in 1994. His research interests slowly shifted from Artificial Intelligence and Fuzzy Logic, to Software Engineering, and finally to the Networking and Distributed Computing fields, and he is currently holding a position as a Senior

References (16)

  • I. Foster, C. Kesselman, S. Tuecke, The anatomy of the grid: enabling scalable virtual organizations, Int. J....
  • J. Plank, M. Beck, W. Elwasif, T. Moore, M. Swany, R. Wolski, The Internet Backplane Protocol: storage in the network,...
  • M. Beck, T. Moore, J. Plank, M. Swany, Logistical networking: sharing more than the wires, in: S. Hariri, C. Lee, C....
  • A. Bassi, M. Beck, J. Plank, R. Wolski, Internet Backplane Protocol: API 1.0, CS Technical Report, ut-cs-01-455,...
  • M. Beck, T. Moore, J.S. Plank, Exposed vs. encapsulated approaches to grid service architecture, in: Proceedings of the...
  • D.C Arnold et al.

    On the convergence of computational and data grids

    Parallel Proc. Lett.

    (2001)
  • D.C. Arnold, D. Bachmannd, J. Dongarra, Request sequencing: optimizing communication for the grid, in: A. Bode, T....
  • H. Casanova, A. Legrand, D. Zagorodnov, F. Berman, Heuristics for scheduling parameter sweep applications in grid...
There are more references available in the full text version of this article.

Cited by (34)

  • Position paper: Open web-distributed integrated geographic modelling and simulation to enable broader participation and applications

    2020, Earth-Science Reviews
    Citation Excerpt :

    From this perspective, technologies related to network security (e.g., local network, private cloud), usage category assignment (e.g., free use, commercial use, or private use) and illegal usage control (e.g., cracking and decompilation) could be employed as references. Encouraging resource owners to make their resources available to communities is another challenge in resource sharing processes (Bartol and Srivastava, 2002; Bassi et al., 2003; Chard et al., 2012) and involves at least two points. The first point is related to community building - forming teams that include resource owners, users and related stakeholders.

  • A high performance suite of data services for grids

    2010, Future Generation Computer Systems
    Citation Excerpt :

    Alongside with these two basic, most-popular alternatives, there are many others aimed to provide more sophisticated data grid features. Internet Backplane Protocol (IBP) [31,32], Kangaroo [33] and Storage Resource Broker (SRB) [34] are examples of them. The first one, IBP, is an end-to-end data movement mechanism that tries to optimize the data movement from a node to another one by storing information at intermediate locations.

  • Accelerating tropical cyclone analysis using LambdaRAM, a distributed data cache over wide-area ultra-fast networks

    2009, Future Generation Computer Systems
    Citation Excerpt :

    In fact, a future goal in LambdaRAM is to use it as a caching layer for Parallel Filesystems and provide access to multi-dimensional data stored in these filesystems over wide-area networks. Distributed file systems, including, Storage Resource Broker [19], Storage Resource Manager [20] and IBP [21], and P2P-based storage systems, including Sector [22], provide efficient access to files over wide-area networks. While these filesystems work on file-level granularity, LambdaRAM works at the granularity of multi-dimensional scientific datasets spread over multiple files, and can harness the memory of multiple intermediate clusters.

  • DDD: Distributed Dataset DNS

    2022, Cluster Computing
  • Analyzing Scientific Data Sharing Patterns for In-network Data Caching

    2021, SNTA 2021 - Proceedings of the 2021 Systems and Network Telemetry and Analytics, co-located with HPDC 2021
View all citing articles on Scopus

Alessandro Bassi Despite being tempted by other fields, requiring the same mix of creation and scientific rigour, Alessandro Bassi attended the Information Science Faculty at the Universities of Studies in Milan, until the final achievement of his Laurea degree in 1994. His research interests slowly shifted from Artificial Intelligence and Fuzzy Logic, to Software Engineering, and finally to the Networking and Distributed Computing fields, and he is currently holding a position as a Senior Research Associate at the University of Tennessee.

Micah Beck is an active contributor to research ranging from Parallel and Distributed Systems to Language and Compilers to Advanced Internetworking and Content Distribution. He began his career doing research in distributed operating systems at Bell Laboratories and received his PhD in Computer Science from Cornell University (1992) in the area of parallelizing compilers. He then joined the faculty of the Computer Science Department at the University of Tennessee, where he is currently an Associate Professor working in distributed high performance computing, networking and storage.

An active participant in the Internet2 project since 1997, Dr. Beck is chair of their Network Storage Working Group. In 2000 he joined with Internet2 participants drawn from industry and academia to found and serve as Chief Scientist of Lokomo Systems, a company in the area of advanced content distribution. In 2002 he joined with collegues within the University of Tennessee’s Computer Science Department to found and now codirects the Logistical Computing and Internetworking Laboratory, now 17 faculty, students and research staff.

Terry Moore received his BA in Philosophy from Purdue University in 1972 and his PhD in Philosophy from the University of North Carolina, Chapel Hill in 1993. He is currently Associate Director of the Logistical Computing and Internetworking Laboratory and the Center for Information Technology Research at the Computer Science Department at University of Tennessee. His research interests include network storage and logistical networking. He is a member of the Association for Computing Machinery.

James S. Plank received his BS from Yale University in 1988 and his PhD from Princeton University in 1993. He is currently an associate professor in the Computer Science department at the University of Tennessee. His research interests are in network storage, fault-tolerance and Grid computing. He is a member of the IEEE Computer Society.

Martin Swany received his BA and MS from the University of Tennessee in 1992 and 1998, respectively. He is currently a PhD candidate in the Computer Science Department of the University of California at Santa Barbara. His research interests include grid and distributed computing.

Rich Wolski received his BS in Computer Science from the California Polytechnic State University at San Luis Obispo, and his MS and PhD degrees from the University of California Davis, LLNL campus. He is currently an associate professor of Computer Science at the University of California, Santa Barbara. His research interests include Computational Grid computing, distributed computing, scheduling, and resource allocation. He leads the Network Weather Service project which focuses on on-line prediction of resource performance. He has developed EveryWare—a set of tools for portable Grid programming. He is also leading the G-commerce project studying computational economies for the Grid.

Graham Fagg received his BSc in Computer Science and Cybernetics from the University of Reading (UK) in 1991 and a PhD in Computer Science in 1998. From 1991 to 1993, he worked on CASE tools for interconnecting array processors and Transputer MIMD systems. From 1994 to 1995 he was a research assistant at the Cluster Computing Laboratory at the University of Reading working on code generation tools for group communications. From 1996 to 2001 he worked as a senior research associate and then a Research Assistant Professor at the University of Tennessee. From 2001 to 2002 he was a visiting guest scientist at the High Performance Computing Center Stuttgart (HLRS). Currently he is a Research Associate Professor at the University of Tennessee. His current research interests include distributed scheduling, resource management, performance prediction, benchmarking, cluster management tools, parallel and distributed IO and high speed networking. He is currently involved in the development of a number of metacomputing and GRID middle-ware systems including SNIPE, MPI_Connect, HARNESS, and a fault tolerant MPI implementation (FT-MPI).

View full text