Abstract
High-performance interconnects play a pivotal and essential role in the performance and functionality of modern large-scale computational systems, including datacenters and high-performance computing (HPC) architectures. Commercial datacenter applications require that a large number of small independent tasks be performed rapidly in parallel with upper bounds on individual task delays. This emphasis on large numbers of tasks and limited intertask dependencies leads to such computing being known as capacity computing. The term high-performance computing traditionally refers to large-scale applications running exclusively on a large system. This type of computing is referred to as capability computing. It requires that the system coordinate a small number of applications over a large number of resources (e.g., nodes). Capability computing is generally most useful for computation-intensive scientific applications. Cloud computing has begun to move HPC applications from dedicated machines into the cloud, where they can take advantage of commercial datacenter infrastructure.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
L. G. Valiant, “A bridging model for parallel computation,” Commun. ACM, vol. 33, no. 8, pp. 103–111, Aug. 1990. [Online]. Available: http://doi.acm.org/10.1145/79173.79181
B.W. Barrett, R. Brightwell, R.E. Grant, S. Hemmert, K. Pedretti, K. Wheeler, K.D. Underwood, R. Riesen, A.B. MacCabe, T. Hudson, The Portals 4.0.2 Networking Programming Interface, Sandia National Laboratories, October 2014, Tech. Rep. SAND2014-19568
RDMA Consortium, “July 2013.” [Online]. Available: http://www.rdmaconsortium.org
Internet Engineering Taskforce. July 2013. [Online]. Available: www.ietf.org
D. Dunning, G. Regnier, G. McAlpine, D. Cameron, B. Shubert, F. Berry, A. M. Merritt, E. Gronke, and C. Dodd, “The virtual interface architecture,” Micro, IEEE, vol. 18, no. 2, pp. 66–76, 1998.
J. Hilland, P. Culley, J. Pinkerton, and R. Recio, “RDMA protocol verbs specification,” RDMAC Consortium Draft Specification draft-hilland-iwarp-verbsv1. 0-RDMAC, 2003.
R. Recio, P. Culley, D. Garcia, J. Hilland, and B. Metzler, “An RDMA protocol specification,” IETF Internet-draft draft-ietf-rddp-rdmap-03. txt (work in progress), Tech. Rep., 2005.
H. Shah, J. Pinkerton, R. Recio, and P. Culley, “Direct data placement over reliable transports (version 1.0),” RDMA Consortium, October, 2002.
P. Culley, U. Elzur, R. Recio, S. Baily et al., “Marker PDU aligned framing for TCP specification (version 1.0),” RDMA Consortium, October, 2002.
B. Hauser, “iWARP ethernet: eliminating overhead in data center designs,” NetEffect Inc. White paper, 2006.
InfiniBand Trade Association. InfiniBand architecture specification, release 1.2.1, nov. 2007.
InfiniBand Trade Association. InfiniBand architecture specification, release 1.2.1, annex A14: Extended Reliable Connected Transport Service, mar. 2009.
G. Huston, “TCP performance,” The Internet Protocol Journal, vol. 3, no. 2, pp. 2–24, 2000.
R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. Paxson, “RFC 4960: Stream control transmission protocol,” Network Working Group, 2007.
Cisco, VNI, “Hyperconnectivity and the approaching zettabyte era,” White paper, 2013.
H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RFC 3550,” RTP: a transport protocol for real-time applications, vol. 7, 2003.
OpenFabrics Alliance, “July 2013.” [Online]. Available: http://www.openfabrics.org
Ryan E. Grant, Mohammad J. Rashti, Pavan Balaji, and Ahmad Afsahi, “Remote Direct Memory Access over Datagrams”, U.S. Patent #8903935, December 12, 2014.
J. Pinkerton, E. Deleganes, and M. Krause, “Sockets direct protocol (SDP) for iWARP over TCP (v1. 0),” RDMA Consortium, 2003.
S. Hefty. (2013) Rsockets. Intel Corporation. [Online]. Available: https://www.openfabrics.org/ofa-documents/doc_download/495-rsockets.html
IEEE. IEEE standard for local and metropolitan area networks - virtual bridged local area networks – amendment: priority-based flow control - 802.1qbb. [Online]. Available: http://www.ieee802.org/1/pages/802.1bb.html
IEEE. IEEE standard for local and metropolitan area networks - virtual bridged local area networks – amendment 10: congestion notification - 802.1qau. [Online]. Available: http://www.ieee802.org/1/pages/802.1au.html
IEEE. IEEE standard for local and metropolitan area networks - virtual bridged local area networks – amendment: enhanced transmission selection - 802.1qaz. [Online]. Available: http://www.ieee802.org/1/pages/802.1az.html
IEEE. IEEE standard for station and media access control connectivity - 802.1ab. [Online]. Available: http://www.ieee802.org/1/pages/802.1ab.html
INCITS technical committee T11. ANSI standard FC-BB-5 - fibre channel over ethernet (FCoE). [Online]. Available: http://www.t11.org/ftp/t11/pub/fc/bb-5/09-056v5.pdf
R. Perlman, “Introduction to TRILL,” The Internet Protocol Journal, vol. 4, no. 3, pp. 2–20, 2011.
D. Cohen, T. Talpey, A. Kanevsky, U. Cummings, M. Krause, R. Recio, D. Crupnicoff, L. Dickman, and P. Grun, “Remote direct memory access over the converged enhanced ethernet fabric: Evaluating the options,” in Proceedings of the 17th IEEE Symposium on High Performance Interconnects (HOTI). IEEE, 2009, pp. 123–130.
B. Goglin, “Design and implementation of open-mx: High-performance message passing over generic ethernet hardware,” in Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2008, pp. 1–7.
M. J. Rashti, R. E. Grant, P. Balaji, and A. Afsahi, “iWARP redefined: Scalable connectionless communication over high-speed ethernet,” in Proceedings of the 2010 International Conference on High Performance Computing (HiPC). IEEE, 2010, pp. 1–10.
Ohio Supercomputing Center, “Software implementation and testing of iWARP protocol,” 2013. [Online]. Available: http://www.osc.edu/research/network_file/projects/iwarp/ iwarp_main.shtml
D. Dalessandro, A. Devulapalli, and P. Wyckoff, “Design and implementation of the iWARP protocol in software,” in Proceedings of the 17th IASTED International Conference on Parallel and Distributed Computing and Systems, Phoenix, AZ, 2005.
D. Dalessandro, A. Devulapalli, and P. Wyckoff, “iWARP protocol kernel space software implementation,” in Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS), 2006.
B. Metzler, P. Frey, and A. Trivedi, “A software iWARP driver for OpenFabrics,” in Proceedings of the OpenFabrics Alliance 2010 Sonoma Workshop, 2010.
W. Matthews and L. Cottrell, “The PingER project: active internet performance monitoring for the HENP community,” Communications Magazine, IEEE, vol. 38, no. 5, pp. 130–136, 2000.
VideoLan Project, “VLC media player, May 2013.” [Online]. Available: http://www.videolan.org/vlc/
R. Gayraud, O. Jacques, and C. Wright, “SIPp: traffic generator for the SIP protocol,” 2013.
Acknowledgments
This work was supported in part by the Natural Sciences and Engineering Research Council of Canada Grant #RGPIN/238964-2011; Canada Foundation for Innovation and Ontario Innovation Trust Grant #7154; U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357; and the National Science Foundation Grant #0702182.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this chapter
Cite this chapter
Grant, R., Rashti, M., Balaji, P., Afsahi, A. (2015). Scalable Network Communication Using Unreliable RDMA. In: Khan, S., Zomaya, A. (eds) Handbook on Data Centers. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2092-1_12
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2092-1_12
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2091-4
Online ISBN: 978-1-4939-2092-1
eBook Packages: Computer ScienceComputer Science (R0)