ABSTRACT
Supercomputers and clouds both strive to make a large number of computing cores available for computation. More recently, similar objectives such as low-power, manageability at scale, and low cost of ownership are driving a more converged hardware and software. Challenges remain, however, of which one is that current cloud infrastructure does not yield the performance sought by many scientific applications. A source of the performance loss comes from virtualization and virtualization of the network in particular. This paper provides an introduction and analysis of a hybrid supercomputer software infrastructure, which allows direct hardware access to the communication hardware for the necessary components while providing the standard elastic cloud infrastructure for other components.
- }}Cray XT5. http://www.cray.com/Assets/PDF/products/xt/CrayXT5Brochure.pdf.Google Scholar
- }}libmemcached. http://tangent.org/552/libmemcached.html.Google Scholar
- }}Netperf. http://www.netperf.org/netperf/.Google Scholar
- }}Rackable MicroSlice#8482; Architecture and Products. http://www.rackable.com/products/microslice.aspx?nid=servers_5.Google Scholar
- }}ZeptoOS - The Small Linux for Big Computers. http://www.mcs.anl.gov/research/projects/zeptoos/.Google Scholar
- }}N. R. Adiga, M. A. Blumrich, D. Chen, P. Coteus, A. Gara, M. E. Giampapa, P. Heidelberger, S. Singh, B. D. Steinmacher-Burow, T. Takken, M. Tsao, and P. Vranas. Blue Gene/L torus interconnection network. IBM Journal of Research and Development, 49(2/3):265--276, 2005. Google ScholarDigital Library
- }}D. P. Agrawal and W. E. Alexander. B-HIVE: A heterogeneous, interconnected, versatile and expandable multicomputer system. ACM Computer Architecture News, 12(2):7--13, June 1984. Google ScholarDigital Library
- }}J. Appavoo, V. Uhlig, and A. Waterland. Project Kittyhawk: building a global-scale computer: Blue Gene/P as a generic computing platform. 42(1):77--84, Jan 2008. Google ScholarDigital Library
- }}M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. Above the clouds: A berkeley view of cloud computing. Technical Report UCB/EECS-2009-28, EECS Department, University of California, Berkeley, Feb 2009.Google Scholar
- }}A. Barak and R. Wheeler. MOSIX: An integrated multiprocessor UNIX. In Proc. of the Winter 1989 USENIX Conference, San Diego, CA., Jan.-Feb. 1989.Google Scholar
- }}L. A. Barroso and U. Hölzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Synthesis Lectures on Computer Architecture. Morgan & Claypool, 2009. Google ScholarDigital Library
- }}P. Beckman, K. Iskra, K. Yoshii, S. Coghlan, and A. Nataraj. Benchmarking the effects of operating system interference on extreme-scale parallel machines. Cluster Computing, 11(1):3--16, 2008. Google ScholarDigital Library
- }}E. Bugnion, S. Devine, and M. Rosenblum. Disco: Running commodity operating systems on scalable multiprocessors. In Proc. of the 16th Symposium on Operating System Principles, Saint Malo, France, Oct. 1997. Google ScholarDigital Library
- }}G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proc. of the 21st ACM Symposium on Operating Systems Principles (SOSP), Stevenson, Washington, Oct. 2007. ACM. Google ScholarDigital Library
- }}B. Fitzpatrick. Distributed caching with memcached. Linux Journal, 2004(124):5, 2004. Google ScholarDigital Library
- }}A. Ganguly, A. Agrawal, P. O. Boykin, and R. J. Figueiredo. IP over P2P: Enabling self-configuring virtual IP networks for grid computing. In IPDPS'06: Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium, Rhodes Island, Greece, Apr. 2006. IEEE Computer Society. U. Florida, USA. Google ScholarDigital Library
- }}A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. Vl2: a scalable and flexible data center network. SIGCOMM Comput. Commun. Rev., 39(4):51--62, 2009. Google ScholarDigital Library
- }}E. V. Hensbergen and R. Minnich. System support for many task computing. In Proc. of the Workshop on Many-Task Computing on Grids and Supercomputers, 2008 (MTAGS 2008). IEEE, Nov. 2008.Google ScholarCross Ref
- }}K. Iskra, J. W. Romein, K. Yoshii, and P. Beckman. Zoid: I/o-forwarding infrastructure for petascale architectures. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pages 153--162, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- }}O. Krieger, M. Auslander, B. Rosenburg, R. W. Wisniewski, J. Xenidis, D. D. Silva, M. Ostrowski, J. Appavoo, M. Butrico, M. Mergen, A. Waterland, and V. Uhlig. K42: Building a complete operating system. In Proc. of the First European Systems Conference, Leuven, Belgium, Apr. 2006. Google ScholarDigital Library
- }}S. J. Mullender, G. van Rossum, A. S. Tanenbaum, R. van Renesse, and H. van Staveren. Amoeba: A distributed operating system for the 1990s. Computer, 23(5):44--53, May 1990. Google ScholarDigital Library
- }}J. Napper and P. Bientinesi. Can cloud computing reach the top500? In UCHPC-MAW '09: Proceedings of the combined workshops on UnConventional high performance computing workshop plus memory access workshop, pages 17--20, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- }}S. Neuner. Scaling Linux to new heights: the SGI Altix 3000 system. Linux Journal, 106, Feb. 2003. Google ScholarDigital Library
- }}R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. Portland: a scalable fault-tolerant layer 2 data center network fabric. In SIGCOMM '09: Proceedings of the ACM SIGCOMM 2009 conference on Data communication, pages 39--50, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- }}D. Presotto, R. Pike, K. Thompson, H. Trickey, and P. Winterbottom. Plan 9, A distributed system. In Proc. of the Spring EurOpen'91 Conference, Tromso, May 1991.Google Scholar
- }}A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Middleware 2001, IFIP/ACM International Conference on Distributed Systems Platforms, Heidelberg, Germany, 2001. Google ScholarDigital Library
- }}P. Ruth, X. Jiang, D. Xu, and S. Goasguen. Virtual distributed environments in a shared infrastructure. Computer, 38:63--69, 2005. Google ScholarDigital Library
- }}J. P. Singh, T. Joe, A. Gupta, and J. L. Hennessy. An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors. In ACM Supercomputing 93, 1993. Google ScholarDigital Library
- }}I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan. Chord: A scalable Peer-To-Peer lookup service for internet applications. In Proc. of the ACM SIGCOMM 2001 Conference, Aug. Google ScholarDigital Library
- }}A. Sundararaj, A. Gupta, and P. Dinda. Dynamic topology adaptation of virtual networks of virtual machines. In Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems, page 8. ACM, 2004. Google ScholarDigital Library
- }}W. Vogels. A Head in the Cloud: The Power of Infrastructure as a Service. http://www.youtube.com/watch?v=9AS8zzUaO3Y, 2008.Google Scholar
- }}Z. Vranesic, S. Brown, M. Stumm, S. Caranci, A. Grbic, R. Grindley, M. Gusat, O. Krieger, G. Lemieux, K. Loveless, N. Manjikian, Z. Zilic, T. Abdelrahman, B. Gamsa, P. Pereira, K. Sevcik, A. Elkateeb, and S. Srbljic. The NUMAchine multiprocessor. Technical Report 324, University of Toronto, Apr. 1995.Google Scholar
- }}G. Wang and E. Ng. The impact of virtualization on network performance of amazon ec2 data center. In INFOCOM'10: Proceedings of The IEEE Conference on Computer Communications. IEEE, 2010. Google ScholarDigital Library
- }}Wikipedia. Virtual private network --- wikipedia, the free encyclopedia, 2010. {Online; accessed 12-March-2010}.Google Scholar
- }}K. Yoshii, K. Iskra, H. Naik, P. Beckmanm, and P. C. Broekema. Characterizing the performance of big memory on blue gene linux. Parallel Processing Workshops, International Conference on, 0:65--72, 2009. Google ScholarDigital Library
- }}L. Yousef, M. Butrico, and D. DaSilva. Towards a unified ontology of cloud computing. In GCE08: Proceedings of The IEEE Conference on Computer Communications. IEEE, 2008.Google ScholarCross Ref
Index Terms
- Providing a cloud network infrastructure on a supercomputer
Recommendations
Cloud Storage as the Infrastructure of Cloud Computing
ICICCI '10: Proceedings of the 2010 International Conference on Intelligent Computing and Cognitive InformaticsAs an emerging technology and business paradigm, Cloud Computing has taken commercial computing by storm. Cloud computing platforms provide easy access to a company’s high-performance computing and storage infrastructure through web services. With cloud ...
Cloud Infrastructure & Applications --- CloudIA
CloudCom '09: Proceedings of the 1st International Conference on Cloud ComputingThe idea behind Cloud Computing is to deliver Infrastructure-as-a-Services and Software-as-a-Service over the Internet on an easy pay-per-use business model. To harness the potentials of Cloud Computing for e-Learning and research purposes, and to small-...
Evaluating open-source cloud computing solutions for geosciences
Many organizations start to adopt cloud computing for better utilizing computing resources by taking advantage of its scalability, cost reduction, and easy to access characteristics. Many private or community cloud computing platforms are being built ...
Comments