Abstract
There are currently a number of streaming data analysis systems in research or commercial operation. These systems are generally large-scale distributed systems, but each system operates in isolation, under the control of one administrative authority. We are developing middleware that permits autonomous or semi-autonomous streaming analysis systems (called “sites”) to interoperate, providing them opportunities for data access, performance improvements, and reliability far exceeding that available in a single system. Unique characteristics of our system include an architecture for the management of multiple cooperation paradigms depending on the degree of trust and dependencies among the participating sites; a multisite planner that converts user-specified declarative queries into specifications of distributed jobs; and a mechanism for automatic recovery of site failures by redispatching failed pieces of a distributed job. We evaluate our architecture via experiments on a running prototype, and the results demonstrate the advantages of multi-site cooperation: collaborative jobs that share resources, even across only a few sites, can produce results 50% faster than independent execution, and jobs on failed sites can be recovered within a few seconds.
Chapter PDF
Similar content being viewed by others
References
The STREAM Group: STREAM: The Stanford stream data manager. IEEE Data Engineering Bulletin 26(1) (2003)
Chandrasekaran, S., et al.: TelegraphCQ: Continuous dataflow processing for an uncertain world. In: Conference on Innovative Data Systems Research (2003)
Abadi, D.J., et al.: The design of the Borealis stream processing engine. In: CIDR 2005 - Second Biennial Conference on Innovative Data Systems Research (2005)
Pietzuch, P., et al.: Network-aware operator placement for stream-processing systems. In: ICDE 2006. Proc. the 22nd International Conference on Data Engineering (2006)
Streambase Systems, Inc.: Streambase (2007), http://www.streambase.com/
Repantis, T., Gu., X., Kalogeraki, V.: Synergy: Sharing-aware component composition for distributed stream processing systems. In: ACM/IFIP/USENIX 7th International Middleware Conference, pp. 322–341 (2006)
Risch, T., Koparanova, M., Thide, B.: High-performance GRID Database Manager for Scientific Data. In: WDAS-2002. Proceedings of 4th Workshop on Distributed Data & Structures (2002)
Jain, N., et al.: Design, implementation, and evaluation of the linear road benchmark on the stream processing core. In: SIGMOD 2006. 25th ACM SIGMOD International Conference on Management of Data, ACM Press, New York (2006)
Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the Grid: Enabling scalable virtual organizations. In: Sakellariou, R., Keane, J.A., Gurd, J.R., Freeman, L. (eds.) Euro-Par 2001. LNCS, vol. 2150, Springer, Heidelberg (2001)
Werner-Allen, G., et al.: Deploying a Wireless Sensor Network on an Active Volcano. IEEE Internet Computing 10(2), 18–25 (2006)
Bouillet, E., et al.: A semantics-based middleware for utilizing heterogeneous sensor networks. In: Proceedings of the 3rd IEEE International Conference on Distributed Computing in Sensor Systems, pp. 174–188. IEEE Computer Society Press, Los Alamitos (2007)
Riabov, A., Liu, Z.: Scalable planning for distributed stream processing systems. In: Proceedings of ICAPS 2006 (2006)
Amini, L., et al.: Adaptive control of extreme-scale stream processing systems. In: Proceedings of ICDCS 2006 (2006)
Jacques-Silva, G., et al.: Towards autonomic fault recovery in system-s. In: Proceedings of the 4th IEEE International Conference on Autonomic Computing, IEEE Computer Society Press, Los Alamitos (2007)
Kim, K.H., Buyya, R.: Policy-based Resource Allocation in Hierarchical Virtual Organizations for Global Grids. In: SBAC-PAD 2006. Proceedings of the 18th International Symposium on Computer Architecture and High Performance Computing, vol. 00, pp. 36–46 (2006)
Branson, M., et al.: Autonomic operations in cooperative stream processing systems. In: Proceedings of the Second Workshop on Hot Topics in Autonomic Computing (2007)
Andrieux, A., et al.: Web Services Agreement Specification (WS-Agreement), Version 2006/07. GWD-R (Proposed Recommendation), Grid Resource Allocation Agreement Protocol (GRAAP) WGGRAAP-WG (2006)
Rong, B., et al.: Failure recovery in cooperative data stream analysis. In: ARES 2007. Proceedings of the Second International Conference on Availability, Reliability and Security, Vienna (2007)
Recommendation, W.: Web ontology language (OWL) (2004)
Yang, H., et al.: Resource discovery in federated systems with voluntary sharing (2007) (in submission)
Sandhu, R.: Lattice-based access control models. IEEE Computer (1993)
IBM: Security in System S (2006), http://domino.research.ibm.com/comm/research_pro-jects.nsf/pages/system_s_security.index.html
Anderson, K.S., et al.: SWORD: Scalable and flexible workload generator for distributed data processing systems. In: The 37th Winter Simulation Conference, pp. 2109–2116 (2006)
Foster, I.T., Kesselman, C.: Scaling system-level science: Scientific exploration and IT implications. IEEE Computer 39(11), 31–39 (2006)
Liu, C., et al.: Design and evaluation of a resource selection framework for grid applications. In: Proceedings of the 11th IEEE Symposium on High-Performance Distributed Computing, IEEE Computer Society Press, Los Alamitos (2002)
Ludwig, H., Dan, A., Kearney, B.: Cremona: An Architecture and Library for Creation and Monitoring of WS-Agreements. In: ICSOC 2004. ACM International Conference on Service Oriented Computing, ACM Press, New York (2004)
Balazinska, M., Balakrishnan, H., Madden, S., Stonebraker, M.: Fault-Tolerance in the Borealis Distributed Stream Processing System. In: ACM SIGMOD Conf., Baltimore, MD, ACM Press, New York (2005)
Balazinska, M., Balakrishnan, H., Stonebraker, M.: Contract-based load management in federated distributed systems. In: Symposium on Network System Design and Implementation (2004)
Stonebraker, M., Çetintemel, U., Zdonik, S.B.: The 8 requirements of real-time stream processing. SIGMOD Record 34(4), 42–47 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 IFIP International Federation for Information Processing
About this paper
Cite this paper
Branson, M., Douglis, F., Fawcett, B., Liu, Z., Riabov, A., Ye, F. (2007). CLASP: Collaborating, Autonomous Stream Processing Systems. In: Cerqueira, R., Campbell, R.H. (eds) Middleware 2007. Middleware 2007. Lecture Notes in Computer Science, vol 4834. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76778-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-76778-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76777-0
Online ISBN: 978-3-540-76778-7
eBook Packages: Computer ScienceComputer Science (R0)