Abstract
Operating system data provenance has a range of applications, such as security monitoring, debugging heterogeneous runtime environments, and profiling complex applications. However, fine-grained collection of provenance over extended periods of time can result in large amounts of metadata. Xie et al. describe an algorithm that leverages the subgraph similarity and locality of reference in provenance graphs to perform batch compression. We build on their effort to construct an online version that can perform streaming compression in SPADE. Our optimizations provide both performance and compression improvements over their baseline.
M. Bru—While visiting SRI.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chapman, A., Jagadish, H., Ramanan, P.: Efficient provenance storage. In: 34th ACM International Conference on Management of Data (SIGMOD) (2008)
Gehani, A., Kazmi, H., Irshad, H.: Scaling SPADE to “big provenance”. In: 8th USENIX Workshop on Theory and Practice of Provenance (TaPP) (2016)
Jeannot, E., Knutsson, B., Bjorkman, M.: Adaptive online data compression. In: 11th IEEE International Symposium on High Performance Distributed Computing (HPDC) (2002)
Li, X., Xu, X., Malik, T.: Interactive provenance summaries for reproducible science. In: 12th IEEE Conference on e-Science (2016)
Xie, Y., Muniswamy-Reddy, K.-K., Feng, D., Li, Y., Long, D.: Evaluation of a hybrid approach for efficient provenance storage. ACM Trans. Storage (TOS), 9(4) (2013)
Acknowledgements
This material is based upon work supported by the National Science Foundation under Grant ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Ahmad, R., Bru, M., Gehani, A. (2018). Streaming Provenance Compression. In: Belhajjame, K., Gehani, A., Alper, P. (eds) Provenance and Annotation of Data and Processes. IPAW 2018. Lecture Notes in Computer Science(), vol 11017. Springer, Cham. https://doi.org/10.1007/978-3-319-98379-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-98379-0_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98378-3
Online ISBN: 978-3-319-98379-0
eBook Packages: Computer ScienceComputer Science (R0)