Abstract
As many domains employ ever more complex systems-of-systems, capturing provenance among component systems is increasingly important. Applications such as intrusion detection, load balancing, traffic routing, and insider threat detection all involve monitoring and analyzing the data provenance. Implicit in these applications is the assumption that “good” provenance is captured (e.g. complete provenance graphs, or one full path). When attempting to provide “good” provenance for a complex system of systems, it is necessary to know “how hard” the provenance-enabling will be and the likely quality of the provenance to be produced. In this work, we provide analytical results and simulation tools to assist in the scoping of the provenance enabling process. We provide use cases of complex systems-of-systems within which users wish to capture provenance. We describe the parameters that must be taken into account when undertaking the provenance-enabling of a system of systems. We provide a tool that models the interactions and types of capture agents involved in a complex systems-of-systems, including the set of known and unknown systems in the environment. The tool provides an estimation of quantity and type of capture agents that will need to be deployed for provenance-enablement in a complex system that is not completely known.
Approved for Public Release #16-0858. The authors’ affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE’s concurrence with, or support for, the positions, opinions or viewpoints expressed by the author.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
References
North American Profile of ISO19115:2003 - Geographic Information - Metadata. NAP Metadata Working Group (2005)
Allen, M.D., Chapman, A., Blaustein, B., Seligman, L.: Capturing provenance in the wild. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds.) IPAW 2010. LNCS, vol. 6378, pp. 98–101. Springer, Heidelberg (2010)
Allen, M.D., Chapman, A., Seligman, L., Blaustein, B.: Provenance for collaboration: detecting suspicious behaviors and assessing trust in information. In: CollabCom (2011)
Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the Kepler scientific workflow system. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006)
Asuncion, H.U.: Automated data provenance capture in spreadsheets, with case studies. Future Gener. Comput. Syst. 29, 2169–2181 (2013)
Bankes, S.C.: Tools and techniques for developing policies for complex and uncertain systems. Proc. Natl. Acad. Sci. 99, 7263–7266 (2002)
K. Belhajjame, J. Zhao, D. Garijo, A. Garrido, S. Soiland-Reyes, P. Alper, O. Corcho: A workflow PROV-corpus based on taverna and wings. In: Khalid Belhajjame, J.M.G.-P., Sahoo, S. (eds.) ProvBench (2013)
Caron, C., Amann, B., Constantin, C., Giroux, P.: WePIGE: the WebLab provenance information generator and explorer. In: EDBT (2014)
Dai, C., Lin, D., Kantarcioglu, M., Bertino, E., Celikel, E., Thuraisingham, B.: Query processing techniques for compliance with data confidence policies. In: Jonker, W., Petković, M. (eds.) SDM 2009. LNCS, vol. 5776, pp. 49–67. Springer, Heidelberg (2009)
Coe, G.B., Doty, R.C., Allen, M.D., Chapman, A.: Provenance capture disparities highlighted through datasets. In: Theory and Practice of Provenance (2014)
Conover, H., Ramachandran, R., Beaumont, B., Kulkarni, A., McEniry, M., Regner, K., Graves, S.: Introducing provenance capture into a legacy data system. IEEE Trans. Geosci. Remote Sens. 51, 5098–5104 (2013)
Gammack, D., Chapman, A.: Provenance tipping point. In: Theory and Practice of Provenance (2015)
Gilbert, N., Terna, P.: How to build and use agent-based models in social science. Mind Soc. 1, 57–72 (2000)
Gode, D., Sunder, S.: Allocative efficiency of markets with zero-intelligence traders: market as a partial substitute for individual rationality. J. Polit. Econ. 101, 119–137 (1993)
A. Goderis, D. De Roure, C. Goble, J. Bhagat, D. Cruickshank, P. Fisher, D. Michaelides, F. Tanoh: Discovering scientific workflows: the myExperiment benchmarks. In: IEEE Transactions on Automation Science and Engineering (2008)
Groth, P., Gil, Y., Magliacane, S.: Automatic metadata annotation through reconstructing provenance. In: Third International Workshop on the role of Semantic Web in Provenance Management (2012)
Jackson, M.: The stability and efficiency of economic and social networks. In: Jackson, M.O. (ed.) Advances in Economic Design, pp. 319–361. Springer, Heidelberg (2003)
Jackson, M., Watts, A.: The evolution of social and economic networks. J. Econ. Theor. 106, 265–295 (2002)
Lerner, B., Boose, E.: RDataTracker: collecting provenance in an interactive scripting environment. In: Theory and Practice of Provenance (2014)
McPhillips, T., Song, T., Kolisnik, T., Aulenbach, S., Belhajjame, K., Bocinsky, K., Cao, Y., Chirigati, F., Dey, S., Freire, J., Huntzinger, D., Jones, C., Koop, D., Missier, P., Schildhauer, M., Schwalm, C., Wei, Y., Cheney, J., Bieda, M., Ludaescher, B.: YesWorkflow: a user-oriented, language-independent tool for recovering workflow information from scripts. Int. J. Digit. Curation 7, 92–100 (2015)
Missier, P., Chen, Z.: Extracting PROV provenance traces from Wikipedia history pages. In: EDBT (2013)
Muniswamy-Reddy, K.-K., Holland, D.A., Braun, U., Seltzer, M.I.: Provenance-aware storage systems. In: USENIX, pp. 43–56 (2006)
De Nies, T., Magliacane, S., Verborgh, R., Coppens, S., Groth, P., Mannens, E., Van de Walle, R.: Git2PROV: exposing version control system content as W3C PROV. In: Proceedings of the 12th International Semantic Web Conference (2013)
Park, H., Ikeda, R., Widom, J.: RAMP: a system for capturing and tracing provenance in MapReduce workflows. VLDB 4, 1351–1354 (2011)
Scheidegger, C.E., Vo, H.T., Koop, D., Freire, J. Silva, C.: Querying and re-using workflows with VisTrails. In: SIGMOD (2008)
Stamatogiannakis, M., Groth, P., Bos, H.: Looking inside the black-box: capturing data provenance using dynamic instrumentation. In: Ludaescher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 155–167. Springer, Heidelberg (2015)
Tesfatsion, L.: Agent-based computational economics: modeling economies as complex adaptive systems. Inf. Sci. 149, 262–268 (2003)
Wilensky, U.: NetLogo. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL (1999). http://ccl.northwestern.edu/netlogo
Wolstencroft, K., Haines, R., et al.: The taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic Acids Res. 41, w557–w561 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Gammack, D., Scott, S., Chapman, A.P. (2016). Modelling Provenance Collection Points and Their Impact on Provenance Graphs. In: Mattoso, M., Glavic, B. (eds) Provenance and Annotation of Data and Processes. IPAW 2016. Lecture Notes in Computer Science(), vol 9672. Springer, Cham. https://doi.org/10.1007/978-3-319-40593-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-40593-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40592-6
Online ISBN: 978-3-319-40593-3
eBook Packages: Computer ScienceComputer Science (R0)