Abstract
Data quality has become a major concern for organisations. The rapid growth in the size and technology of a databases and data warehouses has brought significant advantages in accessing, storing, and retrieving information. At the same time, great challenges arise with rapid data throughput and heterogeneous accesses in terms of maintaining high data quality. Yet, despite the importance of data quality, literature has usually condensed data quality into detecting and correcting poor data such as outliers, incomplete or inaccurate values. As a result, organisations are unable to efficiently and effectively assess data quality. Having an accurate and proper data quality assessment method will enable users to benchmark their systems and monitor their improvement. This paper introduces a granules mining for measuring the random degree of error data which will enable decision makers to conduct accurate quality assessment and allocate the most severe data, thereby providing an accurate estimation of human and financial resources for conducting quality improvement tasks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alkharboush, N., Li, Y.: A decision rule method for data quality assessment. In: Proceedings of the 15th International Conference on Information, vol. 3, pp. 84–95. ACM, Little Rock (2010)
Ballou, D.P., Pazer, H.L.: Modeling data and process quality in multi-input, multi-output information systems. Management Science 31(2), 150–162 (1985)
Batini, C., Scannapieco, M.: Data quality: Concepts, methodologies and techniques. Springer-Verlag New York Inc. (2006)
Dasu, T., Johnson, T.: Exploratory data mining and data cleaning. Wiley, New York (2003)
Even, A., Shankaranarayanan, G.: Dual assessment of data quality in customer databases. Journal of Data and Information Quality 1(3), 1–29 (2009)
Fisher, W.C., Lauria, J.M.E., Matheus, C.C.: An accuracy metric: Percentages, randomness, and probabilities. Journal of Data and Information Quality 1(3), 1–21 (2009)
Kolmogorov, A.: Three approaches to the quantitative definition of information. Problems of Information Transmission 1(1), 1–7 (1965)
Lee, Y.W., Strong, D.M., Kahn, B., Wang, R.Y.: Aimq: a methodology for information quality assessment. Information & Management 40(2), 133–146 (2002)
Lempel, A., Ziv, J.: On the complexity of finite sequences. IEEE Transactions on Information Theory 22(1), 75–81 (1976)
Li, Y.: Interpretations of discovered knowledge in multidimensional databases. In: Proceedings in IEEE International Conference on Granular Computing, p. 307 (2007), doi:10.1109/GrC.2007.92
Motro, A., Rakov, I.: Estimating the quality of databases. Flexible Query Answering Systems, 298–307 (1998)
Naumann, F., Freytag, J., Leser, U.: Completeness of integrated information sources. Information Systems 29(7), 583–615 (2004)
Parssian, A.: Managerial decision support with knowledge of accuracy and completeness of the relational aggregate functions. Decision Support Systems 42(3), 1494–1502 (2006)
Parssian, A., Sarkar, S., Jacob, V.: Assessing data quality for information products: Impact of selection, projection, and cartesian product. Management Science 50(7), 967–982 (2004)
Pawlak, Z.: Rough sets: Theoretical aspects of reasoning about data. Springer, Kluwer, Dordrecht (1991)
Pawlak, Z., Skowron, A.: Rough sets and boolean reasoning. Information Sciences 177(1), 41–73 (2007)
Pawlak, Z., Skowron, A.: Rudiments of rough sets. Information Sciences 177(1), 3–27 (2007)
Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Communications of the ACM 45(4), 211–218 (2002)
Redman, T.C.: Data quality for the information age. Artech House Boston, MA (1996)
Redman, T.C.: The impact of poor data quality on the typical enterprise. Communications of the ACM 41(2), 79–82 (1998)
Scannapieco, M., Batini, C.: Completeness in the relational model: a comprehensive framework. In: Proceedings of 9th International Conference on Information Quality, ICIQ, vol. 4, pp. 333–345 (2004)
Strong, D.M., Lee, Y.W., Wang, R.: Data quality in context. Communications of the ACM 40(5), 103–110 (1997)
Wand, Y., Wang, R.: Anchoring data quality dimensions in ontological foundations. Communications of the ACM 39(11), 86–95 (1996)
Wang, R.W., Strong, D.: Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems 12(4), 5–33 (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alkharboush, N., Li, Y. (2012). A Decision Table Method for Randomness Measurement. In: Watada, J., Watanabe, T., Phillips-Wren, G., Howlett, R., Jain, L. (eds) Intelligent Decision Technologies. Smart Innovation, Systems and Technologies, vol 15. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29977-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-29977-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29976-6
Online ISBN: 978-3-642-29977-3
eBook Packages: EngineeringEngineering (R0)