Introduction

Hadzic, Fedja; Tan, Henry; Dillon, Tharam S.

doi:10.1007/978-3-642-17557-2_1

Introduction

Fedja Hadzic,
Henry Tan &
Tharam S. Dillon

Chapter

814 Accesses
1 Citations

Part of the book series: Studies in Computational Intelligence ((SCI,volume 333))

Abstract

Large amounts of data are collected and stored by different government, industrial, commercial or scientific organizations. As the complexity and volume of the data continue to increase, the task of classifying new unseen data and extracting useful knowledge from the data is becoming practically impossible for humans to do. This makes the automatic knowledge acquisition process not just advantageous over manual knowledge acquisition, but rather a necessity since the probability that a human observer will detect something new and useful is very low given the overwhelming complexity and volume of the information.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imieliski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington D.C., USA, May 26-28, pp. 207–216. ACM, New York (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile, Septemebr 12-15, pp. 487-499 (1994)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Usama, M.F., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining. American Association for Artificial Intelligence, pp. 307–328 (1996)
Google Scholar
AliMohammadzadeh, R., Soltan, S., Rahgozar, M.: Template Guided Association Rule Mining from XML Documents. In: Proceedings of the 15th International Conference on World Wide Web, Edinburgh, Scotland, pp. 963–964. ACM, New York (2006)
Chapter Google Scholar
Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamato, H., Arikawa, S.: Efficient substructure discovery from large semi-structured data. Paper presented at the Proceedings of the 2nd SIAM International Conference on Data Mining (SIAM 2002), Arlington, VA, USA, April 11-13 (2002)
Google Scholar
Asai, T., Arimura, H., Uno, T., Nakano, S.-i.: Discovering Frequent Substructures in Large Unordered Trees. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 47–61. Springer, Heidelberg (2003)
Chapter Google Scholar
Bayardo, R.J.: Efficiently mining long patterns from databases. Paper presented at the Proceedings of the ACM SIGMOD Conference on Management of Data, Seattle, USA, June2-4 (1998)
Google Scholar
Braga, D., Campi, A., Ceri, S., Klemettinen, M., Lanzi, P.L.: A Tool for Extracting XML Association Rules. In: Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), Washington, DC, USA, pp. 57–64 (2002)
Google Scholar
Brin, S., Motwani, R., Silverstein, C.: Beyond Market Baskets: Generalizing Association Rules to Correlations. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, USA, May 13-15, pp. 265–276. ACM, New York (1997)
Chapter Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms. Paper presented at the Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM 2004), Santorini Island, Greece, June 21-23 (2004a)
Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: Canonical forms for labeled trees and their applications in frequent subtree mining. Knowledge and Information Systems 8(2), 203–234 (2004b)
Article Google Scholar
Chi, Y., Yang, Y., Xia, Y., Muntz, R.R.: CMTreeMiner: Mining both closed and maximal frequent subtrees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 63–73. Springer, Heidelberg (2004)
Chapter Google Scholar
Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent Subtree Mining - An Overview. Fundamenta Informaticae, Special Issue on Graph and Tree Mining 66(1-2), 161–198 (2005)
MATH MathSciNet Google Scholar
Davidson, S.B., Crabtree, J., Brunk, B.P., Schug, J., Tannen, V., Overton, G.C., Stoeckert Jr., C.J.: K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources. IBM Systems Journal 40(2), 512–531 (2001)
Article Google Scholar
Dillon, T., Tan, P.L.: Object Oriented Conceptual modelling. Prentice-Hall of Australia Pty Ltd. (1993)
Google Scholar
Dong, A.: Treefinder: a First Step towards XML Data Mining. Paper presented at the Proceedings of the IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 9-12 (2004)
Google Scholar
El-Haji, M., Zaiane, O.R.: COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation. Paper presented at the Workshop on Frequent Itemset Mining Implementations (FIMI 2003), in Conjunction with IEEE-ICDM, Melbourne, Florida, USA, November 19-22 (2003)
Google Scholar
Feng, L., Dillon, T.S., Weigand, H., Chang, E.: An XML-Enabled Association Rule Framework. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 88–97. Springer, Heidelberg (2003)
Chapter Google Scholar
Feng, L., Dillon, T.S.: Mining XML-Enabled Association Rules with Templates. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 66–88. Springer, Heidelberg (2005)
Chapter Google Scholar
Feng, L., Dillon, T.S.: An XML-Enabled Data Mining Query Language XML-DMQL. International Journal of Business Intelligence and Data Mining 1(1), 22–41 (2005)
Article Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979)
MATH Google Scholar
Gehtland, J., Almaer, D., Galbraith, B.: Pragmatic Ajax: A Web 2.0 Primer. Pragmatic Bookshelf (2006)
Google Scholar
Gouda, K., Zaki, M.J.: Efficiently Mining Maximal Frequent Itemsets. Paper presented at the Proceedings of the 1st IEEE International Conference on Data Mining, San Jose, USA, November 29 - December 2 (2001)
Google Scholar
Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5(2), 199–220 (1993a)
Article Google Scholar
Gruber, T.R.: Towards Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal of Human and Computer Studies 43(5/6), 907–928 (1993b)
Google Scholar
Hadzic, F., Dillon, T.S., Sidhu, A.S., Chang, E., Tan, H.: Mining Substructures in Protein Data. Paper presented at the IEEE Workshop on Data Mining in Bioinformatics DMB 2006, in conjunction with IEEE ICDM 2006, Hong Kong, December 18-22 (2006)
Google Scholar
Hadzic, F., Tan, H., Dillon, T.S.: UNI3 - Efficient Algorithm for Mining Unordered Induced Subtrees using TMG Candidate Generation. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Honolulu, Hawaii, USA, April 1-5, pp. 568–575. IEEE, Los Alamitos (2007)
Chapter Google Scholar
Hadzic, F.: Advances in knowledge learning methodologies and their applications. Curtin University of Technology, Perth (2008)
Google Scholar
Halverson, A., Josifovski, V., Lohman, G., Pirahesh, H., Morschel, M.: ROX: Relational Over XML. In: Proceedings of the 30th Conference on Very Large Databases, Toronto, Canada, pp. 264–275 (2004)
Google Scholar
Hampton, L., vun Kannon, D.: Extensible Business Reporting Language (XBRL) 2.0 Specification (December 14, 2001); XBRL.org
Google Scholar
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining Top-K Frequent Closed Patterns without Minimum Support. Paper presented at the Proceedings of the 2002 IEEE International Conference on Data Mining, Illinois, USA (2002)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Elsevier, Morgan Kaufmann Publishers, San Francisco, CA, USA (2006)
Google Scholar
Hawkins, D.: Identification of outliers. Chapman & Hall, London (1980)
MATH Google Scholar
Hidber, C.: Online Association Rule Mining. ACM Sigmod Record 28(2), 145–156 (1999)
Article Google Scholar
Inokuchi, A., Washio, T., Nishimura, K., Motoda, H.: A Fast Algorithm for Mining Frequent Connected Subgraphs. IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd., Tokyo, Japan (2001)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Machine Learning 50(3), 321–354 (2003)
Article MATH Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: A General Framework for Mining Frequent Subgraphs from labeled Graphs. Fundamenta Informaticae, Advances in Mining Graphs, Trees and Sequences 66(1-2) (2004)
Google Scholar
Kuramochi, M. and Karypic, G, Frequent Subgraph Discovery. Paper presented at the Proceedings of the IEEE International Conference on Data Mining, ICDM 2001, San Jose, California, USA, November 29 - December 2 (2001)
Google Scholar
Lewis, K.N., Robinson, M.D., Hughes, T.R., Hogue, C.W.V.: MyMed: A database system for biomedical research on MEDLINE data. IBM Systems Journal 43(4), 756–767 (2004)
Article Google Scholar
Liu, B., Hsu, W., Ma, Y.: Mining Association Rules with Multiple Minimum Supports. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 15-18, pp. 337–341. ACM, New York (1999)
Chapter Google Scholar
Liu, J., Pan, Y., Wang, K., Han, J.: Mining Frequent Item Sets by Opportunistic Projection. Paper presented at the Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada (2002)
Google Scholar
Luk, R.W., Leong, H., Dillon, T.S., Chan, A.T., Croft, W.B., Allen, J.: A Survey in Indexing and Searching XML Documents. Journal of the American Society for Information Science and Technology 53(6), 415–438 (2002)
Article Google Scholar
MacManus, R., Porter, J.: Web 2.0 Design: Bootstrapping the Social Web. Digital Web Magazine (2005)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill Companies, Inc., Boston (1997)
MATH Google Scholar
Nijssen, S., Kok, J.N.: Efficient discovery of frequent unordered trees. In: Proceedings of the 1st International Workshop on Mining Graphs, Trees, and Sequences, Dubrovnik, Croatia (2003)
Google Scholar
Park, J.S., Chen, M.-S., Yu, P.S.: Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering 9(5), 813–825 (1997)
Article Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. Paper presented at the Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, January 10-12 (1999)
Google Scholar
Pavon, J., Viana, S., Gomez, S.: Matrix apriori: speeding up the search for frequent patterns. Paper presented at the Proceedings of the 24th IASTED International Conference on Database and Applications, Innsbruck, Austria (2006)
Google Scholar
Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: Alternatives and implications. Data Mining and Knowledge Discovery 4(2-3), 89–125 (2000)
Article Google Scholar
Sestito, S., Dillon, T.S.: Automated Knowledge Acquisition. Prentice Hall, Sydney (1994)
MATH Google Scholar
Shabo, A., Rabinovic-Cohen, S., Vortman, P.: Revolutionary Impact of XML on Biomedical Information Interoperability. IBM Systems Journal 45(2), 361–372 (2006)
Article Google Scholar
Sidhu, A.S., Dillon, T.S., Chang, E., Sidhu, B.S.: Protein ontology: vocabulary for protein data. Paper presented at the Proceedings of the 3rd International Conference on Information Technology and Applications (ICITA 2005), Sydney, Australia, July 4-7 (2005)
Google Scholar
Suciu, D.: Semistructured data and XML Information. In: Tanaka, K., Ghandeharizadeh, S., Kambayashi, Y. (eds.) Information Organization and Databases: Foundations of Data Organization. Kluwer International Series In Engineering And Computer Science Series, pp. 9–30. Kluwer Academic Publishers, Dordrecht (2000)
Google Scholar
Tan, H., Dillon, T.S., Feng, L., Chang, E., Hadzic, F.: X3-Miner: Mining Patterns from XML Database. Paper presented at the Proceedings of the 6th International Conference on Data Mining, Text Mining and their Business Applications, Skiathos, Greece, May 25 (2005)
Google Scholar
Tan, H., Hadzic, F., Feng, L., Chang, E.: MB3-Miner: mining eMBedded subTREEs using tree model guided candidate generation. In: Proceedings of the 1st International Workshop on Mining Complex Data in Conjunction with ICDM 2005, Houston, Texas, USA, November 27-30, pp. 103–110 (2005)
Google Scholar
Tan, H., Dillon, T.S., Hadzic, F., Chang, E., Feng, L.: IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 450–461. Springer, Heidelberg (2006)
Chapter Google Scholar
Tan, H.: Tree Model Guided (TMG) enumeration as the basis for mining frequent patterns from XML documents. University of Technology Sydney, Sydney (2008)
Google Scholar
Tan, H., Hadzic, F., Dillon, T.S., Feng, L., Chang, E.: Tree Model Guided Candidate Generation for Mining Frequent Subtrees from XML. ACM Transactions on Knowledge Discovery from Data 2(2) (2008)
Google Scholar
Toivonen, H.: Sampling Large Databases for Association Rules. Paper presented at the Proceedings of the 22nd International Conference on Very Large Data Bases (VLDB 1996), Mumbai (Bombay), India (1996)
Google Scholar
Wang, K., He, Y., Han, J.: Mining Frequent Itemsets Using Support Constraints. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), Cairo, Egypt, pp. 43–52 (2000)
Google Scholar
Wang, K., Liu, H.: Discovering Structural Association of Semistructured Data. IEEE Transactions on Knowledge and Data Engineering 12(3), 353–371 (2000)
Article Google Scholar
Yang, L.H., Lee, M.L., Hsu, W.: Efficient Mining of XML Query Patterns for Caching. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), Berlin, Germany, September 9-12, pp. 69–80 (2003)
Google Scholar
Zaiane, O.R., Han, J., Li, Z.-N., Chee, S.H., Chiang, J.: Multimediaminer: a system prototype for multimedia data mining. Paper presented at the Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA, June 2-4 (1998)
Google Scholar
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New Algorithms for Fast Discovery of Association Rules. New York (1997)
Google Scholar
Zaki, M.J., Ogihara, M.: Theoretical Foundations of Association Rules. Paper presented at the Proceedings of the 3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Seattle, Washington, USA, June 2-4 (1998)
Google Scholar
Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering 12(3), 372–390 (2000)
Article MathSciNet Google Scholar
Zaki, M.J., Aggarwal, C.C.: XRules: An Effective Structural Classifier for XML Data. Paper presented at the Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C., USA, August 24-27 (2003)
Google Scholar
Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. Paper presented at the Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C., USA, August 24-27 (2003)
Google Scholar
Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering 17(8), 1021–1035 (2005)
Article Google Scholar
Zhang, J., Ling, T.W., Bruckner, R.M., Tjoa, A.M., Liu, H.: On Efficient and Effective Association Rule Mining from XML Data. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, pp. 497–507. Springer, Heidelberg (2004)
Chapter Google Scholar
Zhang, S., Zhang, J., Liu, H., Wang, W.: XAR-Miner: Efficient Association Rules Mining for XML Data. Paper presented at the Proceedings of the 14th International World Wide Web Conference Shiba, Japan, May 10-14 (2005)
Google Scholar
Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. Paper presented at the Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Fancisco, California, USA (2001)
Google Scholar

Download references

Authors

Fedja Hadzic
View author publications
You can also search for this author in PubMed Google Scholar
Henry Tan
View author publications
You can also search for this author in PubMed Google Scholar
Tharam S. Dillon
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hadzic, F., Tan, H., Dillon, T.S. (2011). Introduction. In: Mining of Data with Complex Structures. Studies in Computational Intelligence, vol 333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17557-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-17557-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17556-5
Online ISBN: 978-3-642-17557-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Abstract

Buying options