Skip to main content

Large-Scale Graph Classification Based on Evolutionary Computation with MapReduce

  • Conference paper
  • First Online:
  • 2812 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9313))

Abstract

Discriminative subgraph mining from a large collection of graph objects is a crucial problem for graph classification. Several main memory-based approaches have been proposed to mine discriminative subgraphs, but they always lack scalability and are not suitable for large-scale graph databases. Based on the MapReduce model, we propose an efficient method, MRGAGC, to process discriminative subgraph mining. MRGAGC employs the iterative MapReduce framework to mine discriminative subgraphs. Each map step applies the evolutionary computation and three evolutionary strategies to generate a set of locally optimal discriminative subgraphs, and the reduce step aggregates all the discriminative subgraphs and outputs the result. The iteration loop terminates until the stopping condition threshold is met. In the end, we employ subgraph coverage rules to build graph classifiers using the discriminative subgraphs mined by MRGAGC. Extensive experimental results on both real and synthetic datasets show that MRGAGC obviously outperforms the other approaches in terms of both classification accuracy and runtime efficiency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bancilhon, F., Ramakrishnan, R.: An amateur’s introduction to recursive query processing strategies. ACM (1986)

    Google Scholar 

  2. Bilgin, C., Demir, C., Nagi, C., Yener, B.: Cell-graph mining for breast tissue modeling and classification. In: 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS 2007, pp. 5311–5314. IEEE (2007)

    Google Scholar 

  3. Borgelt, C., Berthold, M.R.: Mining molecular fragments: Finding relevant substructures of molecules. In: Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM 2003, pp. 51–58. IEEE (2002)

    Google Scholar 

  4. Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: Haloop: Efficient iterative data processing on large clusters. Proceedings of the VLDB Endowment 3(1-2), 285–296 (2010)

    Article  Google Scholar 

  5. Cui, B., Mei, H., Ooi, B.C.: Big data: the driver for innovation in databases. National Science Review 1(1), 27–30 (2014)

    Article  Google Scholar 

  6. De Jong, K.: Evolutionary computation: a unified approach. In: Proceedings of the Fourteenth International Conference on Genetic and Evolutionary Computation Conference Companion, pp. 737–750. ACM (2012)

    Google Scholar 

  7. Deshpande, M., Kuramochi, M., Wale, N., Karypis, G.: Frequent substructure-based approaches for classifying chemical compounds. IEEE Transactions on Knowledge and Data Engineering 17(8), 1036–1050 (2005)

    Article  Google Scholar 

  8. Hill, S., Srichandan, B., Sunderraman, R.: An iterative mapreduce approach to frequent subgraph mining in biological datasets. In: Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pp. 661–666. ACM (2012)

    Google Scholar 

  9. Huan, J., Wang, W., Bandyopadhyay, D., Snoeyink, J., Prins, J., Tropsha, A.: Mining protein family specific residue packing patterns from protein structure graphs. In: Proceedings of the Eighth Annual International Conference on Resaerch in Computational Molecular Biology, pp. 308–315. ACM (2004)

    Google Scholar 

  10. Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraphs in the presence of isomorphism. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 549–552. IEEE (2003)

    Google Scholar 

  11. Jin, N., Wang, W.: Lts: Discriminative subgraph mining by learning from search history. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 207–218. IEEE (2011)

    Google Scholar 

  12. Jin, N., Young, C., Wang, W.: Graph classification based on pattern co-occurrence. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 573–582. ACM (2009)

    Google Scholar 

  13. Jin, N., Young, C., Wang, W.: Gaia: graph classification using evolutionary computation. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 879–890. ACM (2010)

    Google Scholar 

  14. Lin, W., Xiao, X., Ghinita, G.: Large-scale frequent subgraph mining in mapreduce. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 844–855. IEEE (2014)

    Google Scholar 

  15. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report (1999)

    Google Scholar 

  16. Ranu, S., Singh, A.K.: Graphsig: A scalable approach to mining significant subgraphs in large graph databases. In: IEEE 25th International Conference on Data Engineering, ICDE 2009, pp. 844–855. IEEE (2009)

    Google Scholar 

  17. Sharan, R., Suthram, S., Kelley, R.M., Kuhn, T., McCuine, S., Uetz, P., Sittler, T., Karp, R.M., Ideker, T.: Conserved patterns of protein interaction in multiple species. Proceedings of the National Academy of Sciences of the United States of America 102(6), 1974–1979 (2005)

    Article  Google Scholar 

  18. Storn, R., Price, K.: Differential evolution-a simple and efficient adaptive scheme for global optimization over continuous spaces. ICSI Berkeley (1995)

    Google Scholar 

  19. Tang, L., Liu, H.: Graph mining applications to social network analysis. In: Managing and Mining Graph Data, pp. 487–513. Springer (2010)

    Google Scholar 

  20. Tao, Y., Lin, W., Xiao, X.: Minimal mapreduce algorithms. In: Proceedings of the 2013 International Conference on Management of Data, pp. 529–540. ACM (2013)

    Google Scholar 

  21. Yan, X., Cheng, H., Han, J., Yu, P.S.: Mining significant graph patterns by leap search. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 433–444. ACM (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, Z., Zhao, Y., Wang, G., Cheng, Y. (2015). Large-Scale Graph Classification Based on Evolutionary Computation with MapReduce. In: Cheng, R., Cui, B., Zhang, Z., Cai, R., Xu, J. (eds) Web Technologies and Applications. APWeb 2015. Lecture Notes in Computer Science(), vol 9313. Springer, Cham. https://doi.org/10.1007/978-3-319-25255-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25255-1_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25254-4

  • Online ISBN: 978-3-319-25255-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics