Skip to main content

From Alternative Clustering to Robust Clustering and Its Application to Gene Expression Data

  • Conference paper
Book cover Intelligent Data Engineering and Automated Learning - IDEAL 2011 (IDEAL 2011)

Abstract

The major contribution of the work described in this paper could be articulated as a parameter free clustering approach that leads to appropriate distribution of the given data instances into the most convenient clusters. This goal is realized in several steps. First, we apply multi-objective genetic algorithm to determine some alternative clustering solutions that constitute the pareto-front. The result is a pool of the clusters reported by all the solutions. Then, we determine the homogeneity of each cluster in the pool to keep the most homogeneous clusters which may not be select from one solution because a solution which is favored the most by considering the multiple objectives might have some clusters which are less homogeneous compared to best clusters in other solutions. Finally, as a given data instance may belong to more than one cluster in the solution set we reduce this membership to the cluster in which the instance is closest to the centroid. Many applications like gene expression data analysis are in need for such parameter free approach because the correctness of the post processing is directly affected by the outcome form the clustering process. We demonstrate the applicability and effectiveness of the proposed clustering approach by conducting experiments using two benchmark data sets.

This study was supported by Scientific and Technical Research Council of Turkey (Grant number TÜBITAK EEEAG 109E241).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Transactions on Pattern Recognition and Machine Intelligence (1), 224–227 (1979)

    Google Scholar 

  2. Dunn, J.: Well separated clusters and optimal fuzzy partitions. J. Cybernetics 4, 95–104 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  3. Halkidi, M., Vazirgiannis, M., Batistakis, Y.: Quality scheme assessment in the clustering process. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. PKDD, vol. 1910, pp. 265–276. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Halkidi, M., Vazirgiannis, M.: Clustering Validity Assessment: Finding the optimal partitioning of a data set. In: Proceedings of IEEE ICDM, California (November 2001)

    Google Scholar 

  5. Horn, J., Nafpliotis, N., Goldberg, D.E.: A niched pareto genetic algorithm for multiobjective optimization. In: Proceedings of IEEE Conference on Evolutionary Computation, IEEE World Congress on Computational Computation, Piscataway, NJ, vol. 1, pp. 82–87 (1994)

    Google Scholar 

  6. Hubert, L., Schultz, J.: Quadratic assignment as a general data-analysis strategy. British Journal of Mathematical and Statistical Psychology 29, 190–241 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  7. Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.: FGKA: A Fast Genetic K-means Clustering Algorithm. In: Proceedings of ACM Symposium on Applied Computing, Nicosia, Cyprus, pp. 162–163 (2004)

    Google Scholar 

  8. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comp. App. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  9. Özyer, T., Alhajj, R.: Parallel Clustering of High Dimensional Data by Integrating Multi-Objective Genetic Algorithm with Divide and Conquer. Applied Intelligence (in press)

    Google Scholar 

  10. Tan, M., Alshalalfa, M., Alhajj, R., Polat, F.: Influence of Prior Knowledge in Constraint-Based Learning of Gene Regulatory Networks. In: IEEE/ACM TCBB, vol. 8(1), pp. 130–142 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Peng, P. et al. (2011). From Alternative Clustering to Robust Clustering and Its Application to Gene Expression Data. In: Yin, H., Wang, W., Rayward-Smith, V. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2011. IDEAL 2011. Lecture Notes in Computer Science, vol 6936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23878-9_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23878-9_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23877-2

  • Online ISBN: 978-3-642-23878-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics