Elsevier

Information Processing Letters

Volume 93, Issue 3, 14 February 2005, Pages 143-147
Information Processing Letters

An optimal hierarchical clustering algorithm for gene expression data

https://doi.org/10.1016/j.ipl.2004.11.001Get rights and content

Abstract

Microarrays are used for measuring expression levels of thousands of genes simultaneously. Clustering algorithms are used on gene expression data to find co-regulated genes. An often used clustering strategy is the Pearson correlation coefficient based hierarchical clustering algorithm presented in [Proc. Nat. Acad. Sci. 95 (25) (1998) 14863–14868], which takes O(N3) time. We note that this run time can be reduced to O(N2) by applying known hierarchical clustering algorithms [Proc. 9th Annual ACM-SIAM Symposium on Discrete Algorithms, 1998, pp. 619–628] to this problem. In this paper, we present an algorithm which runs in O(NlogN) time using a geometrical reduction and show that it is optimal.

References (7)

  • A.A. Alizadeh et al.

    Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling

    Nature

    (2000)
  • S.N. Bespamyatnikh

    An optimal algorithm for closest pair maintenance

    Discrete Comput. Geom.

    (1998)
  • P.B. Callahan et al.

    A decomposition of multi-dimensional point-sets with applications to k-nearest-neighbors and n-body potential fields

There are more references available in the full text version of this article.

Cited by (16)

  • Clustering gene expression data analysis using an improved em algorithm based on multivariate elliptical contoured mixture models

    2014, Optik
    Citation Excerpt :

    Many clustering algorithms have been proposed for gene expression data analysis. The hierarchical clustering is one of the earliest algorithms applied to clustering gene expression data [6,7]. K-means clustering algorithm is used in gene expression data analysis due to its high computational performances [8,9].

  • SEP/COP: An efficient method to find the best partition in hierarchical clustering based on a new cluster validity index

    2010, Pattern Recognition
    Citation Excerpt :

    Clustering is widely used in many fields such as psychology [1], biology [2,3], pattern recognition [4], image processing [5,6] and computer security [7].

  • Class-Specific Correlations of Gene Expressions: Identification and Their Effects on Clustering Analyses

    2008, American Journal of Human Genetics
    Citation Excerpt :

    DNA microarray technology provides a unique tool to monitor gene-expression levels of thousands of genes simultaneously. To detect gene-transcriptional modules in microarray data, a main step is often the application of clustering analyses,3–6 which can group genes with similar expression profiles.4,7,8 In recent years, various clustering-based methods have been proposed, such as hierarchical clustering,4 K-means,9 and self-organizing map (SOM).10,11

  • Kansei clustering using fuzzy and grey relation algorithms

    2015, Journal of Interdisciplinary Mathematics
View all citing articles on Scopus
View full text