Abstract
Constrained clustering is becoming an increasingly popular approach in data mining. It offers a balance between the complexity of producing a formal definition of thematic classes—required by supervised methods—and unsupervised approaches, which ignore expert knowledge and intuition. Nevertheless, the application of constrained clustering to time-series analysis is relatively unknown. This is partly due to the unsuitability of the Euclidean distance metric, which is typically used in data mining, to time-series data. This article addresses this divide by presenting an exhaustive review of constrained clustering algorithms and by modifying publicly available implementations to use a more appropriate distance measure—dynamic time warping. It presents a comparative study, in which their performance is evaluated when applied to time-series. It is found that k-means based algorithms become computationally expensive and unstable under these modifications. Spectral approaches are easily applied and offer state-of-the-art performance, whereas declarative approaches are also easily applied and guarantee constraint satisfaction. An analysis of the results raises several influencing factors to an algorithm’s performance when constraints are introduced.
Similar content being viewed by others
Notes
References
Aghabozorgi S, Shirkhorshidi A, Wah T (2015) Time-series clustering: a decade review. Inf Syst 53:16–38
Al-Razgan M, Domeniconi C (2009) Clustering ensembles with active constraints. Springer, Berlin, pp 175–189
Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75(2):245–248
Aloise D, Hansen P, Liberti L (2012) An improved column generation algorithm for minimum sum-of-squares clustering. Math Program 131(1–2):195–220
Alzate C, Suykens J (2009) A regularized formulation for spectral clustering with pairwise constraints. In: Proceedings of the international joint conference on neural networks, pp 141–148
Anand R, Reddy C (2011) Graph-based clustering with constraints. In: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining, pp 51–62
Anand S, Bell D, Hughes J (1995) The role of domain knowledge in data mining. In: Proceedings of the international conference on information and knowledge management, pp 37–43
Antunes C, Oliveira A (2001) Temporal data mining: an overview. In: KDD workshop on temporal data mining, pp 1–13
Babaki B (2017) MIPKmeans. https://github.com/Behrouz-Babaki/MIPKmeans. Accessed 01 May 2017
Babaki B, Guns T, Nijssen S (2014) Constrained clustering using column generation. In: Proceedings of the international conference on AI and OR techniques in constriant programming for combinatorial optimization problems, pp 438–454
Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660
Banerjee A, Ghosh J (2006) Scalable clustering algorithms with balancing constraints. Data Min Knowl Discov 13(3):365–395
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. In: Proceedings of the international conference on machine learning, pp 11–18
Bar-Hillel A, Hertz T, Shental M, Weinshall D (2005) Learning a Mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965
Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: Proceedings of the international conference on machine learning, pp 19–26
Basu S, Banerjee A, Mooney R (2004) Active semi-supervision for pairwise constrained clustering. In: Proceedings of the SIAM international conference on data mining, pp 333–344
Basu S, Bilenko M, Mooney R (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 59–68
Basu S, Davidson I, Wagstaff K (2008) Constrained clustering: advances in algorithms, theory, and applications, 1st edn. Chapman & Hall, London
Bellet A, Habrard A, Sebban M (2015) Metric learning. Morgan & Claypool Publishers, Los Altos
Bilenko M, Mooney R (2003) Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 39–48
Bilenko M, Basu S, Mooney R (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the international conference on machine learning, pp 11–18
Bradley P, Bennett K, Demiriz A (2000) Constrained k-means clustering. Technical Report MSR-TR-2000-65, Microsoft Research
Chen W, Feng G (2012) Spectral clustering: a semi-supervised approach. Neurocomputing 77(1):229–242
Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015) The UCR time series classification archive. www.cs.ucr.edu/~eamonn/time_series_data/. Accessed 01 May 2017
Cheng H, Hua K, Vu K (2008) Constrained locally weighted clustering. Proc VLDB Endow 1(1):90–101
Cohn D, Caruana R, Mccallum A (2003) Semi-supervised clustering with user feedback. Technical Report TR2003-1892, Department of Computer Science, Cornell University
Cucuringu M, Koutis I, Chawla S, Miller G, Peng R (2016) Simple and scalable constrained clustering: a generalized spectral method. In: Proceedings of the international conference on artificial intelligence and statistics, pp 445–454
Dao TBH, Duong KC, Vrain C (2013) A declarative framework for constrained clustering. In: Proceedings of the European conference on machine learning and principles and practice of knowledge discovery in databases, pp 419–434
Dao TBH, Vrain C, Duong KC, Davidson I (2016) A framework for actionable clustering using constraint programming. In: Proceedings of the European conference on artificial intelligence, pp 453–461
Dao TBH, Duong KC, Vrain C (2017) Constrained clustering by constraint programming. Artif Intell 244:70–94
Davidson I, Basu S (2007) A survey of clustering with instance level constraints. ACM Trans Knowl Discov Data 77(1):1–41
Davidson I, Ravi S (2005) Clustering with constraints: Feasibility issues and the k-means algorithm. In: Proceedings of the SIAM international conference on data mining, pp 307–314
Davidson I, Ravi S (2006) Identifying and generating easy sets of constraints for clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 336–341
Davidson I, Ravi S (2007) Intractability and clustering with constraints. In: Proceedings of the international conference on machine learning, pp 201–208
Davidson I, Wagstaff K, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: European conference on principles of data mining and knowledge discovery, pp 115–126
Davidson I, Ravi S, Shamis L (2010) A SAT-based framework for efficient constrained clustering. In: Proceedings of the SIAM international conference on data mining, pp 94–105
Delattre M, Hansen P (1980) Bicriterion cluster analysis. IEEE Trans Pattern Anal Mach Intell PAMI 2(4):277–291
Demiriz A, Bennett K, Embrechts M (1999) Semi-supervised clustering using genetic algorithms. In: Proceedings of the conference on artificial neural networks in engineering, pp 809–814
Demiriz A, Bennett K, Bradley P (2008) Chap 9: Using assignment constraints to avoid empty clusters in k-means clustering. In: Basu S, Davidson I, Wagstaff K (eds) Constrained clustering: advances in algorithms, theory, and applications, 1st edn. Chapman & Hall, London, pp 201–220
Dimitriadou E, Weingessel A, Hornik K (2002) A mixed ensemble approach for the semi-supervised problem. In: Proceedings of the international conference on artificial neural networks, pp 571–576
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proceedings of the international conference on very large data bases
Ding S, Qi B, Jia H, Zhu H, Zhang L (2013) Research of semi-supervised spectral clustering based on constraints expansion. Neural Comput Appl 22:405–410
Domeniconi C, Al-Razgan M (2008) Penta-training: clustering ensembles with bootstrapping of constraints. In: Proceedings of workshop on supervised and unsupervised ensemble methods and their applications, pp 47–51
Domeniconi C, Gunopulos D, Ma S, Yan B, Al-Razgan M, Papadopoulos D (2007) Locally adaptive metrics for clustering high dimensional data. Data Min Knowl Discov 14(1):63–97
Fisher D (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2(2):139–172
Forestier G, Gançarski P, Wemmert C (2010) Collaborative clustering with background knowledge. Data Knowl Eng 69(2):211–228
Forestier G, Wemmert C, Gançarski P (2010) Towards conflict resolution in collaborative clustering. In: IEEE International conference on intelligent systems, pp 361–366
Fred ALN, Jain AK (2002) Data clustering using evidence accumulation. In: Proceedings of the IEEE international conference on pattern recognition, pp 276–280
Gançarski P, Wemmert C (2007) Collaborative multi-step mono-level multi-strategy classification. J Multimed Tools Appl 35(1):1–27
Ganji M, Bailey J, Stuckey P (2016) Lagrangian constrained clustering. In: Proceedings of the SIAM international conference on data mining, pp 288–296
Ge R, Ester M, Jin W, Davidson I (2007) Constraint-driven clustering. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 320–329
Grira N, Crucianu M, Boujemaa N (2006) Fuzzy clustering with pairwise constraints for knowledge-driven image categorization. IEE Proc Vis Image Signal Process (CORE B) 153(3):299–304
Guns T, Dao TBH, Vrain C, Duong KC (2016) Repetitive branch-and-bound using constraint programming for constrained minimum sum-of-squares clustering. In: Proceedings of the European conference on artificial intelligence, pp 462–470
Hadjitodorov ST, Kuncheva LI (2007) Selecting diversifying heuristics for cluster ensembles. In: Proceedings of the international workshop on multiple classifier systems, pp 200–209
Handl J, Knowles J (2006) On semi-supervised clustering via multiobjectve optimization. In: Proceedings of the annual conference on genetic and evolutionary computation, pp 1465–1472
Hansen P, Delattre M (1978) Complete-link cluster analysis by graph coloring. J Am Stat Assoc 73(362):397–403
Hansen P, Jaumard B (1997) Cluster analysis and mathematical programming. Math Program 79(1–3):191–215
Hiep T, Duc N, Trung B (2016) Local search approach for the pairwise constrained clustering problem. In: Proceedings of the symposium on information and communication technology, pp 115–122
Hoi S, Jin R, Lyu M (2007) Learning nonparametric kernel matrices from pairwise constraints. In: International conference on machine learning, pp 361–368
Hoi S, Liu W, Chang SF (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: Proceedings of the IEEE international conference on computer vision and pattern recognition
Hoi S, Liu W, Chang SF (2010) Semi-supervised distance metric learning for collaborative image retrieval and clustering. ACM Trans Multimed Comput Commun Appl 6(3):18
Huang H, Cheng Y, Zhao R (2008) A semi-supervised clustering algorithm based on must-link set. In: Proceedings of the international conference on advanced data mining and applications, pp 492–499
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Iqbal A, Moh’d A, Zhan Z (2012) Semi-supervised clustering ensemble by voting. In: Proceedings of the international conference on information and communication systems, pp 1–5
Kamvar S, Klein D, Manning C (2003) Spectral learning. In: Proceedings of the international joint conference on artificial intelligence, pp 561–566
Kavitha V, Punithavalli M (2010) Clustering time series data stream–a literature survey. Int J Comput Sci Inf Secur 8(1):289–294
Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371
Keogh E, Lin J (2005) Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowl Inf Syst 8(2):154–177
Kittler J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239
Klein D, Kamvar S, Manning C (2002) From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the international conference on machine learning, pp 307–314
Kruskal J (1964) Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1):1–27
Kuhn H, Tucker A (1951) Nonlinear programming. In: Proceedings of the Berkeley symposium, pp 481–492
Kulis B, Basu S, Dhillon I, Mooney R (2005) Semi-supervised graph clustering: a kernel approach. In: Proceedings of the international conference on machine learning, pp 457–464
Kulis B, Basu S, Dhillon I, Mooney R (2009) Semi-supervised graph clustering: a kernel approach. Mach Learn 74(1):1–22
Laxman S, Sastry P (2006) A survey of temporal data mining. Sadhana 31(2):173–198
Li T, Ding C (2008) Weighted consensus clustering. In: Proceedings of the SIAM international conference on data mining, pp 798–809
Li T, Ding C, Jordan M (2007) Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In: Proceedings of the IEEE international conference on data mining, pp 577–582
Li Z, Liu J (2009) Constrained clustering by spectral kernel learning. In: IEEE international conference on computer vision, pp 421–427
Li Z, Liu J, Tang X (2008) Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: Proceedings of the international conference on machine learning, pp 576–583
Li Z, Liu J, Tang X (2009) Constrained clustering via spectral regularization. In: Proceedings of the international conference on computer vision and pattern recognition, pp 421–428
Liao TW (2005) Clustering of time series data: a survey. Pattern Recognit 38(11):1857–1874
Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592
Lu Z, Carreira-Perpiñán M (2008) Constrained spectral clustering through affinity propagation. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Lu Z, Ip H (2010) Constrained spectral clustering via exhaustive and efficient constraint propagation. In: Proceedings of the European conference on computer vision, pp 1–14
Lu Z, Leen T (2005) Semi-supervised learning with penalized probabilistic clustering. In: Proceedings of the advances in neural information processing systems
Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Merle Od, Hansen P, Jaumard B, Mladenović N (1999) An interior point algorithm for minimum sum-of-squares clustering. SIAM J Sci Comput 21(4):1485–1505
Monti S, Tamayo P, Mesirov J, Golub T (2003) Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach Learn 52(1):91–118
Mueller M, Kramer S (2010) Integer linear programming models for constrained clustering. In: Proceedings of the international conference on discovery science, pp 159–173
Ng A, Jordan M, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Proceedings of the international conference on neural information processing systems, pp 849–856
Ng M (2000) A note on constrained k-means algorithms. Pattern Recognit 33(3):515–519
Ouali A, Loudni S, Lebbah Y, Boizumault P, Zimmermann A, Loukil L (2016) Efficiently finding conceptual clustering models with integer linear programming. In: Proceedings of the international joint conference on artificial intelligence, pp 647–654
Pedrycz W (2002) Collaborative fuzzy clustering. Pattern Recognit Lett 23(14):1675–1686
Pelleg D, Baras D (2007) K-means with large and noisy constraint sets. In: Proceedings of the European conference on machine learning, pp 674–682
Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit 44(3):678–693
Rangapuram S, Hein M (2012) Constrained 1-spectral clustering. In: Proceedings of the international conference on artificial intelligence and statistics, pp 1143–1151
Rani S, Sikka G (2012) Recent techniques of clustering of time series data: a survey. Int J Comput Appl 52(15):1–9
Rossi F, Pv Beek, Walsh T (eds) (2006) Handbook of constraint programming. Foundations of artificial intelligence. Elsevier, Amsterdam
Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput Appl Math 20:53–65
Rutayisire T, Yang Y, Lin C, Zhang J (2011) A modified COP-KMeans algorithm based on sequenced cannot-link set. In: Proceedings of the International Conference on Rough Sets and Knowledge Technology, pp 217–225
Sakoe H, Chiba S (1971) A dynamic programming approach to continuous speech recognition. In: Proceedings of the international congress on acoustics, vol 3, pp 65–69
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49
Shental N, Bar-Hillel A, Hertz T, Weinshall D (2013) Computing Gaussian mixture models with EM using equivalence constraints. In: International conference on neural information processing systems, pp 465–472
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Strehl A, Ghosh J (2002) Cluster ensembles: a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Tan W, Yang Y, Li T (2010) An improved COP-KMeans algorithm for solving constraint violation. In: Proceedings of the international FLINS conference on foundations and applications of computational intelligence, pp 690–696
Tang W, Xiong H, Zhong S, Wu J (2007) Enhancing semi-supervised clustering: a feature projection perspective. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 707–716
Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceedings of the international conference on machine learning, pp 1103–1110
Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained k-means clustering with background knowledge. In: Proceedings of the international conference on machine learning, pp 577–584
Wagstaff K, Basu S, Davidson I (2006) When is constrained clustering beneficial, and why? In: Proceedings of the national conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference
Wang J, Wu S, Vu H, Li G (2010) Text document clustering with metric learning. In: International ACM SIGIR conference on research and development in information retrieval, pp 783–784
Wang X, Davidson I (2010) Active spectral clustering. In: Proceedings of the IEEE international conference on data mining, pp 561–568
Wang X, Davidson I (2010) Flexible constrained spectral clustering. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 563–572
Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26(2):275–309
Wang X, Qian B, Davidson I (2014) On constrained spectral clustering and its applications. Data Min Knowl Discov 28(1):1–30
Wemmert C, Gançarski P, Korczak J (2000) A collaborative approach to combine multiple learning methods. Int J Artif Intell Tools 9(1):59–78
Xiao W, Yang Y, Wang H, Li T, Xing H (2016) Semi-supervised hierarchical clustering ensemble and its application. Neurocomputing 173(3):1362–1376
Xing E, Ng A, Jordan M, Russell S (2002) Distance metric learning learning, with application to clustering with side-information. In: Proceedings of the advances in neural information processing systems, pp 521–528
Yang F, Li T, Zhou Q, Xiao H (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70
Yang Y, Tan W, Li T, Ruan D (2012) Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems. Knowl Based Syst 32:101–115
Yi J, Jin R, Jain A, Yang T, Jain S (2012) Semi-crowdsourced clustering: generalizing crowd labeling by robust distance metric learning. In: Proceedings of the advances in neural information processing systems, pp 1772–1780
Yu Z, Wongb HS, You J, Yang Q, Liao H (2011) Knowledge based cluster ensemble for cancer discovery from biomolecular data. IEEE Trans NanoBioscience 10(2):76–85
Zha H, He X, Ding CHQ, Gu M, Simon HD (2001) Spectral relaxation for k-means clustering. In: Proceedings of the international conference on neural information processing systems, pp 1057–1064
Zhang T, Ando R (2006) Analysis of spectral kernel design based semi-supervised learning. In: Proceedings of the international conference on neural information processing systems, pp 1601–1608
Zhi W, Wang X, Qian B, Butler P, Ramakrishnan N, Davidson I (2013) Clustering with complex constraints-algorithms and applications. In: Proceedings of the conference on artificial intelligence, pp 1056–1062
Zhu X, Loy C, Gong S (2016) Constrained clustering with imperfect oracles. IEEE Trans Neural Netw Learn Syst 27(6):1345–1357
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Jian Pei.
Funding: CNES/Unistra R&T research Grant Number 2016-033.
Appendices
Appendix A: Full metric scores
See Tables 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 and 18.
Appendix B: Constraint coherence
As described in Davidson et al. (2006): “We consider all constraint pairs composed of an ML and a CL constraint (pairs composed of the same constraint type cannot be contradictory). To determine the coherence of two constraints, a and b, we compute the projected overlap of each constraint on the other”.
Let \(\mathbf {a}\) and \(\mathbf {b}\) be vectors connecting the points constrained by a, i.e. \((a_1,a_2)\), and b, i.e. \((b_1,b_2)\), respectively. We first project the points bound by constraint a onto the line that is defined by the points bound by constraint b, such that
where
The points \(a'_1\), \(a'_2\), \(b_1\), and \(b_2\) now all exist in the 1D space described by the basis vector \(\mathbf {e}\), and as such are projected into this 1D space, such that
The 1D points of each constraint are then sorted such that \(a''_1 \le a''_2\) and \(b''_1 \le b''_2\). With this assumption satisfied, the overlap of constraint a on constraint b becomes
Two constraints are coherent if there is no overlap between them, such that
and the coherence of a set of constraints is defined to be the fraction of coherent constraints within the set, such that
Rights and permissions
About this article
Cite this article
Lampert, T., Dao, TBH., Lafabregue, B. et al. Constrained distance based clustering for time-series: a comparative and experimental study. Data Min Knowl Disc 32, 1663–1707 (2018). https://doi.org/10.1007/s10618-018-0573-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-018-0573-y