Abstract
Given a set of entities, Cluster Analysis aims at finding subsets, called clusters, which are homogeneous and/or well separated. As many types of clustering and criteria for homogeneity or separation are of interest, this is a vast field. A survey is given from a mathematical programming viewpoint. Steps of a clustering study, types of clustering and criteria are discussed. Then algorithms for hierarchical, partitioning, sequential, and additive clustering are studied. Emphasis is on solution methods, i.e., dynamic programming, graph theoretical algorithms, branch-and-bound, cutting planes, column generation and heuristics.
Similar content being viewed by others
References
A. Aggarwal, H. Imai, N. Katoh and S. Suri, Findingk points with minimum diameter and related problems,J. Algorithms 12 (1991) 38–56.
H.J. Bandelt and A.W.M. Dress, Weak hierarchies associated with similarity measures: An additive clustering technique,Bulletin Mathematical Biology 51 (1989) 133–166.
F. Barahona, M. Junger and G. Reinelt, Experiments in quadratic 0–1 programming,Mathematical Programming 44 (1989) 127–137.
C. Barnhart, E.L. Johnson, G.L. Nemhauser and M.W.P. Savelsbergh, Branch and price: Column generation for solving huge integer programs, Computational Optimization Center COC-94-03, Georgia Institute of Technology, Atlanta, 1994; revised 1995.
J.-P. Barthelemy and A. Guénoche,Les Arbres et les Représentations des Proximités (Masson, Paris, 1988); English translation:Trees and Proximity Relations (Wiley, Chichester, UK, 1991).
R. Bellman, A note on cluster analysis and dynamic programming,Math. Biosci. 18 (1973) 311–312.
J.P. Benzecri, Construction d’une classification ascendante hiérarchique par la recherche en chaîne des voisins réciproques, Les Cahiers de l’Analyse des Données VII (2) (1982) 209–218.
U. Bertole and F. Brioschi,Nonserial Dynamic Programming (Academic Press, New York, 1972).
P. Bertrand, Structural Properties of Pyramidal Clustering, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 35–53.
J.C. Bezdek,Pattern Recognition with Fuzzy Objective Function Algorithms (Plenum, New York, 1981).
E. Boros and P.L. Hammer, On clustering problems with connected optima in Euclidean spaces,Discrete Mathematics 75 (1989) 81–88.
P. Brucker, On the complexity of clustering problems, in: M. Beckmann and H.P. Kunzi, eds.,Optimization and Operations Research, Lecture Notes in Economics and Mathematical Systems, Vol. 157 (Springer, Heidelberg, 1978) 45–54.
M. Bruynooghe, Classification ascendante hiérarchique des grands ensembles de données: Un algorithme rapide fondé sur la construction des voisinages réductibles, Les Cahiers de l’Analyse des Données 3 (1978) 7–33.
P. Buneman, The recovery of trees from measures of dissimilarity, in: F.R. Hodson, D.G. Kendall and P. Tautu, eds.,Mathematics in Archeological and Historical Sciences (Edinburgh University Press, Edinburgh, 1971) 387–395.
G.L. Leclerc, Comte de Buffon, Histoire Naturelle, Premier discours: De la Manière d’Étudier et de Traiter l’Histoire Naturelle (Paris, 1749).
V. Capoyleas, G. Rote and G. Woeginger, Geometric clusterings, J. Algorithms 12 (1991) 341–356.
J.L. Chandon, J. Lemaire and J. Pouget, Construction de l’ultramétrique la plus proche d’une dissimilarité au sens des moindres carrés, RAIRO-Recherche Opérationnelle 14 (1980) 157–170.
M.S. Chang, C.Y. Tang and R.C.T. Lee, A unified approach for solving bottleneckk-bipartition problems, in:Proceedings of the 19th Annual Computer Science Conference (ACM, San Antonio, TX, 1991) 39–47.
S. Chopra and M.R. Rao, On the multiway cut polyhedron, Networks 21 (1991) 51–89.
S. Chopra and J.H. Owen, Extended formulations for theA-cut problem,Mathematical Programming 73 (1996) 17–30.
V. Chvatal,Linear Programming (Freeman, New York, 1983).
N. Christofides,Graph Theory. An Algorithmic Approach (Academic Press, London, 1975).
R.M. Cormack, A review of classification (with discussion),J. Royal Statistical Society. Series A 134 (1971) 321–367.
Y. Crama, P. Hansen and B. Jaumard, The basic algorithm for pseudo-Boolean programming revisited,Discrete Applied Mathematics 29 (1990) 171–185.
A. Datta, H.-P. Lenhof, Ch. Schwarz and M. Smid, Static and dynamic algorithms fork-point clustering problems, J. Algorithms 19 (1995) 474–503.
W.H.E. Day and H. Edelsbrunner, Efficient algorithms for agglomerative hierarchical clustering methods, J. Classification 1 (1984) 7–24.
M. Delattre and P. Hansen, Classification d’homogénéité maximum, in:Actes du Colloque Analyse de Données et Informatique, INRIA 1 (1977) 99–104.
M. Delattre and P. Hansen, Bicriterion cluster analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence 2 (1980) 277–291.
G. De Soete, A least squares algorithm for fitting additive trees to proximity data,Psychometrika 48 (1983) 621–626.
G. De Soete, A least squares algorithm for fitting an ultrametric tree to a dissimilarity matrix,Pattern Recognition Letters 2 (1984) 133–137.
G. De Soete, Ultrametric tree representation of incomplete dissimilarity data,J. Classification 1 (1984) 235–242.
G. De Soete, Additive tree representations of incomplete dissimilarity data,Quality and Quantity 18 (1984) 387–393.
J. Desrosiers, F. Soumis and M. Desrochers, Routing with time windows by column generation,Networks 14 (1984) 545–565.
E. Diday, From data to knowledge: Probabilistic objects for a symbolic data analysis, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 35–53.
E. Diday, Orders and overlapping clusters by pyramids, Research Report, 730, INRIA, France, 1987.
G. Diehr, Evaluation of a branch and bound algorithm for clustering,SIAM J. Scientific and Statistical Computing 6 (1985) 268–284.
W. Dinkelbach, On nonlinear fractional programming,Management Science 61 (1995) 195–212.
A. Dodge and T. Gafner, Complexity relaxation of dynamic programming for cluster analysis, in: E. Diday et al., eds.,New Approaches in Classification and Data Analysis, Studies in Classification, Data Analysis, and Knowledge Organization (1994) 220–227.
U. Dorndorf and E. Pesch, Fast clustering algorithms,ORSA J. Computing 6 (1994) 141–153.
O. du Merle, D. Villeneuve, J. Desrosiers and P. Hansen, Stabilization dans le cadre de la génération de colonnes, Les Cahiers du GERAD G-97-08 (1997).
O. du Merle, P. Hansen, B. Jaumard and N. Mladenović, An interior point algorithm for minimum sum-of-squares clustering, in preparation.
D. Erlenkotter, A dual-based procedure for uncapacitated facility location,Operations Research 26 (1978) 1590–1602.
C.C Ferreira, A. Martin, C.C. De Souza, R. Weissmantel and L.A. Wolsey, Formulation and valid inequalities for the node capacitated graph partitioning problem,Mathematical Programming 74 (1996) 247–266.
R. Garfinkel, A.W. Neebe and M.R. Rao, An algorithm for theM-median plant location problem,Transportation Science 8 (1974) 217–236.
O. Gascuel and D. Levy, A reduction algorithm for approximating a (non-metric) dissimilarity by a tree distance,J. Classification 13 (1996) 129–155.
S. Gelinas, P. Hansen and B. Jaumard, A labelling algorithm for minimum sum of diameters partitioning of graphs, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 89–96.
P.C. Gilmore and R.E. Gomory, A linear programming approach to the cutting stock problem,Operations Research 9 (1961) 849–859.
A.D. Gordon,Classification: Methods for the Exploratory Analysis of Multivariate Data (Chapman and Hall, New York, 1981).
A.D. Gordon, A review of hierarchical classification, J. Royal Statistical Society 150 (1987) 119–137.
J.C. Gower and G.J.S. Ross, Minimum spanning trees and single linkage cluster analysis,Applied Statistics 18 (1969) 54–64.
M. Grötschel and Y. Wakabayashi, A cutting plane algorithm for a clustering problem,Mathematical Programming 45 (1989) 59–96.
M. Grötschel and Y. Wakabayashi, Facets of the clique partitioning polytope,Mathematical Programming 47 (1990) 367–387.
A. Guénoche, Partitions with minimum diameter, in:Proceedings of the International Conference of the Federation of Classification Societies (Charlottesville, USA, 1989).
A. Guénoche, Enumération des partitions de diamètre minimum,Discrete Mathematics 111 (1993) 277–287.
A. Guénoche, P. Hansen and B. Jaumard, Efficient algorithms for divisive hierarchical clustering with the diameter criterion,J. Classification 8 (1991) 5–30.
P.L. Hammer, P. Hansen and B. Simeone, Roof duality, complementation and persistency in quadratic 0–1 optimization,Mathematical Programming 28 (1984) 121–155.
P. Hanjoul and D. Peeters, A comparison of two dual-based procedures for solving thep-median problem,European Journal of Operations Research 20 (1985) 387–396.
P. Hansen and M. Delattre, Complete-link cluster analysis by graph coloring,J. American Statistical Association 73 (1978) 397–403.
P. Hansen and B. Jaumard, Minimum sum of diameters clustering,J. Classification 4 (1987) 215–226.
P. Hansen, B. Jaumard and E. da Silva, Average-linkage divisive hierarchical clustering,Les Cahiers du GERAD, G-91-55 (1991); also in:J. Classification, to appear.
P. Hansen, B. Jaumard and O. Frank, Maximum sum-of-splits clustering,J. Classification 6 (1989) 177–193.
P. Hansen, B. Jaumard, S. Krau and O. du Merle, A stabilized column generation algorithm for the Weber multisource problem, in preparation.
P. Hansen, B. Jaumard and V. Mathon, Constrained nonlinear 0–1 programming,ORSA J. Computing 5 (1993) 97–119.
P. Hansen, B. Jaumard and C. Meyer, Exact sequential algorithms for additive clustering, Les Cahiers du GERAD (1997), to appear.
P. Hansen, B. Jaumard and N. Mladenovic, How to choosek entities amongN, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 105–116.
P. Hansen, B. Jaumard and M. Poggi De Aragão, Mixed-integer column generation algorithms and the probabilistic maximum satisfiability problem, in: E. Balas, G. Cornejols and R. Kannan, eds.,Proceedings of the Second IPCO Conference (Carnegie-Mellon University, 1992) 165–180.
P. Hansen, B. Jaumard and E. Sanlaville, Weight constrained minimum sum-of-stars clustering,Les Cahiers du GERAD, G-93-38 (1993); also in:J. Classification, to appear.
P. Hansen, B. Jaumard and E. Sanlaville, Partitioning problems of cluster analysis: A review of mathematical programming approaches, in: E. Diday et al., eds.,New Approaches in Classification and Data Analysis (Springer, Berlin, 1994) 228–240.
P. Hansen, B. Jaumard and B. Simeone, Espaliers, A generalization of dendrograms,J. Classification 13 (1996) 107–127.
P. Hansen, B. Jaumard and B. Simeone, Polynomial algorithms for nested univariate clustering,Les Cahiers du GERAD, G-96-28 (1996).
P. Hansen, M. Minoux and M. Labbe, Extension de la programmation linéaire généralisée au cas des programmes mixtes,Comptes Rendus de l’Académie des Sciences 305 (1987) 569–572.
P. Hansen and N. Mladenovic, Variable neighborhood search,Les Cahiers du GERAD, G-96-49 (1996); also in:Computers and Operations Research, to appear.
J.A. Hartigan,Clustering Algorithms (Wiley, New York, 1975).
J. Hershberger, Minimizing the sum of diameters efficiently,Computational Geometry: Theory and Applications 2 (1992) 111–118.
L.J. Hubert, Some applications of graph theory to clustering,Psychometrika 39 (1974) 283–309.
L.J. Hubert, Min and max hierarchical clustering using asymmetric similarity measures,Psychometrika 38 (1973) 63–72.
L.J. Hubert and P. Arabie, Iterative projection strategies for the least-squares fitting of tree structure to proximity data,The British Journal of Mathematical and Statistical Psychology 48 (1995) 281–317.
F.K. Hwang, U.G. Rothblum and Y.-C. Yao, Localizing combinatorial properties of partitions, AT&T Bell Labs Report, 1995.
M. Jambu,Classification Automatique pour l’Analyse des Données, Tome 1 (Dunod, Paris, 1976).
M. Jambu,Exploratory and Multivariate Data Analysis (Academic Press, New York, 1991).
R.E. Jensen, A dynamic programming algorithm for cluster analysis,Operations Research 17 (1969) 1034–1057.
E.L. Johnson, A. Mehrotra and G.L. Nemhauser, Min-cut clustering,Mathematical Programming 62 (1993) 133–151.
L. Kaufman and P.J. Rousseeuw,Finding Groups in Data: An Introduction to Cluster Analysis (Wiley, New York, 1990).
G. Klein and J.E. Aronson, Optimal clustering: a model and method,Naval Research Logistics 38 (1991) 447–461.
W.L.G. Koontz, P.M. Narendra and K. Fukunaga, A Branch and bound clustering algorithm,IEEE Transactions Computers 24 (1975) 908–915.
M. Krivanek and J. Moravek, NP-hard problems in hierarchical-tree clustering,Acta Informatica 23 (1986) 311–323.
G.N. Lance and W.T. Williams, A general theory of classificatory sorting strategies. 1. Hierarchical systems,The Computer Journal 9 (1967) 373–380.
B. Leclerc, Description combinatoire des ultramétriques,Mathématiques et Sciences Humaines 73 (1981) 5–37.
J.K. Lenstra, Clustering a data array and the traveling salesman problem,Operations Research 22 (1974) 413–414.
J.F. Marcotorchino and P. Michaud,Optimisation en Analyse Ordinale des Données (Masson, Paris, 1979).
D.W. Matula and L.L. Beck, Smallest-last ordering and clustering and graph-coloring algorithms,J. ACM 30 (1983) 417–427.
W.T. McCormick Jr, P.J. Schweitzer and T.W. White, Problem decomposition and data reorganization by a clustering technique,Operations Research 20 (1972) 993–1009.
M. Minoux and E. Pinson, Lower bounds to the graph partitioning problem through generalized linear programming and network flows,RAIRO Recherche Opérationnelle 21 (1987) 349–364.
G.W. Milligan and M.C. Cooper, An examination of procedures for determining the number of clusters in data set,Psychometrika 50 (1985) 159–179.
B. Mirkin, Additive clustering and qualitative factor analysis methods for similarity matrices,J. Classification 4 (1987) 7–31, (Erratum 6, 271–272).
B. Mirkin,Mathematical Classification and Clustering (Kluwer, Dordrecht, 1996).
C. Monma and S. Suri, Partitioning points and graphs to minimize the maximum or the sum of diameters, in: Y. Alavi, G. Chartrand, O.R. Oellerman and A.J. Schwenk, eds.,Proceedings of the Sixth Quadrennial International Conference on the Theory and Applications of Graphs, Graph Theory, Combinatorics, and Applications (Wiley, New York, 1991) 899–912.
F. Murtagh, A survey of recent advances in hierarchical clustering algorithms,The Computer Journal 26 (1983) 329–340.
J. Ponthier, A.-B. Dufour and N. Normand, Le Modèle Euclidien en Analyse des Données (Ellipses, Paris, 1990).
A.W. Neebe and M.R. Rao, An algorithm for the fixed-charge assignment of users to sources problem,J. Operations Research Society 34 (1983) 1107–1113.
G. Palubeckis, A branch-and-bound approach using polyhedral results for a clustering problem,INFORMS J. Computing 9 (1997) 30–42.
M.R. Rao, Cluster analysis and mathematical programming,J. American Statistical Association 66 (1971) 622–626.
C.R. Reeves, ed.,Modern Heuristic Techniques for Combinatorial Problems (Blackwell, London, 1993).
S. Régnier, Sur quelques aspects mathématiques des problèmes de classification,ICC Bulletin 4 (1965) 175–191; reprinted in:Mathématiques et Sciences Humaines 82 (1983) 85–111.
P. Rosenstiehl, L’arbre minimum d’un graphe, in: P. Rosenstiehl, ed.,Théorie des Graphes (Dunod, Paris, 1967) 357–368.
A. Rusch and R. Wille, Knowledge spaces and formal concept analysis, in: H. Boch and W. Polasek, eds.,Data Analysis and Information Systems (Springer, Berlin, 1996) 427–436.
D.M. Ryan and B.A. Foster, An integer programming approach to scheduling, in: A. Wren, ed.,Computer Scheduling of Public Transport Urban Passenger Vehicle and Crew Scheduling (North-Holland, Amsterdam, 1981) 269–280.
R.N. Shepard and P. Arabie, Additive clustering representation of similarities as combinations of discrete overlapping properties,Psychol. Rev. 86 (1979) 87–123.
H. Späth,Cluster Analysis Algorithms for Data Reduction and Classification of Objects (Ellis Horwood, Chichester, UK, 1980).
L.E. Stanfel, A recursive Lagrangian method for clustering problems,European Journal of Operations Research 27 (1986) 332–342.
P.H.A. Sneath and R.R. Sokal,Numerical Taxonomy (Freeman, San Francisco, 1973).
R.E. Tarjan, An improved algorithm for hierarchical clustering using strong components,Information Processing Letters 17 (1983) 37–41.
F. Vanderbeck, Decomposition and column generation for integer programs, Ph.D. Thesis, Faculté des Sciences Appliquées, Université Catholique de Louvain, Louvain-la-Neuve, 1994.
H.D. Vinod, Integer programming and the theory of grouping,J. American Statistical Association 64 (1969) 506–519.
W.J. Welch, Algorithmic complexity — Three NP-hard problems in computational statistics,J. Statistical Computing 15 (1982) 68–86.
Author information
Authors and Affiliations
Corresponding author
Additional information
Research supported by ONR grant N00014-95-1-0917, FCAR grant 95-ER-1048 and NSERC grants GP0105574 and GP0036426. The authors thank Olivier Gascuel and an anonymous referee for insightful remarks.
Rights and permissions
About this article
Cite this article
Hansen, P., Jaumard, B. Cluster analysis and mathematical programming. Mathematical Programming 79, 191–215 (1997). https://doi.org/10.1007/BF02614317
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02614317