Cluster analysis and mathematical programming

Hansen, Pierre; Jaumard, Brigitte

doi:10.1007/BF02614317

Cluster analysis and mathematical programming

Published: October 1997

Volume 79, pages 191–215, (1997)
Cite this article

Mathematical Programming Submit manuscript

Pierre Hansen¹ &
Brigitte Jaumard²

3020 Accesses
352 Citations
Explore all metrics

Abstract

Given a set of entities, Cluster Analysis aims at finding subsets, called clusters, which are homogeneous and/or well separated. As many types of clustering and criteria for homogeneity or separation are of interest, this is a vast field. A survey is given from a mathematical programming viewpoint. Steps of a clustering study, types of clustering and criteria are discussed. Then algorithms for hierarchical, partitioning, sequential, and additive clustering are studied. Emphasis is on solution methods, i.e., dynamic programming, graph theoretical algorithms, branch-and-bound, cutting planes, column generation and heuristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Dongkuan Xu & Yingjie Tian

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

Article 27 November 2022

Gbeminiyi John Oyewole & George Alex Thopil

References

A. Aggarwal, H. Imai, N. Katoh and S. Suri, Findingk points with minimum diameter and related problems,J. Algorithms 12 (1991) 38–56.
Article MATH MathSciNet Google Scholar
H.J. Bandelt and A.W.M. Dress, Weak hierarchies associated with similarity measures: An additive clustering technique,Bulletin Mathematical Biology 51 (1989) 133–166.
MATH MathSciNet Google Scholar
F. Barahona, M. Junger and G. Reinelt, Experiments in quadratic 0–1 programming,Mathematical Programming 44 (1989) 127–137.
Article MATH MathSciNet Google Scholar
C. Barnhart, E.L. Johnson, G.L. Nemhauser and M.W.P. Savelsbergh, Branch and price: Column generation for solving huge integer programs, Computational Optimization Center COC-94-03, Georgia Institute of Technology, Atlanta, 1994; revised 1995.
Google Scholar
J.-P. Barthelemy and A. Guénoche,Les Arbres et les Représentations des Proximités (Masson, Paris, 1988); English translation:Trees and Proximity Relations (Wiley, Chichester, UK, 1991).
Google Scholar
R. Bellman, A note on cluster analysis and dynamic programming,Math. Biosci. 18 (1973) 311–312.
Article MATH MathSciNet Google Scholar
J.P. Benzecri, Construction d’une classification ascendante hiérarchique par la recherche en chaîne des voisins réciproques, Les Cahiers de l’Analyse des Données VII (2) (1982) 209–218.
Google Scholar
U. Bertole and F. Brioschi,Nonserial Dynamic Programming (Academic Press, New York, 1972).
Google Scholar
P. Bertrand, Structural Properties of Pyramidal Clustering, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 35–53.
Google Scholar
J.C. Bezdek,Pattern Recognition with Fuzzy Objective Function Algorithms (Plenum, New York, 1981).
MATH Google Scholar
E. Boros and P.L. Hammer, On clustering problems with connected optima in Euclidean spaces,Discrete Mathematics 75 (1989) 81–88.
Article MATH MathSciNet Google Scholar
P. Brucker, On the complexity of clustering problems, in: M. Beckmann and H.P. Kunzi, eds.,Optimization and Operations Research, Lecture Notes in Economics and Mathematical Systems, Vol. 157 (Springer, Heidelberg, 1978) 45–54.
Google Scholar
M. Bruynooghe, Classification ascendante hiérarchique des grands ensembles de données: Un algorithme rapide fondé sur la construction des voisinages réductibles, Les Cahiers de l’Analyse des Données 3 (1978) 7–33.
Google Scholar
P. Buneman, The recovery of trees from measures of dissimilarity, in: F.R. Hodson, D.G. Kendall and P. Tautu, eds.,Mathematics in Archeological and Historical Sciences (Edinburgh University Press, Edinburgh, 1971) 387–395.
Google Scholar
G.L. Leclerc, Comte de Buffon, Histoire Naturelle, Premier discours: De la Manière d’Étudier et de Traiter l’Histoire Naturelle (Paris, 1749).
V. Capoyleas, G. Rote and G. Woeginger, Geometric clusterings, J. Algorithms 12 (1991) 341–356.
Article MATH MathSciNet Google Scholar
J.L. Chandon, J. Lemaire and J. Pouget, Construction de l’ultramétrique la plus proche d’une dissimilarité au sens des moindres carrés, RAIRO-Recherche Opérationnelle 14 (1980) 157–170.
MATH MathSciNet Google Scholar
M.S. Chang, C.Y. Tang and R.C.T. Lee, A unified approach for solving bottleneckk-bipartition problems, in:Proceedings of the 19th Annual Computer Science Conference (ACM, San Antonio, TX, 1991) 39–47.
Chapter Google Scholar
S. Chopra and M.R. Rao, On the multiway cut polyhedron, Networks 21 (1991) 51–89.
MATH MathSciNet Google Scholar
S. Chopra and J.H. Owen, Extended formulations for theA-cut problem,Mathematical Programming 73 (1996) 17–30.
MathSciNet Google Scholar
V. Chvatal,Linear Programming (Freeman, New York, 1983).
MATH Google Scholar
N. Christofides,Graph Theory. An Algorithmic Approach (Academic Press, London, 1975).
MATH Google Scholar
R.M. Cormack, A review of classification (with discussion),J. Royal Statistical Society. Series A 134 (1971) 321–367.
Article MathSciNet Google Scholar
Y. Crama, P. Hansen and B. Jaumard, The basic algorithm for pseudo-Boolean programming revisited,Discrete Applied Mathematics 29 (1990) 171–185.
Article MathSciNet MATH Google Scholar
A. Datta, H.-P. Lenhof, Ch. Schwarz and M. Smid, Static and dynamic algorithms fork-point clustering problems, J. Algorithms 19 (1995) 474–503.
Article MATH MathSciNet Google Scholar
W.H.E. Day and H. Edelsbrunner, Efficient algorithms for agglomerative hierarchical clustering methods, J. Classification 1 (1984) 7–24.
Article MATH Google Scholar
M. Delattre and P. Hansen, Classification d’homogénéité maximum, in:Actes du Colloque Analyse de Données et Informatique, INRIA 1 (1977) 99–104.
M. Delattre and P. Hansen, Bicriterion cluster analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence 2 (1980) 277–291.
MATH Google Scholar
G. De Soete, A least squares algorithm for fitting additive trees to proximity data,Psychometrika 48 (1983) 621–626.
Article Google Scholar
G. De Soete, A least squares algorithm for fitting an ultrametric tree to a dissimilarity matrix,Pattern Recognition Letters 2 (1984) 133–137.
Article Google Scholar
G. De Soete, Ultrametric tree representation of incomplete dissimilarity data,J. Classification 1 (1984) 235–242.
Article Google Scholar
G. De Soete, Additive tree representations of incomplete dissimilarity data,Quality and Quantity 18 (1984) 387–393.
Article Google Scholar
J. Desrosiers, F. Soumis and M. Desrochers, Routing with time windows by column generation,Networks 14 (1984) 545–565.
MATH Google Scholar
E. Diday, From data to knowledge: Probabilistic objects for a symbolic data analysis, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 35–53.
Google Scholar
E. Diday, Orders and overlapping clusters by pyramids, Research Report, 730, INRIA, France, 1987.
Google Scholar
G. Diehr, Evaluation of a branch and bound algorithm for clustering,SIAM J. Scientific and Statistical Computing 6 (1985) 268–284.
Article MATH Google Scholar
W. Dinkelbach, On nonlinear fractional programming,Management Science 61 (1995) 195–212.
Google Scholar
A. Dodge and T. Gafner, Complexity relaxation of dynamic programming for cluster analysis, in: E. Diday et al., eds.,New Approaches in Classification and Data Analysis, Studies in Classification, Data Analysis, and Knowledge Organization (1994) 220–227.
U. Dorndorf and E. Pesch, Fast clustering algorithms,ORSA J. Computing 6 (1994) 141–153.
MATH Google Scholar
O. du Merle, D. Villeneuve, J. Desrosiers and P. Hansen, Stabilization dans le cadre de la génération de colonnes, Les Cahiers du GERAD G-97-08 (1997).
O. du Merle, P. Hansen, B. Jaumard and N. Mladenović, An interior point algorithm for minimum sum-of-squares clustering, in preparation.
D. Erlenkotter, A dual-based procedure for uncapacitated facility location,Operations Research 26 (1978) 1590–1602.
MathSciNet Google Scholar
C.C Ferreira, A. Martin, C.C. De Souza, R. Weissmantel and L.A. Wolsey, Formulation and valid inequalities for the node capacitated graph partitioning problem,Mathematical Programming 74 (1996) 247–266.
Article MathSciNet Google Scholar
R. Garfinkel, A.W. Neebe and M.R. Rao, An algorithm for theM-median plant location problem,Transportation Science 8 (1974) 217–236.
MathSciNet Google Scholar
O. Gascuel and D. Levy, A reduction algorithm for approximating a (non-metric) dissimilarity by a tree distance,J. Classification 13 (1996) 129–155.
Article MATH MathSciNet Google Scholar
S. Gelinas, P. Hansen and B. Jaumard, A labelling algorithm for minimum sum of diameters partitioning of graphs, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 89–96.
Google Scholar
P.C. Gilmore and R.E. Gomory, A linear programming approach to the cutting stock problem,Operations Research 9 (1961) 849–859.
MATH MathSciNet Google Scholar
A.D. Gordon,Classification: Methods for the Exploratory Analysis of Multivariate Data (Chapman and Hall, New York, 1981).
MATH Google Scholar
A.D. Gordon, A review of hierarchical classification, J. Royal Statistical Society 150 (1987) 119–137.
MATH Google Scholar
J.C. Gower and G.J.S. Ross, Minimum spanning trees and single linkage cluster analysis,Applied Statistics 18 (1969) 54–64.
Article MathSciNet Google Scholar
M. Grötschel and Y. Wakabayashi, A cutting plane algorithm for a clustering problem,Mathematical Programming 45 (1989) 59–96.
Article MATH MathSciNet Google Scholar
M. Grötschel and Y. Wakabayashi, Facets of the clique partitioning polytope,Mathematical Programming 47 (1990) 367–387.
Article MATH MathSciNet Google Scholar
A. Guénoche, Partitions with minimum diameter, in:Proceedings of the International Conference of the Federation of Classification Societies (Charlottesville, USA, 1989).
A. Guénoche, Enumération des partitions de diamètre minimum,Discrete Mathematics 111 (1993) 277–287.
Article MATH MathSciNet Google Scholar
A. Guénoche, P. Hansen and B. Jaumard, Efficient algorithms for divisive hierarchical clustering with the diameter criterion,J. Classification 8 (1991) 5–30.
MATH MathSciNet Google Scholar
P.L. Hammer, P. Hansen and B. Simeone, Roof duality, complementation and persistency in quadratic 0–1 optimization,Mathematical Programming 28 (1984) 121–155.
Article MATH MathSciNet Google Scholar
P. Hanjoul and D. Peeters, A comparison of two dual-based procedures for solving thep-median problem,European Journal of Operations Research 20 (1985) 387–396.
Article MATH MathSciNet Google Scholar
P. Hansen and M. Delattre, Complete-link cluster analysis by graph coloring,J. American Statistical Association 73 (1978) 397–403.
Article Google Scholar
P. Hansen and B. Jaumard, Minimum sum of diameters clustering,J. Classification 4 (1987) 215–226.
Article MATH MathSciNet Google Scholar
P. Hansen, B. Jaumard and E. da Silva, Average-linkage divisive hierarchical clustering,Les Cahiers du GERAD, G-91-55 (1991); also in:J. Classification, to appear.
P. Hansen, B. Jaumard and O. Frank, Maximum sum-of-splits clustering,J. Classification 6 (1989) 177–193.
Article MATH MathSciNet Google Scholar
P. Hansen, B. Jaumard, S. Krau and O. du Merle, A stabilized column generation algorithm for the Weber multisource problem, in preparation.
P. Hansen, B. Jaumard and V. Mathon, Constrained nonlinear 0–1 programming,ORSA J. Computing 5 (1993) 97–119.
MATH MathSciNet Google Scholar
P. Hansen, B. Jaumard and C. Meyer, Exact sequential algorithms for additive clustering, Les Cahiers du GERAD (1997), to appear.
P. Hansen, B. Jaumard and N. Mladenovic, How to choosek entities amongN, in: I. Cox, P. Hansen and B. Julesz, eds.,Partitioning Data Sets (American Mathematical Society, Providence, RI, 1995) 105–116.
Google Scholar
P. Hansen, B. Jaumard and M. Poggi De Aragão, Mixed-integer column generation algorithms and the probabilistic maximum satisfiability problem, in: E. Balas, G. Cornejols and R. Kannan, eds.,Proceedings of the Second IPCO Conference (Carnegie-Mellon University, 1992) 165–180.
P. Hansen, B. Jaumard and E. Sanlaville, Weight constrained minimum sum-of-stars clustering,Les Cahiers du GERAD, G-93-38 (1993); also in:J. Classification, to appear.
P. Hansen, B. Jaumard and E. Sanlaville, Partitioning problems of cluster analysis: A review of mathematical programming approaches, in: E. Diday et al., eds.,New Approaches in Classification and Data Analysis (Springer, Berlin, 1994) 228–240.
Google Scholar
P. Hansen, B. Jaumard and B. Simeone, Espaliers, A generalization of dendrograms,J. Classification 13 (1996) 107–127.
Article MATH MathSciNet Google Scholar
P. Hansen, B. Jaumard and B. Simeone, Polynomial algorithms for nested univariate clustering,Les Cahiers du GERAD, G-96-28 (1996).
Google Scholar
P. Hansen, M. Minoux and M. Labbe, Extension de la programmation linéaire généralisée au cas des programmes mixtes,Comptes Rendus de l’Académie des Sciences 305 (1987) 569–572.
MATH MathSciNet Google Scholar
P. Hansen and N. Mladenovic, Variable neighborhood search,Les Cahiers du GERAD, G-96-49 (1996); also in:Computers and Operations Research, to appear.
J.A. Hartigan,Clustering Algorithms (Wiley, New York, 1975).
MATH Google Scholar
J. Hershberger, Minimizing the sum of diameters efficiently,Computational Geometry: Theory and Applications 2 (1992) 111–118.
MATH MathSciNet Google Scholar
L.J. Hubert, Some applications of graph theory to clustering,Psychometrika 39 (1974) 283–309.
Article MATH MathSciNet Google Scholar
L.J. Hubert, Min and max hierarchical clustering using asymmetric similarity measures,Psychometrika 38 (1973) 63–72.
Article MATH Google Scholar
L.J. Hubert and P. Arabie, Iterative projection strategies for the least-squares fitting of tree structure to proximity data,The British Journal of Mathematical and Statistical Psychology 48 (1995) 281–317.
MATH Google Scholar
F.K. Hwang, U.G. Rothblum and Y.-C. Yao, Localizing combinatorial properties of partitions, AT&T Bell Labs Report, 1995.
M. Jambu,Classification Automatique pour l’Analyse des Données, Tome 1 (Dunod, Paris, 1976).
Google Scholar
M. Jambu,Exploratory and Multivariate Data Analysis (Academic Press, New York, 1991).
MATH Google Scholar
R.E. Jensen, A dynamic programming algorithm for cluster analysis,Operations Research 17 (1969) 1034–1057.
MATH Google Scholar
E.L. Johnson, A. Mehrotra and G.L. Nemhauser, Min-cut clustering,Mathematical Programming 62 (1993) 133–151.
Article MathSciNet Google Scholar
L. Kaufman and P.J. Rousseeuw,Finding Groups in Data: An Introduction to Cluster Analysis (Wiley, New York, 1990).
Google Scholar
G. Klein and J.E. Aronson, Optimal clustering: a model and method,Naval Research Logistics 38 (1991) 447–461.
MATH Google Scholar
W.L.G. Koontz, P.M. Narendra and K. Fukunaga, A Branch and bound clustering algorithm,IEEE Transactions Computers 24 (1975) 908–915.
MATH MathSciNet Google Scholar
M. Krivanek and J. Moravek, NP-hard problems in hierarchical-tree clustering,Acta Informatica 23 (1986) 311–323.
Article MATH MathSciNet Google Scholar
G.N. Lance and W.T. Williams, A general theory of classificatory sorting strategies. 1. Hierarchical systems,The Computer Journal 9 (1967) 373–380.
Google Scholar
B. Leclerc, Description combinatoire des ultramétriques,Mathématiques et Sciences Humaines 73 (1981) 5–37.
MathSciNet Google Scholar
J.K. Lenstra, Clustering a data array and the traveling salesman problem,Operations Research 22 (1974) 413–414.
MATH Google Scholar
J.F. Marcotorchino and P. Michaud,Optimisation en Analyse Ordinale des Données (Masson, Paris, 1979).
Google Scholar
D.W. Matula and L.L. Beck, Smallest-last ordering and clustering and graph-coloring algorithms,J. ACM 30 (1983) 417–427.
Article MATH MathSciNet Google Scholar
W.T. McCormick Jr, P.J. Schweitzer and T.W. White, Problem decomposition and data reorganization by a clustering technique,Operations Research 20 (1972) 993–1009.
Article MATH Google Scholar
M. Minoux and E. Pinson, Lower bounds to the graph partitioning problem through generalized linear programming and network flows,RAIRO Recherche Opérationnelle 21 (1987) 349–364.
MATH MathSciNet Google Scholar
G.W. Milligan and M.C. Cooper, An examination of procedures for determining the number of clusters in data set,Psychometrika 50 (1985) 159–179.
Article Google Scholar
B. Mirkin, Additive clustering and qualitative factor analysis methods for similarity matrices,J. Classification 4 (1987) 7–31, (Erratum 6, 271–272).
Article MATH MathSciNet Google Scholar
B. Mirkin,Mathematical Classification and Clustering (Kluwer, Dordrecht, 1996).
MATH Google Scholar
C. Monma and S. Suri, Partitioning points and graphs to minimize the maximum or the sum of diameters, in: Y. Alavi, G. Chartrand, O.R. Oellerman and A.J. Schwenk, eds.,Proceedings of the Sixth Quadrennial International Conference on the Theory and Applications of Graphs, Graph Theory, Combinatorics, and Applications (Wiley, New York, 1991) 899–912.
Google Scholar
F. Murtagh, A survey of recent advances in hierarchical clustering algorithms,The Computer Journal 26 (1983) 329–340.
Google Scholar
J. Ponthier, A.-B. Dufour and N. Normand, Le Modèle Euclidien en Analyse des Données (Ellipses, Paris, 1990).
Google Scholar
A.W. Neebe and M.R. Rao, An algorithm for the fixed-charge assignment of users to sources problem,J. Operations Research Society 34 (1983) 1107–1113.
Article MATH Google Scholar
G. Palubeckis, A branch-and-bound approach using polyhedral results for a clustering problem,INFORMS J. Computing 9 (1997) 30–42.
MATH MathSciNet Google Scholar
M.R. Rao, Cluster analysis and mathematical programming,J. American Statistical Association 66 (1971) 622–626.
Article MATH Google Scholar
C.R. Reeves, ed.,Modern Heuristic Techniques for Combinatorial Problems (Blackwell, London, 1993).
MATH Google Scholar
S. Régnier, Sur quelques aspects mathématiques des problèmes de classification,ICC Bulletin 4 (1965) 175–191; reprinted in:Mathématiques et Sciences Humaines 82 (1983) 85–111.
Google Scholar
P. Rosenstiehl, L’arbre minimum d’un graphe, in: P. Rosenstiehl, ed.,Théorie des Graphes (Dunod, Paris, 1967) 357–368.
Google Scholar
A. Rusch and R. Wille, Knowledge spaces and formal concept analysis, in: H. Boch and W. Polasek, eds.,Data Analysis and Information Systems (Springer, Berlin, 1996) 427–436.
Google Scholar
D.M. Ryan and B.A. Foster, An integer programming approach to scheduling, in: A. Wren, ed.,Computer Scheduling of Public Transport Urban Passenger Vehicle and Crew Scheduling (North-Holland, Amsterdam, 1981) 269–280.
Google Scholar
R.N. Shepard and P. Arabie, Additive clustering representation of similarities as combinations of discrete overlapping properties,Psychol. Rev. 86 (1979) 87–123.
Article Google Scholar
H. Späth,Cluster Analysis Algorithms for Data Reduction and Classification of Objects (Ellis Horwood, Chichester, UK, 1980).
MATH Google Scholar
L.E. Stanfel, A recursive Lagrangian method for clustering problems,European Journal of Operations Research 27 (1986) 332–342.
Article MATH Google Scholar
P.H.A. Sneath and R.R. Sokal,Numerical Taxonomy (Freeman, San Francisco, 1973).
MATH Google Scholar
R.E. Tarjan, An improved algorithm for hierarchical clustering using strong components,Information Processing Letters 17 (1983) 37–41.
Article MATH MathSciNet Google Scholar
F. Vanderbeck, Decomposition and column generation for integer programs, Ph.D. Thesis, Faculté des Sciences Appliquées, Université Catholique de Louvain, Louvain-la-Neuve, 1994.
Google Scholar
H.D. Vinod, Integer programming and the theory of grouping,J. American Statistical Association 64 (1969) 506–519.
Article MATH Google Scholar
W.J. Welch, Algorithmic complexity — Three NP-hard problems in computational statistics,J. Statistical Computing 15 (1982) 68–86.
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

GERAD and École des Hautes Études Commerciales, Montréal, Canada
Pierre Hansen
GERAD and École Polytechnique de Montréal, Montréal, Canada
Brigitte Jaumard

Authors

Pierre Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Brigitte Jaumard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierre Hansen.

Additional information

Research supported by ONR grant N00014-95-1-0917, FCAR grant 95-ER-1048 and NSERC grants GP0105574 and GP0036426. The authors thank Olivier Gascuel and an anonymous referee for insightful remarks.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hansen, P., Jaumard, B. Cluster analysis and mathematical programming. Mathematical Programming 79, 191–215 (1997). https://doi.org/10.1007/BF02614317

Download citation

Received: 03 March 1997
Accepted: 26 April 1997
Issue Date: October 1997
DOI: https://doi.org/10.1007/BF02614317

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Cluster analysis and mathematical programming

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cluster analysis and mathematical programming

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation