ABSTRACT
Clustering has become an increasingly important task in modern application domains where the data are originally located at different sites. In order to create a central clustering, all clients have to transmit their data to a central server. Due to technical limitations and security aspects, at the central site often only vague object descriptions are available. The server then has to carry out the clustering based on vague and uncertain data. In a recent paper, an approach for clustering uncertain data was proposed based on the concept of medoid clusterings. The idea of this approach is to create first several sample clusterings. Then based on suitable distance functions between clusterings the most average clustering, i.e. the medoid clustering, was determined. In this paper, we extend this approach for partitioning clustering algorithms and propose to compute a centroid clustering based on these input sample clusterings. These centroid clusterings are new artificial clusterings which minimize the distance to all the sample clusterings.
- Bracewell, R. The Impulse Symbol. Ch. 5 in The Fourier Transform and Its Applications, 3rd ed.: McGraw-Hill, 1999.Google Scholar
- Cheng R., Kalashnikov D.V., Prabhakar S.: Evaluating probabilistic queries over imprecise data. SIGMOD'03, pp. 551--562. Google ScholarDigital Library
- Ciaccia P., Patella M., Zezula P.: M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. VLDB'97, pp. 426--435. Google ScholarDigital Library
- Januzaj E., Kriegel H.-P., Pfeifle M.: Scalable Density-Based Distributed Clustering. PKDD'04, pp. 231--244. Google ScholarDigital Library
- Januzaj E., Kriegel H.-P., Pfeifle M.: Density-Based Distributed Clustering. EDBT'04, pp.88--105.Google Scholar
- Kriegel H.-P., Kunath P., Pfeifle M., Renz M.: Approximated Clustering of Distributed High Dimensional Data. PAKDD'05. Google ScholarDigital Library
- Kriegel H.-P., Pfeifle M.: Measuring the Quality of Approximated Clusterings. BTW'05, pp. 415--424.Google Scholar
- Kriegel H.-P., Pfeifle M.: Clustering Moving Objects via Medoid Clusterings. SSDBM'05. Google ScholarDigital Library
- Li Y., Han J., Yang J.: Clustering Moving Objects. KDD'04, pp. 617--622. Google ScholarDigital Library
- Yiu M. L., N. Mamoulis N.: Clustering Objects on a Spatial Network. SIGMOD'04, pp. 443--454. Google ScholarDigital Library
Index Terms
- Efficient and effective server-sided distributed clustering
Recommendations
A Comparative Analysis of Distributed Clustering Algorithms: A Survey
ISCBI '13: Proceedings of the 2013 International Symposium on Computational and Business IntelligenceCluster analysis (or clustering) is one of the most common techniques used for data mining. It is a process in which a given set of objects is assigned into groups, where these groups are known as clusters. Objects belonging to a single cluster are ...
A Survey of Distributed Clustering Algorithms
ICICEE '12: Proceedings of the 2012 International Conference on Industrial Control and Electronics EngineeringClustering is to divide a set of objects into multiple classes, and each class is made up of similar objects. Traditional centralized clustering algorithms cluster objects stored in a single site, but it cannot satisfy the clustering requirements when ...
Efficient distributed clustering using boundary information
In the era of big data, it is increasingly common that large amount of data is generated across multiple distributed sites and cannot be gathered into a centralized site for further analysis, which invalidates the assumption of traditional clustering ...
Comments