Abstract
With the rapid development of the network and the increase of the information on the web, rapid access to the database and data mining become very important. Column-store has the advantage of quick read speed, saving the disk I/O, and can be read by uncompressed, which is helpful to acquire knowledge in the massive data. So based on the traditional data mining module, introduce the column store technology. Information base and knowledge base all adopt column store to store and access, and also provide the access interface between the store module and its upper layer module. Then, compute the Minkowski distance and use k-medoids methods in the data clustering module. On the base of the access advantage of the column store and k-medoids methods, this system can improve the speed and the quality of the clustering. The innovation is the application of column store in the clustering system, and provide completed data access interface, using k-medoids methods can detect clusters of arbitrary shape. Computing the Minkowski distance can improve the efficiency of the dissimilarity of objects and clustering speed.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Mike Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen (2005) Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O’Neil, Pat O’Neil, Alex Rasin, Nga Tran, Stan Zdonik. C-Store: A Column-Oriented DBMS[C]. In VLDB, Trondheim, 21: 57–63
CoPeland GP, Koshafian SF (1985) A decomposition storage model. In: Proceedings of the ACM SIGMOD international conference on management of data 5:268–279
Harry KT Wong, Hsiu-Fen Liu (1985) Frank Olken, Doron Rotem, Linda Wong. Bit transPosed files. In VLDB
Anastassia Ailamaki (2001) A storage model to bridge the proeessor/memory speed gap. In HPTS
Ravishankar Ramamurthy, David J Dewitt, Qi Su (2002) A case for fractured mirrors. In VLDB
Han J, Kamber M (2010) Data mining concepts and techniques. Morgan Kaufmann, San Francisco, pp 338–353
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this paper
Cite this paper
Shen, L., Zhang, T., Song, J., Chen, P., Wang, J. (2014). Research and Design of the Clustering System Based on the Column-Store. In: Zhong, S. (eds) Proceedings of the 2012 International Conference on Cybernetics and Informatics. Lecture Notes in Electrical Engineering, vol 163. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3872-4_223
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3872-4_223
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3871-7
Online ISBN: 978-1-4614-3872-4
eBook Packages: EngineeringEngineering (R0)