SCLUSTREAM: AN EFFICIENT ALGORITHM FOR TRACKING CLUSTERS OVER SLIDING WINDOW IN BIG DATA STREAMING

Document Type : Original Article

Authors

1 computer science , faculty of computer and information,Ain shimas

2 Information Systems,Department, Faculty of Computer and Information Sciences,Ain Shams university,cairo,Egypts

3 Department Computer Science, Faculty of Computer and Information Sciences,Ain Shams University, Cairo, Egypt.

Abstract

Mining in data streams has been a hot research topic in the recent time. A main challenge in data stream mining lies in extracting knowledge in real time from a massive, dynamic data stream in only a single scan. Data stream clustering presents an important role in data stream processing. This paper proposes SCluStream an algorithm for tracking clusters over a sliding window to handle such challenges. The algorithm is an enhancement over CluStream which does not involve this sliding window concept. In the sliding window model, only the most recent data is used while the old data is eliminated, which allows for faster execution. A better clustering technique is also involved which managed to contribute to accuracy enhancement. The proposed algorithm has been tested on a dataset for Intrusion detection and the results showed that comparing SCluStream to CluStream has proven that the former algorithm is more efficient for online clusters generation for big data streaming in regard of the accuracy as well as the utilized time and memory resources.

Keywords