ABSTRACT
Novelty detection has been presented in the literature as one-class problem. In this case, new examples are classified as either belonging to the target class or not. The examples not explained by the model are detected as belonging to a class named novelty. However, novelty detection is much more general, especially in data streams scenarios, where the number of classes might be unknown before learning and new classes can appear any time. In this case, the novelty concept is composed by different classes. This work presents a new algorithm to address novelty detection in data streams multi-class problems, the MINAS algorithm. Moreover, we also present a new experimental methodology to evaluate novelty detection methods in multi-class problems. The data used in the experiments include artificial and real data sets. Experimental results show that MINAS is able to discover novelties in multi-class problems.
- C. C. Aggarwal, J. W. J. Han, and P. S. Yu. A framework for clustering evolving data streams. In Proceedings of the 29th International Conference on Very Large Data Bases, volume 29, pages 81--92, 2003. Google ScholarDigital Library
- T. Ahmed and M. Coates. Multivariate online anomaly detection using kernel recursive least squares. In Proc. IEEE Infocom, pages 625--633, Anchorage, Alaska, 2007.Google ScholarDigital Library
- A. Bifet, G. Holmes, B. Pfahringer, P. Kranen, H. Kremer, T. Jansen, and T. Seidl. MOA: Massive online analysis, a framework for stream classification and clustering. 11:44--50, 2010. Google ScholarDigital Library
- R. Casimir, E. Boutleux, G. Clerc, and A. Yahoui. The use of features selection and nearest neighbors rule for faults diagnostic in induction motors. Eng. Appl. Artif. Intell., 19(2):169--177, 2006. Google ScholarDigital Library
- L. A. Clifton, H. Yin, and Y. Zhang. Support vector machine in novelty detection for multi-channel combustion data. In Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III, ISNN'06, pages 836--843, Berlin, Heidelberg, 2006. Springer-Verlag. Google ScholarDigital Library
- A. Frank and A. Asuncion. UCI machine learning repository, 2010.Google Scholar
- J. Gama. Knowledge Discovery from Data Streams. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press, 1 edition, 2010. Google ScholarDigital Library
- M. Z. Hayat and M. R. Hashemi. A dct based approach for detecting novelty and concept drift in data streams. In International Conference on Soft Computing and Pattern Recognition (SoCPaR), pages 373--378. IEEE, 2010.Google ScholarCross Ref
- M. Markou and S. Singh. Novelty detection: a review part 1: statistical approaches. Signal Process., 83(12):2481--2497, 2003. Google ScholarDigital Library
- S. Marsland, J. Shapiro, and U. Nehmzow. A self-organising network that grows when required. Neural Network, 15:1041--1058, 2002. Google ScholarDigital Library
- M. Masud, J. Gao, L. Khan, J. Han, and B. M. Thuraisingham. Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans. on Knowl. and Data Eng., 23(6):859--874, 2011. Google ScholarDigital Library
- M. M. Masud, T. M. Al-Khateeb, L. Khan, C. Aggarwal, J. Gao, J. Han, and B. Thuraisingham. Detecting recurring and novel classes in concept-drifting data streams. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining, ICDM '11, pages 1176--1181, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarDigital Library
- M. Naldi, R. Campello, E. Hruschka, and A. Carvalho. Efficiency issues of evolutionary k-means. App. Soft Comp., 11:1938--1952, 2011. Google ScholarDigital Library
- E. J. Spinosa, A. C. P. L. F. Carvalho, and J. Gama. Novelty detection with application to data streams. Intelligent Data Analysis, 13(3):405--422, 2009. Google ScholarDigital Library
- D. Y. Yan and C. Chow. Parzen-window network intrusion detectors. In International Conference on Pattern Recognition, ICPR '02, pages 385--388, 2002. Google ScholarDigital Library
Index Terms
- Novelty detection algorithm for data streams multi-class problems
Recommendations
Ensemble Clustering for Novelty Detection in Data Streams
Discovery ScienceAbstractIn data streams new classes can appear over time due to changes in the data statistical distribution. Consequently, models can become outdated, which requires the use of incremental learning algorithms capable of detecting and learning the ...
Large margin distribution multi-class supervised novelty detection
AbstractAs one of state-of-the-art supervised novelty detection models, support vector machine-supervised novelty detection (SVM-SND) can recognize whether a test instance is a novelty or which class the test instance comes from if it is a ...
Novelty detection in data streams
In massive data analysis, data usually come in streams. In the last years, several studies have investigated novelty detection in these data streams. Different approaches have been proposed and validated in many application domains. A review of the main ...
Comments