Abstract
In sampling of huge network traffic dataset, some packets are chosen out of total packets. Leftover packets may have effect on statistical characteristics of the data. In this paper effect of sampling on statistical characteristics is discussed. A well-known benchmarked NSL KDD network traffic dataset is used. Three sampling techniques namely - random, systematic and under-over sampling are used. Various attributes of dataset considered are duration, src_bytes, dst_bytes, wrong_fragment, num_compromised, num_file_ creations and srv_count. Parameter of statistical characteristics like range, mean and standard deviation is used for analysis purpose. Result shows that sampling has considerable statistical effect on network traffic dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
He, G., Hou, J.C.: On sampling self-similar Internet traffic. Computer Networks 50(16), 2919–2936 (2006)
Mahmood, A.N., Hu, J., Tari, Z., Leckie, C.: Critical infrastructure protection: Resource efficient sampling to improve detection of less frequent patterns in network traffic. Journal of Network and Computer Applications 33(4), 491–502 (2010)
Liu, J.G., Martin, C.: Generative oversampling for imbalanced datasets. In: International Conference on Data Mining (DMIN), Las Vegas, Nevada, USA, June 25-28, pp. 66–72 (2007)
Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Handling imbalanced datasets: A review. GESTS International Transactions on Computer Science and Engineering 30(1), 25–36 (2006)
Liu, Y., Yu, X., Huang, J.X., An, A.: Combining integrated sampling with SVM ensembles for learning from imbalanced datasets. Information Processing & Management 47(4), 617–631 (2011)
Lippmann Richard, P., Fried David, J., Isaac, G., Haines Joshua, W., Kendall Kristopher, R., David, M., Dan, W., Webster Seth, E., Dan, W., Cunningham Robert, K., Zissman Marc, A.: Evaluating Intrusion Detection Systems: The 1998 DARPA Off-line Intrusion Detection Evaluation. In: DARPA Information Survivability Conference and Exposition, Hilton Head, South Carolina, January 25-27, pp. 12–26 (2000)
Singh, R., Kumar, H., Singla, R.K.: Traffic Analysis of Campus Network for Classification of Broadcast Data. In: 47th Annual National Convention of Computer Society of India, International Conference on Intelligent Infrastructure, Science City, Kolkata, December 1-2, pp. 163–166 (2012)
KDD dataset, http://nsl.cs.unb.ca/NSL-KDD
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Singh, R., Kumar, H., Singla, R.K. (2014). Analyzing Statistical Effect of Sampling on Network Traffic Dataset. In: Satapathy, S., Avadhani, P., Udgata, S., Lakshminarayana, S. (eds) ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India- Vol I. Advances in Intelligent Systems and Computing, vol 248. Springer, Cham. https://doi.org/10.1007/978-3-319-03107-1_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-03107-1_43
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03106-4
Online ISBN: 978-3-319-03107-1
eBook Packages: EngineeringEngineering (R0)