Abstract
Machine learning (ML) methods used to train computational models are one of the most valuable elements of the modern artificial intelligence. Thus preparing tools to evaluate ML training algorithms abilities to find inside the training data information (the context) crucial to build successful models is still an important topic. Within this text we introduce a new method of quantitative estimation of effectiveness of context usage by the ML training algorithms based on injection of predefined context to the training data sets. The results indicate that the proposed solution can be used as a general method of analyzing differences in context processing between ML training methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Gradient Boosting Machine training parameters: 50 trees, maximum tree depth = 5, learning rate = 0.1, implementation: H2O Flow 3.10.0.8.
- 2.
Deep Neural Network training parameters: 100 × 100 hidden neurons, activation function: rectifier, max number of training epochs = 300, implementation: H2O Flow 3.10.0.8.
- 3.
Random Forest classifier training parameters: bag size = 100, number of iterations = 100, unlimited tree size, implementation: Weka 3.8.0.
- 4.
C4.5 tree (not pruned) implementation: C4.5 v8 by R. Quinlan.
References
Chen, P., Xu, B., Yang, M., Li, S.: Clause sentiment identification based on convolutional neural network with context embedding. In: 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 1532–1538. IEEE Press (2016)
Tang, K., Paluri, M., Fei-Fei, L., Fergus, R., Bourdev, L.: Improving image classification with location context. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1008–1016. IEEE Press (2015)
Kapitsaki, G.M.: Reflecting user privacy preferences in context-aware web services. In: 2013 IEEE 20th International Conference on Web Services (ICWS), pp. 123–130. IEEE Press (2013)
Datta, S.K., Bonnet, C., Nikaein, N.: Self-adaptive battery and context aware mobile application development. In: 2014 International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 761–766 (2014)
Klingelschmitt, S., Eggert, J.: Using context information and probabilistic classification for making extended long-term trajectory predictions. In: 2015 IEEE 18th International Conference on Intelligent Transportation Systems, pp. 705–711 (2015)
Spaulding, J., Krauss, A., Srinivasan, A.: Exploring an open WiFi detection vulnerability as a malware attack vector on iOS devices. In: 2012 7th International Conference on Malicious and Unwanted Software (MALWARE), pp. 87–93 (2012)
Nguyen, T.C., Nguyen, X.H., Nguyen, V.K.: Hybrid priority schemes for the message scheduling for CAN-based Networked Control Systems. In: 2014 IEEE Fifth International Conference on Communications and Electronics (ICCE), pp. 264–269 (2014)
Murphy, R., Woods, D.D.: Beyond Asimov: the three laws of responsible robotics. IEEE Intell. Syst. 24, 14–20 (2009)
Wang, J., Qiu, M., Guo, B., Shen, Y., Li, Q.: Low-power sensor polling for context-aware services on smartphones. In: 2015 IEEE 12th International Conference on Embedded Software and Systems (ICESS), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), pp. 617–622 (2015)
Pallotta, G., Jousselme, A.L.: Data-driven detection and context-based classification of maritime anomalies. In: 2015 18th International Conference on Information Fusion (Fusion), pp. 1152–1159 (2015)
Duma, D., Sutton, C., Klein, E.: Context matters: towards extracting a citation’s context using linguistic features. In: 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 201–202 (2016)
Kang, S., Kim, D., Cho, S.: Efficient feature selection-based on random forward search for virtual metrology modeling. IEEE Trans. Semicond. Manuf. 29, 391–398 (2016)
Chakraborty, G., Horie, S., Yokoha, H., Kokosiński, Z.: Minimizing sensors for system monitoring - a case study with EEG signals. In: 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF), pp. 206–211. IEEE Press (2015)
Fan, X., Tang, K.: Enhanced maximum AUC linear classifier. In: 2010 7th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 1540–1544 (2010)
Yan, L., Dodier, R., Mozer, M.C., Wolniewicz, R.: Optimizing classifier performance via the Wilcoxon-Mann-Whitney statistic. In: 20th International Conference on Machine Learning (ICML-03), pp. 848–855. American Association for Artificial Intelligence (2003)
Trigg, L.: An entropy gain measure of numeric prediction performance. Working paper 98/11, Department of Computer Science, University of Waikato (1998)
Patil, L.H., Atique, M.: A novel feature selection based on information gain using WordNet. In: Science and Information Conference (SAI), pp. 625–629 (2013)
Wu, G., Wang, L., Zhao, N., Lin, H.: Improved expected cross entropy method for text feature selection. In: 2015 International Conference on Computer Science and Mechanical Automation (CSMA), pp. 49–54 (2015)
Wang, X.N., Wei, J.M., Jin, H., Yu, G., Zhang, H.W.: Probabilistic confusion entropy for evaluating classifiers. Entropy 15, 4969–4992 (2013)
Sofeikov, K.I., Tyukin, I.Y., Gorban, A.N., Mirkes, E.M., Prokhorov, D.V., Romanenko, I.V.: Learning optimization for decision tree classification of non-categorical data with information gain impurity criterion. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 3548–3555 (2014)
Bhasin, V., Bedi, P., Singhal, A.: Feature selection for steganalysis based on modified Stochastic Diffusion Search using Fisher score. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2323–2330 (2014)
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
ICxS Contextual Data Repository. http://www.icxs.pwr.edu.pl/cx
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Huk, M. (2017). Context Injection as a Tool for Measuring Context Usage in Machine Learning. In: Nguyen, N., Tojo, S., Nguyen, L., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2017. Lecture Notes in Computer Science(), vol 10191. Springer, Cham. https://doi.org/10.1007/978-3-319-54472-4_65
Download citation
DOI: https://doi.org/10.1007/978-3-319-54472-4_65
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54471-7
Online ISBN: 978-3-319-54472-4
eBook Packages: Computer ScienceComputer Science (R0)