Skip to main content

A Role of (Not) Crisp Discernibility in Rough Set Approach to Numeric Feature Selection

  • Conference paper
Advanced Machine Learning Technologies and Applications (AMLTA 2012)

Abstract

We investigate the rough-set-based framework for feature selection in decision tables with numeric attributes. We compare functions evaluating subsets of attributes with respect to their potential in determining the distinguished decision attribute by means of two alternative methods: discernibility-based functions over discretized numeric data, as well as distance-based functions often used in the fuzzy-rough approaches to feature selection. In both cases, the idea is to compare objects belonging to different decision classes, by verifying whether they can be distinguished from each other by using discretized attributes or measuring distances between their values over original numeric attributes. We draw a correspondence between functions evaluating subsets of numeric attributes according to both methodologies. For a subset of numeric attributes, we consider a function measuring the amount of pairs of objects belonging to different decision classes that are not discerned by discretized attributes, averaged over all possible choices of binary discretization cuts over the attribute ranges. We prove that such a function can be rewritten by means of distances between the original numeric attributes. Namely, it is equal to the average fuzzy indiscernibility function computed by using the product t-norm combining indiscernibility degrees obtained over particular attributes.

Supported by the Polish National Science Centre grant 2011/01/B/ST6/03867.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bay, S.D.: Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets. In: Proc. of ICML, pp. 37–45 (1998)

    Google Scholar 

  2. Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough Set Algorithms in Classification Problem. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) New Developments in Knowledge Discovery in Information Systems, pp. 49–88. Physica Verlag (2000)

    Google Scholar 

  3. Bazan, J.G., Skowron, A., Ślęzak, D., Wróblewski, J.: Searching for the Complex Decision Reducts: The Case Study of the Survival Analysis. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 160–168. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Cornelis, C., Jensen, R., Hurtado Martín, G., Ślęzak, D.: Attribute Selection with Fuzzy Decision Reducts. Information Sciences 180(2), 209–224 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  5. Düntsch, I., Gediga, G., Nguyen, H.S.: Rough Set Data Analysis in the KDD Process. In: Proc. of IPMU, pp. 220–226 (2000)

    Google Scholar 

  6. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery in Databases. AI Magazine 17(3), 37–54 (1996)

    Google Scholar 

  7. Greco, S., Matarazzo, B., Słowiński, R.: Rough Sets Theory for Multicriteria Decision Analysis. European Journal of Operational Research 129(1), 1–47 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  8. Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  9. Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)

    Article  Google Scholar 

  10. Hung, Y.H.: A Neural Network Classifier with Rough Set-based Feature Selection to Classify Multiclass IC Package Products. Advanced Engineering Informatics 23(3), 348–357 (2009)

    Article  Google Scholar 

  11. Jensen, R., Cornelis, C.: Fuzzy-Rough Nearest Neighbour Classification. In: Peters, J.F., Skowron, A., Chan, C.-C., Grzymala-Busse, J.W., Ziarko, W.P. (eds.) Transactions on Rough Sets XIII. LNCS, vol. 6499, pp. 56–72. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Jensen, R., Shen, Q.: New Approaches to Fuzzy-Rough Feature Selection. IEEE Transactions on Fuzzy Systems 17(4), 824–838 (2009)

    Article  Google Scholar 

  13. Kovalerchuk, B., Vityaev, E., Yupusov, H.: Symbolic Methodology in Numeric Data Mining: Relational Techniques for Financial Applications. Computational Engineering, Finance, and Science Journal (2002)

    Google Scholar 

  14. Kowalski, M., Stawicki, S.: SQL-based Heuristics for Selected KDD Tasks over Large Data Sets. In: Proc. of FedCSIS (2012)

    Google Scholar 

  15. Kwiatkowski, P., Nguyen, S.H., Nguyen, H.S.: On Scalability of Rough Set Methods. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010, Part I. CCIS, vol. 80, pp. 288–297. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Lal, T., Chapelle, O., Weston, J., Elisseeff, A.: Embedded Methods. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications. Springer (2005)

    Google Scholar 

  17. Moshkov, M.J., Piliszczuk, M., Zielosko, B.: On Partial Covers, Reducts and Decision Rules. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets VIII. LNCS, vol. 5084, pp. 251–288. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Nair, B.B., Mohandas, V.P., Sakthivel, N.R.: A Decision Tree- Rough set Hybrid System for Stock Market Trend Prediction. International Journal of Computer Applications 6(9), 1–6 (2010)

    Article  Google Scholar 

  19. Nguyen, H.S.: Approximate Boolean Reasoning: Foundations and Applications in Data Mining. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets V. LNCS, vol. 4100, pp. 334–506. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  20. Pawlak, Z.: Rough Sets – Theoretical Aspects of Reasoning about Data. Kluwer (1991)

    Google Scholar 

  21. Pawlak, Z., Skowron, A.: Rudiments of Rough Sets. Information Sciences 177(1), 3–27 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  22. Rudnicki, W.R., Kierczak, M., Koronacki, J., Komorowski, J.: A Statistical Method for Determining Importance of Variables in an Information System. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 557–566. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  23. Ślęzak, D.: Approximate Decision Reducts. PhD Thesis, University of Warsaw, Poland (2002) (In Polish)

    Google Scholar 

  24. Ślęzak, D.: Degrees of Conditional (In)dependence: A Framework for Approximate Bayesian Networks and Examples Related to the Rough Set-based Feature Selection. Information Sciences 179(3), 197–209 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  25. Ślęzak, D., Wróblewski, J.: Classification Algorithms Based on Linear Combinations of Features. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 548–553. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  26. Stepaniuk, J.: Approximation Spaces, Reducts and Representations. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 2, pp. 109–126. Physica Verlag (1998)

    Google Scholar 

  27. Świniarski, R.W., Skowron, A.: Rough Set Methods in Feature Selection and Recognition. Pattern Recognition Letters 24(6), 833–849 (2003)

    Article  MATH  Google Scholar 

  28. Widz, S., Ślęzak, D.: Rough Set Based Decision Support – Models Easy to Interpret. In: Peters, G., Lingras, P., Ślęzak, D., Yao, Y. (eds.) Selected Methods and Applications of Rough Sets in Management and Engineering, Springer (2012)

    Google Scholar 

  29. Wojna, A.: Combination of Metric-Based and Rule-Based Classification. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005, Part I. LNCS (LNAI), vol. 3641, pp. 501–511. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  30. Wróblewski, J.: Ensembles of Classifiers Based on Approximate Reducts. Fundamenta Informaticae 47(3-4), 351–360 (2001)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ślęzak, D., Betliński, P. (2012). A Role of (Not) Crisp Discernibility in Rough Set Approach to Numeric Feature Selection. In: Hassanien, A.E., Salem, AB.M., Ramadan, R., Kim, Th. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2012. Communications in Computer and Information Science, vol 322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35326-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35326-0_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35325-3

  • Online ISBN: 978-3-642-35326-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics