Abstract
In this article, we study the setting of learning from fuzzy labels, a generalization of supervised learning in which instances are assumed to be labeled with a fuzzy set, interpreted as an epistemic possibility distribution. We tackle the problem of feature selection in such task, in the context of rough set theory (RST). More specifically, we consider the problem of RST-based feature selection as a means for data disambiguation: that is, retrieving the most plausible precise instantiation of the imprecise training data. We define generalizations of decision tables and reducts, using tools from generalized information theory and belief function theory. We study the computational complexity and theoretical properties of the associated computational problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We note that in the learning from fuzzy labels setting, the set of candidate labels (that is, the labels with a membership degree greater than 0) is given a disjunctive interpretation: only one of those labels is correct, but we don’t precisely know which one, and the membership degrees represent degrees of belief. Thus, in this article, we do not consider the conjunctive interpretation, in which the membership degrees are degrees of truth (and, thus, could be seen as a generalization of multi-label learning).
- 2.
Here \(sup_{\le _{C}}\mathcal {I}(R) = \{ I \in \mathcal {I}(R) : \not \exists I' \in \mathcal {I}(R) \text { s.t. } I <_C I' \}\).
References
Arora, S., Barak, B.: Computational Complexity: A modern Approach. Cambridge University Press, Cambridge (2009)
Bello, R., Falcon, R.: Rough sets in machine learning: a review. In: Wang, G., Skowron, A., Yao, Y., Ślęzak, D., Polkowski, L. (eds.) Thriving Rough Sets. SCI, vol. 708, pp. 87–118. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54966-8_5
Campagner, A., Ciucci, D.: Orthopartitions and soft clustering: soft mutual information measures for clustering validation. Knowl.-Based Syst. 180, 51–61 (2019)
Campagner, A., Ciucci, D., Hüllermeier, E.: Feature reduction in superset learning using rough sets and evidence theory. In: Lesot, M.J., et al. (eds.) IPMU 2020. CCIS, vol. 1237, pp. 471–484. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50146-4_35
Ciucci, D., Forcati, I.: Certainty-based rough sets. In: Polkowski, L., et al. (eds.) IJCRS 2017. LNCS (LNAI), vol. 10314, pp. 43–55. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60840-2_3
Côme, E., Oukhellou, L., Denoeux, T., Aknin, P.: Learning from partially supervised data using mixture models and belief functions. Pattern Recogn. 42(3), 334–348 (2009)
Couso, I., Borgelt, C., Hullermeier, E., Kruse, R.: Fuzzy sets in data analysis: from statistical foundations to machine learning. IEEE Comput. Intell. Mag. 14(1), 31–44 (2019)
Couso, I., Dubois, D., Sánchez, L.: Random sets and random fuzzy sets as ill-perceived random variables. SpringerBriefs in Computational Intelligence (2014)
Denoeux, T.: A k-nearest neighbor classification rule based on dempster-shafer theory. In: Yager, R.R., Liu, L. (eds.) Classic Works of the Dempster-Shafer Theory of Belief Functions. Studies in Fuzziness and Soft Computing, vol. 219, pp. 737–760. Springer, Berlin, Heidelberg (2008). https://doi.org/10.1007/978-3-540-44792-4_29
Denœux, T., Zouhal, L.M.: Handling possibilistic labels in pattern classification using evidential reasoning. Fuzzy Sets Syst. 122(3), 409–424 (2001)
El Gayar, N., Schwenker, F., Palm, G.: A study of the robustness of KNN classifiers trained using soft labels. In: Schwenker, F., Marinai, S. (eds.) ANNPR 2006. LNCS (LNAI), vol. 4087, pp. 67–80. Springer, Heidelberg (2006). https://doi.org/10.1007/11829898_7
Greco, S., Matarazzo, B., Slowinski, R.: Rough sets theory for multicriteria decision analysis. Eur. J. Oper. Res. 129(1), 1–47 (2001)
Hüllermeier, E.: Learning from imprecise and fuzzy observations: data disambiguation through generalized loss minimization. Int. J. Approx. Reason. 55(7), 1519–1534 (2014)
Hüllermeier, E.: Does machine learning need fuzzy logic? Fuzzy Sets Syst. 281, 292–299 (2015)
Hüllermeier, E., Beringer, J.: Learning from ambiguously labeled examples. Intell. Data Anal. 10(5), 419–439 (2006)
Hüllermeier, E., Cheng, W.: Superset learning based on generalized loss minimization. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Gama, J., Jorge, A., Soares, C. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9285, pp. 260–275. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23525-7_16
Liu, L., Dietterich, T.: Learnability of the superset label learning problem. In: ICML, pp. 1629–1637 (2014)
Nakata, M., Sakai, H.: An approach based on rough sets to possibilistic information. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2014. CCIS, vol. 444, pp. 61–70. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08852-5_7
Nguyen, H.T., Walker, C., Walker, E.A.: A First Course in Fuzzy Logic. CRC Press, Boca Raton (2018)
Ning, Q., He, H., Fan, C., Roth, D.: Partial or complete, that’s the question. arXiv preprint arXiv:1906.04937 (2019)
Orlowska, E. (ed.): IncompleteIinformation: Rough Set Analysis. Physica (2013)
Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11(5), 341–356 (1982)
Quost, B., Denoeux, T.: Clustering and classification of fuzzy data using the fuzzy em algorithm. Fuzzy Sets Syst. 286, 134–156 (2016)
Sakai, H., Liu, C., Nakata, M., Tsumoto, S.: A proposal of a privacy-preserving questionnaire by non-deterministic information and its analysis. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 1956–1965. IEEE (2016)
Sakai, H., Nakata, M., Yao, Y.: Pawlak’s many valued information system, non-deterministic information system, and a proposal of new topics on information incompleteness toward the actual application. In: Wang, G., Skowron, A., Yao, Y., Ślęzak, D., Polkowski, L. (eds.) Thriving Rough Sets. SCI, vol. 708, pp. 187–204. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54966-8_9
Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)
Ślęzak, D.: Approximate entropy reducts. Fundam. Inform. 53(3–4), 365–390 (2002)
Ślęzak, D., Dutta, S.: Dynamic and discernibility characteristics of different attribute reduction criteria. In: Nguyen, H.S., Ha, Q.-T., Li, T., Przybyła-Kasperek, M. (eds.) IJCRS 2018. LNCS (LNAI), vol. 11103, pp. 628–643. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99368-3_49
Thangavel, K., Pethalakshmi, A.: Dimensionality reduction based on rough set theory: a review. Appl. Soft Comput. 9(1), 1–12 (2009)
Trabelsi, S., Elouedi, Z., Lingras, P.: Dynamic reduct from partially uncertain data using rough sets. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) RSFDGrC 2009. LNCS (LNAI), vol. 5908, pp. 160–167. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10646-0_19
Umans, C.: On the complexity and inapproximability of shortest implicant problems. In: Wiedermann, J., van Emde Boas, P., Nielsen, M. (eds.) ICALP 1999. LNCS, vol. 1644, pp. 687–696. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48523-6_65
Yao, Y.Y., Lingras, P.J.: Interpretations of belief functions in the theory of rough sets. Inf. Sci. 104(1–2), 81–106 (1998)
Zadeh, L.A.: Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1(1), 3–28 (1978)
Zhou, Z.-H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5(1), 44–53 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Campagner, A., Ciucci, D. (2021). Feature Selection and Disambiguation in Learning from Fuzzy Labels Using Rough Sets. In: Ramanna, S., Cornelis, C., Ciucci, D. (eds) Rough Sets. IJCRS 2021. Lecture Notes in Computer Science(), vol 12872. Springer, Cham. https://doi.org/10.1007/978-3-030-87334-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-87334-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87333-2
Online ISBN: 978-3-030-87334-9
eBook Packages: Computer ScienceComputer Science (R0)