Skip to main content
Log in

Chance discovery and learning minority classes

  • Special Feature
  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

Chances are viewed in chance discovery as events/situations with significant impact on human decision making. In this research context we are particularly interested in a subset of chances that are unexpected or contradictory with human common knowledge, and the human role that we consider as an essential factor in finding such chances. We first introduce the method LUPC that can learn minority classes from large unbalanced datasets. With its visualization tools as well its exclusive and inclusive constraints, LUPC allows the user to actively participate in and to incorporate background knowledge in the chance discovery process. We then present case studies in which LUPC is used to support the user in discovering significant unexpected chances from stomach cancer and hepatitis databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Fayyad, U. M., Grinstein, G. G., and Wierse, A.,Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann, 2002.

  2. Furnkranz, J., “Separate-and-Conquer Rule Learning,”Journal Artificial Intelligence Review, 13, pp. 3–54, 1999.

    Article  Google Scholar 

  3. Ho, T. B., Nguyen T. D., Nguyen D. D. and Kawasaki S., “Visualization Support for User-centered Model Selection in Knowledge Discovery and Data Mining,”International Journal of Artificial Intelligence Tools, 10, 4, pp. 691–713, 2001.

    Article  Google Scholar 

  4. Ho, T. B., Nguyen, D. D. and Kawasaki, S., “Mining Prediction Rules from Minority Classes,”14th International Conference on Applications of Prolog (INAP2001), Tokyo, pp. 254–264, October 2001.

  5. Kawasaki, S., Nguyen, D. D., Nguyen, T. D. and Ho, T. B., “Study of Hepatitis Data by Visual Data Mining System D2MS,”JSAI SIG-KBS-A201 Workshop Active Data Mining, Pusan, pp. 43–48, May 2002.

  6. Japkowicz, N., “The Class Imbalance Problems: Significance and Strategies,”AAAI Workshop on Learning in Imbalanced Datasets, 2000.

  7. Kubat, M. and Marvin, S., “Addressing the Curse of Imbalanced Training Sets: One-Sided Selection,” inProc. of the Fourteenth International Conference on Machine Learning, pp. 179–186, 1997.

  8. Ling, C. X. and Li, C., “Data Mining for Direct Marketing: Problems and Solutions,”International Conference on Knowledge Discovery and Data Mining KDD- 97, pp. 258–267, 1997.

    Google Scholar 

  9. Liu, B., Hsu, W., and Ma, Y., “Integrating Classification and Association Rule Mining,”Fourth Conference on Knowledge Discovery and Data Mining, pp. 80–86, 1998.

  10. Ohsawa, Y., “Chance Discoveries for Making Decisions in Complex Real World,”New Generation Computing, 20, 2, Springer-Verlag, pp. 193–164, 2002.

    Article  Google Scholar 

  11. Padmanabhan, B. and Tuzhilin, A., “Knowledge Refinement based on the Discovery of Unexpected Patterns in Data Mining,”Decision Support Systems, 33, pp. 309–321, 2002.

    Article  Google Scholar 

  12. Provost, F., “Learning with Imbalanced Data Sets,”AAAI’2000 Workshop on Imbalanced Data Sets, 2000.

  13. Quinlan, J. R.,C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.

  14. Ting, K. M., “A Comparative Study of Cost-Sensitive Boosting Algorithms,”Seventeenth International Conference on Machine Learning, pubisher, pp. 983–990, 2000.

  15. Turney, P. D., “Cost-sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm,”Journal of Artificial Intelligence Research, 2, pubisher, pp. 369–409, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tu Bao Ho.

Additional information

Tu Bao Ho: He received a B. Eng. degree from Hanoi University of Technology in 1978, M.S. and Ph.D. degrees from University Paris 6, in 1984 and 1987. He is currently professor at School of Knowledge Science, Japan Advanced Institute of Science and Technology (JAIST). His research interests include machine learning, knowledge-based systems and knowledge discovery and data mining.

Duc Dung Nguyen: He received a B.A. degree from Hanoi University of Pedagogy in 1993. He is currently a master student at School of Knowledge Science, Japan Advanced Institute of Science and Technology (JAIST). His research interests include pattern recognition and knowledge discovery and data mining.

About this article

Cite this article

Ho, T.B., Nguyen, D.D. Chance discovery and learning minority classes. New Gener Comput 21, 149–161 (2003). https://doi.org/10.1007/BF03037632

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03037632

Keywords

Navigation