Abstract
Debt detection is important for improving payment accuracy in social security. Since debt detection from customer transactional data can be generally modelled as a fraud detection problem, a straightforward solution is to extract features from transaction sequences and build a sequence classifier for debts. The existing sequence classification methods based on sequential patterns consider only positive patterns. However, according to our experience in a large social security application, negative patterns are very useful in accurate debt detection. In this paper, we present a successful case study of debt detection in a large social security application. The central technique is building sequence classification using both positive and negative sequential patterns.
Chapter PDF
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. of the 11th International Conference on Data Engineering, Taipei, Taiwan, 1995, pp. 3–14. IEEE Computer Society Press, Los Alamitos (1995)
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: KDD 2002: Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435. ACM, New York (2002)
Bannai, H., Hyyro, H., Shinohara, A., Takeda, M., Nakai, K., Miyano, S.: Finding optimal pairs of patterns. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 450–462. Springer, Heidelberg (2004)
Bonchi, F., Giannotti, F., Mainetto, G., Pedreschi, D.: A classification-based methodology for planning audit strategies in fraud detection. In: Proc. of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, pp. 175–184. ACM Press, New York (1999)
Centrelink. Centrelink annual report 2004-2005. Technical report, Centrelink, Australia (2005)
Chuzhanova, N.A., Jones, A.J., Margetts, S.: Feature selection for genetic sequence classification. Bioinformatics 14(2), 139–143 (1998)
Exarchos, T.P., Tsipouras, M.G., Papaloukas, C., Fotiadis, D.I.: A two-stage methodology for sequence classification based on sequential pattern mining and optimization. Data and Knowledge Engineering 66(3), 467–487 (2008)
Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.-C.: Freespan: frequent pattern-projected sequential pattern mining. In: KDD 2000: Proc. of the 6th ACM SIGKDD international conference on Knowledge discovery and data mining, Boston, Massachusetts, USA, pp. 355–359. ACM, New York (2000)
Julisch, K., Dacier, M.: Mining intrusion detection alarms for actionable knowledge. In: Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp. 366–375. ACM, New York (2002)
Lei, H., Govindaraju, V.: Similarity-driven sequence classification based on support vector machines. In: ICDAR 2005: Proc. of the 8th International Conference on Document Analysis and Recognition, Washington, DC, USA, 2005, pp. 252–261. IEEE Computer Society, Los Alamitos (2005)
Lesh, N., Zaki, M.J., Ogihara, M.: Mining features for sequence classification. In: KDD 1999: Proc. of the 5th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 342–346. ACM, New York (1999)
Li, M., Sleep, R.: A robust approach to sequence classification. In: ICTAI 2005: Proc. of the 17th IEEE International Conference on Tools with Artificial Intelligence, Washington, DC, USA, pp. 197–201. IEEE Computer Society, Los Alamitos (2005)
Li, W., Han, J., Pei, J.: Cmar: Accurate and efficient classification based on multiple class-association rules. In: ICDM 2001: Proc. of the 2001 IEEE International Conference on Data Mining, Washington, DC, USA, pp. 369–376. IEEE Computer Society, Los Alamitos (2001)
Lin, N.P., Chen, H.-J., Hao, W.-H.: Mining negative sequential patterns. In: Proc. of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, pp. 654–658 (2007)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD 1998: Proc. of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 80–86. AAAI Press, Menlo Park (1998)
Ouyang, W., Huang, Q.: Mining negative sequential patterns in transaction databases. In: Proc. of 2007 International Conference on Machine Learning and Cybernetics, Hong Kong, pp. 830–834. China (2007)
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: ICDE 2001: Proc. of the 17th International Conference on Data Engineering, Washington, DC, USA, pp. 215–224. IEEE Computer Society, Los Alamitos (2001)
Rosset, S., Murad, U., Neumann, E., Idan, Y., Pinkas, G.: Discovery of fraud rules for telecommunications - challenges and solutions. In: Proc. of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 1999, pp. 409–413 (1999)
Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)
Sun, X., Orlowska, M.E., Li, X.: Finding negative event-oriented patterns in long temporal sequences. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 212–221. Springer, Heidelberg (2004)
Tseng, V.S.-M., Lee, C.-H.: Cbs: A new classification method by using sequential patterns. In: SDM 2005: Proc. of the 2005 SIAM International Data Mining Conference, Newport Beach, California, USA, pp. 596–600 (2005)
Verhein, F., Chawla, S.: Using significant, positively associated and relatively class correlated rules for associative classification of imbalanced datasets. In: ICDM 2007: Proc. of the 7th IEEE International Conference on Data Mining, pp. 679–684 (2007)
Wu, C.H., Berry, M.W., Fung, Y.-S., McLarty, J.: Neural networks for molecular sequence classification. In: Proc. of the 1st International Conference on Intelligent Systems for Molecular Biology, pp. 429–437. AAAI Press, Menlo Park (1993)
Xing, Z., Pei, J., Dong, G., Yu, P.: Mining sequence classifiers for early prediction. In: SDM 2008: Proc. of the 2008 SIAM international conference on data mining, Atlanta, GA, USA, April 2008, pp. 644–655 (2008)
Yakhnenko, O., Silvescu, A., Honavar, V.: Discriminatively trained markov model for sequence classification. In: ICDM 2005: Proc. of the 5th IEEE International Conference on Data Mining, Washington, DC, USA, pp. 498–505. IEEE Computer Society, Los Alamitos (2005)
Zaki, M.J.: Spade: An efficient algorithm for mining frequent sequences. Machine Learning 42(1-2), 31–60 (2001)
Zhao, Y., Zhang, H., Cao, L., Zhang, C., Bohlscheid, H.: Efficient mining of event-oriented negative sequential rules. In: WI 2008: Proc. of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence, Sydney, Australia, December 2008, pp. 336–342 (2008)
Zhao, Y., Zhang, H., Cao, L., Zhang, C., Bohlscheid, H.: Mining both positive and negative impact-oriented sequential rules from transactional data. In: Proc. of the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2009), Bangkok, Thailand, April 2009, pp. 656–663 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, Y. et al. (2009). Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04174-7_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-04174-7_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04173-0
Online ISBN: 978-3-642-04174-7
eBook Packages: Computer ScienceComputer Science (R0)