Skip to main content

Local PCA Regression for Missing Data Estimation in Telecommunication Dataset

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6230))

Abstract

The customer churn problem affects hugely the telecommunication services in particular, and businesses in general. Note that in majority of cases the number of potential customer churn is much smaller than the non-churners. Therefore, the imbalance distribution of samples between churners and non-churners is a concern when building a churn prediction model. This paper presents a Local PCA approach to solve imbalance classification problem by generating new churn samples. The experiments were carried out on a large real-world Telecommunication dataset and assessed on a churn prediction task. The experiments showed that the Local PCA along with Smote outperformed Linear regression and Standard PCA data generation techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Au, W., Chan, C.C., Yao, X.: A novel evolutionary data mining algorithm with applications to churn prediction. IEEE Transactions on Evolutionary Computation 7, 532–545 (2003)

    Article  Google Scholar 

  2. Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)

    Article  Google Scholar 

  3. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kergelmeyer, W.P.: Smote: synthetic minority over-sampling technique. JAIR 16, 321–357 (2002)

    MATH  Google Scholar 

  4. Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. Newsl. 6(1), 1–6 (2004)

    Article  Google Scholar 

  5. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Kluwer Academic Publishers, Dordrecht (1989)

    MATH  Google Scholar 

  6. Huang, B.Q., Kechadi, M.-T., Buckley, B.: Customer churn prediction for broad-band internet services. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 229–243. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  7. Jolliffe, I.T.: Principal Components Analysis. Springer, Heidelberg (1986)

    Google Scholar 

  8. Wei, C., Chiu, I.: Turning telecommunications call details to churn prediction: a data mining approach. Expert Systems with Applications 23, 103–112 (2002)

    Article  Google Scholar 

  9. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Communications 2(3), 408–421 (1972)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sato, T., Huang, B.Q., Huang, Y., Kechadi, M.T. (2010). Local PCA Regression for Missing Data Estimation in Telecommunication Dataset. In: Zhang, BT., Orgun, M.A. (eds) PRICAI 2010: Trends in Artificial Intelligence. PRICAI 2010. Lecture Notes in Computer Science(), vol 6230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15246-7_67

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15246-7_67

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15245-0

  • Online ISBN: 978-3-642-15246-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics