Skip to main content
Log in

Distribution-free bounds for relational classification

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Statistical relational learning (SRL) is a subarea in machine learning which addresses the problem of performing statistical inference on data that is correlated and not independently and identically distributed (i.i.d.)—as is generally assumed. For the traditional i.i.d. setting, distribution-free bounds exist, such as the Hoeffding bound, which are used to provide confidence bounds on the generalization error of a classification algorithm given its hold-out error on a sample size of N. Bounds of this form are currently not present for the type of interactions that are considered in the data by relational classification algorithms. In this paper, we extend the Hoeffding bounds to the relational setting. In particular, we derive distribution-free bounds for certain classes of data generation models that do not produce i.i.d. data and are based on the type of interactions that are considered by relational classification algorithms that have been developed in SRL. We conduct empirical studies on synthetic and real data which show that these data generation models are indeed realistic and the derived bounds are tight enough for practical use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Arias M, Feigelson A, Khardon R, Servedio R (2006) Polynomial certificates for propositional classes. Inf Comput 204(5): 816–834

    Article  MathSciNet  MATH  Google Scholar 

  2. Arias M, Khardon R (2002) Learning closed horn expressions. Inf Comput 178(1): 214–240

    MathSciNet  MATH  Google Scholar 

  3. Bakir G, Hofmann T, Schölkopf B, Smola A, Taskar B, Vishwanathan SVN (2007) Predicting structured data. The MIT Press, Cambridge

    Google Scholar 

  4. Bartlett P, Bousquet O, Mendelson S (2002) Local rademacher complexities. Ann Stat 33: 44–58

    Google Scholar 

  5. Bennett G (1962) Probability inequalities for the sums of independent random variables. JASA 57: 33–45

    MATH  Google Scholar 

  6. Blum A, Kalai A, Langford J (1999) Beating the hold-out: bounds for k-fold and progressive cross-validation. Comput Learn Theory 203–208

  7. Blumer A, Ehrenfueucht A, Haussler D, Warmuth M (1987) Occam’s razor. Inf Process Lett 24: 377–380

    Article  MATH  Google Scholar 

  8. Chernoff H (1952) A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann Math Stat 23: 493–507

    Article  MathSciNet  MATH  Google Scholar 

  9. Cohen W (1995) Polynomial learnability and inductive logic programming: methods and results. New Gener Comput 13: 369–409

    Article  Google Scholar 

  10. Devroye L, Györfi L, Lugosi G (1996) A Probabilistic theory of pattern recognition. Springer, New York

    MATH  Google Scholar 

  11. Floyd S, Warmuth M (1995) Sample compression, learnability and the vapnik-chervonenkis dimension. Mach Learn 21: 269–304

    Google Scholar 

  12. Friedman N, Getoor L, Koller D, Pfeffer A (1999) Learning probabilistic relational models. IJCAI 1300–1309

  13. Getoor L, Taskar B (2007) Introduction to statistical relational learning. MIT Press, Cambridge

    MATH  Google Scholar 

  14. Godwin H (1955) On generalization of tchebyshev’s inequality. JASA 50: 923–945

    MathSciNet  MATH  Google Scholar 

  15. Grimmett G, Stirzaker D (2001) Probability and random processes, 3rd edn. Oxford University Press, Oxford

    Google Scholar 

  16. Hoeffding W (1963) Probability inequalities for sums of bounded random variables. JASA 58(301): 13–30

    MathSciNet  MATH  Google Scholar 

  17. Hulten G, Domingos P, Abe Y (2003) Mining massive relational databases

  18. Jensen D, Neville J (2002) Linkage and autocorrelation cause feature selection bias in relational learning

  19. Jensen J (1906) Sur les fonctions convexes et les ingalits entre les valeurs moyennes. Acta Math 30: 175–193

    Article  MathSciNet  MATH  Google Scholar 

  20. Jia Y, Zhang J, Huan J (2011) An efficient graph-mining method for complicated and noisy data with real-world applications. Knowl Inf Syst

  21. Kok S, Singla P, Richardson M, Domingos P (2005) The alchemy system for statistical relational ai. Technical report, department of computer science and engineering, UW, http://www.cs.washington.edu/ai/alchemy/

  22. Langford J (2005) Tutorial on practical prediction theory for classification. J Mach Learn Res 6: 273–306

    MathSciNet  MATH  Google Scholar 

  23. Mcallester D (1999) Pac-bayesian model averaging. In: Proceedings of the twelfth annual conference on computational learning theory. ACM Press, pp 164–170

  24. Neville J (2006) Statistical models and analysis techniques for learning in relational data. Ph.D. Thesis, University of Massachusetts Amhers

  25. Neville J, Gallagher B, Eliassi-Rad T, Wang T (2011) Correcting evaluation bias of relational classifiers with network cross validation. Knowl Inf Syst

  26. Neville J, Jensen D (2005) Leveraging relational autocorrelation with latent group models. In: MRDM ’05: Proceedings of the 4th international workshop on Multi-relational mining. ACM, New York, NY, USA, pp 49–55

  27. Neville J, Jensen D (2007) Relational dependency networks. J Mach Learn Res 8: 653–692

    MATH  Google Scholar 

  28. Neville J, Jensen D, Gallagher B (2003) Simple estimators for relational bayesian classifiers

  29. Okamoto M (1958) Some inequalities relating to the partial sum of binomial probabilites. Ann Inst Stat Math 10: 29–35

    Article  MATH  Google Scholar 

  30. Papoulis A (1991) Probability, random variables and stochastic processes. 3. McGraw-Hill, New York

    Google Scholar 

  31. Preisach C, Schmidt-Thieme L (2008) Ensembles of relational classifiers. Knowl Inf Syst 14(2): 249–272

    Article  MATH  Google Scholar 

  32. Raedt L (1994) First order jk-clausal theories are pac-learnable. Artif Intell 70: 375–392

    Article  MATH  Google Scholar 

  33. Reddy C, Park J (2010) Multi-resolution boosting for classification and regression problems. Knowl Inf Syst

  34. Richardson M, Domingos P (2006) Markov logic networks. Mach Learn 62(1–2): 107–136

    Article  Google Scholar 

  35. Rusu F, Dobra A (2007) Pseudo-random number generation for sketch-based estimations. ACM Trans Database Syst 32(2): 11

    Article  Google Scholar 

  36. Savage I (1961) Probability inequalities of the tchebyshev type. J Res Natl Bur Stand 65B: 211–222

    MathSciNet  Google Scholar 

  37. Schmidt J, Siegel A, Srinivasan A (1995) Chernoff-hoeffding bounds for applications with limited independence. SIAM J Discret Math 8: 223–250

    Article  MathSciNet  MATH  Google Scholar 

  38. Taskar B, Abbeel P, Koller D (2002) Discriminative probabilistic models for relational data. In: Proceedings 18th conference on uncertainty in AI, pp 485–492

  39. Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amit Dhurandhar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dhurandhar, A., Dobra, A. Distribution-free bounds for relational classification. Knowl Inf Syst 31, 55–78 (2012). https://doi.org/10.1007/s10115-011-0406-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-011-0406-4

Keywords

Navigation