Abstract
Differential entropy, or continuous entropy, is a concept in information theory related to the classical (Shannon) entropy (Shannon, 1948). For a random variable with density f on \(\mathbb{R}^{d}\), it is defined by
when this integral exists (with the convention \(0\log 0 = 0\)). If d = 1 and f is the uniform density on [0, a], a > 0, then its differential entropy is
We see that for a < 1, \(\log a < 0\), so that \(\mathcal{E}(f)\) can be negative. The standard exponential has \(\mathcal{E}(f) = 1\), and the standard Gaussian has \(\mathcal{E}(f) =\log \sqrt{2\pi e}\), to give a few examples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
H. Akaike, An approximation to the density function. Ann. Inst. Stat. Math. 6, 127–132 (1954)
A. Antos, L. Devroye, L. Györfi, Lower bounds for Bayes error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 21, 643–645 (1999)
J.-Y. Audibert, A.B. Tsybakov, Fast learning rates for plug-in classifiers. Ann. Stat. 35, 608–633 (2007)
T. Bailey, A. Jain, A note on distance-weighted k-nearest neighbor rules. IEEE Trans. Syst. Man Cybern. 8, 311–313 (1978)
J. Beck, The exponential rate of convergence of error for k n -NN nonparametric regression and decision. Probl. Control Inf. Theory 8, 303–311 (1979)
J. Beirlant, E.J. Dudewicz, L. Györfi, E.C. van der Meulen, Nonparametric entropy estimation: an overview. Int. J. Math. Stat. Sci. 6, 17–39 (1997)
G. Bennett, Probability inequalities for the sum of independent random variables. J. Am. Stat. Assoc. 57, 33–45 (1962)
A. Berlinet, S. Levallois, Higher order analysis at Lebesgue points, in Asymptotics in Statistics and Probability, ed. by M.L. Puri. Papers in Honor of George Gregory Roussas (VSP, Utrecht, 2000), pp. 17–32.
S.N. Bernstein, The Theory of Probabilities (Gastehizdat Publishing House, Moscow, 1946)
A.C. Berry, The accuracy of the Gaussian approximation to the sum of independent variates. Trans. Am. Math. Soc. 49, 122–136 (1941)
G. Biau, F. Cérou, A. Guyader, On the rate of convergence of the bagged nearest neighbor estimate. J. Mach. Learn. Res. 11, 687–712 (2010)
G. Biau, F. Chazal, L. Devroye, D. Cohen-Steiner, C. Rodríguez, A weighted k-nearest neighbor density estimate for geometric inference. Electron. J. Stat. 5, 204–237 (2011)
G. Biau, L. Devroye, V. Dujmović, A. Krzyżak, An affine invariant k-nearest neighbor regression estimate. J. Multivar. Anal. 112, 24–34 (2012)
G. Biau, F. Cérou, A. Guyader, New insights into Approximate Bayesian Computation. Ann. Inst. Henri Poincaré (B) Probab. Stat. 51, 376–403 (2015)
P.J. Bickel, L. Breiman, Sums of functions of nearest neighbor distances, moment bounds, limit theorems and a goodness of fit test. Ann. Probab. 11, 185–214 (1983)
P.J. Bickel, Y. Ritov, Estimating integrated squared density derivatives: sharp best order of convergence estimates. Sankhy\(\bar{\mathrm{a}}\) A 50, 381–393 (1988)
P. Billingsley, Probability and Measure, 3rd edn. (Wiley, New York, 1995)
N.H. Bingham, C.M. Goldie, J.L. Teugels, Regular Variation (Cambridge University Press, Cambridge, 1987)
L. Birgé, P. Massart, Estimation of integral functionals of a density. Ann. Stat. 23, 11–29 (1995)
K. Böröczky, Jr., Finite Packing and Covering (Cambridge University Press, Cambridge, 2004)
K. Böröczky, Jr., G. Wintsche, Covering the sphere by equal balls, in Discrete and Computational Geometry: The Goodman-Pollack Festschrift, ed. by B. Aronov, S. Basu, J. Pach, M. Sharir (Springer, Berlin, 2003), pp. 235–251
S. Boucheron, G. Lugosi, P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence (Oxford University Press, Oxford, 2013)
L. Breiman, W. Meisel, E. Purcell, Variable kernel estimates of multivariate densities. Technometrics 19, 135–144 (1977)
T. Cacoullos, Estimation of a multivariate density. Ann. Inst. Stat. Math. 18, 178–189 (1966)
F. Cérou, A. Guyader, Nearest neighbor classification in infinite dimension. ESAIM: Probab. Stat. 10, 340–355 (2006)
P.E. Cheng, Strong consistency of nearest neighbor regression function estimators. J. Multivar. Anal. 15, 63–72 (1984)
H. Chernoff, A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23, 493–507 (1952)
G. Collomb, Estimation de la régression par la méthode des k points les plus proches avec noyau: quelques propriétés de convergence ponctuelle, in Statistique non Paramétrique Asymptotique, ed. by J.-P. Raoult. Lecture Notes in Mathematics, vol. 821 (Springer, Berlin, 1980), pp. 159–175
G. Collomb, Estimation non paramétrique de la régression: Revue bibliographique. Int. Stat. Rev. 49, 75–93 (1981)
T.M. Cover, Estimation by the nearest neighbor rule. IEEE Trans. Inf. Theory 14, 50–55 (1968)
T.M. Cover, P.E. Hart, Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd edn. (Wiley, Hoboken, 2006)
T.M. Cover, J.M. Van Campenhout, On the possible orderings in the measurement selection problem. IEEE Trans. Syst. Man Cybern. 7, 657–661 (1977)
S. Csibi, Stochastic Processes with Learning Properties (Springer, Wien, 1975)
B.V. Dasarathy, Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques (IEEE Computer Society Press, Los Alamitos, 1991)
M. de Guzmán, Differentiation of Integrals in \(\mathbb{R}^{n}\). Lecture Notes in Mathematics, vol. 481 (Springer, Berlin, 1975)
P.A. Devijver, A note on ties in voting with the k-NN rule. Pattern Recogn. 10, 297–298 (1978)
P.A. Devijver, New error bounds with the nearest neighbor rule. IEEE Trans. Inf. Theory 25, 749–753 (1979)
P.A. Devijver, An overview of asymptotic properties of nearest neighbor rules, in Pattern Recognition in Practice, ed. by E.S. Gelsema, L.N. Kanal (North-Holland, Amsterdam, 1980), pp. 343–350
L. Devroye, On the almost everywhere convergence of nonparametric regression function estimates. Ann. Stat. 9, 1310–1319 (1981a)
L. Devroye, On the inequality of Cover and Hart in nearest neighbor discrimination. IEEE Trans. Pattern Anal. Mach. Intell. 3, 75–78 (1981b)
L. Devroye, On the asymptotic probability of error in nonparametric discrimination. Ann. Stat. 9, 1320–1327 (1981c)
L. Devroye, Necessary and sufficient conditions for the pointwise convergence of nearest neighbor regression function estimates. Z. Warhscheinlichkeitstheorie Verwandte Geb. 61, 467–481 (1982)
L. Devroye, Non-Uniform Random Variate Generation (Springer, New York, 1986)
L. Devroye, A Course in Density Estimation (Birkhäuser, Boston, 1987)
L. Devroye, Automatic pattern recognition: a study of the probability of error. IEEE Trans. Pattern Anal. Mach. Intell. 10, 530–543 (1988)
L. Devroye, Exponential inequalities in nonparametric estimation, in Nonparametric Functional Estimation and Related Topics, ed. by G. Roussas (Springer, Dordrecht, 1991a), pp. 31–44
L. Devroye, A universal k-nearest neighbor procedure in discrimination, in Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, ed. by B.V. Dasarathy (IEEE Computer Society Press, Los Alamitos, 1991b), pp. 101–106
L. Devroye, L. Györfi, Nonparametric Density Estimation: The L 1 View (Wiley, New York, 1985)
L. Devroye, A. Krzyżak, New multivariate product density estimators. J. Multivar. Anal. 82, 88–110 (2002)
L. Devroye, G. Lugosi, Combinatorial Methods in Density Estimation (Springer, New York, 2001)
L. Devroye, T.J. Wagner, Nearest neighbor methods in discrimination, in Handbook of Statistics, vol. 2, ed. by P.R. Krishnaiah, L.N. Kanal (North-Holland, Amsterdam, 1982), pp. 193–197
L. Devroye, L. Györfi, A. Krzyżak, G. Lugosi, On the strong universal consistency of nearest neighbor regression function estimates. Ann. Stat. 22, 1371–1385 (1994)
L. Devroye, L. Györfi, G. Lugosi, A Probabilistic Theory of Pattern Recognition (Springer, New York, 1996)
L. Devroye, L. Györfi, D. Schäfer, H. Walk, The estimation problem of minimum mean squared error. Stat. Decis. 21, 15–28 (2003)
L.P. Devroye, The uniform convergence of nearest neighbor regression function estimators and their application in optimization. IEEE Trans. Inf. Theory 2, 142–151 (1978)
L.P. Devroye, T.J. Wagner, The strong uniform consistency of nearest neighbor density estimates. Ann. Stat. 5, 536–540 (1977)
W. Doeblin, Exposé de la théorie des chaînes simples constantes de Markov à un nombre fini d’états. Rev. Math. Union Interbalkanique 2, 77–105 (1937)
D. Donoho, One-sided inference about functionals of a density. Ann. Stat. 16, 1390–1420 (1988)
B. Efron, C. Stein, The jackknife estimate of variance. Ann. Stat. 9, 586–596 (1981)
C.-G. Esseen, On the Liapunoff limit of error in the theory of probability. Arkiv Matematik Astronomi Fysik A28, 1–19 (1942)
D. Evans, A.J. Jones, W.M. Schmidt, Asymptotic moments of near-neighbour distance distributions. Proc. R. Soc. A 458, 2839–2849 (2002)
C. Fefferman, E.M. Stein, Some maximal inequalities. Am. J. Math. 93, 107–115 (1971)
E. Fix, J.L. Hodges, Discriminatory analysis – Nonparametric discrimination: consistency properties. Project 21-49-004, Report Number 4 (USAF School of Aviation Medicine, Randolph Field, Texas, 1951), pp. 261–279
E. Fix, J.L. Hodges, Discriminatory analysis – Nonparametric discrimination: small sample performance. Project 21-49-004, Report Number 11 (USAF School of Aviation Medicine, Randolph Field, Texas, 1952), pp. 280–322
E. Fix, J.L. Hodges, Discriminatory analysis: nonparametric discrimination: consistency properties, in Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, ed. by B.V. Dasarathy (IEEE Computer Society Press, Los Alamitos, 1991a), pp. 32–39
E. Fix, J.L. Hodges, Discriminatory analysis: nonparametric discrimination: small sample performance, in Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, ed. by B.V. Dasarathy (IEEE Computer Society Press, Los Alamitos, 1991b), pp. 40–56
J. Fritz, Distribution-free exponential error bound for nearest neighbor pattern classification. IEEE Trans. Inf. Theory 21, 552–557 (1975)
S. Gada, T. Klein, C. Marteau, Classification with the nearest neighbor rule in general finite dimensional spaces. Ann. Stat. arXiv:1411.0894 (2015)
J. Galambos, The Asymptotic Theory of Extreme Order Statistics (Wiley, New York, 1978)
M. Giaquinta, G. Modica, Mathematical Analysis: An Introduction to Functions of Several Variables (Birkhäuser, Boston, 2009)
N. Glick, Sample-based multinomial classification. Biometrics 29, 241–256 (1973)
G.R. Grimmett, D.R. Stirzaker, Probability and Random Processes, 3rd edn. (Oxford University Press, Oxford, 2001)
B. Grünbaum, Arrangements and Spreads (American Mathematical Society, Providence, 1972)
I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
L. Györfi, An upper bound of error probabilities for multihypothesis testing and its application in adaptive pattern recognition. Probl. Control Inf. Theory 5, 449–457 (1976)
L. Györfi, On the rate of convergence of nearest neighbor rules. IEEE Trans. Inf. Theory 24, 509–512 (1978)
L. Györfi, Z. Györfi, An upper bound on the asymptotic error probability of the k-nearest neighbor rule for multiple classes. IEEE Trans. Inf. Theory 24, 512–514 (1978)
L. Györfi, M. Kohler, A. Krzyżak, H. Walk, A Distribution-Free Theory of Nonparametric Regression (Springer, New York, 2002)
T. Hagerup, C. Rüb, A guided tour of Chernoff bounds. Inf. Process. Lett. 33, 305–308 (1990)
P. Hall, On near neighbour estimates of a multivariate density. J. Multivar. Anal. 13, 24–39 (1983)
T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. (Springer, New York, 2009)
W. Hoeffding, Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963)
O. Kallenberg, Foundations of Modern Probability, 2nd edn. (Springer, New York, 2002)
R.M. Karp, Probabilistic Analysis of Algorithms. Class Notes (University of California, Berkeley, 1988)
E. Kaufmann, R.-D. Reiss, On conditional distributions of nearest neighbors. J. Multivar. Anal. 42, 67–76 (1992)
J.D. Kečkić, P.M. Vasić, Some inequalities for the gamma function. Publ. Inst. Math. 11, 107–114 (1971)
J. Kiefer, Iterated logarithm analogues for sample quantiles when p n ↓ 0, in Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, ed. by L.M. Le Cam, J. Neyman, E.L. Scott. Theory of Statistics, vol. 1 (University of California Press, Berkeley, 1972), pp. 227–244
B.K. Kim, J. Van Ryzin, Uniform consistency of a histogram density estimator and modal estimation. Commun. Stat. 4, 303–315 (1975)
R. Kohavi, G.H. John, Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
M. Kohler, A. Krzyżak, On the rate of convergence of local averaging plug-in classification rules under a margin condition. IEEE Trans. Inf. Theory 53, 1735–1742 (2007)
M. Kohler, A. Krzyżak, H. Walk, Rates of convergence for partitioning and nearest neighbor regression estimates with unbounded data. J. Multivar. Anal. 97, 311–323 (2006)
L.F. Kozachenko, N.N. Leonenko, Sample estimate of the entropy of a random vector. Probl. Inf. Transm. 23, 95–101 (1987)
S.R. Kulkarni, S.E. Posner, Rates of convergence of nearest neighbor estimation under arbitrary sampling. IEEE Trans. Inf. Theory 41, 1028–1039 (1995)
V. Kumar, S. Minz, Feature selection: a literature review. Smart Comput. Rev. 4, 211–229 (2014)
S.L. Lai, Large Sample Properties of k-Nearest Neighbor Procedures. Ph.D. Thesis, University of California, Los Angeles, 1977
B. Laurent, Efficient estimation of integral functionals of a density. Ann. Stat. 24, 659–681 (1996)
N. Leonenko, L. Pronzato, V. Savani, A class of Rényi information estimators for multidimensional densities. Ann. Stat. 36, 2153–2182 (2008)
E. Liitiäinen, A. Lendasse, F. Corona, Non-parametric residual variance estimation in supervised learning, in Computational and Ambient Intelligence: 9th International Work-Conference on Artificial Neural Networks, ed. by F. Sandoval, A. Prieto, J. Cabestany, M. Graña (Springer, Berlin, 2007), pp. 63–71
E. Liitiäinen, A. Lendasse, F. Corona, Bounds on the mean power-weighted nearest neighbour distance. Proc. R. Soc. A 464, 2293–2301 (2008a)
E. Liitiäinen, A. Lendasse, F. Corona, On nonparametric residual variance estimation. Neural Process. Lett. 28, 155–167 (2008b)
E. Liitiäinen, F. Corona, A. Lendasse, Residual variance estimation using a nearest neighbor statistic. J. Multivar. Anal. 101, 811–823 (2010)
J.W. Lindeberg, Über das Exponentialgesetz in der Wahrscheinlichkeitsrechnung. Ann. Acad. Sci. Fenn. 16, 1–23 (1920)
D.O. Loftsgaarden, C.P. Quesenberry, A nonparametric estimate of a multivariate density function. Ann. Math. Stat. 36, 1049–1051 (1965)
Y.P. Mack, Asymptotic normality of multivariate k-NN density estimates. Sankhy\(\bar{\mathrm{a}}\) A 42, 53–63 (1980)
Y.P. Mack, Local properties of k-NN regression estimates. SIAM J. Algorithms Discret. Meth. 2, 311–323 (1981)
Y.P. Mack, Rate of strong uniform convergence of k-NN density estimates. J. Stati. Plann. Inference 8, 185–192 (1983)
Y.P. Mack, M. Rosenblatt, Multivariate k-nearest neighbor density estimates. J. Multivar. Anal. 9, 1–15 (1979)
J. Marcinkiewicz, A. Zygmund, Sur les fonctions indépendantes. Fundam. Math. 29, 60–90 (1937)
A.W. Marshall, I. Olkin, Inequalities: Theory of Majorization and Its Applications (Academic Press, New York, 1979)
P. Massart, Concentration Inequalities and Model Selection (Springer, Berlin, 2007)
P. Massart, E. Nédélec, Risk bounds for statistical learning. Ann. Stat. 34, 2326–2366 (2006)
J. Matous̆ek, Lectures on Discrete Geometry (Springer, New York, 2002)
C. McDiarmid, On the method of bounded differences, in Surveys in Combinatorics, ed. by J. Siemons. London Mathematical Society Lecture Note Series, vol. 141 (Cambridge University Press, Cambridge, 1989), pp. 148–188
J.V. Michalowicz, J.M. Nichols, F. Bucholtz, Handbook of Differential Entropy (CRC, Boca Raton, 2014)
K.S. Miller, Multidimensional Gaussian Distributions (Wiley, New York, 1964)
J.W. Milnor, On the Betti numbers of real algebraic varieties. Proc. Am. Math. Soc. 15, 275–280 (1964)
D.S. Moore, E.G. Henrichon, Uniform consistency of some estimates of a density function. Ann. Math. Stat. 40, 1499–1502 (1969)
D.S. Moore, J.W. Yackel, Large sample properties of nearest neighbor density function estimators, in Statistical Decision Theory and Related Topics II: Proceedings of a Symposium Held at Purdue University, May 17–19, 1976, ed. by S.S. Gupta, D.S. Moore (Academic Press, New York, 1977a), pp. 269–279
D.S. Moore, J.W. Yackel, Consistency properties of nearest neighbor density function estimators. Ann. Stat. 5, 143–154 (1977b)
C. Mortici, C.-P. Chen, New sharp double inequalities for bounding the gamma and digamma function. Analele Universităţii de Vest din Timişoara, Seria Matematică-Informatică 49, 69–75 (2011)
E.A. Nadaraya, On estimating regression. Theory Probab. Appl. 9, 141–142 (1964)
E.A. Nadaraya, On nonparametric estimates of density functions and regression curves. Theory Probab. Appl. 10, 186–190 (1965)
R. Olshen, Discussion on a paper by C.J. Stone. Ann. Stat. 5, 632–633 (1977)
E. Parzen, On the estimation of a probability density function and the mode. Ann. Math. Stat. 33, 1065–1076 (1962)
M.D. Penrose, J.E. Yukich, Laws of large numbers and nearest neighbor distances, in Advances in Directional and Linear Statistics: A Festschrift for Sreenivasa Rao Jammalamadaka, ed. by M.T. Wells, A. SenGupta (Physica, Heidelberg, 2011), pp. 189–199
V.V. Petrov, Sums of Independent Random Variables (Springer, Berlin, 1975)
I.G. Petrovskiĭ, O.A. Oleĭnik, On the topology of real algebraic surfaces. Am. Math. Soc. Translat. 70 (1952)
R. Pollack, M.-F. Roy, On the number of cells defined by a set of polynomials. Comp. R. Acad. Sci. Sér. 1: Math. 316, 573–577 (1993)
S.T. Rachev, L. Rüschendorf, Mass Transportation Problems. Volume I: Theory (Springer, New York, 1998)
B.L.S. Prakasa Rao, Nonparametric Functional Estimation (Academic Press, Orlando, 1983)
A. Rényi, On measures of entropy and information, in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. Contributions to the Theory of Statistics, vol. 1 (University of California Press, Berkeley, 1961), pp. 547–561
A. Rényi, Probability Theory (North-Holland, Amsterdam, 1970)
C. Rodríguez, J. Van Ryzin, Large sample properties of maximum entropy histograms. IEEE Trans. Inf. Theory 32, 751–759 (1986)
C.C. Rodríguez, On a new class of density estimators. Technical Report (Department of Mathematics and Statistics, University at Albany, Albany, 1986)
C.C. Rodríguez, Optimal recovery of local truth, in Bayesian Inference and Maximum Entropy Methods in Science and Engineering: 19th International Workshop, vol. 567, ed. by J.T. Rychert, G.J. Erickson, C.R. Smith (American Institute of Physics Conference Proceedings, Melville, 2001), pp. 89–115
C.C. Rodríguez, J. Van Ryzin, Maximum entropy histograms. Stat. Probab. Lett. 3, 117–120 (1985)
M. Rosenblatt, Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 27, 832–837 (1956)
R.M. Royall, A class of non-parametric estimates of a smooth regression function. Technical Report No. 14 (Department of Statistics, Stanford University, Stanford, 1966)
R.J. Samworth, Optimal weighted nearest neighbour classifiers. Ann. Stat. 40, 2733–2763 (2012)
D.W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization (Wiley, New York, 1992)
C.E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)
B.W. Silverman, Density Estimation for Statistics and Data Analysis (Chapman & Hall, London, 1986)
J.M. Steele, An Efron-Stein inequality for nonparametric statistics. Ann. Stat. 14, 753–758 (1986)
E.M. Stein, Singular Integrals and Differentiability Properties of Functions (Princeton University Press, Princeton, 1970)
C.J. Stone, Consistent nonparametric regression (with discussion). Ann. Stat. 5, 595–645 (1977)
C.J. Stone, Optimal rates of convergence for nonparametric estimators. Ann. Stat. 8, 1348–1360 (1980)
C.J. Stone, Optimal global rates of convergence for nonparametric regression. Ann. Stat. 10, 1040–1053 (1982)
W. Stute, Asymptotic normality of nearest neighbor regression function estimates. Ann. Stat. 12, 917–926 (1984)
G.R. Terrell, Mathematical Statistics: A Unified Introduction (Springer, New York, 1999)
R. Thom, On the homology of real algebraic varieties, in Differential and Combinatorial Topology, ed. by S.S. Cairns (Princeton University Press, Princeton, 1965, in French)
Y.L. Tong, Probability Inequalities in Multivariate Distributions (Academic Press, New York, 1980)
A.B. Tsybakov, Optimal aggregation of classifiers in statistical learning. Ann. Stat. 32, 135–166 (2004)
A.B. Tsybakov, Introduction to Nonparametric Estimation (Springer, New York, 2008)
A.B. Tsybakov, E.C. van der Meulen, Root-n consistent estimators of entropy for densities with unbounded support. Scand. J. Stat. 23, 75–83 (1996)
L.R. Turner, Inverse of the Vandermonde matrix with applications. NASA Technical Note D-3547 (Washington, 1966)
A.W. van der Vaart, Asymptotic Statistics (Cambridge University Press, Cambridge, 1998)
J. Van Ryzin, Bayes risk consistency of classification procedures using density estimation. Sankhy\(\bar{\mathrm{a}}\) A 28, 161–170 (1966)
V.N. Vapnik, A.Y. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16, 264–280 (1971)
T.J. Wagner, Strong consistency of a nonparametric estimate of a density function. IEEE Trans. Syst. Man Cybern. 3, 289–290 (1973)
H. Walk, A universal strong law of large numbers for conditional expectations via nearest neighbors. J. Multivar. Anal. 99, 1035–1050 (2008)
H.E. Warren, Lower bounds for approximation by nonlinear manifolds. Trans. Am. Math. Soc. 133, 167–178 (1968)
G.S. Watson, Smooth regression analysis. Sankhy\(\bar{\mathrm{a}}\) A 26, 359–372 (1964)
G.S. Watson, M.R. Leadbetter, On the estimation of the probability density. Ann. Math. Stat. 34, 480–491 (1963)
R.L. Wheeden, A. Zygmund, Measure and Integral: An Introduction to Real Analysis (Marcel Dekker, New York, 1977)
P. Whittle, On the smoothing of probability density functions. J. R. Stat. Soc. B 20, 334–343 (1958)
C.T. Wolverton, T.J. Wagner, Asymptotically optimal discriminant functions for pattern classification. IEEE Trans. Inf. Theory 15, 258–265 (1969)
L.C. Zhao, Exponential bounds of mean error for the nearest neighbor estimates of regression functions. J. Multivar. Anal. 21, 168–178 (1987)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Biau, G., Devroye, L. (2015). Entropy estimation. In: Lectures on the Nearest Neighbor Method. Springer Series in the Data Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-25388-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-25388-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25386-2
Online ISBN: 978-3-319-25388-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)