Some properties of Rényi entropy and Rényi entropy rate

doi:10.1016/j.ins.2009.03.002

Information Sciences

Volume 179, Issue 14, 27 June 2009, Pages 2426-2433

https://doi.org/10.1016/j.ins.2009.03.002 Get rights and content

Abstract

In this paper, we define the conditional Rényi entropy and show that the so-called chain rule holds for the Rényi entropy. Then, we introduce a relation for the rate of Rényi entropy and use it to derive the rate of the Rényi entropy for an irreducible-aperiodic Markov chain. We also show that the bound for the Rényi entropy rate is simply the Shannon entropy rate.

Introduction

In 1948, Shannon [30] used some axioms to introduce Shannon entropy. Subsequently, a number of sets of axioms were proposed by Khinchin [16] and others [10]. In 1961, Rényi [27] generalized Shannon entropy to a one-parameter family of entropies by defining an entropy of order α which is called the Rényi entropy. Some authors have modified the axioms associated with Rényi entropy [10]. Recently, Jizba and Arimitsu [14] used the extant axioms for Shannon and Rényi entropies to introduce a set of new axioms on the basis of which one can get both Shannon and Rényi entropies.

The concept of Rényi entropy has a number of applications in coding theory [5], [9], statistical mechanics [7], [17], statistics and related fields [1], [13], [22], [34] and other areas (see for instance [3] and the references therein).

We know that in the case of Shannon entropy, conditional entropy can be derived for random variables. Furthermore, there is a relation between conditional Shannon entropy and the joint Shannon entropy of random variables. This relation is called chain rule [6]. In the case of conditional Rényi entropy of random variables, however, there is no established definition of this quantity yet. Cachin [4] has given a definition on the basis of conditional Shannon entropy. For this definition, however, the chain rule does not hold. But one can use the axioms introduced by Jizba and Arimitsu [14] to derive conditional Rényi entropy and thereby prove the validity of the chain rule.

The application of the conditional Rényi entropy can be found in many areas such as quantum systems [32], biomedical engineering [20], cryptography [4], fields related to statistics [15], economics [2] and other areas [8], [23].

Through the introduction of entropy in the probability theory, entropy and stochastic processes became linked and the entropy rate was defined for stochastic processes. For a discrete-time process X = (X_n)_n⩾1, the entropy at time n is defined as the entropy of the n-dimensional random vector of X, and the entropy rate is defined by the limit of the entropy at time n divided by n, when the limit exists. For a stationary stochastic process with finite state space, Shannon proved [30] that the Shannon entropy rate exists. He also obtained the entropy rate for an ergodic Markov chain in the form ${\bar{H}}_{1} (X) = - \sum_{i} π_{i} \sum_{j} p_{ij} \log p_{ij}$ where p_ij, i,j = 1,2, … ,n are transition probabilities and Π = (π_i), i = 1,2,...,n, is the stationary distribution of the chain. Here, Π = ΠP, where P = (p_ij) and $\sum_{i = 1}^{n} π_{i} = 1$ .

The existence of the Shannon entropy rate for an irreducible Markov chain with infinite state space was proved by Klimko and Sucheston [18]. It can be shown that (1.1) is valid for the rate of Shannon entropy of an irreducible Markov chain with infinite state space.

The Rényi entropy rate was first defined by Rached et al. [25] for an ergodic Markov chain with a finite state space. The entropy rate of this process is expressed as ${\bar{H}}_{α} (X) = \frac{1}{1 - α} \log λ α > 0, α \neq 1$ where λ is the largest positive eigenvalue of the matrix $(p_{ij}^{α}) i, j = 1, 2, \dots, n$ and p_ij are transition probabilities of the chain. Recently, it has been shown in Ref. [24] that the Rényi entropy rate for an irreducible-aperiodic Markov chain with infinite state space is ${\bar{H}}_{α} (X) = \frac{1}{1 - α} \log R^{- 1} α > 0, α \neq 1$ where R is the convergence radius of the matrix $(p_{ij}^{α})$ , i, j = 1, 2, …, and p_ij are transition probabilities of the chain.

The Rényi entropy rate has revealed several operational characteristics in coding theory and error exponent [5], [25], [26], [31]. Rényi entropy rate for stochastic processes is given by [12], and for fields related to statistics refer to [11].

This paper is organized as follows. In Section 2, the conditional Rényi entropy is obtained and some properties of the Rényi entropy are presented. In Section 3, we show that the chain rule holds for the Rényi entropy, and introduce a relation for obtaining the rate of Rényi entropy. Then, using this relation, the Rényi entropy rate for an irreducible-aperiodic Markov chain, with finite and infinite state spaces, is derived. Finally, in Section 4, we show that for an irreducible-aperiodic Markov chain the bound for the Rényi entropy rate is simply the Shannon entropy rate.

Section snippets

Rényi entropy

We know that Rényi entropy is a generalization of Shannon entropy. For Shannon entropy, the conditional entropy has already been derived and its properties are known. But in the case of Rényi entropy, the conditional entropy does not have an established definition yet. Cachin [4] has given a definition similar to the definition for the conditional Shannon entropy, but he could not derive for it the chain rule which had been obtained for Shannon entropy. In this section, we present a suitable

The rate of Rényi entropy

Let (X_n)_n⩾1 be a discrete-time process and assume that the random vector (X₁, …, X_n) has a probability distribution $p (i_{1}, \dots, i_{n}) ≕ P (X_{1} = i_{1}, \dots, X_{n} = i_{n})$ Then, by the following theorem, we show that the chain rule holds for the Rényi entropy.

Theorem 3.1

chain rule:

Let (X₁, …, X_n) be a random vector with the probability distribution p(i₁, …, i_n) and H_α(X₁, …, X_n) be the Rényi entropy. Then: $H_{α} (X_{1}, \dots, X_{n}) = \sum_{i = 1}^{n} H_{α} (X_{i} ∣ X_{1}, \dots, X_{i - 1})$

Proof

For the random vector (X₁, …, X_n), we have by (2.4), (3.1): $H_{α} (X_{1}, \dots, X_{n}) = \frac{1}{1 - α} \log \sum_{i_{1}, \dots, i_{n}} p^{α} (i_{1}, \dots, i_{n})$ We can write: $\sum_{i_{1}, \dots, i_{n}} p^{α}$

Bounds for the Rényi entropy rate

Using the fact that the Rényi entropy is a decreasing function of α (remark 2.1), we have the following inequalities: $1 - For α < 1, H_{1} (.) < H_{α} (.)$ $2 - For α > 1, H_{α} (.) < H_{1} (.)$ where H₁ is the Shannon entropy.

Now, we obtain the bounds for the Rényi entropy rate of an irreducible-aperiodic Markov chain by using (4.1), (4.2).

For a random vector (X₁, …, X_n), inequality (4.1) becomes: $H_{1} (X_{1}, \dots, X_{n}) < H_{α} (X_{1}, \dots, X_{n})$ and therefore $\frac{1}{n} H_{1} (X_{1}, \dots, X_{n}) < \frac{1}{n} H_{α} (X_{1}, \dots, X_{n})$ Taking the limit n → ∞ of the entropy and considering that the rate of Rényi

Conclusion

In this paper, we introduced a new definition for conditional Rényi entropy, based on the axioms introduced by Jizba and Arimitsu, and demonstrated that the chain rule holds for this definition. Then, we derived a relation for obtaining the rate of Rényi entropy and we used this relation to obtain the rate of Rényi entropy for an irreducible-aperiodic Markov chain. Furthermore, showed that Shannon entropy rate is a bound for Rényi entropy rate for the aforementioned processes.

References (34)

A. Andai
On the geometry of generalized Gaussian distributions
Journal of Multivariate Analysis
(2009)
S.R. Bentes et al.
Long memory and volatility clustering: is the empirical evidence consistent across stock markets?
Physica A
(2008)
J.F. Bercher
On some entropy functionals derived from Rényi information divergence
Information Sciences
(2008)
A. De Gregorio et al.
On Rényi information for ergodic diffusion processes
Information Sciences
(2009)
A. Dukkipati et al.
Gelfand–Yaglom–Perez theorem for generalized relative entropy functionals
Information Sciences
(2007)
A. Farhadi et al.
Robust codind for a class of sources: applications in control and reliable communication over limited capacity channels
Systems and Control Letters
(2008)
P. Jacquet et al.
On the entropy of hidden Markov process
Theoretical Computer Science
(2008)
R. Jenssen et al.
A new information theoretic analysis of sum-of-squared-error kernel clustering
Neurocomputing
(2008)
P. Jizba et al.
The world according to Rényi: thermodynamics of multifractal systems
Annals of Physics
(2004)
C. Cachin, Entropy measures and unconditional security in cryptography, PhD Thesis, Swiss Federal Institute of...

I. Csiszár

Generalized cutoff rates and Rényi’s information measures

IEEE Transactions on Information Theory

(1995)

T.M. Cover et al.

The Elements of Information Theory

(1991)

S. Guiasu

Information Theory with Applications

(1977)

N.J.A. Harvey, K. Onak, J. Nelson, Streaming algorithms for estimating entropy, in: IEEE Information Theory Workshop,...

F. Kanaya et al.

The asymptotics of posterior entropy and error probability for Bayesian estimation

IEEE Transactions on Information Theory

(1995)

A.I. Khinchin

Mathematical Foundations of Information Theory

(1957)

V.S. Kirchanov

Using the Rényi entropy to describe quantum dissipative systems in statistical mechanics

Theoretical and Mathematical Physics

(2008)

Cited by (91)

Studying the impact of fluctuations, spikes and rare events in time series through a wavelet entropy predictability measure
2024, Physica A: Statistical Mechanics and its Applications
Data has become one of the most crucial sources of human life. In particular, the ability to predict the future through data is a widely studied topic. In finance, as an instance, increased volatility, fluctuations, low-frequency events, and rare events negatively affect the predictability of data, thus increasing the level of risk. As a consequence, the inability to make accurate predictions on future events increases the uncertainty and variability of a given scenario, indicating a consequent increase in risk. In this paper, we analyze data predictability introducing a new measure based on entropy and the wavelet transform. In particular, we show that the data are less predictable than one might expect due to the mentioned fluctuations and low-frequency events. Furthermore, we apply our tool to real data, in particular to time series of commodities. As a result, thanks to this new measure, we can observe that the price time series under analysis exhibit a significant level of unpredictability due to increased volatility, fluctuations, and the influence of low-frequency events.
WHRIME: A weight-based recursive hierarchical RIME optimizer for breast cancer histopathology image segmentation
2024, Displays
In medical image processing, multi-threshold image segmentation has been challenging, as selecting appropriate thresholds is crucial for distinguishing different structures within an image, especially when dealing with breast cancer images. Breast cancer images are complex with multiple tissue types, which pose challenges to precise diagnosis. A weight-based recursive hierarchical bootstrapping rime algorithm (WHRIME) is proposed to tackle this case effectively. The proposed WHRIME segregates the population into elite and non-elite individuals. The hierarchical bootstrapping strategy ensures comprehensive exploration of the solution space, with elite individuals guiding the position updates of non-elite individuals. A weighted method is introduced based on solution quality differences to maintain population diversity and enhance convergence accuracy. We apply WHRIME in multi-threshold image segmentation on breast cancer histopathology images. Experimental results on both the IEEE CEC 2017 benchmark suit and breast cancer histopathology images from the Databiox dataset validate the superiority of WHRIME over competing algorithms, affirming WHRIME’s capability in addressing complex breast cancer histopathology image segmentation problems.
Renyi entropy based design of heavy tailed distribution for return of financial assets
2024, Physica A: Statistical Mechanics and its Applications
It is well-known that returns of financial assets exhibit heavy tail property and there has been no distribution which can reliably capture this characteristic so far. To contribute to the solution of this problem, we derive a new heavy tail distribution using the maximum entropy principle for Renyi entropy under the absolute moment constraints. Our newly derived distribution with two shape parameters forms a family of distributions. They are smooth, scaleable, symmetric and may be heavy tailed if their shape parameter attains the appropriate value. As a result, parameters of this distribution can be estimated by maximum likelihood estimation technique. The ability of the derived distribution to model the heavy tail property of financial assets is verified on a range of financial instruments. The results we obtained show that it can be a better option for modeling the returns of financial assets compared to other well-known heavy tailed distributions.
Dynamic mechanism-assisted artificial bee colony optimization for image segmentation of COVID-19 chest X-ray
2023, Displays
The artificial bee colony optimization (ABC) algorithm operates efficiently and converges well but still suffers from the problem of easily falling into local optimum, and there is room for improving the convergence speed. For this reason, this paper proposes a dynamic mechanism-assisted ABC algorithm (EABC), which contains a dynamic approximation strategy for the optimal solution and a periodic variable food source number strategy. The dynamic approximation of the optimal solution strategy improves the swarm position update formulation and increases the pre-convergence speed of the ABC algorithm. Utilizing a periodic variable food source number scheme allows for more rapid algorithm convergence while simultaneously producing higher variability and diminishing the chances of the algorithm becoming trapped in local optima. In addition, this paper proposes a multi-threshold image segmentation (MTIS) model for COVID-19 X-ray chest images based on EABC. In this paper, the optimization performance of EABC is verified on the benchmark function of IEEE CEC 2017. The effectiveness of the EABC-based MTIS model is also validated on COVID-19 X-ray chest images.
Multi-threshold image segmentation based on an improved differential evolution: Case study of thyroid papillary carcinoma
2023, Biomedical Signal Processing and Control
The scholarly world has demonstrated an immense enthusiasm for medical image segmentation due to its intricate nature and critical role in medical diagnosis and treatment systems. Multi-threshold image segmentation (MTIS) is a popular technique for this purpose, due to its simplicity and straightforwardness. This paper presents an improved Differential Evolution (DE) algorithm called AGDE, which is based on MTIS and was used to evaluate its high capability at IEEE CEC 2017. Comparisons with classical and advanced algorithms were conducted as part of the experiments. An AGDE-based multi-threshold image segmentation method utilizing a non-local mean 2D histogram in combination with Rényi's entropy was applied to segment images from the Berkeley Segmentation Datasets 500 (BSDS500) and microscopic images of thyroid papillary carcinoma (TPC). The experimental results showed that the proposed image segmentation method outperformed its competitors, making it a promising approach for medical image segmentation.
Prediction of grain boundary of a composite microstructure using digital image processing: A comparative study
2021, Materials Today: Proceedings
Citation Excerpt :
By using edge detection techniques, unimportant information of structural properties gets discarded. In this paper, we depicted a quantitative comparative analysis of different edge detection operators such as Sobel [15], Robert [16], Prewitt [17], and Canny [18,19]. Then the best operator is used in conjunction with a watershed transformation to give the distinct edge characteristics of the grains.
In recent years, due to its balance mechanical and tribological properties, hybrid materials have occupied a dominant position in the automotive industry and aerospace industry. However, in the study of mechanical and tribological properties, it is necessary to perform metallographic characterization of hybrid composites. The discovery of digital image processing technique as a next generation technology for metallographic characterization results in time saving and error free solution. This automation technology includes multiple stages, such as image enhancement. Perform spatial filtering to remove noise, edge detection, and watershed conversion. One of the key factors of digital image processing technique is the application of statistical analysis in edge detection stage. This signify, edge detection stage plays the crucial role in digital image processing technique which provide a good quality microstructure from an unfavorable image. As per our comparative study between the edge operators like Sobel operator, Robert operator, Prewitt operator and Canny operator, we recommend the results of canny edge detectors for watershed transformation to calculate the grain edge to find the average grain size and results of statistical analysis method validate the same. The results also confirmed the formation of distinct edge of grain after watershed transformation.

View all citing articles on Scopus

View full text

Some properties of Rényi entropy and Rényi entropy rate

Abstract

Introduction

Section snippets

Rényi entropy

The rate of Rényi entropy

chain rule:

Bounds for the Rényi entropy rate

Conclusion

Journal of Multivariate Analysis

Physica A

Information Sciences

Information Sciences

Information Sciences

Systems and Control Letters

Theoretical Computer Science

Neurocomputing

Annals of Physics

Generalized cutoff rates and Rényi’s information measures

IEEE Transactions on Information Theory

The Elements of Information Theory

Information Theory with Applications

The asymptotics of posterior entropy and error probability for Bayesian estimation

IEEE Transactions on Information Theory

Mathematical Foundations of Information Theory

Using the Rényi entropy to describe quantum dissipative systems in statistical mechanics

Theoretical and Mathematical Physics