Abstract

We study the central limit theorem for a class of coloured graphs. This means that we investigate the limit behavior of certain random variables whose values are combinatorial parameters associated to these graphs. The techniques used at arriving this result comprise combinatorics, generating functions, and conditional expectations.

1. Introduction

In this paper we want to verify the central limit theorem (CLT) in the context of a combinatorial problem for coloured hard dimer configurations, which comprise a certain class of labeled graphs on . We will consider two combinatorial parameters that characterize our hard dimers and will therefore investigate a bivariate mass function. The problem is interesting as far as it is difficult to reformulate the task in such a manner that a general CLT becomes applicable. The proof here is based on the explicit knowledge of the bivariate mass function and is a generalization of the original De Moivre-Laplace theorem. By the way, coloured hard dimers are applied in the framework of causally triangulated -dimensional quantum gravity. We believe that our result could be employed for further study on the asymptotics of the one-step propagator, as it has been defined in [1] by means of an inversion formula.

Let us describe the objects we are interested in. Given a sequence of length of consecutive blue and red vertices on the one-dimensional lattice , one defines a dimer to be an edge connecting two nearest sites of the same colour. The dimer’s colour is given by the colour of its boundary vertices. A coloured hard dimer configuration (CHDC) is defined to be a sequence together with a sequence of dimers on it, which must not intersect. We include also the empty CHDC, that is, the configuration when no dimers are present. In Figure 1, an example of a CHDC is shown.

In graph-theoretic language, CHDCs form a subclass of labeled graphs whose vertices and edges carry one of two possible labels (here termed colours). For a given CHDC , let denote the numbers of blue, red dimers and the number of inner vertices, that is, vertices inside dimers that are not boundary points. Below we shall consider the numbers as random variables (r.v.s) investigating their joint probability generating function. Due to the symmetry w.r.t. and , this will lead us to the joint mass function of the r.v.s , where and denote the number of blue and red vertices which are not occupied by dimers (“single points”).

The paper is organized as follows. In Section 2, we give an explicit formula for the combinatorial generating function and define the probability mass function associated with CHDCs, with an exact expression for its normalizing constant, that is, . Moreover, we find the right probability distribution for the r.v. . In Section 3, we calculate the first two moments of the r.v. . In Section 4, we prove a CLT for the pair of r.v.s . The limit distribution is a bivariate Gaussian distribution with correlation coefficient equal to .

2. Coloured Hard Dimers and Probability Distributions

With the definitions made in the introduction, the following constraint holds. In the example above (Figure 1),

First, we want to find the combinatorial generating function of the variables . This is given as where It is useful to define further variables and , where corresponds to the number of sites occupied by dimers and to the number of coloured hard dimers

In the next proposition, we prove an exact formula for , using combinatorial arguments.

Lemma 2.1. The combinatorial generating function has the following explicit expression, for any : where denotes the integer part.

Proof of Lemma 2.1. Consider a CHDC on any with fixed but arbitrary . We set , , , , and . By (2.1) and (2.5) the following equalities hold.
First, we fix the number of blue and red dimers and that of single points and . Note that by (2.7) is also fixed. The dimers of the same colour are considered indistinguishable. Then, we calculate all possible permutations of blue dimers, red dimers, blue single points, and red single points, that is, Now, for any nontrivial given permutation of coloured dimers and single points, that is, (the empty CHDCs contribute a factor ), we have to see in how many ways we can distribute the given inner vertices over the given dimers. The number of all these combinations is Therefore, all contributions are summed to Since and , we have In obtaining (2.11), we have multiplied and divided the generic term of the previous sum by . The binomial formula yields , and (2.11) becomes Now by multiplying and dividing the generic term of the sum by , we get Performing the variable changements and , with and as above, we get In the last sum, we applied again the binomial formula Therefore, we get formula (2.6) where only the indices and appear. The lemma is so proved.

If we want to understand the appearance of CHDCs in probabilistic terms it is natural to assign each CHDC the same probability, so that the combinatorial frequency of particular configurations will be proportional to their probability. For this, let us define a family of probability spaces . We choose , to be the set of all different CHDCs. The -algebra is the power set of and for we take the probability measure having uniform distribution on . Normalizing the function by then just gives the joint probability generating function of the random variables , defined on , which count, for each hard dimer configuration, the number of blue, red dimers and the number of inner vertices, respectively.

The main result of this section is an explicit formula for the normalizing constant of the probability measure , that holds for any , derived by evaluating the combinatorial generating function at the point . Considering the change of variable , we have

Throughout the text, we use the convention that , whenever , or .

Remark 2.2. Note that, upon normalization, the factors in (2.16) yield a hypergeometric distribution with respect to the variable . Here, the sample size is and (resp., ) are the total number of successes (resp. failures). Therefore, summing over , we get the binomial coefficient .

In this way, we have also found the joint mass function related to the r.v.s and , more precisely, is given by

Remark 2.3. From (2.18), we deduce that the r.v. is binomial with parameters and . Since , it follows that the r.v. is also binomial with parameters and .

3. Number of Dimers: The First Two Moments

When proving a CLT, we have to rescale the r.v.s by subtracting the means and dividing by the standard deviations in question. In the previous section, we have seen that is binomial, whose moments are known. Although the distribution of the r.v. is not of common type, we are able to compute its mean and variance. For this, we rely on the fact that the conditional distribution of , given , is hypergeometric by Remark 2.2. We start with the computation of the mean.

Proposition 3.1. For any , the following formula holds.

By , we indicate the mean with respect to the probability measure .

Proof. By the properties of the conditional expectation and taking into account Remark 2.2, we have In fact, the expectation of our hypergeometric distribution is . Hence, In the last sum, we have used the decomposition which gives the first and zeroth moment of the binomial distribution with parameters and .

Remark 3.2. By identity (2.1), Remark 2.3, and from (3.1), we are able to calculate the single point number’s mean. In fact Note that, as , that is, for the present model, the expected number of single points is asymptotically twice the expected number of dimers. Moreover, fixing the number of single points, the conditional probability distribution of (resp., ) is binomial and symmetric.

In order to find the variance of , we apply the law of total variance involving the conditional expectation and the conditional variance, (see, e.g., [2]) The symbol indicates the variance w.r.t. . We recall that the conditional variance of a r.v. given a r.v. is defined as Alternatively one can define as that function of , whose value at is given by In the next Lemmas, we find the exact expressions of the two terms in (3.6).

Lemma 3.3. For all , one has

Proof. As above, we use the fact that the variance of the hypergeometric distribution is known. In our case, This entails that
In fact, we can write , so that the factorial moments of the binomial distribution with parameters and appear. It is easy to check that its second factorial moment is .

Lemma 3.4. For all , it holds that

Proof. Using again the properties of the conditional expectation, we have From Proposition 3.1, we know that , so that it remains to compute the first term in (3.13). The latter is given by As in the previous Lemma 3.3, it is useful to use the factorial moments of the binomial distribution. In achieving this, we simply write . According to this decomposition, the sum (3.14) splits into two terms, which we denote by and , respectively. The first term is In the above computations, we have used the equality , so that the factorial moments come into play. From (3.13) and (3.15), it is easy to get (3.12).

Finally, we are able to state the next proposition for the variance of .

Proposition 3.5. For all , one has

Proof. This follows immediately from (3.6) and Lemmas 3.3 and 3.4.

Remark 3.6. From the formulas for the mean and the variance of ((3.1) and (3.16)), one can deduce that the distribution of is asymptotically not binomial. In fact, and . In the next section, we prove that it is asymptotically Gaussian, for large , that is, a CLT holds.

4. Central Limit Theorem for the Dimers’ Number

In the present section, we study the asymptotic distribution of the dimers’ number, in particular, we prove a CLT for the total number of dimers plus single points , analyzed in Section 2, and the number of dimers . The limit distribution is a bivariate Gaussian distribution with correlation coefficient equal to .

The following proof is a generalization of De Moivre-Laplace's Theorem.

Theorem 4.1. A central limit theorem holds for the joint probability distribution. This means that for any and , one has where

Remark 4.2. In (4.2) above, we consider only the first order of the expectations and the variances with respect to . It is easy to see that the result remains the same when including terms of zeroth order, as they do not contribute to the asymptotics.

Proof. We have the following.
Step 1. We shall first verify a local version of the CLT, that is, where are given by (4.2). Moreover, , uniformly with respect to and , belonging to finite intervals and , respectively.
By the remark above and the fact that the indices and are both of order , we can forget the constants in (2.18), that is, and , as .
As in De Moivre-Laplace’s Theorem, we apply Stirling's formula to the binomial coefficients in (2.18). In the present model, we have two binomial coefficients instead of one, so that the calculus becomes heavier than in De Moivre-Laplace’s Theorem. We write where for every .
Taking into account (4.2), we consider the first factor in (4.4) with respect to the variables and . It is easy to see that with , as , uniformly with respect to and . In fact, since belong to bounded intervals, one can estimate uniformly from above and from below with respect to and .
For the factor in (4.4), the following asymptotic holds as . In fact, from (4.2), we estimate with , as , uniformly with respect to and .
Analogously, with , as , uniformly with respect to and .
By (4.7), we see that the correlation coefficient can be . Its sign will be determined later.
Finally, we consider the logarithm of the last factor in (4.4), Now we express each term of the sum in (4.10) , , in terms of and , defined in (4.2). We start with , Since the last logarithm above is of the form , with , we can expand it around , , as . The same is true for each logarithm function present in any , . So becomes, as , Analogously, for the other , , we find From (4.12)-(4.13), we find the main contribution We have thus proven formula (4.3).

Remark 4.3. Note that the last term in (4.14) is of the form with . We get thus a bivariate Gaussian distribution with correlation coefficient equal to , that is, the r.v.s and are negatively correlated.
Step 2. Now, we want to show formula (4.1), adapting the steps from the one-dimensional case, see [3]. For finite and , this follows from (4.3) which implies that the l.h.s. of (4.1) is just a Riemann sum approximation to the r.h.s.. To understand infinite boundaries, we consider, for example, the particular case where and . The other cases can be treated similarly. Let be the joint density function of the integral in (4.1). Since there exist for all finite constants such that Moreover, for all  : for all From (4.17) and (4.18), we deduce that for large enough. Without loss of generality, we now assume that . Then, it remains to verify that, for all , for large enough. In order to show this, one has to express the double sum and integral of (4.20) by means of the larger domains and by subtraction of appropriate terms. The resulting sums and integrals can be split over domains which may be finite or infinite. The sums over the finite domains are again Riemann approximations to the corresponding integrals. The sums and integrals over the infinite domains can be made arbitrarily small by employing estimates (4.17) and (4.19), which completes the proof.

Choosing in Theorem 4.1, we get the following result.

Corollary 4.4. A CLT holds for the r.v. , that is, for any , one has

Acknowledgment

H. Thaler is grateful for the financial support through the program “Rientro dei Cervelli” of the Italian MIUR.