Acessibilidade / Reportar erro

A numerical study of the Kullback-Leibler distance in functional magnetic resonance imaging

Abstract

The Kullback-Leibler distance (or relative entropy) is applied in the analysis of functional magnetic resonance (fMRI) data series. Our study is designed for event-related (ER) experiments, where a brief stimulus is presented and a long period of rest is followed. In particular, this relative entropy is used as a measure of the "distance" between the probability distributions p1 and p2 of the signal levels related to stimulus and non-stimulus. In order to avoid undesirable divergences of the Kullback-Leibler distance, a small positive parameter delta is introduced in the definition of the probability functions in such a way that it does not bias the comparison between both distributions. Numerical simulations are performed so as to determine the probability densities of the mean Kullback-Leibler distance $\overline{D}$ (throughout the N epochs of the whole experiment). For small values of N (N < 30), such probability densities $f(\overline{D})$ are found to be fitted very well by Gamma distributions (chi2 < 0.0009). The sensitivity and specificity of the method are evaluated by construction of the receiver operating characteristic (ROC) curves for some values of signal-to-noise ratio (SNR). The functional maps corresponding to real data series from an asymptomatic volunteer submitted to an ER motor stimulus is obtained by using the proposed technique. The maps present activation in primary and secondary motor brain areas. Both simulated and real data analyses indicate that the relative entropy can be useful for fMRI analysis in the information measure scenario.

fMRI Analysis; Kullback-Leibler Distance; Hypothesis Testing


A numerical study of the Kullback-Leibler distance in functional magnetic resonance imaging

Brenno Caetano Troca Cabella; Márcio Júnior Sturzbecher; Walfred Tedeschi; Oswaldo Baffa Filho; Dráulio Barros de Araújo; Ubiraci Pereira da Costa Neves

Departamento de Física e Matemática, Faculdade de Filosofia Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Av. Bandeirantes, 3900, 14.040-901, Ribeirão Preto, SP, Brazil brenno@pg.ffclrp.usp.br

ABSTRACT

The Kullback-Leibler distance (or relative entropy) is applied in the analysis of functional magnetic resonance (fMRI) data series. Our study is designed for event-related (ER) experiments, where a brief stimulus is presented and a long period of rest is followed. In particular, this relative entropy is used as a measure of the "distance" between the probability distributions p1 and p2 of the signal levels related to stimulus and non-stimulus. In order to avoid undesirable divergences of the Kullback-Leibler distance, a small positive parameter d is introduced in the definition of the probability functions in such a way that it does not bias the comparison between both distributions. Numerical simulations are performed so as to determine the probability densities of the mean Kullback-Leibler distance $\overline{D}$ (throughout the N epochs of the whole experiment). For small values of N (N < 30), such probability densities $f(\overline{D})$ are found to be fitted very well by Gamma distributions (c2 < 0.0009). The sensitivity and specificity of the method are evaluated by construction of the receiver operating characteristic (ROC) curves for some values of signal-to-noise ratio (SNR). The functional maps corresponding to real data series from an asymptomatic volunteer submitted to an ER motor stimulus is obtained by using the proposed technique. The maps present activation in primary and secondary motor brain areas. Both simulated and real data analyses indicate that the relative entropy can be useful for fMRI analysis in the information measure scenario.

Keywords: fMRI Analysis; Kullback-Leibler Distance; Hypothesis Testing

I. INTRODUCTION

The functional magnetic resonance imaging (fMRI) is one of the main techniques for mapping human brain activity in a noninvasive way [1]. This technique allows to assess cognitive functions with high spatial and temporal resolution imaging. Neural activity due to stimulation causes an increase of the blood flow and oxygenation in the local vasculature. The blood oxygenation level dependent (BOLD) contrast of the image is induced by changes in relative concentrations of oxy and deoxy-hemoglobin [2–4]. The BOLD signal is the response of the application of an experimental protocol (known as paradigm) which is composed of periodic changes between stimulus and non-stimulus of different brain areas. The result of such an experiment is a time series corresponding to the signal for each voxel. In the processing of fMRI time series, several methods that separate the physiologically induced signals (corresponding to active voxels) from noise (non-active voxels) have been employed. Deterministic methods quantify the similarity between the time series of each voxel with a hemodynamic response function (HRF) model. On the other hand, statistical methods can infer how significative is the difference between the signals corresponding to periods of stimulation and non-stimulation but do not need a priori knowledge of the form of the HRF. In the late years, methods based on information measures, such as the Shannon and Tsallis entropies and the generalized mutual information, have been employed as alternatives for the conventional analysis of fMRI data [5–9]. These informational methods also present the advantage of not requiring a HRF model.

In this context, we now propose to use the Kullback-Leibler distance (or relative entropy) as an information measure to analyse fMRI data series. Our study is designed for event-related (ER) paradigms, where stimulation is presented in a short period of time and followed by a long period of rest. Like other informational methods, the proposed technique only considers the general structure of the signal and needs no HRF model. Indeed the relative entropy is able to quantify the difference between the probability distributions p1 and p2 of the states of the BOLD signal acquired during periods of stimulation and rest, respectively. The sample average of the Kullback-Leibler distance is elected as the statistic for testing the null hypothesis that a voxel is non-active (noise). The probability distributions of the test statistic are numerically determined by Monte Carlo simulations and found to be fitted very well by Gamma distributions. The sensitivity and specificity of the method are computed for different signal-to-noise ratios (SNR's). Finally, the proposed technique is applied to real data series (from an asymptomatic volunteer submitted to an ER motor stimulus) in order to determine the corresponding activation map. Our results from both simulated and real data indicate that the technique can be in fact a very good option for fMRI analysis.

II. RELATIVE ENTROPY

In information theory, the relative entropy is a measure of the "distance" between the probability functions p1 and p2 of two discrete random variables X1 and X2, respectively. This information measure is also known as the Kullback-Leibler distance (D) and defined as [10]

where pij is the probability that Xi assumes the j-th value from its set of L elements. The logarithm is taken to base 2 so as the entropy is measured in bits.

Relative entropy is not symmetric and is always non-negative. It is zero if and only if p1 = p2. It should also be noted that = 0 whereas = ¥ .

The relative entropy of the time course s(t) of a signal (from real or even simulated data) is proposed to be calculated as follows. Consider one single event-related trial (also named epoch). During that epoch the time course s(t) of the (simulated) BOLD signal of an active voxel exhibits a peak and then returns to a baseline level (Fig.1). As for the calculation of Shannon entropy in a previous work [7], an entire epoch (window W) of the experiment is divided into two time intervals: half windows W1 (related to the signal increase) and W2 (corresponding mainly to baseline values). The probability distribution of signal levels (p1) corresponding to the BOLD response in the first time period is clearly expected to differ from that one (p2) corresponding to the baseline signal values so that the relative entropy within this epoch is calculated as a measure of the "distance" between p1 and p2. In this way we consider the discrete sets S1 = {s(tk), k = 1,2,...,n1} and S2 = {s(tl), l = n1 + 1,n1+2,...,n1+ n2} of signal amplitudes measured at n1 instants (t1 < t2 < ... < ) within period W1 and n2 instants ( < < ... < ) within period W2, respectively. In order to compute the probability functions p1 and p2, the accessible states of the system are then defined in terms of subdivisions of the interval of variation of the signal amplitude [5]. Consider the whole set S = S1È S2 and let s0 = min[S] and sL = max[S] be the inferior and superior limits of S, respectively. An equipartition of S is defined by the amplitude values s0, s1 = s0+ Ds, s2 = s0+ 2Ds,..., sL = s0 + LDs where L is the number of subdivisions (levels) and Ds = (sL – s0)/L. It is assumed that each subdivision corresponds to a possible state of the system. Each integer j from the label set z = {1,2,..., L} refers to the the j-th disjoint interval Ij which is defined as [sj-1 , sj) (for j < L - 1) and [sL-1 , sL] (for j = L).


As already noted, the relative entropy might diverge in the limit p2j ® 0 (for p1j¹ 0). From a practical point of view, this undesirable divergence would be susceptible to occur if the probability function p2 were allowed to assume null values. In order to avoid such divergence, we introduce a small positive parameter d (0 < d < 1) in the definition of the probabilities of the signal levels,

where nij = |Si Ç Ij| is the number of points within window Wi (i = 1,2) and level Ij (j = 1,2,¼,L) and ni = |Si| is the whole number of points in window Wi. In the above definition, the condition = 1 is clearly satisfied whereas if d ® 0+ then so that one recovers the usual meaning that pij is the probability of occupation of level Ij (within Wi), i.e., the relative frequency of points on Ij. Furthermore, eq. 2 might be interpreted as the relative frequency of points on level Ij (within Wi) given that a fractionary point (of weight d) is equally placed on each level Ij of both windows W1 and W2. In this sense, the introduction of parameter d does not bias the comparison between distributions p1 and p2 since it still remains balanced.

III. SIMULATED DATA ANALYSIS

The simulated time series s(t) is composed of a hemodynamic response function h(t) and a gaussian noise h, i.e., s(t) = h(t) + h. The exact hemodynamic response function that follows neuronal activity in an ER-fMRI paradigm depends on many variables [11] but in general the BOLD response reaches a maximum soon after a brief stimulation (4 to 5s) and then decays to the baseline level (at about 20s after stimulus). Pre-undershoot and a post-undershoot signals are sometimes observed [12]. In our simulations, we use a fisiologically reasonable model for the HRF which was proposed by Friston [13] and consists of the difference between two gamma functions,

where d = ab is the time corresponding to the peak and d¢ = a¢b¢ is the time for the undershoot, with a = 6, a¢ = 12, b = b¢ = 0.9 and c = 0.35.

Noise h is sampled from a gaussian distribution with mean zero and a given variance . Otherwise the variance of the signal h(t) is evaluated once along time and is the same for all simulations. Time courses s(t) are simulated for different values of SNR (signal-to-noise ratio) which is usually defined as

and measured in decibels (dB).

Following the same parameters used in the real ER-fMRI experiment (see section IV), each simulated time course s(t) consists of N = 24 epochs. For each epoch, n1 = n2 = 7 signal amplitudes are simulated within windows W1 and W2. Some time series s(t) simulated for different signal-to-noise ratios (-4dB, 0dB, 4dB and 8dB) are plotted in Fig. 2.


To determine the probability functions p1 and p2, we have chosen the parameters L = 2 and d = 0.1 for evaluation of eq. 2. As we shall see, that choice for d leads to optimum results. The Kullback-Leibler distance D must be computed within each epoch. The mean Kullback-Leibler distance is the sample average of the N distances computed along the whole time course. We perform Nexp = 106 simulations of the whole time series. Results from simulations show that the Kullback-Leibler distance D exhibits a discrete probability distribution with periodic peaks. On the other hand, the probability density of the mean distance, f(), is observed to approach a continuous distribution for large values of N. According to the central limit theorem, f() tends to a Gaussian distribution as N ® ¥. However, for small values of N (N < 30), we have found that the probability densities f() can be fitted very well by Gamma distributions,

where a and b are parameters of the distribution related to its mean µ = ab and variance s2 = ab2. Fig. 3 presents the probability densities f() corresponding to simulations for pure noise (SNR ® -¥) and some finite values of SNR (-8dB, -4dB, 0dB and 4dB); the continuous curves correspond to the Gamma fittings (c2 < 0.0009) and overlap the simulated points very well.


The mean Kullback-Leibler distance is elected as the statistic for testing the null hypothesis H0 that the time course s(t) in a particular voxel is pure noise (non-active voxel) against the alternative hypothesis H1 that s(t) is composed of both BOLD signal and noise with a given signal-to-noise ratio (active voxel).

A signal detection scheme can be usually appraised by the construction of the receiver operating characteristic (ROC) curves [14, 15]. In the present analysis, the outcome of the signal detection scheme is the mean distance . Such outcome is classified as signal (+) if the observed belongs to a given critical region (above some threshold ); otherwise it is classified as noise (-). If f(|-) and f(|+) are the probability densities obtained from simulations of pure noise and signal (with a given SNR), respectively, then the sensitivity is defined as the probability of a true positive,

whereas the specificity is defined as the probability of a true negative,

(the probability functions are null for < 0). Fig. 4 illustrates the areas under the probability distributions f(|-) (for noise only) and f(|+) (for signal with SNR=-4dB) which correspond to specificity and sensitivity, respectively. In hypothesis testing, a type I error consists of rejecting the null hypothesis H0 when it is true. The significance level aPT per test is the probability of a type I error, i.e., aPT = P(+|-) = 1-P(-|-) is the probability of a false positive (or 1-specificity).


The ROC curve is the plot of sensitivity (probability of a true positive) versus 1-specificity (probability of a false positive), resulting from the variation of the threshold . The ROC curves for several SNR's are shown in Fig.5.


The performance of the method is evaluated by its ability to detect active brain areas with high sensitivity and specificity. Thus the higher is the concentration of points in the left superior corner of the ROC curve graph, the better is the performance. The area below the ROC curve is a way to quantify this performance. The discriminate power of the test increases as the ROC area varies from 0.5 (lowest value) to 1 (optimum performance). Our simulations show that the ROC area (and so the discriminate power) grows as SNR is increases according to Fig. 6. Finally, we may investigate the dependence of the power of the test with the choice of parameter d. Fig. 7 shows that the ROC area varies smoothly with d for various SNR'S and that d = 0.1 seems to be the best choice for all studied cases.



IV. REAL DATA ANALYSIS

An asymptomatic volunteer has been submitted to an event related motor stimulus. The fMRI data were acquired in a 1.5 T scanner (Siemens, Magneton Vision) with quadrature transmit/receive head RF coil and circular polarization commercially available. The study protocol consisted of 24 repeated epochs with two states: left- and right-handed finger-tapping with duration of 3-5 seconds and 17-21 seconds of rest. Three axial slices (Fig.8) with 4 mm of width (located on portions of the primary motor cortex) showed 14 temporal points throughout each epoch, three during the activity states and eleven during the period of rest, totalizing 336 images each slice. All images were acquired with echo-planar image (EPI) protocol with repetition time (TR) of 1680 ms, time to echo of 118 ms, flip angle of 90º. Each slice was acquired with field of view (FOV) of 210 mm with a matrix 128 x 128 resulting a resolution of 1,64 x 1,64. Preprocessing was carried out using the Brain VoyagerTM QX (Brain Innovation Inc., The Netherlands) software. The data analysis included several steps as motion correction, time series image realignment by the first slice as reference, spatial smoothing with a FWHM of 4 mm and temporal filtering with a high pass filter of 3 cycle/s.


For each voxel of the image, the mean Kullback-Leibler distance () of the corresponding real data series has been computed (with parameters L = 2 and d = 0.1). One should remember that this processing requires no model for the HRF. The result of such analysis are the activation maps (overlaid onto corresponding anatomical images) for three axial slices, as shown in Fig. 9. The maps present activation in primary and secondary motor areas due to a motor stimulus. In order to avoid a lot of spurious activated voxels (false positives) in the functional maps, one should consider the idàk-Bonferroni correction [16],


where aPF is the significance level per family of tests, defined as the probability of making at least one type I error for the whole family of tests (i.e., for all voxels); aPT is the usual significance level (per test) and C is the number of tests (voxels). For a signicance level aPF = 0.05 per family of tests, the threshold for voxel activation is = 0.9337 so that voxels with < are decided to be non-active.

V. CONCLUSION

In summary, we have proposed the use of the Kullback-Leibler distance (or relative entropy) as an alternative method to analyse fMRI data series. The probability distributions of the mean Kullback-Leibler distance (throughout the N epochs of the event related experiment) were determined by numerical simulations. The sensitivity and specificity of the method were evaluated by construction of ROC curves for different SNR's. The functional map corresponding to real data series from an asymptomatic volunteer submitted to an ER motor stimulus was obtained by using the proposed technique. Our results from both simulated and real data indicate that the technique can be a very good option for fMRI analysis. As a further study, the use of the generalized Kullback-Leibler distance in fMRI analysis might be considered.

Acknowledgments

The authors acknowledge the Brazilian agencies CNPq and CAPES for financial support.

Received on 01 December, 2007

  • [1] D. Lebihan, Diffusion and Perfusion Magnetic Resonance Imaging Applications to Functional MRI, Raven Press, New York, 1995.
  • [2] S. Ogawa, T. Lee, A. Nayak, P. Glynn, Magnetic Resonance in Medicine 14, 68 (1990).
  • [3] S. Ogawa, T. Lee, A. Kay, D. Tank, Proceedings of the National Academy of Sciences USA, 9868 87 (1990).
  • [4] P. van Zijl, S. Eleff, J. Ulatowski, J. Oja, A. Ulug, R. Traystman, R. Kauppinen, Nature Medicine 4, 159 (1998).
  • [5] A. Capurro, L. Diambra, D. Lorenzo, O. Macadar, M. T. Martin, C. Mostaccio, A. Plastino, J. Pérez, E. Rofman, M. E. Torres, J. Velluti, Physica A 265, 235 (1999).
  • [6] T. Yamano, Phys. Rev. A 63, 046-105 (2001).
  • [7] D. B. de Araujo, W. Tedeschi, A. C. Santos, J. Elias Jr., U. P. C. Neves, O. Baffa, Neuroimage 20, 311 (2003).
  • [8] W. Tedeschi, D. B. de Araujo, H.-P. Müller, , A. C. Santos, U. P. C. Neves, S. N. Ernč, O. Baffa, Physica A 344, 705-711 (2004).
  • [9] W. Tedeschi, H.-P. Müller, D. B. de Araujo, A. C. Santos, U. P. C. Neves, S. N. Ernč, O. Baffa, Physica A 352, 629-644 (2005).
  • [10] T. M. Cover, J. A. Thomas, Elements of information theory, John Wiley & Sons, New York, (1991).
  • [11] F. M. Miezin, L. Maccotta, J. M. Ollinger, S. E. Petersen and R. L. Buckner, NeuroImage 11, 735 (2000).
  • [12] D. Malonek, A. Grinvald, Science 272, 551 (1996).
  • [13] K. J. Friston, P. Fletcher, O. Josephs, A. Holmes, M. D. Rugg, R. Turner, NeuroImage 7, 30 (1998).
  • [14] J. Hanley, B. McNeil, Radiology 148, 839 (1983).
  • [15] D. Goodenough, K. Rossmann, L. Lusted, Radiology 110, 89 (1974).
  • [16] N. Salkind, Encyclopedia of Measurement and Statistics, Sage Publications, Inc, 2006.

Publication Dates

  • Publication in this collection
    27 Mar 2008
  • Date of issue
    Mar 2008

History

  • Received
    01 Dec 2007
Sociedade Brasileira de Física Caixa Postal 66328, 05315-970 São Paulo SP - Brazil, Tel.: +55 11 3091-6922, Fax: (55 11) 3816-2063 - São Paulo - SP - Brazil
E-mail: sbfisica@sbfisica.org.br