This paper is concerned with the reliable inference of optimal tree-approximations to the dependency structure of an unknown distribution generating data. The traditional approach to the problem measures the dependency strength between random variables by the index called mutual information. In this paper reliability is achieved by Walley's imprecise Dirichlet model, which generalizes Bayesian learning with Dirichlet priors. Adopting the imprecise Dirichlet model results in posterior interval expectation for mutual information, and in a set of plausible trees consistent with the data. Reliable inference about the actual tree is achieved by focusing on the substructure common to all the plausible trees. We develop an exact algorithm that infers the substructure in time O(m 4), m being the number of random variables. The new algorithm is applied to a set of data sampled from a known distribution. The method is shown to reliably infer edges of the actual tree even when the data are very scarce, unlike the traditional approach. Finally, we provide lower and upper credibility limits for mutual information under the imprecise Dirichlet model. These enable the previous developments to be extended to a full inferential method for trees.
Article PDF
Similar content being viewed by others
References
M. Abramowitz and I.A. Stegun, eds., Handbook of Mathematical Functions (Dover, 1974).
I.D. Aron and P. Van Hentenryck, On the complexity of the robust spanning tree problem with internal data, Operations Research Letters 32 (2004) 36–40.
J.-M. Bernard, 2001, Non-parametric inference about an unknown mean using the imprecise Dirichlet model, in: ISIPTA'01, eds. G. de Cooman, T. Fine and T. Seidenfeld (The Netherlands, 2001) pp. 40–50.
J.-M. Bernard, An introduction to the imprecise Dirichlet model for multinomial data, International Journal of Approximate 39(2–3) (2005) 123–150.
C.K. Chow and C.N. Liu, Approximating discrete probability distributions with dependence tress, IEEE Transactions on Information Theory, IT-14(3) (1968) 462–468.
N. Friedman, D. Geiger and M. Goldszmidt, Bayesian networks classifiers, Machine Learning 29(2/3) (1997) 131–163.
A. Gelman, J.B. Carlin, H.S. Stern and D.B. Rubin, Bayesian Data Analysis (Chapman, 1995).
J.B.S. Haldane, The precision of observed values of small frequencies, Biometrika 35 (1948) 297–300.
M. Hutter, Distribution of mutual information, in: Proceedings of NIPS*2001, eds. T.G. Dietterich, S. Vecker and Z. Ghahramani (Cambridge, MA, 2001).
M. Hutter, Robust estimators under the imprecise dirichlet model, in: Proc. 3rd International Symposium on Imprecise Probalities and Their Application (ISIPTA-2003), Proceedings in Informatics Vol. 18 (Canada, 2003) pp. 274–289.
M. Hutter and M. Zaffalon, Distribution of mutual information from complete and incomplete data, Computational Statics & Data Analysis 48(3) (2005) 633–657.
H. Jeffreys, An invariant form for the prior probability in estimation problems, in: Proceedings Royal Society London A, 186 (1946) pp. 453–461.
M.G. Kendall and A. Stuart, The Advanced Theory of Statistics, 2nd edition. (Griffin, London, 1967).
G.D. Kleiter, The posterior probability of Bayers nets with strong dependences, Soft Computing 3 (1999) 162–173.
J.B. Kruskal Jr., On the shortest spanning subtree of a graph and the traveling salesman problem, in: Proceedings of the American Mathematical Society 7 (1956) 48–50.
S. Kullback, Information Theory and Statistics (Dover, 1968).
S. Kullback and R.A. Leiber, On information and sufficiency, Annals of Mathematical Statistics 22 (1951) 79–86.
C. Manski, Partial Identification of Probability Distributions (Department of Economics, Northwestern University, USA: Draft book, 2002).
R. Montemanni, A Benders decomposition approach for the robust spanning tree problem with interval data, European Journal of Operational Research. Forthcoming.
H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity (Prentice Hall, New York, 1982).
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Morgan Kaufmann, San Mateo, 1988).
W. Perks, Some observations on inverse probability, Journal of the Institute of Actuaries 73 (1947) 285–312.
M. Ramoni and P. Sebastiani, Robust learning with missing data, Machine Learning 45(2) (2001) 147–170.
T. Verma and J. Pearl, Equivalence and synthesis of causal models, in: UAI'90, eds. P.P. Bonissone, M. Henrion, L.N. Kanal and J.F. Lemmer (New York, 1990) pp. 220–227.
P. Walley, Statistical Reasoning with Imprecise Probabilities (Chapman and Hall, New York, 1991).
P. Walley, Inferences from multinomial data: learning about a bag of marbles, Journal of the Royal Statistical Society B 58(1) (1996) 3–57.
D.H. Wolpert and D.R. Wolf, Estimating functions of distributions from a finite set of samples, Physical Review E 52(6) (1995) 6841–6854.
H. Yaman, O.E. Karaşan and M.C. Pinar, The robust spanning tree problem with interval data, Operations Research Letters 29 (2001) 31–40.
M. Zaffalon, Exact credal treatment of missing data, Journal of Statistical Planning and Inference 105(1) (2002) 105–122.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zaffalon, M., Hutter, M. Robust inference of trees. Ann Math Artif Intell 45, 215–239 (2005). https://doi.org/10.1007/s10472-005-9007-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-005-9007-9
Keywords
- robust inference
- spanning trees
- intervals
- dependence
- graphical models
- mutual information
- imprecise probabilities
- imprecise Dirichlet model