Abstract
Given a discrete-valued sample X 1,…,X n , we wish to decide whether it was generated by a distribution belonging to a family H 0, or it was generated by a distribution belonging to a family H 1. In this work we assume that all distributions are stationary ergodic and do not make any further assumptions (in particular, no independence or mixing rate assumptions). We find some necessary and some sufficient conditions, formulated in terms of the topological properties of H 0 and H 1, for the existence of a consistent test. For the case where H 1 is the complement of H 0 (to the set of all stationary ergodic processes), these necessary and sufficient conditions coincide, thereby providing a complete characterization of families of processes membership to which can be consistently tested, against their complement, based on sampling. This criterion includes as special cases several known and some new results on testing for membership to various parametric families, as well as testing identity, independence, and other hypotheses.
Similar content being viewed by others
References
Anderson T, Goodman L (1957) Statistical inference about Markov chains. Ann Math Stat 28(1):89–110
Biau G, Gyorfi L (2005) On the asymptotic properties of a nonparametric l 1-test of homogeneity. IEEE Trans Inf Theory 51:3965–3973
Billingsley P (1961) Statistical inference about Markov chains. Ann Math Stat 32(1):12–40
Billingsley P (1965) Ergodic theory and information. Wiley, New York
Csiszar I (1967) Information-type measures of difference of probability distributions and indirect observations. Studia Sci Math Hung 2:299–318
Csiszar I, Shields P (2000) The consistency of the BIC Markov order estimator. Ann Stat 28(6):1601–1619
Csiszar I, Shields P (2004) Notes on information theory and statistics. In: Foundations and trends in communications and information theory.
Gray R (1988) Probability, random processes, and ergodic properties. Springer, Berlin
Gretton A, Györfi L (2010) Consistent nonparametric tests of independence. J Mach Learn Res 11:1391–1423
Kendall M, Stuart A (1961) The advanced theory of statistics. Inference and relationship, vol 2, London (1961)
Kieffer J (1993) Strongly consistent code-based identification and order estimation for constrained finite-state model classes. IEEE Trans Inf Theory 39(3):893–902
Lehmann E (1986) Testing statistical hypotheses, 2nd edn. Wiley, New York
Morvai G, Weiss B (2005) On classifying processes. Bernoulli 11(3):523–532
Ryabko B, Astola J (2006) Universal codes as a basis for time series testing. Stat Methodol 3:375–397
Ryabko B, Astola J, Gammerman A (2006) Application of Kolmogorov complexity and universal codes to identity testing and nonparametric testing of serial independence for time series. Theor Comput Sci 359:440–448
Ryabko D (2010) Clustering processes. In: Proc the 27th international conference on machine learning (ICML 2000), Haifa, Israel, pp 919–926.
Ryabko D (2010) Discrimination between B-processes is impossible. J Theor Probab 23(2):565–575
Ryabko D (2010) Testing composite hypotheses about discrete-valued stationary processes. In: Proc. IEEE information theory workshop (ITW’10), IEEE, Cairo, Egypt, pp 291–295
Ryabko D, Ryabko B (2010) Nonparametric statistical inference for ergodic processes. IEEE Trans Inf Theory 56(3):1430–1435
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Domingo Morales.
Rights and permissions
About this article
Cite this article
Ryabko, D. Testing composite hypotheses about discrete ergodic processes. TEST 21, 317–329 (2012). https://doi.org/10.1007/s11749-011-0245-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-011-0245-3