Copyright © 1995 Published by Elsevier Science B.V.
On the exponential value of labeled samples*1
Received 14 July 1994;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
Consider the problem of classifying a sample X0 into one of two classes, using a training set Q. Let Q be composed of l labeled samples {(X1, θ1), …, (Xl, θl)} and u unlabeled samples {X′1, …, X′u}, where the labels θi are i.i.d. Bernoulli(η) random variables over the set {1, 2}, the observations {Xi}i=1l are distributed according to fθi(·) and the unlabeled observations {X′j}j=1u are independently distributed according to the mixture density fX′(·) = ηf1(·) + (1−η)f2(·). We assume that f1(·),f2(·) and η are all unknown. Let f1(·) and f2(·) belong to a known family
, and assume that the mixtures of elements of
are identifiable. Even when the number of unlabeled samples is infinite and the decision regions can therefore be identified, one still needs labeled samples to label the decision regions with the correct classification. Letting R(l, u) denote the optimal probability of error for l labeled andu unlabeled samples, and assuming that the pairwise mixtures of
are identifiable, we obtain the obvious statements







E-mail Article
Add to my Quick Links

Cited By in Scopus (37)






