ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Pattern Recognition Letters
Volume 16, Issue 1, January 1995, Pages 105-111
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Purchase PDF (412 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/0167-8655(94)00074-D    How to Cite or Link Using DOI (Opens New Window)
Copyright © 1995 Published by Elsevier Science B.V.

On the exponential value of labeled samples*1

Vittorio Castelli and Thomas M. CoverCorresponding Author Contact Information, E-mail The Corresponding Author

Information Systems Laboratory, Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA

Received 14 July 1994; 
revised 8 August 1994. 
Available online 22 December 1999.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Consider the problem of classifying a sample X0 into one of two classes, using a training set Q. Let Q be composed of l labeled samples {(X1, θ1), …, (Xl, θl)} and u unlabeled samples {X1, …, Xu}, where the labels θi are i.i.d. Bernoulli(η) random variables over the set {1, 2}, the observations {Xi}i=1l are distributed according to fθi(·) and the unlabeled observations {Xj}j=1u are independently distributed according to the mixture density fX(·) = ηf1(·) + (1−η)f2(·). We assume that f1(·),f2(·) and η are all unknown. Let f1(·) and f2(·) belong to a known family Image , and assume that the mixtures of elements of Image are identifiable. Even when the number of unlabeled samples is infinite and the decision regions can therefore be identified, one still needs labeled samples to label the decision regions with the correct classification. Letting R(l, u) denote the optimal probability of error for l labeled andu unlabeled samples, and assuming that the pairwise mixtures of Image are identifiable, we obtain the obvious statements

, and then prove R(1, ∞) = 2R*(1−R*), where R* is the Bayes probability of error, and R(l, ∞) = R* + exp{ −αl + o(l)}, where the exponent α is given by Image . Thus the first labeled sample reduces the risk from 1/2 to 2R*(1−R*) and subsequent labeled samples in the training set reduce the probability of error exponentially fast to the Bayes risk.

Article Outline

• References

Pattern Recognition Letters
Volume 16, Issue 1, January 1995, Pages 105-111
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.