Copyright © 1999 Published by Elsevier Science B.V. All rights reserved.
KPS: a Web information mining algorithm
Available online 3 May 2000.
Abstract
The Web mostly contains semi-structured information. It is, however, not easy to search and extract structural data hidden in a Web page. Current practices address this problem by (1) syntax analysis (i.e. HTML tags); or (2) wrappers or user-defined declarative languages. The former is only suitable for highly structured Web sites and the latter is time-consuming and offers low scalability. Wrappers could handle tens, but certainly not thousands, of information sources. In this paper, we present a novel information mining algorithm, namely KPS, over semi-structured information on the Web. KPS employs keywords, patterns and/or samples to mine the desired information. Experimental results show that KPS is more efficient than existing Web extracting methods.
Author Keywords: Information extraction; Information retrieval; Web query; Web databases
Article Outline
*E-mail: guan@cs.uregina.ca
1E-mail: kfwong@se.cuhk.edu.hk






E-mail Article
Add to my Quick Links

Cited By in Scopus (2)





