ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Data & Knowledge Engineering
Volume 52, Issue 2, February 2005, Pages 249-271
XML schema and data management
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (1204 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.datak.2004.05.008    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2004 Elsevier B.V. All rights reserved.

Bio2X: a rule-based approach for semi-automatic transformation of semi-structured biological data to XML

Song Yanga, Sourav S. BhowmickCorresponding Author Contact Information, E-mail The Corresponding Author, a and Sanjay MadriaE-mail The Corresponding Author, b

a School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore b Department of Computer Science, University of Missouri-Rolla, Rolla 65409, USA

Received 21 May 2004; 
Revised 21 May 2004; 
accepted 21 May 2004. 
Available online 1 July 2004.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Data integration of geographically dispersed, heterogeneous, complex biological databases is a key research area. One of the key features of a successful data integration system is to have a simple self-describing data exchange format. However, many of the biological databases provide data in flat files which are poor data exchange formats. Fortunately, XML can be viewed as a powerful data model and better data exchange format. In this paper, we present the Bio2X system that transforms flat file data into highly hierarchical XML data using rule-based machine learning technique. Bio2X has been fully implemented using Java. Our experiments to transform real world biological data demonstrate the effectiveness of the Bio2X approach.

Author Keywords: Flat files; Rule base; Machine learning; XML; Transformer

Article Outline

1. Introduction
1.1. Motivation
1.2. XML as data exchange format
1.3. Overview of Bio2X
2. Structure of biological data
3. Design of extraction rules
3.1. Overview
3.2. Extracting children of the root
3.3. Extracting hierarchical structure from values
3.4. Disjunctive rules
4. Rule induction system
4.1. Algorithm learnrule
4.2. A case study
5. Experimental results
6. Related work
7. Conclusions and future work
References
Vitae






















Data & Knowledge Engineering
Volume 52, Issue 2, February 2005, Pages 249-271
XML schema and data management
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.