Abstract
Informatics has helped launch molecular biology into the genomic era. It appears certain that informatics will remain a major contributor to molecular biology in the post-genome era.We discuss here data integration and datamining in bioinformatics, as well as the role that database theory played in these topics. We also describe LIMS as a third key topic in bioinformatics where advances in database system and theory can be very relevant.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In VLDB’94, pp 487–499.
P. G. Baker and A. Brass. Recent development in biological sequence databases. Curr. Op. Biotech., 9:54–58, 1998.
R. J. Bayardo. Efficiently mining long patterns from databases. In SIGMOD’98, pp 85–93.
P. Buneman et al. Comprehension syntax. SIGMOD Record, 23:87–96, 1994.
P. Buneman et al. Principles of programming with complex objects and collection types. TCS, 149:3–48, 1995.
J. Chen et al. The Kleisli query system as a backbone for bioinformatics data integration and analysis. In Bioinformatics: Managing Scientific Data, Morgan Kaufmann. To appear.
T.M. Cover and P.E. Hart. Nearest neighbour pattern classification. IEEE Trans. Info. Theory, 13:21–27, 1967.
L. Damas and R. Milner. Principal type-schemes for functional programs. In POPL’82, pp 207–212.
S. Davidson et al. BioKleisli:A digital library for biomedical researchers. Intl. J. Digit. Lib., 1:36–53, 1997.
Department of Energy. DOE Informatics Summit Meeting Report, 1993.
G. Dong and J. Li. Efficient mining of emerging patterns: Discovering trends and differences. In KDD’99, pp 15–18.
J. Li et al. The space of jumping emerging patterns and its incremental maintenance algorithms In ICML’00, pp 551–558.
U. Fayyad and K. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In IJCAI’93, pp 1022–1029
D. Gerhold et al. DNA chips: promising toys have become powerful tools. Trends Biochem. Sci., 24:168–173, 1999.
T.R. Golub et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286:531–537, 1999.
L.M. Haas et al. DiscoveryLink:A system for integrated access to life sciences data sources. IBM Systems Journal, 40:489–511, 2001.
A.G. Hatzigeorgiou. Translation initiation start prediction in human cDNAs with high accuracy. Bioinformatics, 18:343–350, 2002.
G. Jaeschke and H. J. Schek. Remarks on the algebra of non-first-normal-form relations. In PODS’82, pp 124–138.
M. Kozak. An analysis of 5’-noncoding sequences from 699 vertebrate messenger RNAs. NAR, 15:8125–8148, 1987.
E.S. Lander et al. Initial sequencing and analysis of the human genome. Nature, 409:861–921, 2001.
P. Langley et al. An analysis of Bayesian classifier. In AAAI’92, pp 223–228.
J. Li et al. Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients. Bioinformatics, 2002. To appear.
J. Li and L. Wong. Geography of differences between two classes of data. In PKDD’02, pp 325–337.
L. Libkin and L. Wong. Query languages for bags and aggregate functions. JCSS, 55(2):241–272, October 1997.
H. Liu and R. Sentiono. Chi2: Feature selection and discretization of numeric attributes. In Proc. IEEE 7th Intl. Conf. on Tools with Artificial Intelligence, pp 338–391, 1995.
A. Makinouchi. A consideration on normal form of not necessarily normalised relation in the relational data model. In VLDB’77, pp 447–453.
H. Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1:241–258, 1997.
Y. Papakonstantinou et al. Object exchange across heterogenous information sources. In ICDE’95, pp 251–260.
P. Pearson et al. The GDB human genome data base anno 1992. NAR, 20:2201–2206, 1992.
A.G. Pedersen and H. Nielsen. Neural network prediction of translation initiation sites in eukaryotes: Perspectives for EST and genome analysis. ISMB, 5:226–233, 1997.
J.R. Quinlan. C4.5: Program for Machine Learning. Morgan Kaufmann, 1993.
D. E. Rumelhart et al. Learning representations by back-propagating errors. Nature, 323:533–536, 1986.
G. D. Schuler et al. Entrez: Molecular biology database and retrieval system. Methods Enzymol., 266:141–162, 1996.
D.B. Searls. Using bioinformatics in gene and drug discovery. DDT, 5:135–143, 2000.
S.J. Thomas and P.C. Fischer. Nested relational structures. In Advances in Computing Research: The Theory of Databases, pp 269–307, 1986.
V.N. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.
P. Wadler. Comprehending monads. Math. Struct. Comp. Sci., 2:461–493, 1992.
L. Wong. Normal forms and conservative extension properties for query languages over collection types. JCSS, 52:495–505, 1996.
L. Wong. Kleisli, a functional query system. JFP, 10:19–56, 2000.
L. Wong. Kleisli, its exchange format, supporting tools, and an application in protein interaction extraction. In BIBE’00, pp 21–28.
E.J. Yeoh et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1:133–143, 2002.
F. Zeng et al. Using feature generation and feature selection for accurate prediction of translation initiation sites. In GIW’02. To appear.
A. Zien et al. Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics, 16:799–807, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, J., See-Kiong, N., Wong, L. (2003). Bioinformatics Adventures in Database Research. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds) Database Theory — ICDT 2003. ICDT 2003. Lecture Notes in Computer Science, vol 2572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36285-1_3
Download citation
DOI: https://doi.org/10.1007/3-540-36285-1_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00323-6
Online ISBN: 978-3-540-36285-2
eBook Packages: Springer Book Archive