Clustering techniques in biological sequence analysis

Manning, A. M.; Keane, J. A.; Brass, A.; Goble, C. A.

doi:10.1007/3-540-63223-9_130

A. M. Manning¹,
J. A. Keane¹,
A. Brass² &
…
C. A. Goble³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1263))

Included in the following conference series:

European Symposium on Principles of Data Mining and Knowledge Discovery

646 Accesses
6 Citations

Abstract

In biological sequence analysis many DNA and RNA sequences discovered in laboratory experiments are not properly identified. Here the focus is on using clustering algorithms to provide a structure to the data. The approach is inter-disciplinary using domain knowledge to identify such sequences. The enormous volume and high dimensionality of unidentified biological sequence data presents a challenge. Nonetheless useful and interesting results have been obtained, both directly and indirectly, by applying clustering to the data.

Work supported by UK EPSRC MRes studentship and ESPRIT HPCN Project No 22693.

Download to read the full chapter text

Chapter PDF

Bioinformatics Analysis of Sequence Data

Clustering of Biological Sequences

Biological Sequence Analysis: Algorithms and Statistical Methods

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Australian Biotechnology Association (ABA), What is genetic engineering, Educational leaflet, http://www.aba.asn.au/leaf2.html, 1996.
Google Scholar
M. Bland, An Introduction to Medical Statistics, Oxford Medical Publications, 1994.
Google Scholar
P. Cheeseman and J. Stutz, Bayesian Classification (AutoClass): Theory and Results, Advances in Knowledge Discovery and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy (Eds.), AAAI Press, pp. 153–181, 1995.
Google Scholar
M.J. Currie and Q.A. Parker, Clustan — A Cluster-Analysis Package, Science and Engineering Research Council, Rutherford Appleton Laboratory, Starlink Project, User Note 26.6, 1993.
Google Scholar
EMBL Nucleotide Sequence Database: http://www.ebi.ac.uk, 1997.
Google Scholar
K.H. Fasman, A.J. Cuticchia and D.T. Kingsbury, The GDB (TM) Human Genome Database Anno, Nucl. Acid. R. 22 (17), pp. 3462–3469, 1994.
Google Scholar
U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy (Eds.) Advances in Knowledge Discovery and Data Mining, AAAI Press, 1995.
Google Scholar
D. Jacobson, Mapping and sequencing the human genome, http:/www.gdb.org/Dan/DOE/prim2.html, 1995.
Google Scholar
I.T. Jolliffe, Principle Component Analysis, Springer Series in Statistics, 1986.
Google Scholar
C.S. Wallace and D.L. Dowe, Intrinsic classification by MML — the Snob program, Proc. 7th Australian Joint Conference on Artificial Intelligence World Scientific, pp. 37–44, 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computation, UMIST, M60 1QD, Manchester, UK
A. M. Manning & J. A. Keane
School of Biological Sciences, University of Manchester, M13 9PL, UK
A. Brass
Department of Computer Science, University of Manchester, M13 9PL, UK
C. A. Goble

Authors

A. M. Manning
View author publications
You can also search for this author in PubMed Google Scholar
J. A. Keane
View author publications
You can also search for this author in PubMed Google Scholar
A. Brass
View author publications
You can also search for this author in PubMed Google Scholar
C. A. Goble
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jan Komorowski Jan Zytkow

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Manning, A.M., Keane, J.A., Brass, A., Goble, C.A. (1997). Clustering techniques in biological sequence analysis. In: Komorowski, J., Zytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1997. Lecture Notes in Computer Science, vol 1263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63223-9_130

Download citation

DOI: https://doi.org/10.1007/3-540-63223-9_130
Published: 06 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63223-8
Online ISBN: 978-3-540-69236-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Clustering techniques in biological sequence analysis

Abstract

Chapter PDF

Similar content being viewed by others

Bioinformatics Analysis of Sequence Data

Clustering of Biological Sequences

Biological Sequence Analysis: Algorithms and Statistical Methods

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Clustering techniques in biological sequence analysis

Abstract

Chapter PDF

Similar content being viewed by others

Bioinformatics Analysis of Sequence Data

Clustering of Biological Sequences

Biological Sequence Analysis: Algorithms and Statistical Methods

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation