Abstract
Though the number of completely sequenced genomes quickly grows in recent years, the methods to predict protein functions by homology from the genomes have not been used sufficiently. It has been a successful technique to construct an OPCs(Orthologous Protein Clusters) with the best reciprocal BLAST hits from multiple complete-genomes. But it takes time-consuming-processes to make the OPCs with manual work. We, here, propose an automatic method that clusters OPs(Orthologous Proteins) from multiple complete-genomes, which is, to be extended, based on INPARANOID which is an automatic program to detect OPs between two complete-genomes. We also prove all possible clustering mathematically.
This work was supported by the Regional Research Centers Program of Ministry of Education & Human Resources Development in South Korea.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Fitch, W.M.: Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–113 (1970)
Tatusov, R.L., Koonin, E.V., Lipman, D.J.: A genomic perspective on protein families. Science 278(5338), 631–637 (1997)
Tatusov, R.L., Galperin, M.Y., Natale, D.A., Koonin, E.V., et al.: The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Research 28, 33–36 (2000)
Altschul, S.F., Gish, W., Miller, W., Myers, E.W.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)
Tatusov, R.L., Fedorova, N.D., Jackson, J.D., Aviva, R., Jacobs, A.R., et al.: The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003)
Chervitz, S.A., Aravind, L., Sherlock, G., Ball, C.A., et al.: Comparison of the complete protein set of worm and yeast:orthology and divergence. Science 282, 2022–2028 (1998)
Rubin, G.M., Yandell, M.D., Wortman, J.R., Gabor Miklos, G.L., et al.: Comparative genomics of the eukaryotes. Science 287, 2204–2215 (2000)
Wheelan, S.J., Boguski, M.S., Duret, L., Makalowski, W.: Human and nematode orthologs – lessons from the analysis of 1800 human genes and the proteome of Caenorhabditis elegans. Gene 238, 163–170 (1999)
Mushegian, A.R., Garey, J.R., Martin, J., Liu, L.X.: Large-scale taxonomic profiling of eukaryotic model organisms: a comparison of orthologous proteins enclosed by the human, fly, nematode, and yeast genomes. Genome. Res. 8, 590–598 (1998)
Kanehisa, M., Peer, B.: Bioinformatics in the post-sequences era. nature genetics supplement 33, 305–310 (2003)
Bork, P., Koonin, E.V.: Predicting functions from protein sequence-where are the bottlenecks? Nat. Genet. 18, 313–318 (1998)
Eisen, J.A.: Phylogenomics:improving functional predictions for uncharacterized genes by evolutionary analysis. Genome. Res. 8, 163–167 (1998)
Galperin, M.Y., Koonin, E.V.: Source of systematic error in functional annotation of genomes: domain rearrangement, nonorthologous gene displacement and operon disruption. In Silico Biol. 1, 55–67 (1998)
Kimmen, S.: Phylogenomic inference of protein molecular function: advances and challenges. Bioinformatics 20, 170–179 (2004)
Bono, H., Goto, S., Fujibuchi, W., Ogata, H., et al.: Systematic Prediction of Orthologous Units of Genes in the Complete Genomes. In: Genome. Inform. Ser. Workshop Genome. Inform., vol. 9, pp. 32–40 (1998)
Remm, M., Storm, C.E., Sonnhammer, E.L.: Automatic Clustering of Orthologs and in-paralogs from Pairwise Species Comparisons. J. Mol. Biol. 314, 1041–1052 (2001)
Montague, M.G., Hutchison III, C.A.: Gene content phylogeny of herpersviruses. PNAS, 5334–5339 (2000)
Stuart, J.M., Segal, E., Koller, D., Kim, S.K.: A Gene-Coexpression Network for Global Discovery of Conserved genetic Modules. Science 302, 249–255 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, S., Jung, K.S., Ryu, K.H. (2006). Automatic Orthologous-Protein-Clustering from Multiple Complete-Genomes by the Best Reciprocal BLAST Hits. In: Li, J., Yang, Q., Tan, AH. (eds) Data Mining for Biomedical Applications. BioDM 2006. Lecture Notes in Computer Science(), vol 3916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691730_7
Download citation
DOI: https://doi.org/10.1007/11691730_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33104-9
Online ISBN: 978-3-540-33105-6
eBook Packages: Computer ScienceComputer Science (R0)