ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Journal of Computer and System Sciences
Volume 65, Issue 3, November 2002, Pages 570-586
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (305 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/S0022-0000(02)00010-7    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2002 Published by Elsevier Science (USA).

Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis

Yaw-Ling LinE-mail The Corresponding Author, a, Tao JiangCorresponding Author Contact Information, E-mail The Corresponding Author, b and Kun-Mao ChaoE-mail The Corresponding Author, c

a Department of Computer Science and Information Management, Providence University, 200 Chung Chi Road, Shalu, Taichung County, 433, Taiwan b Department of Computer Science, University of California Riverside, Riverside, CA 92521-0144, USA c Department of Life Science, National Yang-Ming University, Taipei, 112, Taiwan

Received 20 January 2002; 
revised 28 February 2002. 
Available online 17 January 2003.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a lower bound L, find a consecutive subsequence of length at least L with the maximum average. We present an O(n)-time algorithm for the first problem and an O(n log L)-time algorithm for the second. The algorithms have potential applications in several areas of biomolecular sequence analysis including locating GC-rich regions in a genomic DNA sequence, post-processing sequence alignments, annotating multiple sequence alignments, and computing length-constrained ungapped local alignment. Our preliminary tests on both simulated and real data demonstrate that the algorithms are very efficient and able to locate useful (such as GC-rich) regions.

Author Keywords: Algorithm; Efficiency; Maximum consecutive subsequence; Length constraint; Biomolecular sequence analysis; Ungapped local alignment

Article Outline

1. Introduction
2. Applications to biomolecular sequence analysis
2.1. Locating GC-rich regions
2.2. Post-processing sequence alignments
2.3. Annotating multiple sequence alignments
2.4. Computing ungapped local alignments with length constraints
3. Maximum sum consecutive subsequence with length constraints
4. Maximum average consecutive subsequence with length constraints
5. Implementation and preliminary experiments
6. Concluding remarks
Acknowledgements
References










 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.