skip to main content
10.1145/2500863.2500871acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Mining spatially cohesive itemsets in protein molecular structures

Authors Info & Claims
Published:11 August 2013Publication History

ABSTRACT

In this paper we present a cohesive structural itemset miner aiming to discover interesting patterns in a set of data objects within a multidimensional spatial structure by combining the cohesion and the support of the pattern. The usefulness of this algorithm is demonstrated by applying it to find interesting patterns of amino acids in spatial proximity within a set of proteins based on their atomic coordinates in the protein molecular structure. The experiments show that several patterns found by the cohesive structural itemset miner contain amino acids that frequently co-occur in the spatial structure, even if they are distant in the primary protein sequence and only brought together by protein folding. Further various indications were found that some of the discovered patterns seem to represent common underlying support structures within the proteins.

References

  1. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In VLDB'94, pages 487--499. Morgan Kaufmann Publishers, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Andreeva, D. Howorth, J.-M. Chandonia, S. E. Brenner, T. J. P. Hubbard, C. Chothia, and A. G. Murzin. Data growth and its impact on the SCOP database: new developments. Nucleic acids research, 36(Database issue):D419--25, Jan. 2008.Google ScholarGoogle Scholar
  3. D. N. Arvidson, F. Lu, C. Faber, H. Zalkin, and R. G. Brennan. The structure of PurR mutant L54M shows an alternative route to DNA kinking. Nature structural biology, 5(6):436--41, June 1998.Google ScholarGoogle Scholar
  4. M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, and G. Sherlock. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature genetics, 25(1):25--9, May 2000.Google ScholarGoogle ScholarCross RefCross Ref
  5. C. E. Bell, P. Frescura, A. Hochschild, and M. Lewis. Crystal structure of the lambda repressor C-terminal domain provides a model for cooperative operator binding. Cell, 101(7):801--11, June 2000.Google ScholarGoogle ScholarCross RefCross Ref
  6. B. Cule, B. Goethals, and C. Robardet. A new constraint for mining sets in sequences. In SDM'09, pages 317--328, 2009.Google ScholarGoogle Scholar
  7. K. S. Gajiwala and S. K. Burley. Winged helix proteins. Current Opinion in Structural Biology, 10(1):110--116, Feb. 2000.Google ScholarGoogle ScholarCross RefCross Ref
  8. B. Gärtner. Fast and robust smallest enclosing balls. In Algorithms-ESA'99, pages 325--338. Springer, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. C. Graham, P. E. Lilley, M. Lee, P. M. Schaeffer, A. V. Kralicek, N. E. Dixon, and J. M. Guss. Kinetic and crystallographic analysis of mutant Escherichia coli aminopeptidase P: insights into substrate recognition and the mechanism of catalysis. Biochemistry, 45(3):964--75, Jan. 2006.Google ScholarGoogle ScholarCross RefCross Ref
  10. J. Hu, X. Shen, Y. Shao, C. Bystroff, and M. J. Zaki. Mining protein contact maps. In 2nd BIOKDD workshop on data mining in bioinformatics., 2002.Google ScholarGoogle Scholar
  11. C. G. Kalodimos, R. Boelens, and R. Kaptein. Toward an integrated model of protein-DNA recognition as inferred from NMR studies on the Lac repressor system. Chemical reviews, 104(8):3567--86, Aug. 2004.Google ScholarGoogle ScholarCross RefCross Ref
  12. A. Kouranov, L. Xie, J. de la Cruz, L. Chen, J. Westbrook, P. E. Bourne, and H. M. Berman. The RCSB PDB information portal for structural genomics. Nucleic acids research, 34(Database issue):D302--5, Jan. 2006.Google ScholarGoogle Scholar
  13. W. T. Lowther and B. W. Matthews. Metal-loaminopeptidases: Common Functional Themes in Disparate Structural Surroundings. Chemical Reviews, 102(12):4581--4608, Dec. 2002.Google ScholarGoogle Scholar
  14. P. Meysman, K. Marchal, and K. Engelen. Identifying common structural DNA properties in transcription factor binding site sets of the LacI-GalR family. Current bioinformatics, 8(4), 2013.Google ScholarGoogle Scholar
  15. A. Nakamura, C. Wada, and K. Miki. Structural basis for regulation of bifunctional roles in replication initiator protein. Proceedings of the National Academy of Sciences of the United States of America, 104(47):18484--9, Nov. 2007.Google ScholarGoogle ScholarCross RefCross Ref
  16. A. Reményi, M. C. Good, R. P. Bhattacharyya, and W. A. Lim. The role of docking interactions in mediating signaling input, output, and discrimination in the yeast MAPK network. Molecular cell, 20(6):951--62, Dec. 2005.Google ScholarGoogle ScholarCross RefCross Ref
  17. E. D. Scheeff and P. E. Bourne. Structural evolution of the protein kinase-like superfamily. PLoS computational biology, 1(5):e49, Oct. 2005.Google ScholarGoogle Scholar
  18. H. Sharma, S. Yu, J. Kong, J. Wang, and T. A. Steitz. Structure of apo-CAP reveals that large conformational changes are necessary for DNA binding. Proceedings of the National Academy of Sciences of the United States of America, 106(39):16604--9, Sept. 2009.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Vendruscolo, E. Kussell, and E. Domany. Recovery of protein structure from contact maps. Folding and Design, 2(5):295--306, Oct. 1997.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    BioKDD '13: Proceedings of the 12th International Workshop on Data Mining in Bioinformatics
    August 2013
    64 pages
    ISBN:9781450323277
    DOI:10.1145/2500863
    • General Chairs:
    • Jake Chen,
    • Mohammed Zaki,
    • Program Chairs:
    • Gaurav Pandey,
    • Huzefa Rangwala,
    • George Karypis

    Copyright © 2013 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 August 2013

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    BioKDD '13 Paper Acceptance Rate7of16submissions,44%Overall Acceptance Rate7of16submissions,44%

    Upcoming Conference

    KDD '24

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader