Skip to main content
Advertisement

< Back to Article

The Construction and Use of Log-Odds Substitution Scores for Multiple Sequence Alignment

Figure 3

Large insertions in the central loop region of Api-AP2 domains.

As a consequence of asymmetric gap costs, Programs 1 and 2 reported several positive Api-AP2 candidates which have long insertions but, in the other parts of the domain, show high-scoring matches to the canonical pattern. Here, the sequence of T. gondii protein TGME49_06420, which has a 45 amino acid insertion in the central loop region, is shown aligned with the two most-closely-matching domains of typical length. Program 2, run with Dirichlet mixture prior and default parameters, assigned the insertion to the central loop location shown, which avoided the more conserved columns of the secondary structural elements indicated above the sequences. In contrast, Program 1 placed the same inserted residues in three separate locations, two of which would disrupt secondary structure. Moreover, with an established HMM search method [80] (http://hmmer.janelia.org/), only the right end alignment of this TGME49_06420 domain was found, but with a negative score well below the rejection threshold. Structural assignments E (beta-strand) and H (alpha-helix) are based on homologous experimental structures [121], [122] (PDB codes 2gcc,3gcc,3igm).

Figure 3

doi: https://doi.org/10.1371/journal.pcbi.1000852.g003