Regular article
An algorithm for the prediction of proteasomal cleavages1

https://doi.org/10.1006/jmbi.2000.3683Get rights and content

Abstract

Proteasomes, major proteolytic sites in eukaryotic cells, play an important part in major histocompatibility class I (MHC I) ligand generation and thus in the regulation of specific immune responses. Their cleavage specificity is of outstanding interest for this process.

In order to generalize previously determined cleavage motifs of 20 S proteasomes, we developed network-based model proteasomes trained by an evolutionary algorithm with experimental cleavage data of yeast and human 20 S proteasomes. A window of ten flanking amino acid residues proved sufficient for the model proteasomes to reproduce the experimental results with 98–100 % accuracy. Actual experimental data were reproduced significantly better than randomly selected cleavage sites, suggesting that our model proteasomes were able to extract rules inherent to proteasomal cleavage data. The affinity parameters of the model, which decide for or against cleavage, correspond with the cleavage motifs determined experimentally. The predictive power of the model was verified for unknown (to the program) test conditions: the prediction of cleavage numbers in proteins and the generation of MHC I ligands from short peptides.

In summary, our model proteasomes reproduce and predict proteasomal cleavages with high degree of accuracy. They present a promising approach for predicting proteasomal cleavage products in future attempts and, in combination with existing algorithms for MHC I ligand prediction, will be tested to improve cytotoxic T lymphocyte epitope prediction.

Introduction

Proteasomes are cytosolic multisubunit proteases which are involved in cell cycle control, transcription factor activation and the generation of peptide ligands for MHC I molecules (for reviews, see Baumeister et al 1998, Rock and Goldberg 1999, Uebel and Tampe 1999). They exist in several forms: the proteolytically active core complexes, or 20 S proteasomes, and, when associated with the ATP-dependent 19 S cap complexes, the larger 26 S proteasomes that are able to recognize proteins marked by ubiquitin for proteasomal degradation Jentsch and Schlenker 1995, Hershko and Ciechanover 1998. Another protein complex known to associate with the 20 S core particle is PA28, the 11 S regulator (Ahn et al., 1995), which was shown to improve the yield of antigenic peptides Groettrup et al 1996, Dick et al 1996.

Eukaryotic 20 S proteasomes consist of four stacked rings (overall stoichiometry α7β7β7α7), each consisting of seven different subunits (Groll et al., 1997). Each of the two inner β-rings carries three catalytically active sites on its inner surface. Their proteolytic specificities have been described as chymotrypsin-like (cleaving after large, hydrophobic AAs), trypsin-like (cleaving after basic AAs) and peptidyl-glutamyl-peptide-hydrolyzing (cleaving after acidic AAs) (for a review, see Uebel & Tampe, 1999). Strings of unfolded proteins are thought to be inserted into the cylinder and to be cut into pieces by the active sites; the resulting peptide fragments are then released into the cytosol. Functionally, proteasomal protein degradation is believed to proceed from one substrate end to the other (“processively”), without the release of large degradation intermediates Akopian et al 1997, Nussbaum et al 1998, Kisselev et al 1999a.

In vertebrate cells, some of the proteolytic fragments produced by the proteasome are fed into the antigen processing machinery. Since peptide presentation by MHC I molecules at the cell surface is an intrinsic requirement for the ability of the immune system to eradicate virus-infected or transformed cells Rammensee et al 1993, Pamer and Cresswell 1998, it is of general interest to know exactly how the proteasome is involved in this process. Proteasomal cleavage specificity has been assessed by in vitro digestion experiments using either tri- or tetrapeptides with fluorogenic leaving groups Kuckelkorn et al 1995, Heinemeyer et al 1997, Arendt and Hochstrasser 1997, peptides of 15–40 AAs Boes et al 1994, Niedermann et al 1995, Niedermann et al 1996, Dick et al 1998, or denatured proteins Dick et al 1991, Dick et al 1994, Kisselev et al 1998, Kisselev et al 1999a as substrates. We analyzed the cleavage preferences of yeast wild-type and mutant proteasomes in a non-modified protein (Nussbaum et al., 1998) †. Using statistical analysis of cut sites, it was possible for the first time to determine so-called cleavage motifs, i.e. the preferred sequences around cleavage sites, for the three active β-subunits of yeast proteasomes.

In order to apply this cleavage site information to any possible proteasome substrate, an automated prediction device is needed. Such devices already exist for the binding of peptides to MHC I molecules (Rammensee et al., 1997) and have been described for peptide transport by the transporter associated with antigen processing (TAP) (Daniel et al., 1998). However, devices for the prediction of proteasomal cleavages are only at the beginning of their development. Recently, published peptide cleavage data were used to develop a prediction algorithm (Holzhutter et al., 1999), which reproduced its training data with 93 % and predicted non-training cleavages in one peptide substrate with 80 % accuracy. For different AAs in the P1 position ‡, cleavage motifs spanning up to 13 AAs were calculated by the algorithm. However, it has recently been shown that the three different proteolytic activities of eukaryotic proteasomes exhibit overlapping specificities Dick et al 1998, Nussbaum et al 1998. We therefore planned to generate a prediction device that does not rely on various motifs for different P1-AAs, but reflects a combination of the cutting of three active sites. This should mirror the experimental situation more efficiently where observed cleavages arise from a mixture of three different, partly overlapping proteasomal cleavage specificities. More importantly, we wanted our approach to be based on a more homogeneous set of training data, possibly generated under identical conditions. A device for proteasomal cleavage prediction would take us one step further in predicting the selection of CTL epitopes presented on MHC I by the three well-described “funnels of specificity”: proteasome cleavages, TAP transport, MHC I binding.

To this end, we developed a network-based model for proteasome cleavages, trained by an evolutionary algorithm on cleavages in protein and some peptide substrates (Nussbaum et al 1998, Niedermann et al 1995, Niedermann et al 1996, Niedermann et al 1997; our unpublished results). Our program performed significantly better for experimental data than for randomly positioned cuts, suggesting that it extracts rules inherent to proteasomal cleavages from training data. Besides, it reproduced the training data with very high level of accuracy (98–100 %). The parameters of the proteasome model largely reflect the roles of particular AAs in the cleavage motifs determined experimentally. Prediction of non-training cleavages was tested for some peptide substrates containing known MHC I ligands. The results indicate our approach to represent a promising starting point for refined algorithms for proteasomal cleavage and fragment prediction.

Section snippets

The experimental training data

Before designing a prediction tool, it was necessary to inspect carefully the experimental training data §. In brief, the experimental setup was as follows: purified 20 S proteasomes were co-incubated with a protein substrate (enolase, 436 AAs), enolase fragments were generated by the proteolytic action of the proteasome, and these fragments were identified by biochemical methods, allowing us to locate proteasomal cleavage sites. Thus, the experimental data consist of fragments and cleavage

Reproduction performance

Our method was applied to experimental cleavage data generated by different 20 S proteasomes (yeast wild-type, single mutants, double mutant, human wild-type) in the substrate enolase (Nussbaum et al., 1998; our unpublished results), yielding different model proteasome “species” (1–8; Table 1).

Initially, the double mutant data were utilized because of their “cleavage homogeneity”. The decision to cut was based on information from the octameric interval P4,..., P4′, i.e. k = m = 4. An initial

Discussion

Here we present a simple (one-layer) network capable of generating proteasomal cleavages in any given AA sequence. For training the network, goal functions counting missing and superfluous cuts (and cleavage probability in case of overlapping fragments in a refined model) were used, together with a stochastic, hill-climbing optimization process. Thus, parameter sets were found which reproduced the training data (cuts performed by 20 S proteasomes in enolase plus some cuts in ovalbumin peptides)

Acknowledgements

This work was supported by grants awarded by the Deutsche Forschungsgemeinschaft (Leibnizprogramm to H.-G.R. (Ra369/4–1); Schi 301/2–1 and Sonderforschungsbereich 510, C1 to H.S.) and the European Union (Biotech 95–1627). We thank Lynne Yakes for critically reading the manuscript.

References (43)

  • A.F Kisselev et al.

    Proteasome active sites allosterically regulate each other, suggesting a cyclical bite-chew mechanism for protein breakdown

    Mol. Cell

    (1999)
  • G Niedermann et al.

    Contribution of proteasome-mediated proteolysis to the hierarchy of epitopes presented by major histocompatibility complex class I molecules

    Immunity

    (1995)
  • P Paz et al.

    Discrete proteolytic intermediates in the MHC class I antigen processing pathway and MHC I-dependent peptide trimming in the ER

    Immunity

    (1999)
  • I Schechter et al.

    On the size of the active site in proteases. I. Papain

    Biochem. Biophys. Res. Commun.

    (1967)
  • N Shimbara et al.

    Contribution of proline residue for efficient production of MHC class I ligands by proteasomes

    J. Biol. Chem.

    (1998)
  • T.B Thompson et al.

    Neural network prediction of the HIV-1 protease cleavage sites

    J. Theoret. Biol.

    (1995)
  • S Uebel et al.

    Specificity of the proteasome and the TAP transporter

    Curr. Opin. Immunol.

    (1999)
  • J.Y Ahn et al.

    Primary structures of two homologous subunits of PA28, a gamma-interferon-inducible protein activator of the 20 S proteasome

    FEBS Letters

    (1995)
  • C.S Arendt et al.

    Identification of the yeast 20 S proteasome catalytic centers and subunit interactions required for active-site formation

    Proc. Natl Acad. Sci. USA

    (1997)
  • B Boes et al.

    Interferon gamma stimulation modulates the proteolytic activity and cleavage site preference of 20 S mouse proteasomes

    J. Exp. Med.

    (1994)
  • K.C Chou et al.

    Predicting human immunodeficiency virus protease cleavage sites in proteins by a discriminant function method

    Proteins: Struct. Funct. Genet.

    (1996)
  • Cited by (138)

    • Vav1 mutations: What makes them oncogenic?

      2020, Cellular Signalling
      Citation Excerpt :

      Based on our published data, we assumed that accessibility of the E59 K mutant protein to proteasome activity is the reason why this mutation yields a truncated protein. To further substantiate our hypothesis, we used two recommended servers [PAProC (http://www.paproc.de/ [52,53]) and NetChop (http://www.cbsdtu.dk/services/NetChop/ [54,55])] to examine whether a predicted proteasome cleavage site appears when the amino-acid glutamate at position 59 (E59) in Vav1 is changed to lysine (K). PAProC is a method for predicting cleavage sites by human proteasomes as well as by wild-type and mutant yeast proteasomes.

    View all citing articles on Scopus
    1

    Edited by R. Huber

    These authors contributed equally to the work.

    3

    Present address: T. P. Dick, Section of Immunobiology, Howard Hughes Medical Institute, Yale University School of Medicine, 310 Cedar Street, PO Box 208011, New Haven, CT 06520-8011, USA.

    View full text