Conformational and oligomeric states of SPOP from small-angle X-ray scattering and molecular dynamics simulations

Version of Record

Accepted for publication after peer review and revision.

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Version of Record published: March 9, 2023 (This version)
Accepted Manuscript published: March 1, 2023 (Go to version)
Accepted: February 20, 2023
Received: October 12, 2022
Preprint posted: October 8, 2022 (Go to version)

1. Of interest
Secreted dengue virus NS1 from infection is predominantly dimeric and in complex with high-density lipoprotein

Bing Liang Alvin Chew, AN Qi Ngoh ... Dahai Luo

Research Article May 24, 2024
Further reading

Abstract
Editor's evaluation
Introduction
Results
Discussion
Methods
Data availability
References
Article and author information
Metrics

Abstract

Speckle-type POZ protein (SPOP) is a substrate adaptor in the ubiquitin proteasome system, and plays important roles in cell-cycle control, development, and cancer pathogenesis. SPOP forms linear higher-order oligomers following an isodesmic self-association model. Oligomerization is essential for SPOP’s multivalent interactions with substrates, which facilitate phase separation and localization to biomolecular condensates. Structural characterization of SPOP in its oligomeric state and in solution is, however, challenging due to the inherent conformational and compositional heterogeneity of the oligomeric species. Here, we develop an approach to simultaneously and self-consistently characterize the conformational ensemble and the distribution of oligomeric states of SPOP by combining small-angle X-ray scattering (SAXS) and molecular dynamics (MD) simulations. We build initial conformational ensembles of SPOP oligomers using coarse-grained molecular dynamics simulations, and use a Bayesian/maximum entropy approach to refine the ensembles, along with the distribution of oligomeric states, against a concentration series of SAXS experiments. Our results suggest that SPOP oligomers behave as rigid, helical structures in solution, and that a flexible linker region allows SPOP’s substrate-binding domains to extend away from the core of the oligomers. Additionally, our results are in good agreement with previous characterization of the isodesmic self-association of SPOP. In the future, the approach presented here can be extended to other systems to simultaneously characterize structural heterogeneity and self-assembly.

Editor's evaluation

In this important paper, the authors have developed an approach for simultaneously optimizing the conformational ensemble and degrees of oligomerization, and this has been tested by applying it to a specific protein (SPOP). Comparison of the quality of fits with different models also provides valuable insights into structural features important to the assembly of oligomers. The approach, presented with compelling experimental support, is potentially applicable to other systems as well.

https://doi.org/10.7554/eLife.84147.sa0

Introduction

Protein self-association is fundamental for many processes in biology (Ali and Imperiali, 2005; Marsh and Teichmann, 2015), and it has been estimated that around half of all proteins form dimers or higher-order complexes (Lynch, 2012). One such protein is Speckle-type POZ protein (SPOP), a substrate adaptor in the ubiquitin proteasome system, which recruits substrates for the Cullin3-RING ubiquitin ligase (CRL3) (Hernández-Muñoz et al., 2005; Kent et al., 2006; Kwon et al., 2006). SPOP targets a range of substrates for degradation, including proteins involved in hormonal signalling, epigenetic modification, and cell-cycle control, such as the androgen receptor (AR) (An et al., 2014) and death-associated protein 6 (DAXX) (Kwon et al., 2006; Cuneo and Mittag, 2019). SPOP is thus an important regulator of cellular signalling, and mutations in SPOP are associated with a variety of cancers (Le Gallo et al., 2012; Kim et al., 2013; Cuneo and Mittag, 2019).

The 374-residue SPOP monomer consists of three domains. From N- to C-terminus, these are the MATH domain (i.e., the meprin and TRAF-C homology domain), the BTB domain (i.e. the broad-complex, tramtrack, and bric-a-brac domain), and the BACK domain (i.e. the BTB and C-terminal Kelch domain). MATH is the substrate binding domain, while the BTB domain mediates interaction with CRL3 (Zhuang et al., 2009; Bosu and Kipreos, 2008). Both the BTB and BACK domains can homodimerize, resulting in the formation of polydisperse, linear higher-order SPOP oligomers with alternating BTB-BTB and BACK-BACK interfaces (Errington et al., 2012; van Geersdaele et al., 2013; Marzahn et al., 2016). The BTB-mediated dimer is formed with nanomolar affinity, and this dimer acts as the unit of higher-order oligomerization, which occurs through micromolar affinity BACK dimerization. Thus, only even-numbered SPOP oligomers are substantially populated (Marzahn et al., 2016; Figure 1).

Figure 1 with 1 supplement see all

Download asset Open asset

SPOP forms higher-order oligomers through isodesmic self-association.

(a) The SPOP BTB-BTB homodimer forms with nanomolar affinity, and is the unit of higher-order oligomerization through BACK-BACK homodimerization. Higher-order SPOP oligomerization follows an isodesmic model, where the equilibrium between oligomer $i$ and $i$ +1 is described by a single equilibrium constant, $K_{D,isodesmic}$ , which is independent of oligomer size. (b) Crystal structures of homodimers of the BACK (left, PDB: 4HS2) and MATH-BTB (right, PBD: 3HQI) domains of SPOP. Below, the structure of a SPOP^28–359 dimer constructed based on crystal structures. The cartoon model is overlaid with the coarse-grained representation used in the Martini simulations. The BACK domains of the two neighbouring subunits in the oligomer are also shown (without Martini bead overlay). (c) Left: Populations of SPOP oligomers given by the isodesmic model with $K_{D,isodesmic}$ =1.6 µM, determined from CG-MALS, for the protein concentrations used in our SAXS experiments. Note the logarithmic scale. Right: Relative contribution of each oligomer to the average SAXS signal given by the populations in left panel. (d) Structure of a SPOP^28–359 60-mer constructed based on structures in panel b. MATH domains are coloured orange and BTB/BACK domains are coloured blue in all panels.

Chemical crosslinking experiments have shown that SPOP oligomers form inside cells (Marzahn et al., 2016), and analysis of SPOP homologues shows sequence co-variation across both the BTB-BTB and BACK-BACK interfaces (Bouchard et al., 2018), together suggesting that self-association has physiological relevance. By presenting multiple MATH domains for substrate binding, SPOP oligomers can simultaneously bind to multiple low-affinity binding motifs in a single substrate, resulting in an overall increased affinity through avidity effects (Pierce et al., 2016). The longer lifetimes of these complexes enable effective polyubiquitination (Pierce et al., 2016). This suggests that tuning SPOP’s oligomerization state could act as a mechanism to regulate substrate degradation (Errington et al., 2012). SPOP oligomerization is also involved in phase separation. SPOP localizes to nuclear speckles in cells (Marzahn et al., 2016), and upon overexpression of certain substrates, SPOP and substrate co-localize to condensates which recruit CRL3 and display active substrate ubiquitination (Bouchard et al., 2018). This process requires both SPOP oligomerization and substrate binding (Marzahn et al., 2016; Bouchard et al., 2018), and it has been proposed that SPOP oligomers function as scaffolds that enable binding of substrates both within and between oligomers, resulting in filament-formation at low substrate concentrations and condensate formation at higher substrate concentrations (Bouchard et al., 2018; Schmit et al., 2020).

The higher-order self-association of SPOP follows the isodesmic model (Marzahn et al., 2016), in which the equilibrium between oligomer $i$ and $i$ +1 is described by a single equilibrium constant independently of oligomer size (Oosawa and Kasai, 1962). In the case of SPOP, the BTB-mediated dimer acts as the protomer of higher-order self-association, and the isodesmic $K_{D}$ thus describes BACK-BACK self-association. The isodesmic model can be used to calculate the equilibrium concentration of every oligomeric species as a function of the total protomer concentration (Figure 1a and c). For SPOP, a low micromolar isodesmic $K_{D}$ has been determined from composition gradient multi-angle light scattering experiments (CG-MALS; Marzahn et al., 2016). While these insights describe the heterogeneity in oligomer sizes, the conformational heterogeneity of the higher-order oligomers has not been characterized. Previous work revealed that constitutive SPOP dimers, created via deletion of the BACK domain, have considerable conformational heterogeneity in the position of their MATH domains. The MATH domains are seen docked onto the BTB dimer in the structure, but small-angle X-ray scattering (SAXS) experiments showed that they could undock from the BTB domains, enabled by a long flexible linker (Zhuang et al., 2009). This may enable binding of multivalent substrates with different spacing between SPOP binding motifs. Whether this conformational flexibility also exists in higher-order SPOP oligomers is unclear.

Here, we aimed to determine simultaneously both the distribution of oligomeric states of SPOP and the conformational ensemble of each SPOP oligomer by combining SAXS experiments and MD simulations. SAXS can provide low-resolution information on protein structure in solution, but reports on an ensemble average, which in the case of SPOP is both an average over different oligomeric states and the structural heterogeneity of each oligomeric state. Therefore, SAXS experiments are often combined with MD simulations to provide a full structural model of the system (Thomasen and Lindorff-Larsen, 2022). In the case of polydisperse systems, it is sometimes possible to deconvolute the information into contributions from a small number of individual species and analyse these individually (Herranz-Trillo et al., 2017; Meisburger et al., 2021). We took a different approach and aimed to explicitly model every relevant configuration of SPOP in its range of oligomeric states along with the associated thermodynamic weight of each configuration. We collected SAXS data on SPOP at a range of protein concentrations and constructed initial conformational ensembles of every substantially populated oligomeric state using coarse-grained MD simulations. We then developed an approach to simultaneously and self-consistently optimize the distribution of oligomeric states, given by the isodesmic model (Oosawa and Kasai, 1962; Shemesh et al., 2021), and refine the conformational ensemble of each oligomer against the SAXS data using Bayesian/maximum entropy (BME) reweighting (Bottaro et al., 2020; Figure 2). Our results show that SPOP forms rigid, helical oligomers in solution, and that the linker connecting the MATH and BTB domains is likely flexible, allowing for repositioning of the MATH domains during substrate binding. Our results also provide further evidence that SPOP self-association follows the isodesmic model, and we find an isodesmic $K_{D}$ in good agreement with the previously determined value (Marzahn et al., 2016). Using SAXS experiments of a cancer variant of SPOP we also show how our approach can be used to determine changes in the level of self-association.

Figure 2

Download asset Open asset

Overview of the self-consistent approach used to fit conformational ensembles of SPOP oligomers to SAXS data.

Small-angle X-ray scattering (SAXS) data on SPOP represents an average over a range of oligomeric species present in solution. Here, the distribution of oligomeric species and the conformational ensemble of each oligomer were self-consistently fitted to a concentration series of SAXS data by iteratively fitting the scale and constant background of the SAXS data and the isodesmic $K_{D}$ , followed by reweighting of the conformational ensemble of each oligomer.

Results

We collected a concentration series of SAXS data on a previously used truncated version of SPOP, SPOP^28–359 (full length is 374 residues), with total protein concentrations ranging from 5 to 40 µM. In order to build structural models to refine against the SAXS data, we first needed to decide which oligomeric species to include in our modelling. To this aim, we used the isodesmic self-association model, which has previously been shown to describe SPOP oligomerization well (Marzahn et al., 2016). We fitted previously measured CG-MALS data (Marzahn et al., 2016) to obtain an isodesmic $K_{D}$ of 1.6±0.3 µM (Figure 1—figure supplement 1). Based on the isodesmic model fitted to the CG-MALS data, the population of oligomers larger than ~30-mersshould be very low at the concentration range used in our SAXS experiments. As scattering intensity is proportional to particle size squared, larger oligomers, however, make a considerable contribution to the SAXS signal despite their low concentrations (Figure 1c). Given the concentrations from the isodesmic model and taking into account the increased scattering of larger oligomers, we decided that constructing models of oligomers up to the 60-mer should be sufficient to capture all substantial contributions to the SAXS data.

There are no crystal structures of SPOP^28–359 available, so we constructed a model of the SPOP^28–359 BTB-dimer using the crystal structure of the isolated BACK domain (4HS2) (van Geersdaele et al., 2013) and the crystal structure of a truncated construct containing only the MATH and BTB domains (3HQI) (Zhuang et al., 2009; Figure 1b). We used this model of the BTB-dimer to construct SPOP^28–359 oligomers, which we used as starting structures for MD simulations. We ran 60 µs MD simulations of oligomers ranging from the dimer to the dodecamer; we used a coarse-grained representation of SPOP modelled using the Martini 3 force field (Souza et al., 2021) further modified by increasing protein-water interactions by 6% (Thomasen et al., 2022). It would be computationally prohibitive to run simulations of large oligomers up to the 60-mer. Instead, we relied on the assumption that the dodecamer behaves similarly to a segment of an arbitrarily long oligomer, and constructed conformational ensembles of oligomers up to the 60-mer by joining together conformers from the simulations of the dodecamer at the BACK-BACK interface (Figure 1d).

We calculated SAXS intensities from our conformational ensembles and, given the relative population of each oligomer from the isodesmic model with $K_{D}$ =1.6 µM, determined from CG-MALS, we calculated SAXS profiles averaged over all the oligomeric species. We found that the SAXS data calculated in this way from the ensembles generated by MD simulations convoluted with the isodesmic model were in good agreement with the experimental SAXS data, giving a reduced $χ^{2}$ to the concentration series of SAXS data ( $χ_{r, global}^{2}$ ) of 1.53 (Figure 3). Despite the overall good agreement, the residuals revealed some systematic deviations to the experimental SAXS profiles. These deviations could potentially arise from inaccuracies in the distribution of oligomeric states given by the isodesmic model, from inaccuracies in the modelled conformational ensembles, or from both. As a first step, we wanted to see if we could eliminate the deviations by only tuning the distribution of oligomeric states. We globally optimized the $K_{D}$ of the isodesmic model against the concentration series of SAXS data, which gave $K_{D}$ =0.9±0.4 µM, in good agreement with $K_{D}$ =1.6±0.3 µM determined from CG-MALS, and resulted in a $χ_{r, global}^{2}$ of 1.24 to the SAXS data (Figure 3). However, this did still not fully eliminate the systematic deviations from the experimental SAXS profiles.

Figure 3 with 9 supplements see all

Download asset Open asset

Refining oligomer populations and conformational ensembles against SAXS data.

(a) Relative populations of oligomers for the protein concentrations used in SAXS experiments. Note the logarithmic scale. Populations are given by the isodesmic model with the $K_{D}$ noted above the plot, which is either (1) previously determined by CG-MALS or (2–3) fitted globally to the SAXS data in panel b. $χ_{r, global}^{2}$ quantifies the agreement with SAXS data in panel b for the three scenarios. (b) Agreement between experimental SAXS data and averaged SAXS data calculated from conformational ensembles of SPOP oligomers with populations given by the isodesmic model (as shown in panel a). SAXS profiles are shown for three different scenarios: (1) calculated from the conformational ensembles generated by MD simulations with the isodesmic $K_{D}$ previously determined with CG-MALS, (2) calculated from the conformational ensembles generated by MD simulations with the isodesmic $K_{D}$ fitted to the SAXS data, and (3) calculated from conformational ensembles refined against the SAXS data using Bayesian/MaxEnt reweighting, and with the isodesmic $K_{D}$ self-consistently fitted to the SAXS data. Error-normalized residuals are shown below the SAXS profiles and $χ_{r}^{2}$ to each SAXS profile is shown on the plot.

To improve the agreement with the experimental SAXS data further, we aimed to simultaneously refine the conformational ensemble of each oligomer and optimize the distribution of oligomeric states. We developed a self-consistent optimization scheme, in which the isodesmic $K_{D}$ is optimized globally to the entire concentration series of SAXS data followed by reweighting of the conformations of each oligomer against a SAXS profile deconvoluted from the experimental SAXS data (Figure 2; see Methods section for details). To reweight the ensembles, we used BME reweighting, in which the population weights of the conformational ensemble are minimally perturbed with respect to the prior ensemble (generated by the MD simulations) to improve the agreement with a set of experimental data (Bottaro et al., 2020). This approach resulted in excellent agreement with the experimental SAXS data, giving a $χ_{r, global}^{2}$ of 0.69, while only small deviations remained (Figure 3). The isodesmic $K_{D}$ was fitted to 1.3±0.5 µM, and thus also remained in good agreement with the previously determined value (Marzahn et al., 2016). To validate our approach and to examine the possibility of overfitting, we left out one SAXS profile recorded with 15 µM protein from the optimization. The optimized $K_{D}$ and ensemble weights did not substantially affect the fit to this SAXS profile, suggesting that we had avoided overfitting (Figure 3—figure supplement 1). These results show that SAXS data on SPOP can be explained well by conformational ensembles of linear oligomers with populations given by the isodesmic model, and thus provide further evidence that SPOP self-association follows a simple isodesmic mechanism (Marzahn et al., 2016).

The previously published CG-MALS data on SPOP clearly precludes a simple dimer–tetramer, dimer–hexamer, dimer–octamer, or dimer–decamer equilibrium in favor of an isodesmic self-association model (Marzahn et al., 2016). To determine whether the SAXS data also favors the isodesmic model, we used our conformational ensembles to examine whether monodisperse oligomers, ranging in size between an octamer and 60-mer, as well as corresponding dimer-oligomer equilibria, would be compatible with the SAXS concentration series (Figure 3—figure supplement 2). For each dimer–oligomer equilibrium, we fitted the $K_{D}$ globally to the SAXS data. Thus, the isodesmic model and dimer–oligomer models are of comparable complexity, with only a single free parameter. The results show that the SAXS concentration series is in better agreement with an isodesmic distribution of oligomers than with any of the tested single oligomers or dimer–oligomer equilibria.

We also wished to examine whether the conformational ensembles of SPOP generated by the MD simulations described the SAXS data better than static structures. As a first comparison, we calculated SAXS profiles from the initial SPOP oligomer structures constructed based on crystal structures. To make the results comparable with our optimized ensembles, we fitted the isodesmic $K_{D}$ to the SAXS data for the static structures. This resulted in a worse agreement with the SAXS data ( $χ_{r, global}^{2}$ =4.03 with $K_{D}$ =0.43 µM) than what we obtained using the ensembles both before and after reweighting. We also investigated the agreement with the SAXS data for individual structures drawn from the ensembles of the oligomers, again fitting the isodesmic $K_{D}$ for each set of structures. We found that some of the single structures from the ensembles could fit the SAXS data as well as the entire ensembles before reweighting, but no set of static structures fit the SAXS data as well as the reweighted ensembles (Figure 3—figure supplement 3). This result highlights that, while an ensemble of multiple conformers is likely necessary to produce the best agreement with the SAXS data, SPOP oligomers have a relatively rigid structure overall, allowing for reasonable agreement with the SAXS data without modelling the conformational heterogeneity for each oligomer. The improvement in agreement with the SAXS data over the starting structures, also for individual conformers, suggests that the MD simulations contribute, not only by modelling the conformational heterogeneity, but also by simply relaxing the structure to a more accurate state.

The results described above show that the SAXS data fit well to an isodesmic model with a $K_{D}$ value close to that determined from CG-MALS. We wished to validate our approach further by comparing the SAXS-derived model of self-association with the CG-MALS data more directly. We therefore calculated the average molecular weight given by the isodesmic model with the $K_{D}$ of 1.3 µM that we obtained by fitting to the SAXS data and compared the results with the CG-MALS data (Figure 3—figure supplement 4). This analysis confirmed that the model of self-association derived from our analysis of the SAXS data is fully consistent with the independently measured CG-MALS data.

Having generated a conformational ensemble of each SPOP oligomer in agreement with the SAXS data, we proceeded to analyze the structures. Reweighting resulted in an increase in the radius of gyration ( $R_{g}$ ) for almost all oligomeric species, suggesting that slightly more expanded conformations than those sampled with our modified version of Martini are more consistent with the SAXS data (Figure 4a–b and Figure 4—figure supplement 1). This expansion can be attributed both to a slight increase in the end-to-end distance for most oligomers (Figure 4c–d and Figure 4—figure supplement 2), as well as a slight increase in the average distance between the MATH and BTB/BACK domains for all oligomers upon reweighting (Figure 4e–f).

Figure 4 with 3 supplements see all

Download asset Open asset

SPOP forms rigid, linear oligomers with flexible MATH domains in solution.

(a) Probability distribution of the radius of gyration ( $R_{g}$ ) calculated from ensembles of six representative SPOP oligomers before and after reweighting (see Figure 1 for $R_{g}$ distributions for all oligomers). Dashed lines show the average values. (b) The fold-change in average $R_{g}$ after reweighting for all SPOP oligomers. (c) The average end-to-end distance calculated from ensembles of SPOP oligomers before and after reweighting (see Figure 2 for distributions for all oligomers). Solid lines show the fit of a power law: $R_{E-E} = R_{0} N^{ν}$ , where $R_{E-E}$ is the average end-to-end distance, R₀ is the subunit segment size, $N$ is the number of subunits in the oligomer, and $ν$ is a scaling exponent. The fit gave R₀=3.16 nm, $ν$ =0.99 before reweighting and R₀=3.11 nm, $ν$ =0.99 after reweighting. (d) The fold-change in average end-to-end distance after reweighting for all SPOP oligomers. (e) Normalized histogram of distances between the center-of-mass (COM) of the MATH domain and the COM of the BTB/BACK domains in the same subunit before and after reweighting. The histogram contains the distances from every conformation of every subunit in every oligomer. (f) The fold-change in average MATH-BTB/BACK COM distance after reweighting for all SPOP oligomers. (g) Normalized histogram of COM distances between MATH substrate binding sites in neighbouring subunits (blue and red). The histogram contains the distances from every conformation of every subunit in every oligomer. In black, distances between neighbouring SPOP binding sites in seven SPOP substrate IDRs calculated from CALVADOS simulations. (h) The fold-change in average COM distance between neighbouring MATH substrate binding sites after reweighting for all SPOP oligomers. (i) Overlay of conformational ensembles corresponding to the three populations in panel (e) The structures are from all non-terminal subunits of the SPOP dodecamer and are superposed on the BTB/BACK domains. (j) Overlay of 151 randomly selected frames from the conformational ensemble of the SPOP 60-mer with atoms represented as spheres. Structures were superposed to the BTB/BACK domains in the four middle subunits. MATH domains are shown in orange and BTB/BACK domains are shown in blue.

In order to investigate the global flexibility and compaction of SPOP oligomers, we fitted a power law to the average end-to-end distance ( $R_{E-E}$ ) as a function of the number of subunits in the oligomer ( $N$ ), $R_{E-E} = R_{0} N^{ν}$ , where R₀ is the subunit segment size and $ν$ is a scaling exponent (Figure 4c). The fit gave R₀=~3.1 nm and $ν$ =0.99, showing a linear growth of the end-to-end distance with the number of subunits. This result is consistent with no significant curvature or compaction of the oligomers and, along with the narrow distribution of end-to-end distances for each oligomer (Figure 4—figure supplement 2), suggests that the SAXS data is compatible with a distribution of straight and relatively rigid SPOP oligomers, at least on length scales up to the ~180 nm of the 60-mer. The helical structure of larger oligomers, with ~16 subunits per turn, is evident as small periodic deviations from the fit (Figure 4c).

The MATH and BTB domains are connected through a ~20 residue long linker region (Figure 1b). We hypothesized that this linker may be flexible, allowing for reconfiguration of the MATH domains with respect to the crystal structure (Zhuang et al., 2009). We calculated the distances between the center-of-mass (COM) of the MATH domain and the COM of the BTB/BACK domains in the ensembles for every subunit of every oligomer. The distribution of these MATH-BTB/BACK distances reveal two populations overlapping with the two crystal structure configurations (Figures 4e, i ,, 5c), where the MATH domains are in close proximity to the BTB/BACK domains (Zhuang et al., 2009). However, there is also a third population in which the MATH domains are extended away from the BTB/BACK domains, suggesting that the MATH-BTB linker is flexible and allows for movement out of the configurations observed in the crystal structure. Reweighting slightly increased the population of this extended state (Figure 4e–f). This flexibility in the configuration of the MATH domains gives rise to a broad distribution of distances between the substrate binding sites in neighbouring MATH domains, which is also slightly increased upon reweighting for all oligomers (Figure 4g–h). Both the overall rigidity of the oligomers and the flexibility of the MATH domains are also evident from visual inspection of the conformational ensemble of the 60-mer (Figure 4j).

Figure 5 with 6 supplements see all

Download asset Open asset

Unrestrained MATH domains give better agreement with SAXS data.

Comparison of conformational ensembles with MATH domains either unrestrained (blue) or restrained to BTB/BACK domains based on the configuration in the crystal structure (orange). (a) Relative populations of oligomers for the protein concentrations used in SAXS experiments. Note the logarithmic scale. Populations are given by the isodesmic model with the $K_{D}$ noted above the plot. $K_{D}$ was fitted globally to the SAXS data in panel (b). $χ_{r, global}^{2}$ quantifies the agreement with SAXS data in panel b for the two setups. (b) Agreement between experimental SAXS data and averaged SAXS data calculated from conformational ensembles of SPOP oligomers generated with the two setups. Oligomer populations are given by the isodesmic model (as shown in panel a). Error-normalized residuals are shown below the SAXS profiles and $χ_{r}^{2}$ to each SAXS profile is shown on the plot. (c) Histogram of center-of-mass distances between MATH and BTB/BACK domains in the same subunit calculated from all conformations of all subunits of all oligomers. Average values are shown as dashed lines.

To examine further whether the SAXS data support the observed flexibility of the MATH-BTB linker and repositioning of the MATH domains, we generated new ensembles of SPOP oligomers following the same protocol as above, but this time restraining the MATH domains to the BTB-BACK domains based on the configuration in the crystal structure using the elastic network model implemented in Martini. We calculated SAXS data from the generated ensembles and again fitted the isodesmic $K_{D}$ globally to the SAXS data, resulting in $K_{D}$ =0.2±0.2 µM. The agreement with the SAXS data was substantially worse than for the original ensembles with the MATH domains unrestrained ( $χ_{r, global}^{2}$ =4.38 and $χ_{r, global}^{2}$ =1.24 respectively), and the systematic deviations from the experimental SAXS profiles were clearly exacerbated (Figure 5). These results suggest that, first, the resolution of the SAXS data is high enough to distinguish between different configurations of the MATH domains and, second, that the SAXS data are indeed in better agreement with a model where the MATH-BTB linker is flexible. Taken together, our results support a model where, in solution, SPOP oligomers behave as rigid, helical structures with flexible MATH domains that can extend away from the BTB/BACK domains.

The comparison between the conformational ensembles generated with the MATH domains free or restrained also suggests that accurate conformational ensembles are necessary for accurate determination of the isodesmic $K_{D}$ . Fitting the SAXS data using ensembles with the MATH domains restrained resulted in a lower isodesmic $K_{D}$ , and calculating the agreement with SAXS for a range of isodesmic $K_{D}$ values revealed that there was no clear minimum in the $χ_{r, global}^{2}$ for $K_{D}$ values > 0, which is also reflected in the large error range for the fitted $K_{D}$ (Figure 5—figure supplement 1). In line with this observation, validation with SAXS data at 15 µM protein revealed that improving the accuracy of the ensembles by reweighting also improved the accuracy of the fitted isodesmic $K_{D}$ independently of the fitted ensemble weights (Figure 3—figure supplement 1d).

To explore further how changes in the conformational ensemble would affect the agreement with the SAXS data, we used subsampling to generate ensembles with specific properties based on the ensembles with unrestrained MATH domains. To keep the comparison of ensembles unbiased by our previous fitting to the SAXS data, we used the isodesmic $K_{D}$ =1.6 µM from CG-MALS for all comparisons with SAXS. First, we selected frames with lower average MATH-BTB/BACK COM distances (Figure 5—figure supplement 3). In line with the results from simulations with restrained MATH domains, this worsened the agreement with the SAXS data. In contrast, selecting frames with higher average MATH-BTB/BACK COM distance slightly improved the agreement with the SAXS data, in line with the results from reweighting (Figure 5—figure supplement 4). We also wished to test how sensitive the agreement with the SAXS data was to the overall shape of the oligomers. However, the conformational space that we could explore by subsampling the ensembles was limited by the rigidity of the oligomers. Despite this limitation, we subsampled ensembles with slightly higher and lower end-to-end distances than the original ensembles, corresponding to oligomers that are more or less extended than the original ensembles. Again consistent with the results from reweighting, ensembles with lower end-to-end distance resulted in slightly worse agreement with the SAXS data (Figure 5—figure supplement 5), while ensembles with higher end-to-end distance did not substantially change the agreement with the SAXS data (Figure 5—figure supplement 6). This result suggests that the SAXS data is less consistent with more compact SPOP oligomers, at least within the local part of conformational space explored here.

Self-association allows SPOP to bind disordered substrates that contain multiple SPOP binding motifs through multivalent interactions (Pierce et al., 2016). We hypothesized that the spacing between MATH domains in SPOP oligomers could be related to the spacing between SPOP binding motifs in disordered substrates. To investigate this, we selected five SPOP substrates with multiple SPOP binding motifs located in IDRs (SETD2 Zhu et al., 2017, SCAF1 Theurillat et al., 2014, SRC3 Li et al., 2011; Geng et al., 2013; Janouskova et al., 2017, Gli2, and Gli3 Zhang et al., 2006; Zhang et al., 2009) and ran coarse-grained simulations of their IDRs (seven IDRs in total) using CALVADOS, a one-bead-per-residue implicit solvent model that has been optimized to reproduce accurate global dimensions and transient interactions in IDPs (Tesei et al., 2021). We calculated the distances between neighbouring SPOP binding motifs in the simulations, and compared these with the distances between substrate-binding sites in neighbouring MATH domains given by our ensembles of SPOP oligomers (Figure 4g). This revealed substantial overlap between the two distributions, with a similar average distance between neighbouring binding sites in SPOP and in substrates, suggesting that the spacing of SPOP binding motifs in substrates may be evolutionarily optimized for multivalent binding to MATH domains.

Having analyzed the conformational properties of wild type SPOP and shown that the SAXS data are sensitive to the degree of self-association, we next wished to test whether our approach could capture the effects of mutations on SPOP self-association. We collected a concentration series of SAXS data on the SPOP mutant R221C, which has been identified in melanoma (Krauthammer et al., 2012) and colorectal cancer (Giannakis et al., 2016). R221C is located in the BTB-BTB interface, so we hypothesized that it may affect SPOP’s propensity to self-associate. We used the same approach as for wild type to fit the isodesmic $K_{D}$ globally to the SAXS data, but without reweighting the conformational ensembles. For R221C, the isodesmic $K_{D}$ was fitted to 8.2±2.3 µM, which resulted in a reasonable fit to the SAXS data with $χ_{g l o b a l}^{2}$ =1.79 (Figure 3—figure supplement 7), suggesting that the mutation results in a decreased propensity to self-associate compared with wild type ( $K_{D}$ =0.9 µM using a comparable approach or 1.3 µM when also reweighting the ensemble). Because R221C is located at the BTB-BTB interface, the 6–9 fold increase of the isodesmic $K_{D}$ (which relates to BACK-BACK dimerization) is perhaps surprising. While a long-range effect of R221C cannot be ruled out, an alternative mechanism may involve shifting the equilibrium of the BTB-BTB dimer, thus effectively decreasing the concentration of dimeric species available for self-association.

Discussion

The ability of SPOP, a cancer-associated substrate adaptor in the ubiquitination machinery, to self-associate is important for its role in biology and disease. Characterizing the conformational ensemble of flexible and self-associating proteins such as SPOP from ensemble-averaged experiments is, however, difficult due to conformational and compositional heterogeneity. In one approach, SAXS data of mixtures may be attempted to be decomposed into contributions of individual components that may then be analysed separately (Herranz-Trillo et al., 2017; Meisburger et al., 2021). Here, we have developed an alternative ‘forward modelling’ approach to characterize proteins that undergo polydisperse oligomerization by self-consistently and globally fitting the distribution of oligomeric species and reweighting the conformational ensembles of the oligomers against SAXS data. A similar idea has recently been applied to study the self-association of tubulin using static structures as input (Shemesh et al., 2021). We recorded a concentration series of SAXS data on SPOP, which is known to form linear higher-order oligomers, and combined MD simulations with our approach to simultaneously refine conformational ensembles of thirty oligomeric states of SPOP along with the relative populations.

Our results suggest that SPOP oligomers are rigid, helical structures in solution and that the MATH-BTB linker is flexible, allowing for the extension of MATH domains away from the oligomer core. This is consistent with SPOP’s proposed role in phase separation, as reconfiguration of the MATH domains could facilitate binding of substrates across multiple MATH domains and between different SPOP oligomers (Pierce et al., 2016; Bouchard et al., 2018). Indeed, the spacing of the MATH domains in our model of SPOP oligomers is consistent with the distances between motifs in ensembles of disordered SPOP substrates, based on coarse-grained simulations of disordered SPOP substrates. It has been suggested previously that rigidity could play an important role in the phase separation of SPOP oligomers by ensuring a low conformational entropy penalty upon stacking linear oligomers with cross-bound substrates in the dense phase (Schmit et al., 2020). This is also consistent with the rigid structural model of SPOP oligomers proposed here. Our results also provide orthogonal evidence that SPOP self-association is described well by the isodesmic model, and that the isodesmic $K_{D}$ for BACK-BACK mediated self-association is in the low micromolar range, in agreement with previous measurements by CG-MALS (Marzahn et al., 2016). We also collected SAXS data and fitted the isodesmic $K_{D}$ for the SPOP mutant R221C. Our results suggest that SPOP R221C has a six- to ninefold decreased propensity to self-associate.

While the analysis of the SAXS data presented here does not strictly exclude the possibility that SPOP forms branched or otherwise non-linear oligomers, our results show that linear oligomers based only on the self-association interfaces known from existing crystal structures are consistent with SAXS data. Thus, linear oligomers seem to be the most plausible model based on this and other existing experimental evidence, for example that removal or mutation of either the BACK-BACK or BTB-BTB interface results in abolishment of higher-order self-association and that higher-order oligomers are formed through the self-association of SPOP dimers with every step of subunit-addition populated (Marzahn et al., 2016). Finally, as shown here, linear isodesmic self-association with the same $K_{D}$ provides a good fit to both SAXS and CG-MALS data (Marzahn et al., 2016).

The approach presented here to study SPOP can be extended to other polydisperse systems to characterize the distribution of oligomeric states and their conformational properties. However, there are a few limitations to be aware of; SAXS is a low-resolution technique, and may not be able to distinguish between all relevant conformations, a problem that is likely exacerbated here, as the contribution of many species to the SAXS signal may average out distinct features in the profile. One way to mitigate this problem is to construct multiple structural models, and test whether they show any difference in the agreement with the SAXS data. In the case of SPOP we used this approach to examine the flexibility of the MATH domain in SPOP^28–359.

Another limitation of the approach is the correlation between the fitted distribution of oligomeric states and the conformational properties of the oligomers. Here, we observed that a low isodesmic $K_{D}$ with large uncertainty was fitted when using more compact structures (MATH domains restrained), which suggests that the model can compensate for the underestimated dimensions of the proteins by increasing the populations of larger oligomers. Therefore, it is important to use prior conformational ensembles that are as accurate as possible. Additionally, it is important to include all the oligomeric species that make a substantial contribution to the SAXS data in the modelling. In the future, it might be relevant to include independent data reporting on the distribution of oligomeric species, such as from CG-MALS, when fitting SAXS data.

In the case of SPOP, we described the distribution of oligomers using the isodesmic self-association model, but this can be replaced by any model that describes the populations of the species in solution — with the caveat that there should not be too many free parameters to fit to the SAXS data. Similarly, the approach to generate prior conformational ensembles is not limited to MD simulations, and can be varied based on the system at hand. This flexibility in the modelling approach will make it useful to study other polydisperse systems in the future.

Methods

Protein expression and purification

The SPOP gene encoding residues 28–359 (His-SUMO-SPOP^28–359) was expressed and purified as previously described (Bouchard et al., 2018). Briefly, His-SUMO-SPOP^28–359 was transformed into BL21-RIPL cells and expressed in auto-induction media (Studier, 2005). Cells were harvested, lysed, and cell debris was pelleted by centrifugation. The clarified supernatant was applied to a gravity Ni Sepharose resin equilibrated in resuspension buffer (30 mM imidazole, 1 M NaCl, pH 7.8). After washing with wash buffer (75 mM imidazole, 1 M NaCl, pH 7.8), the protein was eluted with a buffer containing 300 mM imidazole, 1 M NaCl, pH 7.8. One milligram of TEV protease was added to the eluted protein and the reaction was left to dialyze into 20 mM Tris pH 7.8, 300 mM NaCl, and 5 mM DTT at 4 °C overnight. The cleaved protein was then further purified using a Superdex S200 size-exclusion chromatography column equilibrated with 20 mM Tris pH 7.8, 300 mM NaCl, and 5 mM DTT.

Small-angle X-ray scattering

SAXS experiments were performed at the LIX-beamline (16-ID) of the National Synchrotron Light Source II (Upton, NY) (DiFabio et al., 2016). Data were collected at a wavelength of 1.0 Å, yielding an accessible scattering angle range of 0.006 $< q <$ 3.2 Å⁻¹, where $q$ is the momentum transfer, defined as $q = 4 π \sin (θ) / λ$ , where $λ$ is the X-ray wavelength and 2θ is the scattering angle. Data with $q <$ 0.4 Å⁻¹ were used for all analyses. Prior to data collection, SPOP was dialyzed into 20 mM Tris pH 7.8, 150 mM NaCl, and 5 mM DTT. Samples were loaded into a 1 mm capillary for ten 1 s X-ray exposures. Data were reduced at the beamline using the Python package py4xs.

Molecular dynamics simulations with Martini

We ran coarse grained molecular dynamics simulations of six SPOP^28–359 oligomers ranging from the dimer to dodecamer (in steps of dimeric protomer subunits) using a beta version (3.0.4.17) of the Martini 3 force field (https://github.com/KULL-Centre/papers/tree/main/2020/TIA1-SAS-Larsen-et-al/Martini; Souza et al., 2021) and Gromacs 2020 (Abraham et al., 2015). We built the SPOP monomer structure using Modeller (Sali and Blundell, 1993) based on the crystal structure of the MATH and BTB domains (PDB: 3HQI) (Zhuang et al., 2009) and a crystal structure of the BACK domain (PDB: 4HS2) (van Geersdaele et al., 2013). We built the dimer structure by superposing two monomer structures to the crystal structure of the BTB-BTB dimer interface in 3HQI. We then built larger oligomers by iteratively adding dimer structures to the linear oligomer. Dimers were added by superposing the terminal BACK domain of the oligomer and a terminal BACK domain of the dimer to the structure of the BACK-BACK dimer (4HS2).

The starting structures were coarse grained using the Martinize2 python script. Elastic network restraints of 500 kJ mol^–1 nm^–2 between backbone beads within a 1.2 nm cut-off were applied with Martinize2 to keep folded domains intact and to hold oligomer subunits together. In the ‘MATH free’ model, we removed all elastic network restraints between MATH and BTB/BACK domains, between MATH and MATH domains, and in the linker region between MATH and BTB/BACK domains, while in the ‘MATH restrained’ model, we only removed elastic network restraints between MATH and MATH domains and in the linker region between MATH and BTB/BACK domains, but kept restraints between MATH and BTB/BACK domains. We added dihedral and angle potentials between side chains and backbone beads with the -scfix flag in Martinize2. Using Gromacs editconf, we placed the dimer and tetramer in a dodecahedral box. To keep the box volume small, larger oligomers were aligned with the principal axis of the system and placed in triclinic boxes that were thus elongated along the x-axis. To keep these oligomers from rotating and self-associating across the periodic boundary, we added soft harmonic position restraints of 5 J mol^–1 nm^–2 along the y- and z-axis to the backbone beads of the terminal BTB/BACK domains. We solvated the systems using the Insane python script (Wassenaar et al., 2015) and added 150 mM NaCl along with Na⁺ ions to neutralize the systems. In the ‘MATH free’ system, we rescaled the $ϵ$ of the Lennard-Jones potentials between all protein and water beads by a factor 1.06 to favour extension of the MATH domains into solution (Thomasen et al., 2022), while the unmodified Martini 3 beta v.3.0.4.17 was used for the ‘MATH restrained’ model.

Energy minimization was performed using steepest descent for 10,000 steps with a 30 fs time-step. Simulations were run in the NPT ensemble at 300 K and 1 bar using the Velocity-Rescaling thermostat (Bussi et al., 2007) and Parinello-Rahman barostat (Parrinello and Rahman, 1981). Non-bonded interactions were treated with the Verlet cut-off scheme. The cut-off for Van der Waals interactions and Coulomb interactions was set to 1.1 nm. A dielectric constant of 15 was used. We equilibrated the systems for 10 ns with a 2 fs time-step and ran production simulations for 60 µs with a 20 fs time-step, saving a frame every 1 ns.

After running the simulations, molecule breaks over the periodic boundaries were treated with Gromacs trjconv using the flags -pbc mol -center. Simulations were backmapped to all-atom using a modified version of the Backward algorithm (Wassenaar et al., 2014), in which simulation runs are excluded and energy minimization is shortened to 200 steps (Larsen et al., 2020). Every fourth simulation frame was backmapped for a total of 15,000 conformers in each backmapped ensemble.

Constructing ensembles of larger SPOP oligomers

We constructed conformational ensembles of larger SPOP^28–359 oligomers with up to 60 subunits by joining together conformers from the all-atom backmapped ensembles of the SPOP dodecamer. Using ensembles of two input SPOP oligomers (SPOP 1 and SPOP 2) we started by removing the last subunit of SPOP 1 and the first subunit of SPOP 2 to ensure that the newly joined subunits were internal and not terminal. We then removed additional subunits from SPOP 2 to reach the desired length of the output oligomer. Then, we selected a random frame from SPOP 1 and SPOP 2, superposed the BTB/BACK domains of the last two subunits of SPOP 1 to the BTB/BACK domains of the first two subunits of SPOP 2, and deleted the first two subunits of SPOP 2. Next, we checked for clashes between the newly joined subunits (shortest interatomic distance <0.4 Å), and rejected the new frame if there was a clash. This approach ensured that the terminal subunits in the constructed oligomer were also the terminal subunits in the MD simulation of the dodecamer, while all internal subunits in the constructed oligomer were also internal in the MD simulation. This approach was repeated to create 15,000 structures of each larger oligomer.

Calculating SAXS intensities from conformational ensembles

We calculated SAXS intensities from each of the 15,000 conformers in each of our all-atom ensembles of SPOP oligomers using Pepsi-SAXS (Grudinin et al., 2017). To avoid overfitting to the experimental SAXS data, we used fixed values for the parameters that describe the contrast of the hydration layer, $δ ρ$ =3.34 e/nm³, and the volume of displaced solvent, r₀/r_m = 1.025, that have been shown to work well for intrinsically disordered and multidomain proteins (Pesce and Lindorff-Larsen, 2021). The forward scattering ( $I (0)$ ) was set equal to the number of subunits in the oligomer, in order to scale the SAXS intensities proportionally to the particle volume.

The isodesmic self-association model and averaging of SAXS intensities

The experimental SAXS profiles of SPOP report on the average of a polydisperse mixture of oligomeric species in solution. The concentration of each oligomer should follow the isodesmic model where the concentration of the smallest subunit, the BTB-BTB dimer, is given by:

c_{1} = \frac{2 c_{t o t} K_{A} + 1 - \sqrt{4 c_{t o t} K_{A} + 1}}{2 c_{t o t} K_{A}^{2}}

The concentration c_i of any larger oligomer with $i$ subunits can be calculated given c₁ and the concentration of oligomer $i$ –1, c_i-1:

c_{i} = K_{A} c_{i - 1} c_{1}

$K_{A}$ is the isodesmic association constant and $c_{t o t}$ is the total concentration of protomers. Here we assume that the SPOP BTB-BTB dimer is always fully formed (Marzahn et al., 2016) and $c_{t o t}$ in Equation 1 is thus half of the total protein concentration reported for the SAXS experiments, which refers to the SPOP monomer concentration. Given the concentration c_i of each oligomer $i$ from the isodesmic model, we can calculate the volume fraction $ϕ_{i}$ of the oligomer:

ϕ_{i} = \frac{i c_{i}}{\sum_{i}^{N} i c_{i}}

The average SAXS intensities from the mixture of oligomers ${⟨ I ⟩}_{mix}$ are then given by:

⟨ I ⟩_{mix} = \sum_{i}^{N} ⟨ I ⟩_{i, ensemble} ϕ_{i}

where ${⟨ I ⟩}_{i, ensemble}$ is the conformationally averaged SAXS intensity of oligomer $i$ . Note that the magnitude of the SAXS intensities calculated with Pepsi-SAXS were set to be proportional to the number of subunits in the oligomer, so given Equations 3 and 4 the total contribution of each oligomer to the averaged SAXS intensity is proportional to $i^{2} c_{i}$ .

Self-consistent optimization of isodesmic model parameters and conformational ensemble weights

The algorithm we developed to self-consistently optimize the isodesmic distribution of oligomer concentrations and reweight the conformational ensemble of each oligomer against SAXS data consists of three iterative steps: (1) fitting the scale and constant background of the SAXS data, (2) fitting the isodesmic $K_{A}$ , and (3) reweighting the conformational ensemble of each oligomer using BME reweighting. We used a concentration series of SAXS experiments, to which the isodesmic $K_{A}$ was fitted globally, and only subsequently transformed the $K_{A}$ to the $K_{D}$ ( $K_{D} = 1 / K_{A}$ ) for reporting our results.

Step 1: Fitting the SAXS scale and constant background

The following step was repeated for each SAXS experiment in the concentration series. The concentration of each oligomer was calculated using the isodesmic model with the given $c_{t o t}$ (Equations 1 and 2). The average SAXS intensities ${⟨ I ⟩}_{mix}$ from all oligomers were then calculated using Equations 3 and 4. The scale and constant background ( $c s t$ ) of ${⟨ I ⟩}_{mix}$ were fitted to the experimental SAXS intensities, $I_{exp}$ , using least-squares linear regression weighted by the experimental errors (LinearRegression function in scikit-learn Pedregosa et al., 2011):

I_{exp} = s c a l e {⟨ I ⟩}_{mix} + c s t

In practice, to avoid modifying the SAXS scale and constant background for every conformer in our ensembles, we instead performed the inverse operation on the experimental SAXS profile:

I_{exp,fit} = \frac{I_{exp} - c s t}{s c a l e}

and propagated the experimental errors $σ_{exp}$ accordingly:

σ_{exp,fit} = \frac{σ_{exp}}{| s c a l e |}

Step 2: Fitting the isodesmic model

The isodesmic $K_{A}$ was fitted globally to the concentration series of SAXS experiments using Metropolis Monte Carlo (Metropolis et al., 1953) with simulated annealing. For each Monte Carlo step, we generated a new random $K_{A}$ with a Gaussian probability distribution centered around the previous $K_{A}$ , calculated new oligomer concentrations and corresponding ${⟨ I ⟩}_{mix}$ for each SAXS experiment in the concentration series using Equations 1–4, and for each SAXS experiment calculated the reduced $χ^{2}$ , $χ_{r}^{2}$ , as:

χ_{r}^{2} = \frac{1}{m} \sum_{j}^{m} \frac{(⟨ I ⟩_{j, m i x} - I_{j, e x p})^{2}}{σ_{j, e x p}^{2}}

where $m$ is the number of SAXS intensities $j$ in the SAXS profile. We then calculated the average of the $χ_{r}^{2}$ -values across the SAXS concentration series to get the global $χ_{r}^{2}$ , $χ_{r, global}^{2}$ , as the number of intensities was the same in each SAXS profile. Next, we evaluated the acceptance criterion by calculating:

α = e x p (- \frac{χ_{new,r,global}^{2} - χ_{old,r,global}^{2}}{T})

where $χ_{new,r,global}^{2}$ and $χ_{old,r,global}^{2}$ are the from the current and previous Monte Carlo step respectively and $T$ is the simulated annealing temperature. If $α$ > 1, we accepted the new $K_{A}$ . If $α \leq$ =1, we generated a random number, $r a n d$ , between 0 and 1, and if $α$ > $r a n d$ , accepted the new $K_{A}$ . Otherwise, we kept the $K_{A}$ from the previous Monte Carlo step. Finally, we decreased $T$ for the next Monte Carlo step.

Step 3: Reweighting the conformational ensemble

The following step was repeated for each SAXS experiment in the concentration series. We calculated the oligomer concentrations using the isodesmic model given the new $K_{A}$ determined in step 2. For each oligomer $i$ , we extracted a SAXS profile for BME reweighting from the experimental profile using the following method: we calculated the average SAXS profile from the ensembles as in Equation 4 but leaving out oligomer $i$ from the sum to get ${⟨ I ⟩}_{mix,rest}$ . Next, we determined the contribution of species $i$ to the experimental SAXS intensity as:

⟨ I ⟩_{i, extr} = \frac{I_{exp} - ⟨ I ⟩_{mix,rest}}{ϕ_{i}}

where $I_{exp}$ is the experimental SAXS intensity and $ϕ_{i}$ is the volume fraction of oligomer $i$ . We then propagated the error $σ_{i, extr}$ on $⟨ I ⟩_{i, extr}$ from both the errors on the experimental SAXS intensities and the errors on the calculated SAXS intensities, which we determined using block error analysis (Flyvbjerg and Petersen, 1989). The propagated errors were given by:

σ_{i, extr} = \frac{\sqrt{σ_{exp}^{2} + \sum_{r}^{N} (σ_{r, block} ϕ_{r})^{2}}}{ϕ_{i}}

where the sum $r$ to $N$ runs over all oligomers that contributed to ${⟨ I ⟩}_{mix,rest}$ , $σ_{exp}$ is the error on the experimental SAXS intensity, $σ_{r, block}$ is the error on the average SAXS intensity calculated from the ensemble of oligomer $r$ prior to reweighting using block error analysis (https://github.com/fpesceKU/BLOCKING; Pesce, 2023), and $ϕ_{i}$ is the volume fraction of oligomer $i$ . The conformational ensemble of oligomer $i$ was then reweighted against this extracted SAXS profile using BME reweighting (Bottaro et al., 2020), in which a set of ensemble weights $w$ are obtained by minimizing the function:

L (w_{1} . . . w_{n}) = \frac{m}{2} χ_{r}^{2} (w_{1} . . . w_{n}) - θ S_{rel} (w_{1} . . . w_{n})

where $n$ is the number of ensemble conformations, $m$ is the number of experimental observables (in this case the number of SAXS intensities in the profile), $χ_{r}^{2}$ quantifies the agreement between ${⟨ I ⟩}_{ensemble}$ and $⟨ I ⟩_{extr}$ , $S_{rel}$ is the relative Shannon entropy that quantifies the deviation of the new weights from the initial weights, $w^{0}$ , and $θ$ is a scaling parameter that quantifies the confidence in the experimental data versus the prior ensemble. $χ_{r}^{2}$ is given by:

χ_{r}^{2} (w_{1} . . . w_{n}) = \frac{1}{m} \sum_{j}^{m} \frac{\sum_{k}^{n} (w_{k} I_{j, k, ensemble} - ⟨ I ⟩_{j, extr})^{2}}{σ_{j, extr}^{2}}

where $I_{j, k, ensemble}$ is the SAXS intensity $j$ calculated from the conformer $k$ of the ensemble. $S_{rel}$ is given by:

S_{rel} = - \sum_{k}^{n} w_{k} l o g (\frac{w_{k}}{w_{k}^{0}})

Using the ensemble weights obtained from BME reweighting, we calculated new weighted average SAXS intensities, ${⟨ I ⟩}_{i, ensemble}$ , from the ensemble of oligomer $i$ . The process of extracting a SAXS profile followed by BME reweighting was repeated for each oligomer.

Optimization parameters

The three steps described above were repeated iteratively to converge on self-consistent values of the SAXS scale and constant background, the isodesmic $K_{A}$ , and the ensemble weights for each oligomer species. As the SAXS profile against which the ensemble of each oligomer was reweighted is a function of the ensemble weights of all other oligomeric species, we wished to reweight the ensembles only slightly in initial iterations, and then gradually increase the degree of reweighting as the conformational weights and isodesmic $K_{A}$ converged. We achieved this by starting with a high value of $θ$ (Equation 12) and then gradually decreasing $θ$ each iteration. The fraction of effective frames, $ϕ_{eff}$ , given by $\exp (S_{rel})$ , provides a measure of the fraction of the initial ensemble that is retained after reweighting. At every iteration, we checked whether the ensemble of each oligomer had reached a $ϕ_{eff}$ below a set cut-off, after which $θ$ was no longer decreased for that specific oligomer. Thus, the overall degree of reweighting could be tuned through selection of this $ϕ_{eff}$ -cut-off.

We ran the optimization scheme for 1000 iterations starting with $θ$ =100 and decreasing $θ$ by 2% every iteration. The simulated annealing of the isodesmic $K_{A}$ was run from $T$ =10 to $T$ =0.1 every iteration, with $T$ decreased by 30% every Monte Carlo step, and with a standard deviation of 0.1 µM⁻¹ for the Gaussian probability distribution used to generate the new $K_{A}$ . The step was repeated if $K_{A} \leq$ 0 was generated.

Preventing overfitting of ensemble weights

We ran the optimization with a range of $ϕ_{eff}$ -cut-offs from 0.1 to 1. To prevent overfitting, we aimed to choose a value of $ϕ_{eff}$ that retained as much of the prior ensemble as possible (high $ϕ_{eff}$ ) while not sacrificing substantial improvement in the fit to the SAXS data (low $χ_{r, global}^{2}$ ). As an additional approach to prevent overfitting, we left out the SAXS experiment recorded with 15 µM protein from the optimization, and used it as validation for the determined weights (averaged as explained in the next section) and isodesmic $K_{A}$ at different values of the $ϕ_{eff}$ -cut-off. For each $ϕ_{eff}$ -cut-off, we fitted only the SAXS scale and constant background to the 15 µM SAXS experiment. We tested the effect of using the fitted $K_{A}$ and ensemble weights in combination, but also the effect of using only the fitted $K_{A}$ or ensemble weights independently. Although in all cases the fitted $K_{A}$ and ensemble weights combined improved the fit to the SAXS data compared with the initial weights and $K_{A}$ , $ϕ_{eff}$ -cut-off=0.4 was the lowest value of $ϕ_{eff}$ where the fit was not improved by replacing the fitted weights with uniform weights in combination with the fitted $K_{A}$ (Figure 3—figure supplement 1). Thus, we selected the conformational weights and isodesmic $K_{A}$ determined with $ϕ_{eff}$ -cut-off=0.4 to avoid overfitting the ensemble weights.

Averaging the conformational weights from different SAXS experiments

The optimization scheme outputs a set of conformational weights for each SAXS experiment in the concentration series. We combined these conformational weights to obtain a single set of weights for further analysis, under the assumption that the conformational properties of each SPOP oligomer are independent of protein concentration. The distribution of oligomeric species from the isodesmic model depends on the protein concentration. Thus, each SAXS experiment does not contain the same amount of information on every oligomer; SAXS experiments at lower concentrations have a relatively smaller contribution from large oligomers and vice versa. Therefore, we weighted the averaging of the conformational weights to reflect this mismatch in information. The average weight of conformation $k$ of oligomer $i$ was calculated as:

⟨ w ⟩_{k, i} = \sum_{l}^{o} w_{k, i, l} ρ_{i, l}

where $w_{k, i, l}$ is the weight of conformer $k$ of oligomer $i$ from reweighting against SAXS experiment $l$ and $ρ_{i, l}$ is the contribution of oligomer $i$ to SAXS experiment $l$ relative to the contribution of oligomer $i$ to the other SAXS experiments in the concentration series, given by:

ρ_{i, l} = \frac{1}{\sum_{l}^{o} \frac{i^{2} c_{i, l}}{\sum_{i}^{N} i^{2} c_{i, l}}} \frac{i^{2} c_{i, l}}{\sum_{i}^{N} i^{2} c_{i, l}}

where $c_{i, l}$ is the concentration of oligomer $i$ in SAXS experiment $l$ given by the isodesmic model. For a plot of the contributions $ρ_{i, l}$ , see Figure 3—figure supplement 9.

Determining the error of the fitted isodesmic K_D

To determine the uncertainty of the isodesmic $K_{D}$ fitted with our optimization scheme, we scanned a range of $K_{D}$ values around the fitted $K_{D}$ and determined the $χ_{r, global}^{2}$ to the concentration series of SAXS data. We used the same ensemble weights for every value of $K_{D}$ , and only fitted the scale and constant background to the SAXS data. We then defined the error of the fitted $K_{D}$ to include all $K_{D}$ values that gave a $χ_{r, global}^{2}$ to the SAXS data within 10% of the minimum $χ_{r, global}^{2}$ .

Analysis of SPOP conformational ensembles

$R_{g}$ was calculated from ensembles using the gyrate function in Gromacs. End-to-end distances were calculated from ensembles as the distances between the center-of-mass (COM) of the BTB/BACK domains in the terminal subunits using the compute_center_of_mass function in MDTraj (McGibbon et al., 2015) and the linalg.norm function in NumPy (Harris et al., 2020). We fitted ensemble averaged end-to-end distances against oligomer size (number of subunits) with a power law: $R_{E-E} = R_{0} N^{ν}$ , where R₀ is the subunit segment size, $N$ is the number of subunits in the oligomer, and $ν$ is a scaling exponent, using the curve_fit function in SciPy (Virtanen et al., 2020). To subsample ensembles with extended or compacted oligomers, frames were selected with $R_{E-E} > max (R_{E-E}) - \frac{max (R_{E-E}) - ⟨ R_{E-E} ⟩}{2}$ or $R_{E-E} < min (R_{E-E}) + \frac{⟨ R_{E-E} ⟩ - min (R_{E-E})}{2}$ respectively, where $max (R_{E-E})$ and $min (R_{E-E})$ are the maximum and minimum over all frames of the ensemble and $⟨ R_{E - E} ⟩$ is the ensemble average. MATH-BTB/BACK COM distance was calculated from ensembles as the distance between the COM of the MATH domain and BTB/BACK domains in every subunit using the compute_center_of_mass function in MDTraj and the linalg.norm function in NumPy. The histogram of MATH-BTB/BACK COM distances shows values for all conformations of all subunits of all oligomers. To subsample ensembles with compacted or extended MATH domains, frames were selected with an average MATH-BTB/BACK COM distance over all subunits <4.4 nm or >5.2 nm, respectively. The COM distance between substrate binding sites in neighbouring MATH domains was calculated from ensembles using the distance function in Gromacs. The MATH substrate binding site was defined as residue Arg70, Tyr87, Ser119, Tyr123, and Lys129-Phe133. The histogram of MATH binding site COM distances shows values for all conformations of all subunits of all oligomers. Structures for Figure 4i were selected by fitting three Gaussians to the histogram in Figure 4e (after reweighting) using SciPy curve_fit and for each Gaussian selecting conformers within 0.1σ of the mean. All visualizations of protein structures were made with ChimeraX (Pettersen et al., 2021). To examine the agreement of single frames drawn from the ensembles with SAXS data, we drew a random frame from the ensemble of each oligomer and scanned the isodesmic $K_{D}$ from 0.01 to 100 µM (with 10,000 log-spaced steps) to select the $K_{D}$ that gave the optimal agreement with the SAXS concentration series based on $χ_{r, global}^{2}$ . The SAXS scale and constant background were fitted for each $K_{D}$ . This procedure was repeated for 10,000 iterations. The same procedure was performed with oligomer structures constructed prior to the MD simulations.

Dimer-oligomer equilibria and averaging of SAXS intensities

For dimer-oligomer equilibria, the total concentration of BTB-mediated dimer subunits (both free and in oligomers), $c_{tot,dimer}$ , was assumed to be half of the total SPOP monomer concentration. We determined the equilibrium dimer concentration, $c_{dimer}$ , and oligomer concentration, c_i, for a given association constant $K_{A}$ using:

K_{A} = \frac{c_{i}}{c_{dimer}^{(i / 2)}}

and the equation for conservation of mass:

c_{i} = \frac{c_{tot,dimer} - c_{dimer}}{i / 2}

where, $i$ is the number of subunits in the oligomer. The averaged SAXS intensities $⟨ I ⟩_{mix}$ were then calculated as:

⟨ I ⟩_{mix} = ϕ_{dimer} ⟨ I ⟩_{dimer,ensemble} + ϕ_{i} ⟨ I ⟩_{i, ensemble}

where $⟨ I ⟩_{dimer,ensemble}$ and $⟨ I ⟩_{i, ensemble}$ are the ensemble averaged SAXS intensity for the dimer and oligomer, and $ϕ_{dimer}$ and $ϕ_{i}$ are the volume fractions of the dimer and oligomer calculated based on the concentrations and number of subunits. For each possible dimer-oligomer equilibrium, we scanned $K_{A}$ values from 10⁻¹²–10¹² µM⁻¹ and selected the $K_{A}$ that gave the optimal agreement with the SAXS concentration series based on $χ_{r, global}^{2}$ . The SAXS scale and constant background were fitted for each $K_{A}$ .

Molecular dynamics simulations with CALVADOS

We selected five SPOP substrates with at least 8 SPOP binding motifs (Cuneo and Mittag, 2019) for simulations (SETD2 Zhu et al., 2017, SCAF1 Theurillat et al., 2014, SRC3 Li et al., 2011; Geng et al., 2013; Janouskova et al., 2017, Gli2, and Gli3 Zhang et al., 2006; Zhang et al., 2009). We selected the IDRs of these proteins based on low Alphafold pLDDT scores and pairwise alignment errors (Jumper et al., 2021). We ran coarse-grained simulations of these with CALVADOS 2 (Tesei et al., 2021; Tesei and Lindorff-Larsen, 2023). Simulations were run at 298 K, with an ionic strength of 150 mM, and pH 7.2 for determining the partial charge of histidine side-chains. Simulations were run for $3 \times 10^{3} N_{res}^{2}$ steps, where $N_{res}$ is the number of residues, using a 10 fs time-step (Tesei and Lindorff-Larsen, 2023). Frames were saved every $3 N_{res}^{2}$ steps to obtain weakly correlated frames. We used a 2 nm cutoff for the Ashbaugh-Hatch potential and a 4 nm cutoff for the Debye-Hückel potential. All simulations were started from a linear arrangement of the protein chain, except for simulations of the two longest IDRs, SCAF1 IDR and SETD2 IDR 1, which were started from an Archimedean spiral arrangement. Simulations were performed with HOOMD-blue 2.9.3 (Anderson et al., 2020).

Analysis of motif spacing in SPOP substrates

We identified SPOP binding motifs in the substrate sequences as five consecutive positions with residues 1: GAVLIMWFPC, 2: STCYNQDEHR, 3: ST, 4: STCYNQDEHR, 5: ST or 1: GAVLIMWFPC, 2: STCYNQDEHR, 3: ST, 4: ST, 5: STCYNQDEHR, where each set of amino acids are allowed at the given position (Zhuang et al., 2009; Cuneo and Mittag, 2019). We calculated a histogram of all distances between neighbouring motifs in the SPOP susbstrate sequences over the CALVADOS simulations. Distances were calculated between the middle residue beads of the neighbouring motifs using the compute_contacts function in MDTraj. We also calculated the average distance, $R$ , between each neighbouring motif and fit this with a power law $R = R_{0} N^{ν}$ , where R₀ is the segment size, $N$ is the number of residues spacing the two motifs, and $ν$ is a scaling exponent, using the curve_fit function in SciPy.

Fitting CG-MALS data

Given the concentration of each oligomer from the isodesmic model, the average molecular weight, as measured by CG-MALS, was calculated as:

⟨ M W ⟩ = \sqrt{\frac{1}{N \sum_{i}^{N} c_{i}} \sum_{i}^{N} (i M W_{monomer})^{2} c_{i}}

where $N$ is the number of oligomers, c_i is the concentration of oligomer $i$ given by the isodesmic model and $M W_{monomer}$ is the molecular weight of the subunit of oligomerization.

We fitted the isodesmic $K_{D}$ and $M W_{monomer}$ to the CG-MALS data from Marzahn et al., 2016. The CG-MALS data consists of two merged data-sets, so we allowed a different $M W_{monomer}$ for each of the two merged data-sets to absorb uncertainties from determination of the protein concentrations. The $K_{D}$ was fitted globally to the two merged data-sets, and the error of the fit on the $K_{D}$ was set to two standard deviations. Fitting was done with the curve_fit function in SciPy.

Data availability

Code and data is available at https://github.com/KULL-Centre/_2022_Thomasen_SPOP (copy archived at swh:1:rev:be995dd615079fe8b4fbb86941160d519429ee4c). Simulation data is available at https://doi.org/10.17894/ucph.ef999f72-b5e8-45c4-835f-3e49619a0f91. Plasmids are available from Addgene (plasmid IDs 194115 and 194116).

The following data sets were generated

(2022) Electronic Research Data Archive at University of Copenhagen
Supporting data for Conformational and oligomeric states of SPOP from small-angle X-ray scattering and molecular dynamics simulations.
https://doi.org/10.17894/ucph.ef999f72-b5e8-45c4-835f-3e49619a0f91

References

1. Abraham MJ
2. Murtola T
3. Schulz R
4. Páll S
5. Smith JC
6. Hess B
7. Lindahl E
(2015) GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers
SoftwareX 1–2:19–25.
https://doi.org/10.1016/j.softx.2015.06.001
- PubMed
- Google Scholar
1. Ali MH
2. Imperiali B
(2005) Protein oligomerization: How and why
Bioorganic & Medicinal Chemistry 13:5013–5020.
https://doi.org/10.1016/j.bmc.2005.05.037
- PubMed
- Google Scholar
1. An J
2. Wang C
3. Deng Y
4. Yu L
5. Huang H
(2014) Destruction of full-length androgen receptor by wild-type Spop, but not prostate-cancer-associated mutants
Cell Reports 6:657–669.
https://doi.org/10.1016/j.celrep.2014.01.013
- PubMed
- Google Scholar
(2020) HOOMD-blue: A python package for high-performance molecular dynamics and hard particle monte carlo simulations
Computational Materials Science 173:109363.
https://doi.org/10.1016/j.commatsci.2019.109363
- Google Scholar
1. Bosu DR
2. Kipreos ET
(2008) Cullin-Ring ubiquitin ligases: Global regulation and activation cycles
Cell Division 3:1–13.
https://doi.org/10.1186/1747-1028-3-7
- PubMed
- Google Scholar
(2020) Integrating molecular simulation and experimental data: A bayesian/maximum entropy reweighting approach
Methods in Molecular Biology 2112:219–240.
https://doi.org/10.1007/978-1-0716-0270-6_15
- PubMed
- Google Scholar
1. Bouchard JJ
2. Otero JH
3. Scott DC
4. Szulc E
5. Martin EW
6. Sabri N
7. Granata D
8. Marzahn MR
9. Lindorff-Larsen K
10. Salvatella X
11. Schulman BA
12. Mittag T
(2018) Cancer mutations of the tumor suppressor spop disrupt the formation of active, phase-separated compartments
Molecular Cell 72:19–36.
https://doi.org/10.1016/j.molcel.2018.08.027
- PubMed
- Google Scholar
(2007) Canonical sampling through velocity rescaling
J Chem Phys 126:1–7.
https://doi.org/10.1063/1.2408420
- PubMed
- Google Scholar
1. Cuneo MJ
2. Mittag T
(2019) The ubiquitin ligase adaptor Spop in cancer
The FEBS Journal 286:3946–3958.
https://doi.org/10.1111/febs.15056
- PubMed
- Google Scholar
Conference
1. DiFabio J
2. Chodankar S
3. Pjerov S
4. Jakoncic J
5. Lucas M
6. Krywka C
7. Graziano V
8. Yang L
(2016) The life science x-ray scattering beamline at NSLS-II
Proceedings of the 12th international conference on synchrotron radiation instrumentation – SRI2015.
https://doi.org/10.1063/1.4952872
- Google Scholar
(2012) Adaptor protein self-assembly drives the control of a cullin-RING ubiquitin ligase
Structure 20:1141–1153.
https://doi.org/10.1016/j.str.2012.04.009
- PubMed
- Google Scholar
1. Flyvbjerg H
2. Petersen HG
(1989) Error estimates on averages of correlated data
J Chem Phys 91:461–466.
https://doi.org/10.1063/1.457480
- Google Scholar
1. Geng C
2. He B
3. Xu L
4. Barbieri CE
5. Eedunuri VK
6. Chew SA
7. Zimmermann M
8. Bond R
9. Shou J
10. Li C
11. Blattner M
12. Lonard DM
13. Demichelis F
14. Coarfa C
15. Rubin MA
16. Zhou P
17. O’Malley BW
18. Mitsiades N
(2013) Prostate cancer-associated mutations in speckle-type POZ protein (spop) regulate steroid receptor coactivator 3 protein turnover
PNAS 110:6997–7002.
https://doi.org/10.1073/pnas.1304502110
- PubMed
- Google Scholar
1. Giannakis M
2. Mu XJ
3. Shukla SA
4. Qian ZR
5. Cohen O
6. Nishihara R
7. Bahl S
8. Cao Y
9. Amin-Mansour A
10. Yamauchi M
11. Sukawa Y
12. Stewart C
13. Rosenberg M
14. Mima K
15. Inamura K
16. Nosho K
17. Nowak JA
18. Lawrence MS
19. Giovannucci EL
20. Chan AT
21. Ng K
22. Meyerhardt JA
23. Van Allen EM
24. Getz G
25. Gabriel SB
26. Lander ES
27. Wu CJ
28. Fuchs CS
29. Ogino S
30. Garraway LA
(2016) Genomic correlates of immune-cell infiltrates in colorectal carcinoma
Cell Reports 15:857–865.
https://doi.org/10.1016/j.celrep.2016.03.075
- PubMed
- Google Scholar
(2017) Pepsi-SAXS: An adaptive method for rapid and accurate computation of small-angle X-ray scattering profiles
Acta Crystallographica. Section D, Structural Biology 73:449–464.
https://doi.org/10.1107/S2059798317005745
- PubMed
- Google Scholar
1. Harris CR
2. Millman KJ
3. van der Walt SJ
4. Gommers R
5. Virtanen P
6. Cournapeau D
7. Wieser E
8. Taylor J
9. Berg S
10. Smith NJ
11. Kern R
12. Picus M
13. Hoyer S
14. van Kerkwijk MH
15. Brett M
16. Haldane A
17. Del Río JF
18. Wiebe M
19. Peterson P
20. Gérard-Marchant P
21. Sheppard K
22. Reddy T
23. Weckesser W
24. Abbasi H
25. Gohlke C
26. Oliphant TE
(2020) Array programming with numpy
Nature 585:357–362.
https://doi.org/10.1038/s41586-020-2649-2
- PubMed
- Google Scholar
(2005) Stable X chromosome inactivation involves the PRC1 polycomb complex and requires histone MACROH2A1 and the CULLIN3/SPOP ubiquitin E3 ligase
PNAS 102:7635–7640.
https://doi.org/10.1073/pnas.0408918102
- PubMed
- Google Scholar
(2017) Structural analysis of multi-component amyloid systems by chemometric saxs data decomposition
Structure 25:5–15.
https://doi.org/10.1016/j.str.2016.10.013
- PubMed
- Google Scholar
1. Janouskova H
2. El Tekle G
3. Bellini E
4. Udeshi ND
5. Rinaldi A
6. Ulbricht A
7. Bernasocchi T
8. Civenni G
9. Losa M
10. Svinkina T
11. Bielski CM
12. Kryukov GV
13. Cascione L
14. Napoli S
15. Enchev RI
16. Mutch DG
17. Carney ME
18. Berchuck A
19. Winterhoff BJN
20. Broaddus RR
21. Schraml P
22. Moch H
23. Bertoni F
24. Catapano CV
25. Peter M
26. Carr SA
27. Garraway LA
28. Wild PJ
29. Theurillat JPP
(2017) Opposing effects of cancer-type-specific SPOP mutants on BET protein degradation and sensitivity to BET inhibitors
Nature Medicine 23:1046–1054.
https://doi.org/10.1038/nm.4372
- PubMed
- Google Scholar
1. Jumper J
2. Evans R
3. Pritzel A
4. Green T
5. Figurnov M
6. Ronneberger O
7. Tunyasuvunakool K
8. Bates R
9. Žídek A
10. Potapenko A
11. Bridgland A
12. Meyer C
13. Kohl SAA
14. Ballard AJ
15. Cowie A
16. Romera-Paredes B
17. Nikolov S
18. Jain R
19. Adler J
20. Back T
21. Petersen S
22. Reiman D
23. Clancy E
24. Zielinski M
25. Steinegger M
26. Pacholska M
27. Berghammer T
28. Bodenstein S
29. Silver D
30. Vinyals O
31. Senior AW
32. Kavukcuoglu K
33. Kohli P
34. Hassabis D
(2021) Highly accurate protein structure prediction with alphafold
Nature 596:583–589.
https://doi.org/10.1038/s41586-021-03819-2
- PubMed
- Google Scholar
1. Kent D
2. Bush EW
3. Hooper JE
(2006) Roadkill attenuates hedgehog responses through degradation of cubitus interruptus
Development 133:2001–2010.
https://doi.org/10.1242/dev.02370
- PubMed
- Google Scholar
1. Kim MS
2. Je EM
3. Oh JE
4. Yoo NJ
5. Lee SH
(2013) Mutational and expressional analyses of SPOP, A candidate tumor suppressor gene
In Prostate, Gastric and Colorectal Cancers. Apmis 121:626–633.
https://doi.org/10.1111/apm.12030
- PubMed
- Google Scholar
1. Krauthammer M
2. Kong Y
3. Ha BH
4. Evans P
5. Bacchiocchi A
6. McCusker JP
7. Cheng E
8. Davis MJ
9. Goh G
10. Choi M
11. Ariyan S
12. Narayan D
13. Dutton-Regester K
14. Capatana A
15. Holman EC
16. Bosenberg M
17. Sznol M
18. Kluger HM
19. Brash DE
20. Stern DF
21. Materin MA
22. Lo RS
23. Mane S
24. Ma S
25. Kidd KK
26. Hayward NK
27. Lifton RP
28. Schlessinger J
29. Boggon TJ
30. Halaban R
(2012) Exome sequencing identifies recurrent somatic Rac1 mutations in melanoma
Nature Genetics 44:1006–1014.
https://doi.org/10.1038/ng.2359
- PubMed
- Google Scholar
1. Kwon JE
2. La M
3. Oh KH
4. Oh YM
5. Kim GR
6. Seol JH
7. Baek SH
8. Chiba T
9. Tanaka K
10. Bang OS
11. Joe CO
12. Chung CH
(2006) Btb domain-containing speckle-type POZ protein (Spop) serves as an adaptor of Daxx for ubiquitination by Cul3-based ubiquitin ligase
The Journal of Biological Chemistry 281:12664–12672.
https://doi.org/10.1074/jbc.M600204200
- PubMed
- Google Scholar
(2020) Combining molecular dynamics simulations with small-angle X-ray and neutron scattering data to study multi-domain proteins in solution
PLOS Computational Biology 16:e1007870.
https://doi.org/10.1371/journal.pcbi.1007870
- PubMed
- Google Scholar
(2012) Exome sequencing of serous endometrial tumors identifies recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes
Nature Genetics 44:1310–1315.
https://doi.org/10.1038/ng.2455
- PubMed
- Google Scholar
1. Li C
2. Ao J
3. Fu J
4. Lee D-F
5. Xu J
6. Lonard D
7. O’Malley BW
(2011) Tumor-Suppressor role for the Spop ubiquitin ligase in signal-dependent proteolysis of the oncogenic co-activator SRC-3/AIB1
Oncogene 30:4350–4364.
https://doi.org/10.1038/onc.2011.151
- PubMed
- Google Scholar
1. Lynch M
(2012) The evolution of multimeric protein assemblages
Molecular Biology and Evolution 29:1353–1366.
https://doi.org/10.1093/molbev/msr300
- PubMed
- Google Scholar
1. Marsh JA
2. Teichmann SA
(2015) Structure, dynamics, assembly, and evolution of protein complexes
Annual Review of Biochemistry 84:551–575.
https://doi.org/10.1146/annurev-biochem-060614-034142
- PubMed
- Google Scholar
1. Marzahn MR
2. Marada S
3. Lee J
4. Nourse A
5. Kenrick S
6. Zhao H
7. Ben-Nissan G
8. Kolaitis RM
9. Peters JL
10. Pounds S
11. Errington WJ
12. Privé GG
13. Taylor JP
14. Sharon M
15. Schuck P
16. Ogden SK
17. Mittag T
(2016) Higher-Order oligomerization promotes localization of spop to liquid nuclear speckles
The EMBO Journal 35:1254–1275.
https://doi.org/10.15252/embj.201593169
- PubMed
- Google Scholar
1. McGibbon RT
2. Beauchamp KA
3. Harrigan MP
4. Klein C
5. Swails JM
6. Hernández CX
7. Schwantes CR
8. Wang LP
9. Lane TJ
10. Pande VS
(2015) MD Traj: A modern open library for the analysis of molecular dynamics trajectories
Biophysical Journal 109:1528–1532.
https://doi.org/10.1016/j.bpj.2015.08.015
- PubMed
- Google Scholar
(2021) regals: A general method to deconvolve X-ray scattering data from evolving mixtures
IUCrJ 8:225–237.
https://doi.org/10.1107/S2052252521000555
- PubMed
- Google Scholar
(1953) Equation of state calculations by fast computing machines
J Chem Phys 21:1087–1092.
https://doi.org/10.1063/1.1699114
- Google Scholar
1. Oosawa F
2. Kasai M
(1962) A theory of linear and helical aggregations of macromolecules
Journal of Molecular Biology 4:10–21.
https://doi.org/10.1016/s0022-2836(62)80112-0
- PubMed
- Google Scholar
1. Parrinello M
2. Rahman A
(1981) Polymorphic transitions in single crystals: A new molecular dynamics method
Journal of Applied Physics 52:7182–7190.
https://doi.org/10.1063/1.328693
- Google Scholar
1. Pedregosa F
2. Varoquaux G
3. Gramfort A
4. Michel V
5. Thirion B
6. Grisel O
7. Blondel M
8. Prettenhofer P
9. Weiss R
10. Dubourg V
11. Vanderplas J
12. Passos A
13. Cournapeau D
14. Brucher M
15. Perrot M
16. Duchesnay E
(2011)
Scikit-learn: Machine learning in python

Journal of Machine Learning Research 12:2825–2830.
- Google Scholar
1. Pesce F
2. Lindorff-Larsen K
(2021) Refining conformational ensembles of flexible proteins against small-angle X-ray scattering data
Biophysical Journal 120:5124–5135.
https://doi.org/10.1016/j.bpj.2021.10.003
- PubMed
- Google Scholar
Software
1. Pesce F
(2023) BLOCKING
GitHub.

https://github.com/fpesceKU/BLOCKING
1. Pettersen EF
2. Goddard TD
3. Huang CC
4. Meng EC
5. Couch GS
6. Croll TI
7. Morris JH
8. Ferrin TE
(2021) UCSF chimerax: Structure visualization for researchers, educators, and developers
Protein Science 30:70–82.
https://doi.org/10.1002/pro.3943
- PubMed
- Google Scholar
1. Pierce WK
2. Grace CR
3. Lee J
4. Nourse A
5. Marzahn MR
6. Watson ER
7. High AA
8. Peng J
9. Schulman BA
10. Mittag T
(2016) Multiple weak linear motifs enhance recruitment and processivity in SPOP-mediated substrate ubiquitination
Journal of Molecular Biology 428:1256–1271.
https://doi.org/10.1016/j.jmb.2015.10.002
- PubMed
- Google Scholar
1. Sali A
2. Blundell TL
(1993) Comparative protein modelling by satisfaction of spatial restraints
Journal of Molecular Biology 234:779–815.
https://doi.org/10.1006/jmbi.1993.1626
- PubMed
- Google Scholar
(2020) Protein network structure enables switching between liquid and gel states
Journal of the American Chemical Society 142:874–883.
https://doi.org/10.1021/jacs.9b10066
- PubMed
- Google Scholar
(2021) Structure and energetics of GTP- and GDP-tubulin isodesmic self-association
ACS Chemical Biology 16:2212–2227.
https://doi.org/10.1021/acschembio.1c00369
- PubMed
- Google Scholar
1. Souza PCT
2. Alessandri R
3. Barnoud J
4. Thallmair S
5. Faustino I
6. Grünewald F
7. Patmanidis I
8. Abdizadeh H
9. Bruininks BMH
10. Wassenaar TA
11. Kroon PC
12. Melcr J
13. Nieto V
14. Corradi V
15. Khan HM
16. Domański J
17. Javanainen M
18. Martinez-Seara H
19. Reuter N
20. Best RB
21. Vattulainen I
22. Monticelli L
23. Periole X
24. Tieleman DP
25. de Vries AH
26. Marrink SJ
(2021) Martini 3: A general purpose force field for coarse-grained molecular dynamics
Nature Methods 18:382–388.
https://doi.org/10.1038/s41592-021-01098-3
- PubMed
- Google Scholar
1. Studier FW
(2005) Protein production by auto-induction in high density shaking cultures
Protein Expression and Purification 41:207–234.
https://doi.org/10.1016/j.pep.2005.01.016
- PubMed
- Google Scholar
(2021) Accurate model of liquid-liquid phase behavior of intrinsically disordered proteins from optimization of single-chain properties
PNAS 118:e2111696118.
https://doi.org/10.1073/pnas.2111696118
- PubMed
- Google Scholar
1. Tesei G
2. Lindorff-Larsen K
(2023) Improved predictions of phase behaviour of intrinsically disordered proteins by tuning the interaction range
Open Research Europe 2:94.
https://doi.org/10.12688/openreseurope.14967.2
- Google Scholar
1. Theurillat J-PP
2. Udeshi ND
3. Errington WJ
4. Svinkina T
5. Baca SC
6. Pop M
7. Wild PJ
8. Blattner M
9. Groner AC
10. Rubin MA
11. Moch H
12. Prive GG
13. Carr SA
14. Garraway LA
(2014) Prostate cancer. Ubiquitylome analysis identifies dysregulation of effector substrates in SPOP-mutant prostate cancer
Science 346:85–89.
https://doi.org/10.1126/science.1250255
- PubMed
- Google Scholar
1. Thomasen FE
2. Lindorff-Larsen K
(2022) Conformational ensembles of intrinsically disordered proteins and flexible multidomain proteins
Biochemical Society Transactions 50:541–554.
https://doi.org/10.1042/BST20210499
- PubMed
- Google Scholar
(2022) Improving martini 3 for disordered and multidomain proteins
Journal of Chemical Theory and Computation 18:2033–2041.
https://doi.org/10.1021/acs.jctc.1c01042
- PubMed
- Google Scholar
(2013) Structural basis of high-order oligomerization of the cullin-3 adaptor Spop
Acta Crystallographica. Section D, Biological Crystallography 69:1677–1684.
https://doi.org/10.1107/S0907444913012687
- PubMed
- Google Scholar
1. Virtanen P
2. Gommers R
3. Oliphant TE
4. Haberland M
5. Reddy T
6. Cournapeau D
7. Burovski E
8. Peterson P
9. Weckesser W
10. Bright J
11. van der Walt SJ
12. Brett M
13. Wilson J
14. Millman KJ
15. Mayorov N
16. Nelson ARJ
17. Jones E
18. Kern R
19. Larson E
20. Carey CJ
21. Polat İ
22. Feng Y
23. Moore EW
24. VanderPlas J
25. Laxalde D
26. Perktold J
27. Cimrman R
28. Henriksen I
29. Quintero EA
30. Harris CR
31. Archibald AM
32. Ribeiro AH
33. Pedregosa F
34. van Mulbregt P
35. SciPy 1.0 Contributors
(2020) Author correction: Scipy 1.0: Fundamental algorithms for scientific computing in python
Nature Methods 17:352.
https://doi.org/10.1038/s41592-020-0772-5
- PubMed
- Google Scholar
(2014) Going backward: A flexible geometric approach to reverse transformation from coarse grained to atomistic models
Journal of Chemical Theory and Computation 10:676–690.
https://doi.org/10.1021/ct400617g
- PubMed
- Google Scholar
(2015) Computational lipidomics with insane: A versatile tool for generating custom membranes for molecular simulations
Journal of Chemical Theory and Computation 11:2144–2155.
https://doi.org/10.1021/acs.jctc.5b00209
- PubMed
- Google Scholar
1. Zhang Q
2. Zhang L
3. Wang B
4. Ou CY
5. Chien CT
6. Jiang J
(2006) A Hedgehog-induced BTB protein modulates hedgehog signaling by degrading ci/gli transcription factor
Developmental Cell 10:719–729.
https://doi.org/10.1016/j.devcel.2006.05.004
- PubMed
- Google Scholar
1. Zhang Q
2. Shi Q
3. Chen Y
4. Yue T
5. Li S
6. Wang B
7. Jiang J
(2009) Multiple ser/thr-rich degrons mediate the degradation of ci/gli by the cul3-HIB/SPOP E3 ubiquitin ligase
PNAS 106:21191–21196.
https://doi.org/10.1073/pnas.0912008106
- PubMed
- Google Scholar
1. Zhu K
2. Lei PJ
3. Ju LG
4. Wang X
5. Huang K
6. Yang B
7. Shao C
8. Zhu Y
9. Wei G
10. Fu XD
11. Li L
12. Wu M
(2017) SPOP-containing complex regulates SETD2 stability and h3k36me3-coupled alternative splicing
Nucleic Acids Research 45:92–105.
https://doi.org/10.1093/nar/gkw814
- PubMed
- Google Scholar
1. Zhuang M
2. Calabrese MF
3. Liu J
4. Waddell MB
5. Nourse A
6. Hammel M
7. Miller DJ
8. Walden H
9. Duda DM
10. Seyedin SN
11. Hoggard T
12. Harper JW
13. White KP
14. Schulman BA
(2009) Structures of SPOP-substrate complexes: Insights into molecular architectures of BTB-cul3 ubiquitin ligases
Molecular Cell 36:39–50.
https://doi.org/10.1016/j.molcel.2009.09.022
- PubMed
- Google Scholar

Article and author information

Author details

F Emil Thomasen

Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark

Contribution
Conceptualization, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-2096-4873
Matthew J Cuneo

Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, United States

Contribution
Investigation, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1475-6656
Tanja Mittag

Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, United States

Contribution
Conceptualization, Resources, Supervision, Funding acquisition, Writing – review and editing

Competing interests
was a consultant for Faze Medicines, Inc

"This ORCID iD identifies the author of this article:" 0000-0002-1827-3811
Kresten Lindorff-Larsen

Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark

Contribution
Conceptualization, Resources, Supervision, Methodology, Project administration, Writing – review and editing

For correspondence
lindorff@bio.ku.dk

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-4750-6039

Funding

Lundbeckfonden (R155-2015-2666)

Kresten Lindorff-Larsen

Novo Nordisk Fonden (NNF18OC0033950)

Kresten Lindorff-Larsen

National Institutes of Health (R01GM112846)

Tanja Mittag

American Lebanese Syrian Associated Charities

Tanja Mittag

Novo Nordisk Fonden (NNF18OC0032608)

Kresten Lindorff-Larsen

National Institutes of Health (P30GM133893)

Tanja Mittag

DOE Office of Science's Biological and Environmental Research (KP1605010)

Tanja Mittag

National Institutes of Health (OD012331)

Tanja Mittag

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by the Lundbeck Foundation BRAINSTRUC structural biology initiative (R155-2015-2666, to K.L.-L.), the PRISM (Protein Interactions and Stability in Medicine and Genomics) centre funded by the Novo Nordisk Foundation (NNF18OC0033950, to K.L.-L.), by NIH grant R01GM112846 (to T.M.) and by the American Lebanese Syrian Associated Charities (to T.M.). We acknowledge access to computational resources from the ROBUST Resource for Biomolecular Simulations (supported by the Novo Nordisk Foundation; NNF18OC0032608), the Danish National Supercomputer for Life Sciences (Computerome), and the Biocomputing Core Facility at the Department of Biology, University of Copenhagen. We thank Melissa R Marzahn and Erik W Martin for the generation of preliminary data. We thank Shirish Chodankar for assistance with SAXS data collection and reduction. The LiX beamline is part of the Center for BioMolecular Structure (CBMS), which is primarily supported by the National Institutes of Health, National Institute of General Medical Sciences (NIGMS) through a P30 Grant (P30GM133893), and by the DOE Office of Biological and Environmental Research (KP1605010). LiX also received additional support from NIH Grant S10 OD012331. As part of NSLS-II, a national user facility at Brookhaven National Laboratory, work performed at the CBMS is supported in part by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences Program under contract number DE-SC0012704.

Version history

Preprint posted: October 8, 2022 (view preprint)
Received: October 12, 2022
Accepted: February 20, 2023
Accepted Manuscript published: March 1, 2023 (version 1)
Version of Record published: March 9, 2023 (version 2)

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.