Dynamo Catalogue: Geometrical tools and data management for particle picking in subtomogram averaging of cryo-electron tomograms

https://doi.org/10.1016/j.jsb.2016.06.005Get rights and content

Abstract

Cryo electron tomography allows macromolecular complexes within vitrified, intact, thin cells or sections thereof to be visualized, and structural analysis to be performed in situ by averaging over multiple copies of the same molecules. Image processing for subtomogram averaging is specific and cumbersome, due to the large amount of data and its three dimensional nature and anisotropic resolution. Here, we streamline data processing for subtomogram averaging by introducing an archiving system, Dynamo Catalogue. This system manages tomographic data from multiple tomograms and allows visual feedback during all processing steps, including particle picking, extraction, alignment and classification. The file structure of a processing project file structure includes logfiles of performed operations, and can be backed up and shared between users. Command line commands, database queries and a set of GUIs give the user versatile control over the process. Here, we introduce a set of geometric tools that streamline particle picking from simple (filaments, spheres, tubes, vesicles) and complex geometries (arbitrary 2D surfaces, rare instances on proteins with geometric restrictions, and 2D and 3D crystals). Advanced functionality, such as manual alignment and subboxing, is useful when initial templates are generated for alignment and for project customization. Dynamo Catalogue is part of the open source package Dynamo and includes tools to ensure format compatibility with the subtomogram averaging functionalities of other packages, such as Jsubtomo, PyTom, PEET, EMAN2, XMIPP and Relion.

Introduction

Cryo-electron tomography (cryo-ET), with its unique capacity for the three-dimensional (3D) visualization of macromolecular complexes in a close-to-native state, is a rapidly developing technology (Harapin et al., 2013, Lucic et al., 2013). Subtomogram averaging (STA) recovers the structure of a given macromolecule by locating multiple noisy copies of the object of interest in one or several tomograms, and integrating them into one or multiple structures with a higher signal-to-noise ratio (SNR) (Briggs, 2013). This integration typically consists of an alignment step that accurately locates and imparts a common orientation to the initially differently oriented particles, followed by averaging of the aligned particles and classification of the particles into more homogeneous groups. The procedure has been widely used, with applications ranging from large macromolecular complexes inside the native context (Beck et al., 2007, Kudryashev et al., 2013, Pigino et al., 2011) to membrane proteins on membranes (Faini et al., 2013, Pfeffer et al., 2012) or isolated protein complexes (Dudkina et al., 2011). Currently, 11% of the structures deposited in the electron microscopy data bank (EMDB) were solved by STA. Generally, there is a correlation between the number of particles and the resolution obtained (Kudryashev et al., 2012), suggesting that a larger amount of processed particles results in a higher final resolution. Several structures determined by STA have sub-nanometer resolution, revealing the secondary structure of the proteins present (Bartesaghi et al., 2012, Schur et al., 2013, Schur et al., 2015); in each case, a large number of 3D subtomograms was required.

Several software packages are available for the 3D processing of multiple subtomograms. These include AV3 (Forster et al., 2005), Jsubtomo (Huiskonen et al., 2010), PyTom (Hrabe et al., 2012), EMAN2 (Galaz-Montoya et al., 2015), PEET (Nicastro et al., 2006), Relion (Bharat et al., 2015), and Dynamo (Castano-Diez et al., 2012).

Subtomogram analysis starts with the extraction of subtomograms from one or several tomographic reconstructions. These 3D subtomograms are referred to as particles in the following. The location, orientation and exact number of particles of interest in tomograms are not defined a priori. In STA, “particle picking” commonly refers to a set of actions carried out automatically or with input from the operator, with the goal of approximately determining the positions and possibly also the orientations of subtomograms, to produce a set of particles, each particle containing a unique single copy of a macromolecule of interest, accompanied by adequately formatted metadata. Compared to two-dimensional (2D) particle picking in micrographs for single particle analysis, picking 3D particles from cryo-ET volumes has two specific difficulties. First, the data are 3D volumes which usually suffer from anisotropic resolution, making their visualization more complex. Second, particle identification is more challenging for both automated methods and human operators, since tomography typically images 3D particles in their native context as opposed to isolated single particles.

Further, in spite of the common goal, in tomography the generic term “particle picking” expresses different procedural approaches in different scenarios. Determination of particle locations might involve purely automatic methods relying on image analysis of the tomograms, such as a pattern matching-based identification, as implemented in molmatch (Frangakis et al., 2002), regular picking from a surface, e.g., a bacterial membrane (Amat et al., 2010), or picking from a tubular crystal (Bharat et al., 2012). Alternatively, particle picking might be based on visual inspection and manually targeting the structure of interest on a computer monitor. Integration of a priori information arising from geometric constraints is a frequent requirement: the particles to average might lie on cellular membranes (Kudryashev et al., 2013), on virus capsids (Huiskonen et al., 2010), on lipid vesicles (Faini et al., 2013), along the axial path of tubular structures (Nicastro et al., 2006, Pigino et al., 2011), or might be the repeating subunits that generate such structures. Sometimes, the particle coordinates might be additionally constrained by the presence of crystalline order (2D or 3D) or symmetry, like the single vertices of an icosahedral virus (Gil-Carton et al., 2015), and the symmetry assumption might need to be partially weakened to accommodate the actual behavior of the observed data (Peralta et al., 2013). Also, related geometric surfaces might be fully or partially populated with particles (Maurer et al., 2013). The various possible particle arrangements require the use of different combinations of automated and manual particle picking procedures

Here, we present an extension to the Dynamo package, called Dynamo Catalogue, which facilitates automated or semi-automated subtomogram particle picking from various geometries used in STA, such as isolated particles, filaments, vesicles, arbitrary 2D surfaces and 2D and 3D crystals. This toolbox is an integral part of the Dynamo workflow and allows scripting of repetitive tasks through the command line. Dynamo Catalogue features a graphical user interface (GUI)-enabled toolbox that assists in the management of the data and metadata, including geometry and metadata conversions, and offers pipelines for the STA workflow from particle picking to STA and classification. Dynamo Catalogue is available within the Dynamo package as open source at http://www.dynamo-em.org.

Section snippets

Data management system in Dynamo

In order to streamline STA projects we have implemented a database that integrates particle picking and extraction into the Dynamo workflow. The database is organized as a catalogue system with a user-accessible GUI (dynamo_catalogue_manager; here and below the Dynamo commands are stylized in courier font) and a set of command-line tools for scripting. The objects of the database are (1) links to tomograms and (2) models describing particle geometry (Fig. 1). Properties of the tomograms include

Conclusions

We present a set of semi-automated particle picking tools for subtomogram processing, that are integrated in the Dynamo pipeline. These tools accelerate particle picking and at the same time provide initial orientations for the particles. This orientation information allows local searches for particle shifts and rotations to be performed, which speeds up processing and minimizes the amount of misaligned particles. Knowledge of initial orientations for all or a subset of the particles ensures

Acknowledgements

We thank Nikolaus Stahlberg for support with beta-testing, Marcel Arheit, Julia Kowal, Juha Huiskonen, Alex Noble and Sai Li for useful discussions and user feedback, and Takashi Ishikawa, Bara Malkova, and Morgan Beeby for sharing tomographic data for visualization purposes. Numerical experiments were performed with resources from the Swiss National Supercomputing Center. This work was in part supported by the Swiss Initiative for Systems Biology SystemsX.ch (RTD CINA) and by the Swiss

References (32)

Cited by (79)

  • Recent structural advances in bacterial chemotaxis signalling

    2023, Current Opinion in Structural Biology
  • A cytoskeletal vortex drives phage nucleus rotation during jumbo phage replication in E. coli

    2022, Cell Reports
    Citation Excerpt :

    Reference-generation and alignment of capsids was performed while enforcing icosahedral symmetry with Relion-v3.1.1 (despite the capsids possessing C5 symmetry) in order to promote convergence from the low number of particles. For the phage tails, the start and endpoints along the filament axis were defined manually and used to generate over-sampled filament models in Dynamo-v1.1514 (Castaño-Díez et al., 2012, 2017). An initial reference for the tail was generated using Dynamo-v1.1514 from two full-length tails with clearly defined polarity.

  • Cryo-electron tomography of the onion cell wall shows bimodally oriented cellulose fibers and reticulated homogalacturonan networks

    2022, Current Biology
    Citation Excerpt :

    Sub-tomogram extraction, alignment, and averaging were performed using the Dynamo software package.57 Initial orientations and positions of cellulose fibers segments were determined using geometrical tools for particle picking in Dynamo.58 Regions of the filaments with minimal bending and overlapping were traced in 4x binned tomograms.

View all citing articles on Scopus
View full text