Chapter 7 - CellOrganizer: Image-Derived Models of Subcellular Organization and Protein Distribution

https://doi.org/10.1016/B978-0-12-388403-9.00007-2Get rights and content

Abstract

This chapter describes approaches for learning models of subcellular organization from images. The primary utility of these models is expected to be from incorporation into complex simulations of cell behaviors. Most current cell simulations do not consider spatial organization of proteins at all, or treat each organelle type as a single, idealized compartment. The ability to build generative models for all proteins in a proteome and use them for spatially accurate simulations is expected to improve the accuracy of models of cell behaviors. A second use, of potentially equal importance, is expected to be in testing and comparing software for analyzing cell images. The complexity and sophistication of algorithms used in cell-image-based screens and assays (variously referred to as high-content screening, high-content analysis, or high-throughput microscopy) is continuously increasing, and generative models can be used to produce images for testing these algorithms in which the expected answer is known.

Introduction

As traditional reductionist paradigms of biomedical research increasingly give way to systems approaches, the need to build predictive models that synthesize large amounts of information from potentially diverse sources is becoming critical. Most such current models take the form of transcriptional regulatory networks, protein–protein interaction maps, or biochemical reaction simulations. These typically do not consider spatial organization of cells or tissues. Important advances came with systems such as MCell (Stiles et al., 1998), which allowed models to be constructed using mesh representations of cells built from electron microscope images, and the Virtual Cell (Loew and Schaff, 2001), which allowed appropriately processed images to provide surface area and volume for its compartmental models. Ontologies such as the genome ontology (GO) can be used to describe protein attributes, including location, primarily at a major organelle level. Such assignments can also be used to create compartmental models (e.g., http://biologicalnetworks.net/tutorials). However, compartmental models suffer from some important limitations, in that they treat all molecules within each compartment as being homogenously distributed, and they do not allow appearance, disappearance, fission or fusion of compartments.

Given the energy expended by cells to maintain their subcellular organization, and the many defects that are associated with alterations in it, models that do not accurately reflect subcellular organization are unlikely to perform satisfactorily at predicting complex cell behaviors or how they respond to changes in conditions. There is therefore a need for computational models that accurately represent the number, size, shape, and positions of subcellular structures, the spatial relationships between different structures, and how proteins (and other molecules) are distributed between them (Murphy, 2010, Murphy, 2011). In addition, there is a need for a mechanism for representing how all of these vary within a population of cells of a single cell type, within a single cell type under different conditions, among different cell types, and among different organisms. Such models can not only capture cell behavior but can also be an important step in understanding that behavior, since, for example, a sufficiently detailed model helps distinguish aspects that are conserved and presumably necessary from those that are highly variable and potentially not necessary.

In considering how to build such models, we can distinguish descriptive models, which allow one to recognize what state a particular cell is in, from generative models, which can also synthesize new examples of cells in particular states. We can also distinguish theoretical or conceptual models, which posit a particular structure based on a generalized understanding, from data-driven models that are learned from data and capture both general behavior and variation in that behavior.

My focus in this chapter will be primarily on methods developed in my group that have been used to learn generative models of cell organization and protein distribution from two-dimensional and three-dimensional fluorescence microscope images (Zhao and Murphy, 2007, Rohde et al., 2008a, Rohde et al., 2008b, Peng et al., 2009, Shariff et al., 2010a, Shariff et al., 2011, Peng and Murphy, 2011). We have recently grouped these methods as part of the open source CellOrganizer project (http://cellorganizer.org), which includes collaborations with a number of investigators studying particular cell systems.

Section snippets

Components of a Model of Subcellular Organization and Protein Distribution

Although there are a number of ways to break down the tasks necessary for creating such models, we can distinguish at least three major components of a model of the distribution of proteins within cells of a given type under a given condition:

  • A model of subcellular organization, including distributions of the number, size, shape, and position of each subcellular structure, any of which may be conditional on the model(s) for other structures;

  • A model representing the probability that a cell of a

Models of Subcellular Organization

At a conceptual level, the most complete model of subcellular organization is probably the GO cellular component ontology (Ashburner et al., 2000). A significant effort has been made to capture the vast majority of terms used to describe subcellular structures. The terms in this ontology can be assigned to proteins in order to represent the results of experimental or computational analyses. The advantage of this approach is precisely its disadvantage: general terms such as “mitochondria” can be

Protein distributions across subcellular structures

The models described above capture how cellular organelles are arranged within a cell, but do not address the critical question of how the tens of thousands of proteins in each cell are distributed among these organelles. Images, especially fluorescence microscope images, can be a major source of information on the subcellular distributions of proteins, and, as mentioned above, may be used directly in cell simulations. The feasibility of using automated pattern recognition approaches to

Use of Models for Testing Algorithms

A classic problem in testing algorithms for microscope images is that the correct results are frequently not known. A generative model for a desired pattern or structure can be combined with a model of image formation in a particular microscope to generate test images (phantoms) for which the correct results from image analysis are known. The process by which an image is formed in a microscope is quite well understood, so accurate models of point-spread functions and sampling noise can be

Conclusion

In this chapter, I have described current approaches for building accurate models of cell organization directly from fluorescent microscope images. These models capture variation in cell organization at the level of the nucleus, cell membrane, and individual organelles, and can capture how particular proteins are distributed among cellular components. They represent a significant advance over the use of words (such as GO terms) as the means by which results of experiments on subcellular

Acknowledgments

I express my thanks to Michael Boland, Meel Velliste, Ting Zhao, Tao Peng, Luis Coelho, Wei Wang, and especially Gustavo Rohde for their contributions to previous collaborative work described here and for many helpful discussions with them and with Jieyue Li, Taraz Buck, Ivan Cao-Berg, Baek Hwan Cho, Klaus Palme, Hagit Shatkay, Joel Stiles, Leslie Loew, Ion Moraru, Christoph Wülfing, Eric Xing, Gaudenz Danuser, Karl Rohr, and Ivo Sbalzarini. Much of the original work reviewed here was supported

References (25)

  • G.K. Rohde et al.

    Deformation-based nonlinear dimension reduction: applications to nuclear morphometry

    Proc. 2008 Int. Symp. Biomed. Imaging.

    (2008)
  • T. Peng et al.

    Instance-based generative biological shape modeling

    Proc. 2009 Int. Symp. Biomed. Imaging.

    (2009)
  • Cited by (40)

    • Review of cell image synthesis for image processing

      2022, Biomedical Image Synthesis and Simulation: Methods and Applications
    • An Open-Source Mesh Generation Platform for Biophysical Modeling Using Realistic Cellular Geometries

      2020, Biophysical Journal
      Citation Excerpt :

      On the other hand, to gain better insight into how cellular geometry can affect the dynamics of these mechanochemical processes, using realistic geometries is necessary. Already, freely available tools such as Virtual Cell (6) and CellOrganizer (7) have paved the way for using realistic cellular geometries in simulations. With the increasing availability of high-resolution images of the cellular ultrastructure, including the size and shape of organelles and the curvature of the various cellular membranes, there is a need for computational tools and algorithms that can enable us to use these data as the geometry or domain of interest and conduct simulations using numerical methods (8).

    • Phenotypic Image Analysis Software Tools for Exploring and Understanding Big Image Data from Cell-Based Assays

      2018, Cell Systems
      Citation Excerpt :

      Generative approaches capture variation in a population and encode it as a probability distribution, or generative models, which can use this information to synthesize new examples of cells in a particular state. The CellOrganizer software package is able to generate models of individual cells by modeling the structure of subcellular compartments on data from high-resolution microscopy images (Murphy, 2012). CytoGAN is a recent approach that trains GANs to synthesize realistic cell images that are useful for exploring morphological variations within or between populations of cells (Goldsborough et al., 2017).

    • Opportunities and Challenges in Building a Spatiotemporal Multi-scale Model of the Human Pancreatic β Cell

      2018, Cell
      Citation Excerpt :

      For example, Earnest et al. (2017) extracted cell geometry from cryo-electron tomography experiments and used stochastic simulations to study the effect of cell structure on reaction network. Other studies have relied on fluorescent images to extract ultrastructure and spatial distribution of proteins for cell modeling using V-Cell or M-Cell (Murphy, 2012, 2016). These methods do not represent cellular environments, including proteins, lipids, and macromolecular assemblies, as three-dimensional structures.

    • 3D high-content screening of organoids for drug discovery

      2017, Comprehensive Medicinal Chemistry III
    • Building cell models and simulations from microscope images

      2016, Methods
      Citation Excerpt :

      Converting these representations into generative models differs greatly in the amount of training data required – learning a statistical model of the variation of two or three axis lengths requires far less data than accurately capturing the relationships between hundreds or thousands of minor surface variations. Over a number of years and contributions from a number of participants, the open source CellOrganizer system has been created as a step towards meeting the need for learning and using image-based generative cell models [14,16–23]. The basic principles of the CellOrganizer pipeline are illustrated in Fig. 1, and are generally applicable to efforts in this area.

    View all citing articles on Scopus
    View full text