CellOrganizer: Image-Derived Models of Subcellular Organization and Protein Distribution

doi:10.1016/B978-0-12-388403-9.00007-2

Methods in Cell Biology

Volume 110, 2012, Pages 179-193

https://doi.org/10.1016/B978-0-12-388403-9.00007-2 Get rights and content

Abstract

This chapter describes approaches for learning models of subcellular organization from images. The primary utility of these models is expected to be from incorporation into complex simulations of cell behaviors. Most current cell simulations do not consider spatial organization of proteins at all, or treat each organelle type as a single, idealized compartment. The ability to build generative models for all proteins in a proteome and use them for spatially accurate simulations is expected to improve the accuracy of models of cell behaviors. A second use, of potentially equal importance, is expected to be in testing and comparing software for analyzing cell images. The complexity and sophistication of algorithms used in cell-image-based screens and assays (variously referred to as high-content screening, high-content analysis, or high-throughput microscopy) is continuously increasing, and generative models can be used to produce images for testing these algorithms in which the expected answer is known.

Introduction

As traditional reductionist paradigms of biomedical research increasingly give way to systems approaches, the need to build predictive models that synthesize large amounts of information from potentially diverse sources is becoming critical. Most such current models take the form of transcriptional regulatory networks, protein–protein interaction maps, or biochemical reaction simulations. These typically do not consider spatial organization of cells or tissues. Important advances came with systems such as MCell (Stiles et al., 1998), which allowed models to be constructed using mesh representations of cells built from electron microscope images, and the Virtual Cell (Loew and Schaff, 2001), which allowed appropriately processed images to provide surface area and volume for its compartmental models. Ontologies such as the genome ontology (GO) can be used to describe protein attributes, including location, primarily at a major organelle level. Such assignments can also be used to create compartmental models (e.g., http://biologicalnetworks.net/tutorials). However, compartmental models suffer from some important limitations, in that they treat all molecules within each compartment as being homogenously distributed, and they do not allow appearance, disappearance, fission or fusion of compartments.

Given the energy expended by cells to maintain their subcellular organization, and the many defects that are associated with alterations in it, models that do not accurately reflect subcellular organization are unlikely to perform satisfactorily at predicting complex cell behaviors or how they respond to changes in conditions. There is therefore a need for computational models that accurately represent the number, size, shape, and positions of subcellular structures, the spatial relationships between different structures, and how proteins (and other molecules) are distributed between them (Murphy, 2010, Murphy, 2011). In addition, there is a need for a mechanism for representing how all of these vary within a population of cells of a single cell type, within a single cell type under different conditions, among different cell types, and among different organisms. Such models can not only capture cell behavior but can also be an important step in understanding that behavior, since, for example, a sufficiently detailed model helps distinguish aspects that are conserved and presumably necessary from those that are highly variable and potentially not necessary.

In considering how to build such models, we can distinguish descriptive models, which allow one to recognize what state a particular cell is in, from generative models, which can also synthesize new examples of cells in particular states. We can also distinguish theoretical or conceptual models, which posit a particular structure based on a generalized understanding, from data-driven models that are learned from data and capture both general behavior and variation in that behavior.

My focus in this chapter will be primarily on methods developed in my group that have been used to learn generative models of cell organization and protein distribution from two-dimensional and three-dimensional fluorescence microscope images (Zhao and Murphy, 2007, Rohde et al., 2008a, Rohde et al., 2008b, Peng et al., 2009, Shariff et al., 2010a, Shariff et al., 2011, Peng and Murphy, 2011). We have recently grouped these methods as part of the open source CellOrganizer project (http://cellorganizer.org), which includes collaborations with a number of investigators studying particular cell systems.

Section snippets

Components of a Model of Subcellular Organization and Protein Distribution

Although there are a number of ways to break down the tasks necessary for creating such models, we can distinguish at least three major components of a model of the distribution of proteins within cells of a given type under a given condition:

•
A model of subcellular organization, including distributions of the number, size, shape, and position of each subcellular structure, any of which may be conditional on the model(s) for other structures;
•
A model representing the probability that a cell of a

Models of Subcellular Organization

At a conceptual level, the most complete model of subcellular organization is probably the GO cellular component ontology (Ashburner et al., 2000). A significant effort has been made to capture the vast majority of terms used to describe subcellular structures. The terms in this ontology can be assigned to proteins in order to represent the results of experimental or computational analyses. The advantage of this approach is precisely its disadvantage: general terms such as “mitochondria” can be

Protein distributions across subcellular structures

The models described above capture how cellular organelles are arranged within a cell, but do not address the critical question of how the tens of thousands of proteins in each cell are distributed among these organelles. Images, especially fluorescence microscope images, can be a major source of information on the subcellular distributions of proteins, and, as mentioned above, may be used directly in cell simulations. The feasibility of using automated pattern recognition approaches to

Use of Models for Testing Algorithms

A classic problem in testing algorithms for microscope images is that the correct results are frequently not known. A generative model for a desired pattern or structure can be combined with a model of image formation in a particular microscope to generate test images (phantoms) for which the correct results from image analysis are known. The process by which an image is formed in a microscope is quite well understood, so accurate models of point-spread functions and sampling noise can be

Conclusion

In this chapter, I have described current approaches for building accurate models of cell organization directly from fluorescent microscope images. These models capture variation in cell organization at the level of the nucleus, cell membrane, and individual organelles, and can capture how particular proteins are distributed among cellular components. They represent a significant advance over the use of words (such as GO terms) as the means by which results of experiments on subcellular

Acknowledgments

I express my thanks to Michael Boland, Meel Velliste, Ting Zhao, Tao Peng, Luis Coelho, Wei Wang, and especially Gustavo Rohde for their contributions to previous collaborative work described here and for many helpful discussions with them and with Jieyue Li, Taraz Buck, Ivan Cao-Berg, Baek Hwan Cho, Klaus Palme, Hagit Shatkay, Joel Stiles, Leslie Loew, Ion Moraru, Christoph Wülfing, Eric Xing, Gaudenz Danuser, Karl Rohr, and Ivo Sbalzarini. Much of the original work reviewed here was supported

References (25)

L.M. Loew et al.
The virtual cell: a software environment for computational cell biology
Trends Biotechnol.
(2001)
H. Blum
Biological shape and visual science. I
J. Theor. Biol.
(1973)
J.A. Helmuth et al.
Shape reconstruction of subcellular structures from live cell fluorescence microscopy images
J. Struct. Biol.
(2009)
B.L. Sprague
Mechanisms of microtubule-based kinetochore positioning in the yeast metaphase spindle
Biophys. J.
(2003)
A. Shariff et al.
Automated image analysis for high-content screening and analysis
J. Biomol. Screen. Off. J. Soc. Biomol. Screen.
(2010)
J.R. Stiles et al.
Monte Carlo simulation of neuro-transmitter release using MCell, a general simulator of cellular physiological processes
R.F. Murphy
Communicating subcellular distributions
Cytometr. Part A
(2010)
R.F. Murphy
An active role for machine learning in drug development
Nature Chem. Biol.
(2011)
T. Zhao et al.
Automated learning of generative models for subcellular location: building blocks for systems biology
Cytometr. Part A.
(2007)
G.K. Rohde et al.
Deformation-based nuclear morphometry: capturing nuclear shape variation in HeLa cells
Cytometr. Part A.
(2008)

G.K. Rohde et al.

Deformation-based nonlinear dimension reduction: applications to nuclear morphometry

Proc. 2008 Int. Symp. Biomed. Imaging.

(2008)

T. Peng et al.

Instance-based generative biological shape modeling

Proc. 2009 Int. Symp. Biomed. Imaging.

(2009)

Cited by (40)

Review of cell image synthesis for image processing
2022, Biomedical Image Synthesis and Simulation: Methods and Applications
Opposites attract, also in the biomedical field and during the processing of cell microscopy images. In the same spirit, image processing, the indispensable analyst tool, is often supported by image synthesis applications. Image synthesis is a methodology implemented in computer program intended to create artificial cell images similar to images from real microscopy. The generation of artificial images has had a stable tradition in image processing and is currently gaining more attention with the rising popularity of deep learning.
This chapter reviews the current state of cell image synthesis, including terminology, broader context, goals, and peculiarities. It offers a brief historical introspection and, most importantly, surveys all contemporary methodology and applications. The light descriptions of procedural methods with explicit parameters and deep learning-based methods with implicit parameters, such as the generative adversarial networks, are also included. Last but not least, this chapter discusses what kind of artificial images and ground-truth data the methods generate, including the subsequent usage of this data for image processing such as cell segmentation or data augmentation for deep learning.
Among the covered methods are approaches generating artificial cell microscopy images of fluorescence stained proteins, actin filaments, chromatin stained nuclei, membranes, and even populations of cells or full cells in differential inference contrast microscopy, to name a few. The generated data is often accompanied by ground truth annotation, whose forms are also discussed, including cell detection markers, full cell segmentation, and cell tracking data.
An Open-Source Mesh Generation Platform for Biophysical Modeling Using Realistic Cellular Geometries
2020, Biophysical Journal
Citation Excerpt :
On the other hand, to gain better insight into how cellular geometry can affect the dynamics of these mechanochemical processes, using realistic geometries is necessary. Already, freely available tools such as Virtual Cell (6) and CellOrganizer (7) have paved the way for using realistic cellular geometries in simulations. With the increasing availability of high-resolution images of the cellular ultrastructure, including the size and shape of organelles and the curvature of the various cellular membranes, there is a need for computational tools and algorithms that can enable us to use these data as the geometry or domain of interest and conduct simulations using numerical methods (8).
Advances in imaging methods such as electron microscopy, tomography, and other modalities are enabling high-resolution reconstructions of cellular and organelle geometries. Such advances pave the way for using these geometries for biophysical and mathematical modeling once these data can be represented as a geometric mesh, which, when carefully conditioned, enables the discretization and solution of partial differential equations. In this work, we outline the steps for a naïve user to approach the Geometry-preserving Adaptive MeshER software version 2, a mesh generation code written in C++ designed to convert structural data sets to realistic geometric meshes while preserving the underlying shapes. We present two example cases: 1) mesh generation at the subcellular scale as informed by electron tomography and 2) meshing a protein with a structure from x-ray crystallography. We further demonstrate that the meshes generated by the Geometry-preserving Adaptive MeshER software are suitable for use with numerical methods. Together, this collection of libraries and tools simplifies the process of constructing realistic geometric meshes from structural biology data.
Phenotypic Image Analysis Software Tools for Exploring and Understanding Big Image Data from Cell-Based Assays
2018, Cell Systems
Citation Excerpt :
Generative approaches capture variation in a population and encode it as a probability distribution, or generative models, which can use this information to synthesize new examples of cells in a particular state. The CellOrganizer software package is able to generate models of individual cells by modeling the structure of subcellular compartments on data from high-resolution microscopy images (Murphy, 2012). CytoGAN is a recent approach that trains GANs to synthesize realistic cell images that are useful for exploring morphological variations within or between populations of cells (Goldsborough et al., 2017).
Phenotypic image analysis is the task of recognizing variations in cell properties using microscopic image data. These variations, produced through a complex web of interactions between genes and the environment, may hold the key to uncover important biological phenomena or to understand the response to a drug candidate. Today, phenotypic analysis is rarely performed completely by hand. The abundance of high-dimensional image data produced by modern high-throughput microscopes necessitates computational solutions. Over the past decade, a number of software tools have been developed to address this need. They use statistical learning methods to infer relationships between a cell's phenotype and data from the image. In this review, we examine the strengths and weaknesses of non-commercial phenotypic image analysis software, cover recent developments in the field, identify challenges, and give a perspective on future possibilities.
Opportunities and Challenges in Building a Spatiotemporal Multi-scale Model of the Human Pancreatic β Cell
2018, Cell
Citation Excerpt :
For example, Earnest et al. (2017) extracted cell geometry from cryo-electron tomography experiments and used stochastic simulations to study the effect of cell structure on reaction network. Other studies have relied on fluorescent images to extract ultrastructure and spatial distribution of proteins for cell modeling using V-Cell or M-Cell (Murphy, 2012, 2016). These methods do not represent cellular environments, including proteins, lipids, and macromolecular assemblies, as three-dimensional structures.
The construction of a predictive model of an entire eukaryotic cell that describes its dynamic structure from atomic to cellular scales is a grand challenge at the intersection of biology, chemistry, physics, and computer science. Having such a model will open new dimensions in biological research and accelerate healthcare advancements. Developing the necessary experimental and modeling methods presents abundant opportunities for a community effort to realize this goal. Here, we present a vision for creation of a spatiotemporal multi-scale model of the pancreatic β–cell, a relevant target for understanding and modulating the pathogenesis of diabetes.
3D high-content screening of organoids for drug discovery
2017, Comprehensive Medicinal Chemistry III
We are entering a new era of biomedical research that is driven by the demand for more effective therapeutics to prevent and treat human disease. Organoids, cultured ex vivo, are the future of this new era of biomedical research and are poised to replace preclinical 2D cell models, and in some cases animal models of human disease. Therefore, the drug discovery and development pipeline is retooling high-throughput technologies to accommodate organoids as the model of choice. In particular, the marriage of high-content screening (HCS) with organoid models for drug discovery will be a critical component in this new era of drug development. This book chapter is focused on the state-of-the-art HCS technology and how this technology is being retooled for drug discovery and development with human organoids.
Building cell models and simulations from microscope images
2016, Methods
Citation Excerpt :
Converting these representations into generative models differs greatly in the amount of training data required – learning a statistical model of the variation of two or three axis lengths requires far less data than accurately capturing the relationships between hundreds or thousands of minor surface variations. Over a number of years and contributions from a number of participants, the open source CellOrganizer system has been created as a step towards meeting the need for learning and using image-based generative cell models [14,16–23]. The basic principles of the CellOrganizer pipeline are illustrated in Fig. 1, and are generally applicable to efforts in this area.
The use of fluorescence microscopy has undergone a major revolution over the past twenty years, both with the development of dramatic new technologies and with the widespread adoption of image analysis and machine learning methods. Many open source software tools provide the ability to use these methods in a wide range of studies, and many molecular and cellular phenotypes can now be automatically distinguished. This article presents the next major challenge in microscopy automation, the creation of accurate models of cell organization directly from images, and reviews the progress that has been made towards this challenge.

View all citing articles on Scopus

View full text

Chapter 7 - CellOrganizer: Image-Derived Models of Subcellular Organization and Protein Distribution

Abstract

Introduction

Section snippets

Components of a Model of Subcellular Organization and Protein Distribution

Models of Subcellular Organization

Protein distributions across subcellular structures

Use of Models for Testing Algorithms

Conclusion

Acknowledgments

Trends Biotechnol.

J. Theor. Biol.

J. Struct. Biol.

Biophys. J.

J. Biomol. Screen. Off. J. Soc. Biomol. Screen.

Monte Carlo simulation of neuro-transmitter release using MCell, a general simulator of cellular physiological processes

Communicating subcellular distributions

Cytometr. Part A

An active role for machine learning in drug development

Nature Chem. Biol.

Automated learning of generative models for subcellular location: building blocks for systems biology

Cytometr. Part A.

Deformation-based nuclear morphometry: capturing nuclear shape variation in HeLa cells

Cytometr. Part A.

Deformation-based nonlinear dimension reduction: applications to nuclear morphometry

Proc. 2008 Int. Symp. Biomed. Imaging.

Instance-based generative biological shape modeling

Proc. 2009 Int. Symp. Biomed. Imaging.