Skip to main content
Log in

Markov chain aggregation and its applications to combinatorial reaction networks

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

We consider a continuous-time Markov chain (CTMC) whose state space is partitioned into aggregates, and each aggregate is assigned a probability measure. A sufficient condition for defining a CTMC over the aggregates is presented as a variant of weak lumpability, which also characterizes that the measure over the original process can be recovered from that of the aggregated one. We show how the applicability of de-aggregation depends on the initial distribution. The application section is devoted to illustrate how the developed theory aids in reducing CTMC models of biochemical systems particularly in connection to protein-protein interactions. We assume that the model is written by a biologist in form of site-graph-rewrite rules. Site-graph-rewrite rules compactly express that, often, only a local context of a protein (instead of a full molecular species) needs to be in a certain configuration in order to trigger a reaction event. This observation leads to suitable aggregate Markov chains with smaller state spaces, thereby providing sufficient reduction in computational complexity. This is further exemplified in two case studies: simple unbounded polymerization and early EGFR/insulin crosstalk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Anderson DF, Kurtz TG (2011) Continuous time markov chain models for chemical reaction networks. In: Koeppl H, Setti G, di Bernardo M, Densmore D (eds) Design and analysis of biomolecular circuits. Springer, Berlin

    Google Scholar 

  • Blinov M, Faeder JR, Goldstein B, Hlavacek WS (2004) Bionetgen: software for rule-based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics (Oxford, England) 20(17):3289–3291

    Google Scholar 

  • Borisov NM, Markevich NI, Hoek JB, Kholodenko BN (2006) Trading the micro-world of combinatorial complexity for the macro-world of protein interaction domains. BioSystems 83:152–166

    Article  Google Scholar 

  • Buchholz P (1994) Exact and ordinary lumpability in finite Markov chains. J Appl Probab 31(1):59–75

    Article  MATH  MathSciNet  Google Scholar 

  • Buchholz P (2008) Bisimulation relations for weighted automata. Theoret Comput Sci 393(1–3):109–123

    Article  MATH  MathSciNet  Google Scholar 

  • Conzelmann H, Fey D, Gilles ED (2008) Exact model reduction of combinatorial reaction networks. BMC Syst Biol 2(78):342–351

    Google Scholar 

  • Danos V, Laneve C (2003) Core formal molecular biology. Theoret Comput Sci 325:69–110

    Article  MathSciNet  Google Scholar 

  • Danos V, Feret J, Fontana W, Harmer R, Krivine J (2010) Abstracting the differential semantics of rule-based models: exact and automated model reduction. In: Symposium on logic in computer science, pp 362–381

  • Feret J, Henzinger T, Koeppl H, Petrov T (2012) Lumpability abstractions of rule-based systems. Theoret Comput Sci 431:137–164

    Article  MATH  MathSciNet  Google Scholar 

  • Feret J, Koeppl H, Petrov T (2013) Stochastic fragments: a framework for the exact reduction of the stochastic semantics of rule-based models. Int J Softw Inf 4 (to appear)

  • Friedman N, Cai L, Sunney XX (2010) Stochasticity in gene expression as observed by single-molecule experiments in live cells. Israel J Chem 49:333–342

    Article  Google Scholar 

  • Gillespie DT (2007) Stochastic simulation of chemical kinetics. Annu Rev Phys Chem 58(1):35–55

    Google Scholar 

  • Guptasarma P (1995), Does replication-induced transcription regulate synthesis of the myriad low copy number proteins of Escherichia coli? BioEssays: news and reviews in molecular, cellular and, developmental biology 17(11):987–997

  • Gurvits L, Ledoux J (2005) Markov property for a function of a Markov chain: a linear algebra approach. Linear Algebra Appl 404:85–117

    Article  MATH  MathSciNet  Google Scholar 

  • Hardy GH, Ramanujan S (1918) Asymptotic formula in combinatory analysis. Proc Lond Math Soc S2–17(1):75–115

    Article  Google Scholar 

  • Hernández-Lerma O, Lasserre J-B (2003) Markov chains and invariant probabilities. In: Progress in mathematics, vol 211. Birkhäuser Verlag, Basel

    Book  Google Scholar 

  • Hlavacek WS, Faeder JR, Blinov ML, Perelson AS, Goldstein B (2005) The complexity of complexes in signal transduction. Biotechnol Bioeng 84:783–794

    Article  Google Scholar 

  • Ibe OC (2009) Markov processes for stochastic modeling. Elsevier, Amsterdam

    MATH  Google Scholar 

  • Kang H-W, Kurtz TG (2013) Separation of time-scales and model reduction for stochastic reaction networks. Ann Appl Probab 23(2):529–583

    Article  MATH  MathSciNet  Google Scholar 

  • Keizer J (1987) Statistical thermodynamics of nonequilibrium processes, 1st edn. Springer, Berlin

    Book  Google Scholar 

  • Kemeny J, Snell JL (1960) Finite Markov chains. Van Nostrand

  • Krivine J, Danos V, Benecke A (2009) Modelling epigenetic information maintenance: a kappa tutorial. CAV, pp 17–32

  • Ledoux J (1995) On weak lumpability of denumerable Markov chains. Statist Probab Lett 25(4):329–339

    Article  MATH  MathSciNet  Google Scholar 

  • Ledoux J (2004) Linear dynamics for the state vector of Markov chain functions. Adv Appl Probab 36(4):1198–1211

    Article  MATH  MathSciNet  Google Scholar 

  • Munsky B, Khammash M (2006) The finite state projection algorithm for the solution of the chemical master equation. J Chem Phys 124:044104

    Google Scholar 

  • Norris JR (1998) Markov chains. Cambridge university press, Cambridge

    MATH  Google Scholar 

  • Rubino G, Sericola B (1991) A finite characterization of weak lumpable Markov processes. part II: The continuous time case. Stochast Process Appl 38(2):195–204

    Google Scholar 

  • Rubino G, Sericola B (1993) A finite characterization of weak lumpable Markov processes. part I: The discrete time case. Stochast Process Appl 45(1):115–125

    Google Scholar 

  • Sokolova A, de Vink EP (2003) On relational properties of lumpability. In: Proceedings of the 4th PROGRESS

  • Tian JP, Kannan D (2006) Lumpability and commutativity of Markov processes. Stochast Anal Appl 24(3):685–702

    Article  MATH  MathSciNet  Google Scholar 

  • Walsh CT (2006) Posttranslation modification of proteins: expanding nature’s inventory. Roberts and Co Publisher, Englewood

    Google Scholar 

  • Wilkinson DJ (2006) Stochastic modelling for systems biology. Chapman & Hall, Boca Raton

    MATH  Google Scholar 

  • Yu J, Xiao J, Ren X, Lao K, Xie XS (2006) Probing gene expression in live cells, one protein molecule at a time. Science 311(5767):1600–1603

    Article  Google Scholar 

Download references

Acknowledgments

A. Ganguly and H. Koeppl acknowledge the support from the Swiss National Science Foundation, grant number PP00P2 128503/1. T. Petrov is supported by SystemsX.ch—the Swiss Inititative for Systems Biology. The authors would like to thank Prof. James Ledoux for his useful comments and for bringing to their attention some of the general work done in the area.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnab Ganguly.

Additional information

A. Ganguly and T. Petrov contributed equally to this work.

Appendix

Appendix

1.1 De-aggregation: simple scaffold

Assume that \(\mathcal {G}\in {\mathbb G}\) is such that \(\phi _1(\mathcal {G})=(m_{AB},m_{BC},m_{ABC})\). Let \(m_A:=n_A-m_{AB}- m_{ABC},\,m_B:=n_A-m_{AB}-m_{BC}-m_{ABC}\) and \(m_C:=n_C-m_{BC}-m_{ABC}\). If \(\mathcal {G}\in {\mathbb {A}}_i\), then \({\alpha }_{1i}(\mathcal {G})=|{\mathbb {A}}_i|^{-1}\), where

$$\begin{aligned} |{\mathbb {A}}_i| =\frac{n_A!n_B!n_C!}{m_{AB}!m_{BC}!m_{ABC}!m_A!m_B!m_C!}. \end{aligned}$$
(10)

The explanation is as follows. The \(m_A\) free nodes of type \(A,\,m_B\) free nodes of type \(B\) and \(m_C\) free nodes of type \(C\) can be chosen in \({n_A\atopwithdelims ()m_A}{n_B\atopwithdelims ()m_B}{n_C\atopwithdelims ()m_C}\) possible ways. Among the remaining nodes, \(m_{AB}\) nodes of type \(A\) and \(m_{AB}\) nodes of type \(B\) can be chosen in \({n_{A}{-}m_A\atopwithdelims ()m_{AB}}{n_{B}{-}m_B\atopwithdelims ()m_{AB}}\) ways. There are \(m_{AB}!\) different ways to establish bonds between \(m_{AB}\) identified nodes \(A\) and \(m_{AB}\) identified nodes \(B\). In the same way, we choose \(m_{BC}\) complexes of type \((BC)\) among the \(n_{B}{-}m_B{-}m_{AB}\) nodes of type \(A\), and \(n_C{-}m_C\) nodes of type \(B\). Finally, there is exactly one way to choose \(m_{ABC}\) complexes of type \((ABC)\) among the \(n_A{-}m_A{-}m_{AB},\,n_B{-}m_B{-}m_{AB}{-}m_{BC}\) and \(n_C{-}m_C{-}m_{BC}\) nodes of type \(A,\,B\) and \(C\) respectively. Connecting the bonds can be done in \((m_{ABC}!)^2\) different ways (for each node \(B^j\), there are exactly \(m_{ABC}!\) ways to choose the \(A^i\) and \(m_{ABC}!\) ways to choose \(C^k\)). The final expression follows.

Moreover, if \(\phi _2(\mathcal {G})=(m_{AB*},m_{*BC})\) and \(\mathcal {G}\in \mathbb {B}_j\), then \({\alpha }_{2j}(\mathcal {G}) = |\mathbb {B}_j|^{-1}\), where

$$\begin{aligned} |\mathbb {B}_j|= {n_A\atopwithdelims ()m_{AB*}}{n_B\atopwithdelims ()m_{AB*}}m_{AB*}! {n_C\atopwithdelims ()m_{*BC}}{n_B\atopwithdelims ()m_{*BC}}m_{*BC}!. \end{aligned}$$
(11)

We first choose the \(m_{AB*}\) nodes of type \(A\) and \(m_{AB*}\) nodes of type \(B\); There are \(m_{AB*}!\) different ways to establish the bonds; In total, it makes \({n_A\atopwithdelims ()m_{AB*}}{n_B\atopwithdelims ()m_{AB*}}m_{AB*}!\) choices. Independently, the \(m_{*BC}\) bonds between \(B\) and \(C\) can be chosen in \({n_B\atopwithdelims ()m_{*BC}}{n_C\atopwithdelims ()m_{*BC}} m_{*BC}!\) ways.

1.2 De-aggregation: two-sided polymerization

Assume that \(s\) is a site-graph such that

$$\begin{aligned} \phi _1(s)= (x_{11},\ldots , x_{1m}, x_{21},\ldots ,x_{2m}, x_{31},\ldots ,x_{3m}, x_{41},\ldots ,x_{4m}, x_{51},\ldots ,x_{5m}). \end{aligned}$$

We do not give the analytic expression for \({\alpha }_{1i}(s)\). For computing it, it is enough to use the following:

  • choosing a chain of type \((A..B)_i\) among \(m_A\) nodes \(A\) and \(m_B\) nodes \(B\) can be done in \(f_1(m_A,m_B,i)={m_A\atopwithdelims (){i}}{m_B\atopwithdelims (){i}}({i}!)^2\) ways; there are \((m_A-i)\) nodes \(A\), and \((m_B-i)\) nodes \(B\) left. The same is used for choosing a chain of type \((B..A)_i\);

  • choosing a chain of type \((A..A)_i\) among \(m_A\) nodes \(A\) and \(m_B\) nodes \(B\) can be done in \(f_2(m_A,m_B,i)={m_A\atopwithdelims (){i}}{m_B\atopwithdelims (){i-1}}i!(i-1)!\) ways; there are \((m_A-i)\) nodes \(A\), and \((m_B-(i-1))\) nodes \(B\) left. The same is used for choosing a chain of type \((B..B)_i\);

  • choosing a chain of type \((.A..B.)_i\) among \(m_A\) nodes \(A\) and \(m_B\) nodes \(B\) can be done in \(f_3(m_A,m_B,i)={m_A\atopwithdelims (){i}}{m_B\atopwithdelims (){i}}(i!)^2/i\) ways; there are \((m_A-i)\) nodes \(A\), and \((m_B-i)\) nodes \(B\) left. Division by \(i\) is done because of symmetries—every ring of type \((.A..B.)_i\) is determined by choosing \(i\) nodes of type \(A,\,i\) nodes of type \(B\), ordering nodes \(A\) in one of \(i!\) ways, ordering nodes \(B\) in one of \(i!\) ways, but every ordering \((A_{j1}-B_{k1}-A_{j2}-B_{k2}-\ldots A_{ji}-B_{ki})\) defines the same ring as \((A_{j2}-B_{k2}-A_{j3}-B_{k3}-\ldots A_{j1}-B_{k1})\) etc. (\(i\) of them in total).

Moreover, if \(s\) is such that \(\phi _2(s)=(m_{rl},m_{ba})\), then

$$\begin{aligned} {\alpha }_{2i}(s) = {n\atopwithdelims (){m_{rl}}}^2m_{rl}! {n\atopwithdelims (){m_{ba}}}^2 m_{ba}!. \end{aligned}$$

If \(s\) is such that \(\phi _2(s)=m\), then

$$\begin{aligned} {\alpha }_{3i}(s) = \sum _{i=0}^{m} {n\atopwithdelims (){i}}^2 i! {n\atopwithdelims (){m-i}}^2 (m-i)!. \end{aligned}$$

We choose \(m_{rl}\) nodes of type \(A\) among \(n\) of them, and the same number of nodes of type \(B\). There is \(m_{rl}!\) different ways to connect them. We independently choose the \(m_{ba}\) bonds in the same way.

To compute \({\alpha }_{3i}(s)\), since all of the \(m\) bonds can be either of type \(m_{rl}\) or \(m_{ba}\), we choose \(i\) bonds of type \(m_{rl}\) and \((m-i)\) bonds of type \(m_{ba}\), for \(i=0,\ldots ,m\).

1.3 Figure 3

The CTMC \(\{X_t\}\), for given one node \(A\), three nodes \(B\) and one node \(C\) contains different reaction mixtures over the set of nodes \(\{A^{1}, B^{1}, B^{2}, B^{3}, C^{1}\}\). For example, let \(\mathcal {G}\) be the reaction mixture with the set of edges \(\{\{(A^1,b), (B^3,a)\}\}\). There are three ways to apply the rule \(R_2\) on \(\mathcal {G}\): by embedding via function , or . If \(\mathcal {G}_1\) is a mixture with a set of edges \(\{\{(B^{3},a),(A^{1},b)\},\{(B^2,c),(C^1,b)\}\}\) and \(\mathcal {G}_2\) is a mixture with a set of edges \(\{\{(B^{3},a),(A^{1},b)\},\{(B^3,c),(C^1,b)\}\}\), then \(Q(\mathcal {G},\mathcal {G}_1)=Q(\mathcal {G},\mathcal {G}_2)={c}_2\).

Note that \(\phi _1(\mathcal {G})=(1,0,0),\,\phi _1(\mathcal {G}_1)=(1,1,0),\,\phi _1(\mathcal {G}_2)=(0,0,1)\). Let \(\mathcal {G}\in {\mathbb {A}}_1,\,\mathcal {G}_1\in {\mathbb {A}}_2,\,\mathcal {G}_2\in {\mathbb {A}}_3\). By applying the Equation (10), we have \({\alpha }_{11}(\mathcal {G})=(\frac{1!3!1!}{1!0!0!0!2!1!})^{-1}\)=1/3, \({\alpha }_{12}(\mathcal {G}_1)=(\frac{1!3!1!}{1!0!1!0!2!0!})^{-1}=1/3\), and \({\alpha }_{13}(\mathcal {G}_2)=(\frac{1!3!1!}{1!1!0!0!1!0!})^{-1}=1/6\).

Moreover, since \(\phi _2(\mathcal {G})=(1,0)\), and \(\phi _2(\mathcal {G}_1)=\phi _2(\mathcal {G}_2)=(1,1)\), let \(\mathbb {B}_1, \mathbb {B}_2\in {{\mathbb G}_2}\) be such that \(\mathcal {G}\in \mathbb {B}_1\) and \(\mathcal {G}_1,\mathcal {G}_2\in \mathbb {B}_2\). Then, \({\alpha }_{21}(\mathcal {G})=({1\atopwithdelims ()1} {3\atopwithdelims ()1} 1! {1\atopwithdelims ()0} {3\atopwithdelims ()0} 0!)^{-1}=1/3\) and \({\alpha }_{22}(\mathcal {G}_1)={\alpha }_{22}(s_2)=({1\atopwithdelims ()1} {3\atopwithdelims ()1} 1! {1\atopwithdelims ()1} {3\atopwithdelims ()1} 1!)^{-1}=1/9\).

Finally, observing the aggregation from \({\mathbb G}_1\) to \({\mathbb G}_2\), we have that \({\alpha }_1({\mathbb {A}}_1)=\frac{{\alpha }_{21}(\mathcal {G})}{{\alpha }_{11}(\mathcal {G})}=1,\,{\alpha }_2({\mathbb {A}}_2)=\frac{{\alpha }_{22}(\mathcal {G}_1)}{{\alpha }_{12}(\mathcal {G}_1)}=1/3\), and \({\alpha }_2({\mathbb {A}}_3) = \frac{{\alpha }_{22}(\mathcal {G}_2)}{{\alpha }_{13}(\mathcal {G}_2)}=2/3\).

1.4 Table 1

In order to illustrate how powerful the presented reduction method is in comparison to the standard, species-based models, we compare the size of the state space in the species-based model, \({\mathbb G}_1\), and in the fragment-based model, \({\mathbb G}_2\).

  • Simple scaffold. The size of \({\mathbb G}_2\) is \((n+1)^2\): there are \(n+1\) possible situations between \(A\) and \(B\) nodes with \(0,1,\ldots ,n\) bonds between them. The same holds for possible configurations between nodes of type \(B\) and \(C\). Let \(f(k)\) denote the number of states with \(k\) copies of each of the nodes \(A,\,B\) and \(C\), and with no complexes of type \((ABC)\). If there is \(0\le i\le k\) complexes of type \((AB)\), there can be \(0\le j\le (k-i)\) complexes of type \((BC)\), and we thus have \(f(k)=\sum _{i=0}^k (k-i+1)=\frac{(k+1)(k+2)}{2}\). The number of complexes of type \((ABC)\) can vary from 0 to \(n\), and thus we have the total number of states in \({\mathbb G}_1\) to be \(\sum _{k=0}^n f(k)=\frac{1}{2} \sum _{k=0}^n (k^2+3k+2) = \frac{1}{2} (\sum _{k=0}^n k^2+3\sum _{k=0}^n k + 2\sum _{k=0}^n 1) = \frac{1}{2} (n(n+1)(2n+1)/6+3n(n+1)/2 + 2(n+1))=(n+1)(n+2)(n+3)/6\).

  • Two-sided polymerization. We first estimate the size of \({\mathbb G}_2\). The value of \(m_{rl}\) varies between 0 and \(n\), and the same holds for the value of \(m_{ba}\). Each state \((i,j)\in \{0,\ldots ,n\}\times \{0,\ldots ,n\}\) is reachable, since the bonds are created independently of each-other. The size of the state space \({\mathbb G}_2\) is thus \((n+1)^2\). The size of \({\mathbb G}_2\) is \(2n+1\), because the value of \(m\) varies between 0 and \(2n\). Let \(P(n)\) denote the number of partitions of number \(n\)—number of ways of writing \(n\) as a sum of positive integers. One of the well-known asymptotics is \(P(n)\approx \frac{1}{4n\sqrt{3}} e^{\pi \sqrt{\frac{2n}{3}}}\) (Hardy and Ramanujan 1918). Consider one partition \(n=n_1+\cdots +n_k,\,n_1\le \cdots \le n_k\), and a state \(s_1\in {\mathbb G}_1\) that counts one chain of type \((A..B)_{n_1}\), one chain of type \((A..B)_{n_2}\) etc. It is in \({\mathbb G}_1\), because it has exactly \(n\) nodes \(A\) and \(n\) nodes \(B\). Therefore, the set \({\mathbb G}_1\) counts at least \(P(n)\) states. This approximation can be improved by factor three: think of the states \(\mathcal {G}_2\) and \(\mathcal {G}_3\), which are constructed of chains of type \((B..A)_i\), or \((.A..B.)_i\) instead of \((A..B)_i\).

1.5 Relation between site-graph-rewrite rules and Kappa

Since the main purpose of this paper is not to formally present the reduction procedure for a general rule-set, we described the rule-based model directly as a collection of site-graph-rewrite rules, which is a simplification with respect to standard site-graph framework of Kappa (Danos et al. 2010). The simplification arises in three aspects.

First, the site (protein domain) in Kappa may be internal, in the sense that they bear an internal state encoding, for instance, post-translational modification of protein-residues such as phosphorylation, methylation, ubiquitylation—to name a few. Moreover, one site can simultaneously serve as a binding site, and as an internal site. We omit the possibility of having internal sites, but, it can be overcome: for example, the phosphorylation of a site can be encoded by a binding reaction to a node with a new name, for example, \(Ph\). In order to mimic the standard unimolecular modification process by this bimolecular one, we need to ensure that the nodes of type \(Ph\) are always highly abundant, that is, are not rate limiting at any time. As a side remark, we point out that in reality it takes a binding event (e.g. binding of ATP) for a modification to happen. If a site is both internal and binding site, another copy of the site is created, so that one site bears an internal state, and another one is a binding state. A Kappa rule and an example of the corresponding site-graph-rewrite rule are shown in Fig. 8b.

Second, each Kappa program has a predefined signature of site types and agent types, where the agent type consists of a name, and a predetermined interface (set of sites). Each node of a ‘Kappa’ site-graph is assigned a unique name. On top of that, a type function partitions all the nodes according to their agent type. We instead embed the information about the node type (and we also abandon the use of term ‘agent’ in favor of ‘node’) directly in the name of the node: a node \(v^i,\,i\in \mathbb N\) is of type \(v\); The rules are accordingly written with these generative node names. The interface of a node type \(v\) is read from the collection of site-graph-rewrite rules, as a union of all the sites which are assigned to \(v\) along the rules. Our formalism cannot specify a rule which operates over a connected site-graph with more than one node of a certain type, but the examples which we present here do not contain such rules.

Third, we restrict to the conserved systems—only edges can be modified by the rules, while Kappa can specify agent birth or deletion.

Finally, it is worth noting that we define the notion of embedding in a non-standard way, through a combination of node-renaming function and sub-site-graph property.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ganguly, A., Petrov, T. & Koeppl, H. Markov chain aggregation and its applications to combinatorial reaction networks. J. Math. Biol. 69, 767–797 (2014). https://doi.org/10.1007/s00285-013-0738-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-013-0738-7

Keywords

Mathematics Subject Classification (2010)

Navigation