Economies with heterogeneous interacting learning agents

Landini, Simone; Gallegati, Mauro; Stiglitz, Joseph E.

doi:10.1007/s11403-013-0121-1

Economies with heterogeneous interacting learning agents

Regular Article
Published: 21 January 2014

Volume 10, pages 91–118, (2015)
Cite this article

Journal of Economic Interaction and Coordination Aims and scope Submit manuscript

Simone Landini¹,
Mauro Gallegati² &
Joseph E. Stiglitz³

439 Accesses
11 Citations
2 Altmetric
Explore all metrics

Abstract

Economic agents differ from physical atoms because of the learning capability and memory, which lead to strategic behaviour. Economic agents learn how to interact and behave by modifying their behaviour when the economic environment changes. We show that business fluctuations are endogenously generated by the interaction of learning agents via the phenomenon of regenerative-coordination, i.e. agents choose a learning strategy which leads to a pair of output and price which feedback on learning, possibly modifying it. Mathematically, learning is modelled as a chemical reaction of different species of elements, while inferential analysis develops combinatorial master equation, a technique, which is an alternative approach in modelling heterogeneous interacting learning agents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Emergence in complex networks of simple agents

Article Open access 23 May 2023

David G. Green

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Sven Gronauer & Klaus Diepold

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

Note that business fluctuations are due to the idiosyncratic price shocks and the endogenous self organization of the market.
For the sake of simplicity this modeling has not been considered here: the reader is referred to Delli Gatti et al. (2012).
$\alpha $ can be considered as a “financial” parameter since it represents the leverage, i.e. the ratio between external and internal financial needs.
Firms may change their behavior because the price change, i.e. they take into account the Lucas’ critique.
As it will be shown, a configuration is a string of rules’ codes ordering the production strategies according to their diffusion degree, from the most diffused to the lest diffused rule.
In the ABM–DGP $\Delta =1$, that means that two adjacent dates are separated by a time-span of length $\Delta $. Since in the present paper time has no specific relevant meaning, $\Delta =1$ is only the simulation reference unit of time, it might be a quarter of a year or a year or whatever.
By analogy with chemical reactions it is a “progress variable”, or “degree of advancement”, as described in de Groot and Mazur (1984) p. 199, see also van Kampen (2007) p. 168. Therefore, the complexity degree of rule $\lambda _k \in \Lambda $ is the index $k\le K=|\Lambda |$ labelling the $k$-th rule the social atom is testing while learning.
Having set up an ABM one can certainly take note of each single step each single agent is taking while learning. But this would be very time and memory consuming, even with not so huge systems like the one here involved with 1,000 firms and 1,000 periods. Moreover, in the end, it would be useless because what needed is an inferential approach, like the Statistical Physics one: taking care of all the positions on the learning space would be like integrating the motion differential equations for particles in a complex system, which is almost an impossible task.
Fig. 1
The learning mechanism as a sequence of interactions along the learning-period made of $K$ test-sub-periods. A predetermined sample path is highlighted from being $L_p$ to become $L_7$. The whole graph represents the graphs one would obtain by starting with any rule $\lambda _p \in \Lambda $, that is from the initial state $L_p $, passing through all the $K$ channels $\rho _k $ before ending to a final state $L_q $
Full size image
The formulae presented in “Appendix C” have been analytically obtained by involving a suitable algebraic method: its development is far beyond the aim of the present paper; notes are available by the authors.
The reader might refer to Gardiner (1985) chapter 7, from Sects. 5 to 7, for a rigorous development of the following exposition which aims to resemble the main features of the Poisson representation technique for the many variable birth-death systems in terms of combinatorial kinetics. See also Gardiner and Chaturvedi (1977) for an early exposition of the technique.
This term is due to Gardiner and Chaturvedi (1977) to extend the field of Chemical Kinetics. The reader interested in Chemical Physics and Physical Chemistry, upon which the following development is based, is suggested to refer also to McQuarrie (1967) and Gillespie (2007), and references cited therein. To appreciate the probabilistic and combinatorial nature of these disciplines, and for extensions of the tools in other fields of applicability, an important reference is Nicolis and Prigogine (1977).
According to Gardiner (1985) combinatorial transition rates are usually not explicitly time dependent. In the present paper time dependence is maintained to take care that the configuration of the system changes $\mathbf{I}_\varsigma (t)=\mathbf{n}$ due to learning, but time is considered as a sequential parameter.
Note that $P_e ({\bullet } ;t):\chi \rightarrow [0,1]$ is the stationary solution where time is an indexing parameter as in transition rates $T_k^\pm ({\bullet } ;t)$. The interest on time indexing is essentially motivated by the fact that the present modelling is grounded on an ABM–DGP where time is an iteration counter.
See van Kampen (2007), page 168, for a geometric interpretation on this issue.
The ergodic property is here conceived very loosely: basically, estimates of $\mathbf{W}_\varsigma (t)$ have been found stable through time such that their series can be likely substituted with the time average; it has also been found the standard deviation is very small.
For a given subsystem of SF or NSF firms, a dominant configuration is a combination of behavioural rules that, at $t$, concentrates fractions of a given quantity, say $Z$, from the highest to the lowest share. As regarding the number of firms, $Z=I$, the diffusion-dominance of a certain rules’ configuration allocates the highest shares of firms into behavioural states $\lambda \in \Lambda $. If $Z=A,Q,W,\Pi $ effects-dominance of a rules’ configuration identifies what should have been chosen to get the collective optimal configuration as regarding a given quantity.
Regimes concern dominance while phases concern the state levels of aggregate quantities.
In order to fully appreciate the consequence of introducing learning in a complex system, let concentrate on the effect of a policy, say an easing of the monetary policy, i.e.. a reduction of the rate of interest. The share of SF firms will increase: resilience will be strengthed but the pace of growth could be modest; those effects themselves will depend on the S,s of the system; agents will change their behavior, according to the prescription of the Lucas critique.
As regarding the profit maximizing rule $3$, it can also be seen as function of the control (output scheduling) parameter $\alpha $.
Profit curves and their maximisation w.r.t. equity (state variable) previously developed is different from profit maximisation w.r.t. the scheduling parameter (control parameter): the former concerns the overall economic interpretation, the latter concerns the specific profit maximisation rule, which aims to set an optimal value for the control parameter.
This means that equity is conditioning profit through the scheduling parameter, that is $\Pi (\alpha |A)$.
For the ease of exposition time and financial fragility state are suppressed therefore, from here on, all the quantities must be considered as time dependent in every state of financial fragility.

References

Alfarano S, Lux T, Wagner F (2005) Estimation of agent-based models: the case of an asymmetric herding model. Comput Econ 26(1):19–49
Article Google Scholar
Aoki M (1996) New approaches to macroeconomic modelling. Cambridge University Press, Cambridge
Google Scholar
Aoki M (2002) Modelling aggregate behaviour and fluctuations in economics. Cambridge University Press, Cambridge
Google Scholar
Aoki M, Yoshikawa H (2006) Reconstructing macroeconomics. Cambridge University Press, Cambridge
Book Google Scholar
Buchanan M (2007) The social atom: why the rich get richer, cheaters get caught, and your neighbour usually looks like you. Bloomsbury, London
Google Scholar
Delli Gatti D, Gallegati M, Greenwald B, Russo A, Stiglitz JE (2010) The financial accelerator in an evolving credit network. J Econ Dyn Control 34:1627–1650
Article Google Scholar
Delli Gatti D, Di Guilmi C, Gallegati M, Landini S (2012) Reconstructing aggregate dynamics in heterogeneous agents models. A Markovian Approach, Revue de l’OFCE 124(5):117–146
Google Scholar
Delli Gatti D, Fagiolo G, Richiardi M, Russo A, Gallegati M (2014) Agent based models. A Premier (forthcoming)
de Groot SR, Mazur P (1984) Non-equilibrium thermodynamics. Dover Publication, New York
Google Scholar
Di Guilmi C, Gallegati M, Landini S, Stiglitz JE (2011) Towards an analytic solution for agent based models: an application to a credit network economy. In: Aoki M, Binmore K, Deakin S, Gintis H (eds) Complexity and institutions: markets, norms and corporations. Palgrave Macmillan, London, IEA conference, vol. N.150-II
Feller W (1966) An introduction to probability theory and its applications. Wiley, New Jersey
Google Scholar
Foley DK (1994) A statistical equilibrium theory of markets. J Econ Theory 62:321–345
Article Google Scholar
Gardiner CW (1985) Handbook of stochastic methods. Springer, Berlin
Google Scholar
Gardiner CW, Chaturvedi S (1977) The Poisson representation I. A new technique for chemical master equations. J Stat Phys 17(6):429–468
Article Google Scholar
Gillespie DT (2007) Stochastic simulation of chemical kinetics. Annu Rev Phys Chem 58:35–55
Article Google Scholar
Greenwald B, Stiglitz JE (1993) Financial markets imperfections and business cycles. Q J Econ 108(1):77–114
Article Google Scholar
Godley W, Lavoie M (2007) Monetary economics. An integrated approach to credit, money, income, production and wealth. Palgrave MacMillan, Basingstoke
Google Scholar
Kirman A (2011) Learning in agent based models. East Econ J 37(1):20–27
Article Google Scholar
Kirman A (2012) Can artificial economies help us understand real economies. Revue de l’OFCE, Debates and Policies 124
McQuarrie DA (1967) Stochastic approach to chemical kinetics. J Appl Probab 4:413–478
Article Google Scholar
Nicolis G, Prigogine I (1977) Self-organization in nonequilibrium systems: from dissipative structures to order through fluctuations. Wiley, New Jersey
Google Scholar
Sargent T (1993) Bounded rationality in macroeconomics. Clarendon Press, Oxford
Google Scholar
Stiglitz JE (1973) Taxation, corporate financial policy and the cost of capital. J Public Econ 2(1):1–34
Article Google Scholar
Stiglitz JE (1975) The theory of screening, education and the distribution of income. Am Econ Rev 65(3):283–300
Google Scholar
Stiglitz JE (1976) The efficiency wage hypothesis, surplus labour and the distribution of income in L.D.C’.S. Oxford Econ Papers 28(2):185–207
Google Scholar
Tesfatsion L, Judd KL (2006) Agent-based computational economics. In: Handbook of computational economics. Handbooks in economics series, vol 2. North Holland, Amsterdam
van Kampen NG (2007) Stochastic processes in physics and chemistry. North-Holland, Amsterdam
Weidlich W, Braun M (1992) The master equation approach to nonlinear economics. J Evol Econ 2(3): 233–265
Google Scholar

Download references

Acknowledgments

The authors thank an anonymous referee for his remarks; Patrick Xihao Li, Corrado di Guilmi and participants to the EEA conference, NY May 2013, PRIN Bologna, June 2013, for suggestions; the support of the Institute for New Economic Thinking Grant INO1200022, and the EFP7, MATHEMACS and NESS, is gratefully acknowledged.

Author information

Authors and Affiliations

I.R.E.S. Piemonte, via Nizza 18, 10125 , Turin, Italy
Simone Landini
DiSES, Università Politecnica delle Marche, Piazzale Martelli 8, 60121 , Ancona, Italy
Mauro Gallegati
Columbia Business School, Columbia University, 3022 Broadway, New York, NY, USA
Joseph E. Stiglitz

Authors

Simone Landini
View author publications
You can also search for this author in PubMed Google Scholar
Mauro Gallegati
View author publications
You can also search for this author in PubMed Google Scholar
Joseph E. Stiglitz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simone Landini.

Appendices

Appendix A: Analytics inside learning

In the following, we describe the analytic aspects of the learning mechanism at the individual level and the re-configuration it follows.

At time $t$ the firm is endowed with $A(i,t)[h]=A_l $, where $h$ refers to the learning strategy of the previous period. The firm faces $K$ possible scheduling strategies. Profit is then a function both of the control variable, $\Pi (Q_{l,k} )$, and of the state variable given the control parameter,^{Footnote 19} $\Pi (A_l |\alpha _k )$.

Appendix B shows profit curves are concave with every behavioural rule if a specific condition on parameters is fulfilled. Accordingly, Fig. 5 represents configurations in case of four strategies, ordered by the value of $\alpha $. Being profit curves concave, on the plane $(A,\Pi )$ profit maximising equity values $A_k^*$ exist and belong to the loci $E^{*}=E(A^{*},\Pi ^{*})$. For each rule, on $(Q,\Pi )$ output maximising profit values $Q_k^*$ are found associated to the corresponding $\Pi _k^*$. On $(A,Q)$ profit maximising configurations are found for each behavioural rule. By connecting optimal configuration points $S_k \in E^{*}$ on the planes, the optimal configurations loci is then identified, and it might be thought as the temporary equilibrium curve for the system showing the possibility of multiple equilibria. If all the firms are found on this curve it seems there is no incentive to depart from it; nevertheless, equilibrium loci is neither steady nor permanent but temporary and unstable through time because of the learning activity firms put forward: this takes care of the Lucas’ critique.

As external conditions are changing, mainly under the effect of the market price force-field, firms adjust their positions through learning, therefore a sequence of equilibrium loci $\hbox {E}=\{E^{*}(t)\}$ is the outcome of the re-configuration by learning. They are stimulated by the market price external field, but each firm has also its own internal impulse to learn how to adjust its position to improve its financial soundness.

Figure 5 explains two phenomena; the “stronger” the behavioural rule, $\alpha (\lambda _q )\succ \alpha (\lambda _p )$: (a) the less the equity to get the same profit, $\Pi (A_l |\alpha _q )>\Pi (A_l |\alpha _p )$; (b) the easiest to reach the optimal feasible level of output maximising profit. There are essentially two possibilities for a firm outside the optimal loci $E^{*}$: (a) to be on the left-increasing branch of the profit curve, that is to be endowed with equity values for which $\partial \Pi /\partial A>0\Rightarrow A_l <A_k^*$; (b) to be on the right-decreasing branch of the profit curve, that is to be endowed with equity values for which $\partial \Pi /\partial A<0\Rightarrow A_l >A_k^*$.

The only way for a firm to improve its financial soundness is increasing profit which, however, is exposed to price shocks. Moreover, to increase profit it can only increase output. If equity increases a firm can maintain its previous period rule to increase output but in the opposite case the firm needs to switch to a more augmenting one to get a higher level of output with the same level of equity. This means to jump from a lower profit curve to a higher one. The outcomes of the jump can be four, see Fig. 5ii, depending on the state of jump: (a) from a left-increasing to a left-increasing branch, (b) from a left-increasing to a right-decreasing branch, (c) from a right-decreasing to a left-increasing branch, (d) from a right-decreasing to a right-decreasing branch.

To explain states and transitions consider Fig. 5ii with two rules. Rule $\lambda _k $ is more efficient than $\lambda _h $ if, at the same level of equity $A_l $, it gives a higher level of profit, $\Pi (A_l |\alpha _k )>\Pi (A_l |\alpha _h )$; if both give the same profit $\Pi (A_E |\alpha _k )=\Pi (A_E |\alpha _h )$ they are equivalent at $A_E $ and $E=(A_E ,\Pi _E )$ is an equivalence point. Moreover, each rule-specific profit curve is concave and admits a maximum: $M_k =(A_k^*,\Pi _k^*), M_h =(A_h^*,\Pi _h^*)$. Finally, the plane $(A,\Pi )$ can be partitioned according to maximum and equivalence points: $S_I =\{(A,\Pi ):A\le A_k^*\}, S_{II} =\{(A,\Pi ):A_k^*<A\le A_E \}$, $S_{III} =\{(A,\Pi ):A_E <A\le A_h^*\}$ and $S_{IV} =\{(A,\Pi ):A>A_h^*\}$.

Case 1

On $S_I $, if a firm is in $\Pi (A_l |\alpha _h )$ then $\lambda _h \rightarrow \lambda _k $ is rational because from a left-increasing profit state the firm will jump to a left-increasing profit state (a) with a higher level of profit. However, it can be seen that on $S_I $ a small change in equity, $A_l +a<A_k^*$ gives a high change in profit after $\lambda _h \rightarrow \lambda _k $, but only up to a certain level of equity $\hat{{A}}_{h,k} <A_k^*$. Therefore, $S_I$ can be partitioned into $S_I^1 =\{(A,\Pi ):A\le \hat{{A}}_{h,k} <A_k^*\}$ and $S_I^2 =\{(A,\Pi ):\hat{{A}}_{h,k} <A\le A_k^*\}$. Given $A_l +a$, in the limit for $a\rightarrow 0^{+}$, the effect of the small change in equity is measured by the derivatives of the profit curves, therefore rule $\lambda _k $ is found to be more efficient and attractive if $\partial \Pi (A_l |\alpha _k )/\partial A>\partial \Pi (A_l |\alpha _h )/\partial A$ which implies that $A_l <\hat{{A}}_{h,k} $. Therefore, if $A_l <\hat{{A}}_{h,k} : S_I^1 $ then $\lambda _h \rightarrow \lambda _k $ is efficient-attracting, if $\hat{{A}}_{h,k} \le A_l \le A_k^*: S_I^2 $ then $\lambda _h \rightarrow \lambda _k $ is less efficient-attracting: the same increase in equity on $S_I^2 $ gives a less than proportional increase in profit, on $S_I^1 $ the induced increase in profit is more than proportional.

Case 2

On $S_{II} , \lambda _h \rightarrow \lambda _k $ is convenient but risky because the firm will jump to a right-decreasing profit state, (b). In case (b) on $S_{II} $, risk is due to that if equity increases so does output but, if prices do not allow for increasing profit as well, the firm can be trapped into $\lambda _h $ decreasing its profit up to the equivalence point $E=(A_E ,\Pi _E )$: beyond this point the firm will find $\lambda _h $ more efficient than $\lambda _k $, and so a jump $\lambda _h \leftarrow \lambda _k $ is motivated but, if firm’s endowments are not sufficient for $\lambda _h $ to be feasible, the firm will continue decreasing its profit and next period equity, compromising the reached level of financial soundness. On $S_{II} , \partial \Pi (A_l |\alpha _k )/\partial A<0$ and $\partial \Pi (A_l |\alpha _h )/\partial A>0$ therefore $\lambda _h \rightarrow \lambda _k $ is efficient but less attracting, even less than the jump on $S_I^2 $ because of the risk for the firm of being trapped into a profit decreasing rule.

Case 3

On $S_{III} $, the efficient jump is $\lambda _h \leftarrow \lambda _k $ and it is always attractive. Indeed, from a left-decreasing profit state the firm jumps to a right-increasing profit state and, moreover, it can be seen that $\partial \Pi (A_l |\alpha _k )/\partial A<0$ while $\partial \Pi (A_l |\alpha _h )/\partial A>0$, as it was on $S_{II} $.

Case 4

On $S_{IV} $, the jump $\lambda _h \leftarrow \lambda _k $ is efficient-attractive because $\partial \Pi (A_l |\alpha _k )/\partial A\le \partial \Pi (A_l |\alpha _h )/\partial A<0$. In this case, the best thing to do would be to jump from $\lambda _k $ and $\lambda _h $ to a third rule $\lambda {\prime }$, if any feasible. For firms on $\Pi (A>A_E |\alpha _k )$ the jump $\lambda _k \rightarrow \lambda {\prime }$ is efficient and attracting right beyond the equivalence point $E^{\prime }$. For firms on $\Pi (A>A_{E^{{\prime }{\prime }}} |\alpha _h )$ the jump $\lambda _h \rightarrow \lambda ^{\prime }$ is efficient and attracting right beyond the equivalence point $E^{{\prime }{\prime }}$.

Appendix B: Profit curves and maxima

For a fixed strategy $\lambda _k \in \Lambda $ at $t$ and $\lambda _h $ at $t-1$, eventually identical, $A(i,t-1)[h]=A_l >0$ and (1) gives $Q_{l,k} =\alpha _k A_l^\beta $. Therefore, (7) gives $\Pi (Q_{l,k} )=XR_l +PQ_{l,k} -UQ_{l,k}^\delta $ which reads as $\Pi (A_l |\alpha _k )=XrA_l +P\alpha _k A_l^\beta -U\alpha _k^\delta A_l^\phi $ being $X=1$ if NSF, $R_l =rA_l >0$ and $U=(1+Xr) \theta >0$. As a simplification, assume the firm is selling at the expected market price $P=E[p]$.

According to (7) and being the equity domain of profit curves right-unbounded, in order to allow for an analytic maximum profit condition profit curves must be concave paraboloids in equity. Computing derivatives it follows that

$$\begin{aligned} \frac{\partial \Pi }{\partial A}&= Xr+\frac{\beta }{A_l }\left( {P\alpha _k A_l^\beta -\delta (1+Xr)w\gamma ^{\delta }\alpha _k^\delta A_l^\phi } \right) \end{aligned}$$

(32)

$$\begin{aligned} \frac{\partial ^{2}\Pi }{\partial A^{2}}&= -\frac{\beta }{A_l^2 }\left\{ {(1-\beta )P\alpha _k A_l^\beta +(\phi -1)\delta (1+Xr)w\gamma ^{\delta }\alpha _k^\delta A_l^\phi } \right\} \end{aligned}$$

(33)

The first order condition for a stationary point is

$$\begin{aligned} \frac{\partial \Pi }{\partial A}=0\Rightarrow Xr+\frac{\beta }{A_l }\left( {P\alpha _k A_l^\beta -\delta (1+Xr)w\gamma ^{\delta }\alpha _k^\delta A_l^\phi } \right) =0 \end{aligned}$$

(34)

which is only necessary for a maximum point since the equity domain is not a compact set of positive real numbers.

In case of SF firms, $X=0$, a closed form for the stationary point can be found

$$\begin{aligned} A_{k|SF}^*=\left( {\frac{P}{\delta w\gamma ^{\delta }\alpha _k^{\delta -1} }} \right) ^{\frac{1}{\beta (\delta -1)}}>0 \end{aligned}$$

(35)

In case of NSF firms, $X=1$, a closed form solution cannot be found but it exists if profit curve are concave paraboloids: the sign of the first derivative shows they are so

$$\begin{aligned} \frac{\partial \Pi }{\partial A}>0\Leftrightarrow A_l <A_k^*\,\quad \hbox {and}\,\quad \frac{\partial \Pi }{\partial A}<0\Leftrightarrow A_l >A_k^*\end{aligned}$$

(36)

Therefore, for SF and NSF the stationary point exists and it is candidate to be the maximum point: writing $A_k^*$ makes explicit the dependence on the $k$-th behavioural rule through $\alpha _k $ as shown in (35); note also the effect of the market price $P$, which is a systemic observable. To characterise it definitively the sign of the second derivative gives

$$\begin{aligned} \frac{\partial ^{2}\Pi }{\partial A^{2}}<0\Leftrightarrow 0<A_l <\bar{{A}}_k =\psi (\beta ,\delta )\left( {\frac{P}{\delta (1+Xr)w\gamma ^{\delta }\alpha _k^{\delta -1} }} \right) ^{\frac{1}{\beta (\delta -1)}} \end{aligned}$$

(37)

where

$$\begin{aligned} \psi (\beta ,\delta )=\left( {\frac{1-\beta }{1-\beta \delta }} \right) ^{\frac{1}{\beta (\delta -1)}}>0\Rightarrow \delta \in (0,1/\beta ) \end{aligned}$$

(38)

It is now worth stressing some considerations. It appears that the profit function is concave only on the restriction $0<A_l <\bar{{A}}_k <\infty $ of the equity domain. Therefore, $A_k^*$ is the maximum profit point only if $A_k^*<\bar{{A}}_k $, hence it should be $\psi (\beta ,\delta )>1$ which is fulfilled only if $\delta \in (0,1/\beta )$. Therefore, for every given behavioural rule there exists a stationary point: the sufficient condition for the stationary point to be the maximum profit point is $\delta \in (0,1/\beta )$ where $\beta \in (0,1)$.

As regarding profit, a brief note on the profit maximisation rule is here developed.^{Footnote 20} For notation convenience, reference to firms is neglected but time is needed setting $a_{t+k} =A(i,t+k), p_{t+1} =u(i,t+1)P(t+1), \alpha _t =\alpha (i,t)$ and $x_t =X(i,t)$. By assuming the optimal value of the control parameter to exist, the implicit optimisation problem is $\alpha _t^*(a_t ,p_{t+1} )=\arg \max E\{\pi (a_t ,\alpha _t ,p_{t+1} )\}$ with constraints $a_{t+1} =a(a_t ,E\{\pi (a_t ,\alpha _t ,p_{t+1} )\})\ge 0$ and $\alpha _t \ge 0$, where the individual profit is $\pi _{t+1} =\pi (a_t ,\alpha _t ,p_{t+1} )=\Pi (i,t+1)$.

The objective function (7), $O(\alpha _t )\equiv E\{\pi (a_t ,\alpha _t ,p_{t+1} )\}$ and the constraint (8), $c(\alpha _t )\equiv a_t +O(\alpha _t )$, are functions of the control parameter.^{Footnote 21} The profit function is differentiable, therefore the sufficient condition for the profit function to be concave ($\partial _\alpha ^2 O\le 0$) is always true if $\delta >1$. Since $\delta \in (0,1/\beta ): \beta \in (0,1)$, the profit curve is also concave in equity, hence the profit function is concave in the control parameter and in the state variable as well. Accordingly, $c(\alpha _t )$ is concave hence $C(\alpha _t )=-c(\alpha _t )\le 0$ is convex. Therefore, the standard form optimisation is $\max O(\alpha _t )$ with constraints $C(\alpha _t )\le 0$ and $-\alpha _t \le 0$, where the objective function is concave and the constraint is convex. By setting the Lagrangean to deal with Khun-Tucker conditions, the candidate solution is $\alpha _t^*(a_t ,x_t ,P_{t+1} )=(\delta \theta (1+x_{i,t} r)/P_{t+1} )^{1-\delta }/a_{i,t}^\beta >0$.

To be feasible, that is for a firm not to go surely bankrupt, it must satisfy the constraint (8), that is $c(\alpha _t^*)\ge 0$ implies that $a_t \ge \tilde{a}(x_t ,P_{t+1} )=[\delta \theta (1+x_{i,t} r)/(\delta ^{1-\delta }P_{t+1}^\delta -P_{t+1}^2 )]^{1-\delta }>0$. Therefore, the optimal scheduling-output parameter $\alpha _t^*(a_t ,x_t ,P_{t+1} )$ is feasible only for the firm endowed enough to satisfy the constraint depending on the state of financial soundness $x_t $ while facing its own price expectation to be $P_{t+1} $. Finally, being $\delta \in (0,1/\beta ): \beta \in (0,1)$, if $\delta >2$ then if $\beta \in (0,1/2)$ the constraint is always fulfilled and hence the found solution is feasible to maximise profit. These conditions on parameters have been involved in simulations.

Appendix C: Interaction specific probabilities

This appendix sketches a formal development of interaction specific transition probabilities of Sect. 4 and how to involve them in specifying dynamic transition matrices $\mathbf{W}_\varsigma (t+\Delta )$. Since for an ‘effective reactant’ the probability to change its present behavioural rule to that of the ‘virtual reactant’ it is interacting with depends on the profit differential, this is the pilot-quantity for the switching probability. That is, the more the ‘virtual’ profit exceeds the ‘effective’ one, the more the effective reactant is likely to adopt the ‘virtual’ behavioural rule. To model this interaction and its probabilistic outcome it has been found convenient to represent the involved observables on the Cartesian plane as vectors defined by their modulus and trigonometric components.

Consider $\Pi (\lambda |\varsigma ; \tau _k )$ the aggregate profit of firms in state $X(\lambda |\varsigma )\in \Lambda $ and set $\Pi _+ (\varsigma ; \tau _k )=\sum _{\lambda \in \Lambda } {\left| {\Pi (\lambda |\varsigma ; \tau _k )} \right| } $. Define the profit share (with sign) $u(\lambda |\varsigma ; \tau _k )=\Pi (\lambda |\varsigma ; \tau _k )/\Pi _+ (\varsigma ; \tau _k ):=u_\lambda $.^{Footnote 22} Interaction in (9) is consistent with the following profit share interaction vector representation

$$\begin{aligned} \mathbf{u}_{hk} =(u_h ,u_k ):\left\{ {\begin{array}{l} u_h =\left\| {\mathbf{u}_{hk} } \right\| \cos \theta _{hk} : {\textit{effective}}\ {\textit{profit}}\ {\textit{share}} \\ u_k =\left\| {\mathbf{u}_{hk} } \right\| \sin \theta _{hk} : {\textit{virtual}}\ {\textit{profit}}\ {\textit{share}}\\ \end{array}} \right. \end{aligned}$$

(39)

that is, the profit vector implied by the interaction in (9) on a different scale on the Cartesian plane. According to (39) the interaction vector can be represented on the unit-circle with the following normalisation

$$\begin{aligned} \mathbf{v}_{hk} =\frac{\mathbf{u}_{hk} }{\left\| {\mathbf{u}_{hk} } \right\| }:\left\{ {\begin{array}{l} v_h =\cos \theta _{hk} : {\textit{effective}}\ {\textit{profit}}\ {\textit{indicator}} \\ v_k =\sin \theta _{hk} : {\textit{virtual}}\ {\textit{profit}}\ {\textit{indicator}}\\ \end{array}} \right. \end{aligned}$$

(40)

The vector $\mathbf{v}_{hk} $ in (40) is called specific-interaction vector and the $\theta _{hk} $ is the interaction angle: this specification maps the interaction vectors on the Cartesian plane onto the unit-circle in such a way that the only meaningful quantity is the interaction angle. By applying a clockwise rotation $\vartheta =\theta -\pi /4$ to simplify calculations, together with sin/cos subtraction formulae, then

$$\begin{aligned} \mathbf{V}_{hk} :\left\{ {\begin{array}{l} V_h =\cos \vartheta _{hk} =\frac{v_h -v_k }{\sqrt{2}}\ {\textit{profit}}\ {\textit{differential}}\ {\textit{indicator}} \\ V_k =\sin \vartheta _{hk} =\frac{v_h +v_k }{\sqrt{2}} \\ \end{array}} \right. \end{aligned}$$

(41)

Due to the goniometric representation of the interaction vector $\mathbf{V}_{hk} $, the interaction-specific switching probability of (9) is found by involving the Cosine density: by definition of $\cos \vartheta _{hk} $ it involves a profit differential indicator for interaction specific probabilities to switch

$$\begin{aligned} r_{hk|k}&= A_h \int \limits _{-\pi /2}^{\vartheta _{hk} } {\left. {\frac{\cos \vartheta }{2}d\vartheta } \right| _{\vartheta _{hk} =\theta _{hk} -\pi /4} \!=\!\frac{\left( {1+\sin (\theta _{hk} -\pi /4)} \right) }{\left( {K+\sum \nolimits _{k\le K} {\sin (\theta _{hk} -\pi /4)} } \right) } }\nonumber \\&= r(\theta _{hk} ) : A_h= \frac{4}{K+\sum \nolimits _{k\le K} {\sin (\theta _{hk} -\pi /4)} } \end{aligned}$$

(42)

while maintenance probability is the complement $r_{hh|k} =1-r_{hk|k} $. Therefore, at each infra-time step, that is while the learning is developing, the following matrices are defined

$$\begin{aligned} \mathbf{W}^{s}\!=\left[ {{\begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} {r_{11|1} }&{} \cdots &{} {r_{1k|k} }&{} \cdots &{} {r_{1K|K} } \\ \vdots &{} &{} \vdots &{} &{} \vdots \\ {r_{h1|h} }&{} \cdots &{} {r_{hk|k} }&{} \cdots &{} {r_{hK|K} } \\ \vdots &{} &{} \vdots &{} &{} \vdots \\ {r_{K1|1} }&{} \cdots &{} {r_{Kk|k} }&{} \cdots &{} {r_{KK|K} } \\ \end{array} }} \right] \quad \mathbf{W}^{m}\!=\left[ {{\begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} {r_{11|1} }&{} \cdots &{} {r_{11|k} }&{} \cdots &{} {r_{11|K} } \\ \vdots &{} &{} \vdots &{} &{} \vdots \\ {r_{hh|h} }&{} \cdots &{} {r_{hh|k} }&{} \cdots &{} {r_{hh|K} } \\ \vdots &{} &{} \vdots &{} &{} \vdots \\ {r_{KK|1} }&{} \cdots &{} {r_{KK|k} }&{} \cdots &{} {r_{KK|K} } \\ \end{array} }} \right] \nonumber \\ \end{aligned}$$

(43)

Note that, by using aggregate profits from the ABM–DGP simulation for each species, (39)–(43) can be empirically evaluated. On the graph $\Gamma $ of Fig. 1 these probabilities are involved to define the transition probabilities $w(p,q)$ for the matrix $\mathbf{W}$. These probabilities can be analytically defined by means of a recursive method or graphically specified by following the paths on $\Gamma $ when the final state is fixed and the initial state is any initial (characteristic) state $L_p $.

These probabilities have constraints depending on the initial reactant $L_p $: constraints switch-off some paths not to double their probabilities. As the $L_p $ changes the previous formulae give the transition probabilities in $\mathbf{W}$: as $L_p $ changes then $w(p,1)$ fills the first column in $\mathbf{W}$ relative to the final state $L_q :q=1, w(p,2)$ gives the second column in $\mathbf{W}$ and so on up to $w(p,K)$ where the final state is $L_q :q=K$ being.

$$\begin{aligned}&w(p,1)=H_{p,1} \left[ {r_{p1|1} } \right] \prod _{k\ge =2} {r_{11|k} } \end{aligned}$$

(44)

$$\begin{aligned}&w(p,2)=H_{p,2} \left[ {r_{pp|1} r_{p2|2} +r_{p1|1} r_{12|2} } \right] \prod _{k\ge 3} {r_{22|k} } \nonumber \\&\qquad \qquad \qquad s.t. p=1\Rightarrow r_{p2|2} =0 \end{aligned}$$

(45)

$$\begin{aligned}&w(p,3)=H_{p,3} \left[ [r_{p1|1} r_{11|2} ]r_{13|3} +[r_{p1|1} r_{12|2} +r_{pp|1} r_{p2|2} ]r_{23|3}+[r_{pp|1} r_{pp|2} ]r_{p3|3} \right] \prod _{k\ge 4} {r_{33|k}} \nonumber \\&\qquad \qquad \qquad s.t. p=1\Rightarrow r_{p2|2} =r_{pp|2} =0\wedge p=2\Rightarrow r_{p3|3} =0 \end{aligned}$$

(46)

$$\begin{aligned}&w(p,4)=H_{p,4} \left[ {[r_{p1|1} r_{11|2} r_{11|3} ]r_{14|4} +[[r_{pp|1} r_{p2|2} +r_{p1|1} r_{12|2} ]r_{22|3} ]r_{24|4} } \right. \nonumber \\&\qquad \qquad \qquad +\,[r_{pp|1} [r_{pp|2} r_{p3|3} +r_{p2|2} r_{23|3} ]+r_{p1|1} [r_{11|2} r_{13|3} +r_{12|2} r_{23|3} ]]r_{34|4} \nonumber \\&\qquad \qquad \qquad \quad \,\, [\left. {r_{pp|1} r_{pp|2} r_{pp|3} ]r_{p4|4} } \right] \prod _{k\ge 5} {r_{44|k} } \nonumber \\&\quad p=1\Rightarrow r_{p2|2} =0\wedge p=2\Rightarrow r_{p3|3} =r_{pp|3} =0\wedge p=3\Rightarrow r_{p4|4} =0 \end{aligned}$$

(47)

$$\begin{aligned}&w(p,5)=H_{p,5} \left\{ { [r_{p1|1} r_{11|2} r_{11|3} r_{11|4} ]r_{15|5} +[r_{pp|1} r_{p2|2} +r_{p1|1} r_{12|2} ]r_{22|3} r_{22|4} r_{25|5}} \right. \nonumber \\&\qquad \qquad \qquad +\, [r_{pp|1} [r_{pp|2} r_{p3|3} +r_{p2|2} r_{23|3} ]+r_{p1|1} [r_{11|2} r_{13|3} +r_{12|2} r_{23|3} ]]r_{33|4} r_{35|5} \nonumber \\&\qquad \qquad \qquad +\,\left[ {\begin{array}{l} r_{pp|1} [r_{pp|2} [r_{pp|3} r_{p4|4} +r_{p3|3} r_{34|4} ]+r_{p2|2} [r_{22|3} r_{24|4} +r_{23|3} r_{34|4} ]] \\ +\,r_{p1|1} [r_{11|2} [r_{11|3} r_{14|4} +r_{13|3} r_{34|4} ]+r_{12|2} [r_{22|3} r_{24|4} +r_{23|3} r_{34|4} ]] \\ \end{array}} \right] r_{45|5} \nonumber \\&\qquad \qquad \qquad \,\left. {[r_{pp|1} r_{pp|2} r_{pp|3} r_{pp|4} ]r_{p5|5} } \right\} \prod _{k\ge 6} {r_{55|k} } \nonumber \\&\qquad \qquad \qquad \, p=1\Rightarrow r_{p2|2} =0\wedge p=2\Rightarrow r_{p3|3} =0\wedge p=3\nonumber \\&\qquad \qquad \qquad \, \Rightarrow r_{p4|4} =r_{pp|4} =0\wedge p=4\Rightarrow r_{p5|5} =0 \end{aligned}$$

(48)

$$\begin{aligned}&w(p,6)=H_{p,6} \left\{ \right. [r_{p1|1} r_{11|2} r_{11|3} r_{11|4} r_{11|5} ]r_{16|6} \nonumber \\&\qquad \qquad \qquad +\, [(r_{pp|1} r_{p2|2} +r_{p1|1} r_{12|2} )]r_{22|3} r_{22|4} r_{22|5} r_{26|6} \nonumber \\&\qquad \qquad \qquad +\,[r_{pp|1} (r_{pp|2} r_{p3|3} +r_{p2|2} r_{23|3} )+r_{p1|1} (r_{11|2} r_{13|3} +r_{12|2} r_{23|3} )]r_{33|4} r_{33|5} r_{36|6} \nonumber \\&\quad +\,\left[ {\begin{array}{l} r_{pp|1} [r_{pp|2} (r_{pp|3} r_{p4|4} +r_{p3|3} r_{34|4} )+r_{p2|2} (r_{22|3} r_{24|4} +r_{23|3} r_{34|4} )] \\ +\,r_{p1|1} [r_{11|2} (r_{11|3} r_{14|4} +r_{13|3} r_{34|4} )+r_{12|2} (r_{22|3} r_{24|4} +r_{23|3} r_{34|4} )] \\ \end{array}} \right] r_{45|5} r_{46|6} \nonumber \\&\quad \left[ {\begin{array}{l} r_{pp|1} \left[ {\begin{array}{l} r_{pp|2} \left( {r_{pp|3} (r_{pp|4} r_{p5|5} +r_{p4|4} r_{45|5} )+r_{p3|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right) \\ +\,r_{p2|2} \left( {r_{22|3} (r_{22|4} r_{25|5} +r_{24|4} r_{45|5} )+r_{23|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right) \\ \end{array}} \right] \\ +\,r_{p1|1} \left[ {\begin{array}{l} r_{11|2} \left( {r_{11|3} (r_{11|4} r_{11|5} +r_{14|4} r_{45|5} )+r_{13|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right) \\ +\,r_{12|2} \left( {r_{22|3} (r_{22|4} r_{25|5} +r_{24|4} r_{45|5} )+r_{23|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right) \\ \end{array}} \right] \\ \end{array}} \right] r_{56|6} \nonumber \\&\quad \left. {[r_{pp|1} r_{pp|2} r_{pp|3} r_{pp|4} r_{pp|5} ]r_{p6|6} } \right\} r_{66|7} \nonumber \\&\quad p=1\Rightarrow r_{p2|2} =0\wedge p=2\Rightarrow r_{p3|3} =0 \nonumber \\&\quad p=3\Rightarrow r_{p4|4} =r_{pp|4} =0\wedge p=4\Rightarrow r_{p5|5} =r_{pp|5} =0 \nonumber \\&\quad p=5\Rightarrow r_{p6|6} =0 \end{aligned}$$

(49)

$$\begin{aligned}&w(p,7)=H_{p,7} \left\{ {[r_{p1|1} r_{11|2} r_{11|3} r_{11|4} r_{11|5} r_{11|6} ]r_{17|7} } \right. \nonumber \\&\quad +\,[(r_{pp|1} r_{p2|2} +r_{p1|1} r_{12|2} )]r_{22|3} r_{22|4} r_{22|5} r_{22|6} r_{27|7} \nonumber \\&\quad +\,[r_{pp|1} (r_{pp|2} r_{p3|3} +r_{p2|2} r_{23|3} )+r_{p1|1} (r_{11|2} r_{13|3} +r_{12|2} r_{23|3} )]r_{33|4} r_{33|5} r_{33|6} r_{37|7} \nonumber \\&\quad +\,\left[ {\begin{array}{l} r_{pp|1} [r_{pp|2} (r_{pp|3} r_{p4|4} +r_{p3|3} r_{34|4} )+r_{p2|2} (r_{22|3} r_{24|4} +r_{23|3} r_{34|4} )] \\ +\,r_{p1|1} [r_{11|2} (r_{11|3} r_{14|4} +r_{13|3} r_{34|4} )+r_{12|2} (r_{22|3} r_{24|4} +r_{23|3} r_{34|4} )] \\ \end{array}} \right] r_{44|5} r_{44|6} r_{47|7} \nonumber \\&\quad +\,\left[ {\begin{array}{l} r_{pp|1} \left[ {\begin{array}{l} r_{pp|2} \left[ {r_{pp|3} (r_{pp|4} r_{p5|5} +r_{p4|4} r_{45|5}) +r_{p3|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right] \\ +\,r_{p2|2} \left[ {r_{22|3} (r_{22|4} r_{25|5} +r_{24|4} r_{45|5} )+r_{23|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right] \\ \end{array}} \right] \\ +\,r_{p1|1} \left[ {\begin{array}{l} r_{11|2} \left[ {r_{11|3} (r_{11|4} r_{15|5} +r_{14|4} r_{45|5} )+r_{13|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right] \\ +\,r_{12|2} \left[ {r_{22|3} (r_{22|4} r_{25|5} +r_{24|4} r_{45|5} )+r_{23|3} (r_{33|4} r_{35|5} +r_{34|4} r_{45|5} )} \right] \\ \end{array}} \right] \\ \end{array}} \right] r_{55|6} r_{57|7} \nonumber \\&\quad +\,\left[ {\begin{array}{l} r_{pp|1} \left[ {\begin{array}{l} r_{pp|2} \left[ {\begin{array}{l} r_{pp|3} \left[ {r_{pp|4} (r_{pp|5} r_{p6|6} +r_{p5|5} r_{56|6} )+r_{p4|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ +\,r_{p3|3} \left[ {r_{33|4} (r_{33|5} r_{36|6} +r_{35|5} r_{56|6} )+r_{34|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ \end{array}} \right] \\ +\,r_{p2|2} \left[ {\begin{array}{l} r_{22|3} \left[ {r_{22|4} (r_{22|5} r_{26|6} +r_{25|5} r_{56|6} )+r_{24|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ +\,r_{23|3} \left[ {r_{33|4} (r_{33|5} r_{36|6} +r_{35|5} r_{56|6} )+r_{24|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ \end{array}} \right] \\ \end{array}} \right] \\ +\,r_{p1|1} \left[ {\begin{array}{l} r_{pp|2} \left[ {\begin{array}{l} r_{11|3} \left[ {r_{11|4} (r_{11|5} r_{16|6} +r_{15|5} r_{56|6} )+r_{14|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ +\,r_{13|3} \left[ {r_{33|4} (r_{33|5} r_{36|6} +r_{35|5} r_{56|6} )+r_{34|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ \end{array}} \right] \\ +\,r_{p2|2} \left[ {\begin{array}{l} r_{22|3} \left[ {r_{22|4} (r_{22|5} r_{26|6} +r_{25|5} r_{56|6} )+r_{24|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ +\,r_{23|3} \left[ {r_{33|4} (r_{33|5} r_{36|6} +r_{35|5} r_{56|6} )+r_{24|4} (r_{44|5} r_{46|6} +r_{45|5} r_{56|6} )} \right] \\ \end{array}} \right] \\ \end{array}} \right] \\ \end{array}} \right] r_{67|7} \nonumber \\&\quad +\,\left. {[r_{pp|1} r_{pp|2} r_{pp|3} r_{pp|4} r_{pp|5} r_{pp|6} ]r_{p7|7} } \right\} \nonumber \\&\quad \left\{ {\begin{array}{l} p=1\Rightarrow r_{p2|2} =r_{pp|2} =0\wedge p=2\Rightarrow r_{p3|3} =r_{pp|3} =0 \\ p=3\Rightarrow r_{pp|4} =0 \wedge p=4\Rightarrow r_{pp|5} =0 \\ p=5\Rightarrow r_{pp|6} =0 \wedge p=6\Rightarrow r_{pp|6} =r_{66|6} /2 \\ \end{array}} \right. \end{aligned}$$

(50)

All these probabilities can be specified for any state of financial fragility $\varsigma \in \Sigma $ through time, that is $\forall [t,t+\Delta )$ up to the end of the simulation time at $T$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Landini, S., Gallegati, M. & Stiglitz, J.E. Economies with heterogeneous interacting learning agents. J Econ Interact Coord 10, 91–118 (2015). https://doi.org/10.1007/s11403-013-0121-1

Download citation

Received: 31 March 2013
Accepted: 25 December 2013
Published: 21 January 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s11403-013-0121-1

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Economies with heterogeneous interacting learning agents

Abstract

Access this article

Similar content being viewed by others

Emergence in complex networks of simple agents

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Analytics inside learning

Case 1

Case 2

Case 3

Case 4

Appendix B: Profit curves and maxima

Appendix C: Interaction specific probabilities

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Economies with heterogeneous interacting learning agents

Abstract

Access this article

Similar content being viewed by others

Emergence in complex networks of simple agents

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Analytics inside learning

Case 1

Case 2

Case 3

Case 4

Appendix B: Profit curves and maxima

Appendix C: Interaction specific probabilities

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation