Dynamic coalition formation and the core

https://doi.org/10.1016/S0167-2681(02)00015-XGet rights and content

Abstract

This paper presents a dynamic model of endogenous coalition formation in cooperative games with transferable utility (TU). The players are boundedly rational. At each time step, a player decides which of the existing coalitions to join, and demands a payoff. These decisions are determined by a best-reply rule, given the coalition structure and allocation in the previous period. Further, the players experiment with myopically suboptimal strategies whenever there are potential gains from trade. We establish an isomorphism between the set of absorbing states of the process and the set of core allocations, and show that the process converges to one of these states with probability one whenever the core is non-empty. These results do not require superadditivity of the characteristic function, and they carry over to the case of coalitional values depending on the coalition structure, and to non-transferable utility (NTU) games.

Introduction

Most equilibrium concepts in games, both cooperative and non-cooperative, are static by definition. For example, the core of a cooperative game with transferable utility (TU game) is the set of feasible allocations that cannot be blocked by any coalition of players. This implies that core allocations are stable in the sense that, once a core allocation is achieved, no subset of players can gain by deviating from it. However, the theory fails to explain how the players arrive at a core allocation, or at equilibrium in general. Reaching a certain allocation (core or otherwise) in a TU game requires the completion of two a priory unrelated processes on the part of the players: Coalition formation and bargaining about how to split the surplus within each of the coalitions. Once coalitions have formed, the surplus, or payoff accruing to each coalition, is determined by the characteristic function of the game. However, the concept of the core relies on allocations, i.e. individual payoffs, rather than coalitional surplus. The link between the two concepts, the characteristic function on the one hand and the core on the other, is an implicit bargaining process that detemines the division of the coalitional payoff between the members of the coalition. The theory of cooperative games ignores both the issue of coalition formation and the bargaining process.

The present paper addresses these issues. We are interested in questions like: How do coalitions form? How do the players decide on the division of the coalitional payoff? How do coalition structures change over time? Which of the possible coalition structures will the players eventually arrive at, and what will be the resulting allocation?

Dynamic learning models provide a framework for analysing these questions. These models are based on the assumption that players are only boundedly rational, and follow simple adaptation rules which are based on myopic optimization. While dynamic learning models have been widely applied to non-cooperative games (Ellison, 1993, Kandori et al., 1993, Young, 1993, are among the most seminal works), relatively little research in this field has been done with respect to cooperative games.

This paper provides a dynamic model of endogenous coalition formation. The setup is similar to models of dynamic learning in non-cooperative games with local interaction and player mobility (e.g. Arnold (1999)). In these models, players can move freely between several locations at each time step, and interaction, i.e. the play of a game, takes place only between players inhabiting the same location. A strategy for each player thus consists of a location choice and an action for the game. Similarly, in the context considered here, a player’s strategy consists of a coalition choice and a demand for his share of the coalitional payoff. That is, at each time step, a player decides which of the existing coalitions to join, and demands a share of the payoff, which is determined by the characteristic function. A player will join (or quit) a coalition if and only if he believes it is in his own best interest to do so. Therefore, these decisions are determined by a (non-cooperative) best-reply rule: A player switches coalitions only if his expected payoff in the new coalition exceeds his current payoff, and he demands the most he can get conditional on feasibility. More precisely, the player observes the prevailing coalition structure and the demands of the other players. Expectations are adaptive in the sense that each player expects the present coalition structure and demand to prevail in the next period. The player then chooses the coalition in which he can demand the highest possible payoff, given the demands of the other members of that coalition, and subject to feasibility. As time goes to infinity, the process generated by all players’ adopting the best-reply rule converges to an absorbing state (or set of states). Under the pure best-reply process, absorbing states do not necessarily involve core allocations. However, if we allow the players to experiment, i.e. deviate from the best-reply rule with a small probability whenever there exists a potentially better outcome, all absorbing states will be identified with core allocations.

Despite the fact that there have been several experimental studies on coalition formation (e.g. Rapoport et al., 1979, Sauermann, 1978), there are only very few theoretical papers dealing with the problem of coalition formation in a dynamic context. These are the works by Packel, 1981, Shenoy, 1979, Shenoy, 1980, and, most closely related to this paper, Agastya, 1997, Agastya, 1999, and the very recent work by Konishi and Ray (2001).1 Agastya (1997) presents a dynamic model of social learning where, in each period, each player observes a random sample of demand vectors drawn from a finite history, and adjusts his demand according to a best-reply rule. This rule differs from the one used in the present paper in that players maximize their expected payoffs, conditional on the probability that their demand is compatible with a feasible allocation. Agastya assumes that, whenever there exists any coalition structure for which a player’s demand is feasible, given the other players’ demands, the player receives the payoff he demands with probability one. The process of coalition formation is not modeled. Agastya derives an isomorphism between the set of absorbing states of the learning process and the core of the game. Agastya (1999) extends the model by introducing “mistakes” on the part of the players, and finds that the set of stochastically stable states is a subset of the set of core states.

Our model departs from Agastya in several respects. First, Agastya entirely abstracts from coalition formation, and focuses on allocations. The bargaining process considered by Agastya is simple: Each player announces his demand, i.e. the payoff he asprires to get. If there exists a coalition structure such that the vector of all players’ demands is feasible, then each player will get his demand with probability one. Agastya (1999) writes: “it is reasonalbe to assume that eventually, a maximal coalition (in terms of set inclusion) whose demands are feasible forms.”2 Indeed, for the class of superadditive games considered by Agastya, this assumption is reasonable. Our model, however, is not restricted to superadditive games. We allow for the case that, e.g. large organizations may operate less efficiently than the sum of their constituent parts. In this case, it is not reasonable to assume that a maximal coalition will form. Instead, we model the coalition formation process explicitly, by the players’ choosing both a demand and a coalition in each period. The coalition structure in each period is thus endogenously determined, which allows us to study how coalitions of players evolve over time. Further, letting the players choose their coalitions makes our model applicable to a wider range of economic problems, such as local public good economies, or clubs,3 where individuals care not only about allocations but also about the number and/or the characteristics of people in their coalition.

A model of endogenous coalition formation in a dynamic context is provided by Packel (1981). He defines a Markov process on the set of outcomes, i.e. payoff allocations. Given the individual preferences over all outcomes, the transition probability from one state to another is proportional to the number of minimal coalitions that prefer the new state to the old one. The core is then defined by the union of the absorbing states of the process. The stochastic solution of the process is the probability distribution obtained by letting time go to infinity. It follows that, whenever the core is non-empty, the stochastic solution places probability one on the set of core allocations. Moreover, Packel shows that, if the strong core (i.e. the singleton set of undominated outcomes that can be reached from every other outcome with positive probability) is non-empty, the stochastic solution places probability one on that state. The main difference between Packel’s model and our own is that Packel abstracts from behaviour rules on the individual level, while we explicitly model the players’ coalition choice and demands with respect to payoff.4

Konishi and Ray (2001) also consider a stochastic dynamic process of coalition formation where the probabilities of transition between coalition structures are determined by a Markov chain. Given these transition probabilities, the players maximize the present value of their future expected payoffs (where the payoffs are determined by a characteristic function). This gives rise to a value function for each player, determining his expected discounted payoff given any initial state. An equilibrium process of coalition formation is defined as follows: The transition probability from one state (i.e. a coalition structure and an associated vector of value functions) to another is positive only if there exists a coalition whose members are all better (or at least as well) off in the new state than in the old one, and there is no strictly better alternative state for this coalition. Further, if there is a coalition structure that makes all members of a coalition better off, the probability of the system staying in the old state is 0.

The main difference to our approach is that, in Konishi and Ray’s model, the players are farsighted rather than myopic. Indeed, the players being myopic corresponds to the special case of a discount factor of 0. From the point of view of the present paper, their most interesting result is the following: For the class of deterministic processes with a unique limit state, they establish an equivalence relationship between the core and this limit state, provided that the discount factor is large enough. While the concept of the core is myopic in the sense that it requires stability against deviations by coalitions without further examining the possible consequences of these deviations, Konishi and Ray’s result sustains the core in the dynamic context by showing that it is consistent even with farsightedness on the part of the players. However, the equivalence relationship breaks down if one allows for stochastic transitions. In contrast, our model of boundedly rational play sustains the core as the unique limit state of a Markov chain based on myopic best replies and experimenting on the parts of the players. This result is of independent interest, as it justifies the core on the basis of bounded rationality. What is more, comparing our result to that of Konishi and Ray, it shows that bounded rationality may even be precipitous to reaching efficient allocations, as compared to farsightedness: When players are farsighted, as in Konishi and Ray’s model, the core will be reached only if it is the unique limit state of the process. With boundedly rational players, as in our model, a core allocation will be reached even if the Markov process exhibits several inefficient absorbing states (always presuming that the core is non empty).

For the sake of completeness, we would like to mention other strands of the literature concerned with bargaining and coalition formation, which are (albeit somewhat remotely) related to our model. These approaches come under the heading of “endogenous coalition formation”, e.g. Ray and Vohra, 1999, Ray and Vohra, 1997, or “non-cooperative models of coalition formation”, e.g. Perry and Reny (1994). This literature provides models of bargaining that lead to some kind of stable coalition structure. However, these models are entirely different in spirit since they focus on a one-shot game, i.e. the process ends once a stable coalition structure has emerged. In contrast, our model is inherently dynamic: Coalition choice and bargaining are carried out repeatedly according to a predetermined adaptation process, the best-reply rule, which never ends. To illustrate this difference by an example, consider the model of Perry and Reny (1994). Building on a model by Kalai et al. (1979), Perry and Reny use a continuous time model where at each point in time a player can make a proposal consisting of a coalition to which the player would like to belong and a payoff allocation for the members of that coalition. If the proposal is accepted by the members of the coalition, these players drop out of the game and the remaining players continue bargaining. In effect, Perry and Reny associate with every cooperative game with transferable utility a noncooperative sequential game, and show that the stationary subgame perfect equilibria of this game coincide with the core allocations of the cooperative game, thus providing a non-cooperative motivation for core allocations. While models of non-cooperative foundations of the core are interesting in their own right, they are conceptually different from our dynamic approach.

The remainder of the paper is organized as follows. The next section introduces the basic model. We provide a definition of the core for non-superadditive games, and describe the adaptation process. Section 3 shows the existence of absorbing states. In Section 4, we modify the process by introducing noise. Section 5 provides our main result, namely that the modified process converges to a state involving a core allocation with probability one as time tends towards infinity, regardless of the initial state in which the process starts. Section 6 discusses possible extensions of the model, and Section 7 concludes.

Section snippets

The model

Let N={1,…,n} denote the set of players. Any subset SN is called a coalition. The set of all nonempty coalitions 2N⧹{∅} is denoted by N. A game in characteristic function form with transferable utility (or for short a TU game) is defined by a mapping v:NR, the characteristic function. This function v associates with any nonempty coalition the maximal total payoff for that coalition. Note that the payoff for a coalition does not depend on the behaviour of other coalitions. A vector of payoffs x

Absorbing states

Absorbing states represent stable strategy configurations in the sense that no player wants to revise his strategy. Assumption 1 provides a sufficient condition for the existence of absorbing states.

Theorem 1

If Assumption 1 holds, the best-reply process has at least one absorbing state.

Proof

Assumption 1 implies v(N)≥∑iNv({i}). An absorbing state is constructed as follows: Assume that the grand coalition {N} has formed and each player receives a payoff xiv({i}) with ∑iNxi=v(N). As payoffs are individually

Best reply with experimentation

The result that the set of absorbing states may comprise inefficient or non-equilibrium states is not new, as the literature on evolutionary models of non-cooperative games shows (e.g. Kandori et al., 1993, Young, 1993). A solution to this problem is to introduce noise to the dynamics by allowing the players to pick myopically suboptimal strategies with a small probability, which is supposed to model evolutionary mutations, or “trembles”, that are interpreted as mistakes. The limit distribution

Convergence to absorbing states

Theorem 2 states that, if the core is non-empty, each core allocation can be reached in an absorbing state, and any absorbing state can be associated with a core allocation. However, the theorem does not guarantee that a core allocation will actually be reached by the process. While the theory of Markov chains provides a result that ensures convergence towards an ergodic set, it does not guarantee that such a set be singleton, i.e. an absorbing state (Example 2). The following theorem excludes

Extensions of the model

The characteristic function form of a game is based on the assumption that the value of a coalition is independent of the coalition structure. In many economic situations, however, this is not the case. For instance, consider a Cournot-oligopoly model where the firms have the option to form binding coalitions, i.e. cartels. Then, the payoff to each coalition depends on the entire coalition structure in the industry. By redefining the payoff functions and the concept of blocking, our model can

Conclusion

This paper proposed a dynamic process of endogenous coalition formation in cooperative games. Coalition membership and the allocation of payoffs in each period are determined by a simple adaptation rule that is based on myopic best replies on the part of the players, and players experiment with suboptimal strategies whenever there is a chance that this might lead to a preferred coalition structure. Under a very mild condition concerning the characteristic function, absorbing states are shown to

Acknowledgements

We wish to thank Andreas Blume for his suggestions and comments, and we are grateful to the participants of the workshop Coalition Formation and Applications to Economics, Bilbao, April 1999, for helpful comments and discussions. We are indebted to an anonymous referee for constructive comments.

References (27)

  • Arnold, T., 2000. A dynamic model of a local public good economy with crowding, working...
  • G. Ellison

    Learning, local interaction, and coordination

    Econometrica

    (1993)
  • A. Feldman

    Bilateral trading processes, pairwise optimality, and Pareto optimality

    Review of Economic Studies

    (1973)
  • Cited by (0)

    View full text