1 Introduction

The best-response dynamic is a ubiquitous iterative game-playing process in which, at each time step, players myopically select actions that are a best-response to the actions last chosen by all other players. The literature at large has established the equilibrium convergence properties of the best-response dynamic in games with specific payoff structures; particularly in potential games (Monderer and Shapley 1996), but also in weakly acyclic games (Fabrikant et al. 2013), aggregative games (Dindoš and Mezzetti 2006), and quasi-acyclic games (Friedman and Mezzetti 2001; Takahashi and Yamamori 2002). So known results are restricted to special cases. The performance of the best-response dynamic in the class of all games remains to be established. In this paper, we consider the question of whether the best-response dynamic converges to a pure Nash equilibrium in a small or large fraction of all possible normal-form games.

To answer our question, we take a “random games” approach: we determine whether the best-response dynamic converges to a pure Nash equilibrium in a game drawn at random from among all possible games. The random games approach has a long history in game theory (since Goldman 1957; Goldberg et al. 1968, and Dresher 1970), and has been used to address questions regarding the prevalence of Nash equilibria (Powers 1990; Stanford 1995, 1996, 1997; Cohen 1998; Stanford 1999; McLennan 2005; McLennan and Berg 2005; Takahashi 2008; Kultti et al. 2011; Daskalakis et al. 2011; Quattropani and Scarsini 2020), the prevalence of rationalizable strategies (Pei and Takahashi 2019), convergence to equilibrium (Pangallo et al. 2019; Amiet et al. 2021a, b; Wiese and Heinrich 2022), and the prevalence of dominance solvable games (Alon et al. 2021).Footnote 1 A guiding principle of the approach is that, since the property of interest (e.g. existence of Nash equilibrium, convergence to Nash equilibrium, dominance solvability) does not hold in all games, one can at least determine how likely the property is to hold in the class of all games. To do so, one defines a probability distribution over all games, and computes the probability that a game drawn randomly according to this distribution has the desired property.

The playing sequence—the order in which players update their actions—has an important role in our analysis. We largely focus on two specific playing sequences in this paper. At one extreme, we consider the random playing sequence, where players take turns to play one at a time and the next player to play is chosen uniformly at random from among all players. At the other extreme, we consider a natural deterministic counterpart to the random sequence, which we refer to as the clockwork playing sequence, where players take turns to play one at a time according to a fixed cyclic order. The best-response dynamic under the random playing sequence is widely studied. It is often of interest in population and evolutionary games (Sandholm 2010), and its properties have been analyzed in a variety of games with specific payoff structures.Footnote 2 The best-response dynamic under the clockwork playing sequence appears most frequently in the algorithmic game theory literature. Its properties have inter alia been studied in auctions (Nisan et al. 2011), job scheduling (Berger et al. 2011), network formation games (Chauhan et al. 2017), and it has been used for equilibrium selection in potential games (Boucher 2017). Using the random games approach, Durand and Gaujal (2016) show that, in expectation, convergence to equilibrium in potential games is faster under the clockwork playing sequence than under any other playing sequence.

Fig. 1
figure 1

Illustration of a 3-player game with 2 actions per player (left) and its associated best-response digraph (right). The axes shown in the center give us our coordinate system: player 1 selects rows (along the depth), player 2 selects columns (along the width), and player 3 selects levels (along height). In the left-hand panel, the payoffs of players 1, 2, and 3 are listed in that order. The unique pure Nash equilibrium at the profile (1, 2, 1) is a sink of the digraph and is underlined

The playing sequence is essentially irrelevant in determining whether the best-response dynamic converges to equilibrium in potential games—which is the focus of a large part of the literature—but it is a key determinant of the dynamic’s convergence properties in non-potential games. To see this, consider the 3-player game shown in the left-hand panel of Fig. 1 and its associated best-response digraph shown in the right-hand panel. Best-response digraphs are a commonly used reduced-form representation of a game in which the vertices are the action profiles and the directed edges correspond to the players’ best-responses (e.g. see Young 1998, Chapter 7, or Pangallo et al. 2019). It is easy to show that potential games have acyclic best-response digraphs, which implies that the playing sequence plays almost no role: as long as each player with a remaining payoff-improving action has a chance to play—which is the case for both the random and the clockwork playing sequences—the dynamic must eventually end at a sink of the digraph, i.e. at a Nash equilibrium of the game.Footnote 3 In contrast, in the non-potential game shown in Fig. 1, convergence is dependent on the playing sequence: with initial profile (1, 1, 1), the random sequence best-response dynamic must eventually converge to the Nash equilibrium, whereas the clockwork sequence best-response dynamic with cyclic player order 1-2-3-1-...will remain stuck cycling on the four profiles on the front face of the cube forever.

Since we are assessing the performance of the best-response dynamic over the class of all games (including non-potential games), it is necessary for us to be explicit about the details of the playing sequence. There are, of course, many possible playing sequences,Footnote 4 but our focus on random vs. clockwork suffices for our main finding: whether the best-response dynamic converges to equilibrium in a small or large fraction of all games depends on the playing sequence in an extreme way. Broadly, we show that under a clockwork playing sequence, the fraction of all n-player games in which the best-response dynamic converges to a pure Nash equilibrium goes to 0 as the number of players and/or actions gets large. By contrast, under a random playing sequence, the fraction of all n-player games with a pure Nash equilibrium in which the best-response dynamic converges to a pure Nash equilibrium goes to 1 as the number of players and/or actions gets large (when \(n>2\)).

That the best-response dynamic converges less often under a clockwork than under a random playing sequence is perhaps unsurprising since the clockwork sequence will have more difficulty escaping best-response cycles. We therefore expect the probability of convergence to equilibrium for the clockwork sequence to be less than it is for the random sequence. However, the resulting extreme jump in the asymptotic equilibrium convergence frequency from 1 to 0 is rather striking. Since most games have digraphs that contain cycles, our contribution can be seen as quantifying the fact that a clockwork playing sequence is very likely to become trapped in such cycles, whereas the random playing sequence is very likely to escape them.

We now provide a brief technical overview of our methods and results. To generate games at random, we follow the majority of papers in the ‘random games’ literature by drawing each player’s payoff at each action profile independently according to an arbitrary atomless distribution.Footnote 5 This induces a uniform distribution over best-response digraphs, and it is in this sense that we can claim convergence in a large or small fraction of all games. The probability of convergence to a pure Nash equilibrium can be reduced to working out the probability that the best-response path initiated at a random vertex hits a sink of the randomly drawn digraph.Footnote 6

In Sect. 3.1, we show that the probability that the clockwork best-response dynamic converges to a pure Nash equilibrium in a game with \(n>2\) players and \(m_i\ge 2\) actions per player i is, up to a polynomial factor, of order \(1/\sqrt{q_{n,\textbf{m}}}\), where \(q_{n,\textbf{m}}:= \frac{\prod _{i=1}^n m_i}{\max _i m_i}\) is the minimal number of strategic environments in the game (i.e. the minimal number of combinations of actions of all but one player). The proof relies on a coupling argument that makes it possible to deal with the path-dependence of the best-response dynamic. The result has two implications. (i) For large \(q_{n,\textbf{m}}\), the probability of convergence is determined by the value of a single parameter, namely, the minimal number of possible strategic environments, so all games with an identical minimal number of strategic environments have similar asymptotic probabilities of convergence to equilibrium. This is also reflected in our simulations even for small values of \(q_{n,\textbf{m}}\). (ii) When the number of players n and/or the number of actions per player is large for at least two players (implying \(q_{n,\textbf{m}} \rightarrow \infty\)), the probability that the clockwork best-response dynamic converges to a pure Nash equilibrium goes to zero. This is in stark contrast with the convergence properties of the random sequence best-response dynamic.

In Sect. 3.2, we provide more detailed theoretical results for games with \(n=2\) players. In particular, we provide results on game duration, and we derive an exact expression for the probability that the best-response dynamic converges to a (best-response) cycle of given length at a particular time. As a special case, we obtain the exact probability that the clockwork best-response dynamic converges to a pure Nash equilibrium in 2-player games with \(m_i\) actions per player. Unlike in games with \(n>2\) players in which the clockwork and random sequences behave very differently from each other, the probability of convergence to equilibrium is the same for the random and clockwork playing sequences in 2-player games. Furthermore, when \(m_1=m_2=m\), we show that this probability is asymptotically \(\sqrt{\pi / m}\) when m is large.

Section 4 present our simulation results. We investigate the extent to which our asymptotic analytical results also hold for small numbers of players and/or actions. Additionally, we investigate the behavior of playing sequences that interpolate between the extremes of clockwork and random playing sequences.

2 Best-response dynamics in games

2.1 Games

A game with \(n \ge 2\) players and \(m_i \ge 2\) actions per player i is a tuple

$$\begin{aligned} g_{n,\textbf{m}}:= ([n],\{[m_i]\}_{i \in [n]},\{u_i\}_{i \in [n]}), \end{aligned}$$

where \(\textbf{m}:=(m_1,\ldots ,m_n)\), \([n]:= \{1,\ldots ,n\}\) is the set of players, and each player \(i \in [n]\) has a set of actions \([m_i]:=\{1,\ldots ,m_i\}\) and a payoff function \(u_i: {\mathcal {M}} \rightarrow {\mathbb {R}}\), where \({\mathcal {M}}:=\times _{i \in [n]}[m_i]\).

An action profile is a vector of actions \(\textbf{a}=(a_1,\ldots ,a_n)\in {\mathcal {M}}\) that lists the action taken by each player. An environment for player i is a vector \(\textbf{a}_{-i} \in {\mathcal {M}}_{-i}:=\times _{j \in [n]{\setminus } \{i\}}[m_j]\) that lists the action taken by each player but i. A best-response correspondence \(b_i\) for player i is a mapping from the set of environments for player i to the set of all non-empty subsets of i’s actions and is defined by

$$\begin{aligned} b_i(\textbf{a}_{-i}):= \arg \max _{a_i \in [m_i]} u_i(a_i, \textbf{a}_{-i}). \end{aligned}$$

In the rest of this paper, we consider only games in which for each player i and environment \(\textbf{a}_{-i}\), the best-response action is unique. This is the case for games in which there are no ties in payoffs.Footnote 7

An action profile \(\textbf{a} \in {\mathcal {M}}\) is a pure Nash equilibrium (PNE) if for all \(i \in [n]\) and all \(a_i \in [m_i]\), \(u_i (\textbf{a}) \ge u_i(a_i,\textbf{a}_{-i})\). Equivalently, \(\textbf{a}\) is a PNE if each player \(i \in [n]\) is playing their (assumed unique) best-response action i.e. \(a_i = b_i(\textbf{a}_{-i})\). Denote the set of PNE of the game \(g_{n,\textbf{m}}\) by \(\text {PNE}(g_{n,\textbf{m}})\) and let \(\# \text {PNE}(g_{n,\textbf{m}})\) denote the cardinality of this set.

2.2 Best-response digraphs

The best-response structure of a game \(g_{n,\textbf{m}}\) can be represented by a best-response digraph \({\mathcal {D}}(g_{n,\textbf{m}})\) whose vertex set is the set of action profiles \({\mathcal {M}}\) and whose edges are constructed as follows: for each \(i \in [n]\) and each pair of distinct vertices \(\textbf{a}=(a_i,\textbf{a}_{-i})\) and \(\textbf{a}' = (a_i',\textbf{a}_{-i})\), place a directed edge from \(\textbf{a}\) to \(\textbf{a}'\) if and only if \(a_i'\) is player i’s best-response to environment \(\textbf{a}_{-i}\), i.e. \(a_i' = b_i( \textbf{a}_{-i} )\). There are edges only between action profiles that differ in exactly one coordinate. A profile \(\textbf{a}\) is a PNE of \(g_{n,\textbf{m}}\) if and only if it is a sink of the best-response digraph \({\mathcal {D}}(g_{n,\textbf{m}})\). It is easy to show that potential games have acyclic best-response digraphs.Footnote 8

2.3 Best-response dynamics

We now consider games played over time, with each player in turn myopically best-responding to their current environment.

A playing sequence function \(s: {\mathbb {N}} \rightarrow [n]\) determines whose turn it is to play at each time \(t \in {\mathbb {N}}\), where \({\mathbb {N}}\) denotes the set of positive integers.Footnote 9 We will be interested in two specific playing sequences. The clockwork playing sequence is defined by \(s_{\texttt{c}}(t):= 1 + (t-1) \bmod n\), so player 1 plays at time 1, followed by player 2, then 3, and so on until player n, and then the sequence returns to player 1, and so on. The random playing sequence \(s_\texttt {r}\) is determined as follows: for each \(t \in {\mathbb {N}}\), draw \(s_{\texttt{r}}(t)\) uniformly at random from [n]. So, at each time, the player playing at that time is drawn uniformly at random from among all players. It is easy to see that, starting from any initial profile, the random sequence best-response dynamic must eventually converge to the PNE of the game shown in Fig. 1, but it is by no means guaranteed to converge to a PNE in all games.Footnote 10 In Sects. 2 and 3 we restrict our attention to playing sequences \(s \in \{s_{\texttt {c}},s_{\texttt {r}}\}\).

A path \(\langle {\vec {\textbf{a}}} , s \rangle\) is an infinite sequence of action profiles \({\vec {\textbf{a}}} =(\textbf{a}^0, \textbf{a}^1,\ldots )\) and an associated playing sequence function \(s: {\mathbb {N}} \rightarrow [n]\) satisfying the constraint that only one player changes her action at a time, i.e. \(\textbf{a}_{-s(t)}^t = \textbf{a}_{-s(t)}^{t-1}\) for each \(t \in {\mathbb {N}}\). So only the action of player s(t) is allowed to differ between profiles \(\textbf{a}^{t-1}\) and \(\textbf{a}^t\) along a path.

The best-response dynamic with playing sequence \(s: {\mathbb {N}} \rightarrow [n]\) on a game \(g_{n,\textbf{m}}\) initiated at the action profile \(\textbf{a}^0\) is the following process: set the initial action profile to \(\textbf{a}^0\) and, at each time \(t \in {\mathbb {N}}\), player s(t) myopically plays her best-response \(a_i^{t} = b_i(\textbf{a}_{-i}^{t-1})\) to her current environment \(\textbf{a}_{-s(t)}^{t-1}\). The best-response dynamic effectively generates a path \(\langle {\vec {\textbf{a}}} , s \rangle\) by traveling along the edges of the best-response digraph \({\mathcal {D}}(g_{n,\textbf{m}})\) in direction s(t) at step t starting from the initial profile \(\textbf{a}^0\).Footnote 11

2.4 Convergence

For any path \(\langle {\vec {\textbf{a}}} , s \rangle\) and set of action profiles \({\mathcal {A}}\subseteq {\mathcal {M}}\) the hitting time \(H_{\langle {\vec {\textbf{a}}} , s \rangle }({\mathcal {A}}):=\inf \{t \in {\mathbb {N}}: \textbf{a}^t \in {\mathcal {A}}\}\) is the first time \(t\ge 1\) at which some element of the sequence \({\vec {\textbf{a}}}\) is in, or (first) hits, the set \({\mathcal {A}}\) (\(\inf\) is the infimum operator and we use the convention that \(\inf \emptyset = \infty\)).Footnote 12 We say that the s-sequence best-response dynamic on game \(g_{n,\textbf{m}}\) initiated at \(\textbf{a}^0\) converges to a PNE if its path \(\langle {\vec {\textbf{a}}} , s \rangle\) hits \(\text {PNE}(g_{n,\textbf{m}})\) in finite time. Clearly, if a path hits a PNE at some time t, it stays there forever after.

2.5 Best-response dynamics on random games

We generate random games by drawing all payoffs at random: for each \(\textbf{a} \in {\mathcal {M}}\) and \(i\in [n]\), the payoff \(U_i(\textbf{a})\) is a random number that is drawn from an atomless distribution \({\mathbb {P}}\). The draws are independent across all \(i\in [n]\) and \(\textbf{a} \in {\mathcal {M}}\). The distribution \({\mathbb {P}}\) ensures that any ties in payoffs have zero measure, so almost surely each environment has a unique best-response for each player. A random game drawn in this way is denoted by \(G_{n,\textbf{m}}:= ( [n], \{[m_i]\}_{i \in [n]}, \{U_i\}_{i \in [n]} )\).

The best-response dynamic on random games is described by Algorithm 1. We randomly draw a game and run the best-response dynamic on the drawn game, starting from a randomly drawn initial profile \(\textbf{A}^0\).Footnote 13 Doing so induces a distribution over paths and PNE sets.

figure a

The notion of convergence given in Sect. 2.4 applies here. Namely, the s-sequence best-response dynamic on game \(G_{n,\textbf{m}}\) (and initial condition \(\textbf{A}^0\)) converges to a PNE if its path \(\langle {\vec {\textbf{A}}} , s\rangle\) (generated according to Algorithm 1) hits \(\text {PNE}(G_{n,\textbf{m}})\) in finite time.

3 Theoretical results

In this section, we present the theoretical results for best-response dynamics in random games. In Sect. 3.1 we focus on games with \(n>2\) players. In this case, we find that best-response dynamics behave very differently under clockwork vs. random playing sequences. Most of our results on the probability of convergence to equilibrium are asymptotic. In Sect. 3.2 we focus on games with \(n=2\) players. In this case, the probability of convergence to equilibrium is the same under both clockwork and random playing sequences. Furthermore, we are able to provide asymptotic as well as exact results for game duration and for the probability of convergence to equilibrium.

The quantity

$$\begin{aligned} q_{n,\textbf{m}}:= \frac{\prod _{i \in [n]} m_i}{\max _{i \in [n]} m_i} \end{aligned}$$

is central to our results and it appears frequently in the literature on random games (for example, see Dresher 1970, or Rinott and Scarsini 2000). As summarized in the proposition below, the probability that there is a pure Nash equilibrium is asymptotically \(1 - \exp \{-1\} \approx 0.63\) as \(q_{n,\textbf{m}}\) gets large.

Proposition 1

(Rinott and Scarsini 2000)

$$\begin{aligned} \lim _{q_{n,\textbf{m}} \rightarrow \infty } \Pr \left[ \# \text{PNE }(G_{n,\textbf{m}}) \ge 1\right] = 1 - \exp \{-1\}. \end{aligned}$$

Since \(q_{n,\textbf{m}} \rightarrow \infty\) if and only if \(n \rightarrow \infty\) or \(m_i \rightarrow \infty\) for at least two players i, the probability that there is a PNE in a randomly drawn game approaches \(1 - \exp \{-1\}\) when the number of players gets large or when the number of actions per player gets large for at least two players.Footnote 14

3.1 Games with \(n>2\) players

The following result shows that, in large 2-action games, the random sequence best-response dynamic converges with high probability to a PNE if there is one. Let \(\textbf{2}\) denote a n-vector of 2s.

Proposition 2

(Amiet et al. 2021)

$$\begin{aligned} \lim _{n \rightarrow \infty } \Pr \left[ s_{\texttt{r}} \text{-best-response dynamic on } G_{n,\textbf{2}} \text{ converges to a PNE} \,|\, \# \text{PNE} (G_{n,\textbf{2}}) \ge 1\right] = 1. \end{aligned}$$

Combined with Proposition 1, it follows that over the class of all 2-action games, the random sequence best-response dynamic converges to a PNE with probability about \((1 - \exp \{-1\})\), i.e. in approximately \(63\%\) of those games, when the number of players is large.

A generalization of Proposition 2 to games with more than 2 actions per player is non-trivial. There are currently no existing analytical results for such cases, so this area remains open for future research. However, we conjecture that for \(n>2\), the random sequence best-response dynamic converges to a PNE with high probability if there is one as \(q_{n,\textbf{m}} \rightarrow \infty\). Consistent with this conjecture, in the simulations of Sect. 4 we show that, provided \(n >2\), the random sequence best-response dynamic does converge to a PNE with probability close to \(1-\exp \{-1\}\) when n gets large or when the number of actions gets large for at least two players.

Our main result for the clockwork sequence best-response dynamic in games with \(n>2\) players is given below.

Theorem 1

$$\begin{aligned} \frac{1}{4 \sqrt{n}} \frac{1}{\sqrt{ q_{n,\textbf{m}} }} \le \Pr \left[ \begin{array}{c} s_{\texttt{c}} \text{-best-response dynamic}\\ \text{on } G_{n,\textbf{m}} \text{ converges to a PNE} \end{array} \right] \le \frac{6 n \sqrt{\log (q_{n,\textbf{m}})} }{ \sqrt{ q_{n,\textbf{m}} } }. \end{aligned}$$

Consequently, since the upper and lower bound both go to zero as n gets large or when the number of actions gets large for at least two players,

$$\begin{aligned} \lim _{q_{n,\textbf{m}} \rightarrow \infty }\Pr \left[ s_{\texttt{c}}\text{-best-response dynamic on }G_{n,\textbf{m}} \text{ converges to a PNE} \right] =0. \end{aligned}$$

So, with high probability, the clockwork sequence best-response dynamic does not converge to a PNE as the number of players gets large or as the number of actions for at least two players gets large. This is in sharp contrast with the asymptotic behavior of the random sequence best-response dynamic. It is intuitive that the clockwork sequence converges to a PNE less often than the random sequence because it will have more difficulty escaping cycles in a best-response digraph. That said, the extreme swing in the asymptotic probability of convergence from 1 to 0 is rather striking.

We briefly comment on Theorem 1 and its implications. (i) In Algorithm 1, drawing payoffs independently at random (from an atomless distribution) induces a uniform distribution over best-response digraphs.Footnote 15 It is in this sense that we can say that the best-response dynamic converges in a “large” or “small” fraction of all games. (ii) Our proof of Theorem 1 relies on a coupling argument (explained in the appendix) that makes it possible to deal with the path-dependence of the best-response dynamic (which arises from the fact that if a player encounters an environment that they had seen before, they must play the same action that they played when the environment was first encountered). The proof centers on bounding the time it takes for some player to re-encounter a previously seen environment along a best-response path and this time is fundamentally determined by \(q_{n,\textbf{m}}\), which is the minimal number of possible environments. (iii) In fact, Theorem 1 gives us the following corollary, which shows that the asymptotic probability of convergence to equilibrium is determined primarily by the value of the parameter \(q_{n,\textbf{m}}\).Footnote 16

Corollary 1

The asymptotic probability that the clockwork sequence best-response dynamic converges to a PNE is, up to a polynomial factor, of order \(1/\sqrt{ q_{n,\textbf{m}} }\).

3.2 Games with \(n=2\) players

For \(n=2\) players, we provide detailed results on both game duration and on the probability of convergence to equilibrium.

If the path \(\langle {\vec {\textbf{a}}} , s_{\texttt {c}} \rangle\) generated by the clockwork best-response dynamic on a 2-player game \(g_{2,\textbf{m}}\) has the property that from t onwards, the sequence of 2k possibly non-distinct action profiles \(\textbf{a}^t,\ldots ,\textbf{a}^{t+2k-1}\) repeats itself forever and t is the hitting time to \(\textbf{a}^t\), then we say that the clockwork best-response dynamic converged to a cycle of length 2k, or a 2k-cycle, at time t, where \(k \in \{1,\ldots ,m_*\}\) and \(m_*:= \min \{m_1,m_2\}\).

Theorem 2

For any \(k \in \{1,\ldots ,m_*\}\) and \(t \in \{1,\ldots ,2(m_*-k+1)\}\),Footnote 17 the probability that the \(s_{\text{c}}\)-best-response dynamic on \(G_{2,\textbf{m}}\) converges to a \(2k\)-cycle at time t is given by

$$\frac{1}{m_{s_{\texttt{c}}(t+2k-1)}} \prod _{i=1}^{t+2k-2} \left( 1 - \frac{1}{{m_{s_{\texttt{c}}(i)}}} \left\lfloor \frac{i}{2}\right\rfloor \right) .$$
(1)

Thus we have an exact expression for the probability that the clockwork sequence best-response dynamic converges to a 2k-cycle at time t.Footnote 18 Setting \(k=1\) in (1) yields the exact probability that the clockwork sequence best-response dynamic on \(G_{2,\textbf{m}}\) converges to a PNE at time t.

As a straightforward corollary of Theorem 2, the probability that the clockwork sequence best-response dynamic converges to a 2k-cycle is obtained by summing (1) over all \(t \in \{1,\ldots ,2(m_*-k+1)\}\):

Corollary 2

The probability that the \(s_{\text{c}}\)-best-response dynamic on \(G_{2,\textbf{m}}\) converges to a \(2k\)-cycle is given by

$$\sum _{t=1}^{2(m_*-k+1)} \; \frac{1}{m_{s_{\texttt{c}}(t+2k-1)}} \prod _{i=1}^{t+2k-2} \left( 1 - \frac{1}{m_{s_{\texttt{c}}(i)}} \left\lfloor \frac{i}{2}\right\rfloor \right).$$
(2)

Setting \(k=1\) in (2) yields the exact probability that the clockwork sequence best-response dynamic on \(G_{2,\textbf{m}}\) converges to a PNE.

To get a better sense of the behavior of (2), we now study its asymptotics, which are easiest to see when \(m_1=m_2=m\). We maintain this restriction in the rest of this section. Let \(\Phi (\cdot )\) denote the standard normal cumulative distribution function:

$$\begin{aligned} \Phi (x):= \frac{1}{\sqrt{2\pi }} \int _{-\infty }^x \exp \left\{ - \frac{z^2}{2} \right\} dz. \end{aligned}$$

We say that f(n) is asymptotically g(n) if \(f(n)/g(n) \rightarrow 1\) as \(n \rightarrow \infty\), and \(f(n) = o( g(n))\) denotes \(f(n)/g(n) \rightarrow 0\) as \(n \rightarrow \infty\).

Proposition 3

Set \(m_1=m_2=m\). If \(k = o(m^{2/3})\) then, as \(m \rightarrow \infty\), (2) is asymptotically

$$\begin{aligned} 2 \sqrt{\frac{\pi }{m}} \left( 1 - \Phi \left( \frac{2k -1}{\sqrt{2m}} \right) \right) . \end{aligned}$$

If \(k = o(\sqrt{m})\) then, as \(m \rightarrow \infty\), (2) is asymptotically \(\sqrt{\pi /m}\).

The asymptotics given in Proposition 3 help us to better understand the behavior of the clockwork sequence best-response dynamic in large 2-player games. (i) The probability of convergence to a PNE, which corresponds to setting \(k =1\), goes to zero when \(m \rightarrow \infty\).Footnote 19 (ii) Short cycles all have about the same probability. Indeed, for \(k = o(\sqrt{m})\) the probability is asymptotically \(\sqrt{\pi /m}\). Finally, (iii) it is very unlikely that the best-response dynamic converges to a very long cycle: if \(k/\sqrt{m} \rightarrow \infty\) then the probability that the dynamic converges to a cycle of length at least 2k tends to 0.Footnote 20

Theorem 3

Set \(m_1=m_2=m\) and fix \(x>0\). The probability that the \(s_{\texttt{c}}\)-best-response dynamic on \(G_{2,\textbf{m}}\) does not hit a cycle (of any length) until at least time step \(x\sqrt{2m}\) is asymptotically \(\exp \{-x^2/2\}\) as \(m \rightarrow \infty\).

This result shows that the clockwork sequence best-response dynamic in 2-player games is likely to converge to a 2k-cycle (for some \(k \in \{1,\ldots ,m\}\)) within \(\sqrt{2m}\) time steps when m is large.

We now compare the behavior of the clockwork sequence best-response dynamic in 2-player games with the behavior of the random sequence best-response dynamic in 2-player games. (i) The probability of convergence to a PNE is the same for clockwork and for random playing sequences in 2-player games. The reason is that, under the random playing sequence, players’ actions do not change whenever the sequence asks the same player to play several times in a row. The profiles that are therefore visited along the path are the same under both playing sequences, which induces the same probability of convergence to equilibrium. However, (ii) the expected game duration will be different since the random playing sequence introduces delays. In fact, the expected game duration for the random playing sequence should be greater than for the clockwork playing sequence by a factor of 2. The reason is that, under the clockwork playing sequence, the players alternate at the tick of each time step, whereas, under the random playing sequence, the time it takes for the playing sequence to turn to the other player is Geometric(\(\frac{1}{2}\)). Thus the random playing sequence can be considered as a slowing down of the clockwork playing sequence in which the expected time to play the next step is 2.

4 Simulation results

In Sects. 4.1 and 4.2, we run simulations of the clockwork and random sequence best-response dynamics. Our main goal is to investigate the extent to which our asymptotic results are also valid for a small number of players and actions. In these simulations, for each choice of n and \(\textbf{m}\), we randomly draw 10 batches of 1000 games. We run the best-response dynamic on each game and find the mean frequency of convergence to equilibrium in each batch, and then report the mean across the batches. The error bars in our figures are intervals of one empirical standard deviation (across the means for each batch).

In Sect. 4.3 we investigate the equilibrium convergence probability of playing sequences that interpolate between the extremes of clockwork and random playing sequences, and pay particular attention to the speed of convergence.

4.1 Simulations of clockwork best-response dynamics

The blue markers in Fig. 2 show the frequency of convergence to a PNE in our simulations for different values of numbers of players and actions. In both panels, the solid black line is the analytical probability of convergence to a PNE in 2-player m-action games, calculated using Eq. (2) with \(m_1=m_2=m\).

In the top panel, we present simulation outcomes for 2, 3, and 4-player games in which all players have the same number of actions. Up to sampling noise, our analytical result for 2-player games perfectly matches the numerical simulations. We also find that convergence frequency becomes lower for a given number of actions as the number of players increases.

Fig. 2
figure 2

Frequency of convergence to a PNE for the clockwork best-response dynamic

Fig. 3
figure 3

Frequency of convergence to a PNE for clockwork vs. random best-response dynamics

The blue markers in the bottom panel of Fig. 2 are the simulation means for different values of n and \(\textbf{m}\), all chosen to ensure that the minimal number of environments in those games match the number of environments in a 2-player m-action game. All markers line up reasonably well along the solid black line. Corollary 1 implies that the asymptotic convergence probability in games \(G_{n,\textbf{m}}\) and \(G_{n',\textbf{m}'}\) is approximately the same whenever \(q_{n,\textbf{m}}=q_{n',\textbf{m}'}\). Our results show that this relation holds even for relatively small games.

4.2 Simulations of random best-response dynamics

Figure 3 shows the frequency of convergence to a PNE under clockwork vs. random best-response dynamics in n-player games with m actions per player.Footnote 21

As argued in Sect. 3.2, when there are only \(n=2\) players, the random playing sequence has the same convergence probability as the clockwork playing sequence, which can be seen in the left panel of Fig. 3.

Looking across the panels, the frequency of convergence to a PNE is decreasing in both n and m for the clockwork playing sequence, but the random playing sequence is different because its frequency of convergence rapidly settles near \(1-1/e\) for \(n>2\). Recall, Amiet et al. (2021) proved that the random sequence best-response dynamic always converges to a PNE if there is one when \(m=2\) and \(n\rightarrow \infty\). As argued in Sect. 3.1, this gives us an unconditional probability of convergence of \(1-1/e \approx 63\%\). Our simulations show that the result of Amiet et al. (2021) also appears to hold for games with more than two actions provided \(n>2\). In fact, the random sequence best-response dynamic almost always converges to a PNE in games that have a PNE even for relatively small values of n and m.

4.3 Simulations of periodic best-response dynamics

The analytical results of Sect. 3 allowed us to compare the behavior of two extreme playing sequences: clockwork and random. We now turn our attention to intermediate cases. A playing sequence is p-periodic if it consists of a sequence of players of length \(p \ge n\) that is repeated forever, with the constraint that each player appears at least once in the repeated sequence. In other words,

$$\begin{aligned} i_1, i_2, \ldots , i_p, \; i_1, i_2, \ldots , i_p, \; i_1, i_2, \ldots , i_p, \; \ldots \end{aligned}$$

is a p-periodic playing sequence if for each \(i \in [n]\) there is some \(j \in [p]\) such that \(i_j = i\).

We generate p-periodic playing sequences at random as follows: construct a sequence of \(p-n\) integers drawn at random from [n] and append the numbers \(1,\ldots ,n\) to this sequence. This results in a sequence of p integers. Now select a random permutation of this sequence and call it \(\sigma\). Then \(\sigma , \sigma , \sigma ,\ldots\) is a p-periodic playing sequence. Clearly, if \(p=n\), we recover clockwork playing sequences. And, fixing n, we recover random playing sequences for \(p \rightarrow \infty\).

Fig. 4
figure 4

Frequency of convergence to a PNE for p-periodic best-response dynamics in \(n=3\) player games

Fig. 5
figure 5

Conditional speed of convergence to a PNE for p-periodic best-response dynamics in \(n=3\) player games

Figure 4 plots the frequency of convergence to a pure Nash equilibrium against the length of the random subsequence (namely \(p-n\)) for p-periodic best-response dynamics in \(n=3\) player games with 2 and 3 actions per player. As might be expected, the probability of convergence to a Nash equilibrium for p-periodic playing sequences is increasing in p.

Figure 5 plots, for different lengths of the random subsequence, the distribution of the number of time steps until a pure Nash equilibrium is reached conditional on converging to a pure Nash equilibrium for p-periodic best-response dynamics in \(n=3\) player games with 2 and 3 actions per player. Interestingly, conditional on converging to a Nash equilibrium, the average number of time steps to reach equilibrium is increasing in p. This relates to the findings of Durand and Gaujal (2016) who showed that, in expectation, convergence to equilibrium in potential games is faster under the clockwork playing sequence than under any other playing sequence. Here, our results indicate that, over the space of all games, the speed of convergence (conditional on converging to equilibrium) is slower for playing sequences that have a larger share of random elements (i.e. a larger \(p-n\)). The apparent trade-off between the success in finding equilibria vs. the speed of convergence to equilibria is an interesting area for future research.