Next Article in Journal
Analysis, Evaluation and Reusability of Virtual Laboratory Software Based on Conceptual Modeling and Conformance Checking
Previous Article in Journal
Correction: Kowalenko, V. Exact Values of the Gamma Function from Stirling’s Formula. Mathematics 2020, 8, 1058
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stochastic Simulations of Casual Groups

by
José F. Fontanari
Instituto de Física de São Carlos, Universidade de São Paulo, P.O. Box 369, São Carlos 13560-970, SP, Brazil
Mathematics 2023, 11(9), 2152; https://doi.org/10.3390/math11092152
Submission received: 3 April 2023 / Revised: 24 April 2023 / Accepted: 2 May 2023 / Published: 4 May 2023
(This article belongs to the Section Mathematical Biology)

Abstract

:
Free-forming or casual groups are groups in which individuals are in face-to-face interactions and are free to maintain or terminate contact with one another, such as clusters of people at a cocktail party, play groups in a children’s playground or shopping groups in a mall. Stochastic models of casual groups assume that group sizes are the products of natural processes by which groups acquire and lose members. The size distributions predicted by these models have been the object of controversy since their derivation in the 1960s because of the neglect of fluctuations around the mean values of random variables that characterize a collection of groups. Here, we check the validity of these mean-field approximations using an exact stochastic simulation algorithm to study the processes of the acquisition and loss of group members. In addition, we consider the situation where the appeal of a group of size i to isolates is proportional to i α . We find that, for α 1 , the mean-field approximation fits the equilibrium simulation results very well, even for a relatively small population size N. However, for α > 1 , this approximation scheme fails to provide a coherent description of the distribution of group sizes. We find a discontinuous phase transition at α c > 1 that separates the regime where the variance of the group size does not depend on N from the regime where it grows linearly with N. In the latter regime, the system is composed of a single large group that coexists with a large number of isolates. Hence, the same underlying acquisition-and-loss process can explain the existence of small, temporary casual groups and of large, stable social groups.

Graphical Abstract

1. Introduction

Finding and explaining patterns in the ebb and flow of people in a public gathering have been challenging tasks for the mathematically inclined social scientist [1,2]. People constantly join and leave groups, so at any moment, the gathering appears as a collection of social clusters. The stochastic models proposed in the 1960s to explain the size distribution of these free-forming or casual groups ignore any prior knowledge of the individuals present in the gathering and use only a couple of parameters to represent people’s average tendencies to join and leave groups [3,4]. Here, we evaluate the suitability of these models to reproduce the empirical size distribution of casual groups.
In fact, the equilibrium group-size distributions resulting from these null models, viz., the zero-truncated Poisson distribution and the logarithmic distribution, well describe the observed size distributions of collections of small groups, such as pedestrians on a sidewalk, playgroups in a playground and shopping groups [3]. However, these distributions were derived using a mean-field approximation to solve equations for the expected number of groups of a given size, which prompted the criticism that they were not actual outcomes of the models but artifacts of the approximation scheme [4,5,6]. Here, we attempt to settle this long-standing (and likely forgotten) issue using Gillespie’s stochastic algorithm to exactly simulate the group dynamics [7,8].
In addition, we extend the null models proposed in the 1960s to consider the situation where the appeal of a group of size i to isolates (i.e., individuals who are not members of any group) is proportional to i α , where α ( , ) is the attractiveness exponent. Large negative values of α describe situations where a predominance of couples and isolates is expected, whereas large positive values of α foster the formation of a single large group that coexists with isolates. In time, the truncated Poisson distribution is obtained for α = 0 [3], and the logarithmic distribution is obtained for α = 1 [4], which have been the only cases studied so far. The stochastic simulations of the group dynamics indicate that the mean-field approximation yields the exact results for α 1 and a large population size N. However, the approximation fails for α > 1 since it violates the fixed-population constraint.
The variation in the attractiveness exponent α allows the modeling of collections of small, temporary groups as well as of large, stable groups. We find that these two scenarios are separated by a discontinuous phase transition at α = α c > 1 . The probability of observing large groups vanishes exponentially with increasing group size i for α < α c . For α > α c , the probability mass function concentrates around i = 1 and i = N n 1 , where n 1 is the number of isolates. Hence, the acquisition-and-loss process of group dynamics does not produce a power-law decay for the probability of finding large groups observed in face-to-face interaction networks [9]. Long-tailed group-size distributions are outputs of agent-based models where the individuals are ascribed distinct degrees of attractiveness [10,11], so it seems that some knowledge of the individuals present in the gathering is necessary to produce these power-law distributions. In fact, the natural tendency of people to gravitate toward others who share similar interests or backgrounds is an important factor in explaining the formation of social groups [12,13,14]. Nevertheless, it is noteworthy that the stochastic null models can explain the empirical size distribution of collections of small groups.
A more recent and fruitful approach to the characterization of social groups or, more precisely, social networks—networks of friends or other acquaintances—is based on the complex networks framework [15]. In particular, a group of individuals (nodes) with a high density of internal links but with a comparatively lower density of external links is called a community [16,17]. Communities are ubiquitous in social and biological systems and are believed to represent real social groupings assembled by interest or background. We note that, although there are a variety of public repositories of animal social networks (see, e.g., [18,19]), the detection of communities in the real world as well as in artificial networks is a challenging computational task [20,21]. From an evolutionary perspective, the community organization of social networks is considered optimal if it boosts communication and decision making at the group level while keeping a minimum number of connections between individuals [22,23]. However, social network communities are relatively stable groups and thus are not good models for fleeting casual groups, which can be described by the less popular face-to-face interaction networks [24].
The rest of this paper is organized as follows: In Section 2, we describe the group dynamics and derive the exact equations for the expected number of groups of a given size. In Section 3, we present a brief overview of Gillespie’s algorithm and study the group dynamics using this stochastic simulation algorithm. In Section 4, we solve the equations derived in Section 2 for the equilibrium regime using the mean-field approximation and present explicit analytical expressions for the cases α , α = 0 and α = 1 , which are then compared with the simulation results. In Section 5, we study the equilibrium regime for α > 1 using stochastic simulations and show that in the limit N , there is a discontinuous phase transition separating the scenarios where the variance of the group size is finite and where it diverges linearly with increasing N. Finally, in Section 6, we review our main results and present some concluding remarks.

2. The Model

We consider a fixed number of individuals N that organize themselves into a variable number of groups of size i = 1 , , N in a closed system. We denote by n i the number of groups of size i at time t. These random variables satisfy the constraint i = 1 N i n i = N and determine the total number of groups in the system, viz., i = 1 N n i = M , which is also a random variable. The processes of joining and leaving the groups are as follows.
Each individual in a group of size i > 1 has a probability μ δ t of leaving the group during the time interval δ t . When an individual leaves a group, it becomes an isolate, i.e., a group of size i = 1 . An isolate has a probability λ δ t of joining a group (including other isolates) in the time interval δ t .
We assume that the attractiveness of a group of size i to isolates is proportional to i α . The case α = 0 describes the situation where isolates join any group in the system at the same rate, whatever its size [3]. For α > 0 , we have a contagious scenario that favors the formation of large groups [4], whereas, for α < 0 , we have an aversion scenario that disfavors the formation of groups. In the 1960s, approximate analytical expressions for the expected values of n i were derived for the cases α = 0 and α = 1 only [3,4].
The understanding of the processes by which groups acquire and lose individuals is facilitated if we write down the conditional expected values of the random variables n i ( t + δ t ) given that the system is in the state n ( t ) = n 1 ( t ) , , n N ( t ) at time t. Let us begin with the conditional expectation of the number of isolates,
E n 1 ( t + δ t ) | n ( t ) = n 1 + μ δ t ( 2 n 2 ) + μ δ t ( N n 1 ) λ δ t n 1 N α n 1 N α 1 2 λ δ t n 1 n 1 1 N α 1 ,
where we have omitted the dependence of the variables n i ( t ) on t that appear on the right-hand side (RHS) of the equation. In addition, we have introduced the notation
N α ( t ) = 1 α n 1 ( t ) + 2 α n 2 ( t ) + + N α n N ( t ) ,
so that N 0 ( t ) = M ( t ) and N 1 ( t ) = N . The third term on the RHS of Equation (1) takes into account the fact that any individual who is not isolated may become an isolate with a probability of μ δ t . The second term corrects the third term by reckoning with the fact that whenever an individual leaves a group of two individuals, two isolates are created. The fourth term on the RHS of Equation (1) accounts for the event that an isolate joins any group with i > 1 individuals, whereas the fifth term accounts for the aggregation of two isolates.
Next, we consider the conditional expectation of the number of couples, viz.,
E n 2 ( t + δ t ) | n ( t ) = n 2 + 3 μ δ t n 3 2 μ δ t n 2 λ δ t n 1 2 α n 2 N α 1 + λ δ t n 1 n 1 1 N α 1 .
The second and third terms on the RHS of Equation (3) account for the facts that a group of two individuals is created when an individual leaves a group of three individuals, and it is destroyed when an individual leaves a group of two individuals. The fourth term accounts for the event that isolates attracted by groups of two individuals produce groups of three individuals, and the fifth term accounts for the joining of two isolates to produce a group of two individuals.
For groups with i = 3 , , N individuals, we can write the general expression
E n i ( t + δ t ) | n ( t ) = n i + ( i + 1 ) μ δ t n i + 1 i μ δ t n i λ δ t n 1 i α n i N α 1 + λ δ t n 1 ( i 1 ) α n i 1 N α 1 ,
with n N + 1 0 . The interpretation of the terms in this equation follows straightforwardly from the interpretations of the terms in Equations (1) and (3).
We stress that Equations (1), (3) and (4) for the conditional expectations of n i ( t + δ t ) , i = 1 , , N are exact. Adding these equations yields the conditional expectation for the number of groups,
E M ( t + δ t ) | n ( t ) = M ( t ) + μ δ t [ N n 1 ( t ) ] λ δ t n 1 ( t ) ,
from which we can see that the isolates play a key role in driving the casual group dynamics.
Averaging Equations (1), (3)–(5) over the states n ( t ) and taking the limit δ t 0 yield
d n 1 d t = 2 μ n 2 + μ ( N n 1 ) λ E n 1 ( N α n 1 ) N α 1 2 λ E n 1 ( n 1 1 ) N α 1 ,
d n 2 d t = 3 μ n 3 2 μ n 2 2 α λ E n 1 n 2 N α 1 + λ E n 1 ( n 1 1 ) N α 1 ,
d n i d t = ( i + 1 ) μ n i + 1 i μ n i i α λ E n 1 n i N α 1 + ( i 1 ) α λ E n 1 n i 1 N α 1
for i = 2 , , N , and
d M d t = μ N + ( λ μ ) n 1 ,
where n N + 1 0 , and we have introduced the notation n i ( t ) E [ n i ( t ) ] . We note that by rescaling the time τ = μ t , these equations depend on the aggregation and disaggregation rates only through their ratio,
κ λ μ .
Of course, Equations (6)–(8) do not form a closed set of equations since there are quantities (e.g., E [ n 1 2 / ( N α 1 ) ] ) that are left undefined. Somewhat surprisingly, however, in the equilibrium regime, i.e., d n i / d t = 0 for i = 1 , , N , Equation (9) yields the exact mean number of isolates,
n 1 e q = N 1 + κ ,
which does not depend on the attractiveness exponent α .

3. The Gillespie Algorithm

Here, we offer a brief overview of Gillespie’s algorithm for simulating continuous-time stochastic models [7,8]. In the time interval δ t , the probability that aggregation occurs is λ n 1 ( t ) δ t , and the probability that an individual leaves a group is μ [ N n 1 ( t ) ] δ t . Since these two events decrease and increase the number of groups by one unity, respectively, their probabilities appear on the RHS of Equation (5). Given the state n ( t ) at time t, the probability that the next event will occur in the infinitesimal time interval ( t + t , t + t + δ t ) is P ( t ) δ t , where P ( t ) is the exponential distribution,
P ( t ) = υ exp t υ .
and
υ = μ N + ( λ μ ) n 1 ( t )
is the total rate of events. The event that occurs in the time interval ( t + t , t + t + δ t ) is an aggregation with a probability λ n 1 ( t ) / υ and a disaggregation with a probability μ [ N n 1 ( t ) ] / υ . In the case that aggregation occurs, there are two possibilities: an isolate can join a group of size i > 1 , which is an event that happens with a probability
i α n i ( t ) n 1 ( t ) 1 + 2 α n 2 ( t ) + + N α n N ( t ) ,
or two isolates can join together to form a couple, which has a probability
n 1 ( t ) 1 n 1 ( t ) 1 + 2 α n 2 ( t ) + + N α n N ( t ) .
In the case that disaggregation occurs in the time interval ( t + t , t + t + δ t ) , a single individual leaves a group of size i > 1 , which is an event that happens with a probability
i n i ( t ) 2 n 2 ( t ) + + N n N ( t ) = i n i ( t ) N n 1 ( t ) .
In sum, given the state of the system n ( t ) at time t, the stochastic simulation of the casual-group model begins with the choice of the time t when the next event will occur using the distribution in (12), followed by the choice of the type of event—aggregation with a probability λ n 1 ( t ) / υ and disaggregation with a probability μ [ N n 1 ( t ) ] / υ . Finally, the specific aggregation and disaggregation events are chosen with probabilities given by Equations (14)–(16). We note that, with the exception of the determination of the time of the next event, the aggregation and disaggregation rates always appear in the form of the ratio λ / μ . This numerical algorithm produces the exact trajectories of the states n ( t ) , which, when properly averaged over many independent runs, offers the only way to verify the validity of the approximation schemes used to solve high-dimensional master equations [7,8].
Here, we use the bracket notation to indicate the average over independent runs. In fact, because the number of runs is very large (typically 10 6 ), we can safely equate the average over independent runs to the expected values of the random variables n 1 ( t ) , , n N ( t ) , hence the choice of the same bracket notation used in Section 2.
Figure 1 shows the time evolution of the mean density of isolates, n 1 ( t ) / N , and the mean number of individuals per group m α N / M ( t ) for different values of the attractiveness exponent α . At time t = 0 , all N individuals are isolates (i.e., n 1 ( 0 ) = N and n i ( 0 ) = 0 , i > 1 ). Interestingly, although the mean density of isolates does not depend on α in the equilibrium regime, as shown in Equation (11), the transient regime is affected by the attractiveness exponent: the increase in α favors the production of isolates. This is so because α > 0 hinders the formation of couples, which requires the annihilation of two isolates, whereas the formation of groups of size i > 2 requires the annihilation of only one isolate. In addition, an increase in α increases the transient period, as well as the mean size of the groups.
Figure 2 shows the effect of the population size N on the density of isolates and on the mean group size for α = 0 . The results are qualitatively similar to other values of the attractiveness exponent. We note the remarkable unresponsiveness of the density of isolates to changes in N. In fact, although the decrease in N results in a reduction in the rate of events υ , given by Equation (13), the effect of a single event on the density of isolates is enhanced. These two effects compensate for each other, resulting in a system size invariance of n 1 ( t ) / N . However, the mean group size decreases with increasing N and converges very rapidly to the infinite system size limit. For α 1 , this limit is described very well by the analytical approximation for the equilibrium solutions of Equations (6)–(8) that we will derive in the next section.

4. Analytical Approximation for the Equilibrium Regime

Here, we derive the controversial approximate analytical results for the expectations E [ n i ( t ) ] n i ( t ) that motivated the present contribution [4,5,6]. In the following, we will focus only on the equilibrium regime, i.e., d n i / d t = 0 for i = 1 , , N . To solve the equilibrium equations for n i e q with i > 1 , we make the assumption
E f n 1 , , n N = f n 1 e q , , n N e q ,
where f is an arbitrary rational function. Such a strong assumption is valid if the random variables n i are self-averaging, i.e., n i / N n i e q / N for all i in the limit N . This neglect of fluctuations is the basis of the popular mean-field approximation of statistical physics [25]. With this assumption, we can easily write n i e q for i > 1 in terms of n 1 e q and N α e q ,
n i e q = [ ( i 1 ) ! ] α 1 i κ n 1 e q N α e q 1 i 1 ( n 1 e q 1 )
where N α e q = i = 1 N i α n i e q , and n 1 e q is given by Equation (11). These equations must be solved self-consistently: for an arbitrary value of N α e q , we calculate n i e q for i > 1 , which we then use to update the estimate of N α e q . The process is repeated until convergence. At this point, we can already see that the assumption in (17) leads to nonphysical results for κ > N 1 , where n 1 e q < 1 (see Equation (11)), since it implies n i e q < 0 for i > 1 . This breakdown of the mean-field approximation is expected because a necessary condition for the self-averaging property to hold is that n 1 e q 1 for large N.
In addition, and more importantly, Equation (18) holds for α 1 only. Although for finite N, the self-consistent strategy yields a solution for any value of α , in which N α e q scales with N α , the solution does not satisfy the constraint i = 1 N i n i e q = N for α > 1 . The reason is that Equation (18) yields a non-negligible value for n N e q , resulting in a net flow of individuals to nonphysical group sizes and the consequent violation of the constant-population-size constraint. This effect is negligible for α 1 because n N e q is vanishingly small (provided that N is not too small), so the flow of individuals to nonphysical regions is inconsequential. We stress that the exact simulations of the group dynamics result in negligible values of n N e q for any α , as we will show in Section 5.
A relevant quantity that is usually observed in empirical investigations [1,2,3] is the mean fraction of groups of size i = 1 , , N at equilibrium, defined as
p i = n i M e q n i e q M e q ,
where the approximation is justified by the assumption in (17). Of course, p i can be interpreted as the probability of observing a group of size i. It is interesting that empirical studies typically clump together groups of the same size that are observed on many different occasions (e.g., pedestrians on a sidewalk during Spring mornings in Eugene, Oregon [3]), so they report the total number of observations of groups of a given size. Summing over the different sizes yields the total number of groups observed. Hence, the ratio between the two averages n i e q / M e q is actually the correct measure to describe the empirical results. However, in stochastic simulations, we calculate the ratios n i / M for each run and then average the results over the many independent runs, so we measure n i / M e q .
In the following, we present explicit analytical expressions of p i for α = 0 , α = 1 and the limit α . In addition, we present the numerical solution of Equation (18) obtained with the self-consistent method for general α 1 .

4.1. Case α = 0

In this case, N α e q = M e q , and Equation (18) reduces to
n i e q = 1 i ! κ n 1 e q M e q 1 i 1 ( n 1 e q 1 )
for i > 1 , with
M e q = i = 1 N n i e q 1 + 1 κ ( M e q 1 ) ( 1 1 n 1 e q ) [ exp ( κ n 1 e q M e q 1 ) 1 ] ,
from which we obtain an explicit expression for the mean number of groups in the equilibrium regime,
M e q κ 1 + κ N ln ( 1 + κ ) .
In deriving Equation (21), we have assumed that N in order to carry out the sum over the group sizes, whereas, in deriving Equation (22), we have assumed that n 1 e q is on the order of N, which means that κ N . As already pointed out, these are the necessary conditions for the validity of the self-averaging property that underlies the mean-field approximation.
Hence,
p i = 1 e a 1 a i i ! ,
where a = ln ( 1 + κ ) , which we identify as the zero-truncated Poisson distribution. Interestingly, this distribution fits a wide variety of data of small groups [3]. The mean and the variance of the group size are
m 0 = i = 1 N i p i = ( 1 + 1 / κ ) ln ( 1 + κ )
and
σ 0 2 = 1 + 1 / κ ln ( 1 + κ ) + ln 2 ( 1 + κ ) 1 + 1 / κ 2 ln 2 ( 1 + κ ) .
Figure 3 exhibits the comparison between the stochastic simulations and the truncated Poisson distribution (23) for N = 10 and N = 100 . As expected, the mean-field approximation fails to describe the size distribution for N = 10 , but for N = 100 , its predictions are indistinguishable from the simulation results. This finding validates the use of that approximation, provided that the number of individuals is not too small.

4.2. Case α = 1

In this case, N α e q = N , and Equation (18) reduces to
n i e q = 1 i κ n 1 e q N 1 i 1 ( n 1 e q 1 )
with n 1 e q given in Equation (11). As before, assuming that N 1 and n 1 e q 1 , we obtain an explicit expression for the mean number of groups:
M e q N 1 κ ln ( 1 + κ ) .
Thus, the fraction p i of groups of size i > 0 is
p i = 1 ln ( 1 + κ ) 1 i κ 1 + κ i ,
which we identify as the logarithmic distribution used to model relative species abundance [26]. Hence, the mean group size is
m 1 = κ ln ( 1 + κ )
and the variance of the group size is
σ 1 2 = κ 2 + κ ( 1 + κ ) ln ( 1 + κ ) ln 2 ( 1 + κ ) .
Figure 4 exhibits the comparison between the stochastic simulations and the logarithmic distribution (28) for N = 10 and N = 100 . As before, the results validate the use of the mean-field approximation if the number of individuals is not too small.

4.3. The Limit α

In the limit α , we have N e q = n 1 e q , n 2 e q = κ n 1 e q / 2 and n i e q = 0 for i > 2 . Hence, the fraction of groups of size i is
p 1 = 1 1 + κ / 2
p 2 = κ / 2 1 + κ / 2
and p i = 0 for i > 2 . Figure 5 shows the simulation results for N = 10 and N = 100 . We have verified that p i = 0 for i > 2 in the simulations. As before, although the mean-field approximation fails for N = 10 , it yields the exact result for N = 100 .
The mean group size is
m = 1 + κ 1 + κ / 2
and the variance of the group size is
σ 2 = κ / 2 ( 1 + κ / 2 ) 2 .
The limit α is instructive because we can solve Equations (6)–(8) exactly for any value of N > 1 using the fact that i α 0 for i > 1 . We find
n 1 ( t ) = n 1 ( 0 ) e 2 ( 1 + κ ) τ + N 1 + κ 1 e 2 ( 1 + κ ) τ
where τ = μ t . From this equation, we can easily obtain n 2 ( t ) = ( N n 1 ( t ) ) / 2 and M ( t ) = ( N + n 1 ( t ) ) / 2 . The reason that Equations (31) and (32) are only approximate and thus fail to fit the data for N = 10 in Figure 5 is that, although we can calculate n 1 and M exactly for all N > 1 , we do not know how to calculate p 1 = n 1 / M , which is the quantity measured in the simulations.

4.4. General α 1

Except for the three cases discussed before, it is not possible to obtain explicit analytical expressions for n i e q because we cannot carry out the summation necessary to compute N α e q in a closed form. However, the use of the self-consistent method allows us to easily obtain these quantities numerically. Since we have already established that the mean-field approximation is very accurate, even for N = 100 , in Figure 6, we present only the approximate theoretical results for m α and σ α 2 .
As expected, decreasing the value of the attractiveness exponent decreases the mean and the variance of the group sizes. We note that these quantities have explicit analytical expressions in the cases α = 0 (Equations (24) and (25)), α = 1 (Equations (29) and (30)) and α (Equations (33) and (34)).

5. Equilibrium Regime for α > 1

This is by far the most interesting situation because of the complete failure of the mean-field approximation. As already pointed out, the reason is that the solution of Equation (18) violates the fixed-population constraint for α > 1 . However, we can still obtain some useful analytical information by considering the limit of very large α . In this limit, the system is composed of n 1 e q isolates and a single group of N n 1 e q individuals on average. Hence, the mean number of groups is M e q = n 1 e q + 1 , and we have
m n 1 e q + ( N n 1 e q ) n 1 e q + 1 1 + κ
and
σ 2 n 1 e q + ( N n 1 e q ) 2 n 1 e q + 1 m 2 N κ 2 1 + κ ,
so that the mean group size is finite, but the variance diverges in the limit N . The mean size of the large group is
N n 1 e q = N κ 1 + κ .
Interestingly, Figure 7 shows that this scenario—a single large group coexisting with isolates—describes the case α = 2 very well for large N. Figure 8 corroborates this finding by showing that p i tends to a bimodal distribution characterized by sharp peaks at i = 1 and i = N n 1 in the limit of large N. In addition, this figure shows that p N (or n N e q ) is negligibly small for α > 1 , in disagreement with the prediction (17) of the mean-field approximation.
In order to better understand the transition between the equilibrium regime characterized by a finite variance σ α 2 and the regime where σ α 2 diverges linearly with N as N , in Figure 9, we show the influence of the attractiveness exponent α on m α and σ α 2 / N . The results indicate the existence of a discontinuous transition between these two regimes that takes place at a critical value α c = α c ( κ ) > 1 . In addition, lim α α c + m α < m , so the regime of infinite variance but finite α is not perfectly described by the α scenario. In particular, for κ = 2 , we find α c 1.105 . This estimate was obtained by considering population sizes up to N = 12800 and noticing that, for α = 1.10 , the variance σ α 2 tends to a fixed value, whereas, for α = 1.11 , it increases with N. As κ increases, we find that α c 1 . We note that from the statistical physics perspective, the scaled variance σ α 2 / N is the order parameter of the casual-group model since it is zero for α < α c and nonzero otherwise.
Somewhat disappointingly, the acquisition-and-loss process underlying our model does not produce a power-law decay for the probability of finding large groups. In fact, Figure 10 shows that in the vicinity of the transition point α c , where a scale-free behavior is more likely to be observed [27], the probability of observing large groups vanishes exponentially with increasing group size for α < α c or exhibits two peaks for α > α c . It remains a challenge to find a simple acquisition-and-loss process that leads to a power-law distribution of group sizes as observed in face-to-face interaction networks [10,11].

6. Discussion

The distribution of sizes is likely the simplest quantitative information we can derive from the observation of freely forming groups. Of course, if the interrelations of the people present were known a priori, we could most certainly predict the formation and composition of some groups. However, here, we follow an alternative and more fruitful approach that ignores any prior knowledge of the individuals present and attempts to explain the observations using stochastic models characterized by a few parameters that represent people’s average tendencies to join and leave a group of a certain size [3,4]. Because the models considered here do not take into account individual idiosyncrasies, we refer to them as null models.
The null models assume that the total number of individuals N is fixed, i.e., that the system is closed, but this assumption is rarely satisfied in field studies of casual groups. For instance, the number of pedestrians on a sidewalk observed on distinct days varies greatly, and it is likely bounded by the city population. However, by taking the limit N , the fixed-population constraint becomes inconsequential. Of course, when considering this limit, we must focus only on the ratios of the number of groups, as applied in Equation (19). In fact, this was the approach used in the pioneer paper that introduced the mathematical modeling of casual groups [5]. In any event, our results indicate that even for N = 100 , the group-size distribution is practically indistinguishable from the distribution derived in the infinite population limit. Regarding the connection between the field studies and the mathematical models, it is assumed that the acquisition-and-loss process takes place at the time of the formation of the groups and that the system is at equilibrium at the moment of the observation. In addition, it is assumed that the observation happens on a time scale that is much faster than the acquisition-and-loss process, so the groups maintain their sizes during the period of observation [3].
Here, we extend previous models of casual groups by assuming that the appeal of a group of size i to isolates is proportional to i α , where α ( , ) is the attractiveness exponent. The control of the appeal of groups of different sizes to isolates, which is obtained by tuning the exponent α , allows us to consider some interesting scenarios. For instance, large negative values of α could describe people on the sidewalk walking to a dance party, where a predominance of couples is expected, which cannot be explained by the truncated Poisson distribution (see [28] for an alternative, more complex model of couples and isolates). Large positive values of α result in a bimodal distribution of group sizes, corresponding to a scenario where a single large group coexists with a number of isolates. Interestingly, in both cases, the proportion between the number of individuals in the large group or the number of individuals forming couples and the number of isolates is given by the ratio κ between the rates of aggregation and disaggregation.
Our main result is that the mean-field approximation used to derive the distribution of group sizes in the case that the attractiveness of a group does not depend on its size (i.e., α = 0 ) and in the case that it increases linearly with the group size (i.e., α = 1 ) actually yields the exact result for N . This conclusion, which is drawn from the agreement between the exact stochastic simulations of the group dynamics and the mean-field results, dismisses the suspicion of the inadequacy of the mean-field approximation to describe the equilibrium size distribution of casual groups [6]. (Of course, neither Gillespie’s algorithm [7,8] nor the computational resources to implement it were available in the 1960s to settle this issue.) In fact, for α 1 , the mean-field approximation yields very good predictions, even for a relatively small population size (e.g., N = 100 ). However, the approximation fails spectacularly for α > 1 , since it violates the fixed-population constraint. In this case, Gillespie’s stochastic simulation algorithm emerges as the only resource to study the dynamics of casual groups.
In addition, we find that the variation in the attractiveness exponent α produces scenarios where the group sizes are typically small, which is the situation addressed in the literature on casual groups [2], and scenarios where most of the population is confined to a single group. The latter scenario corresponds to the large and stable groups formed by gregarious animals, whose sizes are determined by a variety of selective pressures, such as defense against predation and foraging success [29]; the cognitive load that constrains the number of individuals with whom it is possible to maintain stable relationships [30]; competence in problem solving [31]; and individual distance preservation [32]. Remarkably, in our model, these distinct scenarios are separated by a discontinuous phase transition that takes place at α = α c > 1 , indicating that both types of aggregation behavior can be explained by the same underlying acquisition-and-loss process.
The biological and sociological implication of the success of the null models to produce the empirical distribution of sizes for small groups (i.e., the truncated Poisson distribution) is that prior knowledge of the individuals present in the gathering, as well as individual idiosyncrasies, is not necessary to explain the size distribution of casual groups.
There are at least two research avenues to pursue in order to further improve our understanding of the fleeting clusters of people observed in public gatherings. First, different forms of group attractiveness to isolates can be explored so as to fit the data available from the SocioPatterns collaboration [24], which suggests that the group-size distribution decays as a power law for large group sizes [9,10,11]. Second, the individual-based model that reproduces the SocioPatterns collaboration data [10,11] can be used to fit the small groups’ data available in the seminal works on casual groups [2,3]. If any of these pursuits is successful, one would be able to reproduce all available data on casual groups with a single model.

Funding

This research was funded by Fundação de Amparo à Pesquisa do Estado de São Paulo, grant number 2020/03041-3, and Conselho Nacional de Desenvolvimento Científico e Tecnológico, grant number 305620/2021-5.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Coleman, J.S. Introduction to Mathematical Sociology; Free Press Glencoe: London, UK, 1964. [Google Scholar]
  2. Cohen, J.E. Casual Groups of Monkeys and Men; Harvard University Press: Cambridge, UK, 1971. [Google Scholar]
  3. Coleman, J.S.; James, J. The Equilibrium Size Distribution of Freely-forming Groups. Sociometry 1961, 24, 36–45. [Google Scholar] [CrossRef]
  4. White, H. Chance Models of Systems of Casual Groups. Sociometry 1962, 25, 153–172. [Google Scholar] [CrossRef]
  5. Coleman, J.S. Comment on Harrison White, “Chance Models of Systems of Casual Groups”. Sociometry 1962, 25, 172–176. [Google Scholar] [CrossRef]
  6. Goodman, L.A. Mathematical Methods for the Study of Systems of Groups. Am. J. Sociol. 1964, 70, 170–192. [Google Scholar] [CrossRef]
  7. Gillespie, D.T. A General Method for Numerically Simulating the Stochastic Time Evolution of Coupled Chemical Reactions. J. Comp. Phys. 1976, 22, 403–434. [Google Scholar] [CrossRef]
  8. Gillespie, D.T. Exact Stochastic Simulation of Coupled Chemical Reactions. J. Phys. Chem. 1977, 81, 2340–2361. [Google Scholar] [CrossRef]
  9. Cattuto, C.; Van den Broeck, W.; Barrat, A.; Colizza, V.; Pinton, J.-F.; Vespignani, A. Dynamics of person-to-person interactions from distributed RFID sensor networks. PLoS ONE 2010, 5, e11596. [Google Scholar] [CrossRef]
  10. Starnini, M.; Baronchelli, A.; Pastor-Satorras, R. Modeling Human Dynamics of Face-to-Face Interaction Networks. Phys. Rev. Lett. 2013, 110, 168701. [Google Scholar] [CrossRef] [PubMed]
  11. Starnini, M.; Baronchelli, A.; Pastor-Satorras, R. Model reproduces individual, group and collective dynamics of human contact networks. Soc. Netw. 2016, 47, 130–137. [Google Scholar] [CrossRef]
  12. Lazarsfeld, P.; Berelson, B.; Gaudet, H. The People’s Choice; Columbia University Press: New York, NY, USA, 1948. [Google Scholar]
  13. Axelrod, R. The Dissemination of Culture: A Model with Local Convergence and Global Polarization. J. Confl. Res. 1997, 41, 203–226. [Google Scholar] [CrossRef]
  14. Reia, S.M.; Gomes, P.F.; Fontanari, J.F. Comfort-driven mobility produces spatial fragmentation in Axelrod’s model. J. Stat. Mech. 2020, 2020, 033402. [Google Scholar] [CrossRef]
  15. Albert, R.; Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 2002, 74, 47–97. [Google Scholar] [CrossRef]
  16. Girvan, M.; Newman, M.E.J. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 2002, 99, 7821–7826. [Google Scholar] [CrossRef]
  17. Cherifi, H.; Palla, G.; Szymanski, B.K.; Lu, X. On community structure in complex networks: Challenges and opportunities. Appl. Netw. Sci. 2019, 4, 117. [Google Scholar] [CrossRef]
  18. The Network Data Repository with Interactive Graph Analytics and Visualization. Available online: https://networkrepository.com (accessed on 23 April 2023).
  19. Sah, P.; Méndez, J.D.; Bansal, S. A multi-species repository of social networks. Sci. Data 2019, 6, 44. [Google Scholar] [CrossRef] [PubMed]
  20. Lancichinetti, A.; Fortunato, S. Community detection algorithms: A comparative analysis. Phys. Rev. E 2009, 80, 056117. [Google Scholar] [CrossRef] [PubMed]
  21. Yang, Z.; Algesheimer, R.; Tessone, C. A Comparative Analysis of Community Detection Algorithms on Artificial Networks. Sci. Rep. 2016, 6, 30750. [Google Scholar] [CrossRef]
  22. Pasquaretta, C.; Levé, M.; Claidière, N.; van de Waal, E.; Whiten, A.; MacIntosh, A.J.J.; Pelé, M.; Bergstrom, M.L.; Borgeaud, C.; Brosnan, S.F.; et al. Social networks in primates: Smart and tolerant species have more efficient networks. Sci. Rep. 2014, 4, 7600. [Google Scholar] [CrossRef] [PubMed]
  23. Fontanari, J.F.; Rodrigues, F.A. Influence of network topology on cooperative problem-solving systems. Theory Biosci. 2016, 135, 101–110. [Google Scholar] [CrossRef]
  24. SocioPatterns. Available online: http://www.sociopatterns.org (accessed on 23 April 2023).
  25. Huang, K. Statistical Mechanics; John Willey & Sons: New York, NY, USA, 1963. [Google Scholar]
  26. Fisher, R.A.; Corbet, A.S.; Williams, C.B. The Relation Between the Number of Species and the Number of Individuals in a Random Sample of an Animal Population. J. Anim. Ecol. 1943, 12, 42–58. [Google Scholar] [CrossRef]
  27. West, G. Scale: The Universal Laws of Life, Growth, and Death in Organisms, Cities, and Companies; Penguin Press: New York, NY, USA, 2017. [Google Scholar]
  28. Fontanari, J.F. A stochastic model for the influence of social distancing on loneliness. Phys. A 2021, 584, 126367. [Google Scholar] [CrossRef] [PubMed]
  29. Wilson, E. Sociobiology; Harvard University Press: Cambridge, MA, USA, 1975. [Google Scholar]
  30. Dunbar, R.I.M. Neocortex size as a constraint on group size in primates. J. Hum. Evol. 1992, 22, 469–493. [Google Scholar] [CrossRef]
  31. Fontanari, J.F. Imitative Learning as a Connector of Collective Brains. PLoS ONE 2014, 9, e110517. [Google Scholar] [CrossRef] [PubMed]
  32. Mogilner, A.; Edelstein-Keshet, L.; Bent, L.; Spiros, A. Mutual interactions, potentials, and individual distance in a social aggregation. J. Math. Biol. 2003, 47, 353–389. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Stochastic simulations of casual groups with attractiveness exponents α = 1 , 0 , 1 and 2, as indicated. (Left) Mean density of isolates as function of time t. The dashed horizontal line is the exact result (11) for the equilibrium regime. (Right) Mean group size as function of time t. The dashed horizontal lines are the equilibrium approximate analytical results for α 1 presented in Section 4. The other parameters are λ = 1.5 , μ = 1 and N = 100 .
Figure 1. Stochastic simulations of casual groups with attractiveness exponents α = 1 , 0 , 1 and 2, as indicated. (Left) Mean density of isolates as function of time t. The dashed horizontal line is the exact result (11) for the equilibrium regime. (Right) Mean group size as function of time t. The dashed horizontal lines are the equilibrium approximate analytical results for α 1 presented in Section 4. The other parameters are λ = 1.5 , μ = 1 and N = 100 .
Mathematics 11 02152 g001
Figure 2. Stochastic simulations of casual groups for α = 0 and population sizes N = 5 , 10 , 20 and 100, as indicated. (Left) Mean density of isolates as function of time t. The dashed horizontal line is the exact result (11) for the equilibrium regime. (Right) Mean group size as function of time t. The dashed horizontal line is the approximate analytical result for the equilibrium regime presented in Section 4. The other parameters are λ = 1.5 and μ = 1 .
Figure 2. Stochastic simulations of casual groups for α = 0 and population sizes N = 5 , 10 , 20 and 100, as indicated. (Left) Mean density of isolates as function of time t. The dashed horizontal line is the exact result (11) for the equilibrium regime. (Right) Mean group size as function of time t. The dashed horizontal line is the approximate analytical result for the equilibrium regime presented in Section 4. The other parameters are λ = 1.5 and μ = 1 .
Mathematics 11 02152 g002
Figure 3. Probability of observing a group of size i = 1 , 2 , 3 and 4 in the equilibrium regime for α = 0 . The population sizes are N = 10 and 100, as indicated, and the disaggregation rate is μ = 1 . The solid curves that perfectly fit the simulation data for N = 100 are given by the truncated Poisson distribution (23).
Figure 3. Probability of observing a group of size i = 1 , 2 , 3 and 4 in the equilibrium regime for α = 0 . The population sizes are N = 10 and 100, as indicated, and the disaggregation rate is μ = 1 . The solid curves that perfectly fit the simulation data for N = 100 are given by the truncated Poisson distribution (23).
Mathematics 11 02152 g003
Figure 4. Probability of observing a group of size i = 1 , 2 , 3 and 4 in the equilibrium regime for α = 1 . The population sizes are N = 10 and 100, as indicated, and the disaggregation rate is μ = 1 . The solid curves that perfectly fit the simulation data for N = 100 are given by the logarithmic distribution (28).
Figure 4. Probability of observing a group of size i = 1 , 2 , 3 and 4 in the equilibrium regime for α = 1 . The population sizes are N = 10 and 100, as indicated, and the disaggregation rate is μ = 1 . The solid curves that perfectly fit the simulation data for N = 100 are given by the logarithmic distribution (28).
Mathematics 11 02152 g004
Figure 5. Probability of observing a group of size i = 1 and 2 in the equilibrium regime for α = 10 4 . The population sizes are N = 10 and 100, as indicated, and the disaggregation rate is μ = 1 . The solid curves that perfectly fit the simulation data for N = 100 are given by Equations (31) and (32).
Figure 5. Probability of observing a group of size i = 1 and 2 in the equilibrium regime for α = 10 4 . The population sizes are N = 10 and 100, as indicated, and the disaggregation rate is μ = 1 . The solid curves that perfectly fit the simulation data for N = 100 are given by Equations (31) and (32).
Mathematics 11 02152 g005
Figure 6. Mean-field approximation for the equilibrium regime for attractiveness exponents (top to bottom) α = 1 , 0.5 , 0 , 1 and α . (Left) Mean group size m α . (Right) Variance of the group size σ α 2 .
Figure 6. Mean-field approximation for the equilibrium regime for attractiveness exponents (top to bottom) α = 1 , 0.5 , 0 , 1 and α . (Left) Mean group size m α . (Right) Variance of the group size σ α 2 .
Mathematics 11 02152 g006
Figure 7. Stochastic simulations in the equilibrium regime for α = 2 and population sizes N = 50 , 100 , 200 and 400, as indicated. (Left) Mean group size m 2 . (Middle) Variance of the group size σ 2 2 . (Right) Scaled variance σ 2 2 / N . The solid lines are the predictions of Equations (36) and (37) for α . The disaggregation rate is μ = 1 .
Figure 7. Stochastic simulations in the equilibrium regime for α = 2 and population sizes N = 50 , 100 , 200 and 400, as indicated. (Left) Mean group size m 2 . (Middle) Variance of the group size σ 2 2 . (Right) Scaled variance σ 2 2 / N . The solid lines are the predictions of Equations (36) and (37) for α . The disaggregation rate is μ = 1 .
Mathematics 11 02152 g007
Figure 8. Distribution of group sizes in the equilibrium regime for α = 2 , κ = 2 , μ = 1 and N = 100 , 200 , 400 and 800, as indicated. The dashed vertical line indicates the relative size of the large group for α , viz., 1 n 1 e q / N = 2 / 3 . The lines connecting the symbols are guides to the eye.
Figure 8. Distribution of group sizes in the equilibrium regime for α = 2 , κ = 2 , μ = 1 and N = 100 , 200 , 400 and 800, as indicated. The dashed vertical line indicates the relative size of the large group for α , viz., 1 n 1 e q / N = 2 / 3 . The lines connecting the symbols are guides to the eye.
Mathematics 11 02152 g008
Figure 9. Stochastic simulations in the equilibrium regime for κ = 2 and population sizes N = 100 , 200 , 400 and 800, as indicated. (Left) Mean group size m α . (Right) Scaled variance σ 2 2 / N . The predictions of Equations (36) and (37) for α are m = 3 and σ 2 / N = 4 / 3 (dashed horizontal line in the right panel). The disaggregation rate is μ = 1 . The lines connecting the symbols are guides to the eye.
Figure 9. Stochastic simulations in the equilibrium regime for κ = 2 and population sizes N = 100 , 200 , 400 and 800, as indicated. (Left) Mean group size m α . (Right) Scaled variance σ 2 2 / N . The predictions of Equations (36) and (37) for α are m = 3 and σ 2 / N = 4 / 3 (dashed horizontal line in the right panel). The disaggregation rate is μ = 1 . The lines connecting the symbols are guides to the eye.
Mathematics 11 02152 g009
Figure 10. Distribution of group sizes in the equilibrium regime for κ = 2 and N = 100 , 200 , 400 and 800, as indicated. (Left) α = 1.10 . The straight line is the fitting p i = 0.003 exp ( 0.18 i ) of the data for N = 800 . (Right) α = 1.15 . The disaggregation rate is μ = 1 . The lines connecting the symbols are guides to the eye.
Figure 10. Distribution of group sizes in the equilibrium regime for κ = 2 and N = 100 , 200 , 400 and 800, as indicated. (Left) α = 1.10 . The straight line is the fitting p i = 0.003 exp ( 0.18 i ) of the data for N = 800 . (Right) α = 1.15 . The disaggregation rate is μ = 1 . The lines connecting the symbols are guides to the eye.
Mathematics 11 02152 g010
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fontanari, J.F. Stochastic Simulations of Casual Groups. Mathematics 2023, 11, 2152. https://doi.org/10.3390/math11092152

AMA Style

Fontanari JF. Stochastic Simulations of Casual Groups. Mathematics. 2023; 11(9):2152. https://doi.org/10.3390/math11092152

Chicago/Turabian Style

Fontanari, José F. 2023. "Stochastic Simulations of Casual Groups" Mathematics 11, no. 9: 2152. https://doi.org/10.3390/math11092152

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop