Optimising SARS-CoV-2 pooled testing strategies on social networks for low-resource settings

K I Mazzitello; Y Jiang; C M Arizmendi

doi:10.1088/1751-8121/ac039b

1. Introduction

The ongoing COVID-19 pandemic has upended the world, quickly challenging many assumptions and certainties. It is a new virus efficient in transmitting from person to person and a high level of morbidity and mortality that increase with age and co-morbidities. The non-pharmaceutical intervention of detection and isolation of infected people is a key policy to reduce the spread of COVID-19. The aim is to slow transmission and the growth rate of infections to avoid overburdening healthcare systems an approach widely known as flattening the curve. In order to identify the infected people SARS-CoV-2 tests must be performed.

However, each diagnostic SARS-CoV-2 test costs 30–50 US dollars in the US [2]. Therefore, testing many people in a population regularly, as may be essential to flatten the curve, is beyond the reach of most low and even some mid-income countries. However, there are more efficient ways than the naive approach of testing everyone in which far fewer tests are actually needed, especially at low prevalence. It is much more efficient to pool (or combine) samples and test them together. Group testing initially appears in a paper of Dorfman in 1943 [3]. Other algorithms of pooling samples have been proposed recently [1, 4–7]. Estimating the prevalence of a virus within a community prior to widespread disease transmission may help public health officials predict when to prepare for an increase in cases. With over sixty eight million cases in the world at time of writing this paper [8], this sort of screening strategy is probably not necessary at this point in the pandemic. Nevertheless, these techniques are likely to be valuable at the beginning of a future outbreak to track the spread of a virus across the world over time. Specially because human behaviors that perturb the human-microbial status quo may have reached a tipping point that predicts the inevitability of an acceleration of disease emergences [9].

On the other hand, this approach may be particularly helpful in settings where the number of infections is low and declining, and most test results are expected to be negative. For example, in a community where the infection seems to be under control and reopenings of schools and businesses are planned, pooled testing of employees and students could be an effective strategy.

Our goal in this work is to analyze the way in which different strategies of surveillance testing in a low prevalence stage, like frequency and random vs localized search of infected people, change the epidemics curve. We choose the hypercube algorithm of pool testing [1] in the same way as we may have chosen some other pool testing algorithm because we are not particularly interested in the efficiency of the algorithm but in the strategy of the algorithm application. A similar study to monitor whether epidemics were contained or became uncontrolled depending on the frequency of testing was studied with a stochastic agent-based model for SARS-CoV2 transmission [10]. To investigate the effects of surveillance testing strategies at the population level, we used simulations to monitor whether epidemics were contained or became uncontrolled. We will take a network approach to simulate the evolution of the epidemic on a society in order to study not just the frequency but also the spatial distribution of testing. In order to study the different behavior of the epidemic when different test pooling samples are applied the social group under the epidemic is represented as families or small communities that interact with each other in a random way. We choose a sparse network to reflect the lockdown restriction. Similar structures have been proposed in [11–13] for carrying out comparative tests of different methods for community detection in complex networks. In our work, the connections between individuals are modeled as static links [14, 15], assuming the contagion as a process faster than the network evolution. Modular time-varying networks have been also proposed to study epidemic spreading [16].

The paper is laid out as follows. In section 2 we define the epidemiological network model. The main results are presented in section 3. In section 3.1, the impact of the social structure given by the network model on the spread of the disease is analyzed. This allows us to establish a frame of reference to study, in section 3.2, the optimal strategies of the pool testing based on the geometry of a hypercube, at low prevalence [1]. Finally, in section 4, we state our conclusions.

2. The epidemiological model of social networks under quarantine

In our sparse network model, we assume that the small communities (families) are composed of a few members connected to each other and also to other families with a number of external static links triggering the spreading epidemics (see figure 1). The number of members of each small community or family is k_int ± Δk_int nodes connected on average to k_ext nodes that belong to other small communities. Within a community, everybody is connected to everybody else as is shown in figure 1. The nodes of the network represent individuals that can be either susceptible, infected or recovered, subject to interactions with their neighbors (i.e. other individuals directly linked to him/her by either intracommunity or intercommunity connections). As a result of these interactions, susceptible individuals can become infected and spread the disease over time before they recover or isolate using a strategy of pooled testing in areas affected by the virus. Starting with a number of outbreaks of the disease randomly located on the network, the model dynamics is defined by iterating a sequence of possibilities, as follows:

**Figure 1.** A social network under quarantine (lockdown restriction) with few individuals (N = 38). In this network there are nine small communities (families) consisting of k_int ± Δk_int = 4 ± 2 members (circles) connected to each other (gray lines) and with mean intercommunity connections k_ext = 4 per family in average (red lines). The network visualization was created with Cytoscape [17].
Download figure:
Standard image High-resolution image

(1) An individual is selected at random;

(2a) If the individual is infected, he/she can transmit the virus to his/her neighbors with an infectious contact rate of COVID-19 pandemic β or can recover with probability 1/t_rec, with t_rec the recovery time. This time is different for each infected individual, given by a Gaussian distribution around the mean value ${\bar{t}}_{\text{rec}}$ (see table 1).

Table 1. Parameters and their values used in the model.

Parameter	Description	Values
$\bar{\beta }$	Mean infectious contact rate	0.25 ± 0.05 ^a (1/day)
${\bar{t}}_{\text{rec}}$	Mean recovery time	13 ± 3.5 ^b (day)
k_int	Mean number of cohabitants	4 ± 2 ^d
k_ext	Mean intercommunity connections	Variable
p	Prevalence of the disease	Variable
N_S	Number of individuals in a hypercube	Equation (3)
L	Size of the hypercubes	3 [1]
D	Dimension of the hypercubes	L^D = N_S
M	Maximum number of tests per 100 000 inhabitants per day	Variable
N	Maximum number of screened individuals per	10 × M ^c
	100 000 inhabitants per day
Frequency	Frequency of testing and isolation of infected persons per day	Variable

^a95% confidence interval to obtain β = 0.21–0.3 (1/day) [18].^b95% confidence interval to obtain t_rec = mean latent period + mean infectious period = 2.2–6 + 4–14 days = 6.2–20 days [18].^cSince N is always greater than M, we consider appropriate to set N one order of magnitude higher than M.^dRange for most countries in Latin America [19].

(2b) If the individual is susceptible and has infected neighbors, he/she can become infected with an effective contact rate β.

(2c) If the individual is infected and is found by testing, he/she is isolated of his/her neighbors. The testing takes a time in which he/she can spread the virus. This time is given by the inverse of the frequency of testing (see table 1).

We consider that the infectious contact rate β is constant over time but may change with different pairs of neighbors according to a Gaussian distribution around its mean value $\bar{\beta }$ (see table 1). Indeed, each individual experiences a different number of contacts per unit time with their neighbors, proportionally reflected in β. In other words, β grows with the probability of disease transmission per unit time and also with the interactions between neighbors [20, 21].

The system evolves toward absorbing states with a maximum of affected individuals by the pandemic i.e. frozen configurations that are not capable of further changes. The final state, consisting of recovered and susceptible individuals that were not infected depends on the number of outbreaks of the disease and on the community structure. Our procedure allows building fairly large networks (up to 1–4 × 10⁵ nodes) in a reasonable time.

We are interested in studying the efficiency of a recently proposed search and isolation algorithm [1] of pooled testing of infected individuals on affected areas applied to the model of social networks under quarantine. The affected areas are discovered due to the rate of infected individuals report to health centers. An affected area around an infected person is composed of a number of individuals chosen from his/her neighboring first, second, third and so on, until this number is reached. We imposed a maximum number M of screened persons per 100 000 inhabitants per day (see table 1).

3. Results and discussion

3.1. Epidemic spread on social networks under quarantine

In order to set the stage for the investigation of testing effects, let us first show results concerning social network model under quarantine without any epidemic control. As mentioned above, in the absence of testing, the system reaches a total number of infected individuals that depends on the parameter values of the disease and the number of outbreaks at the beginning. Each of these outbreaks starts by a single infected individual randomly located on the networks.

Figure 2 shows the mean densities defined as mean numbers of individuals recovered (green lines), active infected (red lines) and susceptible exposed to the virus (black lines) divided by the total population of 100 000 inhabitants, obtained for 1, 10 and 100 outbreaks of the virus (solid, dashed and dotted lines, respectively). The exposed susceptible individuals are defined as those persons having at least one infected neighbor. For a given number of outbreaks, the mean density curve of these individuals (black lines) reaches its maximum long before the corresponding infected people peak (red lines in the same figure). The mean density of susceptible individuals exposed to the virus could be clearly a measure to estimate the probability of contagion.

Also, figure 2 shows that infectiousness increases as the number of outbreaks per inhabitant increases and the epidemic peak is earlier. This last result is expected and also predicted by mean field models like Verhulst-Pearle sigmoid [22] or SIR [23]. Such compartmental models have proven flexible, tractable, and highly informative as a general guide to the population-level behavior of diseases. Each compartment has either susceptible, infected or recovered persons and the probability of disease-causing contact with any member of a particular compartment is the same. This mean field approximation leads to a fixed intensity of the infection peak i.e. it does not change with the density of outbreaks under 10% as shown in figure 3(a). Moreover, these results are easily collapsed by a simple translation on the horizontal axis. In figure 3(b), we moved the curves of 10 and 100 outbreaks on the curve of the 1 outbreak, estimating an initial pandemic growth as a geometric progression of common ratio

$\begin{equation}r=1+\bar{\beta }-\frac{1}{{\bar{t}}_{\text{rec}}}\enspace \text{per}\;\text{day.}\end{equation} \tag{ 1 }$

For SIR model with low density of outbreaks I₀, the infection curves may be well approximated initially by geometric progressions of ratio r. For a time interval Δ_n, the density of infected individuals is ${I}_{n}={I}_{0}{r}^{\left({{\Delta}}_{n}\right)}$ , following the geometric progression of ratio r. Thus, the time translation on the horizontal axis, in figure 3(b) is

$\begin{equation}{{\Delta}}_{n}=\frac{n}{\mathrm{log}\left(r\right)},\quad \text{with}\enspace n=0,\enspace 1,\enspace 2\end{equation} \tag{ 2 }$

for 1, 10 and 100 outbreaks, respectively. The good collapse of the curves is apparent, though a slight difference in the densities of individuals recovered is found at the beginning of their collapse due to every curve starts without individuals recovered (this difference is not visualized in the scale of figure 3(b)). Therefore, for the SIR model, if the density of outbreaks is low enough and known, the pandemic is predictable over time and it is useless for our goal of studying different searching and testing strategies. (More results obtained from the network model are included in the appendix A).

**Figure 3.** (a) Mean densities of recovered (green lines), active infected (red lines) and susceptible (black lines) individuals, without epidemic control, obtained from SIR model for 1, 10 and 100 outbreaks of the virus on 100 000 inhabitants (solid, dashed and dotted lines, respectively). (b) Collapse for the same data in panel (a) obtained from ${{\Delta}}_{n}\cong n/\mathrm{log}\left(1+\bar{\beta }-1/{\bar{t}}_{\text{rec}}\right)$ and n = 0, 1, 2 for 1, 10 and 100 outbreaks, respectively. For SIR model, all susceptible persons have the same probability of contagion and there is no distinction between them.
Download figure:
Standard image High-resolution image

**Figure 3.** (a) Mean densities of recovered (green lines), active infected (red lines) and susceptible (black lines) individuals, without epidemic control, obtained from SIR model for 1, 10 and 100 outbreaks of the virus on 100 000 inhabitants (solid, dashed and dotted lines, respectively). (b) Collapse for the same data in panel (a) obtained from ${{\Delta}}_{n}\cong n/\mathrm{log}\left(1+\bar{\beta }-1/{\bar{t}}_{\text{rec}}\right)$ and n = 0, 1, 2 for 1, 10 and 100 outbreaks, respectively. For SIR model, all susceptible persons have the same probability of contagion and there is no distinction between them.
Download figure:
Standard image High-resolution image

3.2. Optimal strategies of pool testing to prevent the epidemic spread

We consider a maximum fixed number N of screened persons per day. The first tests start when a number n_I0 of infected individuals report to health centers. Each of these persons is considered an infection source and a scanning of their neighbors is done until completing N/n_I0 individuals around of each infection source. Once the sample is taken, the testing method is applied. This method has been introduced in [1] and the idea is to pool N_S subsamples of the total sample N and test the combined subsample with a single test. If the test is negative all subjects in the subsample are negative and it continues with another subsample of N_S persons. If the test is positive the hypercube algorithm is applied to determine who are infected.

The algorithm consists of locating each individual of the positive subsample on a D-dimensional hypercube lattice with L points in each direction. The hypercube has D principal directions, containing the N_S individuals of the positive subsample, so that L^D = N_S. For example, for D = 3 and L = 3, the hypercube is a simple cube with 27 individuals arranged on a 3 × 3 × 3 grid (figure 4). Let us consider N_S individuals in each subsample, the algorithm is summarized as follows:

(a)
Slice the hypercube into L planar slices, perpendicular to each principal axes to form such a set of slices in each one of the D directions. Thus, there are DL slices with L^D−1 points in every one of them. Test every slice and if D are positives, there is one infected individual, which is immediately identified. Indeed, testing the DL slices, one infected individual will result in exactly D positive tests, representing the intersection of the planes passing through the infected sample. If the number of positive slices is greater than D, there is more than one infected individual. Moreover, the number of subsample members who are infected may be accurately inferred from this stage (see [1] for more details). Then, an axis with the maximum number of positive slices is selected (see the slices on the right in figure 4).
(b)
Take one of the positive slices selected and run the hypercube algorithm again. The purpose is to apply recursively the method to the positive slices selected; a slice through a D-dimensional hypercube is itself a hypercube of dimension D − 1. If the selected slice contains one infected individual, it is immediately identified. If it contains more than one, run the hypercube algorithm again on this slice at a lower dimension. In short, the algorithm is then run again on the slices that tested positive, iterating it, if necessary, until the infected individuals are identified.

**Figure 4.** Illustration of sample pooling in the hypercube algorithm, for D = L = 3 and N_S = 27. Circles in red represent infected persons and the rest in cyan are susceptible. Left panel: the hypercube is sliced into L slices, in each of the D principal directions, and samples from N_S/L individuals are pooled into a sample for each slice. For this example, the infected individuals are on the front face and therefore five slices are positives leading to four suspicious persons: the infected individuals and their neighbors pointed with the arrows. Right panel: the x axis with the maximum number of positive slices is selected. Take one of these slices, itself a hypercube of dimension D − 1, and run the hypercube algorithm again. The coordinates of the corresponding infected individual are then uniquely identified, and those of the second infected individual are inferred by elimination.
Download figure:
Standard image High-resolution image

The effective size N_S of the subgroups is chosen to minimize the total mean number of tests per person. The testing increases as the number of infected individuals increases in the subgroup. Therefore, the algorithm is effective if this number is low and if tests with high sensitivity are used for the dilution of the subsamples, such as reverse-transcription polymerase chain reaction (RT–PCR) tests [24, 25]. Assuming Poisson statistics for the number of infected individuals in the subgroups and using L = 3 points in every direction of the hypercube, the optimal size to minimize the total number of tests is [1]

$\begin{equation}{N}_{\text{S}}\simeq 0.350/p,\end{equation} \tag{ 3 }$

with p the prevalence defined as the probability that any individual of the subgroup N_S is infected. The prevalence of the disease is unknown and we roughly estimate p ≈ n_I0/N in the areas affected by the virus, considering that infection does not spread at the beginning of the pandemic. Then, when the testing is finalized, infected individuals in affected areas are found and isolated from their neighbors. Due to limited resources, a maximum number of tests per day M is imposed. M is chosen lower than the sample N of screened individuals per day. The testing is recursively repeated, estimating p as the number of infected individuals isolated, divided by the samples used in the previous testing. In our network model simulations, a fixed number of 15% of infected persons report to health centers in every round of testing. The first tests start when the 15% of infected individuals is equal or greater than one and thus, n_I0 depends on the sample.

Figure 5(a) shows the density of active infected and recovered individuals for a social network without epidemic control and with testing using the hypercube algorithm and a simple method of one test per person on affected areas. In this last method, a number N of persons are screened per day that corresponds to the number M of tests that can be done per day. Thus, for this simple method, the areas affected around of the infected individuals n_I0, that report to health centers are reduced to M/n_I0. For both methods of epidemic control, an epidemic decrease is obtained, significantly improving for social structures with less interaction (figure 5(b)). In this case, the hypercube algorithm controls the epidemic. In fact, the density of recovered persons (dashed green line in figure 5(b)) is very low since a few individuals were infected (dashed red line, practically is not visualized in the scale of figure 5(b)).

Time saving is very important for a rapidly spreading infectious disease like COVID-19. However, the regular testing is limited by costs and operational capacity of sampling. In figures 5(a)–(c), for the hypercube algorithm, a maximum number of tests M = 200 on a maximum number of screened individuals N = 2000 for every 100 000 inhabitants per day were considered, but in part (c) the maximum number of tests and of screened persons are M/2 and N/2 every 12 h, doubling the frequency of testing. This allows the search and isolation of infected individuals in a shorter time in order to decrease infections. Figure 5(c) shows that the hypercube algorithm achieves control of epidemic reducing the search time for social networks with high connectivity.

The efficiency of the hypercube algorithm depends on the samples taken from infected individuals reported to health centers. If the search of infected individuals is random, the efficiency of the algorithm is low. To check this, we take M persons at random on the network and build hypercubes. Infection curves do not practically change with a random search to both high and low connectivity networks. The search on affected areas is essential to consistently reduce an epidemic like COVID-19. These results are summarized in figure 6, in which the efficiency of the hypercube algorithm (a) and the simple method of testing every person (b) as a function of the maximum number of tests per day and of frequency of isolation of infected persons is shown, for social networks with high connectivity. The efficiency is defined as the difference between the total number of recovered individuals without epidemic control and with epidemic control divided by the total number of recovered individuals without epidemic control. The maximum number of scanned individuals is ten times the maximum number of tests for the hypercube algorithm (figure 6(a)) and these numbers are the same for the method of one test per person (figure 6(b)). The frequency is number of times that infected persons are isolated per day respecting the maximum numbers of tests and scanned individuals (see figure 5(c)). The efficiency depends on the model of social networks, however the results shown in figure 6 are qualitatively useful. The frequency of search and isolation of infected individuals on zones reported with virus is the relevant parameter to control the COVID-19. Indeed, a remarkable increment of the algorithm's efficiency is observed when the frequency increases in comparison to an increase of the maximum number of tests per day.

**Figure 6.** Efficiency of the hypercube algorithm (a) and of the simple method of one test per person (b) as a function of the maximum number of tests per 100 000 inhabitants per day and of frequency of isolation of infected persons obtained from 10 outbreaks of the virus on 100 000 inhabitants and for social networks with high connectivity: k_int = 4 ± 2 cohabitants and intercommunity mean connections k_ext = 4. The hypercube algorithm (a) is much more efficient than one/test person (b). (Note that the efficiency scale in (a) is twenty times higher than in (b)).
Download figure:
Standard image High-resolution image

A total population of 100 000 inhabitants was considered in the social networks of the previous figures. When the population is reduced ten times, both the frequency and the maximum number of tests per day impact on the efficiency, but the greatest increase in the efficiency is caused by doubling the frequency instead of doubling the maximum number of tests per day. These results are shown in table 2, for both methods. Finally, note that, the method of one test per person is significantly more efficient for small populations than large populations (see figure 6(b)).

Table 2. Efficiency of both methods obtained from 10 outbreaks of the virus on 100 000 inhabitants, for social networks with high connectivity: k_int = 4 ± 2 cohabitants and intercommunity mean connections k_ext = 4 and a total population of 10 000 inhabitants. M is the maximum number of tests per 100 000 inhabitants per day. For the hypercube algorithm, the screened individuals per 100 000 inhabitants per day, N = 10 × M and for the method of one test/person, N = M.

Method	M = 200, frequency = 1	M = 400, frequency = 1	M = 200, frequency = 2
Hypercube algorithm	0.16	0.23	0.51
One test/person	0.13	0.22	0.34

4. Conclusions

COVID-19 pandemic has been faced with partial or total lockdowns in order to decrease the number of infected, ill or dead people by decreasing social interactions. We found that social networks with low connectivity between their individuals reduce the contagion and can go a long way in keeping the curves of infected persons flat. However, since most facets of economic and social life require person-to-person contact, the testing, searching and isolating infected individuals helps to reduce the epidemics and return sooner to normal activity. RT-PCR tests are accurate, but costly and are a challenge particularly for developing countries. The search for infected individuals by grouping samples is considered in this work. Particularly, we studied the epidemic evolution under different strategies of application of a pool testing based on the geometry of a hypercube to isolate infected persons applied to social networks under quarantine threatened by an epidemic with high contagiousness and rapid spread as the coronavirus disease (COVID-19). The pool testing on social networks under quarantine is effective if the search of infected persons is in zones where the virus was reported and the isolation of these individuals is done as quickly as possible. The strategic search in zones affected by the virus and a high frequency of isolation can overcome a massive testing. Indeed, we found that a massive testing randomly applied to social networks with both high and low connectivities leads to little impact on reduction of contagion. In this line of research, future works may study strategies of pool testing to recognize superspreading events, which are associated with both explosive growth early in an outbreak and sustained transmission in later stages.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Appendix A.: Infection peaks in the social network model under quarantine

The height of the infection peak for the social network model under quarantine without any epidemic control is associated to the network connectivity. Indeed, a few outbreaks can become extinct without intervention, in areas with few connections and thus, the intensity of the infection peak is low. Epidemic elimination may also be obtained for a higher number of outbreaks when the network connectivity is reduced. This may be clearly observed in figure A1, where the evolution of mean densities of recovered, active infected and susceptible exposed to the virus individuals for two different social structures and a fixed number of outbreaks are shown. The enhancement of connectivity in the network promotes the spread of the disease. Therefore, the social isolation is an effective tool that delays the epidemic peak and also significantly reduces the total number of infected individuals, reflected in the number of recovered individuals (green lines of figure A1). Since the social isolation has its socio-cultural and economic constraints, in this work, we apply the algorithm based on the geometry of a hypercube to search and reduce the infection in affected areas of social networks under quarantine.

**Figure A1.** Mean densities of recovered (green lines), active infected (red lines) and susceptible exposed to the virus (black lines) individuals, without epidemic control, obtained from ten outbreaks of the virus on 100 000 inhabitants and social networks under quarantine consisting of k_int = 4 ± 2 cohabitants and two different mean intercommunity connections: k_ext = 2 and 4 (solid and dashed lines, respectively). The curves were averaged over 100 simulation runs.
Download figure:
Standard image High-resolution image

Optimising SARS-CoV-2 pooled testing strategies on social networks for low-resource settings

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. The epidemiological model of social networks under quarantine

3. Results and discussion

3.1. Epidemic spread on social networks under quarantine

3.2. Optimal strategies of pool testing to prevent the epidemic spread

4. Conclusions

Data availability statement

Appendix A.: Infection peaks in the social network model under quarantine

Optimising SARS-CoV-2 pooled testing strategies on social networks for low-resource settings

Article metrics

Submit

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. The epidemiological model of social networks under quarantine

3. Results and discussion

3.1. Epidemic spread on social networks under quarantine

3.2. Optimal strategies of pool testing to prevent the epidemic spread

4. Conclusions

Data availability statement

Appendix A.: Infection peaks in the social network model under quarantine