Main

Food supply shock is a pressing issue that may be increasing worldwide1,2. Extreme-weather events, possibly exacerbated by climate change9,10, are a main driver of food supply shocks11. The risk of simultaneous global breadbasket failure is also probably rising12, posing a threat to systemic and catastrophic food production losses13. Geopolitical crises and policy changes are also responsible for a large proportion of shocks to different food systems1. Moreover, threats to public health can disrupt food supply chains, as shown by the coronavirus disease 2019 pandemic in several national economies around the world14. Global and national food supply chains increase exposure to shocks compared with local food supply chains4,7,15,16, but also add diversity and resilience15,17.

Network topological diversity and connectivity are key attributes of resilient social−ecological7,18 systems. Food supply chains, along with other material inflows such as water and energy, are a close analogy to an ecological food web19,20. Resilience to shock has “three R’s”21,22: resistance to changing food inflows, robustness to a wide range of hazards and recovery or reorganization time after a shock. Therefore, from ecological and resilience theory23,24, a food shock resilience model should relate the diversity and/or connectivity of the food supply chain network to explain a city’s resistance to food shocks. Supply chain diversity provides adaptive options for the city to exploit when some of its supply chains suffer shock25,26,27, thus boosting resilience to shock. The methods in this paper measure resistance to food supply shock, which is a specific subtype of resilience21.

If cities, companies and nations had access to a model estimating their ability to buffer food supply chain shocks, this model could be used in policy and management to optimize supply chains and control the risk of shocks26,28. The ideal model should be simple, quantitative, accurate, operationally useful, applicable at the scale of cities (which consume and process most food)29, ‘hazard agnostic’ for all causes of shocks, and would explain shock risk as a function of factors that a city, company or nation can control. We are skilled at modelling hazard-specific risk management, but general hazard-agnostic models of resilience are better because they can potentially handle unexpected extreme events that continue to threaten our human systems. Here, we propose a statistical−empirical model meeting these ideal criteria, explaining the resistance of USA cities to food supply shocks as a function of the topological diversity of the city’s food supplier network.

Results

Using annual timescale food inflow supply networks (crops, live animals, feed and meat) for the cities of the USA, we extracted the annual intranational food inflow subgraph of each metropolitan area for the period 2012−2015, which is the period with available data30 and when food production systems were substantially affected by drought and agricultural production shocks on the Great Plains and in the western USA31,32. In this model a shock occurs when food inflows drop by more than the intensity threshold (ranging from 3% to 15%) for a duration of a year, with the drop measured against the average inflows of the four-year time period.

Using observations of thousands of food inflows to hundreds of USA cities across four years and four types of food, we calculate the probability (or frequency) of exceeding a given shock intensity for an annual duration for each USA city. The resistance of a city to food shocks is the complement to the probability of the shock. We find that resistance relates positively to the Shannon diversity of a city’s food inflows. This assumes that all cities’ supply chains were exposed to shocks of many intensities during the study period. This is a justified assumption in the USA for the study period of 2012−2015, because in 2012 the USA Great Plains experienced an exceptionally severe drought33 and in 2012−2013 the western USA experienced a severe drought9, both with widespread losses reported in crops and livestock33,34.

The resulting model takes the operationally useful form of an intensity−duration−frequency (IDF) relationship that is widely used for risk and hazard engineering and as a basis for design codes and policies of risk35. This model provides an all-hazards or ‘hazard agnostic’ approach, because—although the empirically observed shocks underlying our analysis are mostly due to drought affecting food suppliers—the model is valid in principle for all kinds of shocks to the city’s food supply network.

Food shocks and supply chain diversity

For cities in the USA, the probability of an annual food supply shock S being greater than a shock intensity s, P(S > s) (see Methods), declines as the diversity D of a city’s food inflows supply chain increases (Fig. 1a). Our measure of supply chain diversity is calculated using the Shannon diversity of a city’s supply chain network of intranational trading partners based on a combination of five different indicators (Methods). Using data for 284 cities and 4 food sectors, the annual probabilities of food supply shocks are calculated by measuring, for each city and food sector, the maximum food supply departure from the annual average during 2012−2015 (Methods). We utilize a total of 4,884 buyer–supplier subgraphs and 1,221 time series to calculate P(S > s) and D.

Fig. 1: Relationship between the probability of shock and supply chain diversity.
figure 1

a, b, Empirical (a) and modelled (b) relationship between  the probability of food supply chain shock P(S > s) and supply chain diversity D for different shock intensities s. Food systems—each food system consists of the supply chain for one of the cities’ four food sectors—are classified into 6 bins of supply chain diversity (limits 0, 0.395, 0.497, 0.585, 0.665, 0.755 and 0.92) with 204, 202, 204, 203, 206 and 202 observations in each bin. These 6 bins each have an average (with standard error given in parentheses) supply chain diversity of 0.28 (0.007), 0.45 (0.002), 0.54 (0.002), 0.62 (0.002), 0.71 (0.002) and 0.81 (0.002), respectively. The probability of food supply chain shock in each bin is calculated as the number of food systems with shocks larger than a shock intensity s divided by the total number of food systems in the bin (Methods). For clarity, panel a shows the probabilities of food supply chain shock for \(s\in \{3,5,10,15\}\) (for other s values see Supplementary Fig. 1). For each s value, the empirical value of P(S > s) is fitted against D using a constrained exponential function (curves in a). The confidence bounds represent ±1 s.d. of the fitted curves (degrees of freedom = 4). The parameter values and goodness-of-fit results for these exponential fits are shown in Extended Data Table 1. With the exponential fit of the empirical relationship, the curves in panel b are obtained by relating the parameters from the exponential fits to the shock intensity s using linear regression (Extended Data Fig. 1a, b). The parameter values and goodness-of-fit results for these linear fits are shown in Extended Data Table 2.

Our results indicate that with greater supply chain diversity D, cities are more likely to avoid or resist shocks of increasing intensity (3%, 5%, 10% and 15%; Fig. 1a). The shock intensity is quantified as the occurrence of a food supply loss greater than a specified percentage threshold s. On average, 1 in 4 cities (probability of shock 0.25; Fig. 1a) with D of 0.20 experience a supply shock greater than 15% in any of their food sectors, while for the same threshold, cities with D values of 0.45, 0.54, 0.63, 0.71 and 0.83 experience food supply shocks 1 out of 6, 8, 14, 17 and 202 times, respectively (probabilities of 0.18, 0.12, 0.07, 0.06 and 0.004, respectively; Fig. 1a). The same trend is observed for shock intensities greater than 3%, 5% and 10% (Fig. 1a). The relationship in Fig. 1a holds when controlling for changes in demand using population or food inflows as proxy for demand (Methods). It also holds when the analysis is performed using the original food flow data with 69 cities rather than the 284 cities in Fig. 1a (Methods). In addition, although the five indicators used to calculate diversity have a varied influence on the relationship in Fig. 1a, the inclusion of all indicators in the supply chain diversity measure increases the Pearson correlation between the data in Fig. 1a (Methods).

We use the observed empirical relationship between P(S > s) and D to build a statistical IDF model of food supply shocks. Letting \({F}_{s}(D)=\,\mathrm{ln}(P(S > s))\), the model takes the following form

$${F}_{s}(D)=-{k}_{s}(D-{D}_{0,s})\propto D-{D}_{0,s},$$
(1)

where ks and D0,s are fitting parameters that depend on the shock intensity s. The parameters are estimated using nonlinear least squares. The exponential function in equation (1) provides a good fit to the data (R2 ≥ 0.73; Extended Data Table 1) and it has the desirable characteristics of being simple to implement and bounded at P(S > s) \(\in [0,1]\) and D \(\in [0,1]\).

Many different (social, economic, infrastructural and environmental) urban indicators of city functioning have been shown to scale with the city size at the metropolitan level36,37, where population is typically used to quantify city size. Hence, we evaluate whether D varies in a systematic way with the city size. For all food sectors, D shows a very weak positive trend with population (Supplementary Fig. 2). Since larger cities tend to have slightly more diverse supply chains (Supplementary Fig. 2), they are slightly less likely to experience shocks of increasing intensity.

A general model of food shocks to cities

We use the model parameters fitted using equation (1) to derive and extrapolate a family of IDF curves covering a wide range of shock intensities. The model parameters ks and D0,s are each linearly regressed against the shock intensity s to obtain parameter values for different intensities (Extended Data Fig. 1a, b). The fitted linear regressions show good performance (R2 ≥ 0.80; Extended Data Table 2). Using these regressed parameter values in equation (1), we obtain the IDF curves in Fig. 1b. The probability of shock equations for different shock intensity values are included in Extended Data Table 3. For example, assuming a design frequency of P = 0.25 per year (a ‘four-year’ shock), a city with D = 0.2 can expect to experience at least one food shock of 15% or greater. If the same city increased its food supply chain diversity to D = 0.8, the same 15% shock occurs less often, with frequency P = 0.05 (a ‘twenty-year’ shock). The curves in Fig. 1b are valid for food supply chain shock to USA cities and regions at roughly the metropolitan area scale during the period 2012−2015, but may possibly be valid for other regions, time periods and supply chain types as well.

In the standard engineering design application35, design codes or standards will set the maximum tolerable frequency and shock intensity (for example, annual P < 0.01 and s < 5%), and the city would enact policies and investments to modify D to bring the expected frequency and intensity below that level of risk. Insurance and emergency management strategies would then be developed to cover the remaining risk. These IDF curves can be directly employed by engineers, insurers, policymakers and planners to measure and control the risk of shock to the food supply chain and to design solutions that reduce the risk of shock.

Using the IDF curves to create a map of the so-called ‘100-year’ shock with a 1% annual exceedance probability (frequency F is P(S > s) = 0.01, duration D is 1 year), we find that the expected shock intensity s varies from 22% to 32% in cities and rural regions across the USA (Fig. 2). The narrow 10% range of variability is due to the relative similarity of food supply chain structure and diversity across USA communities. The cities with the highest expected 100-year shock intensities are: Grand Junction, Colorado; Corpus Christi, Texas; Beaumont, Texas; and Steamboat Springs, Colorado (Fig. 2); and the cities with the lowest expected intensities are: Florence, South Carolina; Cleveland, Ohio; Roanoke, Virginia; and Columbus, Ohio. The shock intensities are on average greater in the western USA (west of the 100th meridian) than in the eastern USA (Fig. 2), with an average shock intensity of 26.3 ± 1.9% (average ± 1 standard deviation) versus 24.8 ± 1.5%, respectively. In addition, some cities show a lower shock intensity than their neighbouring cities. For example, Los Angeles has a lower shock intensity compared to other cities in the state of California, with a shock intensity of 23.5% versus 27.2 ± 0.9%, respectively. The lower expected shock intensity of Los Angeles is due to its unusually diversified (with D = 0.69) food supply chain. Spatial patterns for shocks with other frequencies are similar.

Fig. 2: Map of expected shock intensities of an annual duration food supply chain shock with annual occurrence probability of 1%.
figure 2

The shock intensities are represented as a fraction of the average annual inflows. The annual occurrence probability of 1% is also known as a 100-year food supply shock.

Co-occurrence of food shocks

The model in equation (1) is valid if only one food sector is shocked. However, food supply shocks to multiple sectors (for example, crops, live animals) can and do co-occur1, meaning that shocks from different sectors can be simultaneously experienced by a city or region. Shock co-occurrence is due to multiple factors, such as the tendency of droughts to affect large areas and of industries to collocate and to form strong interdependencies31,32. Even though drought and extreme heat may have a more immediate impact on crop and pasture losses, those losses can quickly propagate to other food sectors, for example, by reducing the crop inputs required to produce animal feed and by inducing livestock culling31,32. The contextual details of a cascading shock are difficult to predict, making it important that the model consider both single and also co-occurring shocks.

In our dataset, shock co-occurrence is widespread, with most cities in the study period 2012−2015 experiencing at least one shock co-occurrence of 2 or more food sectors (Extended Data Fig. 2). Accounting for shock co-occurrence, we find that the probability P ′(S > s) of an annual food supply shock S exceeding intensity threshold s declines with increasing D (Fig. 3a). For example, on average 1 in 4 cities (that is, P′(S > s) = 0.25), with D of 0.20 experience food supply shocks of s > 10% in more than one food sector simultaneously. Our empirical results suggest that supply chain diversity can also boost the resistance of cities to rarer but more dangerous co-occurring shocks.

Fig. 3: Relationship between the probability of co-occurring shocks and supply chain diversity.
figure 3

a, b, Empirical (a) and modelled (b) relationship between the probability of co-occurring food supply chain shock P′(S > s) and supply chain diversity D for different shock intensities s. Food systems are classified into 6 bins of supply chain diversity (limits 0, 0.395, 0.497, 0.585, 0.665, 0.755 and 0.92) with 204, 202, 204, 203, 206 and 202 observations in each bin. These 6 bins each have an average (with standard error given in parentheses) supply chain diversity of 0.28 (0.007), 0.45 (0.002), 0.54 (0.002), 0.62 (0.002), 0.71 (0.002) and 0.81 (0.002), respectively. The probability of co-occurring food supply chain shocks in each bin is calculated as the number of cities with shocks larger than a shock intensity s to 2 or more food sectors divided by the total number of cities in the bin (Methods). For each s, the empirical value of P′(S > s) is fitted against D using a constrained exponential function (curves in panel a). The parameter values and goodness-of-fit results for these exponential fits are shown in Extended Data Table 4. With the exponential fit of the empirical relationship, the curves in panel b are obtained by relating the parameters from the exponential fits to the shock intensity s using linear regression (Extended Data Fig. 1c, d). The parameter values and goodness-of-fit results for these linear fits are shown in Extended Data Table 5. Panel b shows the curves for \(s\in \{3,5,10,15\}\); for other s values see Supplementary Fig. 3.

To consider the effect of shock co-occurrence on the IDF curves, we modify the model in equation (1) as follows:

$${F}_{s}^{{\prime} }(D)=-{k}_{s}^{{\prime} }(D-{D}_{0,s}^{{\prime} })$$
(2)

where the fitting parameters \({k}_{s}^{{\prime} }\) and \({{D}}_{0,{\boldsymbol{s}}}^{{\prime} }\) now account for the effect of co-occurrence on the probabilities of food supply shock to a city. The parameters are estimated using nonlinear least squares (Extended Data Table 4). Each of the estimated parameters is related to the shock intensity s using linear regression (Extended Data Fig. 1c, d and Extended Data Table 5), which are in turn used, together with equation (2), to create a family of IDF curves that account for co-occurrence effects (Fig. 3b).

For a given intensity, the model expects co-occurrent food supply shocks to be less frequent than single food supply shocks (Figs. 1b and 3b). For example, when neglecting co-occurrence, for an annual-duration shock intensity of 10%, a city with D = 0.2 has a frequency of shock of P = 0.45, whereas for co-occurring shocks the expected frequency decreases to P = 0.29. For example, the frequency with which San Diego (D = 0.35), California, expects a single shock of 10% intensity is P = 0.36, but for two or more co-occurrent shocks, P = 0.22.

Discussion

Analogous to biodiversity buffering ecosystems against external shocks5,6, our results show that cities with a greater diversity of food suppliers have a lower probability of suffering a food supply shock for any reason. This method aims at operationalizing ecological theory and network theory to form a valid engineering operations risk management framework for supply chains, grounded in empirical science. This study is a step towards quantifying, explaining and managing the risk faced by cities and regions. The all-hazard nature of the method is important, because it promises the ability to build resilience in the face of the unpredictable events that increasingly characterize hazards in our fast-paced and highly connected world. This method also holds promise for managing supply chain shock risk and resilience at the scales of companies, neighbourhoods and nations, as well as for other kinds of supply chains beyond food.

Approaches aimed at reducing or avoiding food supply shocks have been extensively explored at the national level15,16,38, and the method presented in this paper corroborates—partially—prior national-scale findings, while presenting a simpler answer that applies at finer and more manageable scales. Food supply shocks are most harmful to vulnerable populations within cities, so boosting food supply resilience in cities—and especially in vulnerable neighbourhoods of cities—is an important policy goal. Although food insecurity is on average low in developed countries, approximately 10–12% in the USA39, it can be high for marginalized groups in cities and in rural areas; 20–40% in the USA39. Food insecurity is positively related to food price variability across American cities40. Food supply shocks may increase food prices and price variability41. Therefore, it seems plausible that higher food supply diversity could reduce food insecurity. Because of the varied socioeconomic geography of the USA, our dataset includes cities and regions with markedly different population, density, wealth, race, culture and climate characteristics. Therefore, our results may be directly relevant for a wide range of cities, cultures, timeframes and nations.

Pimm et al.42 emphasize the need to test and utilize empirical measures to operationalize resilience concepts. Our empirical model, linking supply chain diversity to the probability of shock, contributes to that operational goal. Cities need a multifaceted and adaptive approach to manage risk from multiple forms of shocks and to build resilience. In an era of expanding urbanization and connectivity, cities have a key role in global sustainability43. Cities will need to actively manage their supply chain connections to deal with the causes and consequences of shocks to their critical lifelines, such as food. Using this method to guide the policy objective of diversification of supply chains, cities and communities can engage in demand-side policies that scientifically manage risk and build resilience into their supply chains. The IDF framing of this method makes it directly applicable using the design-code framework that is already broadly employed by policymakers to manage risk by design.

A city’s food supply chains—like most supply chains—represent the agglomeration of efforts by many independent companies and logistics operations that grow, manufacture, ship, distribute, store and retail food products. It is a complex supply chain with many parts44. Translating the high-level design framework inherent to our proposed IDF model into actionable steps and effective regulations for individual operators remains a challenge for the future. Diversity by design, at the level of communities, will take coordination, and possibly regulation, among many parties.

Food businesses (for example, grocers, restaurants and distributors) and—to a lesser extent, individuals—can voluntarily contribute to a resilient food policy by intentionally favouring a diverse supply chain where possible, and by maintaining slightly larger food inventories in locales that are known to be at higher risk of food shock (for example, southern Utah). Local, state, and federal governments and mission agencies can collect data on the diversity of their businesses’ suppliers, can set policy targets to achieve minimum supply chain diversity, and can create regulations or incentives to achieve those minimums. In the USA, this could be relevant to several government-sponsored programmes aimed at reducing food insecurity, such as the Supplementary Nutrition Assistance Program45 and the National School Lunch Program46.

Insurers can price in shock risk where appropriate to incentivize diversity and resilience, and a niche business sector could emerge to mitigate that risk by managing supply chains or building buffers. Large agribusiness can reduce the supply-side risk by diversifying its supply chains. Governments can scientifically size and locate food buffers (stockpiles) to fill the gap between the IDF food security risk metrics presented in this paper and the desired level of food security—or can prioritize emergency recovery assets to higher-risk locations. The precise and actionable statistics in this model form a basis for scientifically designed food security and risk management systems that fit neatly within the existing risk management and design paradigm already used by insurers, engineers, emergency managers and policymakers.

Methods

Dataset of food flow networks

We derive annual, intranational food flow networks for the USA using the Freight Analysis Framework version 4 (FAF4) database30. The derived networks are for different food sectors and include all metropolitan areas in the USA. The FAF4 database consists of annual commodity flows during 2012−2015 for 115 geographic areas in the USA and 43 different sectors. We focus on the following four food sectors in the FAF4 database: crops, live animals, animal feed and meat. The 115 geographic areas in the FAF4 database cover the entire contiguous USA, including 69 metropolitan statistical areas and 46 remainders of states (the remainder is the area of a state that is not part of a FAF4 metropolitan area).

To obtain food flows for all metropolitan areas in the USA, we disaggregate the FAF4 database from 115 to 329 areas (Supplementary Fig. 4), out of which 284 are metropolitan or combined statistical areas (120 metropolitan and 164 combined statistical areas). The disaggregation is performed using different socioeconomic and agricultural-related variables as attractors of supply and demand. For each food sector, a flow with origin o and destination d in the FAF4 database is disaggregated to a metropolitan-level flow with origin o′ and destination d′ using a disaggregation variable a as the best attractor of supply or demand.

The disaggregation is performed in two stages. In the first stage, the supply U of each FAF4 remainder of state is disaggregated to include all the metropolitan areas in that remainder of state as follows:

$${U}_{{o}^{{\prime} }d}^{c}=\frac{{U}_{od}^{c}}{{a}_{o}}\times {a}_{{o}^{{\prime} }},$$
(3)

where \({U}_{{o}^{{\prime} }d}^{c}\) (in tons per year) is the disaggregated supply for food sector c in origin o′ that satisfies demand at the FAF4 destination d, \({U}_{od}^{c}\) (in tons per year) is the FAF4 food flow for sector c between areas o and d, and \({a}_{{o}^{{\prime} }}\) and ao are the attractor variables for the new origin o′ and FAF4 origin o, respectively. In the second stage, \({U}_{{o}^{{\prime} }d}^{c}\) is further disaggregated into demand E using:

$${E}_{{o}^{{\prime} }{d}^{{\prime} }}^{c}=\frac{{U}_{{o}^{{\prime} }d}^{c}}{{a}_{d}}\times {a}_{{d}^{{\prime} }},$$
(4)

where \({E}_{{o}^{{\prime} }{d}^{{\prime} }}^{c}\) (in tons per year) is the demand at destination d′ for food sector c supplied by origin o′, while \({a}_{{d}^{{\prime} }}\) and \({a}_{d}\) are the attractor variables at the disaggregated destination d′ and FAF4 destination d, respectively.

The FAF4 database includes food flow data at both the state level (48 states) and metropolitan level (115 areas including 69 metropolitan areas). Prior to performing the disaggregation, we jointly use the FAF4 state data and the FAF4 metropolitan data to select the best performing attractor variables. That is, we first use equations (3) and (4) to disaggregate the FAF4 state-level data to the metropolitan-level for the metropolitan and remainder-of-state areas in FAF4. By comparing the performance of our disaggregated flow data against the empirical FAF4 metropolitan-level data, we select the best attractor variable for each food sector. The following attractor variables are considered: population47, employment48, wages48, number of establishments48 and cropland area49. These variables are selected on the basis of previous analysis and data availability50.

To assess the performance of the attractor variables, we use the Pearson correlation coefficient between the empirical FAF4 flows and the disaggregated flows for the metropolitan areas and remainder-of-state areas in FAF4 (Extended Data Fig. 3). The performance is high with correlation values greater than 0.87 and an average of 0.95. Using the best-performing disaggregation variables, we build the food flow networks employed in this study. The nodes in the networks represent metropolitan and remainder-of-state areas, and the weighted links represent annual food flows during 2012−2015 for crops, live animals, feed and meat (Supplementary Fig. 5).

The FAF4 metropolitan and remainder-of-state areas we used to select the attractor variables span a wide range of populations, cropland areas, and number of establishments, since these FAF4 areas include the largest cities in the USA and a broad range of medium-size cities. The values of the attractor variables used in the disaggregation are within the ranges implied by the FAF4 metropolitan data (Supplementary Fig. 6), indicating that the variables are reliable. The exception to this is population, which is only used to disaggregate meat demand. Population, however, has a high disaggregation performance with a correlation coefficient of 0.97 (Extended Data Fig. 3). In addition, the use of population to disaggregate meat demand is consistent with previous scaling results for metropolitan areas in the USA51 that have shown that metropolitan-level variables that are related to resource consumption scale approximately linearly with population.

Food inflows supply chain diversity

To determine annual supply chain diversity, we extract the annual food buyer–supplier subgraph of each city and food sector from the food flow networks19. We refer to each of these subgraphs as a food system. The food buyer–supplier subgraph of a city i consists of all the supply chain interactions with its trading partners or neighbours j for a specific food sector. Our measure of supply chain diversity is based on the notion of functional distance52. We compute the functional distance d between i and any of its trading partners j by combining five different indicators: physical distance, climate correlation, urban classification, economic specialization and network modularity. The indicators are described below in the ‘Functional distance indicators’ section of the Methods. We also perform statistical analyses to evaluate the influence of the attractor variables on these indicators (Methods). The indicators represent stable characteristics of cities and therefore tend to remain fairly constant during our study period.

The functional distance \({d}_{ij}^{r}\) for an indicator r between any pair of connected nodes (i,j) is calculated as

$${d}_{ij}^{r}={N}^{-1}|{r}_{k}-{r}_{i}|,$$
(5)

where the normalization constant N is determined as the maximum value of \(|{r}_{k}-{r}_{i}|\) between any node k in the network and i. In equation (5), \({d}_{ij}^{r}=0\) for functionally similar nodes and \({d}_{ij}^{r}=1\) for dissimilar nodes.

For each city’s buyer−supplier subgraph and food sector, any pair of connected nodes has 5 different functional distance indicators associated with it. To combine these distance indicators into a single measure, we calculate the average functional distance indicator \(\langle {d}_{ij}^{r}\rangle \) as the arithmetic average of the 5 functional distance indicators for any pair (i, j) of connected nodes. We use the discrete probability distribution of food inflows binned by \(\langle {d}_{ij}^{r}\rangle \) categories, together with Shannon entropy53, to calculate the supply chain diversity \({D}_{i,c}^{t}\) of node i and sector c at year t:

$${D}_{i,c}^{t}=\frac{-{\sum }_{k=1}^{K}{Y}_{i,c}^{t}(k)\mathrm{ln}\,{Y}_{i,c}^{t}(k)}{\log \,K}.$$
(6)

For sector c and year t, \({Y}_{i,c}^{t}(k)\) is the proportion of food inflows to node i within bin k. The k bin is obtained by binning all the \(\langle {d}_{ij}^{r}\rangle \) values for node i into a total number of K bins.

\({D}_{i,c}^{t}\) is sensitive to the total number of bins K. Thus, for each node in our food flow networks, we tested the sensitivity of \({D}_{i,c}^{t}\) to the total number of bins K. For K = 15, D values stabilize (Supplementary Fig. 7); therefore, we used 15 bins when performing all calculations of functional diversity.

Functional distance indicators

The average functional distance between a city and its trading partners is based on the following five indicators:

  1. (1)

    Physical distance indicator (PDI). The PDI is obtained by calculating the Euclidean distance from the centroid of each geographic area to the centroid of all other areas. The geometric centroids of all geographical areas are calculated using the GIS software ArcMap (https://desktop.arcgis.com/en/arcmap/).

  2. (2)

    Climate indicator (CI). To account for different climates in cities across the USA, the Palmer Drought Severity Index (PDSI) is used54. The monthly PDSI is obtained from the National Oceanic and Atmospheric Administration for the years 1895−2015 at the climate division geographic level. An area-weighted average is performed to aggregate the PDSI data to the metropolitan level. The CI is obtained by calculating the monthly correlation between an area and all other areas.

  3. (3)

    Urban classification indicator (UCI). To identify the urbanization level of a geographical area, the Urban-Rural Classification indicator of the National Center for Health Statistics is employed55. This indicator classifies counties using a scale from 1 to 6, where a value of 1 indicates the county is highly rural and a value of 6 means highly urban. The UCI is obtained at the metropolitan level using an area-weighted average of the county-level values.

  4. (4)

    Network modularity indicator (NMI). This indicator identifies geographical areas (network nodes) that belong to the same community. A community is a group of nodes whose strength interactions are stronger than with the rest of the network. To identify the network’s communities, we aggregate the flows from the four food sectors (crops, live animals, feed and meat) into a single-layer network. The communities are identified by maximizing the modularity measure of Newman56,57 using the greedy optimization algorithm of Blondel et al.58,59. Network nodes that lie in the same community are assigned a NMI of 1 and 0 otherwise.

  5. (5)

    Economic specialization indicator (ESI). Each geographical area is assigned a score based on its dominant economic supply sector. Supply is quantified using the FAF4 intranational commodity flows30. Areas with a dominant meat sector are assigned an ESI of 1, crops an ESI of 2, fruits and vegetables an ESI of 3, animal feed an ESI of 4, live animals an ESI of 5, milled grains an ESI of 6, and industrial products an ESI of 7.

Probabilities of food supply chain shock

The annual probability of food supply chain shock is calculated as the probability that food inflows to a city fall below a percentage of the average inflows for that city during 2012−20158. To compute this probability, we group all nodes from the 4 food flow networks (1,221 observations) into 6 diversity bins ordered from lowest to highest functional diversity D. The bin size is selected to obtain bins with similar number of observations, approximately 204 observations in each bin. For each city i and food sector c in a bin, we calculate the food supply chain shock \({S}_{i}^{c}\) as

$${S}_{i}^{c}=\left[1-\frac{min({I}_{i}^{c})}{\langle {I}_{i}^{c}\rangle }\right]\times 100,$$
(7)

where \({I}_{i}^{c}\) is the time series of total food inflows to node i for sector c during 2012−2015, and \({\rm{\min }}({I}_{i}^{c})\) and \(\langle {I}_{i}^{c}\rangle \) are the minimum and average values of the time series \({I}_{i}^{c}\), respectively.

For each diversity bin b, we count the number of observations nb that meet the criteria \({S}_{i}^{c} > s\) for \(s\in \{3,4,5,\ldots ,15\}\), with s being the shock intensity threshold. The probability of a food supply shock S being greater than s in bin b is calculated as:

$${P}_{b}(S > s)=\frac{{n}_{b}}{{N}_{b}},$$
(8)

where Nb is the total number of observations in bin b. Thus, for each shock intensity s, we obtain a set of probabilities of food supply chain shock,

$$P(S > s)={P}_{b}(S > s)\,{\rm{for}}\,b=\{1,\ldots ,6\}.$$
(9)

Furthermore, we adapt equations (8) and (9) to calculate the probability of a food supply chain shock S being greater than s, P′(S > s), under co-occurrence conditions. We define co-occurrence as any city that experiences a shock to 2 or more food sectors during 2012−2015. With this definition, P′(S > s) is calculated in a fashion similar to that described above. We bin the network’s nodes into 6 groups from lowest to highest diversity and determine the percentage of food supply chain shock with equation (7). Letting \({n}_{b}^{{\prime} }\) be the total number of cities for which \({S}_{i}^{c} > s\,\) holds for 2 or more food sectors, the probability of a food supply chain shock S being greater than the shock intensity s in bin b is now

$${P}_{b}^{{\prime} }(S > s)=\frac{{n}_{b}^{{\prime} }}{{N}_{b}^{{\prime} }},$$
(10)

where \({N}_{b}^{{\prime} }\) is the total number of cities in bin b. Thus, under co-occurrence conditions, the new set of probabilities for each shock intensity s is

$$P{\prime} (S > s)={P}_{b}^{{\prime} }(S > s)\,{\rm{for}}\,b=\{1,\ldots ,6\}.$$
(11)

Statistical analyses

We use the disaggregated food flow data to calculate both the probability of food supply chain shock and supply chain diversity. Therefore, we perform two complementary analyses to test whether the attractor variables used in the disaggregation are causing a circularity issue in the empirical relationship between the probability of food supply chain shock and supply chain diversity. For the first analysis, we determine the Pearson correlation between the functional distance indicators (PDI, CI, UCI, NMI and ESI) and the attractor variables (Supplementary Fig. 8). We find that the attractor variables are weakly correlated with the functional distance indicators (Supplementary Table 1). For the second analysis, we determine the Pearson correlation between the food supply chain shock intensities and attractor variables for the 4 food sectors (Supplementary Fig. 9). The attractor variables are also weakly correlated with the food supply chain shock intensities (Supplementary Table 2). Thus, circularity is not unduly influencing the empirical relationship between the probability of shock and supply chain diversity.

We also evaluate whether the empirical relationship between the probability of food supply chain shock and supply chain diversity is driven by the disaggregation of the original FAF4 data. For this, we recalculate the probability of shock and supply chain diversity using the FAF4 data. For the FAF4 data, the probability of shock also declines with rising supply chain diversity (Supplementary Fig. 10), similar to the reduction observed using the disaggregated food flow data (Fig. 1a), suggesting that the latter data are not driving the relationship.

Furthermore, we test whether the relationship between the probability of food supply chain shock and supply chain diversity holds for different demand levels. To control for demand, we stratify all the data into low, medium and high demand levels using population or food inflows as proxies for demand. For both stratifications, the bounds are chosen so that each level has approximately the same number of data points. Using the stratified data, we recalculate the Pearson correlation between the probability of shock at 3%, 5%, 10% and 15% shock intensities and supply chain diversity for each level of population or food inflows (Supplementary Table 3). We find that the relationship between the probability of food supply chain shock and supply chain diversity holds for these different demand levels (Supplementary Table 3). Using the same exponential model in equation (1) to fit the relationship for the stratified data (Supplementary Fig. 11), we determine the exponential model parameters ks and D0,s for each demand level (Extended Data Table 6). These parameters fall within the 95% confidence interval of the parameters of the exponential model in the main text (Extended Data Table 1), indicating that the model is robust.

We perform two different sensitivity analyses to assess the influence of the five distance indicators on the empirical relationship between the probability of food supply chain shock and supply chain diversity. The first analysis compares single-indicator diversity measures against the multi-indicator diversity measure calculated using all 5 indicators. Five different single-indicator diversity measures are compared, one measure for each of the 5 indicators: PDI, CI, UCI, NMI and ESI. For the second sensitivity analysis, we leave out one indicator at a time to calculate diversity using the 4 remaining indicators, which results in another 5 different diversity measures. The diversity measures for the sensitivity analyses are all calculated following the approach in the ‘Food inflows supply chain diversity’ section of the Methods. To perform the sensitivity analyses, we plot the empirical relationship between the probability of food supply chain shock and each diversity measure (Supplementary Figs. 12 and 13), and calculate the Pearson correlation of the data (Extended Data Table 7). The correlation coefficients are used to quantify the influence of the distance indicators on the relationship between the probability of food supply chain shock and supply chain diversity (Extended Data Table 7). The probabilities of food supply chain shock are calculated following the approach in the ‘Probabilities of food supply chain shock’ section of the Methods. We find that the five indicators have a varied influence on the relationship between the probability of food supply chain shock and supply chain diversity (Extended Data Table 7). The inclusion of all 5 indicators, however, in the supply chain diversity measure increases the Pearson correlation between the probability of food supply chain shock and supply chain diversity (Extended Data Table 7).