Introduction

Resilience modeling is applied in hydrology to assess the persistence of watershed systems in supplying life-supporting ecoservices when stressed by extreme conditions. There is a growing urgency to integrate resilience into watershed management as catastrophic weather, hydrologic, and environmental events—including extreme temperatures, floods, droughts, rising sea levels, and wildfires—become commonplace1,2. The US Army Corps of Engineers’ statement on “Climate Preparedness and Resilience” views “[m]anaging hydrologic extremes due to climate variability [to be] an essential mission of water management agencies3.” International organizations aspire to formulate early-warning hydrologic resilience systems in small agricultural watersheds where drought and floods are notoriously ill-predicted4.

Holling (1973)—author of the seminal paper introducing resilience into ecological research—called for resilience models to represent “a meaningful reality5.” Oreskes et al. (1994) went a step further in advocating that “the burden is on the modeler to demonstrate the degree of correspondence between the model and the material world it seeks to represent” when “public policy and public safety are at stake6.” Models untethered to reality lead to resource management that “leave[s] the real problem unaddressed, waste[s] resources, and impede[s] learning7.” Literature surveys indicate that past resilience work has not met this burden. The editors of a special hydrologic resilience issue in the Journal of Hydrology found that past studies “have not properly captured the essence of socio-hydrological dynamics in the coupled human-water context8.” Fraccascia et al. (2018) concluded more generally that “…the number of theories developed on resilience in each research field still lacks operationalization and testing9.”

A possible explanation is that past studies conventionally adopted a model-first approach, presupposing particular resilience behavior from a narrow range of alternatives without testing for real-world correspondence10,11. A meta-study by Unger (2018) extracted a broad definition of resilience as “the capacity of a dynamic system to adapt successfully to disturbances that threaten system function, viability, or development”10. The resilience literature recognizes that dynamic systems can respond to outside disturbance in alternative ways, including resuming normal function (restoration resilience)12,13,14,15 or abruptly shifting dynamic behavior into an alternative state (transformation resilience)10,16,17. Some studies straddle restoration resilience and transformation resilience11,18, creating confusion about whether “a resilient system resists adverse conditions, or…adapts to them”19. More recently, the literature has recognized that resilient systems may be open complex dynamic systems responding to adverse conditions that are self-generated by recurrent internal forces (complexity resilience)9,10,20.

Holling (1973) called for “meaningful reality” because these resilience alternatives “can yield very different approaches to the management of resources5.” Managers overseeing restoration-resilient systems cannot anticipate exogenous random shocks but can expect that volatile system adjustments will dampen over time. In this case, management has reason to not interfere with a system’s natural recovery process. Alternatively, managers overseeing complexity-resilient systems might anticipate recurrent behavior but cannot rely on these inherently unstable systems to self-correct. Muller (2021) draws a similar distinction between managing watersheds with easy and difficult hydrologies21. Easy hydrologies are characterized by stable, reliable, and predictable seasonal rainfall, river flows, and groundwater availability. This renders “the management of water for different purposes relatively easy, facilitating the conceptualization and operation of infrastructure as well as the establishment of the rules that govern the entitlements and obligations of water users.” Alternatively, difficult hydrologies are characterized by unstable, highly variable, and less predictable conditions. They require “stronger institutions as well as higher levels of physical investments in order to support basic activities such as agriculture or even simply to meet domestic needs.” Misdiagnosis of real-world hydrologic resilience behavior can lead to institutions and investments ill-crafted to achieve water security.

In this paper, we test for “meaningful reality” with a data-driven protocol that (1) reconstructs (reverse-engineers) real-world hydrologic dynamics from watershed records and (2) investigates how reconstructed hydrologic dynamics respond to extreme and changing regional climatic conditions. We subsequently match reconstructed responses against conventional resilience classes and a new classification that accounts more broadly for potential real-world dynamics associated with difficult hydrologies. Reconstructed hydrologic resilience dynamics provide the baseline for evaluating the performance of an AI-based early-warning system to forecast hydrologic resilience out-of-sample and alert watershed managers to extreme behavior. We apply the protocol to analyze the hydrologic resilience of the La Tejería watershed in Navarre Province, Spain.

Reconstructing hydrologic resilience dynamics from watershed records is challenging because records typically exhibit substantial variability22,23 that, in our case-study presented below, appear as irregular fluctuating patterns of qualitative behavior concealing underlying temporal dynamics (Fig. S1). The classical perspective is that irregular fluctuations result from interactions of a large number of degrees of freedom (d-o-f). High-dimensional irregularity is conventionally modeled as linear-stochastic dynamic systems in which steady oscillations shift in response to exogenous shocks24,25. However, the advent of deterministic chaos raised an alternative perspective: Irregular fluctuations may emerge from nonlinear-deterministic interactions of relatively few (even as few as three) d-o-f evolving along a nonlinear attractor—a geometric object bounded within a low-dimensional subset of state space26. If attractor dynamics of an n-dimensional nonlinear dynamic system are bounded within m << n dimensions, the problem of modeling long-term dynamics shrinks by the n–m inactive d-o-f27. This dimension-reducing property allows for long-term system dynamics to be captured with relatively few d-o-f regardless of overall system dimensionality without sacrificing essential information.

A recent literature survey indicates that past hydrologic resilience studies have not accounted for potential dimension-reduced complex hydrologic dynamics and consequently have modeled complexity-resilient behavior with high-dimensional ‘top-down’ dynamic systems or ‘bottom-up’ agent-based models8,28. However, it is reasonable to test for low-dimensional nonlinear deterministic dynamics in hydrologic resilience studies for three major reasons: First, watersheds are a classic example of a complex system driven by an intricate network of interacting nonlinear climatic, hydrologic, geologic, and soil processes. Complex systems have the capacity for self-organization, from which emerges ordered collective dynamic behavior not exhibited by individual components on their own. Potential emergent behavior includes low-dimensional nonlinear deterministic dynamics. Second, several recent hydrologic studies have uncovered this type of emergent dynamic behavior in watersheds29,30,31. Third, the inadvertent application of linear methods to analyze potentially nonlinear watershed records is statistically unreliable. In Fig. 1, we illustrate this by fitting a linear regression line (red line) to a scatterplot of precipitation in the La Tejería watershed against the Mediterranean Oscillation Index (MOI) climate teleconnection (black dots). We find that apparently randomly scattered regression residuals are serially dependent by labeling scattered observations by month (e.g., 1 = January) and connecting them in time sequence (grey lines). This produces a ‘boot-shaped’ geometric object composed of a leftward upper branch that precipitation and MOI values revisit in summer and fall months (i.e., months 5–10) and a rightward lower branch revisited in winter and early spring months (i.e., months 11, 12, 1–4). This geometric object represents a potential two-dimensional projection of a low-dimensional nonlinear-deterministic attractor. If so, the basic assumption of a linear relationship between these covariates is violated. Indiscriminate application of linear regression methods creates misspecification bias: A distorted version of nonlinear variation in the data passes through the linear model to structured residuals, which renders coefficient estimates biased and inefficient. Misspecification bias can be resolved only by moving to a nonlinear specification32. We apply empirical nonlinear dynamics25,33 (END) to formally test whether observed variability in watershed records is most likely due to linear-stochastic or nonlinear-deterministic dynamics. END reconstructs state-space dynamics from watershed records, diagnoses whether linear-stochastic or nonlinear-deterministic dynamics are the most likely source of irregular fluctuations characterizing observed records, and estimates the d-o-f required to portray reconstructed state-space dynamics. END—based on the premise that each variable encodes its internal interactions with covariates—can reconstruct the state-space dynamics of complex systems from a relatively small ensemble of covariates or even a single covariate without requiring data on every covariate.

Fig. 1: Visualization of potential low-dimensional nonlinear-deterministic structure in La Tejería watershed and climatic records.
figure 1

To demonstrate the statistical danger of not testing records for nonlinearity, a linear regression line (red line) was fit to a scatterplot of precipitation in the La Tejería watershed against the Mediterranean Oscillation Index (MOI) climate teleconnection (black dots). Scattered observations were labeled by month (e.g., 1 = January) and connected in time sequence (grey lines). This uncovered serial dependence taking the geometric form of a ‘boot-shaped’ object representing a potential two-dimensional projection of a low-dimensional nonlinear-deterministic attractor. Precipitation and MOI values revisit the leftward upper branch of the object in summer and fall months (i.e., months 5–10) and the rightward lower branch in winter and early spring months (i.e., months 11, 12, 1–4). Indiscriminate application of linear regression methods to this nonlinear object creates a misspecification bias that can be resolved only with a nonlinear specification.

Since noisy records obscure the detection of structured dynamic behavior, we follow past empirical nonlinear dynamic studies26,34 by isolating signal (structured variation) from noise (unstructured variation) in each record with singular spectrum analysis35 signal processing (Fig. 2a). We analyze unstructured noise probabilistically with extreme value statistics. This generates return-level plots used in hydrology to model the expected occurrence of extreme random hydrologic conditions, potentially including drought and flooding36. We test isolated signals for structured dynamics with END (Fig. 2b). First, we test hypothesis H1, that signals exhibit critical break points with singular spectrum transformation37. Acceptance of H1 indicates abrupt shifts in dynamic structure compatible with transformation resilience. Structural breaks violate nonlinear stationarity, requiring that the “duration of the measurement is long compared to the time scales of the systems”38. Nonstationary signals lack sufficient length to adequately sample dominant low-frequency cycles isolated by singular spectrum analysis and, consequently, are removed from further analysis. Second, we apply surrogate data analysis39,40 to statistically test hypothesis H2 concerning whether observed irregular fluctuations in stationary signals are most likely due to linear-stochastic rather than nonlinear-deterministic dynamics. Acceptance of H2 indicates that irregular fluctuations in isolated signals are most likely attributed to outside shocks disturbing otherwise stable behavior compatible with restoration resilience. Restoration resilience has often been studied empirically in hydrology with Budyko frameworks using hydrologic indices to assess whether watersheds resume normal hydrologic function in response to warming shifts41. Rejection of H2 leaves the door open to nonlinear-deterministic dynamics as an internal source of irregular behavior. Detection of high-dimensional dynamics is compatible with complexity resilience. Alternatively, the detection of dimension-reduced (low-dimensional) dynamics is compatible with a new classification that we introduce: low-dimensional complexity. Third, we follow emerging hydrologic research29,30,31 to test hypothesis H3 that watershed variables are causally interactive with conjectured hydrologic/climatic stressors with convergent cross mapping42. In particular, we test the extent to which global/regional climate teleconnections link large-scale atmospheric circulation patterns to variability in local hydrologic and meteorological covariates in the study area. Detection of low-dimensional complexity-resilience behavior and acceptance of H3 allows for novel analysis of hydrologic resilience to recurring climatic extremes along attractors constructed from interacting covariates.

Fig. 2: Protocol for reconstructing hydrologic resilience from watershed records.
figure 2

a Initial application of signal processing isolates the structural variation (signal) from the unstructured variation (noise) in each record. Unstructured noise is modeled probabilistically with extreme value statistics to generate return-level plots used in hydrology to model the expected occurrence of extreme random hydrologic conditions, potentially including drought and flooding. b Isolated signals are tested for dynamic structure with empirical nonlinear dynamics (END). A sequence of hypotheses is tested to reconstruct recurrent hydrologic resilience behavior from watershed signals. Hypothesis H1 tests whether signals exhibit critical break points compatible with transformation resilience. Abrupt shifts in dynamic structure suggest that a signal is non-stationary and prevent the application of END. Hypothesis H2 tests whether stationary signals are governed by linear-stochastic dynamics. Acceptance of H2 indicates that irregular fluctuations in isolated signals are most likely attributed to outside shocks disturbing otherwise stable behavior compatible with restoration resilience. Rejection of H2 leaves the door open to nonlinear-deterministic dynamics as an internal source of irregular behavior. Detection of dimension-reduced (low-dimensional) dynamics is compatible with a new classification that we introduce: low-dimensional complexity resilience. Hypothesis H3 tests whether watershed variables are causally interactive with conjectured hydrologic/climatic stressors. Detection of low-dimensional complexity-resilience behavior and acceptance of H3 allows for novel analysis of hydrologic resilience to recurring climatic extremes along attractors constructed from interacting covariates.

La Tejería watershed covers an area of 169 ha and has a humid sub-Mediterranean climate with an average annual precipitation of 725 mm occurring predominantly in the late winter and early spring, and an average annual temperature of 13 °C (Fig. 3)43. The watershed is located within a high-productivity winter grain farming area, with cereal crops covering more than 90% of its total area. La Tejería watershed is part of a network of experimental small agricultural watersheds established by the Government of Navarre (Spain) to assess the impact of soil erosion on water resources and to identify environmentally sound management practices. Instrumentation in each watershed collects time series data on variables including stream discharges, soil sediment concentrations, turbidity, precipitation, and temperatures. These watershed data have been used in several investigations of hydrologic behavior in the La Tejería watershed43,44,45,46.

Fig. 3: La Tejería watershed in Navarre Province, Spain.
figure 3

The La Tejería watershed in north-central Spain is part of a network of experimental small agricultural watersheds established by the Government of Navarre to examine environmentally sound watershed management practices. La Tejería covers an area of 169 ha and has a humid sub-Mediterranean climate with an average annual precipitation of 725 mm occurring predominantly in the late winter and early spring and an average annual temperature of 13 °C. Instrumentation in each watershed collects time series data on variables, including stream discharges, soil sediment concentrations, turbidity, precipitation, and temperatures.

Instrumentation in the La Tejería watershed provides data on water level, turbidity, and meteorological variables measured every 10 min. Sediment concentration is measured daily. The water level is converted to flow rate with the discharge rating curve calculated by the Government of Navarre. The water quality sampling procedure includes four samples collected each day that are mixed and stored in the same bottle before analysis. These samples are taken daily at 3, 9, 15, and 21 h, solar time (every 6 h). The daily water quality samples are analyzed by the Government of Navarre using standard methods44. We downloaded climate teleconnection indices found in past studies to influence meteorological conditions in the Iberian peninsula from web sources provided in Rodrigues et al. (2021) (Table S1)47. These indices include the North Atlantic Oscillation (NAO)48,49,50,51, the Atlantic Multidecadal Oscillation (AMO)47, the El Niño 3.4 SST index (ENSO)47, the Mediterranean Oscillation (MOI)48,49,51, the Pacific Decadal Oscillation (PDO)47, and the Western Mediterranean Oscillation (WeMOi)48,49,51. We resampled daily MOI and daily-averaged watershed data with monthly averages compatible with the monthly time step of the non-MOI teleconnections. We computed Fourier spectra before and after resampling of each variable to ensure that important daily or weekly variation was not averaged out. We filled in sporadic data gaps due to instrument error with the R(imputeTS) package. Finally, we standardized each time series by subtracting the mean from each observation and dividing by the standard deviation. Positive (negative) standardized values represent standard deviations above (below) the mean (the zero value). The period of record is December 2002 through December 2021 (229 months).

Results

Signal processing

Signal processing isolates the structural variation in records composed of trend and oscillatory components. Signal strength measures the percentage of total variation that a signal or a signal component accounts for in the corresponding record, with the residual percentage attributed to unstructured noise. We summarize signal processing results in Table 1 and show the graphical decompositions in Figs. S1 and S2. Of the watershed covariates, the stream discharge and temperature signals had moderate to strong signal strengths—accounting for the majority of variation in their respective records—with dominant annual oscillatory and weaker nonlinear-trend components. The sediment concentration signal was also moderately strong but had dominant nonlinear-trend and fainter annual-oscillatory components. The turbidity and precipitation signals were weaker—accounting for a third of the total variation in their respective records—and were dominated by annual oscillatory and weaker nonlinear-trend components. The majority of climate teleconnection signals accounted for substantial percentages of total variation. The strongest signals were associated with El Niño 3.4 (80%), AM0 (75%), WeMOI (51%), and MOI (48%); and the weakest signals with NAO (36%) and PDO (29%). Teleconnection signals were dominated by strong long-term trend cycles except for MOI, which was dominated by an annual oscillation.

Table 1 Signal processing results.

H1: Signals exhibit critical change points

We tested for critical change points in hydrologic and climate teleconnection signals with singular spectrum transformation37. It is implemented with a sliding change-point window partitioning a signal into past and future series centered around a reference time. Singular spectrum analysis is performed on the past and future series, and a change-point score (0 ≤ CP-score ≤ 1) is computed, indicating whether the singular spectrum decomposition substantially changes. Sliding the change-point window through the signal results in a curve of CP-scores across time periods. The statistical significance of the CP-scores along this curve is tested by bootstrapping upper 95% confidence levels52. CP-scores falling below confidence levels are statistically insignificant. The turbidity, El Niño 3.4, and NAO signals exhibited numerous CP-scores at the upper limit of one, indicating abrupt shifts in dynamic behavior violating required nonlinear stationarity (Fig. 4). They were dropped from further analysis. The sediment concentration, AMO, and WeMOI signals exhibited a few high CP-scores that were statistically insignificant or occurred at either the first or last few observations and, consequently, were deemed to be nonlinear stationary. Finally, the stream discharge, precipitation, temperature, and MOI signals exhibited extremely low CP-scores, indicating nonlinear stationarity.

Fig. 4: Testing signals for critical break points indicating transformation resilience.
figure 4

Singular spectrum transformation detects critical change points with a sliding change-point window partitioning a signal into past and future series centered around a reference time. Singular spectrum analysis is performed on the past and future series, and a change-point score (0 ≤ CP-score ≤ 1) is computed, indicating whether the SSA decomposition substantially changes. CP-scores below bootstrapped upper 95% confidence levels fail to indicate statistically significant breakpoints.

H2: Signals are generated by linear-stochastic dynamics

We computed pseudo-phase-space (pps) surrogate data vectors that test for noisy linear dynamics in cyclical records53. We reconstructed a shadow attractor for each stationary watershed and teleconnection signal and each surrogate data vector using time-delay embedding54. According to Takens’ Theorem, time-delay embedding provides a 1–1 mapping of system dynamics from the original real-world state-space to a reconstructed shadow state space so long as the latter has sufficient dimensions to contain the original attractor54. Since we do not directly observe the dimension of the real-world attractor, we followed convention in estimating the embedding dimension with the false nearest neighbors test55, and the embedding delay as the delay, giving the first minimum of the mutual information function55. A ‘low’ estimated embedding dimension is compatible with low-dimensional complexity resilience, indicating that nonlinear dynamics can be adequately modeled with relatively few d-o-f. Alternatively, a ‘high’ estimated embedding dimension is compatible with high-dimensional complexity resilience, indicating nonlinear dynamics requiring many d-o-f. We selected two discriminating statistics conventionally used to distinguish nonlinear-deterministic from linear-stochastic performance: nonlinear prediction skill—measured by Nash–Sutcliffe Efficiency (NSE)—and permutation entropy—measured by a modification of the Shannon H statistic. Since NSE statistics approaching one indicate higher skill, we specified an upper-tailed test of the null hypothesis that attractors reconstructed from watershed and teleconnection signals predict with no more skill than surrogate attractors reconstructed from linear-stochastic surrogate data. Since larger H statistics reflect more random behavior, we specified a lower-tailed test for permutation entropy. We applied rank-order statistics with a significance level α = 0.0540.

We summarize surrogate data results in Table 2. When permutation entropy is the discriminating statistic, we reject the null hypothesis for every watershed and climate teleconnection signal since H computed for each is below the corresponding lower-threshold value bounding from above the bottom-ranked surrogate attractors. When nonlinear prediction skill is the discriminating statistic, we reject H2 for every signal (except for stream discharge) since the NSE achieved by each surpasses the corresponding upper-threshold value exceeded by the top-ranked surrogate attractors. The nonlinear prediction test was borderline acceptance of H2 for stream discharge, but since the permutation entropy test more soundly rejected H2, we made the judgment call to reject H2 for stream discharge. Given the strength of the overall results, we confidently reject the null hypothesis of linear-stochastic dynamics for every tested signal. Nonlinear-deterministic dynamics and complexity resilience remain likely. We estimated low embedding dimensions of m = 4 for AMO and WeMOI and m = 3 for the other signals, indicating low-dimensional complexity resilience.

Table 2 Surrogate data results.

H3: Hydrologic variables are causally linked to climatic stressors

We follow emerging hydrologic research empirically testing for causality among watershed and climatic signals29,30,31. We applied convergent cross mapping42 (CCM) since it is designed to detect causality in low-dimensional nonlinear-deterministic systems. The logic behind CCM is that interacting variables belong to the same real-world dynamic system. Consequently, attractors empirically reconstructed from interacting variables map one-to-one with the same real-world attractor and thus map one-one with each other by Euclid’s axiom that two things equal to the same thing must equal each other. CCM tests for a one-to-one mapping between reconstructed attractors by measuring the skill with which one attractor can be used to cross-predict values on the other. Following Sugihara et al., we measure CCM prediction skill by the Pearson correlation coefficient—a conventional statistical tool for measuring linear correlation between two variables.

In Fig. 5, we summarize detected causal interactions in a community interaction diagram whose nodes represent interacting covariates and arrowed edges display directional interactions. The strength of each interaction is given by the correlation coefficient near each arrowed edge. The stream discharge variable was driven strongly by the temperature signal (0.75), moderately by the precipitation signal (0.53), and more weakly by the MOI (0.33) and AMO (0.21) regional climate teleconnections. Precipitation measured in the watershed was driven with moderate strength by stream discharge (0.61) and more weakly by local temperature (0.42), and MOI (0.42). These results provide empirical evidence that recurring large-scale air pressure and circulation patterns represented in the MOI and AMO climate teleconnection indices play a detectable role in systematically influencing local hydrologic and weather covariates in the La Tejería watershed. The detected bi-causal relationship between stream discharge and precipitation provides further empirical support for land-atmosphere interactions in which surface moisture systematically influences weather “through its impact on evaporation and other surface energy fluxes56.”

Fig. 5: Causal interaction among climatic and hydrologic covariates.
figure 5

We applied convergent cross mapping (CCM) to empirically test for causal interactions among watershed and climatic signals29,30,31. The nodes of the community interaction diagram represent interacting covariates, and arrowed edges display directional interactions. The strength of each interaction as computed by CCM, is given by the correlation coefficient near each arrowed edge.

Nonlinear recurrent resilience analysis

A generalized version of Takens’ Theorem permits embeddings to include multiple covariates and their delayed copies while guaranteeing invariant state-space dynamics57. We can investigate detected low-dimensional complexity resilience from multiple perspectives depending on the interactive watershed and climate teleconnection signals used to construct state-space attractors. Since the estimated embedding dimension for each signal was predominantly m = 3 (Table 2), we constructed attractors from combinations of at least three interactive signals. Moreover, since precipitation and stream discharge interacted with several drivers (blue nodes in Fig. 5), we constructed attractors to focus on the resiliency of precipitation and stream discharge to at least two of their respective drivers.

We constructed an attractor from interacting precipitation (P), temperature (T), and MOI signals (grey line) to examine the resiliency of precipitation to extremes in local temperature and large-scale climatic patterns represented in the MOI (Fig. 6a). The coordinate axes of the attractor are provided by these three signals, and each multidimensional point on the attractor scatterplots monthly values of each signal. Each point was labeled by month (e.g., 1 = January), and the points connected in time sequence. As in Fig. 2, we observe a ‘boot-shaped’ attractor composed of a rightward lower branch that signals revisited in winter and early spring months (i.e., months 11, 12, 1–4) and a leftward upper branch revisited in summer and fall months (i.e., months 5–10). The upper (lower) panel highlights points on the attractor with extreme lows (highs) in precipitation (blue dots), temperature (red dots), or MOI (green dots). Extreme lows (highs) in each variable were taken as values occurring in the lower (upper) 15th percentile. Along the attractor, precipitation extremes evolved counter-cyclically to temperature and MOI extremes: Precipitation lows (upper plot) and temperature highs (lower plot) occurred predominantly in the warm summer and fall months when the MOI was in an extreme high phase (lower plot) in this watershed. Alternatively, precipitation highs (lower plot) and temperature lows (upper plot) occurred predominantly in the cold winter and early spring months when MOI was in its extreme low phase (upper plot). This detected behavior accurately reflects the ‘wet-cold’ and ‘dry-warm’ monthly weather patterns known to characterize the Mediterranean49 and demonstrates how the MOI climate teleconnection systematically influenced local weather extremes.

Fig. 6: Resilience along climatic-hydrologic system attractors.
figure 6

a To focus on the resilience of precipitation to extremes in other climatic variables, a watershed attractor was constructed from the perspective of precipitation (P), temperature (T), and the Mediterranean Oscillation Index (MOI) climate teleconnection. The coordinate axes of the attractor are provided by these three signals, and each multidimensional point on the attractor scatterplots the monthly values of each signal. Each point is labeled by month (e.g., 1 = January), and the points are connected in time sequence. Signals revisited the rightward lower branch of the attractor in winter and early spring months (i.e., months 11, 12, 1–4) and the leftward upper branch in summer and fall months (i.e., months 5–10). The upper (lower) panel highlights points on the attractor with extreme lows (highs) in precipitation (blue dots), temperature (red dots), or MOI (green dots). Extreme lows (highs) in each variable were taken as values occurring in the lower (upper) 15th percentile. The plots show that precipitation adjusted countercyclically to extremes in temperatures and MOI. Precipitation lows (upper plot) and temperature highs (lower plot) occurred predominantly in the warm summer and fall months when the MOI was in an extremely high phase (lower plot). Alternatively, precipitation highs (lower plot) and temperature lows (upper plot) occurred predominantly in the cold winter and early spring months when MOI was in its extreme low phase (upper plot). This corresponds well to the ‘wet-cold’ and ‘dry-warm’ monthly weather patterns known to characterize the Mediterranean. b(i) Stream discharge (Q) and precipitation (P) extremes evolved countercyclically to MOI extremes with extreme lows (highs) occurring in the dry (wet) months when MOI was at an extremely high (low) phase. b(ii) Stream-discharge extremes evolved countercyclically to temperature (T) and MOI extremes: Stream discharge lows (upper plot) occurred in the dry-warm months when temperature and MOI were at extreme highs. Stream discharge highs (lower plot) occurred in the wet-cold months when temperatures and MOI were at extreme lows. c Substituting the Atlantic Multidecadal Oscillation (AMO) for MOI produces qualitatively identical dynamics.

In Fig. 6b, we focus on the resiliency of stream discharge (Q) to extremes in (i) local precipitation (P) and MOI and (ii) local temperature (T) and MOI. Similar to Fig. 6a, the attractors constructed in these two cases are boot-shaped, with points on the attractor alternating between wet months (rightward lower branch) and dry months (leftward upper branch) annually. In Fig. 6b(i), stream-discharge and precipitation extremes evolved countercyclically to MOI extremes with extreme lows (highs) occurring in the dry (wet) months when MOI was at an extremely high (low) phase. In Fig. 6b(ii), stream-discharge extremes evolved countercyclically to temperature and MOI extremes: Stream-discharge lows (upper plot) occurred in the dry-warm months when temperature and MOI were at extreme highs. Stream discharge highs (lower plot) occurred in the wet-cold months when temperatures and MOI were at extreme lows. In Fig. 6c, we observe identical qualitative dynamics when AMO was substituted for MOI.

AI-based early-warning hydrologic resilience system

We formulated an explainable-AI-based early-warning system of hydrologic resilience to climatic stress (EWS) to forecast hydrologic resilience out-of-sample and alert watershed managers to extreme behavior. By explainable AI, we mean that echo-state neural networks58 (ESNN) were used to reproduce and forecast recurrent resilience behavior reconstructed with our protocol from low-dimensional components and causal networks detected in observed watershed records. In Fig. S3a–c, we show performance plots for ESNN reproductions of the resilience attractors constructed in Fig. 6. Each plot shows the portion of the signals comprising the attractor allocated to the training set (black curve to left of box), the portion remaining in the testing set (boxed area), ESNN in-sample predictions (yellow curve in box), and ESNN two-year out-of-sample forecasts (violet curve to right of box). In each plot, ESNN predicted with almost-perfect skill (NSE > 0.97). Forecasts largely preserved oscillatory behavior observed in corresponding signals. In a demonstration of dynamic correspondence, state-space trajectories reconstructed from ESNN in-sample predictions (blue line) and out-of-sample forecasts (red line) largely rested on attractors constructed from in-sample signals (grey line) (Fig. S3d).

In Fig. 7, we display hydrologic resilience dynamics forecasted by the EWS for the La Tejería watershed in a more management-friendly format. The plots show individual signals (black line) comprising the precipitation attractor (Fig. 7a) and stream discharge/MOI attractor (Fig. 7b) analyzed in Fig. 6a, b. The plots include corresponding observed records (blue line) and two-year ESNN signal forecasts (violet area). Extreme signal values that occur outside of the upper/lower 15th percentile (dashed horizontal lines) are marked with black dots. Extreme values of corresponding observed records occurring outside of these percentiles are marked with blue dots. The plots labeled ‘forecasts’ zoom in on forecasted signals on these attractors along with forecasted extreme values. In Fig. 7a, EWS-forecasted signals continue to exhibit the recurring countervailing relationship that precipitation has with temperature and MOI. Extreme forecasted lows (highs) in precipitation occurring during the dry-warm (wet-cold) months visually correspond with extreme forecasted highs (lows) in temperatures and MOI. In Fig. 7b(1), EWS-forecasted stream-discharge and precipitation signals faithfully reproduce a countervailing relationship with MOI. In Fig. 7b(2), the EWS-forecasted stream-discharge signal continues to interact countercyclically with the temperature and MOI signals.

Fig. 7: AI-based early warning system of hydrologic response to stress.
figure 7

The plots show individual signals (black line) comprising the precipitation attractor (a) and stream discharge/MOI attractor (b). The plots include corresponding observed records (blue line), and two-year signal forecasts made with echo state neural networks (violet area). Extreme signal values occur outside of the upper/lower 15th percentile (dashed horizontal lines) are marked with black dots. Extreme values of corresponding observed records occurring outside of these percentiles are marked with blue dots. The plots labeled ‘forecasts’ zoom in on forecasted signals on these attractors along with forecasted extreme values. a Forecasted signals continue to exhibit the recurring countervailing relationship that precipitation has with temperature and MOI. Extreme forecasted lows (highs) in precipitation occurring during the dry-warm (wet-cold) months are driven by extreme forecasted highs (lows) in temperatures and MOI. b(1) Forecasted stream-discharge and precipitation signals faithfully reproduce a countervailing relationship with MOI. b(2) The forecasted stream-discharge signal continues to interact countercyclically with the temperature and MOI signals.

Probabilistic modeling of unstructured noise isolated in watershed and climatic records

We completed the hydrologic resilience analysis of the La Tejería watershed by applying extreme value statistics36 to probabilistically model the response of watershed variables to extreme events captured in the unstructured noise component isolated from observed records in signal processing. Application of extreme value statistics requires that extreme events be independent and identically distributed (iid) random variables or that artificial ad hoc restrictions be imposed to control for serial dependencies59,60. Since isolated noise is unstructured, we did not need to impose such restrictions on observed watershed records. We apply the peaks over the threshold version of EVS that computes the likelihood of extreme discrepancies exceeding a selected threshold value within a given time interval. In theory, exceedances follow a generalized Pareto (GP) distribution59. Quantile–quantile (Q–Q) plots are used to determine the fit of the GP distribution to the data. Given a reasonable fit, the estimated GP distribution can be inverted to solve for quantiles, providing a useful noise diagnostic: return-level plots.

In Figs. S1 and S2, we plot noise levels measured as the vertical difference between standardized observed and signal values in each month. Positive (negative) noise occurs when observed values are greater (less) than corresponding signal values. We defined positive noise extremes as those exceeding the upper 15th percentile threshold and the absolute value of negative noise extremes in the same manner. The return-level plot estimates the expected time (return time) before extreme noise of various magnitudes (return level) is expected. In Fig. S4a, b, we show computed return-level plots in which return levels increase at a decreasing rate with return time as generally expected. In Fig. S4c, we highlight positive and negative noise extremes expected over 1-year, 2-year, and 3-year return times. Consider, for example, positive noise extremes in the stream-discharge record. The upper 15th percentile threshold is 0.41 (standard deviations), with extreme randomly occurring stream discharges of 0.77, 1.22, and 1.51 expected within 1-year, 2-year, and 3-year intervals, respectively.

Discussion

Hydrologic resilience modeling has emerged as a popular technique to assess the ability of watershed systems to continue supplying life-supporting ecoservices under extreme conditions. Its increasing use is undeterred by literature surveys finding that current models fail to capture essential hydrologic dynamics and incorporate adequate testing. We addressed these shortcomings by formulating an empirical protocol to reconstruct hydrologic dynamics from observed watershed records and analyze the response of reconstructed dynamics to extreme regional climatic conditions. We devised an AI-based early-warning system providing skillful out-of-sample forecasts of reconstructed hydrologic resilience dynamics, reliably alerting watershed managers to future recurrent hydrologic and climatic extreme events. We applied the protocol and early-warning system to reconstruct and forecast hydrologic resilience to the extreme climate in the La Tejería (Spain) experimental watershed.

Our results provide compelling empirical evidence that irregular fluctuating patterns observed in La Tejería watershed records conceal emergent low-dimensional nonlinear-deterministic hydrologic dynamics and, consequently, that watershed dynamics are regulated endogenously by internal nonlinear hydrologic, geologic, soil, and climatic processes. Low-dimensional nonlinear dynamic systems do not fit within conventional resilience classes in the literature. Restoration-resilient systems resume normal function after outside disturbance. Complexity-resilient systems are conventionally taken to be high dimensional. In contrast, we reconstructed a low-dimensional complexity resilient system that undergoes irregular oscillations confined to a low-dimensional watershed attractor in response to internal stressors. Thanks to the Generalized Takens’ Theorem57, we could construct watershed attractors from different combinations of interacting watershed and climatic covariates and consequently analyze resilience behavior from multiple perspectives. We constructed low-dimensional nonlinear attractors to investigate the resilience of stream discharge and local precipitation to their respective detected internal stressors. In both cases, empirically reconstructed watershed resilience behavior faithfully captured monthly wet-cold and dry-warm weather patterns known to characterize the Mediterranean region. Both stream-discharge and local precipitation extremes evolved countercyclically to extreme temperature and extreme phases of the MOI climate teleconnection index. An AI-based early-warning system skillfully learned these complex resilience dynamics, indicating that the 2-year out-of-sample forecasting period was not substantially compromised by sensitivity to initial conditions.

We emphasize two final points: First, our objective was to demonstrate how hydrologic resilience dynamics could be reconstructed from records in a case-study watershed—not to demonstrate that our particular finding of nonlinear dynamics in the La Tejería watershed would hold for multiple watersheds. We cite Muller (2021) in the paper to support the idea that watershed dynamics may differ substantially depending on distinguishing characteristics. Consequently, we do not expect that empirical nonlinear dynamics would necessarily reconstruct low-dimensional attractors from all observed watershed records. Some watersheds may not evolve along a low-dimensional attractor, or available records may be too noisy or short to adequately sample irregular oscillations on a real-world attractor. Second, we do not envision empirical nonlinear dynamic modeling as a replacement for subsequent conceptual modeling based on first principles. Rather, it provides a rigorous empirical benchmark guiding model specification tethered to real-world conditions. This benchmark includes a geometric picture of resilience dynamics that conceptual models should reproduce and an estimate of the minimum model dimensionality required to do so.

Methods

Signal processing

Singular spectrum analysis is a data-adaptive signal processing method accommodating highly anharmonic oscillations in irregular records33,35. It separates structured variation (signal) composed of trend and oscillatory components from unstructured variation (noise). The strength of a component is measured by its contribution to total variation in the record. Singular spectrum analysis runs in three stages: decomposition, grouping, and reconstruction. In the decomposition stage, a time-series record is transformed into a trajectory matrix whose columns are the observed record followed by its forward-delayed copies. The dimension of the trajectory matrix depends on the window length parameter (L), typically estimated to be about half of the record length. Singular value decomposition is used to decompose the trajectory matrix into the sum of new matrices. Each matrix is the product of an eigenvalue and corresponding right and left eigenvectors. In the grouping stage, diagnostics provided by singular value decomposition are used to split the new matrices into groups corresponding to various signal and noise components of the records. Finally, in the reconstruction stage, matrix groups are reconverted to time series vectors with diagonal averaging.

Time-delay embedding

We applied time-delay embedding54 to reconstruct a shadow attractor from a single time series record. Initially, an embedded data matrix is constructed whose first column contains the observed record, and the remaining columns contain time-delayed copies of the observed signal serving as surrogates for excluded system variables. The number of columns is called the embedding dimension, and the delay length between columns is the embedding delay. The columns are coordinate axes in state space, and the rows are multidimensional points on a shadow attractor. Takens (1980) formally proved that time-delay embedding provides a 1–1 mapping of system dynamics from the original real-world state-space to the reconstructed shadow state space so long as the latter has sufficient dimensions to contain the original attractor. Since the dimension of the real-world attractor is unknown, the embedding dimension is conventionally estimated with the false nearest neighbors test55, and the embedding delay is estimated as the delay, giving the first minimum of the mutual information function55.

Surrogate data testing

We generated surrogate data vectors to test whether an apparent nonlinear structure reconstructed from a record was more likely due to a mimicking linear-stochastic dynamic process. Surrogate data vectors destroy the temporal structure of a record while preserving statistical properties compatible with linear-stochastic dynamics. We computed PPS surrogates to test for noisy linear dynamics in cyclic records53. Shadow attractors are reconstructed from each surrogate data vector and compared with the shadow attractor reconstructed from a record based on discriminating statistics representing hallmarks of nonlinear-deterministic behavior. We selected two conventional discriminating statistics: nonlinear predictive skill61 and permutation entropy62. We specified an upper-tailed hypothesis for the nonlinear prediction skill test since attractors reconstructed from watershed and teleconnection signals are expected to predict with more skill than surrogate attractors reconstructed from linear-stochastic surrogate data. We measure predictive skill with the Nash–Sutcliffe model efficiency (NSE)63, which denotes perfect skill when NSE = 1. Permutation entropy modifies the classic Shannon H measure of the information in a time series for application to finite noisy data. Since lower H statistics reflect more structured behavior, we specified a lower-tailed test for permutation entropy. We applied rank-order statistics to test for significant differences in nonlinear performance40. An ensemble of S = (k/α) − 1 surrogate is generated, where α is the probability of false rejection, and k controls the number of surrogates and the sensitivity of the test. We set \(\alpha =0.05\) and k = 20 and accepted the null hypothesis of stochastic cycling dynamics if the NSE (H) taken from the attractor reconstructed from the rill signal did not fall in the upper (lower) k corresponding values taken from the ensemble of S = 399 surrogate attractors. If we reject the null hypothesis, untested dynamic structures (including nonlinear-deterministic dynamics) remain viable.

Convergent cross mapping

We applied convergent cross mapping42 (CCM) to test whether shadow attractors reconstructed from different observed signals reconstruct the same real-world dynamic. If so, then the corresponding signals are deemed to causally interact in the same real-world dynamic system. Shadow attractors empirically reconstructed from interacting variables map one-to-one with the same real-world attractor and thus map one-to-one with each other. CCM tests for a one-to-one mapping by measuring the skill with which an attractor (Mx) reconstructed from a covariate X can be used to cross-predict values on another attractor (My) reconstructed from a second covariate Y. Predictive skill should converge as the number of points used to reconstruct Mx increases. Prediction skill is measured by the Pearson correlation coefficient. Sugihara et al. perform cross-mapping with a simplex-projection algorithm.

Echo-state neural networks

Echo state neural networks (ESNN) are an artificial recurrent neural network (RNN) method comprised of a randomly generated reservoir of neural states mapping each point on an attractor sequentially into a high-dimensional space, and a read-out providing reservoir predictions64,65. Reservoir computing is fast learning with low-training cost since the reservoir is fixed and only the readout is trained. ESNN operation is outlined in Fig. S5. To start, the embedded data matrix of a reconstructed attractor is split into training and testing sets. In the training mode (Fig. S5a), rows of the training set (i.e., points on the attractor) are progressively inputted through an input coupler into a randomly generated reservoir composed of N neurons, \({x}_{i}\left(t\right),i={{{{\mathrm{1,2}}}}},\ldots N\), where t represents an iteration. In each iteration, updated scalar activation values are collected in a neuron activation matrix, X. The training set is regressed against X to estimate the coefficient matrix, \({W}_{{{{{\rm{out}}}}}}\), used below to formulate the readout component. The figure gives the formula for the conventionally applied linear ridge regression method. In the testing mode (Fig. S5a, lower branch), rows of the testing set are progressively inputted into the reservoir, and a linear readout component generates predicted values, \(y\left(t\right)={W}_{{{{{\rm{out}}}}}}x\left(t\right)\), where \(x\left(t\right)\) is the vector of activations at iteration t. The neuron activations x(t) are updated at each iteration, and the updated predictions y(t) are collected in prediction matrix Y. The first column of Y is the ESNN prediction of a signal contained in the first column of the embedded data matrix, and the rows of Y are ESNN predicted points on the testing portion of the reconstructed attractor (i.e., rows of the embedded data matrix in the test set). To assess ESNN predictive skill, we measure the goodness-of-fit between each column of Y and its counterpart in the embedded data matrix with the Nash–Sutcliffe Efficiency Index63 commonly applied in hydrology. To operationalize ESNN, values must be selected for an array of architectural hyperparameters, including the fraction of attractor points in the training set (tau), the number of reservoir states (N), the spectral radius of the reservoir (rho.scale), the leaking rate regulating iterative updating of neural states, the Tikhonov regularization coefficient required for ridge regression (reg), and scalar intervals (a and b) setting boundaries for uniform random elements used to generate an input-to-reservoir coupler matrix and an adjacency matrix (i.e., the reservoir), respectively. These hyperparameters, their roles in ESNN computation, and reasonable ranges of values are detailed in 64. To set these hyperparameters, we automated a large grid search employing high-performance computing. We uniformly sampled the hyperparameter grid with Sobol GSA sampling methods66. We identified the ensemble of sampled ESNN architectures generating state-space attractors corresponding to an attractor reconstructed from watershed records; where our indicator of correspondence was whether embedded ESNN predictions rested on the reconstructed attractor (Fig. S3d). In generation mode (Fig. S5b), skillful parameterizations were used to forecast points on an attractor out-of-sample by feeding back in forecasted values. Hyperparameter values supporting the ESNN-reproduced attractors appearing in Fig. S3 are reported in Table S1.