1 Introduction

There is a variety of natural hazards, which includes meteorological hazards like hurricanes, tornadoes, thunderstorms, and downbursts; seismic hazards such as ground shaking and ground liquefaction; and hydrological hazards, for example storm surges, tsunamis, and inland floods. Some of these hazards can happen simultaneously or in sequence, for instance hurricane and storm surge, or earthquake and tsunami. The risk posed by these hazards will depend on the intensity and other characteristics of the hazard, the amount of man-made exposure or infrastructure subjected to the hazard, and the vulnerability of that infrastructure (Shi et al. 2010). Figure 1 illustrates how the risk is at the intersection of hazard, exposure, and vulnerability. A natural hazard can become a disaster depending on the extent and the vulnerability of human communities in the path of the hazard.

Fig. 1
figure 1

Components of disaster risk at the intersection of hazard, exposure, and vulnerability

The different components of the risk define the research themes under the umbrella of natural hazard engineering and disaster risk management. Key research challenges, when quantifying risk and developing effective mitigation measures, include understanding the interaction between the infrastructure inventory, multiple hazards, and their effects on infrastructure performance (that is, vulnerability), and associated impacts on socioeconomic systems (Aitsi-Selmi et al. 2016). Assessment and management of natural disaster risk therefore require multiple layers of knowledge and expertise from different domains. Within each domain lie unique considerations when multiple cascading and correlated hazards are of interest, with a subset of these issues indicated in Fig. 1. Hazard scientists need to understand the characteristics of the hazard to assess its magnitude and potential destructive impact. This requires extensive monitoring and measurements, sometimes over very long periods, with a vast array of instrumentation (Kohler et al. 2017). For example, characterization of one or multiple hazards, such as a combined wind-surge-wave-rainfall event, requires consideration of the timing and duration of the hazards, correlations in their intensities, and evaluation of their spatial variation across an affected region (Balderrama et al. 2011). Only then can the compounding actions of these multiple hazards on infrastructure, such as bridges or residential construction, be evaluated. For that to happen, demographers and surveyors need to make an inventory of the exposure that involves the assessment of up to millions of structures. Engineers need to quantify the vulnerability of the exposure in order to assess the impact of the hazard on the built environment (Wartman et al. 2018). To model the response of the structures to the hazard, they need to assess the capacities of the different components of the structures in the building and infrastructure inventory, through experiments, simulations, or a combination of both. After a disaster, in reconnaissance surveys, engineers and scientists survey the area and collect extensive data (Chen et al. 2014, 2016; Pinelli et al. 2018). Finally, social scientists need to understand a community’s response to a possible hazard, including how culture, demography, wealth, and a variety of different human factors can affect the preparedness and resilience of a community (Welton-Mitchell 2018). These scientists will also carry out extensive surveys of a community after a disaster.

Hence for a holistic assessment of community resilience, disaster risk managers need to integrate different sources of expertise and data across the various thematic domains. This involves management and analysis of large multidisciplinary datasets that correspond to different kinds of natural hazards (for example, wind speeds or ground acceleration records), man-made exposure (for example, building inventories or utility networks), and measures of their physical and social vulnerability. These datasets not only are multidisciplinary in content, but data source locations, types and structure, and documentation vary widely. Different domain experts must collaborate, reuse, and share data, often in the midst of emergency crises. Storing and archiving these very large datasets for purposes of finding and integrating data is extremely challenging. Data models are needed to preserve and assign metadata to the data, so that it becomes searchable and interoperable. Moving the data around between different platforms or machines becomes cumbersome and it is often impossible. A variety of tools, especially geolocation visualization tools, is needed, to process and analyze the data. Finally, high performance computing (HPC) is necessary to satisfy the computational needs of the risk assessment community in an effective and timely way, on a regional scale.

In response to that challenge, the United States National Science Foundation (NSF) has sponsored the development of DesignSafe, a cloud-based cyberinfrastructure platform that provides computational tools to manage, analyze, understand, and share critical data for natural hazards research and management. DesignSafe is the heir to the Network for Earthquake Engineering Simulation (NEES), which addressed only the needs of the seismic community (Pejša et al. 2014). DesignSafe was conceived as a shared-use research infrastructure to enable transformative research in natural hazards engineering and disaster risk management. It is part of the Natural Hazards Engineering Research Infrastructure (NHERI) funded by NSF. The purpose of NHERI is to integrate seven large natural hazards experimental facilities across the United States with simulation laboratories and disaster reconnaissance efforts, to enable researchers to explore and test ground-breaking concepts to protect homes, businesses, and infrastructure lifelines from earthquakes and windstorms. This integration would enable innovations to help prevent natural hazards from becoming societal disasters.

Similar efforts at cloud-based databases include OpenTopography,Footnote 1 which facilitates community access to high-resolution, earth science-oriented, topographic data and related tools and resources (Robinson et al. 2017). ScienceBase, sponsored by the U.S. Geological Survey (USGS), is a data cataloging and collaborative data management platform.Footnote 2 CyVerseFootnote 3 provides life scientists with powerful computational infrastructure to handle huge datasets and complex analyses, thus enabling data-driven discovery. The Extreme Events Web Viewer, at the University of Alabama, was created to store collected disaster data in a spatial, temporal, and publicly available format (Crawford et al. 2018). The Digital Environment for Enabling Data-Driven Science (DEEDS)Footnote 4 is a full-service platform that provides end-to-end support to scientific and engineering communities everywhere in the fields of agriculture, electrical engineering, civil engineering, health and human services, and computer science (Catlin et al. 2019).

DesignSafe differs from these other efforts in that its primary focus is to serve the NHERI community and to amplify and link the capabilities of the NHERI partners, although it welcomes participants from the world at large. Its vision is to be an integral part of natural hazards research discovery worldwide, including disaster risk management. To achieve that vision, it supports end-to-end research workflows and the full research lifecycle, including data sharing and publishing, and provides cloud-based tools that support the analysis, visualization, and integration of diverse data types. The software development team is located at the Texas Advanced Computing Center (TACC) at the University of Texas, with a team of natural hazards researchers from the University of Texas, the Florida Institute of Technology, and Rice University comprising the senior management team. Additional input is solicited from the broader natural hazards community, facilitated in part by data and simulation requirements team members with expertise in various domains of natural hazards engineering.

DesignSafe has reached a level of maturity in its nearly five years of ongoing development. The current release is version 4.6. There are 4265 users as of December 2019, spanning researchers, nongovernmental organizations (NGOs), government, and practitioners. Although most of these participants are from the United States, DesignSafe also has users from 90 different countries. DesignSafe has over 150 TB of user data, with about 10 TB published. DesignSafe is eager to develop collaborations with foreign partners. For example, DesignSafe has a strong collaboration with QuakeCORE in New Zealand, which uses DesignSafe for their data management and publishing, and uses the apps. DesignSafe also collaborates with E-Defense in Japan and with Tokyo Polytechnic University, which provides interaction with that university’s wind databases (Kwon et al. 2016).

The Texas Advanced Computing Center is committed to sustaining the infrastructure beyond the life of the NSF funding under a couple of possible scenarios. First, if in the future NSF awards the funding to a different organization, TACC will sustain the infrastructure until the next awardee transition is complete. If NSF were to stop funding altogether, TACC will sustain the infrastructure at the release version at which the funding ends, until software technology changes cause the site to no longer function; however, the data will remain available so long as TACC exists.

Rathje et al. (2017) offers a detailed technical description of the components of DesignSafe, while this article provides an overview of the cyberinfrastructure’s role in disaster risk management. In the following sections, we review the DesignSafe cyberinfrastructure architecture and associated design principles relevant for supporting the disaster risk management community. The potential to enable transformative, multidisciplinary disaster risk research is also discussed while highlighting some of the unique functionalities of the cyberinfrastructure. This article concludes by presenting some case studies, which illustrate the impact of DesignSafe on different segments of the disaster risk management community, and, by inviting discussion on steps forward, the ability of DesignSafe to harness the resources of the cyberinfrastructure for present and future needs.

2 DesignSafe Architecture and Design Principles

The DesignSafe web portalFootnote 5 provides access to the cyberinfrastructure capabilities and functionalities. The portal and associated features were conceived with a number of fundamental design principles in mind.

First, the cyberinfrastructure (CI) is designed to be flexible and extensible, which is essential to integrate new tools, adapt to new data types, and support alternative workflows. DesignSafe seeks to provide a venue for Internet-scale collaborative science. By embracing a cloud strategy for the big data generated by hazards engineering, researchers can take advantage of significant server-side resources for data storage, simulation, and analysis. Such large-scale data management and access to high performance computers supports many of the workflows centered around resilience assessment in the face of concurrent and cascading hazards (see case studies presented later). Among others, an additional key principle considered in the development of DesignSafe is to support the full data and research lifecycle. This means that the platform is not intended to be simply a repository where data are archived at the end of the research project, but that the CI will provide a comprehensive environment for data collection, analysis, collaboration, curation and publishing. The goal is to empower the researchers to work with the data they know intimately in an organized fashion. To this end data are curated progressively from the beginning and through the research lifecycle after which it will be made publicly available to others who can find it and reuse it. For example, users in the field can send their data to DesignSafe as their connectivity allows, and work curating, preprocessing, and analyzing data can occur seamlessly. While on demand assistance from curators and HPC specialists is available, the goal is that researchers can access services and tools to enable an intuitive research environment.

The original design and lessons learned from the successful DesignSafe project led to the Core Experience Portal (CEP),Footnote 6 which is the core engine for all the future TACC portal projects. It provides an “out-of-the-box” framework for rapid deployment of a new portal project with all the common capabilities already in place. The portal manages massive datasets, enables the execution of highly parallelized computations, and allows users to collaborate across the United States. The portal is not just for scientists and engineers—TACC’s resources can be used by technologists, industry partners, educators, humanities scholars, and others. An example of a CEP based portal project is the Estimate Circulation and Climate of the Ocean portal, which enables the execution of highly parallelized computations, and national collaboration of users.Footnote 7

2.1 DesignSafe Research Workbench and Data Depot

Figure 2 describes the various components of DesignSafe and their interaction with different data sources. These sources include simulation research, experimental research, hybrid research at the intersection of simulation and experimental research, field reconnaissance research, and field experimental research at the intersection of experimental and field reconnaissance research. At the heart of the DesignSafe web portal lies the Research Workbench. Through this workbench researchers access the Data Depot, which acts as a flexible data repository of the major data sources. Here, the full research lifecycle is supported—from data generation to publication—offering researchers access to private, shared, and public space for their data.

Fig. 2
figure 2

The DesignSafe structure that connects data sources with the research workbench

2.1.1 Capabilities of the Data Depot

Given the complexity of disaster risk analyses, which might involve multi-hazards and interdisciplinary investigations, it is likely that data from distinct sources need to be integrated through combinations of software tools. For example, relevant multi-hazard data objects may include sensor data of structural response resulting from combined surge and wave loading; animation files of simulated tsunami waves generated following earthquakes; or pictures of bridge damage from field reconnaissance of combined rainfall and landslide events. The vision for DesignSafe is that users complete any number of data functions within the platform seamlessly. Users can store, share, curate, publish, find, and reuse relevant data published on the platform, combine it with data from external open sources, analyze it with the suite of tools available on the research workbench, and publish their own results in DesignSafe. Curation of the data with descriptive metadata is allowed progressively throughout the research process, offering flexibility to the users while promoting sharing, searching, and collaboration.

Accomplishing this data lifecycle vision (DCC 2019) for purposes of advancing risk management involves several CI components. Major CI components include the data/metadata models, which are the foundation to the platform’s interactive pipelines to organize, describe, and make data publicly available from research phases and into publication. A data model represents the steps, processes, tools, and modifications that data undergo during a research process. This representation captures the provenance, function, structure, context, and domain of these data, so that when published they are findable, explainable, and reusable. DesignSafe provides data models for the main data sources in natural hazards engineering: experimental, simulation, hybrid simulation, and field research as well as a simple model to publish reports, surveys, presentations, or datasets different from the sources mentioned (Esteva, Jansen, Coronel 2019, Esteva, Jansen, Arduino et al. 2019). While data derived from different types of natural hazard research are used in disaster risk management, of special importance are data generated during field research, especially post-disaster reconnaissance missions. The following section illustrates the development of data models to manage field research data. The process is similar for other types of data.

2.1.2 Field Research Data Model

The field research (FR) data model considers the ways in which data are captured in the field with the goal of making data gathering and curation a fluid process. Field research data are georeferenced, requiring open source mapping tools for data display. Along with engineering, FR incorporates social science investigations that assess the economic and social impact of natural hazards. Increasingly, as new instruments with improved scope and resolution are introduced, datasets become very large in size and number of files. These datasets are also diverse in file formats, metadata, and data structure. Adding to the complexity, data are often mediated by proprietary hardware and software. Often there are survey forms, questionnaires, and risk rating schemas associated with individual data points or groups of data and those need to be clearly associated with the resultant data in the publication. How researchers decide on their field deployments, how they organize themselves to work on the disaster sites, and the tools and data gathering methods they use, have an impact on how data have to be publicly represented to reflect timing, scope, precision, the diversity of instruments used, and the aims of the project. In many occasions, there is urgency to make the data public promptly after a natural hazard event. To accommodate these features and requirements, the FR data model must be both flexible and precise. The model was designed in close collaboration with experts who specialize in the different types of disasters (storm surge, wind, structural, geotechnical). The DesignSafe data model developers also attended instrumentation workshops, and deployed after Hurricane Michael in 2018 to understand how reconnaissance work is conducted in situ. In these activities, they captured the workflows, tools, practices, and terms that researchers use and embedded them in the curation interface with the idea of blending as much as possible field work with curation and publishing activities.

The FR model was designed to tell the story of each individual project. It has broad categories where engineering and social science data and metadata belong. They are: project, mission, collection, asset, and analysis. Table 1 below explains the meaning of these categories.

Table 1 Category and definitions for the field research data model

Categories function as containers for information and each has rules regarding its repeatability. For example, a project may have multiple missions, and each mission more than one collection, so categories are repeatable and the relations between them are recorded to generate a tree of connections and provenance. See Fig. 3 for the case of Mitchell (2019). Figure 3a shows the project information (authors, hazard and event information, keywords, and abstract). It also shows the information for one of the missions of the project including the DOI assigned to the mission, and its different collections. Each collection in turn comprises several files. Figure 3b shows the data diagram, which summarizes the logical relationship between project, mission, and collections. Each category is described by a set of established metadata elements and expert-defined tags, and users can add their own tags as well.

Fig. 3
figure 3figure 3

Example DesignSafe field research data sheet and data organization diagram: a The field research (FR) data publication interface shows the structure and metadata of a FR mission with three collections and different types of observations. Individual files can be tagged to illustrate the file contents; b Users can navigate FR projects through a tree representation. This tree facilitates a quick understanding of a project’s components and relations

Implemented in the CI, the FR model consists of a series of interactive interfaces that guide users through the processes of organizing and describing data, relating different data components, verifying the completeness of the dataset and its metadata, and publishing with a digital object identifier (DOI) and the appropriate license. DesignSafe enumerates available licenses along with other data publication guidelines. The interactions are built to facilitate managing many files so, for example, categorization and tagging can be conducted for many files at a time. Field research metadata can be transferred automatically from the site through the NHERI RAPID mobile application (see below), and entered and edited through forms or tags. At the project level, for example, users add information about the natural hazard, when it happened and its characteristics, to provide context to the dataset. Information about how each mission is deployed, who the team members are, and the different collections gathered through different instrumentation to measure different kinds of damage inform data consumers of the activities completed and their results. From the publication, users can display and visualize different data types using open utilities installed in the visualization tray of the DesignSafe Workspace such as HazMapper,Footnote 8 Potree,Footnote 9 and QGIS.Footnote 10

2.2 Workspace and SimCenter Research Tools

As Fig. 2 shows, the Data Depot is integrated with the Workspace that allows cloud computing for simulation, analytics, and visualization. Through this space the generation of digital workflows, transformation, integration, and discovery of data, sharing of scripts and software occur. The Workspace provides registered users with a desktop metaphor and a window through which to access available tools and scripts that are expected to evolve over time to meet the needs of the disaster risk management and broader natural hazards community. Access to the wealth of HPC resources available at TACC is provided, along with open source software, for example OpenSees (McKenna 2011) and the Advanced Circulation model ADCIRC (Luettich and Westerink 2004), and commercial codes available through the “bring your own license” model, for example, MATLAB. The applications are organized in thematic “trays” or tags (Fig. 2) corresponding to simulation, visualization, and data processing. Additional trays provide access to applications or databases from DesignSafe partners, and to utility applications. In addition, users can develop and install their own tools or ask for certain tools to be implemented. These tools can then be published and hosted in one of the thematic trays, to facilitate future data analyses. For instance, researchers at Florida Tech gathered data during the Hurricane Matthew landfall, and developed a Hurricane Data Analysis tool (Gurram et al. 2017; Subramanian et al. 2018), which is available in the Partner Data Apps tray.

To complement the Workspace, users have access to the SimCenter Research Tools Portal, which provides access to the simulation tools developed by the SimCenter of NHERI. As the needs of the natural hazards engineering and disaster risk management communities evolve and necessitate enhanced features, the SimCenter Portal provides a venue for developing and integrating new capabilities into the intentionally extensible cyberinfrastructure platform. New web applications and tools can be added, or automated workflows synthesized. For example, to analyze storage tank risks to joint wind-surge-wave storm events, an automated workflow can be generated that integrates ADCIRC + SWANFootnote 11 model simulations with multi-hazard fragility models of tank infrastructure to provide estimates of tank failure probability with execution of a single coupled code. Such endeavors, among others, encountered in disaster risk management research using DesignSafe may require advanced support beyond traditional ticket-and-helpdesk services. Hence, the Extended Collaborative Support Services (ECSS)Footnote 12 unit provides more advanced user support, offering deep, responsive and engaged user support by a designated member of the DesignSafe technical staff.

2.3 RAPID and the Reconnaissance Portal

Recognizing the increasing need for geospatially coded and referenced data from field research, a dedicated venue for accessing and visualizing data collected during reconnaissance events, from all over the world, is further provided in the Reconnaissance Portal—known in DesignSafe as the “Recon” Portal (Fig. 2). The Recon Portal provides direct access to the projects curated with the Field Research Data Model. The field research is facilitated by the NHERI Natural Hazards Reconnaissance Facility (referred to as the “RAPID Facility”). The purpose of the facility is to enable the natural hazards and disaster research communities to conduct next-generation rapid response investigations to characterize civil infrastructure performance and community response to natural hazards, evaluate the effectiveness of design methodologies, calibrate simulation models, and develop solutions for resilient communities (Wartman et al. 2018).

The RAPID FacilityFootnote 13 owns and operates a variety of state-of-the-art data collection equipment, such as computers, laser scanners, drones, and more, which it lends to field researchers for post-disaster reconnaissance surveys. In particular, the facility lends to researchers iPads equipped with the RAPID App, which allows users to identify, capture, aggregate, organize, store, and manage social science, engineering, and geoscience reconnaissance data in the form of typed or handwritten notes, photographs, audio and video recordings, map vectors, questionnaires, and metadata (Miles and Tanner 2018). The RAPID App is available today, on iPads from the RAPID Facility. General availability in the Apple App Store is a future goal. Also in the future, data from the app will be seamlessly integrated, in the field, in quasi-real time, with DesignSafe for archiving in the Recon Portal.

Whether obtained through the RAPID App or through other means (see Sect. 5.4), for each disaster event tagged in the Recon Portal, data organized into projects that are published and hosted in the Data Depot are available. The Recon Portal provides a dedicated interface with the Data Depot, so researchers can browse and access reconnaissance data either through a simple search, by scrolling down a list of events, or by clicking on the event location on the displayed map. The platform also provides interactive map viewing of event data through the HazMapper application, available in the Workspace visualization tray, as well as other GIS apps. Geocoded data can be manipulated in several ways, and can be mapped to give graphic representation of hazard characteristics, damage distribution, or different societal parameters related to community response or resilience. For example, contour maps of hazard intensity measures can be produced (for instance, wind speeds or ground acceleration), and superimposed to spatial distributions of damage.

3 Training and Outreach for the Disaster Risk Management Community

Within the NHERI community space of the portal, collaboration tools are available. For example, registered members of DesignSafe can join the DesignSafe Slack Team to leverage Slack, an online collaborative communication tool. With Slack, the community can organize itself in online forums for information sharing and discussion organized around a research theme. Current channels are dedicated to various communities of practice, such as wind, earthquake, or multi-hazard engineering, as well as thematic events, such as recent hazard events and reconnaissance efforts. DesignSafe has more than 1600 Slack users, with more than 80,000 postings to date. Peak activity is centered around a natural hazard event (hurricane, earthquake) when Slack is used for coordination and information sharing/dissemination for field reconnaissance.

The Learning Center provides on demand training materials for the user community, as well as information about relevant workshops or webinars, outreach and student engagement activities. Instruction materials include live and archived tutorials on a range of topics of interest to the disaster risk management community, such as interacting with partner datasets in the cloud, running simulations in DesignSafe with tools like ADCIRC and OpenSees, or data transfer methods. Point of interest tutorials and guidelines are also offered throughout the portal. Members from the community can also publish or share educational materials and content via the Data Depot. This can offer a venue for the disaster risk management community to develop and disseminate much needed educational content related to disaster assessment and management, which is generally lacking in university curricula.

4 Enhancement of Disaster Risk Management Through DesignSafe

Through a blending of hardware, software, network, and human capabilities, scientific activities essential for disaster risk management are supported in the cloud via enhanced functionalities. They include the provision of end-to-end multi-hazard workflows, the facilitation of collaboration across multi-disciplinary teams working on multi-hazard events, the acceleration of damage assessment workflows, and, a better reproducibility and re-use of the data through facilitation of the data curation and publication. The subsections below describe these functionalities, while case studies illustrate them in the Sect. 5.

4.1 End-to-End Multi-Hazard Workflows

A multi-hazard workflow (or set of steps that occur from the initiation to completion of a research process) may vary in context from hazard-specific modeling (for example, stochastic simulation of joint flood and earthquake occurrence modeling) to integrative community-level assessment (for example, evaluation of population dislocation following an earthquake induced tsunami). DesignSafe supports this diversity of associated workflows from data generation (be it experimental, simulation, or RAPID reconnaissance data) through analysis, curation, and publication. Visualization and analytics of the data can take place within the cloud, taking advantage of data analysis tools such as: MATLABFootnote 14; JupyterFootnote 15; tools developed in the SimCenter; or, community supplied tools and apps. As a feature of this end-to-end workflow support, in the Workspace users are able to combine datasets and tools or interface software to create and save custom workflows.

These custom workflows can be shared publicly or with specified groups, such as project team or collaborators. By conducting all research tasks within one seamless cyberinfrastructure, provenance tracking can occur, in which a record of processes applied to data is synthesized. Metadata, which can be defined at any time, provide context to the various types of data, facilitate sharing among an often multi-disciplinary, multi-hazards team, and support subsequent curation of data products. Overall, DesignSafe provides a cyberinfrastructure in which multi-hazard data are not simply stored, but where they are studied, transformed, and shared with limited or broad access during the research process, and where data are ultimately curated, published, and discovered by the broader hazards community.

4.2 Collaboration Across Multi-Disciplinary Multi-Hazard Teams

As indicated by the scope of expertise required to address the disaster risk management research themes and challenges shown in Fig. 1, effective collaboration is essential in natural hazard engineering and disaster risk management. DesignSafe facilitates this collaboration on a number of fronts. Online collaboration of research teams is supported via the ability to share data objects, tools, and workflows with user-defined groups with a click of a button. Although still in the preliminary stage, in the future, DesignSafe will be tightly integrated with the RAPID mobile app. For instance, a field reconnaissance team may import their data (for example, field measurements, photographs, LIDAR (Light Detection and Ranging) images, videos, and sample plans on building damage suffered from a fire-following-earthquake event via the Recon Portal). The researchers will have the ability to tie these multiple georeferenced data to their geocoded markers for visualization, analysis, and planning purposes. Also shared access to the data can be granted to structural modeling collaborators who simulate building responses within the Workspace for model validation with the field reconnaissance data. Collaboration and information exchange with the broader disaster risk research community is also facilitated not only via shared or published data, tools, models, and workflows, but also by other features of the DesignSafe web portal. The Slack online collaborative communication tool, for example, offers opportunity for community driven forums. Participants may opt to announce release of new tools, review research opportunities, or pose questions and discussions on research challenges.

4.3 Enhanced Damage Assessment Workflow

Reconnaissance teams upload, curate, and publish wind, seismic, and coastal damage data they gather during field reconnaissance missions, so they are available shortly after a disaster (Kijewski-Correa et al. 2018).

Following extreme hazard events—such as earthquakes, windstorms, floods, and tsunamis—teams of engineers and scientists have traditionally deployed to capture perishable data (that is, data that will disappear quickly during the recovery effort) by documenting the performance of the built and natural environment (Lindt et al. 2007; Prevatt et al. 2012). The ultimate goal of such deployments is to discover knowledge from the post-hazard data that can advance science and practice to create more resilient communities. DesignSafe provides a range of tools that support the data collection efforts and enhance knowledge discovery from the data while simultaneously accelerating the rate at which data are collected, curated, disseminated, and published for reuse by the scientific community. As a result the time needed between data collection and publication and preliminary analysis is shortened considerably from 1 to 5 years to a mere 6 to 12 months. Early dissemination of post-disaster findings from reconnaissance teams is crucial in order to impact the post-disaster recovery of the affected communities, as shown in Fig. 4. These data become publicly available during the recovery and reconstruction effort and can inform and impact the state of resilience of the rebuilt community.

Fig. 4
figure 4

Timeline of post-disaster field reconnaissance efforts in relation to the resilience curve

4.4 Research Reproducibility, Data Curation, and Publication

Data interoperability is key to efficient risk management work as well as to access, and the long-term preservation of data. The DesignSafe team works closely with researchers from different teams in the United States to facilitate seamless work in the cloud from the inception of research until final publication within the DesignSafe platform.

For example, in the case of field research, the DesignSafe team shares with the RAPID team the same data model, so users can send image, video, and survey data and metadata automatically as they capture it with the RAPID mobile application into corresponding projects/missions and collections in DesignSafe. This allows offloading data from the RAPID mobile app tablet device used in the field. DesignSafe also works with the field research teams on the ground to accommodate their need to publish virtual reconnaissance reports as soon as they are generated at the outset of a disaster, and prior to publishing any field data. Field research data are not only large, but file formats can also be proprietary. DesignSafe curators are knowledgeable about best practices in relation to open data formats that can be easily distributed and displayed on the platform, and about tools to facilitate and generalize readability and mapping of georeferenced data and metadata.

The challenge of reproducing research results is a pressing issue across a range of computation and data-based science fields (Borgman 2012). DesignSafe curators continuously observe how users publish data, and listen to their needs to adjust DesignSafe models and services. For example, to stimulate transparency and interoperability users are requested to unzip their files, convert them to open formats using utilities provided within DesignSafe, to convert data gathered through proprietary tools into open formats for publication, and to conform to best metadata, citation, and licensing practices.

The DesignSafe developers include computer science and engineering technical experts, data science experts, and hazard domain experts. This constant communication between the researchers and the multi-disciplinary team of DesignSafe developers allows DesignSafe to meet the needs (full if possible, or half-way if necessary) of different users. The goal is to avoid imposing restrictive curation conditions but instead communicate with the researchers so that they can work with freedom and DesignSafe can capture and publish their datasets.

5 Case Studies

This section presents a series of brief case studies that illustrate the capabilities described in the previous sections and how DesignSafe can be a game changer in disaster risk science. This is not by any means an exhaustive list. Numerous other cases exist. For example, Lenjani et al. (2019) integrated a field reconnaissance data set with Google street data to improve field data collection and analysis.

5.1 Next Generation Liquefaction Project

The Next Generation Liquefaction (NGL) projectFootnote 16 involves a community database of earthquake-induced liquefaction case histories from around the world. A web interface (Brandenberg et al. 2019) permits users to upload, download, and visualize case history data. However, users cannot use the web interface to perform calculations on the data to produce models that trigger liquefaction events and investigate their consequences. Procedures for evaluating liquefaction hazard are developed by constraining data-driven inquiry with physics-based concepts in a semi-empirical manner. Although the database is still in its infancy, the amount of data is already substantial enough that users cannot be reasonably expected to download all of the data for processing on their local computer. The ability to interact with the data in the cloud is therefore a crucial aspect for utilizing data in the NGL database to develop new models for liquefaction risk evaluation.

The NGL relational database is replicated in DesignSafe, where it can be queried using Python scripts in Jupyter notebooks. A Jupyter notebook is a server-client application that allows editing and running notebook documents via a web browser. It combines rich text elements (equations, figures, HTML, LaTeX) and computer code executed by a Python kernel. Python libraries are available for performing queries using Structure Query Language (SQL) to extract data from the database. The notebooks also enable users to develop their own scripts to interact with the data to compute derived data quantities, develop statistical regressions, or interact with the data in other ways. Several example Jupyter notebooks (Brandenberg et al. 2019) have been developed to perform example queries, and extract and process various types of data in the database. Users can utilize these example notebooks as a starting point for developing their own scripts to interact with the data.

These data will be directly utilized by model development teams, who will produce new liquefaction triggering and consequence models that form the basis for hazard maps that are often utilized by disaster risk managers. These managers also could query the NGL relational database to see examples of liquefaction-induced damage to infrastructure in urban or industrial areas.

5.2 Building Seismic Vulnerability Assessments

Following the 19 September 2017 Earthquake in Mexico City, various groups of practicing engineers and researchers visited the affected areas to assess damage and evaluate the safety of buildings. An overview of damaged structures indicated a high correlation with peak ground velocity and also between the building’s natural vibration period and the dominant frequency of the ground shaking considering local site effects. As part of an NSF-funded reconnaissance effort, research teams from multiple U.S. universities and industry, equipped with portable sensing equipment, instrumented mid-rise buildings to record ambient vibration data. These teams were able to capture building properties and in some cases soil vibration properties. Data from instrumented buildings with recorded response through the strong ground shaking of the main event were also obtained for selected buildings. The collected data including building drawings and the locations of sensors are published in the Data Depot (Behrouzi et al. 2019).

The curated and published data are being analyzed using DesignSafe cloud-based computing, including MATLAB, to correlate damage with building vibration properties, local soil characteristics, and estimated shaking intensity to identify the main contributors to damage. Using DesignSafe, the project team was able to share and collaborate on the analysis of the large data sets collected in the field by multiple teams to arrive at regional damage assessments over wider areas and to identify vulnerabilities in building stock. All data collected are published and are available in DesignSafe for further analysis by other research teams, noting that past data from reconnaissance efforts have not been readily accessible for sharing.

5.3 Seismic Risk Prediction

Of particular interest to the disaster risk management community is the capacity to run large-scale regional hazard and loss simulations. For example, through the NHERI SimCenter Application Framework, users can study the effects of earthquakes on society at the regional scale (Deierlein et al. 2019). This framework demonstrates scientific workflows for regional earthquake damage and loss estimation that utilize either HPC resources at DesignSafe for large-scale simulations or local computational resources for testing, development, and smaller-scale simulations. To demonstrate the workflow capabilities, users are able to replicate a testbed of the M7.0 Anchorage, Alaska earthquake that occurred in November 2018 with prepared building and hazard data. In addition, researchers can build their own regional risk and loss estimation workflow and leverage the growing set of available simulation, analysis, and visualization tools on DesignSafe.

5.4 Hurricane Post-Disaster Reconnaissance Efforts

Post-disaster, rapid response research reconnaissance is one of the most powerful means to understand the effects of natural hazards on the built environment. To that aim, in the United States, NSF recently created the Structural Extreme Event Reconnaissance network (StEER), whose mission is to deepen the structural natural hazards engineering community’s capacity for reliable post-event reconnaissance by (1) promoting community-driven standards, best practices, and training for RAPID field work; (2) coordinating official event responses in collaboration with other stakeholders and reconnaissance groups; and (3) representing structural engineering within the wider extreme events (EE) consortium in geotechnical engineering (GEER) and social sciences (SSEER) to foster greater potentials for truly interdisciplinary reconnaissance.

Collectively, the tools and platforms within DesignSafe enhance the ability of damage assessment teams like StEER to collect higher quality, perishable data, and more rapidly process and curate the data, ultimately resulting in publication of the data and detailed metadata into a central multi-hazard repository for use by the scientific community. Figure 5 illustrates the integrated disaster assessment workflow through the various stages of deployment made possible by DesignSafe. This workflow was first tested during the 2017 hurricane season with hurricane deployments for Harvey, Irma, and Maria (Kijewski-Correa et al. 2018). During the 2018 hurricane season, the workflow was enhanced when DesignSafe facilitated the action of StEER during reconnaissance efforts after Hurricane Michael hit Florida. This section describes this collaborative effort. It can serve as a template for deployments in other parts of the world, and it is being replicated for other hazards as well.

Fig. 5
figure 5

The integrated disaster assessment workflow realized in DesignSafe

During the predeployment stage, DesignSafe provides a central repository through the Reconnaissance Portal for early reports and datasets that guide the deployment strategies of the research teams. For example, for Hurricane Michael, the Recon Portal hosted preliminary reports by advance scout teams, as well as summaries of the spatial distribution of hazard intensities (for example, wind speeds or storm surge heights) derived from observations or models. The StEER team assembled data on the event from public sources and lead authorship of a preliminary report by Alipour et al. (2018). This report, published on DesignSafe, informed the action of the field assessment teams. In addition, teams and interested stakeholders also used Slack as a central communication hub to discuss early observations and deployment strategies.

During deployment, DesignSafe provided a central platform through Slack to facilitate communication between field deployment teams and central coordination and management teams (depending on the status of telecommunications). Daily briefs and preliminary data were also shared via the DesignSafe Slack channels for rapid dissemination to other deployment teams and stakeholders. Such early communications can be critical to shaping a proper engineering perspective on the hazard event while public attention is heightened (Fisher Liu 2009).

DesignSafe also supports the direct synchronization of data and metadata from certain data collection platforms, including the RAPID mobile application prototype, and the Fulcrum mobile smartphone application (Spatial Networks 2017; Pinelli et al. 2018). Using this workflow, data and metadata can be synced to a specified DesignSafe project in real-time or at regular intervals (for example, daily) as connectivity permits.

Following the completion of field deployments, the Hurricane Michael field assessment teams published an overview of the damage and their preliminary findings on DesignSafe (Roueche et al. 2018). DesignSafe also provides a suite of tools to enhance the post-deployment processing, aggregation, curation, and publication of the reconnaissance datasets with appropriate metadata. Unprocessed data are hosted on the DesignSafe Data Depot and accessed by all team members as designated by the team leader. Tools such as Hazmapper and QGIS allow for rapid visualization and analysis of spatial data. Jupyter notebooks can be used to join damage assessment data with external data sources such as county parcel attributes. The DesignSafe Slack platform provides a central location for coordinating data librarians in the standardization, aggregation, and quality control of the damage assessment datasets. During this process, DesignSafe provides tools for synthesizing the variety of processed damage assessment data types (for example, densified point clouds, orthomosaics) to support the curation process. Data librarians can supplement ground-based, door-to-door observations of building damage with three-dimensional views of the building using the Potree Viewer tool (which ingests densified point clouds derived from LIDAR or photogrammetric datasets) to ensure all damage is accurately identified and quantified.

6 Conclusion

The quest for resilience in a multi-hazard environment—particularly one affected by concurrent correlated and uncorrelated events as well as cascading hazards—is expected to pose unique challenges as research in disaster risk management continues to evolve. Diverse data sets from a range of sources (experimental, simulation, and field reconnaissance) must be integrated in models that are often increasingly complex and integrative across disciplines ranging from hazard sciences, engineering, and social sciences. Furthermore, the generated data continue to grow in size and scope, particularly as efforts to address multi-hazard problems expand from physics based hazard simulation across regions to structure and infrastructure response modeling under joint or subsequent hazards to community scale impact assessment. The DesignSafe cyberinfrastructure has been designed to provide the functionalities that will address many of the needs of the disaster risk management community, and help to enable transformative research in disaster risk management as well as natural hazard engineering. Some of the relevant key features of the CI are cloud computing with access to HPC resources; support of the full research lifecycle, including data generation and sharing, analysis, and visualization, as well as tools and services to enable research reproducibility, data curation, and publication. Given the flexible and extensible platform of DesignSafe, along with the ability to support and integrate user developed tools, apps, and educational content, the cyberinfrastructure is expected to grow with the community to enable new discoveries.

The challenge for the different disciplines involved in disaster risk management, especially civil engineering and social sciences, is to take advantage of advances in computing technology. The HPC resources, cloud storage, and integrated analytics afforded by DesignSafe are vital for bringing hazards engineering and disaster risk research into the twenty-first century and supporting next generation multi-hazards research. Overall, the cyberinfrastructure available to the multi-hazard engineering and disaster risk community has been rather limited; notable exceptions include hazard specific endeavors—for example, NEESHub (Hacker et al. 2011; Hacker et al. 2013), and Vortex-Winds.Footnote 17 DesignSafe supports the multiple disciplines involved in disaster risk research in one centralized yet extensible platform. It enables ingestion of data from external sources and interoperability with other CI and data repositories. Unfortunately, in natural hazard engineering, model and data sharing has been traditionally approached in a rather ad hoc or informal manner, and is particularly sparse for simulation-related research. This poses a barrier to advancing research to address such complex challenges as resilience modeling in a multi-hazard environment. DesignSafe, however, is intended to provide a forum for collaboration and data sharing of all data types. Such effective support of collaborative research and sharing of data and tools with the broader community is essential to address pressing problems in disaster risk management. This requires, in some cases, a culture shift in what is perceived to be research end products and what materials researchers publish and share with the broader community. By automating some of the process (for example, metadata description or provenance tracking) and supporting digital object identifier (DOI) assignment for a range of products, the CI can help to enable this shift.

DesignSafe is already making an impact through data publication and sharing, and data reuse. Our article illustrates this impact through four case studies in the areas of soil liquefaction, seismic vulnerability, seismic risk prediction, and hurricane reconnaissance and recovery. In each case the DesignSafe platform enhances collaboration and data sharing, leading to potentially faster and more efficient disaster preparation, mitigation, and management. Given the landscape of research challenges in natural hazard engineering and disaster risk management, prospective hurdles to advancing the field, and resources and functionalities envisioned for DesignSafe, a discussion from the community is welcomed. In particular we invite discussion on how to: best use the end-to-end collaborative environment DesignSafe has brought to the table; harness the computational resources available to tackle multi-hazard problems; most effectively stimulate data sharing and reuse across the broad range of hazard engineering and disaster risk research and data types; or take advantage of the extensible CI platform through integration of new tools that enable disaster risk research.