research-article

Open Access

Scrybe: A Secure Audit Trail for Clinical Trial Data Fusion

Authors:
Jonathan Oakley

Clemson University, USA

Clemson University, USA
View Profile

,
Carl Worley

Clemson University, USA

Clemson University, USA
View Profile

,
Lu Yu

Clemson University, USA

Clemson University, USA
View Profile

,
Richard R. Brooks

Clemson University, USA

Clemson University, USA
View Profile

,
İlker Özçelik

University of Tennessee at Chattanooga, USA

University of Tennessee at Chattanooga, USA
View Profile

,
Anthony Skjellum

University of Tennessee at Chattanooga, USA

University of Tennessee at Chattanooga, USA
View Profile

,
Jihad S. Obeid

Medical University of South Carolina, USA

Medical University of South Carolina, USA
View Profile

Digital Threats: Research and Practice Volume 4 Issue 210 August 2023Article No.: 24pp 1–20https://doi.org/10.1145/3491258

Published:10 March 2022Publication History

Digital Threats: Research and Practice

Abstract

Clinical trials are a multi-billion-dollar industry. One of the biggest challenges facing the clinical trial research community is satisfying Part 11 of Title 21 of the Code of Federal Regulations [7] and ISO 27789 [40]. These controls provide audit requirements that guarantee the reliability of the data contained in the electronic records. Context-aware smart devices and wearable IoT devices have become increasingly common in clinical trials. Electronic Data Capture (EDC) and Clinical Data Management Systems (CDMS) do not currently address the new challenges introduced using these devices. The healthcare digital threat landscape is continually evolving, and the prevalence of sensor fusion and wearable devices compounds the growing attack surface. We propose Scrybe, a permissioned blockchain, to store proof of clinical trial data provenance. We illustrate how Scrybe addresses each control and the limitations of the Ethereum-based blockchains. Finally, we provide a proof-of-concept integration with REDCap to show tamper resistance.

1 INTRODUCTION

The SARS-CoV-2 pandemic has brought new attention to the clinical trial process. As the digital threat landscape has evolved, this new attention has made it more lucrative for attackers. Recent ransomware attacks have focused on holding SARS-CoV-2 clinical trial data [61] hostage, but this exposes a critical weakness in how all clinical trial data is stored. The inherent shortcomings in current infrastructure invite attacks on any research group with promising work.

Part 11 of Title 21 of the Code of Federal Regulations defines the controls for electronic records [7] imposed by the Food and Drug Administration (FDA). Similarly, ISO 27789 [40] governs the standards for electronic health records (EHR) and audit trails. These controls provide audit requirements that guarantee the reliability of data contained in the electronic records. Codifications regulate clinical trials that are necessary for understanding pathologies, developing new treatments, and improving health. Researchers must guarantee data and consent form authenticity, integrity, and confidentiality. Electronic Data Capture (EDC) and Clinical Data Management Systems (CDMS) increase the speed and efficiency of clinical studies [15], but pose challenges for securing clinical data. Digital information can be easily changed, forged, and fabricated, raising questions about authenticity and integrity.

Recently, smart devices and the Internet of Things (IoT) devices have become more common in clinical trials [42, 48, 68]. The prevalence of these devices presents a unique data fusion security challenge, as it combines context-aware computing, traditional sensor fusion and the regulations that govern the handling and processing of clinical trial research data.

Mobile devices provide electronic questionnaires in a simple format, data validation, and make data aggregation simple. Other devices, like smartwatches, have been used in recent studies ranging from Parkinson’s [47] to atrial arrhythmias [43]. These devices generate a rich set of data that must be processed, stored, and analyzed according to the appropriate provenance and security regulations.

A clinical trial planning phase includes creating a study protocol, which specifies the goals, patient groups (or cohorts), pharmaceuticals, and tests. This plan must be approved by the institution’s Internal Review Board (IRB) and then be approved by and registered with a government regulatory agency (FDA in the United States). Once the study has been approved, patients are recruited and must provide consent before enrollment into the clinical trial. During the trial, collected data may include physical exam findings, laboratory test results, research questionnaires, and other research data types. After the trial period, the data is analyzed and published in a manner that preserves confidentiality.

Researchers must maintain a clear audit trail, track data creation, modification, and deletion. All actions need to be recorded along with the time and person responsible. REDCap (Research Electronic Data Capture) is a software toolset and database for electronic collection, management of research, and clinical trial data [38]. Since REDCap stores acquired data as records, these are the primitive data that must be secured.

The number of new clinical trials is increasing each year, compounding the economic impact of securing clinical trial data. In 2018 alone, 30,988 new clinical trials were registered with the FDA [9]. Figure 1 shows the increase in clinical trials over the past decade. As the number of clinical trials grows, so does the amount of data that needs to be secure.

Fig. 1. Studies reported to the FDA and registered on ClinicalTrials.gov [9].

Clinical trial data, and the associated data provenance, is subject to a large number of external threats [36], but insider threats are also important considerations. A meta-analysis in Science [25] was only able to reproduce 39% of published psychological studies. In a survey of biostatisticians, 31% of the statisticians surveyed, all active in medical research, reported being involved in a project that knowingly committed academic fraud [63]. Scrybe provides an immutable audit trail to satisfy Part 11 of Title 21 and provide controls against insider threat and intentional fraud.

Table 1 outlines the controls and governances required for all clinical trial electronic records. From Table 1, several general categories of controls can be inferred.

Table 1.

Designation	Description	Property
11.10(a)	Validation of systems to ensure accuracy, reliability, consistent intended performance, and the ability to discern invalid or altered records.	Integrity Authentication
11.10(b)	The ability to generate accurate and complete copies of records in both human-readable and electronic form suitable for inspection, review, and copying by the agency. Persons should contact the agency if there are any questions regarding the agency’s ability to perform such review and copying of the electronic records.	Availability
11.10(c)	Protection of records to enable their accurate and ready retrieval throughout the records retention period.	Availability
11.10(d)	Limiting system access to authorized individuals.	Access Control
11.10(e)	Use of secure, computer-generated, time-stamped audit trails to independently record the date and time of operator entries and actions that create, modify, or delete electronic records. Record changes shall not obscure previously recorded information. Such audit trail documentation shall be retained for a period at least as long as that required for the subject electronic records and shall be available for agency review and copying.	Integrity Authentication Non-repudiation
11.10(g)	Use of authority checks to ensure that only authorized individuals can use the system, electronically sign a record, access the operation or computer system input or output device, alter a record, or perform the operation at hand.	Access Control
11.10(h)	Use of device (e.g., terminal) checks to determine, as appropriate, the validity of the source of data input or operational instruction.	Integrity
11.10(k1)	Adequate controls over the distribution of, access to, and use of documentation for system operation and maintenance.	Integrity
11.10(k2)	Revision and change control procedures to maintain an audit trail that documents time-sequenced development and modification of systems documentation.	Integrity

\(^1\)Designations 11.10(f), 11.10(i), and 11.10(j) were omitted because they refer to administrative controls outside the scope of this framework.

View Table

Table 1. Controls Outlined in Part 11.10 of Title 21, Code of Federal Regulations, for a Closed Clinical Trial Electronic Records System \(^1\) [7]

\(^1\)Designations 11.10(f), 11.10(i), and 11.10(j) were omitted because they refer to administrative controls outside the scope of this framework.

Integrity: Provenance data cannot be falsified and can only be created by authorized parties.
Availability: The provenance system must be consistently able to receive new data, and that data must be able to be consistently viewed.
Efficiency: The system should have a low overhead.
Authentication: Provenance data is correctly identified as having come from the correct source.
Non-repudiation: Once provenance data has been created, neither creator nor viewer cannot deny its existence.
Access control: Only authorized parties can view provenance data. Compliance with the Health Insurance Portability and Accountability Act (HIPAA) privacy and security rules [58] makes this particularly relevant.

This article proposes Scrybe, a blockchain-based secure audit trail, and discusses how it addresses the security requirements of clinical trial data outlined in Part 11 of Title 21. A Scrybe proof-of-concept is integrated with REDCap to show these properties and demonstrate the working system.

Section 2 discusses existing solutions in this problem space. Section 3 provides background information on data provenance, REDCap, and blockchains. Section 4 introduces Scrybe and outlines the architecture of our system. Section 5 describes the proof-of-concept integration with REDCap and shows how this architecture satisfies the requirements set from in Reference [7]. Section 6 provides a discussion of our results and directions for future work.

2 RELATED WORK

Existing blockchain-based provenance solutions for EHR consist of novel blockchain technology and existing blockchain technology. Approaches that propose novel blockchains often overlook critical security details. Limitations in current blockchain technology make a healthcare blockchain built on these technologies impractical for widespread adoption. Unfortunately, there is also a non-trivial group of proposed solutions that claim to use blockchain technology but do not provide enough information to assess the viability of the solution [22, 45, 79, 82, 83, 84].

2.1 Application-specific Blockchain Technologies

One approach is to design a consensus algorithm tailored to the application. These approaches generally address proof-of-work (PoW) efficiency concerns, but many fail to include integrity and security considerations. MediBchain [11, 46] is one such solution. The specifics of the blockchain are very high-level, and concrete details are abstracted. Further, other authors have found security issues with their implementation [81]. BBDS is a PoW-based blockchain designed to facilitate data sharing [80]. Patel designed a blockchain framework for sharing radiology data using a proof-of-stake (PoS) algorithm [59]. Lee and Yang designed a flawed blockchain specific to their work (fingernail microscopy). Peterson et al. designed a blockchain that leverages proof-of-interoperability instead of PoW [62]. While the approach is novel, based on the algorithm description, it appears that two malicious nodes could control the network.

2.2 Tested Blockchain Technologies

Many EHR blockchain solutions build on either Bitcoin or Ethereum. Bitcoin-based solutions [39, 49, 72] leverage numerous integrations with tools such as sidechains and data anchors. Ethereum-based solutions (such as MedRec [13, 27] and others [16, 17, 54, 56]) use smart contracts to provide impressive functionality that can easily interface with complicated systems. Other solutions (Healthchain [81], MedicalChain¹ [50], and Reference [41]) are based on Hyperledger Fabric—a permissioned blockchain developed by IBM. Hyperledger Fabric incorporates functionality present in Ethereum but does not require a cost or resource overhead. Both Bitcoin and Ethereum currently use PoW as their consensus algorithm. While this is currently one of the more popular consensus algorithms, it does not scale well, and it is cost-prohibitive [67].

While proof-of-concept solutions based on PoW are popular, they are not long-term solutions for securing data provenance. They result in too much overhead and too little throughput [67]. Ethereum has announced a move to a PoS consensus algorithm. While this is a step in the right direction, it still introduces issues, and any application running on the blockchain will have associated costs required to keep running the smart contracts. With the current cryptocurrency volatility, tying a technological solution to any of these blockchains would be a gamble.

Hyperledger is a promising permissioned blockchain architecture that offers pluggable consensus algorithms. There are several mainstream choices, such as Hyperledger Fabric, Hyperledger Indy, Hyperledger Iroha, and Hyperledger Sawtooth [1]. There are several popular pluggable ordering/consensus mechanism for Hyperledger Fabric: practical Byzantine Fault Tolerance (PBFT) [24], BFT-SMaRt [18], SBFT [33], HoneyBadger BFT [51], and Kafka [44]. Kafka stands out on this list, since it is neither crash-tolerant nor fault-tolerant. The other consensus algorithms are all in the family of PBFT algorithms, which suffer from scalability issues (usually greater than 20 nodes) [75]. SBFT improves on this, but it still suffers from the other shortcoming of PBFT algorithms—electing a centralized leader. Similarly, HoneyBadger BFT elects a set of nodes at the beginning of the algorithm that makes it centralized [30]. Hyperledger Indy uses Robust BFT (RBFT), which is in the same PBFT family that suffers from node scalability [75]. Hyperledger Iroha uses the Sumeragi consensus, which is based on BChain [26]. Sumeragi also suffers from node scalability issues [1]. Finally, Hyperledger Sawtooth uses PoET as the consensus algorithm. PoET is based on a Trusted Execution Environment [66]. While TEEs are a huge advancement in the space of efficient consensus algorithms, vulnerabilities such as Plundervolt [53] show there are ways to bypass the trusted environment.

Our EHR blockchain solution leverages Scrybe’s lightweight consensus algorithm [19]. This consensus algorithm is designed for a permissioned blockchain, so it does not require excessively wasteful computations. Scrybe’s complexity is linear as the number of nodes approaches infinity [19], which addresses the scalability concerns many popular consensus algorithms face. While the current embodiment of our design does not leverage Hyperledger Fabric, a future version may be integrated with a pluggable Scrybe consensus algorithm.

3 BACKGROUND

Theoretically, provenance provides several different benefits [32]: (1) data integrity, (2) audit trail, (3) replication, (4) attribution,² and (5) information.³ Provenance tools usually rely on inversion or annotation [69, 70]. Inversion provenance is the set of transformations required to take an empty dataset and arrive at the current state. Annotation involves supplying rich metadata about the changes. Provenance is stored [31]: (1) tightly coupled, (2) loosely coupled, or (3) uncoupled. Tightly coupled provenance is stored in the data. Loosely coupled provenance is stored with the data—it is logically separated, usually in a different file on the same system. Uncoupled provenance is stored remotely. The Open Provenance Model (OPM) [52] and W3C PROV [29] provide standards for representing provenance.

Various tools have arisen to address the provenance challenge [35]. IPython [60] is an interactive Python interface that allows users to mix code, text, and media. Taverna [76] is a Java-based workflow tool that exports OPM-compliant models. VisTrails [6] is a Python-based workflow tool. Karma [71] is an uncoupled provenance tool that conforms to the OPM standard in a client/server relationship. Komadu [73] is a W3C PROV-based tool that captures system-level events. Kepler [5] is another environment capture tool that focuses on the execution environment. Swift [85] is a provenance capture system for parallel processing. Sumatra [2] is another Python-based provenance system focusing on numerical simulations and analyses. Provenance Aware Storage System (PASS) [23] is an environmental capture tool that captures all system metadata at execution time.

3.1 REDCap

REDCap is a software toolset and workflow methodology for electronic clinical trial data collection and management [38]. REDCap provides web-based tools for data entry, aiding correct entry using real-time validation rules with automated data type and range checks at the time of entry. The system allows the research teams to create and design online surveys and allows survey owners to engage respondents using various notification methods. REDCap data dictionaries can be distributed for reuse at multiple institutions. A library of data dictionaries is made available for standard data collection forms and validated collection interfaces [57].

These features make REDCap an ideal clinical trial collection tool. We use REDCap as our data management system, intending to secure the data provenance. The existing REDCap system already has some valuable security features. The underlying database is typically hosted in secure data centers at the host institutions with layers of redundancy, failover capability, backups, and extensive security checks. The system is inherently compliant with the Health Insurance Portability and Accountability Act (HIPAA) of 1996. It has several layers of protection including, user/group account management, “Data Access Groups,”⁴ audit trails for all changes, queries, reports, and Secure Sockets Layer (SSL) encryption. In addition to HIPAA, it can be set up to support other regulatory requirements including Title 21 Part 11 of CFR [7] and FISMA-compliant [28] environments as needed.

Users can export data in native format for several statistical packages, including SPSS, SAS, SATA, R, and comma-separated values files. REDCap has an Application Programming Interface (API), which allows interoperability with external tools, plugins, and mobile apps. REDCap provides an interface for database and case report form creation, either online via a web-based designer or offline using a “data dictionary” spreadsheet template that can be uploaded later into REDCap. This generalized method for quickly creating clinical trial infrastructure has led to REDCap being used in over 3,000 institutions [8].

REDCap is written in PHP, and it depends on other software, such as MySQL and the underlying HTTP server [74]. The developers note that REDCap’s security depends on this underlying infrastructure, which is known to be vulnerable if improperly configured or not maintained [74]. However, even if the underlying infrastructure is updated, REDCap itself still has documented vulnerabilities. According to the latest REDCap changelog, [64], version 9.5.0 (released 12/05/2019) fixed an SQL injection vulnerability that would allow “tech-savvy” users to view any sensitive data they wished. REDCap is an invaluable tool for researchers, but it is not a secure provenance tool. A malicious user could exploit any of these vulnerabilities to falsify REDCap records.

3.2 Blockchains

In 2008, Satoshi Nakamoto proposed Bitcoin, the first cryptocurrency [55]. Nakamoto’s work extended the ideas presented by Haber and Stornetta [34], who presented the first cryptographically secured timestamp audit trail. Bitcoin allowed the digital currency to be exchanged between participants without trusting a central authority (e.g., a bank or wire service). Bitcoin has two categories of participants—miners and users. Each miner maintains a local copy of a decentralized ledger tracking the balances in all accounts, and any user transferring Bitcoin broadcasts a signed message that causes all miners to update their local ledgers.

The blockchain is the data structure each miner uses to store all of the transactions making up the local ledger. Transactions are grouped together in blocks, and each block contains a cryptographic link to the previous block (hence a “chain” of blocks). This cryptographic link is a hash—a one-way function that takes an arbitrary blob of data and returns a random value between 0 and \(2^{256} - 1\) with uniform probability. Since the hash is a one-way function, it is computationally hard to determine the original input given the random value. Further, each block is cryptographically signed—ensuring its authenticity and integrity.

In Bitcoin, creating a valid block requires a computationally expensive proof-of-work (PoW). This process is called mining. All miners try to guess a random value (nonce) that causes the block’s hash to have specific properties. The first miner to guess this random value broadcasts the solution. The process begins again, and the next block contains a hash of the latest solution. The group’s consensus is the longest blockchain.

An attacker must create a valid chain that is longer than the current consensus chain to change the consensus. The probability of successfully performing this attack decreases exponentially with each block added to the chain.⁵ Mining ensures that the blockchain is an immutable consensus among all participants, even in the presence of attackers. One caveat is that the PoW mining process is highly inefficient (computationally and economically), limiting the scalability of the PoW-based systems.

While the most popular use-case is currency transactions, blockchains can be generalized to any use-case where a distributed data store requires consensus and immutability. There are many applications and supporting technologies [77] following this trend. Blockchains have also been used to store provenance metadata [36]. The immutability property provides strong guarantees of data integrity. The requirement that each block is signed and contains the hash of the previous block ensures signatures are interleaved throughout the blockchain, and each miner’s signature validates the signatures of every previous block.

3.3 Hashes

To formally prove the fundamental properties of the Scrybe Provenance framework, we provide formal definitions for the cryptographic primitives. An SHA 256 hash function is defined in Equation (1) for message M, and it has the following relevant properties [12]:

(1)

\(H_{256}\) is a one-way function.

(2)

\(H_{256}\) is a uniform mapping from the domain to the range.

(3)

For perspective, the current Bitcoin network hash rate⁶ is \(8.5 \times 10^{19}\) hashes per second [10]. At the current rate, it would take \(3.42 \times 10^{49}\) years before the Bitcoin network found a collision for a given hash.

3.4 Digital Signatures

Public key encryption assumes two keys: \(K_{private}\) and \(K_{public}\). The private key is known only to the owner, while the public key is known to everyone. The formal definition for the public key encryption function is provided in Equation (4). (4) \(\begin{equation} \begin{split} &\textrm {Enc}:\left\lbrace 0,1\right\rbrace ^n \rightarrow \left\lbrace 0,1\right\rbrace ^n \\ &\textrm {Dec}:\left\lbrace 0,1\right\rbrace ^n \rightarrow \left\lbrace 0,1\right\rbrace ^n \end{split} \end{equation}\)

The encryption and decryption functions are asymmetric, as shown in Equation (5). \(C_{public}\) is the ciphertext when the message is encrypted with the public key, and \(C_{private}\) is the ciphertext when the message is encrypted with the private key. (5) \(\begin{equation} \begin{split} &\textrm {Enc}(M,K_{private}) = C_{private} \\ &\textrm {Dec}(C_{private},K_{public}) = M \\ &\textrm {Enc}(M,K_{public}) = C_{public} \\ &\textrm {Dec}(C_{public},K_{private}) = M \\ \end{split} \end{equation}\)

From this definition, we can see that anything encrypted using a private key can be decrypted using the public key. Until recently, RSA has been the standard for public-key cryptography [65]. Elliptic curve cryptography leverages different mathematical principles to reduce the overall key size (significantly) at a slight performance cost [37]. Both of these approaches are susceptible to quantum attacks [4], but those are distant concerns.

3.5 Integrity

Scrybe uses cryptographic signatures to ensure the integrity of data provenance. The formal definition for a cryptographic signature, shown in Equation (6).

(6)

The signature, \(S(M)\), is appended to the message, M, so anyone can verify the message’s integrity. Since the assumption is that only the signatory knows the private key, it is assumed that if the decrypted signature, \(H_{256}(M)^{\prime }\), is the same as the hash of the message, \(H_{256}(M)\), then the message originated from the owner of \(K_{private}\) and the message has not been modified. The transaction containing the hash of the changelog entry is timestamped and cryptographically signed by the researcher, and the block is timestamped and cryptographically signed by the miner. Together, these layers of cryptographic signatures satisfy the integrity property. Using the digital signature standard as documented by the National Institute of Standards and Technology [14] guarantees security and portability.

4 SCRYBE PROVENANCE FRAMEWORK

Scrybe is a blockchain-based provenance framework that can be adapted to secure clinical trial metadata. Scrybe was initially developed to secure provenance metadata, so clinical trial audit trails are a natural use case. Scrybe provides the five basic properties of a provenance system listed in Section 3. The framework we propose here uses an uncoupled inversion-based changelog secured by Scrybe’s annotation-based provenance framework.

This section will describe the data structures used in Scrybe and its mining method, keeping in mind the required properties. Scrybe uses a blockchain, since the immutability property of blockchains provides strong integrity guarantees. The replication of the blockchain state among all miners also provides strong availability guarantees. The challenge is to design a system that maintains the other properties. A publicly visible blockchain makes access control difficult, and PoW mining is extremely inefficient. Figure 2 shows the Scrybe architecture.

Fig. 2. Verifying valid REDCap data using the Scrybe provenance framework.

The main components of this architecture are the Scrybe blockchain and the changelog. We store the history of all changes made to institutional database records as entries in a changelog server. A firewall and appropriate access controls can be placed around the changelog, allowing only qualified people to access it. Since access to this server is restricted, patient privacy is protected. As long as the blockchain remains immutable, the changelog’s integrity is guaranteed, and the changelog is a secure audit trail tracking every event in the clinical trial.

The changelog entry’s cryptographic hash is stored in the blockchain to ensure that data integrity and non-repudiation are guaranteed despite a centralized changelog server. Only a hash is stored on the public blockchain, which addresses any concerns over information leakage. Because of the immutability property of the blockchain, these hashes cannot be altered. When an audit is performed, the changelog can be examined, and the changelog hashes can be compared to the hashes stored on the blockchain. If there is a mismatch, then the auditor knows tampering has occurred. In the case of an FDA investigation, the blockchain transactions matching changelog entries that describe each record modification guarantee that the changelog is a trustworthy audit trail.

Each changelog entry is recorded on the blockchain as a transaction, and transactions are grouped in blocks. These blocks constitute the blockchain data structure.

4.1 Changelog Entry

A changelog entry describes a single change made to the secure REDCap database. These entries are not a part of Scrybe—they are an entirely separate primitive used to extend REDCap (or any secure database) with additional provenance functionality. Whenever a user performs any addition, deletion, or modification, an entry is created and stored in a changelog server. The changelog server is a sequence of these entries. Each entry has an associated ID that increases sequentially with each new entry added to the log. The entry contains a modification field describing the change made to the database. The entry is signed to guarantee non-repudiation. Changelog entries can be applied sequentially (up to the most recent changelog entry) to an empty database to construct the current database. The changelog is stored locally by the institution and is not a part of Scrybe. Keeping the changelog server behind an institutional firewall and access control list alongside the REDCap database used for the clinical trial will ensure HIPAA compliance, since only authorized personnel can view the data. When a changelog event occurs, the changelog server submits a transaction to the Scrybe blockchain containing a hash of the changelog entry and non-identifying metadata, such as date, time, and trial ID.

4.2 Transactions

Scrybe transactions contain whole pieces of provenance metadata. Any correlations to other pieces of metadata must be performed at a higher level outside the system. In the case of clinical trials that are performed with an institutional database, the changelog entries are secured by transactions. Whenever a change is made to a record, an entry is stored in the changelog server. Then, a transaction with the changelog hash is broadcast to the blockchain miners. This transaction contains a hash of the created entry, the time, and the entry’s ID. The hash is used instead of the actual entry to ensure HIPAA compliance.

When a transaction is created, it is cryptographically signed by its creator, the changelog server. The signature guarantees that if a transaction exists with a particular timestamp, entry hash, and valid signature, the signatory modified the database in the manner described by that particular entry. The transaction is the foundational building block of secure provenance.

4.3 Blocks

A block contains a group of transactions, the hash of the previous block, and a record of the mining process. Since each block contains a previous block’s hash, an attacker must modify all previous blocks to modify the current block. Since the probability of producing a forged block is almost surely zero (i.e., probability of 0), the blockchain can be considered immutable. The block size is determined by the volume of transactions and the time required to mine a block. The time between blocks is adjusted to maintain a reasonable interval for transactions to accumulate. Each block contains a signature of the miner that generated the block. The signature provides an added layer of security for transactions on the blockchain—the block itself contains a signed timestamp, and each transaction also contains a signed timestamp. Both of these attributes provide interleaved trust, allowing us to trust the timestamp on the changelog entry. Authorized participants can view the transaction, but no one can change it or deny its existence.

4.4 Mining

Miners generate the blocks in a blockchain and broadcast them to the rest of the miners. Traditionally, PoW mining is too resource-intensive, requiring the constant generation of hashes until the miner solves a cryptographic puzzle. Proof-of-Stake (PoS) systems require that miners stake a sizable amount of currency that will be forfeit if malfeasance is found. While PoS scales better and provides a more economical approach, it requires a native currency and poses centralization risks if a particular miner controls a majority of the currency.

Scrybe is a permissioned blockchain—only authorized miners can generate blocks. Each block is signed, and any block signed by an unauthorized miner is immediately discarded. This lightweight mining approach differs from traditional PoW and PoS consensus algorithms [20]. A miner is randomly selected from the pool of authorized miners, and that miner is delegated to produce the next block. In practice, miners consist of various companies, research institutions, and regulatory agencies. A detailed description and rigorous proof of Scrybe’s consensus algorithm is given in References [19, 78]. Since Scrybe is a dedicated provenance blockchain, no underlying currency can introduce volatility or cause the underlying technology to become obsolete. Scrybe is not based on PoW or PoS, so there are no concerns with environmental impact or greedy nodes. No leader or initial group of nodes is elected like in PBFT-based algorithms, ensuring the process is truly decentralized. Scrybe leverages the advantages of a permissioned blockchain with a secure and scalable algorithm. The complexity is \(O(n)\) as the number of nodes approaches infinity [19].

5 REDCAP SECURED WITH SCRYBE

A proof-of-concept prototype was created to test the ideas behind the Scrybe framework. The core of the prototype is Scrybe, which was implemented in C++. All data structures are stored as serialized strings in a cached database for fast access, and peer-to-peer communication happens asynchronously. We used REDCap as our institutional database for storing clinical trial data. The changelog server that secures REDCap communicates with a Scrybe client to submit transactions. The changelog server also includes tools for browsing the changelog database and verifying the integrity of the entries it contains by comparing their hashes with those stored on the blockchain.

For this application, an interface was created to input data into the system. REDCap exposes an API of HTTP POST requests that allows for data import and export [21]. A Python script was written that allows clients to input data. That data is then simultaneously imported in REDCap through its API, uploaded to the changelog, and transactions securing the entries are submitted to the Scrybe miners. This interface is currently a standalone command-line interface, but REDCap has a tool for creating data entry interfaces. Future work should consider a daemon that monitors the REDCap database and automatically creates changelog entries whenever changes occur. Automatic changelog generation would make the provenance backend invisible to the end-user, allowing for seamless integration.

We present the use-cases shown below to illustrate the properties outlined in Table 1. To satisfy Reference [7], we must address integrity, authentication, availability, access control, and non-repudiation. These use-cases introduce several new concepts. The Scrybe provenance consortium is a group of independent institutions using the Scrybe provenance framework to secure their respective data. The consortium is most robust when the independent institutions each have a Scrybe node and are unlikely to collude. The auditor represents any authorized individual who wishes to validate data. The researcher is an individual authorized to store data in the Scrybe provenance framework. In these use cases, the attacker is an unauthorized individual who is making malicious changes. These use-cases require public-key cryptography, and there are existing solutions and best practices for enterprise key management.

5.1 Integrity

Consider the standard use-case, shown in Figure 2. A researcher uploads data to REDCap using the Scrybe provenance framework. The Scrybe provenance framework breaks the data into a series of incremental changes that can be applied to the REDCap database. The metadata for each change is used to create a transaction, and this transaction is signed using the researcher’s private key (TXN Signature). This transaction only contains publicly available metadata and the hash of the changelog entry. Simultaneously, each incremental change is also signed using the researcher’s private key (REDCap Signature). The Scrybe framework stores the signed changes in a changelog,⁷ allowing the current state of the database to be reconstructed. Then, the Scrybe transaction is submitted to the miners, where it is incorporated into the blockchain. Finally, the data is uploaded to REDCap.

When an auditor verifies the data stored in REDCap, the first step is to download a copy of the REDCap data and the corresponding Scrybe transactions. The transaction signature is used to verify the integrity of the metadata in the transaction. Next, the REDCap signature is compared to the REDCap signature stored in the Scrybe transaction. Once this signature is verified, it is used to verify the integrity of the REDCap entry. With this verification complete, the auditor can reconstruct the complete history using the changelog. Auditors may elect to use software tools to identify anomalies, such as conflicting changes, which may indicate intentional malfeasance on the researcher’s part.

Consider a malicious user modifying data in the REDCap database without the proper approval. This is shown in Figure 3. The attacker modifies the blood type of a patient. There are two cases, (1) the attacker does not attempt to forge the REDCap signature, and (2) the attacker is a researcher and maliciously updates the database to include false information using the Scrybe framework. The latter case can be addressed by software that scans the changelog and identifies conflicting anomalous behavior. In the former case (shown in Figure 2), the auditor downloads a copy of the REDCap data and verifies the signature. The signature is calculated using all of the fields stored in REDCap. Section 3.5 provided a proof showing that if any of the data changes, then it is detected with the signature. In this scenario, the auditor cannot verify the REDCap signature, showing the data in REDCap was altered.

Fig. 3. Detecting altered REDCap data using the Scrybe provenance framework.

The final scenario, shown in Figure 4, addresses malicious changes made to the Scrybe blockchain. The prerequisite is that an attacker compromises every Scrybe node. This compromise would include Scrybe nodes hosted at private, public, and federal institutions (e.g., the FDA, DHEC, and CDC). By compromising these institutions, the attacker gains access to the private keys the Scrybe nodes use to sign blocks. The attacker must also compromise the researcher who signed the transaction they wish to modify, allowing the attacker to forge the REDCap signature and the transaction signature. With the forged transaction signature, the attacker can reconstruct the entire blockchain using forged signatures. To our knowledge, there are no security countermeasures that can handle all of the nodes being compromised.

Fig. 4. Detecting a total system compromise.

The general provenance requirements were distilled from Table 1. Scrybe satisfies each of these requirements to provide a secure framework for data provenance.

5.2 Availability

There are two aspects of availability: the original data’s availability and the audit trail’s availability. Scrybe only stores the audit trail, and storing the actual clinical trial data is outside the scope of Scrybe. Since Scrybe is a distributed ledger, there are redundant copies stored at various sites. The Scrybe use-case recommends that each institution and regulatory oversight entity host a Scrybe node. Each of these nodes contains a copy of the blockchain. Further, it is recommended that the Scrybe miners be hosted on a server with modern storage redundancy features, such as RAID, which allows data to be recovered in the event of multiple hard drive failures. With these considerations, the bare minimum availability provided by Scrybe matches the current level of availability. However, these features ensure that a valid version of the provenance proof will be available in almost every failure or tampering event.

5.3 Authentication and Access Control

Scrybe is a distributed blockchain and has no central database to control. Instead, the distributed mining process determines what information is added to the blockchain. There is no need to assign privileges to modify on-chain data. Instead, each researcher has a public-private keypair that is publicly registered with the blockchain. Each researcher can only submit transactions when they are signed with a valid key, and these transactions are incorporated into the blockchain by a miner with a valid key. This signature ensures that only authorized researchers create transactions on the blockchain.

Assigning rights to modify off-chain data is handled by the institutional database itself. This access control is not affected by our approach and can be done using existing methods. Care must be taken to ensure that all off-chain data modification is linked with an on-chain record. Since only hashes of the changelog entry are stored on the blockchain, there is no need to restrict read access to the transactions. In addition to simplifying the system, this guarantees HIPAA compliance.

5.4 Efficiency

Section 2 discusses the several approaches solutions in this space employ. Most popular blockchains, such as Ethereum and Bitcoin, use a PoW mining algorithm. Even permissioned Ethereum blockchains still currently use PoW. Many provenance blockchains leverage existing technology, such as Bitcoin and Ethereum. These mining algorithms are designed to consume all available resources wherever it is running. Using Bitcoin or Ethereum as the basis of a provenance blockchain is economically inefficient and morally irresponsible. Despite this limitation, there are still issues with scalability and volatility. Other solutions, such as Hyperledger’s built-in consensus algorithms, face challenges with scalability and decentralization. There is also a set of solutions that employ novel blockchain solutions. As discussed in Section 2, many of these solutions have scalability and security issues.

Scrybe uses a permissioned blockchain that is built with a novel lightweight mining algorithm [20]. Each miner only expends energy communicating and mining when selected through a non-deterministic algorithm [20]. By exchanging signed messages containing all the received transactions, nodes can validate the block published by the selected node. As part of the message exchange, a new node is chosen to produce the next block. The algorithm is described in detail in References [19, 78]. This algorithm was proven to have complexity \(O(n)\) as n approaches infinity [19]. Scrybe is an efficient alternative consensus algorithm that is capable of registering provenance for multiple clinical trials.

5.5 Clinical Trial Data Validation

For experimental purposes, the public-use National Longitudinal Mortality Study (NLMS) dataset was acquired from the National Institute of Health’s Biologic Specimen and Data Repository Information Coordinating Center [3]. This large-scale dataset relates mortality to many lifestyle factors, such as age, location, or substance use. A REDCap instance at the Medical University of South Carolina was used to store the data. REDCap instruments were created for this dataset, allowing the data to be input and processed. An example REDCap input interface is shown in Figure 5.

Fig. 5. REDCap data input instrument for the NLMS dataset.

The interface script was used to submit the NLMS data in CSV format to REDCap and the changelog server. The data can be viewed by exporting individual entries in the Scrybe command-line interface. The changelog server’s pull option can be used to download the entire changelog and the entire blockchain to conduct an item-by-item comparison to audit integrity. If there is an entry in the changelog with no corresponding transaction in Scrybe, then a warning is given. An error is raised if an entry’s value does not match the hash in the corresponding transaction. Similarly, if a transaction is modified on the local blockchain database, then anyone can verify that the transaction is invalid due to the invalid “previous block” hash in the following block. If a recent transaction or set of transactions is missing, then synchronization with the rest of the miners resolves the issue.

Figure 6 shows an example of a successful comparison between the blockchain and the changelog. Every changelog entry hash was stored in a block transaction. The blockchain was manually corrupted, so the hash of an entry would no longer match a valid changelog entry. This corruption was identified, as shown in Figure 7, validating the integrity of the changelog.

Fig. 6. Successful comparison of changelog (datastore) and blockchain.

Fig. 7. On a corrupted local instance, the audit failed.

6 DISCUSSION

Part 11 of Title 21 Code of Federal Regulations [7] and ISO 27789 [40] requires that researchers guarantee the authenticity, integrity, and confidentiality of data collected for clinical trials. Clinical trials are one of the most important forms of scientific research mechanisms for advancing human health, and the FDA closely regulates them in the United States of America. The increased use of smart and wearable IoT devices in clinical trials presents a unique challenge: The advent of computerized data management in EDC and CDM systems has not yet been adequately addressed.

We propose Scrybe, a permissioned blockchain, as a method of storing proof of data provenance. Scrybe uses a lightweight mining algorithm that is more efficient and economical than popular proof-of-work algorithms (e.g., Ethereum and Bitcoin). Many existing solutions based on popular cryptocurrencies are subject to additional overhead and volatility. These solutions are also tied to the cryptocurrency’s future success (or failure). Scrybe is more decentralized than consensus algorithms based on pBFT, which have become popular in the provenance blockchain space. By using a distributed consensus among competitors, Scrybe ensures immutability. Considering the requirements outlined in Reference [7], we demonstrate how Scrybe addresses each of the relevant controls. A proof-of-concept integration with REDCap is used to show tamper resistance. The REDCap-Scrybe provenance framework allows researchers to track the provenance of any clinical trial data collected by smart devices.

Future work will include further integration with REDCap and trial runs on more datasets. The Scrybe transaction process will be integrated as a separate daemon that monitors the REDCap database, automatically generates changelog entries, and submits a transaction whenever changes are detected, providing seamless integration with existing EDC systems. As discussed in Section 2, Hyperledger offers pluggable consensus algorithms. Leveraging the Hyperledger framework and implementing a pluggable Scrybe consensus algorithm would leverage existing technology with a strong community. Scrybe’s application is not limited to tracking clinical trial provenance. There are other projects currently leveraging this technology. A future version of Scrybe will include smart-contract functionality to provide researchers with additional functionality and provenance security.

ACKNOWLEDGMENT

The authors gratefully acknowledge this support and take responsibility for the contents of this report. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the National Institutes of Health, the National Science Foundation, or the U.S. Government.

Footnotes

¹ MedicalChain also relies on the Ethereum blockchain.
Footnote
² Attribution implies that changes can be attributed to a given individual or event.
Footnote
³ Information implies that additional data can be derived from the provenance.
Footnote
⁴ Data Access Groups allow data to be entered by multiple groups in one database with segmented user rights for entered data.
Footnote
⁵ The exception to this occurs when the attacker controls more than 50% of all computing power in the network.
Footnote
⁶ The network hash rate is the combined hash rate of all miners currently working to solve the next block.
Footnote
⁷ The changelog is stored on a secure server operated by the institution.
Footnote

REFERENCES

[1] Hyperledger Foundation. 2017. Introduction to hyperledger business BlockchainDesign philosophy and consensus. 1 (2017).Google Scholar
Reference 1Reference 2
[2] NeuralEnsemble. 2013. Sumatra.Google Scholar
Reference
[3] Sorlie Paul D., Backlund Eric, and Keller Jacob B.. 2015. US mortality by economic, demographic, and social characteristics: the National Longitudinal Mortality Study. American Journal of Public Health 85, 7 (1995), 949–956.Google ScholarCross Ref
Reference
[4] NSA. 2015. NSA Suite B Cryptography - NSA/CSS.Google Scholar
Reference
[5] Kepler. 2016. The Kepler Project.Google Scholar
Reference
[6] VisTrails. 2016. VisTrails Documentation.Google Scholar
Reference
[7] U.S. Food and Drug Administration. 2018. Code of Federal Regulations Title 21.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
[8] REDCap. 2018. Project REDCap.Google Scholar
Reference
[9] ClinicalTrials.gov. 2018. Trends, Charts, and Maps.Google Scholar
Reference 1Reference 2
[10] Blockchain. 2019. Blockchain.Google Scholar
Reference
[11] Omar Abdullah Al, Rahman Mohammad Shahriar, Basu Anirban, and Kiyomoto Shinsaku. 2017. MediBchain: A blockchain based privacy preserving platform for healthcare data. In International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage. Springer, 534–543.Google ScholarCross Ref
Reference
[12] Appel Andrew W.. 2015. Verification of a cryptographic primitive: SHA-256. ACM Transactions on Programming Languages and Systems (TOPLAS) 37, 2 (2015), 1–31.Google ScholarDigital Library
Reference
[13] Azaria Asaph, Ekblaw Ariel, Vieira Thiago, and Lippman Andrew. 2016. MedRec: Using blockchain for medical data access and permission management. In 2nd International Conference on Open and Big Data (OBD). IEEE, 25–30.Google ScholarCross Ref
Reference
[14] Barker Elaine B.. 2009. Digital signature standard (DSS). Technical Report. NIST.Google Scholar
Reference
[15] Bart Thomas. 2003. Comparison of Electronic Data Capture with Paper Data Collection-is there really an advantage. Bus Brief Pharmatech2003), 1–4.Google Scholar
Reference
[16] Benchoufi Mehdi, Porcher Raphael, and Ravaud Philippe. 2017. Blockchain protocols in clinical trials: Transparency and traceability of consent. F1000Research 6 (2017).Google ScholarCross Ref
Reference
[17] Benchoufi Mehdi and Ravaud Philippe. 2017. Blockchain technology for improving clinical research quality. Trials 18, 1 (2017), 1–5.Google ScholarCross Ref
Reference
[18] Bessani Alysson, Sousa Joao, and Alchieri Eduardo E. P.. 2014. State machine replication for the masses with BFT-SMART. In 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 355–362.Google ScholarDigital Library
Reference
[19] Bhat Nazzira, Altaranaweh Amani, Yu Lu, Skjellum Tony, and Brooks Richard R.. Lightweight mining (LWM): A secure and efficient distributed ledger consensus protocol. In Preparation.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[20] Brooks Richard R., Wang K., Yu Lu, Oakley Jon, Skjellum Anthony, Obeid Jihad S., Lenert Leslie, and Worley Carl. 2018. Scrybe: A Blockchain ledger for clinical trials. In IEEE Blockchain in Clinical Trials Forum: Whiteboard Challenge Winner. Retrieved from https://blockchain.ieee.org/images/files/images/clinicaltrialsforum-2018/Clemson_WhitePaper.pdf.Google Scholar
Reference 1Reference 2Reference 3
[21] Burns Scott. 2013. Intro to the REDCap API. Retrieved from http://sburns.org/2013/07/22/intro-to-redcap-api.html.Google Scholar
Reference
[22] BurstIQ. 2017. Bringing Health to Life.Google Scholar
Reference
[23] Carata Lucian, Akoush Sherif, Balakrishnan Nikilesh, Bytheway Thomas, Sohan Ripduman, Seltzer Margo, and Hopper Andy. 2014. A primer on provenance. Queue 12, 3 (Mar.2014). DOI:Google ScholarDigital Library
Reference
[24] Castro Miguel and Liskov Barbara. 2002. Practical byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. 20, 4 (2002), 398–461.Google ScholarDigital Library
Reference
[25] Collaboration Open Science et al. 2015. Estimating the reproducibility of psychological science. Science 349, 6251 (2015), aac4716.Google ScholarCross Ref
Reference
[26] Duan Sisi, Meling Hein, Peisert Sean, and Zhang Haibin. 2014. BChain: Byzantine replication with high throughput and embedded reconfiguration. In International Conference on Principles of Distributed Systems. Springer, 91–106.Google ScholarCross Ref
Reference
[27] Ekblaw Ariel, Azaria Asaph, Halamka John D., and Lippman Andrew. 2016. A case study for blockchain in healthcare: “MedRec” prototype for electronic health records and medical research data. In IEEE Open & Big Data Conference, Vol. 13.Google Scholar
Reference
[28] Gantz Stephen D. and Philpott Daniel R.. 2012. FISMA and the Risk Management Framework: The New Practice of Federal Cyber Security. Newnes.Google Scholar
Reference
[29] Gil Yolanda, Miles Simon, Belhajjame Khalid, Deus Helena, Garijo Daniel, Klyne Graham, Missier Paolo, Soiland-Reyes Stian, and Zednik Stephen. 2012. PROV Model Primer. Technical Report. W3C. Retrieved from http://www.w3.org/TR/prov-primer/.Google Scholar
Reference
[30] Gilad Yossi, Hemo Rotem, Micali Silvio, Vlachos Georgios, and Zeldovich Nickolai. 2017. Algorand: Scaling byzantine agreements for cryptocurrencies. In 26th Symposium on Operating Systems Principles. 51–68.Google Scholar
Reference
[31] Glavic Boris and Dittrich Klaus R.. 2007. Data provenance: A categorization of existing approaches. In 12th GI Conference on Datenbanksysteme in Buisness, Technologie und Web (BTW). 227–241. Retrieved from http://cs.iit.edu/%7edbgroup/pdfpubls/GD07.pdf.Google Scholar
Reference
[32] Goble Carole. 2002. Position statement: Musings on provenance, workflow and (semantic web) annotations for bioinformatics. In Workshop on Data Derivation and Provenance (Chicago).Google Scholar
Reference
[33] Gueta Guy Golan, Abraham Ittai, Grossman Shelly, Malkhi Dahlia, Pinkas Benny, Reiter Michael, Seredinschi Dragos-Adrian, Tamir Orr, and Tomescu Alin. 2019. SBFT: A scalable and decentralized trust infrastructure. In 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 568–580.Google ScholarCross Ref
Reference
[34] Haber Stuart and Stornetta W. Scott. 1990. How to time-stamp a digital document. In Conference on the Theory and Application of Cryptography. Springer, 437–455.Google Scholar
Reference
[35] Hambolu Oluwakemi, Yu Lu, Oakley Jon, Brooks Richard R., Mukhopadhyay Ujan, and Skjellum Anthony. 2016. Provenance threat modeling. In 14th Annual Conference on Privacy, Security and Trust (PST). IEEE, 384–387.Google ScholarCross Ref
Reference
[36] Hambolu Oluwakemi, Yu Lu, Oakley Jon, Brooks Richard R., Mukhopadhyay Ujan, and Skjellum Anthony. 2017. Provenance threat modeling Retrieved from http://arxiv.org/abs/1703.03835.Google Scholar
Reference 1Reference 2
[37] Hankerson Darrel and Menezes Alfred. 2011. Elliptic Curve Cryptography. Springer.Google ScholarCross Ref
Reference
[38] Harris Paul A., Taylor Robert, Thielke Robert, Payne Jonathon, Gonzalez Nathaniel, and Conde Jose G.. 2009. Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42, 2 (2009), 377–381.Google ScholarDigital Library
Reference 1Reference 2
[39] Irving Greg and Holden John. 2016. How blockchain-timestamped protocols could improve the trustworthiness of medical science. F1000Research 5 (2016).Google ScholarCross Ref
Reference
[40] informatics ISO/TC 215 Health. 2013. ISO 27789:2013 Health Informatics – Audit Trails for Electronic Health Records. Retrieved from https://www.iso.org/standard/44315.html.Google Scholar
Reference 1Reference 2
[41] Juneja Amit and Marefat Michael. 2018. Leveraging blockchain for retraining deep learning architecture in patient-specific arrhythmia classification. In IEEE EMBS International Conference on Biomedical & Health Informatics (BHI). IEEE, 393–397.Google ScholarCross Ref
Reference
[42] Kang Minhee, Park Eunkyoung, Cho Baek Hwan, and Lee Kyu-Sung. 2018. Recent patient health monitoring platforms incorporating internet of things-enabled smart devices. Int. Neurol. J. 22, Suppl 2 (2018), S76.Google ScholarCross Ref
Reference
[43] Koshy Anoop N., Sajeev Jithin K., Nerlekar Nitesh, Brown Adam J., Rajakariar Kevin, Zureik Mark, Wong Michael C., Roberts Louise, Street Maryann, Cooke Jennifer, et al. 2018. Smart watches for heart rate assessment in atrial arrhythmias. Int. J. Cardiol. 266 (2018), 124–127.Google ScholarCross Ref
Reference
[44] Kreps Jay, Narkhede Neha, Rao Jun, et al. 2011. Kafka: A distributed messaging system for log processing. In Proceedings of the NetDB, Vol. 11. 1–7.Google Scholar
Reference
[45] Linn Laure A. and Koo Martha B.. 2016. Blockchain for health data and its potential use in health IT and health care related research. In ONC/NIST Use of Blockchain for Healthcare and Research Workshop. 1–10.Google Scholar
Reference
[46] Liu Xiaoxue, Ma Wenping, and Cao Hao. 2019. MBPA: A MediBchain-based privacy-preserving mutual authentication in TMIS for mobile medical cloud architecture. IEEE Access 7 (2019), 149282–149298.Google ScholarCross Ref
Reference
[47] López-Blanco Roberto, Velasco Miguel A., Méndez-Guerrero Antonio, Romero Juan Pablo, Castillo María Dolores Del, Serrano J. Ignacio, Rocon Eduardo, and Benito-León Julián. 2019. Smartwatch for the analysis of rest tremor in patients with Parkinson’s disease. J. Neurol. Sci. 401 (2019), 37–42.Google ScholarCross Ref
Reference
[48] Lu Tsung-Chien, Fu Chia-Ming, Ma Matthew Huei-Ming, Fang Cheng-Chung, and Turner Anne M.. 2016. Healthcare applications of smart watches: A systematic review. Appl. Clin. Inform. 7, 3 (2016), 850.Google ScholarCross Ref
Reference
[49] Mamoshina Polina, Ojomoko Lucy, Yanovich Yury, Ostrovski Alex, Botezatu Alex, Prikhodko Pavel, Izumchenko Eugene, Aliper Alexander, Romantsov Konstantin, Zhebrak Alexander, et al. 2018. Converging blockchain and next-generation artificial intelligence technologies to decentralize and accelerate biomedical research and healthcare. Oncotarget 9, 5 (2018), 5665.Google ScholarCross Ref
Reference
[50] Medicalchain. 2018. Medicalchain—A blockchain for electronic health records.Google Scholar
Reference
[51] Miller Andrew, Xia Yu, Croman Kyle, Shi Elaine, and Song Dawn. 2016. The honey badger of BFT protocols. In ACM SIGSAC Conference on Computer and Communications Security. 31–42.Google ScholarCross Ref
Reference
[52] Moreau Luc, Freire Juliana, Futrelle Joe, McGrath Robert, Myers Jim, and Paulson Patrick. 2008. The open provenance model: An overview. 323–326. DOI:Google ScholarDigital Library
Reference
[53] Murdock Kit, Oswald David, Garcia Flavio D., Bulck Jo Van, Gruss Daniel, and Piessens Frank. 2020. Plundervolt: Software-based fault injection attacks against Intel SGX. In 41st IEEE Symposium on Security and Privacy (S&P’20).Google Scholar
Reference
[54] Mytis-Gkometh P., Drosatos G., Efraimidis P. S., and Kaldoudi E.. 2017. Notarization of knowledge retrieval from biomedical repositories using blockchain technology. In International Conference on Biomedical and Health Informatics. Springer, 69–73.Google Scholar
Reference
[55] Nakamoto Satoshi. 2008. Bitcoin: A peer-to-peer electronic cash system. bitcoin. org. URL: https://bitcoin.org/bitcoin.pdf (accessed: 21.05.2019) (2008).Google Scholar
Reference
[56] Nugent Timothy, Upton David, and Cimpoesu Mihai. 2016. Improving data transparency in clinical trials using blockchain smart contracts. F1000Research 5 (2016).Google ScholarCross Ref
Reference
[57] Obeid Jihad S., McGraw Catherine A., Minor Brenda L., Conde José G., Pawluk Robert, Lin Michael, Wang Janey, Banks Sean R., Hemphill Sheree A., Taylor Rob, et al. 2013. Procurement of shared data instruments for research electronic data capture (REDCap). Journal of Biomedical Informatics 46, 2 (2013), 259–265. DOI:Google ScholarDigital Library
Reference
[58] Rights Office of Civil. 2002. Standards for Privacy of Individually Identifiable Health Infor-mation: Final Rules. 67 (157), 53182-272 pages.Google Scholar
Reference
[59] Patel Vishal. 2019. A framework for secure and decentralized sharing of medical imaging data via blockchain consensus. Health Informatics Journal 25, 4 (2019), 1398–1411.Google ScholarCross Ref
Reference
[60] Pérez Fernando and Granger Brian E.. 2007. IPython: A system for interactive scientific computing. Computing in Science and Engineering 9, 3 (May2007), 21–29. DOI:Google ScholarDigital Library
Reference
[61] Perlroth Nicole. [n. d.]. Clinical Trials Hit by Ransomware Attack on Health Tech Firm. New York Times. Retrieved from https://www.nytimes.com/2020/10/03/technology/clinical-trials-ransomware-attack-drugmakers.html.Google Scholar
Reference
[62] Peterson Kevin, Deeduvanu Rammohan, Kanjamala Pradip, and Boles Kelly. 2016. A blockchain-based approach to health information exchange networks. In Proc. NIST Workshop Blockchain Healthcare, Vol. 1. 1–10.Google Scholar
Reference
[63] Ranstam Jonas, Buyse Marc, George Stephen L., Evans Stephen, Geller Nancy L., Scherrer Bruno, Lesaffre Emmanuel, Murray Gordon, Edler Lutz, Hutton Jane L., et al. 2000. Fraud in medical research: An international survey of biostatisticians. Controlled clinical trials 21, 5 (2000), 415–427.Google ScholarCross Ref
Reference
[64] REDCap. 2019. REDCap Change Log. Retrieved from https://www.evms.edu/research/resources_services/redcap/redcap_change_log/.Google Scholar
Reference
[65] Rivest Ronald L., Shamir Adi, and Adleman Leonard. 1978. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 2 (1978), 120–126.Google ScholarDigital Library
Reference
[66] Sandell Protik, Bowman Mic, and Shah Prashant. [n. d.]. Blockchain and its emerging role in healthcare related research. ([n. d.]). Retrieved from https://public-inspection.federalregister.gov/2016-16133.pdf?1467809533.Google Scholar
Reference
[67] Scherer Mattias. 2017. Performance and Scalability of Blockchain Networks and Smart Contracts.Google Scholar
Reference 1Reference 2
[68] Schobel Johannes, Pryss Rüdiger, and Reichert Manfred. 2015. Using smart mobile devices for collecting structured data in clinical trials: Results from a large-scale case study. In 2015 IEEE 28th International Symposium on Computer-Based Medical Systems. IEEE, 13–18.Google ScholarDigital Library
Reference
[69] Simmhan Yogesh L., Plale Beth, and Gannon Dennis. 2005. A survey of data provenance in e-science. SIGMOD Rec. 34, 3 (Sept.2005), 31–36. DOI:Google ScholarDigital Library
Reference
[70] Simmhan Yogesh L., Plale Beth, and Gannon Dennis. 2005. A Survey of Data Provenance Techniques. Technical Report 612. Computer Science Department, Indiana University. Retrieved from http://www.cs.indiana.edu/pub/techreports/TR618.pdf. Extended version of SIGMOD Record 2005.Google Scholar
Reference
[71] Simmhan Yogesh L., Plale Beth, and Gannon Dennis. 2012. Karma Provenance Collection Tool. Retrieved from http://d2i.indiana.edu/provenance_karma.Google Scholar
Reference
[72] Snow Paul, Deery Brian, Lu Jack, Johnston David, and Kirby Peter. [n. d.]. Factom. ([n. d.]). Retrieved from https://4454jm4bovib1sa6vrtflbew-wpengine.netdna-ssl.com/assets/docs/Factom_Whitepaper_v1.2.pdf.Google Scholar
Reference
[73] Suriarachchi Isuru, Zhou Quan, and Plale Beth. 2015. Komadu: A capture and visualization system for scientific data provenance. Journal of Open Research Software 3, 1 (2015). DOI:Google ScholarCross Ref
Reference
[74] University Vanderbilt. [n. d.]. REDCap General Security Overview. Retrieved from https://www.iths.org/wp-content/uploads/About-REDCap-Vanderbilt.pdf.Google Scholar
Reference 1Reference 2
[75] Vukolić Marko. 2015. The quest for scalable blockchain fabric: Proof-of-work vs. BFT replication. In International Workshop on Open Problems in Network Security. Springer, 112–125.Google Scholar
Reference 1Reference 2
[76] Wolstencroft Katherine, Haines Robert, Fellows Donal, Williams Alan, Withers David, Owen Stuart, Soiland-Reyes Stian, Dunlop Ian, Nenadic Aleksandra, Fisher Paul, Bhagat Jiten, Belhajjame Khalid, Bacall Finn, Hardisty Alex, Hidalga Abraham Nieva de la, Vargas Maria P. Balcazar, Sufi Shoaib, and Goble Carole. 2013. The taverna workflow suite: Designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic Acids Research 41, Web Server issue (2 May2013), gkt328–W561. DOI:Google ScholarCross Ref
Reference
[77] Worley Carl and Skjellum Anthony. 2018. Blockchain tradeoffs and challenges for current and emerging applications: Generalization, fragmentation, sidechains, and scalability. (2018). presented at IEEE Blockchain 2018, Halifax.Google Scholar
Reference
[78] Worley Carl, Yu Lu, Brooks Richard, Oakley Jon, Skjellum Anthony, Altarawneh Amani, Medury Sai, and Mukhopadhyay Ujan. 2020. Scrybe: A second-generation blockchain technology with lightweight mining for secure provenance and related. Blockchain Cybersecurity, Trust and Privacy 79 (2020), 51.Google ScholarCross Ref
Reference 1Reference 2
[79] Xia QI, Sifah Emmanuel Boateng, Asamoah Kwame Omono, Gao Jianbin, Du Xiaojiang, and Guizani Mohsen. 2017. MeDShare: Trust-less medical data sharing among cloud service providers via blockchain. IEEE Access 5 (2017), 14757–14767.Google ScholarCross Ref
Reference
[80] Xia Qi, Sifah Emmanuel Boateng, Smahi Abla, Amofa Sandro, and Zhang Xiaosong. 2017. BBDS: Blockchain-based data sharing for electronic medical records in cloud environments. Information 8, 2 (2017), 44.Google ScholarCross Ref
Reference
[81] Xu Jie, Xue Kaiping, Li Shaohua, Tian Hangyu, Hong Jianan, Hong Peilin, and Yu Nenghai. 2019. Healthchain: A blockchain-based privacy preserving scheme for large-scale health data. IEEE Internet of Things Journal 6, 5 (2019), 8770–8781.Google ScholarCross Ref
Reference 1Reference 2
[82] Yang Yilong, Li Xiaoshan, Qamar Nafees, Liu Peng, Ke Wei, Shen Bingqing, and Liu Zhiming. 2018. Medshare: A novel hybrid cloud for medical resource sharing among autonomous healthcare providers. IEEE Access 6 (2018), 46949–46961.Google ScholarCross Ref
Reference
[83] Yue Xiao, Wang Huiju, Jin Dawei, Li Mingqiang, and Jiang Wei. 2016. Healthcare data gateways: Found healthcare intelligence on blockchain with novel privacy risk control. Journal of Medical Systems 40, 10 (2016), 1–8.Google ScholarDigital Library
Reference
[84] Zhao Huawei, Zhang Yong, Peng Yun, and Xu Ruzhi. 2017. Lightweight backup and efficient recovery scheme for health blockchain keys. In 2017 IEEE 13th International Symposium on Autonomous Decentralized System (ISADS). IEEE, 229–234.Google ScholarCross Ref
Reference
[85] Zhao Y., Hategan M., Clifford B., Foster I., Laszewski G. von, Nefedova V., Raicu I., Stef-Praun T., and Wilde M.. 2007. Swift: Fast, reliable, loosely coupled parallel computation. In Services, 2007 IEEE Congress on. 199–206. DOI:Google ScholarCross Ref
Reference

Index Terms

Scrybe: A Secure Audit Trail for Clinical Trial Data Fusion
1. Information systems
  1. Data management systems
    1. Information integration
  2. Information storage systems

Recommendations

Clinical trial registries as Scientometric data: A novel solution for linking and deduplicating clinical trials from multiple registries
Abstract
Registries of clinical trials are a potential source for scientometric analysis of medical research and serve important functions for the research community and the public at large. Clinical trials that recruit patients in Germany are usually ...
Read More
Aspects concerning misconduct during a clinical trial
MMACTEE'09: Proceedings of the 11th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering

The number of clinical studies is in exponential increase. In our country there are registered numerous phase II, III and IV studies. Increasingly more medical doctors are involved as investigators and coinvestigators. Also there is an increase in the ...
Read More
Verifiable audit trails for a versioning file system
StorageSS '05: Proceedings of the 2005 ACM workshop on Storage security and survivability

We present constructs that create, manage, and verify digital audit trails for versioning file systems. Based upon a small amount of data published to a third party, a file system commits to a version history. At a later date, an auditor uses the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Digital Threats: Research and Practice Volume 4, Issue 2
June 2023
344 pages
EISSN:2576-5337
DOI:10.1145/3615671
Editors:
Arun Lakhotia
University of Louisiana at Lafayette and Cythereal, USA
,
Leigh Metcalf
CERT, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 March 2022
- Online AM: 10 March 2022
- Accepted: 9 September 2021
- Revised: 12 July 2021
- Received: 20 January 2021
Published in dtrap Volume 4, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Blockchain
clinical trials
REDCap
secure audit
Title 21 CFR Part 11
ISO 27789
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 623
  Total Downloads
- Downloads (Last 12 months)506
- Downloads (Last 6 weeks)82
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Scrybe: A Secure Audit Trail for Clinical Trial Data Fusion

Digital Threats: Research and Practice

Abstract

1 INTRODUCTION

2 RELATED WORK

2.1 Application-specific Blockchain Technologies

2.2 Tested Blockchain Technologies

3 BACKGROUND

3.1 REDCap

3.2 Blockchains

3.3 Hashes

3.4 Digital Signatures

3.5 Integrity

4 SCRYBE PROVENANCE FRAMEWORK

4.1 Changelog Entry

4.2 Transactions

4.3 Blocks

4.4 Mining

5 REDCAP SECURED WITH SCRYBE

5.1 Integrity

5.2 Availability

5.3 Authentication and Access Control

5.4 Efficiency

5.5 Clinical Trial Data Validation

6 DISCUSSION

ACKNOWLEDGMENT

Footnotes

REFERENCES

Cited By

Index Terms

Recommendations

Clinical trial registries as Scientometric data: A novel solution for linking and deduplicating clinical trials from multiple registries

Aspects concerning misconduct during a clinical trial

Verifiable audit trails for a versioning file system

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media