1 Introduction

As of July 2021, people all over the world are suffering from the novel Coronavirus called Covid-19. The outbreak of Covid-19 has introduced many difficulties. The researchers need Covid-19 records to help us tackle that situation, so the Decision Support System (DSS) is used to gather those records [1].

The researchers are trying to help us through collected data.Footnote 1 They access the patient’s data through DSS and make predictions on the severity and effect of the Covid-19 disease. Here the main concern unauthorized access may become a privacy breach, so data privacy must be maintained.Footnote 2

A medical organization cannot afford the data leakage of their Covid-19 patients.Footnote 3 Data breaches can affect patients’ lives and the reputation of institutions.Footnote 4 In this pandemic, data privacy is necessary for patients of Covid-19, and for that reason, we proposed a privacy model. Our contribution is in two folds. First, we used the Blowfish algorithm for the encryption of data. It preserves data privacy and takes less time to execute. The Blowfish encryption is applied to identity data attributes such as name, phone number, and quasi attributes such as an address, gender, and age. The authorized user can access it using the key. The benefit is that the patient’s data is safe in a repository securely. It can be accessed only by authorized users [2].

The problem arises here for research if medical organizations share actual data, e.g., only encrypted data, individual identity may reveal. For that reason, the second fold used the Pseudonymization masking technique. It published mask data that protect individual’s privacy and remains used for research. Then masked data, non-sensitive and sensitive data, is associated with the help of a reference number to encrypted data. In that way, data may become secure and can be used for research or any medical purpose. If one publishes an individual’s data without masking, it may chance a data breach. So, the aim is to protect that individual’s privacy while using that data [3].

We analyzed these encryption algorithms: the Blowfish, AES (stands for advance encryption standards), DES (stands for data encryption standards), 3DES (stands for triple data encryption standards), IDEA (stands for the international data encryption algorithm), RSA (stands for Rivest, Shamir, Adleman), and RC6 (stands for Rivest cipher 6). Afterward, we proposed a hybrid model of the Blowfish and the Pseudonymization masking technique to protect data from malicious use. The Blowfish used the parameters that are execution time and best against known attacks. [4]

Our experimental evaluation through proof of concept implementation validates that Blowfish presents a suitable privacy-preserving mechanism for achieving data privacy especially considering healthcare data [5]. The Blowfish takes less time for execution among the algorithms as mentioned above. In the Pseudonymization masking technique, we masked the name, address, and gender, then associated sensitive attributes of covid+ information. After this masking, it is hard to reveal the patient’s identity. At last, all encrypted, masked, sensitive, and non-sensitive attributes are associated with each other.

Fig. 1
figure 1

Existing technologies used to achieve data privacy

1.1 Related topic papers

The health insurance portability and accountability act (HIPPA) has been proposed by Leslie Lener et al. [6]. HIPPA rules and regulations provide guidelines for the data privacy of Covid-19 patients, but they have not presented any model to preserve data privacy. The guideline approach has been proposed by the Zwitter et al. [7]. That discusses how to deal with data to preserve privacy and provides guidelines but does not give an approach to handling such data. The regulation of the mobile positioning approach has been discussed by Iniobong Ekong et al. [8]. That limits unauthorized access and achieves data privacy. This approach only focuses on tracing Covid-19 users using the mobile tracing approach on the regulation provided by Nigeria’s regulations. Functional encryption for securing data using the Spatio-temporal trajectory approach has been proposed by Wooil Kim et al. [9]. The Spatio-temporal approach tends to achieve security through contact tracing. However, their work is only focused on privacy for contact tracing that is not suitable when we require to ensure the privacy of Covid-19 patient’s data in DSS.

1.2 Scope

This paper discusses the privacy model to protect the Covid-19 patient data from unauthorized access. If information leaks, it might cause data breaches. That’s why it is vital to secure the patient’s information. This proposed approach deals with privacy for Covid-19 Patients. We analyzed the existing solutions used for data protection in different research papers shown in Fig. 1. The other papers used HIPPA rules which only focus on providing guidelines, and some of them used contact tracing approaches to preserve data privacy. Our proposed hybrid model is used to achieve data privacy using a Blowfish encryption algorithm and data masking techniques and provides both privacy and efficacy for Covid-19 data uses in health organizations and research. The efficacy was measured in terms of minimum time utilization. So, by using this hybrid model, we achieved data privacy for Covid-19 patients.

Fig. 2
figure 2

Overview of existing technologies used for data privacy

2 Related work

In the medical setups, the DSS is used to collect and manage Covid-19 patient data. That data needs to be kept hidden so that individuals stay as protected a lot as could be expected. We performed an analysis of some technologies used to achieve privacy. Those are authentication, encryption, auditing, and monitoring. Here, we have presented them in a taxonomy diagram in Fig. 2. Below all those technologies are discussed to achieve data privacy [10].

2.1 Authentication

The oPass protocol proposed by Hung-Min Sun et al. that used for authentication [11]. Which controls stealing and reuse attacks of passwords for login so that data becomes safe. This model focuses on the one-time password for authentication. The model of Anak Agung Putri Ratna et al. measuring the time against the brute force attacks against the algorithms SHA-1 called secure hash algorithm, and MD5 called message digest5 [12]. It focuses only on measuring brute force against two algorithms. The OTP called as One-time-password used by the Rama et al. [13]. It’s an authentication protocol, which prevents stealing the password and other attacks. But the danger is that the man in the middle attack can become a breach of privacy. The Secure-sockets-layer (SSL)/ Transport layer security(TLS) is used for further secure communication. The flaws and security issues about SSL and TLS are also considered by the Preeti Sirohi et al. [14]. There are also some open challenges like a logjam, Monstrosity, SSL, and stripping attack. The SSL/TLS is used for secure communication, but still, it has many vulnerabilities, i.e., dangers of attack.

The prevention from unauthorized access and proves the purpose of intent using factor authentication(FA) proposed by Atsushi Kogetsu et al. [15]. The comparison between factor authentication has been performed, but new technologies are ignored to prove which one is best in what type of authentication. The CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart OTP stands for the one-time password has been proposed by Thivanon Kansuwan et al. [16]. The email OTP is used for data protection instead of using the old procedures of login because if that email account is not logged in and the mobile phone is lost, in that situation, email OTP is secure than messages. Covid-19 Test Certification (CTC) ensured patients’ privacy approach proposed by Untung Rahardja et al. [17]. The CTC is used for confirmation of test results in the distributed system. This method discussed data privacy while checking online test results after Covid-19 testing. The approach only deals with the privacy of test results which is not enough.

2.1.1 oPass

oPass named authentication protocol, that is used for authentication and attack protection. oPass protection is best against attacks like phishing, keylogger, password reuse, password guessing, and a MITM the man in the middle attack [11].

2.1.2 Features of SHA-1 and MD5

The Secure hash algorithm 1 (SHA-1) and Message Digest 5 (MD5) is an authentication-based security system [12]. The SHA-1 is best against brute force attacks. The MD5 is best for memory usage and processing time.

2.1.3 OTP

The hash-based one-time password (HOTP), the time-based one-time password (TOTP), and challenge-response one-time password CROTP are used for the prevention to replay attacks. The HOTP prevents replay attacks but takes more processing time and a higher CPU overhead. The CROTP has used prevention from replay attacks, CPU overhead is high, and medium response time for the server. The TOTP also prevents replay attacks, takes less time to complete the process, and has low CPU overhead [13, 16].

2.1.4 TLS

The security in Google, Chrome, Firefox, and Safari support the TLS protocol for secure communication or safe searching [14, 18]. The key features required for TLS 1.3 version in any website; are Certificate, Key Exchange, Cipher Strengths, and Protocol SupportFootnote 5 [19].

2.1.5 Factor authentication

Factor authentication is a method of security.Footnote 6 We analyzed different Factor authentication variants that are one-factor authentication (1FA), Two-factor authentication (2FA), and Three-factor authentication (3FA). That discussed their procedure, methods, and their best use. The 2FA has a predetermined code and security of more than 1FA and less than 3FA. The 1FA deals with what entity knows, the method used for, e.g., ID, password, and the security less than 2FA. The 2FA deals with what entity has, the method used codes or signed digital certificate or fingerprint, less security than 3FA. The 3FA deals with what entity remains, a method used for voice, hand, fingerprint, and retina scan, more secure than others [15].

2.2 Encryption

The cryptography algorithms; Data Encryption Standard DES, Triple Data Encryption Standard 3DES, and Advanced Encryption Standard AES discussed by Hamdan. O. Alanazi et al. [20]. These algorithms are compared for measuring effectiveness, adaptability, and security to secure data [21, 22]. It gives us brief information about the different algorithms, and the main focus is to find a secure algorithm. The comparison of encryption algorithms has been presented by Aggarwal [23]. The discussion about encryption is based on the best algorithm according to a situation to compare the effectiveness. Here the encryption algorithms are compared for two parameters. The encryption algorithms AES as advanced encryption standard, DES data encryption standard, T-DES triple-data encryption standard, and RSA Rivest, Shamir, Adleman, proposed by Pankaj Singh et al. [24]. These encryption algorithms were used for maintaining data privacy that gave a reasonable correlation between encryption and speed. However, these encryption algorithms are used to ensure privacy and speed, which impact the performance. The encryption algorithms AES, DES, RSA, RC6, 3DES, and Blowfish have been compared by Mohammed Nazeh Abdul Wahid et al. [25]. The algorithms are used to secure data from unauthorized access. This study focuses on the encryption algorithm where they can perform best.

The comparison of encryption algorithm has been presented by Patel et al. [26]. The discussion about encryption is based on execution time and memory usage to measure the performance. Here the encryption algorithms are compared for two parameters. These encryption algorithms (RSA, Blowfish, 3DES, and AES )compared for achieving data privacy, have been discussed by Daniel Commey et al. [27]. The main focus is upon choosing cipher, which provides more security. The encryption, auditing, and authentication solutions to achieve data privacy used SHA512, SHA256, and AES encryption techniques have been discussed by Arielle V. Luccal et al. [28]. The use of these algorithms provides security from unauthorized access. The Encryption applied to the prototype app was used to gather data. The prototype app was used to monitor a patient’s condition and decide on discharge. This model focuses on condition-based monitoring securely. CryptoGA (GA) encryption is a Genetic cryptographic algorithm has been discussed by Muhammad Tahir et al. [29]. The GA compared with some algorithms, which denoted the effective transmission rate. It focuses upon providing security to be used regardless of location. K-Medoid and BLOWFISH encryption has been proposed by Dr. Sheena Hussaini et al. [30]. The K-Medoid algorithm is used for the reliability of clustering data, whereas Blowfish demonstrates more security for that data. However, it focuses on distance-based secure encryption.

This paper discusses the weakness of some encryption algorithms and proposes a new approach that ensures the reliability and security of data. Different algorithms (AES, RSA, MD5, DES, Blowfish, and SHA, regarding their execution time and memory) has been discussed by Ashwini P. Parka et al. [31]. A comparative analysis is performed in this paper for choosing the best algorithm regarding their performance.

2.2.1 Feature analysis

Encryption is to encrypt and hide data from unauthorized users. If any unauthorized attempt tries to access the data, it becomes unreadable. This section performed the analysis of encryption algorithms [20, 23,24,25,26,27,28,29,30,31].Footnote 7,Footnote 8 In Table 1, there is an analysis of some encryption algorithms such as Blowfish, AES, DES, 3DES, Rivest cipher 6 RC6, RSA, IDEAS. The Table 1, shows different algorithms regarding the prevention from attacks and the effect of these algorithms.

Table 1 An analysis of different encryption algorithms

Table 2 analyzes different algorithms regarding some parameters such as the block size (is denoted group of bits), key size (the length of the key), round (depends upon the key size), confidentiality, execution-time for encryption/decryption, power consumption, and memory usage. The confidentiality of Blowfish is more than the rest of all algorithms. The Blowfish encryption and decryption take less time, less power consumption, and low memory consumption. AES takes the shortest time for encryption and decryption, and it has more memory usage than Blowfish. DES takes the highest time for encryption and decryption, and it uses more memory than AES. The 3DES takes more execution time than DES for encryption and decryption. The IDEA, RC6, and RSA are high in all these mentioned parameters.

Table 2 A features comparison between different encryption algorithms

2.3 Auditing and monitoring

The unified threat management (UTM) has been discussed by Yin Chao et al. [32]. Although UTM features have been discussed here, the issue is that it’s used for small security solutions. The Lidong Wang et al. paper discussed intrusion, typically done by individuals outside the association [33]. The IDS/IPS intrusion detection and intrusion prevention system are discussed to deal with intrusion detection; however, IDS/IPS are not useful in all situations. A detailed view of IDS/IPS technologies is given by Karim Abouelmehdi et al. [34]. IDS and IPS are used to observe, gather, and analyze the system to get the interruptions, keep up a log of each entrance, and adjust information. They are used for detection and prevention, but also they cannot handle any encrypted data which can be malicious [35]. The Next-generation firewall (NGFW) has been discussed by Kishan Neupane et al. [36]. It’s analyzed against conventional firewalls, other security solutions and also discusses its objectives and danger. However, the configuration is not easy for NGFW. The security information and event management SIEM has been discussed by Sievierinov et al. [37]. It is used for monitoring, and it provides hardware, network, and application analysis. It is used as the generation of logs and reports. This security solution is used for mid-sized setups.

The overall analysis of the literature presents us with the conclusion that most of the capabilities in auditing technologies are present in NGFW as presented in Table 3. Table 3 is an analysis of all the technologies that are discussed in this section.

Table 3 A comparison of auditing and monitoring technologies

2.3.1 IDS/IPS

IDS passively monitors and detects intrusion activities in any system. IPS actively analyzes and prevents intrusion. The IDS variants, host-based intrusion detection system (HIDS), and the network-based intrusion detection system (NIDS) are used to detect anomalies on both host and networks. Moreover, an IPS has variants, included host-based intrusion prevention systems (HIPS) and network intrusion prevention systems (NIPS). The HIDS provides system-level protection, configuration changes, file changes, and registry changes. The NIDS provides features of network resources, network protection, denial of service attacks protection, and sniffs the network traffic continuously if irregularities found in traffic detect and generate an alarm. The applications of IPS and IDS variants are: HIDS have ISS, Symantec Enterasys, and HIPS have Cisco, McAfee Snort. The NIDS applications are ISS, Cisco, Enterasys Symantec, and McAfee Intrushied NetScreen TippingPoint are the application of NIPS. [34].

2.3.2 NGFW features

NGFW integrates all those deep inspections of the packet, IDS, and IPS, visibility of application regardless of protocol and ports, and access control policies. The NGFW is used for the following facilities because it provides high-intensity traffic environments, complex tasks, telecommunication, deep inspection of packets, cohesive architecture, and access control policies [36].

2.4 HIPPA

The HIPPA rules and CIA security trade have been discussed by Thapa et al. [38]. They focused on regulations because of ethical requirements; however, they only discussed rules and regulations. The guidelines for data protection of Covid-19 patient have been discussed by Bernier et al. [39]. They focus on reforms and guidelines for the data protection of Covid-19 patients. This paper discussed reforms to protect data privacy. The privacy for patients achieved via access control of this model has been discussed by Prince et al. [40]. This paper classified the system into three parts to determine confidential, medium, and low confidential data. That helps determine the protection of data. The data privacy guidelines for designing a system have been proposed by AlMarzooqi et al. It’s specifically for Dubai [41]. This paper discussed guidelines to secure data. We analyzed all approaches discussed in various papers for data privacy. That helped us to choose the appropriate system for providing data privacy to Covid-19 patients.

Fig. 3
figure 3

The Hybrid model used to achieve data privacy

3 Proposed model

The Covid-19 patient’s data need protection from unauthorized access. The question arises here how to accomplish this. We proposed a hybrid approach for the privacy of Covid-19 patients. We analyzed some encryption algorithms to determine which technique is best for encrypting data for Covid-19 patients. We classified attributes of Covid-19 patients to identify what attributes need to be encrypted and masked. The classified attributes of Covid-19 patients are:

Fig. 4
figure 4

Depicted how attributes are linked with each other to achieve data privacy

  • Identity Attribute: Name, Phone number, and ID card.

  • Quasi Attribute: Age, Gender, Address.

  • Sensitive Attribute: Covid-19 test result positive.

  • Non-sensitive Attribute: other than quasi attributes.

As shown in the taxonomy diagram Fig. 3, the Hybrid model divides into two parts: Blowfish and Pseudonymization. Blowfish encrypts the identity attributes, name, phone no, Id card, and quasi attributes, address. Pseudonymization masks identity data and quasi data such as random data, address with region and gender with the person and then associates this reference data with encrypted data of Blowfish for reference. In, that way researchers can use it for research purposes to overcome pandemic privacy breaches. The masked and encrypted attributes are:

  • Identity Attribute = Blowfish Encryption + Associated with the reference number.

  • Identity Attribute= Mask (Name, Phone Number) + Associated with the reference number.

  • Quasi Attribute= Address masked with intervals, gender mask with the person.

  • Sensitive Attribute = Covid-19 positive data remain the same.

The hybrid approach is used for data privacy as shown in Fig. 4. The Blowfish is used for encryption and decryption of identity data such as name and phone number. It is also used for quasi attributes, such as address. The parameter for this experiment is execution time and best against known attacks. The benefit of this approach is that the patient’s data is saved in a repository securely. However, this data cannot be used for research purposes. If medical organizations share encrypted data, an attacker can make an attack. For that reason, the second fold used the pseudonymization masking technique. We masked identity attribute the name with random data, quasi attributes address, and age respectively with the region and intervals. After, masked, sensitive, non-sensitive, and encrypted data are associated with each other. In that way, researchers can use it for research to overcome the pandemic without privacy breaches. It may reduce the risk of privacy breaches because the patient’s information is masked. Due to address changes with the region, the researcher can find spots in the particular regions without the privacy breaches. In this section, we discussed how we achieve data privacy for the Covid-19 patient. We described how things relate to each other.

4 Experiments and results

In this section, we performed experiments for the hybrid approach, and It gave us interesting results. The machine setup is in window 10 with 8 GB RAM. The data masking is done with a python script, and all encryption experiments were performed in the java language. The modified dataset used for this approach is the Adult dataset accessible a.Footnote 9 We added sensitive, i.e., covid-19, and gender columns in the adult dataset to measure our results. All the data is masked using Pseudonymization. We have masked data in a way it remains beneficial for medical and research. After this, we performed Blowfish encryption that took less time. Figure 5 shows the execution time for algorithms that are Blowfish, AES, DES,3DES, IDEAS, RC6, and RSA. The parameter for this experiment is execution time. Our hybrid approach showed efficient results.

Fig. 5
figure 5

The execution time of all encryption algorithm

Fig. 6
figure 6

The execution time for AES encryption and decryption

The result is shown in Fig. 5. The Blowfish takes minimum time for encryption. The Covid-19 patient’s data is confidential: name, address, phone, and sensitive attribute Covid-19 positive need to be encrypted for this, so we used Blowfish encryption, which took less time. The result shows it takes only 0.72 milliseconds during encryption and decryption of the data. We also have used online tools to run these algorithms for Blowfish. WE used blowfish.js encrypt/decrypt online for RSA and used RSA Encryption Decryption for IDEA, DES, 3DES, and RC6. We used 8gwifi Crypto Tool Playground for AES, encryption, and decryption Online.Footnote 10,Footnote 11,Footnote 12,Footnote 13

Fig. 7
figure 7

The DES and 3DES execution time with different modes

The eclipse and online tools are used for more results and measured execution time for encryption. Different cipher modes are used to measure execution time for the encryption algorithm. These cipher modes are electronic codebook (ECB), cipher block chaining (CBC), cipher feedback (CFB), output feedback (OFB), output feedback mode in-bits (NOFB), counter (CTR), and the propagating cipher block chaining (PCBC). The execution time is measured using those modes in a different algorithm. It gives us fantastic results.

Fig. 8
figure 8

The RC6 and IDEA execution time of encryption and decryption

We took the result as an average time execution of all these Encryption algorithms. The result shows execution times for different modes. Figure 6 shows AES encryption execution time with the mode in milliseconds such as ECB and CBC, and block sizes of 128,192, and 256. Figure 7 shows the encryption execution time for DES and 3DES regarding different modes and where results depended upon encryption modes; ECB, CBC, CFB, OFB, and NOFB.

Figure 8 shows encryption and decryption execution time for RC6 and IDEA. It shows for both algorithms, time taken for encryption and decryption.

Fig. 9
figure 9

The Blowfish execution time for encryption and decryption with different modes

Figure 9 shows Blowfish encryption execution time with modes such as ECB, CBC, PCBC, CFB, OFB, and CTR modes. As results show, it takes less time.

Fig. 10
figure 10

RSA encryption and decryption execution time

RSA execution time shows in Fig. 10. It shows a different variation of execution for RSA: RSA, RSA ECB, RSA with padding, RSA with SHA padding and RSA ECB SHA padding, and RSA ECB padding with 256 and the block sizes of 512, 1048,2048, and 4096.

Fig. 11
figure 11

The IC of the IDEA encryption algorithms

IC is the index of coincidence and technique of cryptoanalysis. In this paper, we measured IC for all algorithms for secure cipher. The Eq. 1 has been used to measure IC for all encryption algorithms.

Fig. 12
figure 12

IC of the RSA encryption algorithms

In Eq. 1 the C is an index of coincidence, \(i_{m}\) denoted as repetition of a letter, the T denoted as the total number.

$$\begin{aligned} C= \sum _{m=a}^{m=z} \frac{i_{m}(i_{m}-1)}{T(T-1)} \end{aligned}$$
(1)

IC is measured using Eq. 1 for all algorithms discussed in this paper.

Fig. 13
figure 13

IC of different encryption algorithms

If the value is closer to 0.7, its means it nears plain text, and if it’s near to 0.3850 value, its means it nears encrypted text, and if it’s near to 0.3850 value, it is more secure. We checked it for all those algorithms simulated above, and the result is shown in a graph. The Figs. 11, 12, and 13 shows the results of measured IC’s for different algorithms such as AES, DES, 3DES, Blowfish, IDEAS, RC6, and RSA. The result showed we achieved data privacy for Covid-19 patients using our proposed hybrid model Blowfish encryption and Pseudonymization.

5 Discussion

In this paper, we analyzed some research papers related to these technologies, authentication, encryption, auditing, and monitoring, to explore the weaknesses and strengths. Our proposed hybrid model of Blowfish and data masking is used to achieve data privacy for identity, quasi attributes, and sensitive attributes. It associated the encrypted data, masked and sensitive data, to achieve data privacy. We have performed some experiments for this hybrid model using python script and Java language. Also, all Encryption algorithms were run online and on eclipse. The result showed that Blowfish is an efficient algorithm for achieving data privacy of Covid-19 patients. Our proposed model used Blowfish encryption because it is best against known attacks. It takes less time and uses minimum memory consumption compared to other algorithms. We used the cryptoanalysis technique to measure the IC value and found the attack surface for all algorithms. The repeated alphabets show a low IC value, non-repeated alphabets show a high IC value. The result and experiment section show that’s Blowfish IC value is high. The Blowfish is the best algorithm for encryption. This paper achieved data privacy for the Covid-19 patients using a Hybrid model of Blowfish Encryption and data masking technique.

6 Conclusion

The data privacy of Covid-19 patients has been proposed by this paper using a hybrid model. The researchers are using Covid-19 patient data, along with that the adversary can also access that data for malicious purposes, which may cause a privacy breach. This paper used a hybrid encryption and data masking approach to secure the COVID-19 patients’ data. We proposed a hybrid algorithm approach of Blowfish and AES for future work to give a more secure framework for achieving data privacy. They can also achieve privacy for the 1:M data set of Covid-19 Patients. The topic of data privacy of Covid-19 patients is innovative for exploration for the researchers.