Introduction

Lassa virus (LASV) is an emerging viral pathogen belongs to the Arenaviridae family and can cause severe viral haemorrhagic fever which is known as Lassa fever, with a 20% fatality rate (Charrel and De Lamballerie 2003). Lassa fever is a viral acute zoonotic disease due to its capability to affect the highest number of people (~ 500,000) and causes 5000 annual deaths in Western Africa (Ogbu et al. 2007). Also, the people of Ghana (~ 10%), Côte d’Ivoire (~ 30), Nigeria (~ 40%), Guinea (~ 50%), Sierra- Leone and Liberia (~ 80%) and a few areas of Mali are assumed to be affected by Lassa fever (Fichet-Calvet and Rogers 2009; Safronetz et al. 2010). About 200 million people of West African regions (i.e., Nigeria and Senegal) are at high risk of LASV outbreak (Charrel and De Lamballerie 2003). Moreover, it also affects many areas of Europe such as the United Kingdom (Kitching et al. 2009), Netherlands (WHO 2000) and Germany (Haas et al. 2003). Though it has been revealed that the virus primarily targets antigen-presenting cells (mainly dendritic cells and macrophages) and endothelial cells and interferes with their complete maturation and activation, but the pathogenesis of Lassa fever is yet not clearly understood (Hallam et al. 2018; Oti 2018). Given the high annual incidence and mortality rate, however, the development of an effective LASV vaccine is an urgent necessity.

LASV is endemic to West Africa, and the genomic organization of the Lassa virus is an enveloped, ambisense and has a bisegmented, negative sense and single-stranded RNA genome consisting of large (L) segments and small (S) segments (Oti 2018). The large or L segment of the RNA encodes the 200 kDa RNA polymerase (L) protein and the small ring-finger protein (matrix protein or Z-protein, 11 kDa) that regulate replication and transcription (Cornu and de la Torre 2001; Djavani et al. 1997). The small segments encode the surface glycoprotein precursor (GP, 75 kDa) and the nucleoprotein (NP, 63 kDa), which is proteolytically cleaved into GP1 and GP2 (envelope glycoprotein) that bind to the alpha-dystroglycan receptor and mediate entering into the host cell (Cao et al. 1998; Oti 2018).

LASV is transmitted to the human being through the rodent reservoir Mastomys natalensis, a typical African rat lurking in village houses (Bonner et al. 2007). Recent evidence, however, indicates that other rodent species may also be LASV recipients, like, African wood mouse Hylomyscus pamfi (Nigeria), and Guinea mouse M. erythroleucus (Nigeria and Guinea) (Hallam et al. 2018). Exchange of LASV occurs when a healthy individual comes in contact with the blood, secretion, tissue or excretion of any infected personal or by food contaminated with the host excreta. However, skin to skin contact without exchange of blood fluid cannot transmit the virus (Keenlyside et al. 1983). Children under ten years old are considered as the most vulnerable to LASV. For instance, a study showed 15% seropositivity in the under-aged population in West Africa (Kernéis et al. 2009). Besides, pregnant patients with Lassa fever results in spontaneous abortions (Price et al. 1988). Ribavirin, an antiviral drug, is found to be effective at the initial phase of Lassa fever and can reduce the fatality rate (Jahrling et al. 1980; McCormick et al. 1986). However, the development of potential toxicity and teratogenicity when used in the later stage of disease drives us to think that Ribavirin is not a potent therapeutic against Lassa fever (Fisher-Hoch et al. 1992; Kochhar 1990). Peptide vaccines are immune stimulants where fragments of virus-derived proteins mimic natural pathogens; hence, more influential in terms of safety, efficacy and specificity (Skwarczynski and Toth 2016). In 1985, the first epitope-based vaccine was developed using cholera toxin against E. coli (Jacob et al. 1985). Furthermore, peptide vaccines against many pathogenic agents (i.e., HIV, malaria, swine fever, influenza, anthrax, etc.) are promptly under development (Li et al. 2014).

Present study demonstrates the screening of whole LASV proteome followed by the grouping of viral proteins. Each protein group was evaluated separately for the identification of T-cell and B-cell epitopes along with their respective MHC alleles using vaccinomics. Subsequently, a vaccine was designed using the most persuasive epitopes from each protein with suitable adjuvant and linkers. The primary sequence was used for physicochemical analysis and immunogenic profiling, followed by the secondary and tertiary structure predictions. The predicted three-dimensional (3D) structure was applied for refinement and validation. Besides, disulphide bridging was done to improve structural stability. The binding affinity and interactions between the vaccine protein and the receptor were calculated by molecular docking and dynamics simulation, respectively. Codon optimization and in silico cloning were taken care of for the evaluation of the expression of chimeric protein within the appropriate host. Finally, an immune simulation was performed to estimate the immunogenic potency in real-life.

Methodology

Proteome Retrieval and Antigenicity Prediction

The whole proteome of the Lassa mammarenavirus was retrieved from the ViPR database (Virus Pathogen Database and Analysis Resource), an integrated robust database for several virus families and their respective species (Pickett et al. 2012). Initially, the proteins from LASV proteome were isolated and classified as glycoprotein, L-protein, matrix protein, nucleocapsid protein, nucleoprotein, polymerase, ring-finger protein, Z-protein. As a measure of an immune response, the structural proteins were then applied for antigenicity prediction using the Vaxijen v2.0 server with a 0.5 threshold value (Doytchinova and Flower 2007). This server uses auto and cross-covariance (ACC) transformation method to maintain 70–89% prediction accuracy. A protein with the best antigenic score was chosen from each class of structural proteins.

Prediction of Cytotoxic T-Lymphocyte (CTL) Epitopes and MHC-I Binding Alleles

The selected protein was submitted to the NetCTL v1.2 server for the prediction of CTL epitopes (9-mer) for 12 supertypes (i.e., A1, A2, A3, A24, A26, B7, B8, B27, B39, B44, B58, and B62). This server predicts CTL epitopes based on three criteria, namely, MHC-I binding peptides, C-terminal cleavage, and TAP transport efficiency. MHC-I binding and C-terminal cleavage are obtained by using artificial neural networks whereas TAP transporter efficiency is evaluated by the weight matrix (Larsen et al. 2007). In this study, the threshold was set to 0.5 which has a sensitivity and specificity of 0.89 and 0.94, respectively. Furthermore, MHC-I binding alleles for each CTL epitope were predicted using the consensus method in the IEDB analysis tool (Moutaftsi et al. 2006). The human was selected as the source species, and percentile rank ≤ 5 was considered since lower score indicates higher affinity.

Assessment of CTL Epitopes for Immunogenicity, Allergenicity and Toxicity

Firstly, selected CTL epitopes were rechecked for antigenicity to ensure their ability to induce immune response with VaxiJen v2.0 server. Further, they were also evaluated for immunogenicity with MHC-I immunogenicity tool of IEDB server (Calis et al. 2013). Vaccine components should be free from an allergic reaction. So, AllergenFP v1.0 server was used for allergenicity prediction. This server can recognize both allergens and non-allergens with 88% accuracy (Dimitrov et al. 2014b). Furthermore, toxic epitopes should be eliminated as they could compromise the functionality of the vaccine construct. Therefore, we used ToxinPred server to sort out the toxic CTL epitopes (Gupta et al. 2013).

Prediction of Helper T-Lymphocyte (HTL) Epitopes and MHC-II Binding Alleles

Helper T-lymphocyte (HTL) responses play an essential role in the induction of both humoral and cellular immune responses. Therefore, HTL epitopes are likely to be a significant element of preventive and immunotherapeutic vaccines. The IEDB MHC-II binding tool was applied to predict 15 amino acid long HTL epitopes using NN-align method (Nielsen and Lund 2009). A percentile rank was generated by comparing peptide’s binding affinity with a comprehensive set of randomly selected peptides from the Swiss-Prot database. Percentile rank ≤ 5 was also considered for this analysis (Paul et al. 2016).

Identification of Cytokine-Inducing HTL Epitopes

Innate immune system, B-lymphocytes, cytotoxic T-cells and other immune cells are activated by the help of helper T-cells which further releases different types of cytokines, i.e., interferon-gamma (IFN-γ), interleukin-4 (IL-4) and interleukin-10 (IL-10), HTL epitopes have the ability to overcome proinflammatory response and thus diminish tissue damage (Luckheeram et al. 2012). Therefore, cytokine-inducing HTL epitopes are essential in vaccine development. So, we used IFNepitope server for the prediction of IFN-γ inducing HTL epitopes using a hybrid method (Motif and SVM) along with IFN-gamma versus Non-IFN-gamma model (Dhanda et al. 2013a, b). In addition to IFN-gamma, IL-4 and IL-10 properties were also evaluated with IL4pred and IL10pred servers, respectively (Dhanda et al. 2013a, b; Nagpal et al. 2017).

Prediction and Assessment of Linear B-lymphocyte (LBL) Epitopes

Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development as it produces antibodies which provide humoral immunity. Therefore, LBL epitopes were predicted using iBCE-EL server. It uses a method that combines gradient boosting algorithms with an extremely randomized tree method (Manavalan et al. 2018). The predicted LBL epitopes were then further assessed to check their antigenicity, allergenicity and toxicity with VaxiJen v2.0, AllergenFP v1.0 and ToxinPred server, respectively.

Estimation of Population Coverage

Different HLA alleles, as well as their expression, are sensationally distributed at various frequencies in diverse ethnicities (Adhikari and Rahman 2017). Hence, the HLA-alleles distribution among the world population is crucial for successful multi-epitope vaccine development. In this study, the IEDB population coverage analysis tool was used for the analysis of the population coverage of the potential CTL and HTL epitopes and their MHC binding alleles (Bui et al. 2006).

Designing of Multi-epitope Vaccine Construct

To construct multi-epitope vaccine, finally selected CTL, HTL, and LBL epitopes were linked together with the help of AAY, GPGPG and KK linkers, respectively (Gu et al. 2017; Nain et al. 2019). Peptides used for the vaccine construction are generally poorly immunogenic when used alone, therefore, requires adjuvants to boost up the immune response (Li et al. 2014). Consequently, the amino acid sequence of OmpA protein (GenBank: AFS89615.1) was chosen as an adjuvant and linked ahead of the first CTL epitope through EAAAK linker (Arai et al. 2001). Using linker for joining of two epitopes is required for the effective functioning of each epitope (Nezafat et al. 2014).

Antigenicity, Allergenicity and Physicochemical Evaluation

As vaccine protein should be highly antigenic, antigenicity prediction is necessary. The designed vaccine construct was analysed using Vaxijen v2.0 (Doytchinova and Flower 2007) and cross-checked with ANTIGENpro server (Magnan et al. 2010). Vaxijen is free from any alignment and work based on various physicochemical properties of the protein, whereas ANTIGENpro server has been developed based on microarray analysis data. Screening for allergenicity is important as it indicates the potentiality of a vaccine construct to cause sensitization and allergic reaction. We used AllergenFP v1.0 (Dimitrov et al. 2014b) and AllerTOP v2.0 (Dimitrov et al. 2014a) server to check the allergenicity of the vaccine protein. Inducing an immune response after injecting the vaccine into the body is the sole purpose of vaccination. Therefore, the assessment of various physicochemical properties of the chimera protein is essential. The primary protein sequence of the vaccine was used to predict the various physiochemical features through ProtParam (Wilkins et al. 1999) web-server. Furthermore, the solubility of the vaccine protein upon overexpression in E. coli was predicted by SOLpro (Magnan et al. 2009) tool in the SCRATCH suite.

Secondary and Tertiary Structure Prediction of the Vaccine Construct

PSIPRED v4.0 server was used for the prediction of vaccine’s secondary structure. Two feed-forward neural networks are the basis of PSIPRED, which process the PSI-BLAST (Position-Specific Iterated-Blast) to predict a secondary structure from an amino acid sequence as an input (Buchan et al. 2013).The tertiary structure of the final subunit vaccine was predicted using the RaptorX server, which is based on three steps; specifically single-template threading, multiple-template threading and alignment quality prediction. The server predicts 3D protein model and provides some confidence scores to evaluate the quality of predicted models. P-value, GDT, uGDT, modelling error at each residue etc. are different confidence scores that provide a clear indication to the relative good model (Källberg et al. 2017).

Refinement and Validation of 3D Vaccine Construct

The predicted model of the vaccine protein was refined through GalaxyRefine web-server. This server initially rebuilds sidechains, then sidechain repacking and consequently uses the molecular dynamics simulation for overall structural relaxation (Heo et al. 2013). Structural validation is a process to identify potential errors in the predicted tertiary structure (Khatoon et al. 2017). Therefore, ProSA-web was used for structural validation, which provides an overall quality score for the input structure. If the calculated score falls outside the range characteristics of the native protein, then the structure contains errors (Wiederstein and Sippl 2007). The ERRAT server was also used to analyse non-bonded atom–atom interactions (Colovos and Yeates 1993). Finally, Ramachandran plot was obtained using PROCHECK server. The Ramachandran plot is a way to visualize energetically allowed and disallowed dihedral angles psi (ψ) and phi (ϕ) of amino acid and is calculated based on van der Waal radius of the side chain. The result from PROCHECK include the percentage and number of residues in most favoured, additional allowed, generously allowed, and disallowed region, which defines the quality of modelled structure (Laskowski et al. 1993).

Conformational B-cell Epitope Prediction from Vaccine Construct

The conformational B-cell epitope is the collection of amino acid residues on the 3-dimensional geometry of the vaccine protein which interacts directly to the immune receptor. Therefore, ElliPro tool of IEDB server was used to determine the presence of conformational B-cell epitopes in the validated tertiary structure. ElliPro uses three algorithms relying on their protrusion index (PI) values to approximate the protein structure as an ellipsoid, calculate the residue PI, and adjacent cluster residues. ElliPro provides an average PI value over each epitope residue for each generated epitope. For each epitope residue, the PI value is calculated based on the residue mass centre outside the largest possible ellipsoid (Ponomarenko et al. 2008).

Disulphide Engineering of Final Vaccine Construct

Disulphide bonds are covalent interactions that stabilizes molecular interactions and provide considerable stability by confirming precise geometric conformations. Disulphide engineering is a novel approach for creating disulphide bonds into the target protein structure. Therefore, disulphide engineering was executed with Disulphide by Design v2.12 web tool. Initially, the refined protein model was uploaded and run for the residue-pair search that can be used for the disulphide engineering purpose. Potential residue pairs were selected for mutation, and cysteine residue was used as a final target for disulphide engineering (Craig and Dombkowski 2013).

In Silico Codon Adaptation and Cloning

Codon optimization is essential as unadapted codon may lead to the minor expression rate in the host. Codon optimization was performed using Java Codon Adaptation Tool (JCat) server with a view to improving the translational efficiency in E. coli K12 strain (Grote et al. 2005). Three additional options were selected to avoid the rho-independent transcription termination, prokaryote ribosome binding site, and restriction enzymes cleavage sites. Codon adaptation index (CAI) value and GC content of the adapted sequence was obtained and compared with the ideal range (Sharpl and Li 1987). Consequently, the received nucleotide sequence was cloned into the E. coli pET28a(+) vector by using SnapGene v4.2 tool.

Molecular Docking Between the Vaccine and TLR2 Receptor

Molecular docking is a computational method which involves the interaction between a ligand molecule and the receptor molecule to provide a stable adduct. Also, a calculated score was provided as a measure of the degree of binding interaction (Lengauer and Rarey 1996). Toll-like receptor 2 (TLR2) can mediate high proinflammatory responses against LASV infection (Hayes and Salvato 2012). Therefore, TLR2 structure was used as the receptor (PDB ID: 3A7B) and the refined vaccine protein as the ligand (Berman et al. 2002). Finally, the binding affinity between the multi-epitope vaccine and TLR2 receptor was calculated through the ClusPro v2.0 server (Kozakov et al. 2017). This server completed the task in three consecutive steps such as rigid body docking, clustering of lowest energy structure, and structural refinement by energy minimization. The best-docked complex was selected based on the lowest energy scoring and docking efficiency.

Molecular Dynamics Simulation

Molecular dynamics study is critically essential for checking the stability of the protein–protein complex in any in silico analysis. Protein stability can be determined by comparing essential protein dynamics to their normal modes (Van Aalten et al. 1997; Wüthrich et al. 1980). The iMODS server was used to explain the collective protein motion in the internal coordinates through normal mode analysis (NMA) (López-Blanco et al. 2014). The server estimated the direction and extent of the immanent motions of the complex in terms of deformability, eigenvalues, B-factors, and covariance. The deformability of the main chain depends on whether a specified molecule can deform at each of its residues. The eigenvalue of each normal mode describes the rigidity of motion. This value is related directly to the energy required for the structural deformation, and the deformation is much easier if the eigenvalue is low.

In Silico Evaluation of Immune Response

To estimate the immunogenic potential of the final vaccine, in silico immune simulations were conducted using the C-ImmSim server. This immune simulator uses a position-specific scoring matrix (PSSM) and machine learning techniques for the prediction of epitope prediction and immune interactions, respectively (Rapin et al. 2010). The minimum suggested interval between dose 1 and dose 2 is 4 weeks, according to most vaccines in current use (Castiglione et al. 2012). All parameters were set at default with time steps set at 1, 84, and 168 where each time step is equal to 8 h. Therefore, three injections were given four weeks apart. Moreover, six doses of injections of the designed vaccine were given in the same manner to simulate repeated exposure interaction to the antigen seen in a typical endemic area to probe for clonal selection.

Results

Protein Retrieval and Highest Antigenic Protein Selection

For the construction of candidate vaccine, a total of 1665 protein sequences of different structural (glycoprotein, matrix protein, nucleocapsid protein, nucleoprotein, ring-finger protein and Z-protein) and non-structural (L-protein, Polymerase, RNA directed RNA-polymerase) protein sequences of the LASV were retrieved from ViPR database. Structural protein enables viruses to invade and assemble viral particles in the host, while non-structural proteins secrete various enzymes that assist in viral replication and development of structural proteins. Vaxijen v2.0 server revealed the highest antigenic protein from each type and as we set a threshold of 0.5, only glycoprotein, matrix protein, ring-finger protein and Z-protein showed antigenicity (Table 1).

Table 1 The most antigenic protein of each type, along with their GenBank accession ID, antigenicity score, and length

Prediction and Appraisal of CTL, HTL and LBL Epitopes

A total of 180 unique CTL epitopes (9-mer) were predicted from the four LASV highest antigenic proteins. Here 97, 22, 32 and 29 CTL epitopes were predicted from the glycoprotein, matrix protein, ring-finger protein and Z-protein, respectively, using the NetCTL v1.2 server. Among them, only 42 epitopes were found as antigenic, immunogenic, and non-toxic (Supplementary Table 1). Of the 42 epitopes, 22 were found to be non-allergenic. These non-allergenic epitopes were further used to predict their MHC-I binding alleles using MHC-I allele prediction tool of the IEDB server. Similarly, a total of 72 unique HTL epitopes (15-mer) and their MHC-II binding molecules were predicted using the IEDB MHC-II prediction tool. The cytokine (i.e., IFN-γ, IL-4 and IL-10) inducing ability of these HTL epitopes were also evaluated (Supplementary Table 2). B-cell epitopes are antigenic regions of a protein that can trigger antibody formation. The iBCE-EL tool was used to predict linear B lymphocyte (LBL) from the LASV proteins. We found a total of 101 LBL epitopes and after evaluation, only 37 unique epitopes (14, 5, 9 and 9 LBL epitopes were found from the glycoprotein, matrix protein, ring-finger protein and Z-protein, respectively) were found to be non-allergenic and non-toxic and considered for vaccine construction (Supplementary Table 3).

Construction of Multiepitope Vaccine

For multi-epitope vaccine designing, we have considered highly antigenic CTL epitopes from every type of proteins that are immunogenic, non-allergenic and non-toxic (Table 2). On the other hand, at HTL epitope selection, we screened cytokine-inducing properties and found that only two epitopes from the glycoprotein have the capacity to induce all three type of cytokine, while epitopes from the other proteins were positive for a maximum of two cytokines. Hence, we selected two epitopes from the glycoprotein and epitopes having inducing feature for at least two cytokines were selected in case of other proteins (Table 3). As we get a numerous number of B-cell epitopes with higher antigenicity, non-toxicity and non-allergenicity, we took the epitope with the best probability score (obtained from iBCE-EL tool) from each type of protein (Table 4). Therefore, 6 CTL, 8 HTL and 4 LBL epitopes are merged by AAY, GPGPG and KK linkers, respectively (Fig. 1). OmpA agonist (GenBank ID: AFS89615.1), which is 352 amino acid residues long, was used as an adjuvant for TLR2 receptor using EAAAK linker. The final vaccine construct comprises 642 amino acid residues.

Table 2 Finally selected cytotoxic T-lymphocyte (CTL) epitopes for multi-epitope vaccine construction
Table 3 Finally selected helper T-lymphocyte (HTL) epitopes with their cytokine inducing properties
Table 4 Finally selected linear B-lymphocyte (LBL) epitopes for multi-epitope vaccine construction
Fig. 1
figure 1

Graphical presentation of the multi-epitope vaccine construct. A 642 amino acid residues long vaccine construct consisting of an adjuvant (green) at the N-terminal end is linked with the whole multi-epitope sequence through EAAAK linker (red). CTL, HTL and BL epitopes are fused with the help of AAY (yellow), GPGPG (light blue) and KK (light green) linkers, respectively (Color figure online)

Population Coverage Analysis

The distribution of HLA allele varies between different geographical and ethnic regions around the globe. Therefore, population coverage during the development of an efficient vaccine must be taken into account. In this study, selected CTL and HTL epitopes, which were used to construct the vaccine and their corresponding HLA alleles (Supplementary Tables 4 and 5) were obtained for population coverage analysis both individually and in combination. Our selected CTL and HTL epitopes were found to cover 91.74% and 68.15% of the world population, respectively. Importantly, CTL and HTL epitopes showed 97.37% population coverage worldwide when used in combination. The highest population coverage was found to be 99.16% in the South American country Chile Amerindian. In West Africa, where the bacteria first appeared and had several outbreaks, the population coverage was 88.86%. LASV caused several epidemics in different countries of the world, especially in England, Germany, and the Netherlands where the population coverage was found to be 98.19%, 95.47%, and 69.80%, respectively (Supplementary Table 6). In addition, the population coverage in China, Europe, France, India, Japan, South Korea, Russia, South Asia, and United States were found to be 77.56%, 97.49%, 94.65%, 98.22%, 94.52%, 97.58%, 88.13%, 98.49% and 98.70%, respectively (Fig. 2).

Fig. 2
figure 2

Population coverage of the selected T-cell epitopes and their respective HLA alleles. Regions of particular importance were considered in this graph with their MHC-I (red), MHC-II (blue) and combined (green) coverage rate (Color figure online)

Physiochemical Analysis and Solubility Prediction

Multiple physicochemical properties were calculated from the ProtParam server by inserting the whole vaccine construct as an amino acid sequence. The molecular weight of the construct was calculated as ~ 68 kDa and the antigenicity prediction showed that the construct has good antigenic properties. The analysis shows 9.31 pI (Isoelectric point) value which indicates the vaccine construct is basic in nature. The instability index (II) was computed to be 29.05, which implies that the sequence of the construct will remain stable after expression. The aliphatic index was calculated as 84.98 which indicates the construct’s thermostability. The grand average of hydropathicity (GRAVY) was calculated to be negative (− 0.118). This negative value indicates the hydrophilic nature of protein; therefore this protein tends to have better interaction with other proteins. Estimated half-life in mammalian reticulocyte (in vitro) was found to be 30 h, while in yeast and Escherichia coli the estimated half-life (in vivo) are > 20 h and > 10 h, respectively. The immunogenic appraisal also revealed that our vaccine construct is highly antigenic, non-allergenic and showed higher solubility rate calculated through SOLpro server. These evaluations suggested that our vaccine construct might be an ideal vaccine against LASV (Table 5).

Table 5 Antigenic, allergenic and physiochemical assessments of the primary sequence of final vaccine protein

Secondary Structural Feature Prediction

PSIPRED v4.0 workbench was used to predict the secondary structure of the final vaccine construct. The final vaccine construct (642 amino acid long) was analysed in which 352 amino acids involved in random coil formation while 149 amino acids involved in α-helix creation and β-strands are formed only by 141 amino acids. So, overall secondary structural feature prediction results indicate 54.83% are random coils, 23.21% forms α-helix and 21.96% are β-strands (Fig. 3).

Fig. 3
figure 3

Graphical representation of secondary structure prediction of the multi-epitope vaccine. Here, the β-strands, α-helix and random coils are indicated by yellow, pink and blue color, respectively

Tertiary Structure Modelling and Refinement

The tertiary structure of the multiepitope vaccine was modelled from the RaptorX server. In RaptorX server, 100% (642) amino acid residues were modelled as six domains. The protein structure (PDB ID: 1bxwA) was used as the best template for modelling. The relative quality of the modelled structure was evaluated by P-value. Calculated P-value for the modelled structure was 5.39 × 10−06 which is very low. Here, lower the P-value higher the quality of the model. Furthermore, uGDT is also used as a parameter for 3D structure evaluation and a construct with > 100 residues, uGDT > 50 is a good indicator. As we got 354 as uGDT score, it indicates the tertiary model as acceptable protein model for further analysis (Supplementary Table 7). The predicted tertiary (3D) structure was further refined using GalaxyRefine server, leads to an increase in the number of residues in the favoured region, generated five refined models. The refined best model showed 93.6% residues in the most favoured region in the Ramachandran plot, GDT-HA score 0.9276, RMSD 0.490, MolProbity 2.052, Clash score 13.2 and Poor rotamers 0.8 (Supplementary Table 8), which indicate the quality of the refined model among all the five models after comparison and finally helped us to select the model for further studies (Fig. 4).

Fig. 4
figure 4

Tertiary structure of the multi-epitope vaccine after refinement indicating α-helix (red), β-strand (yellow) and random coil (green) in colour (Color figure online)

Validation of Refined Tertiary Structure

Validation of the refined tertiary structure was checked by using PROCHECK, ProSA-Web and ERRAT server. Ramachandran plot analysis of the crude structure, by PROCHECK server, revealed that 85.9% of the structure was located in the most favoured region, 11.1% in additional allowed regions, 2.0% in generously allowed regions and 1.0% of the residues were the in disallowed regions. Whereas, after refinement PROCHECK generated a better result, 89.7% of residues were located in the most favoured regions, 8.6% in additionally allowed, 1.0% in generously allowed and 0.8% of residues were found in the disallowed region (Fig. 5a). ProSA-web and ERRAT verified the quality and potential errors in a crude 3D model. The selected best model after refinement had an overall quality factor of 78.2% with ERRAT while ProSA-web gave a Z-score of − 4.23 for the input vaccine protein model, indicating the model is slightly in the range of native protein conformation (Fig. 5b).

Fig. 5
figure 5

Structural validation of the tertiary structure of the vaccine construct. a Represents the Ramachandran plot of the refined model where most favoured, allowed and disallowed regions are 89.7%, 9.6% and 0.8% respectively. b Indicates ProSA-web validation of 3D structure showing Z-score (− 4.23)

Conformational B-cell Epitope Prediction

Conformational B-cell epitopes were predicted using ElliPro, an online web server that predicts epitope based on the tertiary structure. A total of 305 residues with scores varying from 0.566 to 0.944 were predicted to be located in eight conformational B-cell epitopes (Fig. 6, Table 6). The epitopes ranged in size from 4 to 93 residues.

Fig. 6
figure 6

Graphical presentation of eight conformational B-cell epitopes of the peptide vaccine. Here the predicted epitopes are indicated by cyan colour (spheres) and the rest of the residues are in grey (cartoon) (Color figure online)

Table 6 Predicted conformational B-cell epitopes of the peptide vaccine

Disulphide Bridging for Vaccine Stability

Disulphide engineering was performed using Disulfide by Design v2.12 to stabilize the modelled structure of the final vaccine construct. In total, 54 pairs of residues could be used in disulphide engineering have been discovered (Supplementary Table 9). However, only two pairs of residuals have been concluded after the evaluation of other parameters such as energy score and χ3 angle, as their value falls below the allowed range, i.e. energy should be less than 2.2 kcal/mol, and χ3 angles are expected to be between − 87 and + 97° (Craig and Dombkowski 2013). Therefore, a total of four mutations were generated on the residue pairs. For Ala11–Ala19 residual pairs, the energy score is 0.98 kcal/mol, and the χ3 angle is 85.64 degree. Whereas, for Cys325–Cys337, the χ3 angle and the energy were − 74.33° and 1.93 kcal/mol (Fig. 7).

Fig. 7
figure 7

Disuphide engineering of the vaccine construct to improve stability. Two mutated pairs are shown in green and red colour which are selected based on their energy, χ3 value, and B-factor (Color figure online)

Codon Optimization and In Silico Cloning

Expressing the LASV-derived vaccine protein epitope into the E. coli expression system was the primary purpose of in silico cloning. Therefore, according to the codon usage of the E. coli expression system, it was necessary to adapt the codon respectively to the subunit vaccine construct. To optimize codon usage of the vaccine construct in E. coli (strain K12) for maximal protein expression, the Java Codon Adaptation Tool (JCat) was used. The length of the optimized codon sequence was 1926 nucleotides. Codon optimization evaluates the sequence and tells about GC content of the cDNA sequence and codon adaptive index (CAI) where GC content was calculated as 53.63% which lies in the optimum range of (30–70) %. CAI was calculated as 0.98, which also lies in the range (0.8–1.0), which indicates the possibility of good expression of the vaccine candidate in the E. coli host. XhoI (158) and NdeI (238) restriction sites were later created and cloned using SnapGene software into the pET28a (+) vector (Fig. 8). Thus, the total length of the clone was 7.22 kbp.

Fig. 8
figure 8

In silico cloning of the final vaccine construct into pET28a(+) expression vector where the red part indicates the coding gene for the vaccine surrounded between XhoI (158) and NdeI (2091) while the vector backbone has shown in a black circle (Color figure online)

Molecular Docking of Vaccine with Immune Receptor

To assess the interaction between the refined model and the TLR2 (PDB ID-3A7B) immune receptor, molecular docking was performed by using online server ClusPro v2.0 and a total of 30 models were generated (Supplementary Table 10). Among them, only that model was selected, which occupied the receptor properly and having the lowest energy score. Since model number 1 fulfils the desired criteria, therefore, was chosen as the best-docked complex (Fig. 9). The energy score obtained for the model 1 was found to be − 1406, which is lowest among all other predicted docked complex confirming the highest binding affinity.

Fig. 9
figure 9

The docked complex of vaccine protein and TLR-2 (PDB ID- 3A7B) receptor. Here the receptor is represented in green colour whereas the vaccine protein is in red (Color figure online)

Molecular Dynamics Simulation of the Vaccine-TLR2 Complex

Normal mode analysis (NMA) was conducted to scrutinize protein stabilization and their large-scale mobility. This assessment was conducted by iMODS server depending on the internal coordinates of the docked complex. The complex’s deformability depends on the individual distortion of each residue, depicted by chain hinges (Fig. 10b). The eigenvalue found for the complex was 9.857553e−08 (Fig. 10a). The variance correlated with each normal mode was inverted to the eigenvalue (Kovacs et al. 2004). The B-factor values generated from normal mode analysis were proportional to RMS (Fig. 10c). Covariance matrix showing various pairs of related, anti-correlated or uncorrelated motions represented by red, blue and white colours, has stated the coupling of pairs of residues, respectively (Fig. 10d). The result also provided an elastic model of the network that distinguished the pairs of atoms linked through springs (Fig. 10e). Each dot in the diagram showing one spring, coloured by the degree of stiffness, between the corresponding atom pairs. The darker the greys, the more rigid the springs were.

Fig. 10
figure 10

Molecular dynamics simulation of the vaccine-TLR2 complex, showing a eigenvalue; b deformability; c B-factor; d covariance matrix; and e elastic network analysis

In Silico Immune Simulation

The simulated immune response was compatible with actual immune responses (Fig. 11). For instance, the secondary and tertiary responses were higher than the primary response. High concentrations of IgM was characterized at the primary response. In both the secondary and tertiary reactions, the typical high levels of immunoglobulin activities (i.e., IgG1 + IgG2, IgM, and IgG + IgM antibodies) were evident with concomitant antigen reduction (Fig. 11a). This indicates the emergence of immune memory and thus increased antigen clearance upon subsequent exposures (Fig. 11e). Furthermore, several long-lasting B-cell isotypes were observed, suggesting the potential for isotype switching and memory formation (Fig. 11b, c). In the TH (helper) and TC (cytotoxic) cell populations with the respective memory development, a similarly elevated response was noticed (Fig. 11d–f). During exposure, increased macrophage activity was demonstrated, with continuously proliferating dendritic cells (Fig. 11g, h). High levels of IFN-γ and IL-2 were also evident. Besides, a lower Simpson index (D) indicates greater diversity (Fig. 11i). Moreover, multiple exposure (n = 6) simulation as an encounter to endemic regions led to an increase of concentrations of IgG1 and decreasing IgM and IgG2 levels, while maintaining a high concentration of IFN-γ, IL-2, TC, and TH cell populations. (Supplementary Fig. 1). This profile suggests immune memory development and, therefore, natural immune protection against the virus in question.

Fig. 11
figure 11

C-ImmSim presentation of an in silico simulation of immune response using the chimeric peptide as antigen, showing a Immunoglobulin production in response to antigen injection, b B cell population after three injections, c B cell population per state, d The evolution of T-helper cell, e T-helper cell population per state, f Cytotoxic T-cell population per state, g Macrophages population per state, h Dendritic cell population per state, and i Production of cytokine and interleukins with Simpson index (d)

Discussion

LASV is a virus with a higher mortality rate and has the potential to bring upon catastrophe among the endemic region like West Africa. So, for developing a prevention method to fight against LASV is an obligation. Nowadays, vaccination is the most dynamic approach to improve the immunity system to fight against infectious diseases. Efficient development and manufacturing of live or attenuated vaccine, however, is expensive and can take years to complete. Nonetheless, the incorporation of excessive antigenic load in the attenuated vaccine appears not only to contribute little to the protective immune response but to complicate the state by causing allergic reactions (Li et al. 2014). Compared to traditional vaccines, multi-epitope vaccines decrease unwanted parts, which can either cause pathological immune responses or adverse effects (Zhang 2018). Increased safety, cost-effectiveness, the opportunity to rationally engineer the epitopes for increased potency and breadth, and the ability to focus immune responses on conserved epitopes also include potential benefits of epitope-based vaccines (Shey et al. 2019). For many years, researchers have sought to minimize the cost, time and side-effects of vaccine development. Different strategies are readily available at this moment for designing and developing efficient and competent new generation epitope-based vaccines depending on immunoinformatic approaches (María et al. 2017; Seib et al. 2012). Researchers also used immunoinformatics methods as a tool to provide futuristic models of multi-epitope driven vaccine against Ebola virus, Hepatitis C virus, Oropouche virus, Dengue virus, etc. (Adhikari et al. 2018; Ali et al. 2017; Dash et al. 2017; Ikram et al. 2018). Knowing all the pros of multiepitope based vaccine, our first and foremost concern was to construct a vaccine which will be able to elicit a robust immune response after vaccination. Though there had been a few attempts to suggest candidates for peptide vaccine against LASV, this is the very first approach recommending a fully functional multi-peptide based vaccine that has been evaluated by in silico approaches (Faisal et al. 2017; Hossain et al. 2018; Verma et al. 2015).

ViPR database was used to retrieve the whole complete sequence of LASV and after screening different types of protein, 4 protein sequences were selected due to their higher antigenicity. Through different servers and databases CTL, HTL and LBL epitopes were chosen as a vaccine candidate. An effective multi-epitope vaccine should be designed to include epitopes capable of producing CTL, HTL and B cells epitopes and inducing efficient reactions to a specific tumour or virus (Zhang 2018). Because of its function in inducing antibody manufacturing and mediating its effective features, we have been interested in incorporating B cell epitopes (Cooper and Nemerow 1984). Over time, however, the humoral response from memory B cells can easily be overcome by the emergence of antigens, whereas cell-mediated immunity (T-cell immunity) often leads to lifelong immunity (Bacchetta et al. 2005). CTL limits pathogen spread by identifying and destroying infected cells and by secreting unique antiviral cytokines (Garcia et al. 1999). Therefore, B and T-cell epitopes were predicted for the vaccine construct.

The vaccine candidates were selected from HTL, CTL and B-cell epitopes based on their antigenicity, allergenicity, immunogenicity and toxicity. Helper T-cells that release other types of cytokines such as interferon-gamma (IFN-γ), interleukin-4 (IL-4) and interleukin-10 (IL-10) have the potential to overcome pro-inflammatory response and therefore reduce tissue damage. The innate immune system, B-lymphocytes, cytotoxic T cells and other immune cells are activated with the help of helper T-cells. Thus, the cytokine (i.e., IFN-γ, IL-4 and IL-10) inducing ability of specific HTL epitopes were also evaluated for candidate choosing. The vaccine construction was completed after joining the CTL, HTL and B-cell epitopes with AAY, GPGPG and KK linkers, respectively. To enhance expression, folding and stabilization, linkers are implemented as an indispensable element in the development of vaccine protein (Shamriz et al. 2016). Furthermore, the OmpA agonist (GenBank ID: AFS89615.1) was used as a TLR2 adjuvant and joined to the first CTL epitope using EAAAK linker (Arai et al. 2001). When used alone, multiepitope-based vaccines are poorly immunogenic and require coupling to adjuvants (Meza et al. 2017). Adjuvants are ingredients added to vaccine formulations that affect particular immune responses to antigens, their development, stability, and longevity and are protective against infection (Lee and Nguyen 2015). Also, they gain great attention because the immune response to humoral and cell-mediated immune responses can be selectively modulated (Bonam et al. 2017). Though, the vaccine size is seemingly long as a peptide vaccine, several studies had been done where the vaccine length is even longer (Chatterjee et al. 2018; Kalita et al. 2019; Rahmani et al. 2019). Therefore, we think it won't be a problem in term of stability and expression. When assessing the vaccine construct, we observed that the non-adjuvant construct showed less antigenicity (0.624) than the adjuvant construct (0.7223) with the aid of Vaxijen server, which clearly states that the adjuvant is significant for the chimera.

The molecular weight of our vaccine candidate is ~ 68 kDa which is an average molecular weight for a multi-epitope vaccine. One of the fundamental requirements of many biochemical and functional analysis is the solubility of overexpressed recombinant protéin within the E. coli host (Khatoon et al. 2017). The constructed vaccine protein was found to be soluble which secured their easy access to the host. The basic nature of the vaccine is indicated by theoretical pI value. In addition, the predicted instability index shows that the protein will remain stable after expression, thus enhancing usage capacity further. The GRAVY score and aliphatic index depict the hydrophilicity and thermostability, respectively.

The 3D structure modelling includes sufficient information on the spatial arrangement of crucial protein components and excellent support in the investigation of protein function, dynamics, ligand interactions and other proteins. The desirable properties of the vaccine construct enhanced significantly after refinement. The Ramachandran plot demonstrates that most residues are discovered in the favored and allowed regions (99.2%) with very few residues in the disallowed region; which depicts that the quality of the overall model is satisfactory. Besides, GDT-HA, RMSD value, MolProbity, Clash Score and Poor Rotamers values indicate the good quality of our designed vaccine construct. Different structure validation tools were used to identify errors in the modelled vaccine construct. The Z-score (− 4.23) and ERRAT quality factor (78.2%) showed that the overall structure of the refined vaccine is adequate.

The HLA alleles maintain the response to T-cell epitopes, and in different ethnic communities, these alleles are highly polymorphic. The T-cell epitope should bind with more HLA alleles to obtain more population coverage. So, we selected the CTL and HTL epitopes with their respective HLA alleles to predict the allele distribution worldwide. The findings showed that the chosen epitopes and their individual alleles cover ideally in numerous geographic regions of the globe. The highest population coverage was recorded at 99.16% in Chile Amerindian, and those epitopes and their respective HLA alleles cover 97.37% of the world population when used in combination. In Western Africa, in particular, Nigeria, Senegal and Mali, the epidemic of the LASV happened in most significant measure. Therefore, in these geographical regions, vaccine candidates are essential to protect people against LASV infection. The population coverage was found to be 88.86% at West Africa, where the virus first appeared and had several outbreaks.

Data driven protein-receptor docking analysis and molecular dynamics simulation was carried out to evaluate a potential immune interaction and stability between TLR2 and the vaccine protein, considering the use of a TLR2 agonist as an adjuvant in the constructed chimera. Energy minimization was conducted to minimize the potential energy of the whole system for the complete conformational stabilization of the vaccine protein-TLR2 docked complex. The energy minimizes the inappropriate structural geometry by replacing individual protein atoms, thus making the structure more stable with adequate stereochemistry. The derived eigenvalue indicates the stiffness of motion and the required energy for the complex deformability.

The immunoreactivity testing through serological assessment is one of the first steps in validating a candidate vaccine (Gori et al. 2013). The expression of the recombinant protein in a suitable host is required. E. coli expression systems are determined for recombinant protein manufacturing (Chen 2012; Rosano and Ceccarelli 2014). Codon optimization had been performed with a view to achieving a high level of expression of our recombinant vaccine protein in E. coli K12. Both the codon adaptability index (0.98) and the GC content (53.63%) were promising for high-level protein expression in bacteria. Enhancing the stability of proteins is an indispensable objective in various biomedical and mechanical applications. In this study, we have introduced a disulphide bridging into the multi-epitope vaccine construct to improve protein thermostability, modify its practical features and assist in the analysis of genetic components.

The immune simulation revealed results consistent with typical immune responses. There was an overall increase in immune responses following repeated exposure of the antigen. The development of memory B-cells and T-cells were visible, with several months lasting memory B-cells. Another intriguing finding is that after the first injector concentrations of IFN-γ and IL-2 increased and were maintained at peaks after repeated antigen exposure. This finding shows elevated TH cell concentrations and therefore effective Ig production that support a humoral response. Both dendritic and macrophage cells activity were satisfactory in our study. Besides, components like epithelial cells of the innate immune system were active. The Simpson Index, D suggests a possibility of different immune responses for clonal specificity analysis.

Conclusion

Lassa virus is an emerging viral pathogen which is characterized by severe haemorrhagic fever with a higher mortality rate, hence, become an increasing concern. Though antiviral drug Ribavirin showed some promises earlier, excessive toxicity and teratogenicity rendered its effectivity questionable. Knowing all the merits that a peptide vaccine has to offer, immunoinformatics strategies have been taken into consideration for designing a multi-epitope vaccine. Both T-cell and B-cell epitopes derived from different LASV proteins were included in the vaccine to produce an effective immune response. We believe that our vaccine will hopefully generate cell-mediated and humoral immune responses. The binding potential and interaction between vaccine protein and receptor were higher and stable. Besides, effective immune responses in real life were observed in immune simulation. However, further investigations both in vitro and in vivo are warranted to ensure its true potential to fight against Lassa fever.