Introduction

Viruses are obligate intracellular parasites that require host cell machinery for performing even the most basic biological functions, such as nucleic acid replication and protein synthesis [1]. The first major step in a viral infection cycle is the attachment and entry of the virus into a suitable host cell, after which further pathogenesis occurs [2, 3]. Viruses interact with host cell surface biomolecules such as proteins or carbohydrates, to attach to, and enter host cells. These interactions are generally specific recognitions that often confer tissue and species-specific tropisms to these viruses [2, 4, 5].

As viruses are dependent on host cellular machinery for their basic biological processes, many viruses have evolved to mimic host interactions for these processes [6]. These adaptations, referred to as molecular mimicry, enable the virus to mimic host interactions to hijack previously existing host signalling networks (Fig. 1A) [6]. Studies have shown that such mimicry can confer various advantages to the virus, which ultimately aids in further viral infection and spread: entry, immune modulation and evasion, replication, etc. [7]. At the molecular level, the mimicry can be either structural or sequence mimicry [6]. In this article, we review some well-characterised sequence motifs involved in host–virus interactions that play a role in the attachment and entry of the virus into the host cells, that mimic some physiological receptor–ligand interactions present in host cells.

Fig. 1
figure 1

Viral mimicry of host SLiMs for entry into host cells. A Viruses often utilise existing host cellular network for carrying out their own cellular functions. One such mechanism is the mimicry of the host cell’s short linear motif-mediated physiological receptor-ligand interactions for attachment and entry of the virus into the host cell. B and C The subsequent downstream signalling triggered by such mimicry can follow the exact same pathway as the physiological interaction (B), or may follow a different pathway (C)

Viral mimicry of host short linear motif-mediated interactions

Short linear motifs (SLiMs) or mini motifs are short ~ 3–10 amino acid sequences that mediate specific molecular functions [6, 8]. They are generally present in relatively unstructured or disordered regions of proteins, and can easily appear or disappear during evolution because of point mutations. They also typically mediate transient interactions with low affinities, thus, contributing greatly to protein interaction networks [6, 8, 9].

Due to the properties mentioned above, mimicking SLiMs is a widespread molecular mimicry mechanism utilised by many pathogens, including viruses [6, 7]. Moreover, viral mimicry directly disrupts the host signalling mechanisms by competitively binding to host surface receptors and preventing physiological interactions [7]. Furthermore, they may trigger further downstream responses that are similar to the physiological ones (Fig. 1B), or lead to an entirely different response (Fig. 1C).

In this article, we review some well-characterised SLiMs involved in host–virus interactions that play a role in the attachment and entry of the virus into the host cells, which mimic some physiological receptor–ligand interactions present in host cells (Table 1).

Table 1 Some well-characterised sequence motifs involved in viral molecular mimicry for attachment and entry into host cells

CX3C motif

The CX3C (Cys-Xxx-Xxx-Xxx-Cys) motif present in the chemokine fractalkine (CX3C ligand 1—CX3CL1) mediates its interaction with CX3C receptor 1 (CX3CR1) [10]. Physiologically, interaction of fractalkine with the CX3CR1 is important for leukocyte migration across endothelial barriers and chemotaxis. It triggers further downstream signalling pathways that induce NF-kB activation and expression of CX3CR1 as well as inflammatory cytokines such as IL-8 and MIG (monokine induced by interferon gamma) [11, 38, 39].

One mode of entry of respiratory syncytial virus (RSV) into the host regulatory B-cells, and airway and lung epithelial cells is through the interaction of RSV glycoprotein G with CX3C receptor 1 (CX3CR1) present on the host cells, through the CX3C motif of glycoprotein G [11, 40]. This interaction mimics the physiological interaction between CX3CR1 and the CX3C motif present in the chemokine fractalkine (the physiological ligand for CX3CR1) [10]. Thus, RSV glycoprotein G competes with fractalkine to interact with CX3CR1. Meanwhile, RSV glycoprotein G interaction with CX3CR1 also triggers various cellular responses that are distinct from the physiological fractalkine–CX3CR1 interaction (Fig. 1C) [10, 38].

GPR motif

The GPR (Gly-Pro-Arg) motif present in the N-terminal domain of ɑ chain of fibrinogen is crucial for its interaction with ɑx2integrin. This fibrinogen—ɑx2 integrin (CD11c/CD18) interaction plays an important role in the adhesion of tumour necrosis factor-stimulated polymorphonuclear leukocytes (PMNs) to fibrinogen/fibrin thrombi, especially during clot formation [12].

After human rotaviruses attach to major entry receptor ɑ21 integrin through viral spike protein VP4, the interaction of viral outer capsid protein VP7 with ɑx2 integrin in the early endosomes mediate the entry of the virus into the host human kidney or intestinal epithelial cells, resulting in further infection [13]. Interestingly, mucosal PMNs and inflammation increase during rotavirus infection, suggesting a possible interaction between the virus and the PMNs [41]. Subsequent studies have shown that VP7 also contains a conserved GPR motif that is involved in its recognition by integrin [14, 33, 41].

RGD motif

The RGD (Arg-Gly-Asp) motifs present in many extracellular interacting partners of integrins (such as vitronectin, fibronectin, von Willebrand factor, fibrinogen, osteopontin, and thrombospondin) mediate their interactions with integrins, thus regulating vital cellular processes such as cell migration, proliferation, survival, and apoptosis [42,43,44,45]. For example, vitronectin interacts with ɑV1, ɑV3, ɑV5, and ɑII1 integrins, while fibronectin interacts with ɑ31, ɑV1, ɑ81, ɑV1, ɑV3, ɑV5, ɑV8, ɑV6 and ɑII3 integrins, and cadherin-17 and VE-cadherin can recognise α2β1 integrin in an RGD-dependent manner [15, 46]. Studies also show that many of the RGD-binding integrins are important in mediating cancer metastasis [46].

Viruses such as Kaposi’s sarcoma-associated herpesvirus (KSHV), Epstein-Barr virus (EBV), adenoviruses, herpes simplex virus (HSV), foot-and-mouth disease virus (FMDV) and coxsackievirus A9 (CAV9) exploit the RGD motif for interacting with RGD-dependent integrins, mediating their entry into host cells [16,17,18,19,20, 47]. KSHV forms complex with integrins (ɑV3, ɑV1 and ɑV5) and CD98/xCT through the glycoprotein B containing an RGD motif to enter into human dermal microvascular endothelial cells [16]. Antibodies against RGD motif, fibronectin or ɑV integrins reduce KSHV infection by ~ 50 per cent, highlighting the requirement of RGD motif for its entry. EBV glycoprotein BMRF-2 contains an RGD motif in its extracellular domain essential for its entry into oral epithelial cells by interacting with ꞵ1 and ɑV family of integrins [47].Similarly, after attaching to its primary entry receptor CAR in human melanoma cell lines, adenovirus type 5 (Ad5) binds to ɑV3 and ɑV5 integrins through the RGD motif of the penton base protein [17]. The glycoprotein H of HSV interacts specifically with ɑVꞵ6 and ɑVꞵ8 integrins through an RGD motif which helps them to get endocytosed in epithelial cells and keratinocytes, mediating its entry (Fig. 1B) [18]. Mutational studies also highlighted that FMDV (A12) containing an RGD motif in G-H loop of VP1 and CAV9 containing an RGD motif near the C terminus of VP1, bind to αv3 and αv6, respectively, for their entry [19, 20]. It is interesting to note that several of these viruses, such as KSHV and EBV are also generally involved in viral-induced tumours [48,49,50].

KGD motif

KGD motif is closely related to the RGD motif [21]. Physiologically, multiple KGD (Lys-Gly-Asp) motifs present in the COL15 domain of collagen XVII are important for its interaction with α5β1 and αVβ1integrins. These interactions mediate keratinocyte spreading and migration [21]. Similarly, the KGD motif of avian tenascin-W (present in the fibronectin type III domain) is important for its interaction with integrins, which regulates developmental patterning [51]. KGD motif binds more specifically to the αIIbβ1 integrin [52].

KGD motifs present on the surface-exposed loop of the glycoproteins H and L of EBV are important for the membrane fusion of EBV to epithelial as well as B-cells, the primary infection sites of EBV [22, 23]. Mutation of the KGD motif or down-regulation of αV integrins (αVβ6 and αVβ8) leads to reduced EBV infection in epithelial cells. These data together suggest that EBV interaction with these integrins through the KGD motif is important for its entry [23,24,25]. Furthermore, EBV infections also cause severe thrombocytopenia (decreased platelet counts), in a manner similar to the mode of action of several viper venoms which contain disintegrins (small proteins found in viper venom with RGD or KGD motifs that disrupt platelet aggregation and integrin-dependent cell adhesion through competitive binding) [52,53,54].

DGE motif

DGE (Asp-Gly-Glu) motifs present in type I collagen are important for its interaction with α2β1 integrin present on platelets and fibroblasts [26]. This interaction is important for platelet activation and thrombosis [55].

Rotavirus spike protein VP4 is proteolytically cleaved into VP5 and VP8 [27]. The VP5 is responsible for its attachment to α2β1 integrins which is essential for viral entry into intestinal cells [14, 27]. This interaction is mediated by a DGE present on the VP5 of rotaviruses [14]. Monoclonal antibodies against α2 integrins inhibit the VP5-α2β1 interaction, similar to the physiological collagen-α2β1 interaction, thus, reducing rotavirus infection [33]. On the other hand, clinical studies suggest that acute human rotavirus infection decreases the mean platelet volume, especially in children [56], suggesting a possible interaction between the platelets and human rotaviruses.

(R/K)XX(R/K)

The (R/K)XX(R/K) (Arg/Lys-Xxx-Xxx-Arg/Lys) motif present in vascular endothelial growth factor (VEGF) mediates its interaction with neuropilin-1 [28]. The C-terminal region of VEGF-A containing the polybasic motif R/K-X-X-R/K must be proteolytically cleaved to become active, thus, named as C-end rule (CendR) motif [57]. Neuropilin-1, a physiological co-receptor for VEGF-A, is a transmembrane protein expressed on various immune cells, and is involved in regulating VEGF-A dependent biological processes such as axon guidance, angiogenesis, and endothelial and vascular permeability, development and leakage [29, 58]. Studies have shown that certain peptides and nanoparticles containing a C-terminal arginine (or rarely lysine) in the CendR motif can interact with neuropilin-1, and internalise into the cells [57]. Due to this property, many viruses exploit the CendR motif: neuropilin-1 interaction for their entry into host cells [28, 30,31,32, 59,60,61].

After the CendR motif of EBV glycoprotein B (RRRR) becomes exposed by furin protease cleavage, it interacts with neuropilin-1, which is essential in mediating EBV entry into human nasopharyngeal epithelial cells [32]. Furthermore, this interaction activates the downstream signalling pathways (such as EBV-activated EGFR/RAS/ERK and neuropilin-1-dependent receptor tyrosine kinase pathways) that further facilitates EBV infection.

Human T-cell lymphotropic virus type 1 (HTLV-1) usesneuropilin-1 as a receptor for entering CD4+ T cells and dendritic cells [28]. The HTLV-1 surface (SU) subunit of Env protein consists of a KPXR motif that is involved in its interaction with neuropilin-1 present on these host cells. This region is also highly conserved in various HTLVs (1, 2 and 3), as well as in related simian viruses.

Similarly, furin protease cleavage of CendR site present in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein S results in the formation of RRAR in the S1 subunit, important for enhanced fusion capacity and increased infectivity [30, 31]. This interaction with neuropilin-1, when co-expressed with ACE-2 and TMPRSS2, increases infectivity of the virus. Furthermore, as VEGF-A164-neuropilin-1 interaction also regulates nociception, viral alteration of this physiological interaction has been hypothesised as a reason for VEGF-A-driven pain in SARS-CoV-2 patients [59, 62].

LDV and LDI motifs

Alternatively spliced domains of fibronectin contains an LDV (Leu-Asp-Val) and LDI (Leu-Asp-Ile) motifs, and VCAM-1 consists of an LDV motif, respectively, which is essential for their interaction with α4β1, mediating leukocyte adhesion and plays important role in modulating inflammatory responses [34, 35, 63].

Rotavirus group A expresses α4 integrin-binding motifs LDV and LDI in its outer capsid protein VP7 [13]. Cell-based studies showed that some of the rotaviruses use α4 integrins (α4β1 and α4β7) as their entry receptor or co-receptor. White spot syndrome virus (WSSV) possesses an LDV motif in its outer surface glycoproteins VP26 and VP31 [64]. Also, small peptides or antibodies of the LDV sequence reduces α4 integrin interactions with rotaviruses and decreases their infectivity [13, 33]. Peptides containing LDV sequences partially inhibit their infection by reducing their interaction with β integrins [33, 37, 64]. Disrupting these physiological interactions may help rotaviruses to re-infect the host cells. Also, many α4 integrins are present in immune cells like T cells and B-cells, suggesting a possible explanation for immune modulation exhibited by rotaviruses [65]. Furthermore, the interactions of LDV motifs present in fibronectin connecting segment-1 region with α4β1 also leads to oral cancer cell adhesion, migration and invasion [66]. Rotavirus infection shows a similar type of B-cells accumulation in mesenteric lymph nodes leading to lymph node hypertrophy and increased cytokine levels in oral mucosa [67].

IDA motif

Rotavirus group A expresses α4 integrin-binding motif IDA (Ile-Asp-Ala) in their spike protein VP4 [13]. Cell-based studies showed that some of the rotaviruses use α4 integrins (α4β1 and α4β7) as their entry receptor or co-receptor. Physiologically, alternatively spliced domains of fibronectin contain an IDA motif which is essential for its interaction with α4β1, thus, mediating leukocyte adhesion and modulating inflammatory responses [34, 35, 63].

YGL motif

YGL (Tyr-Gly-Leu) motif present in osteopontin is important for its interaction with α4β1 and α9β1, and signals Th-1 cytokine-mediated immune responses sometimes leading to chronic inflammation.

Rotaviruses also contain an YGL motif in their VP4 [13]. In the rotavirus infection assay, peptides containing YGL motifs tend to inhibit the binding of α4β1 to MAdCAM-1 and fibronectin, and partially inhibit the interaction with VCAM-1. WSSV possesses a YGL motif in outer surface glycoprotein VP37 [36, 37, 64].

Structural perspectives

As SLiMs are short-peptide sequences that are generally found in relatively unstructured regions of proteins, their three-dimensional conformation may vary drastically between the host physiological ligand and the viral ligand [6]. For instance, the CX3C motif present in RSV glycoprotein G does not take the same structural conformation as the CX3C motif of the physiological ligand fractalkine, although they both recognise the same cellular receptor CX3CR1 (Fig. 2). Furthermore, due to their short nature, SLiMs have been found to appear, mutate, and disappear relatively easily during the course of normal evolution [1, 68,69,70]. Hence, these properties are hypothesised to confer some flexibility to these peptides to adapt, mutate and evolve as necessary, thus, making them indispensable during the evolution of protein interaction networks within the host, as well as in host–pathogen interactions.

Fig. 2
figure 2

The structural conformation of SLiMs may differ between the host physiological ligand and the viral ligand, as SLiMs are short-peptide sequences that are generally found in relatively unstructured regions of proteins. For example, the CX3C motifs (shown in red) of RSV glycoprotein G (PDB id: 1F2L) (A) and the physiological ligand fractalkine (PDB id: 6BLH) (B) do not exhibit the same structural conformation, although they both recognise the same cellular receptor CX3CR1

Conclusion and future directions

Viruses are a unique category of pathogens that are dependent completely on their hosts for performing basic cellular processes, and have, thus, evolved to utilise existing host interaction and signalling networks for various steps during their pathogenesis [1,2,3]. One of the many ways they achieve this is by molecular mimicry, wherein the virus mimics a host sequence or structure, thereby being able to hijack the host physiological interactions for their own pathogenesis. Common sequence mimicry utilises SLiMs, short-peptide sequences that are major elements in host protein interaction networks [6]. In this article, we have reviewed some well-characterised sequence mimicry by viruses that are vital for the entry of the virus into host cells.

Furthermore, mimicking host motifs can also confer other advantages to the virus: SLiMs can evolve and mutate easily, and can be easily integrated into existing protein interaction networks. This can also lead to immune evasion, where the host immune system recognises these imposters as self, and fails to act [1, 68,69,70]. Hence, various pathogens (including viruses, as well as prokaryotic and eukaryotic) have evolved to utilise SLiMs for their pathogenesis through molecular mimicry [6].

Although similar sequence motifs present on both the viruses and the physiological ligands mediate these interactions, some of these interactions between viral glycoproteins and their cognate receptors on the host cells mimic physiological downstream processes triggered by the physiological ligand, while others induce signalling cascades that are different [7, 10, 48]. Moreover, the three-dimensional conformation of these motifs may also differ in the viral and physiological ligand, which can be explained by the properties of SLiMs such as their short lengths and occurrence in disordered regions of proteins [6]. The exact molecular and structural determinants involved in many of these interactions remain under explored.

SLiMs are also known for their integral role in a wide range of physiological processes, including cell signalling. Thus, finding the whole repertoire of SLiMs and understanding the mechanisms involved in regulating these diverse processes are of at most importance. However, their low complexity, small size and periodic mutations impart a challenge in identifying new SLiMs from the complex human proteome. In recent years, viral mimicry has been suggested as one of the integral part of a viral infection, and researchers have been able to identify a number of novel viral SLiMs essential for their entry into hosts. Convergent evolution of the human SLiMs in viruses as well as the existence of simpler proteome compared to the humans can help us to identify novel SLiMs in humans [71, 72].

Cell surface biomolecules are exploited by viruses for attachment and entry, thus, dictating host specificity and tissue tropism. Research on such motifs can predict the cellular receptors of various viruses, as well as help in the designing of targeted therapeutics such as vaccines or inhibitors designed specifically for modifying these interactions [73]. Recent studies have shown that SLiM-based peptide inhibitors and drugs such as nutlin, venetoclax and cilengitide can reduce the infectivity of many viruses. However, the broad spectrum antiviral capability of these therapeutic strategies remain to be explored [74, 75]. Thus, elucidating the structural and molecular basis of the interactions of viral SLiMs with host cell surface receptors can help us to design newer intervention techniques against viral infections in future.

Perspectives

  • While the research on emerging diseases has been gaining focus in the past two decades, there is renewed interest in this field due to the ongoing pandemic.

  • As entry receptors present on the host cell surface play a vital role in the viral pathogenesis process, targeting these interactions has been a standard strategy for preventing further spread of the virus.

  • Considering recent trends in emerging diseases, further research on such motifs involved in viral entry can help in the discovery of previously unknown cellular receptors utilised by viruses, as well as help in the designing of targeted therapeutics such as vaccines or inhibitors directed towards these interactions.