The ongoing coronavirus disease 2019 (COVID-19) pandemic, caused by SARS-CoV-2, has surpassed more than 600 million cases, resulting in > 6 million deaths (Johns Hopkins University COVID-19 Dashboard) by September 2021. The coronavirus (CoV) spike (S) protein is responsible for receptor binding and cell entry and can be processed into S1 and S2 subunits. The S1 subunit binds to the cellular receptor angiotensin-converting enzyme 2 (ACE2), followed by virus-cell membrane fusion or/and endocytosis-mediated entry mediated by the S2 subunit [1]. Cleavage of the S protein at the S1/S2 and S2’ sites by host proteases is essential during this process. The presence of a furin cleavage site in the spike protein (681PRRAR685) is a typical feature of SARS-CoV-2. Deletion of the furin cleavage site has been shown to cause the mutant (ΔPRRA) to be able to replicate faster in Vero-E6 cells, but it was found to be partially attenuated in an animal model when compared with the parental SARS-CoV-2 strain [2]. Also, deletion of five amino acids (675QTQTN679) near the furin cleavage site has also been observed during adaptation of SARS-CoV-2 to cell culture [3]. Several independent SARS-CoV-2 genomic surveillance programs in Louisiana, New Mexico, and Ohio found emerging variants carrying a Q677H mutation in late 2020 [4, 5]. In this study, we isolated 20 SARS-CoV-2 strains, including strains 20G and its variant, OSU.20G, which carries a Q677H mutation in the spike protein as well as six other mutations in other parts of the genome, and we compared the replication efficiency of this pair of strains in Vero E6 and Calu-3 cells.

Forty-six deidentified human nasopharyngeal swab samples collected in Columbus, Ohio, in September 2020 (n = 20), January 2021 (n = 24), and February 2021 (n = 2) tested positive for SARS-CoV-2 by real-time reverse transcription PCR (RT-qPCR) with Ct values ≤ 30. These samples were used for viral isolation at the Plant and Animal Agrosecurity Research Facility at the Ohio Agriculture Research and Development Center at enhanced biosafety level 3. The samples were stored in viral transport medium at 4°C for less than a week and were centrifuged at 2000 x g for 5 min at 4°C to remove cell debris. Two hundred μL of filtered supernatant per well was used to inoculate Vero E6 (ATCC no. CRL-1586) cell monolayers in 24-well plates.

Twenty of the 46 specimens caused an obvious cytopathic effect (CPE) as early as 2 days post-inoculation (dpi) (Table 1) and were harvested as passage 0 (P0) when CPE progressed to 50-80% and mock infected cells remained healthy. The plates were frozen and thawed once, the cultured cells and supernatants were vortexed vigorously and centrifuged, and the supernatants were used for RNA extraction followed by next-generation sequencing (NGS) and Sanger sequencing for confirmation as described [6]. All 20 viruses had a D614G mutation in the spike protein and a few other non-synonymous nucleotide changes in the genome [4]. Using the GISAID (https://www.gisaid.org/) strain nomenclature in use at that time, samples collected in September 2020 were mainly typed as clade 20C (81.2%; 9/11) or 20B (9.1%; 1/11). Samples collected in January 2021 were mostly clade 20G (75%. 6/8), with one P.2 (Brazil) variant isolated from a sample collected in February 2021.

Table 1 SARS-CoV-2 strains isolated in this study

Comparing the original viral sequences in nasopharyngeal swabs to those of cultured isolates showed that even one passage in Vero E6 cells produced culture-adapted mutations affecting the 675QTQTN679 sequence of the spike protein, as reported previously [3]. For example, 26% of the reads in P0 from the 1277 isolate contained a 675QTQTN679 deletion. Therefore, to study the Q677H further, we performed plaque purification to obtain a pure preparation of OSU.20G variant clone 3 (1277C3). Similarly, the 20G base strain (1265) was also plaque purified, and isolate 1265C3 was selected. Compared with 1265C3, 1277C3 had a Q677H mutation near the S1/S2 furin cleavage site of the spike protein. In addition, there were six non-synonymous amino acid changes in 1277C3 (Fig. 1A and B). Three were conservative changes in the non-structural proteins (nsp), including M1788I in nsp3, T204I in nsp4, and T31I in nsp14, and these were probably of no functional significance. The non-conservative changes, N1543K in nsp3, S68F in the envelope (E) protein, and H125Y in the membrane (M) protein, are markers of the OSU.20G variant that might be biologically relevant and need to be investigated in the future using reverse genetics. It has been reported that the S68F mutation in the C-terminal domain of the E protein can stabilize the E protein [7], and the H125Y mutation in M has been predicted to be a potential CD8+ T cell epitope and to cause changes in the ordered interface of the M protein [8].

Fig. 1
figure 1

Identification of the OSU.20G variant. (A) Distribution of nucleotide substitutions in the genome of SARS-CoV-2 OSU.20G variant 1277C3 compared with the 20G base strain 1265C3. (B) Summary of the substitutions in 1277C3. (C) Frequency of the substitutions identified in 1277C3 from GISAID globally (upper panel) and in the USA (lower panel). (D) 3D structural models for SARS-CoV-2 S protein trimers. H677 is labeled in blue, and Q677 is labeled in red. The polybasic cleavage site (S: 681PRRAR685) is in magenta.

The Q677H mutation, which alters the 675QTQTN679 sequence that is conserved in the spike proteins of SARS-like CoVs, emerged in one or more US-based clade 20G viruses in late 2020, peaked in early 2021, and then decreased and disappeared by May 2021 in the United States, based on available sequences in GISAID (Fig. 1C) and our data analysis [4].

SWISS-MODEL (https://swissmodel.expasy.org/) was used to model the 3D structures of the S protein trimers of 1265C3 and 1277C3, using 7cn8.1.A as a template. The substitution of histidine for glutamine is predicted to make H677 more distant from the carbon backbone than Q677 (5.037 Å vs 4.878 Å) (Fig. 1D). This suggests that the Q677H mutation might make the polybasic cleavage site (681PRRAR685) more accessible to host proteases for cleavage.

We compared the replication fitness of 1277C3 (the OSU.20G variant) and 1265C3 (the 20G base strain) in Vero E6 cells and in Calu-3 cells (ATCC HTB55), a more physiologically relevant human lung epithelial cell line. 1277C3 and 1265C3 generated plaques of similar size (Fig. 2A and B). The multi-step growth kinetics of the two viruses were examined in Vero E6 cells at a multiplicity of infection (MOI) of 0.001 and in Calu-3 cells at an MOI of 0.01. The cells were washed with phosphate-buffered saline (PBS) after adsorption and cultured in maintenance medium. Supernatants were collected at multiple time points (24, 48, 72, and 96 h post-inoculation [hpi]) and titrated by plaque assays or microplate infectivity assays, and a TaqMan real-time reverse transcription PCR (RT-qPCR) assay was performed as described previously [9]. In Vero E6 cells, 1277C3 reached significantly higher peak titers (6.09 ± 0.04 log10 plaque-forming units [PFU]/mL vs. 5.42 ± 0.10 log10 PFU/mL) within a shorter time (48 hpi vs. 72 hpi) compared with 1265C3 (Fig. 2C). To compare replication efficiency, we quantified viral genomic RNA copies by RT-qPCR targeting the RdRp gene and calculated the number of genomic copies of viral RNA per PFU [10]. We found a significantly lower ratio for 1277C3 than for 1265C3 at 24 hpi, suggesting a higher replication efficiency for 1277C3 than for 1265C3 (Supplementary Fig. S1). However, in Calu-3 cells, the two isolates reached similar infectious titers at different time points (Fig. 2D).

Fig. 2
figure 2

Comparison of a pair of SARS-CoV-2 isolates: 1265C3 (20G base strain) and 1277C3 (OSU.20G variant). (A) Typical plaques caused by 1265C3 and 1277C3 in Vero E6 cells overlayed with 1% methylcellulose. (B) The diameter of plaques generated by 1265C3 or 1277C3. Fifteen plaques were selected for each strain, and the data are presented as the mean ± SD. (C) and (D) Multi-step growth curves of 1277C3 and 1265C3 in Vero E6 cells using an MOI of 0.001 (C) and in Calu-3 cells using an MOI of 0.01 (D). Three replicates were performed for each virus, and the experiments were repeated once.

Since 1277C3 had a replication advantage over 1265C3 in Vero E6 cells, an African Green monkey kidney epithelial cell line, we evaluated the fitness of 1277C3 relative to 1265C3 in a competition assay in Calu-3 cells, as described previously with modifications [11] (Supplementary Fig. S2A). Equal MOI (0.005) of SARS-CoV-2 strains 1265C3 (20G base strain) and 1277C3 (20G base strain+Q677H) were mixed and inoculated onto the cell monolayers. After a one-hour adsorption period, the cell monolayers were washed and cultured in maintenance medium. Cells and supernatants were collected at 48 and 72 hpi. NGS was performed to determine the percentage of each virus present. Differential detection of the 1277/1265-specific variants by NGS reads was employed to determine the percentage of each individual viral genome in the inoculum (input) and culture mixtures (output) collected at 48 and 72 hpi, respectively. The results showed that the initial ratio of 1265C3 to 1277C3 was 3:7, and it became 0.5:9.5 at 48 hpi and 72 hpi (Supplementary Fig. S2B). Because the initial input of 1277C3 was higher than that of 1265C3, we cannot make a conclusion on which virus replicates more efficiently in Calu-3 cells. There are two possible reasons for the uneven amount of the two viruses in the inoculum: (1) Using a low MOI of 0.01, we expected to observe the competition results of multiple replication cycles, so the variation of virus infectious titers and the variation generated in the dilution process can lead to such differences. (2) We prepared the inoculum based on infectious titers in PFU, but the NGS data were based on viral RNA copies. These different targets could contribute to the ratio differences.

As SARS-CoV-2 spreads rapidly, various variants have emerged from different geographic regions, and some have spread globally at a rapid rate and have been designated as “variants of concern” (i.e., Alpha, Beta, Gamma, Delta, and Omicron) and “variants of interest” (i.e., Eta, Iota, Kappa, and Lambda) by the World Health Organization (WHO). Multiple mutations identified within the S proteins of those variants are associated with increased transmissibility, infectivity, and immune evasion [12]. For example, the S protein mutation N501Y found in the Alpha, Beta, and Gamma variants leads to increased ACE2 binding affinity and significantly reduced sensitivity to the neutralizing antibodies elicited by the Moderna (mRNA-1273) or Pfizer–BioNTech (BNT162b2) vaccine [13]. In this study, we isolated 20 SARS-CoV-2 strains and identified one variant (OSU.20G) carrying a Q677H mutation in the S protein, which disturbs the sequence 675QTQTN679 upstream of the furin cleavage site. In addition to the OSU.20G variant, the emerging Eta variant B.1.525 in Nigeria and some variants in the United States, especially those prevailing in the Midwest and Southeast regions, have the Q677H mutation as well [5, 14]. We constructed a model of the 3D structure of the SARS-CoV-2 S protein with the Q677H mutation and found that it can potentially increase proteolytic processing of the SARS-CoV-2 S protein by making the polybasic cleavage site (S: 681PRRAR685) more accessible to host proteases for cleavage, which might alter viral infectivity and replication because of the essential role of the S protein in virus attachment, fusion, and entry into host cells. Correspondingly, our results showed that the OSU.20G variant exhibited enhanced in vitro replication compared with the 20G base strain in Vero E6 cells. Previously, Zeng et al. showed that introduction of the Q677H mutation increased the infectivity of a B.1.1.7 pseudotype virus by 2.5-fold and that of a P.1 virus by 26.3% [15]. Further studies are needed to elucidate the mechanism by which the Q677H mutation affects SARS-CoV-2 replication.

As of October 10, 2022, approximately 90.6% of people in the United States older than 18 years had received at least one dose of a COVID-19 vaccine (https://covid.cdc.gov/covid-data-tracker). Vaccination is considered as an effective way to control the current pandemic. However, variants of SARS-CoV-2 have emerged with multiple mutations in the S protein, resulting in decreased protection by vaccine-induced neutralizing antibodies and breakthrough infections [16, 17]. Hodcroft et al. have discussed the evolution and polymorphisms of residue 677 of the S protein [5]. The Q677H mutation was first identified in August 2020 and found in three clades: 20G (B.1.2), 20A (B.1.234), and 20B (B.1.1.220, and B.1.1.222). Although Q677H is not a characteristic of any variants of concern according to WHO, it can be generally detected in Alpha (20I), Beta (20H), Gamma (20J), Delta (21I and 21J), and Omicron (21K, 21L, 22A, 22C, and 22D) variants with peak frequencies of 100% (September 20, 2021 to December 21, 2021), 81% (July 13, 2021 to July 20, 2021), 100% (November 7, 2021 to November 29, 2021), 100% (March 22, 2022 to May 30, 2022), and 40-50% (February 21, 2022 to October 1, 2022), respectively, according to the Nextstrain project (https://nextstrain.org/groups/neherlab/ncov/S.Q677?c=gt-S_677). The high frequencies of the Q677H mutation in these variants of concern suggest its potential positive effect on viral replication. As for the OSU.20G variant, we did not test its sensitivity to clinical sera in this study. However, introduction of the Q677H mutation into the backgrounds of the B.1.1.7 and P.1 variants has been shown to lead to modest neutralization resistance based on a pseudotype-based neutralization assay [15]. Therefore, in addition to replication fitness, the possible role of the Q677H mutation in immune evasion may have also contributed to the emergence of the OSU.20G variant in late 2020 to early 2021.

In conclusion, we isolated 20 SARS-CoV-2 strains and compared the replication efficiency of one pair of 20G isolates, the OSU.20G variant, carrying a Q677H mutation in the S protein, and its likely parental strain (20G base strain). We found that the OSU.20G variant replicated more efficiently than the 20G base strain in Vero E6 cells and this may have contributed to its emergence in December 2020 to January 2021. Our study highlights the importance of monitoring and evaluating emerging variants and mutations, which may potentiate next wave of infection. There are some limitations of our study. Besides virus replication fitness, many viral, host, and environmental factors can also contribute to the emergence of new virus mutations, such as population immunity and seasonal variations in temperature and humidity [18, 19]. We demonstrated that 1277C3 bears several non-synonymous substitutions and exhibits enhanced replication capability in cell culture compared with 1265C3. This increased replication was attributed to the synergistic effects of some or all of the mutations observed in 1277C3 genome. Therefore, studies of the effects of each individual non-conservative mutation using reverse genetics will be important.