3.1 Identification of DEGs and Hub Genes
Based on RNA-seq data from 8 paired samples, we had three groups of genes' expressing matrixes. According to the cut-off criteria (P value < 0.05&|log FC > 2|), two columns of DEG were selected as significant genes. One was making a comparison between PSCC and N, which was shown in Volcano plot1(Fig. 1A), the other was between positive LM and N, which was shown in Volcano plot2(Fig. 1B).We made an intersection from two columns of DEGs(Fig. 1C), then we got 98 genes which were significant in both groups. Which was shown in heatmap (Fig. 1D). Since the 98 genes were significant in all columns, we could make a hypothesis that the 98 genes had an impact on penile cancer's occurrence and metastasis. Among the 98 genes,we could find the SPP1 gene's expression level changed most evidently.
3.2 GO Function Analysis
The Gene Ontology (GO) is a database that unifies similar functional genes to predict gene function and trend. According to the result of the GO terms functional analysis of 98 DEG, 98DEGs associated with penile cancer's occurrence and metastasis were mainly enriched in skin development, epidermis development, keratinocyte differentiation, epidermal cell differentiation, keratinization. As we can see in the Fig. 2A, 5 functions listed above were mainly regulated by 25 genes of 98 DEGs. Then we have portrayed the congruent relationship between genes and their function in Fig. 2B and Fig. 2C.
3.3 Gene set enrichment analysis(GSEA)
To understand the underlying molecular mechanisms about occurrence and metastasis of penile cancer, We performed Gene set enrichment analysis (GSEA)based on the DEGs from two groups. One group was from the comparison between between PSCC and N, and the other was acquired from comparison between LM and N. DEGs from the PSCC group were most signifcantly enriched for “Pathways in cancer”,such as cGMP-PKG signaling pathway, Ras signaling pathway and Wnt signaling pathway(Fig. 3A). While DEGs derived from the LM group were mainly enriched for “PI3K-Akt signaling pathway” and “chemokine signaling pathway”. Besides, Tight junction and cytokine-cytokine receptor interaction were also enriched in LM group(Fig. 3B).
3.4 SPP1 expression in PSCC and some other cancer types
For our 8 paired tissues conducted RNA-seq analysis, SPP1 was expressed at high levels in PSCC tissue and LM tissue(Fig. 4A&B).To further explore the association between SPP1 protein expression and clinical features, 183 paraffin-embedded PSCC sections were subjected to IHC assays. As detected by IHC, SPP1 proteins were mainly stained in the cytoplasm. The representative immunohistochemical stain pictures are shown in Figs. 4D.The IHC scoring criteria have described in the Materials and methods section, and the frequency in each subgroup is shown in Table1. In brief, 78 tissues (42.6%) had high expression (IHC score ≥ 2), while 105 (57.4%) had low expression(IHC score ≤ 1). Chi-square tests indicated that the high expression of SPP1 was correlated with a lower G stage (p = 0.0369).Then we evaluated the correlation between SPP1 and immune cells in the TIMER database. Which shows that the expression of SPP1 was mostly higher in tumor than that in normal tissue(Fig. 4C). Besides, SPP1 was closely related with the infiltration level of immune cells, such as B cells, CD8 + T cells, CD4 + T cells, macrophages, neutrophils and Dendritic cells(Fig. 4E).
A.Comparision in expression of SPP1 between PSCC and matched normal tissues in 8 paired tissues. B. Comparasion in expression of SPP1 between LM and matched normal tissues in 8 paired tissues. C. Pan-cancer differential expression of SPP1 was mostly upregulated in cancer tissues compared to normal controls from TCGA database.*p < 0.05,**p < 0.01,***p < 0.001 and ****p < 0.0001.D.Immunohistochemical staining for SPP1 in PSCC tissues. The standard staining intensity score of SPP1 in the cytoplasm was Intensity = 0 for no staining, Intensity = 1 for weak staining, Intensity = 2 for clear staining and Intensity = 3 for strong staining. E. Gene SPP1 were significantly associated with with B cell, CD8 + T cell, CD4 + T cell, macrophage, neutrophil and dendritic cell infiltration in four types of cancers. BLCA = bladder cancer, COAD = colon adenocarcinoma, GBM = Glioblastoma, HNSC = head and neck squamous cell carcinoma.
Table 1
Relationship between SPP1 and clinicopathological features in 183 PSCC patients
TNM Staging System and Clinical stage based on the AJCC Cancer Staging Manual for Penile Cancer (8th ed, 2017); PSCC = penile squamous cell carcinoma; ENE = extranodal extension.
3.5 Survival analysis in SPP1 defined subgroups
In order to better explore the effect of SPP1 on PSCC patients’ survival, The overall survival was significantly better for patients in the high-SPP1 expression group than the low-SPP1 expression group (P = 0.005, Fig. 5A). Then patients were classified into subgroups according to T stage, G stage and age, and the survival of patients in SPP1-high expression group was still higher than that of SPP1-low expression group (Fig. 5B-D).
3.6 SPP1 related bioinformatics analysis based on RNA-seq data
In WGCNA analysis, The relationships between all genes and their corresponding modules are shown in a waterfall chart(Fig. 6A). We identifed 21 co-expression modules and analyzed their association with SPP1-defined subgroups. Including low SPP1 group in PSCC, high SPP1 group in PSCC, low SPP1 in LM and high SPP1 in LM. To select genes tightly associated with SPP1, we choosed the modules which were most positively related or negatively related with corresponding SPP1-defined subgroup. From the result showing in heatmap (Fig. 6B), we could make a conclusion that MEdarked, MElightcyan1, MEsaddlebrown, MEsienna3 and MEskyblue five modules were mostly related with SPP1. So these modules were selected for further analysis. GO enrichment analysis suggests that SPP1 is mainly related to the functions of DNA and chromosomes in the nucleus(Fig. 6C). In KEGG pathway analysis, TGF − beta signaling pathway, p53 signaling pathway, MAPK signaling pathway, Hippo signaling pathway and AMPK signaling pathway are also enriched based on the genes from five modules(Fig. 6D).
3.7 Underlying extrinsic immune landscapes of the high and lowSPP1 expression groups
To explore the relationship between SPP1 expression and immune cell infifiltration, we conducted immune-related bioinformatic analysis including CIBERSORT and ssGSEA based on the data from the GSE57955. Given that there was a PSCC sample lacking SPP1 expression, we ultimately included 38 specimens for further analysis. The cohort from GSE57955 was classifed into high-SPP1 and low-SPP1 groups based on the median value of SPP1. In CIBERSORT analysis, the results showed that relatively higher proportion of NK cells activated and a lower proportion of memory activated CD4 + T cells were found in the high SPP1 group compared with the low SPP1 group (Fig. 7A&C). To further explore the connection between SPP1 and immune environment of PSCC, the ssGSEA was performed to present the different fraction of immune cells infiltration between the high SPP1 group compared with the low SPP1 group. Enrichment of immune signatures such as APC_co_inhibition and T_cell_co-stimulation were significantly higher in patients with high SPP1 group (Fig. 7B&D).
3.8 Difference between high and low SPP1 groups regulated pathways
The GSVA analysis indicated that signaling pathways such as the WNT signaling pathway and MTOR signaling pathway had highly differential pathway scores, suggesting their positive relationship with the expression level of SPP1(Fig. 8A). More detailed information about GSVA was presented in Fig. 8C. Besides, the results of GSEA suggested that high SPP1 group was signifificantly associated with immune-related pathways such as PD-L1 expression and PD-1 checkpoint pathway in cancer and TNF signaling pathway(Fig. 8B).