Background

Breast cancer is an aggressive malignancy with high intra-tumoral heterogeneity and is one of the most common carcinomas in women worldwide(Bray et al. 2018; Kim et al. 2018). According to the IHC (immunohistochemical) of the ER (estrogen receptor), PR (progesterone receptor), HER2 status, and Ki-67, breast cancer has been classified into four primary subtypes: Luminal A, Luminal B, Triple-negative and HER2-positive (Harbeck and Gnant 2017). In fact, recent studies have identified the fifth subtype, known as normal-like breast cancer, which was very similar to luminal A, but had slightly worse prognosis than that of luminal A (Dai et al. 2015). Accumulating evidence has indicated that breast cancer with different histopathological and biological features showed huge difference in treatment responses and prognostic outcomes(Heimes et al. 2017). Therefore, distinct subtypes of breast cancer should have different and customized therapeutic strategies. Although tumorectomy, chemotherapy and radiotherapy have been applied to breast cancer patients, there is no effective response for the patients with metastatic or triple negative breast cancer(Zhou and Zhong 2004). Recently, immune checkpoint blockade therapy, such as targeting PD-1/PD-L1 and CTLA-4, has shown efficacy in treating various cancers, including breast cancer (Schmid et al. 2020a, b; Topalian et al. 2012). For example, triple-negative breast cancer (TNBC) exhibited the highest percentage of tumor-infiltrating lymphocytes (TIL) (Denkert et al. 2018; Loi et al. 2013). Anti-PD-L1 improved the pathological complete response (pCR) rate and event-free survival in TNBC patients, which was also superior to maintenance chemotherapy in metastatic TNBC (Schmid et al. 2020a, b). However, as with most tumor types, not all breast cancer patients benefited from PD-1/PD-L1 and CTLA-4 therapies(Zhou and Zhong 2004; Bassez et al. 2021). The difference in therapeutic efficacy of immune checkpoint blockade therapy has been associated with the immune cell composition and heterogeneity in tumor microenvironment. In addition, cancer cells also contributed to the resistance to immune checkpoint blockade therapy or immune evasion (Zhou et al. 2021). Thus, revealing and understanding the tumor cell heterogeneity and the immune cell composition within the breast cancer microenvironment, will help elucidate the pivotal mechanisms underlying resistance to immunotherapies.

Single-cell RNA sequencing has provided a less biased insight to explore the characterization of cellular heterogeneity, cell-cell interactions through ligand-receptor and differentiation states at the single-cell level. Thus, many studies have applied this approach to uncover the tumor microenvironment in multiple cancer types, including lung cancer (Lambrechts et al. 2018), pancreatic ductal adenocarcinoma (Elyada et al. 2019), breast cancer(Azizi et al. 2018; Chung et al. 2017; Gambardella et al. 2022) and esophageal squamous cell carcinoma (Zhang et al. 2021; Dinh et al. 2021). In addition, we have also explored the heterogeneity and microenvironment of gastrointestinal stromal tumors(Mao et al. 2021), and found that two subgroups of tumor cells with different DNA damage and metastasis ability.

In this study, we analyzed the cancer microenvironment from breast tumor and adjacent nonmalignant tissues from 4 breast cancer patients. We revealed distinct cell populations in breast cancer and identified profound cellular heterogeneity of myeloid cell lineages. In addition, we uncovered prominent diversity of tumor cells and identified a subset of EMT (Epithelial Mesenchymal Transition) positive cells in HER2+ patient, which might explain the poor response of immunotherapy in clinics. These results provided insights into breast cancer microenvironment and contributed to developing effective therapeutic intervention for breast cancer.

Methods

Sample collection and the clinical information

Four breast cancer and paired adjacent normal tissue were collected from our recent study (Mao et al. 2022), which uploaded to Genome Sequence Archive (GSA, https://ngdc.cncb.ac.cn/) with accession number HRA002051 and OMIX001111 under the project of PRJCA008495. Characteristics of patients are summarized in Fig. 1A.

scRNA-Seq data preprocessing, quality control (QC) and data integration

Raw reads were aligned to the hg38 reference genome. Alignment of 5’ end counting libraries from scRNA-seq analyses was completed utilizing the 10X Genomics Cell Ranger software (3.1.0). DoubletFinder R package was used to detect and to filter the potential doublets based on the expression proximity of each cell to artificial doublets (McGinnis et al. 2019). Next, cells with high mitochondrial content ( > = 20%) were removed and a total of 54,055 cells were retained for downstream analysis. Firstly, we used Seurat v3 (Stuart et al. 2019) anchoring integration method (based on canonical correlation (CC) analysis) to integrate the data. Finally, the function FindIntegrationAnchors was used to find alignment anchors and to define neighbor search space using 2000 highly variable genes and 20 CC dimensions.

Unsupervised cell clustering and major cell-type identification

Principal Component Analysis (PCA) was used (in Seurat package) with the number of optimal PCA dimensions, which was defined using standard deviations saturation plot for further non-linear high-dimensional reduction method. Uniform Manifold Approximation and Projection (UMAP) (https://umap-learn.readthedocs.io/en/latest/) method was used to visualize cell types and clusters. Clustering was performed based on shared-nearest-neighbor (SNN) graph clustering (Louvain community detection-based method) using FindClusters (Seurat package). To assign the major cell type to each cluster, we scored each cluster by the normalized expression of the following canonical markers: Endothelial cells (CLDN5, FLT1, RAMP2, CDH1), Epithelial cells (EPCAM, KRT7, KRT8), Fibroblasts (COL1A1, COL1A2, DCN), T cells (TRBC1, TRBC2, CD3D, CD3E), B cells (CD19, CD79A, CD79B), Myeloid cells (CD68, CD86, LYZ, FCGR3A), Mast cells (ENPP3, TPSAB1, KIT). First, we used the clustering results at resolution 2 for all 54,055 cells from both nonmalignant and tumor samples to determine the major cell types based on the above markers. Cell type scores, as the average expression of the known markers, were assigned for 20 clusters to determine the initial call of major cell types. In further sub-clustering of myeloid and epithelial cells, we used subset function in Seurat and annotated the cell types based on more specific markers.

InferCNV analysis

To identify malignant cells, we inferred the somatic alterations of large-scale chromosomal, including copy number variants, either gains or losses in each single cell based on inferCNV (https://github.com/broadinstitute/inferCNV). Here, we used the epithelial cells in normal samples as a control reference. We preformed inferCNV analysis with the default parameters in inferCNV R package.

Gene set enrichment analysis

Gene Set Enrichment Analysis was performed using clusterProfiler R package and GSEA v4.0.0. 50 hallmark gene sets were used as functional gene sets.

Expression programs of intra-tumoral heterogeneity

To identify the discrete and continuous programs of expression in each sample, we performed non-negative matrix factorization (NMF) in each sample with the number of factors k from 6 to 9 (Kinker et al. 2020). Next, we used top 50 genes (by NMF score) to define the expression programs for each k. In each sample, we merged the robust expression programs by selecting those with an overlap of at least 35 genes (70%) with a program obtained from different k value. At last, we remained the programs, which have the highest overlap with an NMF program identified in another samples. Pheatmap R package was used in this study to make the heatmap.

Results

Landscape view of cell composition in malignant and nonmalignant breast tissues

To shed light on the cellular heterogeneity in breast cancer, we re-analyzed our previous scRNA-Seq data from four breast primary tumors and four adjacent matched nonmalignant breast tissues (Mao et al. 2022). After processing and sequencing QC, a total of 34,371 cells were obtained from breast cancer cells and 19,684 cells from nonmalignant breast tissues (Fig. 1A and B). To obtain the cell groups with similar expression profiles, we performed unsupervised clustering analysis implemented in Seurat R package. Next, six major cell types were identified based on established markers for known cell types (listed in Materials and Methods), including Epithelial cells, Myeloid cells, Fibroblast cells, Endothelial, T cells and B cells (Fig. 1B and C and Supplementary Figs. 1 and 2). Similar to the results in ours and others published papers, most of the cell types from different patients clustered together (Tirosh et al. 2016; D’Angelo et al. 2019). Three cell types were constituted by predominantly tumor samples (T cells, B cells and Myeloid) (Fig. 2A), while Epithelial cells were contributed comparably by tumor and nonmalignant samples (Fig. 2A and B).

Fig. 1
figure 1

Landscape of cell composition in malignant and nonmalignant breast tissues. A. UMAP (Uniform Manifold Approximation and Projection) visualization of the clustering of 54,055 cells from all 8 nonmalignant and tumor samples, color coded by clusters. B. UMAP (Uniform Manifold Approximation and Projection) visualization of the clustering of 54,055 cells from all 8 nonmalignant and tumor samples, color coded by either major cell type (left), patient origin (middle) and sample type (right). C. Expression of representative marker genes for each cell type defined in (B left)

Fig. 2
figure 2

The frequency of each cell type in samples. A. The frequency of each cell type in nonmalignant and tumor samples. B. The frequency of each cell type in four patients

Tumor heterogeneity and diversity of malignant cells due to CNV patterns in breast cancer

From tumor and matched adjacent nonmalignant samples, we detected a total of 11,206 epithelial cells (Fig. 1B and C (EPCAM+)). Although we used other two established markers to define Epithelial/Tumor cell types (KRT7 and KRT8 in Supplementary Fig. 1A), some normal cells show high expression of these three established markers. To distinguish the normal from tumor cells, we inferred the large-scale copy number alternations in each Epithelial/Tumor cell using established methods for CNV inference, which was a R package (inferCNV) based on gene expression profiles and genomic information (https://github.com/broadinstitute/inferCNV). InferCNV analysis identified 4 distinct subgroups in tumor epithelial cells, which were clearly distinguished from nonmalignant cells (Fig. 3A). Compared with tumor epithelial cells, the normal epithelial cells showed no CNV patterns (Fig. 3A, up). High expression level of classic epithelial markers (EPCAM, KRT7, KRT8) and high copy number alternations indicated that most epithelial cells in tumor samples were malignant cells. In fact, four clusters were corresponding to the tumor samples to some extent. CNVs accumulated in all malignant samples and showed high heterogeneity among clusters. Luminal A (T1, green bar in Fig. 3A) showed a clear amplification in Chr1 while luminal B (T2, gray bar in Fig. 3A) showed a clear amplification in Chr14 and Chr16p compared with other clusters. Chr17q amplification partially existed in luminal B, which may be the subcluster in cluster 3 with different features. In addition, luminal A (T1) and luminal B (T2) showed obvious Chr11q, Chr16q and Chr22 deletion (Fig. 3A). Amplifications in Chr1q and Chr17q were clear in TNBC samples (T3), and luminal B HER2+ (T4) tumor showed Chr17q and Chr20q amplification.

To explore the molecular subtypes of malignant epithelial cells, we used PAM50 model to classify the detailed malignant epithelial cells (Supplementary Fig. 3). The subtype composition of four breast cancer samples showed great diversity. A part of the malignant epithelial cells from TNBC patient (T3) presented no expression in most of the PAM50 genes. Only a part of TNBC malignant epithelial cells expressed EGFR or PHGDH (Supplementary Fig. 3). These results indicated that breast cancer cell showed high intra-tumoral heterogeneity within the same subtype which depends on HR or Her2 expression. This suggests a need for more precise classification method while there may be additional therapeutic targets.

To identify variable expression programs in malignant epithelial cells, an established method, non-negative matrix factorization (NMF), was applied based on gene expression profiles(Kinker et al. 2020). More details were listed in Materials and Methods. After clustering of malignant epithelial cells based on top 50 genes (NMF score), we obtained 12 groups (Fig. 3B). As expected, some of the groups contained at least two subtypes of malignant epithelial cells. Six groups only consisted with one subtype, which reflected both the common and the specific characteristics among breast cancer subtypes. In addition, each subtype (TNBC, LumB and HER2+) has sub-category, which revealed that different gene expression patterns might exist and play different roles in the same subtype in tumor microenvironment.

Fig. 3
figure 3

The CNV pattern and recurrent expression programs of each cell in malignant and nonmalignant samples. A. CNV pattern of each cell in malignant and nonmalignant samples. B. The recurrent expression programs in four malignant samples

Identification of different subgroups in certain breast cancer subtypes

To explore the biological functions of each group, we applied clusterProfiler v3.14.3 to perform the functional enrichment analysis based on top 50 genes in each group (Fig. 4A and B) (Yu et al. 2012). Epithelial mesenchymal transition (EMT) is closely related to tumorigenesis and development. Here, EMT was the top 1 significant hallmark in HER2_G2 group (Fig. 4C and D), which was also identified in other groups (TNBC_G1, LumB_G1, HER2_G2 and TNBC_lumB_HER2_G1). This suggested that EMT play a critical role in the development of tumors of each subtype. Most of the leading-edge genes of EMT in each group was different (only 4 overlap genes) (Fig. 4F), indicating that the activation of EMT might have different channels in different breast cancer subtypes. Although TNBC/LumB mixed groups had similar significant hallmarks, EMT was only enriched in TNBC_lumB_G2 group (Fig. 4B). In TNBC_G1 and TNBC_lumB_HER2_G2 group, p53 pathway showed a significant enrichment. In addition, enriched hallmarks showed different patterns in LumB. LumB_G2 reflected cell cycle related hallmarks (MYC, E2F and G2M), indicating that these cells controlled the cell cycle in LumB breast tumor microenvironment. TNFA signaling is common enriched hallmark in most of the groups. In HER2+ group (HER2_G1 and HER2_G2), significant hallmarks in HER2_G2 were different with those in HER2_G1. Therefore, we identified the differentially expressed genes (DEGs) and performed the pathway enrichment analysis between these two groups. Figure 4C shows the activated hallmarks in HER2_G1 and HER2_G2, respectively. HER2_G1 cells involved in regulation of cell cycle, such as MYC and E2F signaling, while HER2_G2 cells participated in EMT (Fig. 4D) and immune-related process (Fig. 4B). In addition, mTORC1 signaling was activated in HER2_G1(Fig. 4E), also TNBC_lumB_HER2_G2, and TNBC_lumB_HER2_G3 cells (Fig. 4B). These results suggested that the identification of sub-population in breast cancer subtypes might help to understand the process of tumor metastasis and resistance. The comparison of two groups in TNBC and LumB are showed in Supplementary Figs. 4A and B.

Fig. 4
figure 4

Function of different subgroups based on recurrent expression programs in each cell in malignant and nonmalignant samples. A. Top one hallmarks in each expressed group. B. Biological functions of each group based on overlapped top 50 genes in each group. C. Activated hallmarks in HER2-G1 and HER2-G2 groups. D and E. GSEA curves of Epithelial Mesenchymal Transition and MTORC1 signaling in (C). F. The number of overlap genes in EMT

Myeloid cell heterogeneity in breast cancer

Myeloid cells are important antigen presenting cells, including dendritic cells, macrophages and neutrophils. In tumor microenvironment, neutrophils have very low expression of MHC-II, while dendritic cells and macrophages present foreign antigens to helper T-cells on MHC II molecules (Lin and Lore 2017). In addition, macrophages play crucial roles in innate and adaptive immunity with high phenotypic heterogeneity and functional diversity (Pan et al. 2020). Tumor-associated macrophage (TAM) is one of the main tumor-infiltrating immune cell types, which presents at all stages of tumor progression and enhances tumor progression to malignancy (Qian and Pollard 2010). TAMs are widely present in various cancer types (Ngambenjawong et al. 2017) and can be divided into two mainly functionally contrasting subtypes, classical activated M1 macrophages (TAM1) and alternatively activated M2 macrophages (TAM2) (Qian and Pollard 2010), respectively. TAM1 can directly mediate antibody-dependent cell-mediated cytotoxicity to kill tumor cells, while TAM2 can inhibit T cell-mediated anti-tumor immune response, promote tumor angiogenesis, and lead to the occurrence and metastasis of tumor cells (Qian and Pollard 2010).

In this study, we detected a total of 5,436 myeloid cells from nonmalignant and tumor samples (Fig. 5A). To better explore myeloid cells diversity in breast cancer, we performed unsupervised clustering analysis implemented in Seurat V3 across 4 matched malignant and nonmalignant breast cancer myeloid cells. Next, established myeloid markers, such as S100A8 and S100A9 in Monocyte, CD14, FCGR3A, CD68 and CD69 in Macrophage, CD1C in classical dendritic cells (cDC) and CLEC4C in plasmacytoid dendritic cells (pDC), were expressed across all myeloid subpopulations (Fig. 5B), confirming the diversity of myeloid cells in breast cancer samples. Next, we collected the known lineage markers in myeloid cells and showed the expression of these genes (Fig. 5C). TAM2 expressed high levels of CD68, CD163, and CSF1R, but lower level of CD69. Similar to the previous studies, the infiltration of CD68 + or CD163 + TMAs was correlated with poor outcome in various cancer types, including breast cancer (DeNardo et al. 2011; Kurahara et al. 2011; Shabo et al. 2008). CSF1R inhibitor, RG7155, has been applied to reduce tumor burden in patients with Diffuse-type Giant Cell Tumors(Ries et al. 2014), suggesting that the inhibitor might be used in breast cancer therapy. We identified cDC and pDC, both of which expressed entirely different lineage markers (FLT3, CLEC9A and CD1C in cDC; SELL, THBD and CLEC4C in pDC), consistent with recent reports in DC (Patente et al. 2018).

TAM1 expressed inflammatory cytokines IFNG and TNF, but inflammatory cytokines expressed low in TAM2 (Fig. 5C). For DCs, we identified cDC and pDC, both of which expressed entirely different lineage markers (FLT3, CLEC9A and CD1C in cDC; SELL, THBD and CLEC4C in pDC). The distribution of sub-population and samples were showed in Fig. 5F and G.

Fig. 5
figure 5

Landscape of myeloid cells in malignant and nonmalignant samples. A. UMAP (Uniform Manifold Approximation and Projection) visualization of the clustering of 5,436 myeloid cells from all 8 nonmalignant and tumor samples, color coded by clusters. B, C and D. Expression of representative marker genes for each cell type defined in (A). E. Heatmap of known markers in different cell types. F. The frequency of each myeloid cell type in nonmalignant and tumor samples. G. The frequency of each myeloid cell type in four patients

Two types of tumor-associated macrophage cells in breast cancer and adjacent nonmalignant breast tissues

As TAM2 can enhance tumor progression and lead to tumor metastasis, we next determined DEGs between tumor cells and adjacent normal cells in TAM2 populations (Fig. 6A). TAM2 from malignant cells had higher expression of type I interferon (IFN) related genes (e.g. IFI6), lipid metabolism and transport related genes (e.g. APOE, APOC1), type I transmembrane glycoprotein gene (e.g. GPNMB), and Fatty-acid metabolism related genes (e.g. FABP5). Most of these up-regulated genes have been reported in metastatic or high stage breast cancer. For example, IFI6 can directly influence the formation of migratory structure and nuclear gene expression through mitochondrial reactive oxygen species (mtROS), and then to promote the metastasis in breast cancer (Cheriyath et al. 2018). In fact, over-expression of IFI6 induced the metastasis and poor prognosis has been found in various cancer types, such as esophageal squamous cell carcinoma and gastric cancer(Liu et al. 2020a, b; Tahara et al. 2005). These results suggested that analyzing single-cell RNA expression of breast cancer samples could not only recognize known disease-related genes, but also reveal aberrant genes.

In addition, we also identified the DEGs in TAM1 cells between tumor and adjacent normal cells (Supplementary Fig. 5). One notable expression emerged from the DEGs in two TAMs was that only 8 up-regulated genes existed in TAM1 tumor cells, compared with adjacent normal cells (Supplementary Fig. 5). All these genes were also up regulated in TAM2 tumor cells, compared with TAM2 normal cells (Fig. 6B). These results indicated that the difference of breast tumor cells and adjacent normal cells in TAM1 cells was slight, and TAM2 might dominate the occurrence and progression in breast tumor microenvironment. To gain further insights into the functionality of TAM2 in tumor microenvironment, we performed pathway enrichment analysis based on the up-regulated genes in TAM2 (Fig. 6C). The most significant hallmarks were Oxidative Phosphorylation and Adipogenesis, which have been uncovered as an emerging target in cancer therapy and associated with poor prognosis of breast cancer (Ashton et al. 2018; Lin et al. 2021a, b). The identification of known results proves the reliability of our results. In addition, we found Interferon alpha (IFNα) and gamma (IFNγ) response were activated in breast cancer TAM2 at single-cell RNA level. Recent studies reported that TAM1 underwent “classical” activation by interferon-γ (IFNγ) with either lipopolysaccharide (LPS) or TNF, whereas M2 underwent “alternative” activation by IL-4(Martinez and Gordon 2014). The function of IFNα (and IFNγ) remains unclear in breast cancer TAM2. These results might underscore new immunomodulatory effects of IFN-α and IFNγ in breast cancer, which will provide more insights into molecular and cellular mechanisms underlying breast cancer microenvironment.

Fig. 6
figure 6

DEGs and enriched functions in TAM2. A. Volcano plot of DEGs between TAM2 tumor and normal samples. B. Overlap DEGs between TAM1 DEGs and TAM2 DEGs. C. Top 10 enriched hallmarks between TAM2 tumor and normal samples

Discussion

Breast cancer is an elaborate process, which would be affected by multiple genetic and environmental factor. High heterogeneity of breast cancer confer the different response efficacy of chemotherapy and immunotherapy. Emerging of single-cell sequencing opens a new era for the investigation of tumor immune microenvironment, which has been widely used for the study of tumor heterogeneity(Patel et al. 2014; Ren et al. 2018). In this study, we unbiasedly constructed a landscape view of the heterogeneous cell composition in human breast cancer from 4 paired samples, particularly myeloid cells (macrophages) and tumor cells. Transcriptome analysis of 54,055 individual cells in 5 major cell types revealed the vast heterogeneous of breast tumor cells and myeloid cells, highlighting the diversity of tumor-associated macrophages and different functions in tumor microenvironment.

In our study, luminal A (T1) and luminal B (T2) showed obvious Chr11q and Chr22q deletion (Fig. 3A), consistent with recent study (Gaynor et al. 2020). Amplifications in Chr1q and Chr17q in TNBC samples (T3) was aligned with previous reports (Jiang et al. 2019; Lin et al. 2021a, b). In addition, ERBB2, GRB7 and FGFR4 only expressed in luminal B HER2+ malignant epithelial cells, consistent with previous study in bulk RNA-Seq (Prat et al. 2020). High intra-tumoral heterogeneity within the same subtype which depends on HR or HER2 expression were showed. Our analysis at single-cell level might discriminate multiple microenvironment subtypes and provide a potential alternative which could predict a target gene that lead to effective therapy in breast cancer.

EMT is an evolutionarily conserved developmental program in carcinogenesis, which is related to the metastasis, invasion and resistance of cancer (Mittal 2018). Our result indicated that EMT could be a common cancer process across several subgroups included in the study which suggested EMT intervention could have a general therapeutic effect on various subtypes. Recent study proved that cancer cells in various stages of the EMT have different contribution to hallmarks of breast cancer malignancy, such as tumor invasion, metastasis, and chemoresistance(Lüönd et al. 2021). In addition, cisplatin could prevent breast cancer metastasis through TNFβ signaling pathway blocking early EMT(Wang et al. 2021).

P53 is a key tumor suppressor gene that plays an important role in the occurrence and development of breast cancer. Mutations in the TP53 tumor suppressor gene are not rare in cancer, and attempts to restore the functionality of p53 in tumors as a therapeutic strategy began decades ago(Hassin and Oren 2022; Olivier et al. 2006). Recent research found that MDM2-targeted treatment could be an alternative for p53-inactivated TNBC(Adams et al. 2023; Ma et al. 2023). We found that p53 pathway enriched in TNBC_G1 and TNBC_lumB_HER2_G2 group. This result indicated p53 related therapy could have potential effect on TNBC which were consist with current research(Adams et al. 2023; Ma et al. 2023; Cirillo et al. 2023). Again, analysis at single-cell level that discriminating microenvironment subtypes could actually provide a potential treatment target for breast cancer.

TNF-alpha (Tumor Necrosis Factor-alpha) is a pro-inflammatory cytokine that has been shown to have complex and context-dependent effects on cancer growth and progression, including in breast cancer (Cai et al. 2017; Liu et al. 2020a, b). In our study, TNF-alpha signaling showed a enriched hallmark in most of the groups except luminal B HER2+ cells which showed the most enriched function in EMT.

Recent studies reported that mTOR signaling has been linked with resistance to HER2 therapies in breast cancer (Hare and Harvey 2017). This study showed mTORC1 signaling was activated in HER2_G1, also TNBC_lumB_HER2_G2, and TNBC_lumB_HER2_G3 cells. This suggested that anti-mTOR treatment such as everolimus may have potential therapeutic effect in luminal B HER2+ and even TNBC breast cancer patients.

Overall, exploration of both myeloid and malignant cells highlights the tumor heterogeneity and progression of breast cancer microenvironment. Four patients belonged to different subtypes by CNV pattern, malignant cells from the same subtype were more similar. In addition, most of PAM 50 genes expressed low in a part of TNBC patient’s cells. EGFR and PHGDH were mutually exclusively expressed in a part of TNBC cells. These results hinted that TNBC might have more unknown strategies to maintain cancer status. Furthermore, NMF results indicated that variable expression groups existed in malignant cells. Different patients in the same group revealed their common functions in breast cancer. Subtype specific groups indicated the heterogeneous in tumor samples. Focusing on the top 50 genes in each gene expression group, we did the pathway enrichment analysis and identified top 1 enriched function in luminal B HER2+ cells is EMT. Although EMT was enriched in TNBC_G1, LumB_G1 and TNBC_lumB_HER2_G1, most of the leading edge genes in each group were different (Fig. 4B and F). These results suggested that malignant cells might have multiple choices to launch the EMT process in tumor immune microenvironment. In addition, different groups in the same subtype also revealed the heterogeneity in breast cancer at single-cell level, which may facilitate more accurate breast cancer diagnosis and prognosis assessment in clinical management.

TAMs are the key myeloid cells and present in multiple cancer types (Ngambenjawong et al. 2017). Two functionally contrasting subtypes of TAMs are classical activated M1 macrophages (TAM1) and alternatively activated M2 macrophages (TAM2) (Qian and Pollard 2010). We observed an abundance of TAM2 in breast cancer, which exhibited not only oxidative phosphorylation but also Interferon alpha and gamma response functions, suggesting TAM2 might be involved in the anti-tumor immune response and the occurrence of tumor. The influence of TAMs in the breast cancer tumor microenvironment depends on the tumor type(Chang et al. 2023). Although there are few data describing the effects of oxidative phosphorylation on tumor-associated macrophages in breast cancer, oxidative phosphorylation do have some connection with TAMs in other type of tumor(Qu et al. 2022; Meng et al. 2022; Pan et al. 2023). In our study, TAMs seemed to have an effect in tumor microenvironment though adipogenesis pathways. APOE+ macrophages have been shown to affect T-cell inflammation, thus reshaping the tumor microenvironment in intrahepatic cholangiocarcinoma(Bao et al. 2022). FABP5 and PPARγ signaling pathways participate in mediating doxorubicin resistance in breast cancer(Nan-Nan et al. 2023). FABP4+ macrophages subsets can be reprogrammed to up-regulate immune checkpoints and various inflammatory chemokines, thereby inhibiting antitumor immunity(Zhou et al. 2020), indicating that FABP4 may be a potential target for breast cancer treatment(Hao et al. 2024). The specific mechanism of action will be further studied that we plan to first validate on more clinical specimens and further screen them to identify the most relevant pathways for following research. Macrophages would be selected by flow cytometry and the differences in tumor microenvironment of different subtypes would be analyzed by comparing the expression of key factors of adipogenesis and oxidative phosphorylation pathways in different subtypes of breast cancer. Furthermore, comparing the DEGs in TAM1 with those in TAM2 (TAMs from tumor samples versus adjacent normal samples), we found that all the DEGs in TAM1 group had significant change in TAM2 group, indicating that TAM2 represented the majority characteristics of TAM in breast cancer, which further explained the role of TAM2 in the occurrence and progression of breast cancer.

In summary, through unbiasedly revealing the heterogeneous tumor microenvironment of breast cancer at a high resolution, we constructed a comprehensive single-cell transcriptome atlas of 54,055 cells from 4 malignant and 4 non-malignant patients, providing insights into the mechanisms underlying breast cancer progression and the development of potential therapeutic strategies in breast cancer. Although our findings of TAM and malignant various expression reprograms in breast cancer patients were obtained based on bioinformatic analyses of single-cell RNA-Seq data from our previous study, further functional experiments are necessary to uncover the underlying mechanisms.