Abstract
Interest in studying genomics and transcriptomics at the single-cell level has been increasing. One of the keys to single-cell study is developing cell-sorting technology to separate cells according to their type. However, the process of cell isolation changes the cell microenvironment that affects gene activity, and this change in gene expression can affect the conclusion of the single-cell study. To address this, we propose a novel PEnalized deconvolution Analysis for Cell separation-induced Heterogeneity (PEACH). By adopting a Bayesian variable selection scheme, PEACH can simultaneously decompose cell-type-specific expression from bulk tissue and identify cell separation-induced differential expression (CSI-DE) genes. We validated PEACH by using four benchmark datasets and one in silico mixture dataset. In the real application, we used PEACH to analyze an immune-related disease dataset, a blood dataset, and a skin dataset, and we consistently identified immediate-early genes, ribosomal protein genes, and mitochondrial genes across the three datasets. Our study illustrates that genes sensitive to the cell-sorting process are biologically meaningful and nonnegligible, and it may provide new insights into single-cell studies for transcriptomic analysis. The model has been implemented in the R package “PEACH,” and the algorithm is available for download.
Similar content being viewed by others
Data Availability
The methods reported in this article are implemented in the R package, “PEACH,” which is publicly available to perform the deconvolution analysis (https://github.com/AshTai/PEACH). The benchmark datasets are available under GEO (GSE11058, GSE19830, and GSE60424). The datasets in the real application were downloaded from GSE115898, GSE51984, and GSE60424.
References
Ozsolak F, Milos PM (2011) RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 12:87
Stegle O, Teichmann SA, Marioni JC (2015) Computational and analytical challenges in single-cell transcriptomics. Nat Rev Genet 16:133
Bacher R, Kendziorski C (2016) Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol 17:63
Brennecke P et al (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10:1093
Finak G et al (2015) MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol 16:278
Pierson E, Yau C (2015) ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol 16:241
Kharchenko PV, Silberstein L, Scadden DT (2014) Bayesian approach to single-cell differential expression analysis. Nat Methods 11:740
Vallejos CA, Richardson S, Marioni JC (2016) Beyond comparisons of means: understanding changes in gene expression at the single-cell level. Genome Biol 17:70
Hafemeister C, Satija R (2019) Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol 20:1–15
Vallejos CA, Risso D, Scialdone A, Dudoit S, Marioni JC (2017) Normalizing single-cell RNA sequencing data: challenges and opportunities. Nat Methods 14:565
Richardson GM, Lannigan J, Macara IG (2015) Does FACS perturb gene expression? Cytometry A 87:166–175
van den Brink SC et al (2017) Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat Methods 14:935
Ziegenhain C, Vieth B, Parekh S, Hellmann I, Enard W (2018) Quantitative single-cell transcriptomics. Brief Funct Genomics 17:220–232
Lacar B et al (2016) Nuclear RNA-seq of single neurons reveals molecular signatures of activation. Nat Commun 7:1–13
Wu YE, Pan L, Zuo Y, Li X, Hong W (2017) Detecting activated cell populations using single-cell RNA-seq. Neuron 96(313–329):e316
Zhu L, Lei J, Devlin B, Roeder K (2018) A unified statistical framework for single cell and bulk RNA sequencing data. Ann Appl Stat 12:609
Poulin J-F, Tasic B, Hjerling-Leffler J, Trimarchi JM, Awatramani R (2016) Disentangling neural cell diversity using single-cell transcriptomics. Nat Neurosci 19:1131
Wang X, Park J, Susztak K, Zhang NR, Li M (2019) Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat Commun 10:1–9
Abbas AR, Wolslegel K, Seshasayee D, Modrusan Z, Clark HF (2009) Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PLoS ONE 4:e6098
Du R, Carey V, Weiss ST (2019) deconvSeq: deconvolution of cell mixture distribution in sequencing data. Bioinformatics 35:5095–5102
Gong T, Szustakowski JD (2013) DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics 29:1083–1085
Racle J, de Jonge K, Baumgaertner P, Speiser DE, Gfeller D (2017) Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data. Elife 6:e26476
Newman AM et al (2015) Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 12:453
Ogundijo OE, Wang X (2017) A sequential Monte Carlo approach to gene expression deconvolution. PLoS ONE 12:e0186167
Tai A-S, Tseng GC, Hsieh W-P (2021) BayICE: a Bayesian hierarchical model for semireference-based deconvolution of bulk transcriptomic data. Ann Appl Stat 15:391–411
She Y, Owen AB (2011) Outlier detection using nonconvex penalized regression. J Am Stat Assoc 106:626–639
Linsley PS, Speake C, Whalen E, Chaussabel D (2014) Copy number loss of the interferon gene cluster in melanomas is linked to reduced T cell infiltrate and poor patient prognosis. PLoS ONE 9:e109760
Ahn RS et al (2017) Transcriptional landscape of epithelial and immune cell populations revealed through FACS-seq of healthy human skin. Sci Rep 7:1–9
Pabst C et al (2016) GPR56 identifies primary human acute myeloid leukemia cells with high repopulating potential in vivo. Blood J Am Soc Hematol 127:2018–2027
Jin H, Wan Y-W, Liu Z (2017) Comprehensive evaluation of RNA-seq quantification methods for linearity. BMC Bioinform 18:117
Zhong Y, Liu Z (2012) Gene expression deconvolution in linear space. Nat Methods 9:8
Fridman WH, Pagès F, Sautes-Fridman C, Galon J (2012) The immune contexture in human tumours: impact on clinical outcome. Nat Rev Cancer 12:298–306
Tai A-S, Peng C-H, Peng S-C, Hsieh W-P (2018) Decomposing the subclonal structure of tumors with two-way mixture models on copy number aberrations. PLoS ONE 13:e0206579
Shen-Orr SS et al (2010) Cell type–specific gene expression differences in complex tissues. Nat Methods 7:287–289
Ali AT, Boehme L, Carbajosa G, Seitan VC, Small KS, Hodgkinson A (2019) Nuclear genetic regulation of the human mitochondrial transcriptome. Elife 8:e41927
Genuth NR, Barna M (2018) The discovery of ribosome heterogeneity and its implications for gene regulation and organismal life. Mol Cell 71:364–374
Guimaraes JC, Zavolan M (2016) Patterns of ribosomal protein expression specify normal and malignant human cells. Genome Biol 17:236
Ilicic T, Kim JK, Kolodziejczyk AA, Bagger FO, McCarthy DJ, Marioni JC, Teichmann SA (2016) Classification of low quality cells from single-cell RNA-seq data. Genome Biol 17:29
Petukhov V, Guo J, Baryawno N, Severe N, Scadden DT, Samsonova MG, Kharchenko PV (2018) dropEst: pipeline for accurate estimation of molecular counts in droplet-based single-cell RNA-seq experiments. Genome Biol 19:78
Martano G et al (2019) Metabolism of stem and progenitor cells: proper methods to answer specific questions. Front Mol Neurosci 12:151
Akaishi T, Takahashi T, Nakashima I (2018) Peripheral blood monocyte count at onset may affect the prognosis in multiple sclerosis. J Neuroimmunol 319:37–40
Roep BO (2003) The role of T-cells in the pathogenesis of Type 1 diabetes: from cause to cure. Diabetologia 46:305–321
Delong T et al (2016) Pathogenic CD4 T cells in type 1 diabetes recognize epitopes formed by peptide fusion. Science 351:711–714
Shen XF, Cao K, Jp J, Guan WX, Du JF (2017) Neutrophil dysregulation during sepsis: an overview and update. J Cell Mol Med 21:1687–1697
Acknowledgements
This work was supported by the Ministry of Science and Technology [MOST 107-2118-M-007-001]. This manuscript was edited by Wallace Academic Editing.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Rights and permissions
About this article
Cite this article
Tai, AS., Wang, CC. & Hsieh, WP. Detection of Cell Separation-Induced Gene Expression Through a Penalized Deconvolution Approach. Stat Biosci 15, 692–718 (2023). https://doi.org/10.1007/s12561-022-09344-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-022-09344-8