Skip to main content

Advertisement

Log in

Pan-cancer pseudogene RNA analysis reveals a regulatory network promoting cancer cell proliferation

  • Original Research Paper
  • Published:
Genome Instability & Disease Aims and scope Submit manuscript

Abstract

Pseudogenes have long been thought to be nonfunctional due to their lack of protein-coding ability, compared to protein-coding genes (PCG). Actually, pseudogenes can transcribe functional long non-coding RNAs (lncRNAs) involved in cancer development and progression. These lncRNAs may regulate mRNA levels as competing endogenous RNAs (ceRNAs). However, a systematic pan-cancer analysis of pseudogene ceRNA networks is still lacking. Here, we curated 9455 pseudogenes and constructed ceRNA networks for 14 cancer types. We discovered that pseudogenes in ceRNA networks were the most cancer type-specific and that ~ 20% cePCGs were cancer hallmark genes, supporting the close relationship between pseudogenes and cancer. Notably, in breast cancer (BRCA), we found that the ceRNA subnetwork of ZNF252P/AL390726.5 (pseudogenes)-miRNAs-ESR1 may enhance the proto-oncogene MYC and we experimentally validated oncogenic effects of the two pseudogenes. Moreover, pseudogene-based subtyping of BRCA tumors revealed a new subtype featured by immunoglobulin pseudogenes. Collectively, our findings could pave the way for more pseudogene research in cancer. We provided our results in a user-friendly database, cePseudo, available at http://bioinfo-sysu.com/cePseudo2.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

GTEx RNA-seq sequencing reads files were downloaded from dbGaP (phs000424.v8.p2). TCGA RNA-seq BAM files were downloaded from GDC (https://portal.gdc.cancer.gov). Expression levels of PCGs (HTSeq-FPKM and HTSeq-Counts) and miRNAs (mirbase21.isoforms.quantification.txt) and clinical information were downloaded from the UCSC Xena database (https://xenabrowser.net). CPTAC proteomics data were available at https://proteomics.cancer.gov/programs/cptac.

References

  • Agarwal, V., Bell, G. W., Nam, J. W., & Bartel, D. P. (2015). Predicting effective microRNA target sites in mammalian mRNAs. eLife, 4, e05005.

    Article  PubMed  PubMed Central  Google Scholar 

  • Aguda, B. D., Kim, Y., Piper-Hunter, M. G., Friedman, A., & Marsh, C. B. (2008). MicroRNA regulation of a cancer network: consequences of the feedback loops involving miR-17-92, E2F, and Myc. Proceedings of the National Academy of Sciences, 105, 19678–19683.

    Article  CAS  Google Scholar 

  • Balakirev, E. S., & Ayala, F. J. (2003). Pseudogenes: Are they “junk” or functional DNA? Annual Review of Genetics, 37, 123–151.

    Article  CAS  PubMed  Google Scholar 

  • Barthel, F. P., Wei, W., Tang, M., Martinez-Ledesma, E., Hu, X., Amin, S. B., Akdemir, K. C., Seth, S., Song, X., Wang, Q., et al. (2017). Systematic analysis of telomere length and somatic alterations in 31 cancer types. Nature Genetics, 49, 349–357.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Betel, D., Koppal, A., Agius, P., Sander, C., & Leslie, C. (2010). Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biology, 11, R90.

    Article  PubMed  PubMed Central  Google Scholar 

  • Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30, 2114–2120.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Brosch, M., Saunders, G. I., Frankish, A., Collins, M. O., Yu, L., Wright, J., Verstraten, R., Adams, D. J., Harrow, J., Choudhary, J. S., et al. (2011). Shotgun proteomics aids discovery of novel protein-coding genes, alternative splicing, and “resurrected” pseudogenes in the mouse genome. Genome Research, 21, 756–767.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cao, Z., Pan, X., Yang, Y., Huang, Y., & Shen, H. B. (2018). The lnclocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier. Bioinformatics, 34, 2185–2194.

    Article  CAS  PubMed  Google Scholar 

  • Cheetham, S. W., Faulkner, G. J., & Dinger, M. E. (2020). Overcoming challenges and dogmas to understand the functions of pseudogenes. Nature Reviews Genetics, 21, 191–201.

    Article  CAS  PubMed  Google Scholar 

  • Chin, C. H., Chen, S. H., Wu, H. H., Ho, C. W., Ko, M. T., & Lin, C. Y. (2014). cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Systems Biology, 8(Suppl 4), S11.

    Article  PubMed  PubMed Central  Google Scholar 

  • Deming, S. L., Nass, S. J., Dickson, R. B., & Trock, B. J. (2000). C-myc amplification in breast cancer: a meta-analysis of its occurrence and prognostic relevance. British Journal of Cancer, 83, 1688–1695.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dhawan, A., Scott, J. G., Harris, A. L., & Buffa, F. M. (2018). Pan-cancer characterisation of microRNA across cancer hallmarks reveals microRNA-mediated downregulation of tumour suppressors. Nature Communications, 9, 5228.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., & Gingeras, T. R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29, 15–21.

    Article  CAS  PubMed  Google Scholar 

  • Eng, J. K., Jahan, T. A., & Hoopmann, M. R. (2013). Comet: An open-source MS/MS sequence database search tool. Proteomics, 13, 22–24.

    Article  CAS  PubMed  Google Scholar 

  • Fan, M., Sethuraman, A., Brown, M., Sun, W., & Pfeffer, L. M. (2014). Systematic analysis of metastasis-associated genes identifies miR-17-5p as a metastatic suppressor of basal-like breast cancer. Breast Cancer Research and Treatment, 146, 487–502.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gao, L., Ren, W., Zhang, L., Li, S., Kong, X., Zhang, H., Dong, J., Cai, G., Jin, C., Zheng, D., et al. (2017). PTENp1, a natural sponge of miR-21, mediates PTEN expression to inhibit the proliferation of oral squamous cell carcinoma. Molecular Carcinogenesis, 56, 1322–1334.

    Article  CAS  PubMed  Google Scholar 

  • Gaujoux, R., & Seoighe, C. (2010). A flexible R package for nonnegative matrix factorization. BMC Bioinformatics, 11, 367.

    Article  PubMed  PubMed Central  Google Scholar 

  • Goldman, M. J., Craft, B., Hastie, M., Repecka, K., McDade, F., Kamath, A., Banerjee, A., Luo, Y., Rogers, D., Brooks, A. N., et al. (2020). Visualizing and interpreting cancer genomics data via the Xena platform. Nature Biotechnology, 38, 675–678.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Han, L., Yuan, Y., Zheng, S., Yang, Y., Li, J., Edgerton, M. E., Diao, L., Xu, Y., Verhaak, R. G. W., & Liang, H. (2014). The pan-cancer analysis of pseudogene expression reveals biologically and clinically relevant tumour subtypes. Nature Communications, 5, 3963.

    Article  CAS  PubMed  Google Scholar 

  • Hanahan, D. (2022). Hallmarks of Cancer: New Dimensions. Cancer Discovery, 12, 31–46.

    Article  CAS  PubMed  Google Scholar 

  • Hayashi, H., Arao, T., Togashi, Y., Kato, H., Fujita, Y., De Velasco, M. A., Kimura, H., Matsumoto, K., Tanaka, K., Okamoto, I., et al. (2015). The OCT4 pseudogene POU5F1B is amplified and promotes an aggressive phenotype in gastric cancer. Oncogene, 34, 199–208.

    Article  CAS  PubMed  Google Scholar 

  • Ji, Z., Song, R., Regev, A., & Struhl, K. (2015). Many lncRNAs, 5’UTRs, and pseudogenes are translated and some are likely to express functional proteins. eLife, 4, e08890.

    Article  PubMed  PubMed Central  Google Scholar 

  • Kalyana-Sundaram, S., Kumar-Sinha, C., Shankar, S., Robinson, D. R., Wu, Y. M., Cao, X., Asangani, I. A., Kothari, V., Prensner, J. R., Lonigro, R. J., et al. (2012). Expressed pseudogenes in the transcriptional landscape of human cancers. Cell, 149, 1622–1634.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Karreth, F. A., Reschke, M., Ruocco, A., Ng, C., Chapuy, B., Leopold, V., Sjoberg, M., Keane, T. M., Verma, A., Ala, U., et al. (2015). The BRAF pseudogene functions as a competitive endogenous RNA and induces lymphoma in vivo. Cell, 161, 319–332.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kryuchkova-Mostacci, N., & Robinson-Rechavi, M. (2017). A benchmark of gene expression tissue-specificity metrics. Briefings in Bioinformatics, 18, 205–214.

    CAS  PubMed  Google Scholar 

  • Li, J. H., Liu, S., Zhou, H., Qu, L. H., & Yang, J. H. (2014). starBase v2.0: Decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Research, 42, D92-97.

    Article  CAS  PubMed  Google Scholar 

  • Liao, Y., Smyth, G. K., & Shi, W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics, 30, 923–930.

    Article  CAS  PubMed  Google Scholar 

  • Liu, B., Zhou, X., Wu, D., Zhang, X., Shen, X., Mi, K., Qu, Z., Jiang, Y., & Shang, D. (2021). Comprehensive characterization of a drug-resistance-related ceRNA network across 15 anti-cancer drug categories. Mol Ther Nucleic Acids, 24, 11–24.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nielsen, M., Lundegaard, C., Blicher, T., Lamberth, K., Harndahl, M., Justesen, S., Roder, G., Peters, B., Sette, A., Lund, O., et al. (2007). NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence. PLoS ONE, 2, e796.

    Article  PubMed  PubMed Central  Google Scholar 

  • Oh, H., Wang, S. C., Prahash, A., Sano, M., Moravec, C. S., Taffet, G. E., Michael, L. H., Youker, K. A., Entman, M. L., & Schneider, M. D. (2003). Telomere attrition and Chk2 activation in human heart failure. Proceedings of the National Academy of Sciences, 100, 5378–5383.

    Article  CAS  Google Scholar 

  • Pinero, J., Bravo, A., Queralt-Rosinach, N., Gutierrez-Sacristan, A., Deu-Pons, J., Centeno, E., Garcia-Garcia, J., Sanz, F., & Furlong, L. I. (2017). DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Research, 45, D833–D839.

    Article  CAS  PubMed  Google Scholar 

  • Pink, R. C., Wicks, K., Caley, D. P., Punch, E. K., Jacobs, L., & Carter, D. R. (2011). Pseudogenes: pseudo-functional or key regulators in health and disease? RNA, 17, 792–798.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Plaisier, C. L., Pan, M., & Baliga, N. S. (2012). A miRNA-regulatory network explains how dysregulated miRNAs perturb oncogenic processes across diverse cancers. Genome Research, 22, 2302–2314.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pockrandt, C., Alzamel, M., Iliopoulos, C. S., & Reinert, K. (2020). GenMap: ultra-fast computation of genome mappability. Bioinformatics, 36, 3687–3692.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Poliseno, L., Salmena, L., Zhang, J., Carver, B., Haveman, W. J., & Pandolfi, P. P. (2010). A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature, 465, 1033–1038.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Poursani, E. M., Mohammad Soltani, B., & Mowla, S. J. (2016). Differential expression of OCT4 pseudogenes in pluripotent and tumor cell lines. Cell Journal, 18, 28–36.

    PubMed  PubMed Central  Google Scholar 

  • Quinlan, A. R., & Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–842.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43, e47.

    Article  PubMed  PubMed Central  Google Scholar 

  • Robinson, D. R., Wu, Y. M., Vats, P., Su, F., Lonigro, R. J., Cao, X., Kalyana-Sundaram, S., Wang, R., Ning, Y., Hodges, L., et al. (2013). Activating ESR1 mutations in hormone-resistant metastatic breast cancer. Nature Genetics, 45, 1446–1451.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Robinson, M. D., McCarthy, D. J., & Smyth, G. K. (2010). edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26, 139–140.

    Article  CAS  PubMed  Google Scholar 

  • Salmena, L., Poliseno, L., Tay, Y., Kats, L., & Pandolfi, P. P. (2011). A ceRNA hypothesis: the Rosetta stone of a hidden RNA language? Cell, 146, 353–358.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B., & Ideker, T. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research, 13, 2498–2504.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sugimoto, K. (2018). Branching the Tel2 pathway for exact fit on phosphatidylinositol 3-kinase-related kinases. Current Genetics, 64, 965–970.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., & Bray, F. (2021). Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 71, 209–249.

    PubMed  Google Scholar 

  • Szklarczyk, D., Gable, A. L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., Simonovic, M., Doncheva, N. T., Morris, J. H., Bork, P., et al. (2019). STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research, 47, D607–D613.

    Article  CAS  PubMed  Google Scholar 

  • The, M., MacCoss, M. J., Noble, W. S., & Kall, L. (2016). Fast and Accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0. Journal of the American Society for Mass Spectrometry, 27, 1719–1727.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tian, X., Song, J., Zhang, X., Yan, M., Wang, S., Wang, Y., Xu, L., Zhao, L., Wei, J. J., Shao, C., et al. (2020). MYC-regulated pseudogene HMGA1P6 promotes ovarian cancer malignancy via augmenting the oncogenic HMGA1/2. Cell Death & Disease, 11, 167.

    Article  CAS  Google Scholar 

  • Wang, C., Mayer, J. A., Mazumdar, A., Fertuck, K., Kim, H., Brown, M., & Brown, P. H. (2011). Estrogen induces c-myc gene expression via an upstream enhancer activated by the estrogen receptor and the AP-1 transcription factor. Molecular Endocrinology, 25, 1527–1538.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang, Y., Xu, Y., Feng, L., Li, F., Sun, Z., Wu, T., Shi, X., Li, J., & Li, X. (2016). Comprehensive characterization of lncRNA-mRNA related ceRNA network across 12 major cancers. Oncotarget, 7, 64148–64167.

    Article  PubMed  PubMed Central  Google Scholar 

  • Zheng, L. L., Zhou, K. R., Liu, S., Zhang, D. Y., Wang, Z. L., Chen, Z. R., Yang, J. H., & Qu, L. H. (2018). dreamBase: DNA modification, RNA regulation and protein binding of expressed pseudogenes in human health and disease. Nucleic Acids Research, 46, D85–D91.

    Article  CAS  PubMed  Google Scholar 

  • Zhou, Y., Zhou, B., Pache, L., Chang, M., Khodabakhshi, A. H., Tanaseichuk, O., Benner, C., & Chanda, S. K. (2019). Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nature Communications, 10, 1523.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The research has been supported by the Integrated Project of Major Research Plan of National Natural Science Foundation of China (NSFC) (92249303) and Guangdong Basic and Applied Basic Research Foundation (2021A1515110972). The results shown here are in whole or in part based upon data generated by the TCGA or CPTAC Research Network.

Funding

Integrated Project of Major Research Plan of National Natural Science Foundation of China, 92249303, Yuanyan Xiong; Basic and Applied Basic Research Foundation of Guangdong Province, 2021A1515110972, Mengbiao Guo.

Author information

Authors and Affiliations

Authors

Contributions

YYX and MBG conceived the project. QLL, MBG, and JKZ analyzed the data and interpreted the results. JXZ, QW, MHX, and ZWF performed experimental validation. MBG wrote the manuscript. MBG and JKZ revised the manuscript. YYX, MBG, QW, and ZSY reviewed the manuscript. All authors approved the final manuscript.

Corresponding authors

Correspondence to Zhou Songyang or Yuanyan Xiong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (XLSX 16 KB)

Supplementary file1 (docx 2896 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, M., Zhang, J., Liang, Q. et al. Pan-cancer pseudogene RNA analysis reveals a regulatory network promoting cancer cell proliferation. GENOME INSTAB. DIS. 4, 85–97 (2023). https://doi.org/10.1007/s42764-023-00097-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42764-023-00097-2

Keywords

Navigation