Skip to main content
Log in

Gene clustering for time-series microarray with production outputs

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The identification of coexpressed genes from microarray data is a challenging problem in bioinformatics and computational biology. The objective of this study is to obtain knowledge about the most important genes and clusters related to production outputs of real-world time-series microarray data in the industrial microbiology area. Each sample in the microarray data experiment is complemented with the measurement of the corresponding production and growth values. A novel aspect of this research refers to considering the relation of coexpression patterns with the measured outputs to guide the biological interpretation of results. Shape-based clustering models are developed using the pattern of gene expression values over time and further incorporating knowledge about the correlation between the change in the gene expression level and the output value. Experiments are performed for time-series microarray of bacteria, and an analysis from a biological perspective is carried out. The obtained results confirm the existence of relationships between output variables and gene expressions. Moreover, the shape-based clustering methods show promising results, being able to guide metabolic engineering actions with the identification of potential targets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The microarray data set obtained and used in this paper is available at request for academic purposes.

References

  • Andexer J, K SG, Nur-e Alam M, Lazos O, Foster T, Zimmermann A, Warneck T, Suthar D, Coates N, Koehn F, Skotnicki J, Carter G, Gregory M, Martin C, Moss S, Leadlay P, Wilkinson B (2011) Biosynthesis of the immunosuppressants FK506, FK520, and rapamycin involves a previously undescribed family of enzymes acting on chorismate. Proc Natl Acad Sci USA 108(12):4776–4781

    Article  Google Scholar 

  • Bolstad B, Irizarry R, A M, S TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193

    Article  Google Scholar 

  • Chira C, Sedano J, Villar JR, Camara M, Prieto C (2015) Shape-output gene clustering for time series microarrays. In 10th International conference on soft computing models in industrial and environmental applications, SOCO 2015, Burgos, Spain, June 2015, pp 241–250

  • Chira C, Sedano J, Villar JR, Prieto C, Corchado E (2013) Gene clustering in time series microarray analysis. In International joint conference SOCO’13-CISIS’13-ICEUTE’13—Salamanca, Spain, 11th–13th Sept 2013 Proceedings, pp 289–298

  • Dharmadi Y, Gonzalez R (2004) DNA microarrays: experimental issues, data analysis, and application to bacterial systems. Biotechnol Prog 20(5):1309–1324

    Article  Google Scholar 

  • Ernst J, Bar-Joseph Z (2006) Stem: a tool for the analysis of short time series gene expression data. BMC Bioinform 7(1):191

    Article  Google Scholar 

  • Kang A, Chang M (2012) Identification and reconstitution of genetic regulatory networks for improved microbial tolerance to isooctane. Mol BioSyst 8:1350–1358

    Article  Google Scholar 

  • Larrañaga P, Calvo B, Santana R, Bielza C, Galdiano J, Inza I, Lozano JA, Armañanzas R, Santafé G, Pérez A, Robles V (2006) Machine learning in bioinformatics. Brief Bioinform 7(1):86–112

    Article  Google Scholar 

  • Lee C-P, Leu Y (2011) A novel hybrid feature selection method for microarray data analysis. Appl Soft Comput 11:208–213

    Article  Google Scholar 

  • Liu H, Liu L, Zhang H (2010) Ensemble gene selection by grouping for microarray data classification. J Biomed Inform 43(2010):81–87

    Article  Google Scholar 

  • Liu T, Lin N, Shi N, Zhang B (2009) Information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments. BMC Bioinform 10(1):146

    Article  Google Scholar 

  • Lu Y, Han J (2003) Cancer classification using gene expression data. Inf Syst 28(4):243–268

    Article  MathSciNet  MATH  Google Scholar 

  • Mller-Levet C, Klawonn F, Cho K-H, Yin H, Wolkenhauer O (2005) Fuzzy sets in bioinformatics clustering of unevenly sampled gene expression time-series data. Fuzzy Sets Syst 152(1):49–66

    Article  MATH  Google Scholar 

  • Mller-Levet CS, Yin H (2005) Modeling and analysis of gene expression time-series based on co-expression. Int J Neural Syst 15(04):311–322

    Article  Google Scholar 

  • Motamedi H, Cai S-J, Shafiee A, Elliston K (1997) Structural organization of a multifunctional polyketide synthase involved in the biosynthesis of the macrolide immunosuppressant fk506. Eur J Biochem 244(1):74–80

    Article  Google Scholar 

  • Motamedi H, Shafiee A (1998) The biosynthetic gene cluster for the macrolactone ring of the immunosuppressant fk506. Eur J Biochem 256(3):528–34

    Article  Google Scholar 

  • Nieselt K, Battke F, Herbig A, Bruheim P, Wentzel A, Jakobsen O, Sletta H, Alam M, Merlo M, Moore J, Omara W, Morrissey E, Juarez-Hermosillo M, Rodriguez-Garcia A, Nentwich M, Thomas L, Iqbal M, Legaie R, Gaze W, Challis G, Jansen R, Dijkhuizen L, Rand D, Wild D, Bonin M, Reuther J, Wohlleben W, Smith M, Burroughs N, Martin J (2010) The dynamic architecture of the metabolic switch in streptomyces coelicolor. BMC Genomics 11(1):10

    Article  Google Scholar 

  • Pandey G, Yoshikawa K, Hirasawa T, Nagahisa K, Katakura Y, Furusawa C, Shimizu H, Shioya S (2007) Extracting the hidden features in saline osmotic tolerance in Saccharomyces cerevisiae from dna microarray data using the self-organizing map: biosynthesis of amino acids. Appl Microbiol Biotechnol 75:415–426

    Article  Google Scholar 

  • Peddada SD, Lobenhofer EK, Li L, Afshari CA, Weinberg CR, Umbach DM (2003) Gene selection and clustering for time-course and doseresponse microarray experiments using order-restricted inference. Bioinformatics 19(7):834–841

    Article  Google Scholar 

  • Phan S, Famili F, Tang Z, Pan Y, Liu Z, Ouyang J, Lenferink A, O’connor M M-C (2007) A novel pattern based clustering methodology for time-series microarray data. Int J Comput Math 84:585–597

    Article  MathSciNet  MATH  Google Scholar 

  • Pickens L, Tang Y, Chooi Y-H (2011) Metabolic engineering for the production of natural products. Annu Rev Chem Biomol Eng 2(1):211–236

    Article  Google Scholar 

  • Prieto C, Risueno A, Fontanillo C, De Las Rivas J (2008) Human gene coexpression landscape: confident network derived from tissue transcriptomic profiles. PLoS One 3(12):e3911

    Article  Google Scholar 

  • Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Article  Google Scholar 

  • Smyth G, Speed T (2003) Normalization of cdna microarray data. Methods 31(4):265–73

    Article  Google Scholar 

  • Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW (2005) Significance analysis of time course microarray experiments. Proc Natl Acad Sci USA 102(36):12837–12842

    Article  Google Scholar 

  • Tummala S, Junne S, Paredes C, Papoutsakis E (2003) Transcriptional analysis of product-concentration driven changes in cellular programs of recombinant clostridium acetobutylicumstrains. Biotechnol Bioeng 84(7):842–54

    Article  Google Scholar 

  • Wang Y, Tetko IV, Hall MA, Frank E, Facius A, Mayer KF, Mewes HW (2005) Gene selection from microarray data for cancer classification—a machine learning approach. Comput Biol Chem 29:37–46

    Article  MATH  Google Scholar 

  • Wu K, Chung L, Revill W, K L, Reeves C (2000) The FK520 gene cluster of Streptomyces hygroscopicus var. ascomyceticus (ATCC 14891) contains genes for biosynthesis of unusual polyketide extender units. Gene 251(1):81–90

    Article  Google Scholar 

Download references

Acknowledgments

This research has been supported by Spanish Ministry of Science and Innovation, under Project TIN2014-56967-R, and Junta de Castilla y León Project BIO/BU01/15.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Camelia Chira.

Ethics declarations

Conflict of interest

C. Chira, J. Sedano, J. R. Villar, M. Camara and C. Prieto declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by A. Herrero.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chira, C., Sedano, J., Villar, J.R. et al. Gene clustering for time-series microarray with production outputs. Soft Comput 20, 4301–4312 (2016). https://doi.org/10.1007/s00500-016-2299-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2299-3

Keywords

Navigation