Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: A critical perspective

https://doi.org/10.1016/j.tifs.2017.12.006Get rights and content

Highlights

  • Chemometric tools are widely used for classification, calibration and exploratory issues.

  • Unsupervised statistical methods are used to study data structure and look for clusters of samples.

  • PCA and CA are the most widely used methods.

  • PCA and CA can be useful in studies regarding bioactive compounds in foods.

  • We criticize the indiscriminate use of PCA and CA.

Abstract

Background

The development of statistical software has enabled food scientists to perform a wide variety of mathematical/statistical analyses and solve problems. Therefore, not only sophisticated analytical methods but also the application of multivariate statistical methods have increased considerably. Herein, principal component analysis (PCA) and hierarchical cluster analysis (HCA) are the most widely used tools to explore similarities and hidden patterns among samples where relationship on data and grouping are until unclear. Usually, larger chemical data sets, bioactive compounds and functional properties are the target of these methodologies.

Scope and approach

In this article, we criticize these methods when correlation analysis should be calculated and results analyzed.

Key findings and conclusions

The use of PCA and HCA in food chemistry studies has increased because the results are easy to interpret and discuss. However, their indiscriminate use to assess the association between bioactive compounds and in vitro functional properties is criticized as they provide a qualitative view of the data. When appropriate, one should bear in mind that the correlation between the content of chemical compounds and bioactivity could be duly discussed using correlation coefficients.

Introduction

As well stressed by Ropodi, Panagou, and Nychas (2016), in the 21st century, governmental, industrial, and academic problems need to be addressed by using sophisticated analytical tools with proper data collection, analysis and interpretation. In this sense, data mining and data analysis are two interrelated approaches developed rapidly to address problems related to engineering and technology, as well as medicine, economics, biology, and food science (Brown, 2017).

Chemometrics is an interfacial discipline that extracts useful information from large chemical and biochemical data sets using different mathematical and statistical methods (Brown, 2017, Nunes et al., 2015). In applied chemistry, the use of chemometrics has been spread and well recognized since 1960 (Brereton, 2014), but in food sciences and technology the applications of chemometrics and sensometrics (multivariate methods applied to sensory data and studies consumers) are somewhat new (Aquino et al., 2014, Munck et al., 1998, Qannari, 2017). Conversely, the application of chemometrics for assessing the adulteration and geographical origin of foods based on chemical markers is well established in food science (Granato et al., 2015b, Granato et al., 2015d, Paneque et al., 2017, Giannetti et al., 2017, Opatić et al., 2018). For example, Garrido-Delgado, Muñoz-Pérez, and Arce (2018) used ion mobility spectrometry (IMS) to determine the origin of the olive oil, quality and adulteration with low-cost vegetable oils. Using different statistical tools, authors were able to predict the level of contaminating oil in olive oil. Therefore, there is no doubt that chemometric tools is of fundamental importance to solve real life problems.

Granato, Nunes, and Barba (2017) stated that the use of design of experiments together with appropriate statistical data analysis is of pivotal importance to assess the association between nutrition, biology, pharmacology, functional properties and the chemical components of foods and their extracts. In this sense, chemometric tools and other statistical methodologies may be of interest when different food extracts and bioactivities need to be evaluated (Granato, de Araújo Calado, & Jarvis, 2014).

In real life applications, chemometrics may be employed in food science and technology studies either to assess similarities/differences between multiple objects (samples) or to project the objects in a two/three-dimensional factor-plane based on various characteristics. Therefore, clusterings can be observed and the reasons for the grouping can be pinpointed (Erasmus et al., 2018, Jandrić and Cannavan, 2017, Lund et al., 2017). Additionally, multivariate techniques have been widely used to authenticate/trace the geographical origin of foods, to verify the farming system employed by a company and check whether it complies to the information declared on the label, and to check for adulterations (intentional or not) of foods and raw materials (Granato et al., 2015c, Chiesa et al., 2016, Müller-Maatsch et al., 2016, Tavares et al., 2016, Zhu et al., 2017, Karabagias et al., 2017, Chung et al., 2017, Giannetti et al., 2017, Acierno et al., 2018).

For example, Luo, Shi, and Feng (2017) aimed to characterize the metabolites of Zhi-Zi-Hou-Po decoction, a traditional Chinese medicine, in rat bile, urine and feces after oral administration, using untargeted liquid chromatography time of flight mass spectrometry combined with orthogonal partial least squared discriminant (OPLS-DA). After analyzing the experimental data, authors were able to identify 83 compounds, in which 39 were metabolites, in the biological samples. In addition, the metabolic pathway (glucoronidation) by which these metabolites formed after oral administration of the decoction was identified by using OPLS-DA. This research is an example on how chemometric tools are important aids in not only in the food chemistry field but also in the experimental nutrition studies.

According to Brereton (2015), chemometrics users tend to ‘follow the crowd’ and use indiscriminately the available software without knowing the principles and fundamentals of each method applied in their research data analysis. In food chemistry studies, Principal Components Analysis (PCA) and Hierarchical Cluster Analysis (HCA) are widely (and, sometimes, improperly) applied as “unsupervised classification” methods to assess the association between bioactive compounds and in vitro functional properties (i.e., antioxidant and inhibition of enzymes). Herein, a critical perspective on these display techniques (PCA and HCA) is made together with some comments on their use in the field of bioactive compounds.

Section snippets

Study of bioactive compounds and in vitro potential functional properties with the use of chemometrics

Chemometrics may be used for both qualitative and quantitative analysis of experimental data (Martínez et al., 2017, Szymanska et al., 2015). Determining whether a rice sample comes from European countries or elsewhere based on the NMR spectra or the presence or absence of a chemical compound in a HPLC chromatogram are two typical examples of qualitative data. On the other hand, assessing the correlation between the content of chlorogenic acid derivatives and antioxidant activity of coffee

Final comments and recommendations

The use of PCA and HCA in food chemistry studies has increased in the past years because the results are easy to interpret and discuss, especially of a large data set is analyzed. However, the indiscriminate use of multivariate exploratory statistical techniques (PCA and HCA) to assess the association between bioactive compounds and in vitro functional properties is criticized as the results will be, in most cases, a sine qua non observation. When appropriate, the researcher should bear in mind

Acknowledgements

Daniel Granato acknowledges CNPq for a productivity grant (process 303188/2016-2). J. S. Santos and G. B. Escher thank CAPES/Fundação Araucária for their Ph.D scholarships.

References (59)

  • M. Fidelis et al.

    Authentication of juices from antioxidant and chemical perspectives: A feasibility quality control study using chemometrics

    Food Control

    (2017)
  • R. Garrido-Delgado et al.

    Detection of adulteration in extra virgin olive oils by using UV-IMS and chemometric analysis

    Food Control

    (2018)
  • V. Giannetti et al.

    Volatile fraction analysis by HS-SPME/GC-MS and chemometric modeling for traceability of apples cultivated in the Northeast Italy

    Food Control

    (2017)
  • D. Granato et al.

    Observations on the use of statistical methods in food science and technology

    Food Research International

    (2014)
  • D. Granato et al.

    An integrated strategy between food chemistry, biology, nutrition, pharmacology, and statistics in the development of functional foods: A proposal

    Trends in Food Science & Technology

    (2017)
  • Y. Guo et al.

    An integrated antioxidant activity fingerprint for commercial teas based on their capacities to scavenge reactive oxygen species

    Food Chemistry

    (2017)
  • Z. Jandrić et al.

    An investigative study on differentiation of citrus fruit/fruit juices by UPLC-QToF MS and chemometrics

    Food Control

    (2017)
  • Z. Kalaycıoğlu et al.

    Characterization of Turkish honeybee pollens by principal component analysis based on their individual organic acids, sugars, minerals, and antioxidant activities

    LWT – Food Science and Technology

    (2017)
  • I.K. Karabagias et al.

    Characterization and geographical discrimination of commercial Citrus spp. honeys produced in different Mediterranean countries based on minerals, volatile compounds and physicochemical parameters, using chemometrics

    Food Chemistry

    (2017)
  • I. Kasprzyk et al.

    FTIR-ATR spectroscopy of pollen and honey as a tool for unifloral honey authentication. The case study of rape honey

    Food Control

    (2018)
  • N. Liu et al.

    Portraying and tracing the impact of different production systems on the volatile organic compound composition of milk by PTR-(Quad)MS and PTR-(ToF)MS

    Food Chemistry

    (2018)
  • J.A. Lund et al.

    Differentiation of Crataegus spp. guided by nuclear magnetic resonance spectrometry with chemometric analyses

    Phytochemistry

    (2017)
  • K. Luo et al.

    Characterization of global metabolic profile of Zhi-Zi-Hou-Po decoction in rat bile, urine and feces after oral administration based on a strategy combining LC–MS and chemometrics

    Journal of Chromatography B

    (2017)
  • H. Lv et al.

    Phytochemical profiles and antioxidant activities of Chinese dark teas obtained by different processing technologies

    Food Research International

    (2017)
  • P. Mapelli-Brahm et al.

    Isoprenoids composition and colour to differentiate virgin olive oils from a specific mill

    LWT - Food Science and Technology

    (2018)
  • S. Mehretie et al.

    Classification of raw Ethiopian honeys using front face fluorescence spectra with multivariate analysis

    Food Control

    (2018)
  • J. Müller-Maatsch et al.

    Adulteration of anthocyanin- and betalain-based coloring foodstuffs with the textile dye ‘Reactive Red 195’ and its detection by spectrophotometric, chromatic and HPLC-PDA-MS/MS analyses

    Food Control

    (2016)
  • L. Munck et al.

    Chemometrics in food science – a demonstration of the feasibility of a highly exploratory, inductive evaluation strategy of fundamental scientific significance

    Chemometrics and Intelligent Laboratory Systems

    (1998)
  • G.A. Nayik et al.

    A chemometric approach to evaluate the phenolic compounds, antioxidant activity and mineral content of different unifloral honey types from Kashmir, India

    LWT - Food Science and Technology

    (2016)
  • Cited by (640)

    View all citing articles on Scopus
    View full text