Original papersElement selection and concentration analysis for classifying South America wine samples according to the country of origin
Introduction
The growth in international trades and potential markets for food and beverage products have motivated producing regions to develop and apply regulations to ensure product traceability (Zhao et al., 2013). In trading markets, the association of brands with their places of origin tends to boost product acceptance, leading to premium prices and commercial advantages (Diniz et al., 2014, Karoui and De Baerdemaeker, 2007). Thus, food and beverage manufacturers have displayed increasing interest in ensuring precise categorization of products into proper classes according to their place of origin, as well as in improving mechanisms for confirming products’ authenticity (Borràs et al., 2015, Marcelo et al., 2014).
Wine attributes and quality strongly depend on grapes’ features, soil properties and climate conditions, among other variables; such aspects, when combined with specific cultivation, production and preservation techniques become fundamental for product promotion and distinction (Marini et al., 2006). Wines cultivated in specific geographical regions and subjected to strict regulations are certified by a Controlled Denomination of Origin (CDO) distinction, which ensures their superior quality and adherence to best practices (Gómez-Meire et al., 2014). Thus, the development of reliable, fast and straightforward techniques aimed at precisely recognizing wines’ authenticity regarding their origin becomes a relevant issue to preserve the reputation of a CDO distinction (Marini et al., 2006).
An analytical approach to trace the origin of food and beverage products consists of assessing their elemental composition and chemical concentration (Drivelos and Georgiou, 2012). The analysis of elements’ concentrations determined by inductively coupled plasma optical emission spectrometry (ICP-OES) and/or inductively coupled plasma mass spectrometry (ICP-MS) has been widely used to determine the quality of products such as organic coffee (Barbosa et al., 2014a), eggs (Barbosa et al., 2014b), rice (Maione et al., 2016), and tea (Diniz et al., 2014, Moreda-Pineiro et al., 2003). Due to its high sensitivity and ability to measure isotopes, ICP-MS is deemed one of the most appropriate techniques for the determination of trace elements in wine (Gonzálvez et al., 2009). Such technique quantifies the presence of several chemical elements (e.g. Cu, Fe, Mn, Sn and Zn) which may affect wine stability in terms of color, taste, and organoleptic aspects. The concentration of those elements may be determined by geochemistry features, as well as by variations on winemaking procedures. Thus, the assessment of elements’ concentrations becomes a valuable resource to corroborate the authenticity of a wine. With that in view, focusing on chemical elements with higher discriminant ability becomes a crucial step to ensure proper classification of wine samples according to producing country or region. Although some studies have applied statistical and data mining-based techniques for classifying wines according to organoleptic features and geographical origin (e.g. Marini et al., 2006, Coetzee et al., 2014, Azcarate et al., 2015), few have focused on the selection of relevant features (i.e. chemical elements) that enable accurate discrimination and classification of wine samples.
This paper proposes a novel framework for feature selection aimed at categorizing wines samples into classes according to place of origin. The method combines filter and wrapper-based feature selection procedures, and relies on two operational steps. In the first step, the method applies the Kruskal-Wallis (KW) non-parametric test to each feature; features presenting a p-value higher than a given threshold h are removed from the analysis. The aim here is to discard features with no significant ability to discriminate wine samples regarding their place of origin, reducing computational effort and potentially increasing the classifier’s performance. Next, a Linear Discriminant Analysis (LDA) is applied to the remaining features, and a feature importance index is derived from LDA parameters; such index guides the selection process carried out in the next step of the proposed method. In the second step, a forward procedure based on the ranking of features given by the LDA importance index is employed. Best-ranked features are inserted one by one into the subset of features used for classification; after each insertion, classification performance is assessed. The number of selected features is chosen according the maximum accuracy in the repeated cross-validation. Aiming at improving categorization accuracy, different classification techniques are tested.
Section snippets
Samples, materials and sample preparation
Fifty three (53) samples of red wine from four wine-producing countries in South America were purchased in local markets: 13 from Argentina, 15 from Brazil, 13 from Chile, and 12 from Uruguay. The cultivars (mostly Vitis vinifera species) and geographical origin were labeled on the wine bottles. The number of samples from each producing country was not the same, as some cultivars were not found in local markets.
For sample preparation and dilution, nitric acid (Merck) was used. High-purity water
Results and discussion
Table 1 (adapted from Bentlin et al., 2011) depicts the interval of concentrations for the assessed elements in μg L−1 (note that Ca, Na, Mg, P and K concentrations are given in mg L−1); concentration means and standard deviations are presented in bold and between parentheses, respectively.
The proposed method starts by applying the KS test on all features to select the ones with p-values smaller than a given threshold h. To better assess the influence of h in the results, we tested three
Conclusion
The development of frameworks aimed at recognizing wines’ authenticity regarding their origin is deemed a fundamental topic to preserve the reputation of a CDO. This paper proposed a framework for feature selection to classify wines samples into categories according to place of origin. The framework firstly applies the Kruskal-Wallis (KW) test to each feature aiming at removing features with no significant ability to discriminate wine samples. Next, a feature importance index is derived from
References (32)
- et al.
Multicriteria variable selection for classification of production batches
Eur. J. Oper. Res.
(2012) - et al.
Classification of monovarietal Argentinean white wines by their elemental profile
Food Control
(2015) - et al.
The use of advanced chemometric techniques and trace element levels for controlling the authenticity of organic coffee
Food Res. Int.
(2014) - et al.
A hybrid feature selection method based on instance learning and cooperative subset search
Pattern Recogn. Lett.
(2016) - et al.
Data fusion methodologies for food and beverage authentication and quality assessment – a review
Anal. Chim. Acta
(2015) - et al.
Intraregional classification of wine via ICP-MS elemental fingerprinting
Food Chem.
(2014) - et al.
Multi-element and multi-isotope-ratio analysis to determine the geographical origin of foods in the European Union
TrAC – Trends Anal. Chem.
(2012) - et al.
A novel forward gene selection algorithm for microarray data
Neurocomputing
(2014) - et al.
Characterization of wines according the geographical origin by analysis of isotopes and minerals and the influence of harvest on the isotope values
Food Chem.
(2013) - et al.
Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks
Expert Syst. Appl.
(2014)
Feature subset selection in large dimensionality domains
Pattern Recogn.
Assuring the authenticity of northwest Spain white wine varieties using machine learning techniques
Food Res. Int.
Elemental fingerprint of wines from the protected designation of origin Valencia
Food Chem.
A review of the analytical methods coupled with chemometric tools for the determination of the quality and identity of dairy products
Food Chem.
Classification of geographic origin of rice by data mining and inductively coupled plasma mass spectrometry
Comput. Electron. Agric.
Authentication of Italian CDO wines by class-modeling techniques
Chemomet. Intell. Lab. Syst.
Cited by (26)
Assessing geographical origin of Diqing wines based on their elemental and isotopic profiles
2024, Journal of Food Composition and AnalysisDetermination of the most informative chemical elements for discrimination of rice samples according to the producing region
2023, Food ChemistryCitation Excerpt :Like other foods, the denomination of origin impacts not only rice price but also influences commercial trades. In that sense, several statistics-based techniques have been applied to confirm the authenticity, composition, and quality of food products, leading to a better comprehension of chemical elements and features that differentiate products (Kahmann, Anzanello, Marcelo, & Pozebon, 2017; Soares et al., 2018; Yamashita et al., 2019; Rodrigues et al., 2020). This study proposes an approach to identify the chemical elements that best discriminate rice samples according to their region of origin in RS, Brazil.
Selecting relevant wavelength intervals for PLS calibration based on absorbance interquartile ranges
2022, Chemometrics and Intelligent Laboratory SystemsCitation Excerpt :It is likely that some wavelengths in such threshold subset are not relevant, and a localized approach to verify their contribution to the model becomes worthwhile. In addition, we also intend to integrate a filter phase relying on the Kruskal-Wallis non-parametric test [48] in the suggested framework tailored to remove less relevant wavelengths before conducting the wrapper phase. Gabrielli Harumi Yamashita: Conceptualization, Methodology, Writing- Original draft preparation, Michel Jose Anzanello: Supervision, Writing- Reviewing and Editing, Felipe Soares: Conceptualization, Visualization, Miriam Karla Rocha: Validation, Flavio Sanson Fogliatto: Writing- Reviewing and Editing.
Evaluation of feature selection methods based on artificial neural network weights
2021, Expert Systems with ApplicationsPredictive modeling for wine authenticity using a machine learning approach
2021, Artificial Intelligence in AgricultureCitation Excerpt :When there are no linear decision boundaries, the original dataset (x) is converted into a new space f(x) which there is a linear decision boundary that separates the samples into their classes. SVM has been successfully used in many different applications, such as: food science (Araújo et al., 2019; Richter et al., 2019; Soares et al., 2018; Turra et al., 2017), medicine (Froz et al., 2017; Vogado et al., 2018), forensic science (Maione et al., 2018), among others. In previous studies, some Vitis Vinífera wines from South America were analyzed with machine learning using feature selection and support vector machines based on their antioxidant activity, phenolic substances, anthocyanins and color (Costa et al., 2018, 2019; da Costa et al., 2016).