María-Pilar Sáenz-Navajas , Chelo Ferreira , Susan E.P. Bastian , David W. Jeffery
{"title":"Bagging and boosting machine learning algorithms for modelling sensory perception from simple chemical variables: Wine mouthfeel as a case study","authors":"María-Pilar Sáenz-Navajas , Chelo Ferreira , Susan E.P. Bastian , David W. Jeffery","doi":"10.1016/j.foodqual.2025.105494","DOIUrl":null,"url":null,"abstract":"<div><div>Aiming to predict sensory properties from chemical data, the application of bagging and boosting machine learning (ML) algorithms was comprehensively investigated and applied to modelling of red wine mouthfeel from simple chemical measurements. A panel of 15 Australian winemakers described the mouthfeel properties of a total of 30 commercial red wines from Australia and Spain using rate-all-that-apply sensory methodology. In parallel, linear sweep voltammetry signals and excitation-emission matrix (EEM) and absorbance data were acquired for the wines. Data were analysed following unsupervised statistical strategies including principal component analysis (PCA with varimax rotation) to simplify the interpretation of sensory variables, along with supervised regression models based on ML, namely random forest (RF) and extreme gradient boosting (XGBoost). PCA results showed that four independent and uncorrelated sensory dimensions mainly related to perceptions of ‘drying’, ‘full body’, ‘velvety’, and ‘gummy’ differentiated among the wines. The RF and XGBoost algorithms yielded superior validated regression models compared to classical PLS modelling. The ML algorithms exhibited strong predictive performance on test data, with an average value exceeding 80 % accuracy for any of the three sets of chemical variables employed. Although XGBoost provided slightly better models, the low computational effort required by RF is advantageous. Key variables included in the models are discussed along with the importance of controlling overfitting. Overall, absorbance, voltammetric or EEM signals coupled with RF or XGBoost algorithms are presented as cheap, easy-to-use, and rapid approaches to predicting sensory properties from chemical signals in complex matrices such as wine.</div></div>","PeriodicalId":322,"journal":{"name":"Food Quality and Preference","volume":"129 ","pages":"Article 105494"},"PeriodicalIF":4.9000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Quality and Preference","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950329325000692","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Aiming to predict sensory properties from chemical data, the application of bagging and boosting machine learning (ML) algorithms was comprehensively investigated and applied to modelling of red wine mouthfeel from simple chemical measurements. A panel of 15 Australian winemakers described the mouthfeel properties of a total of 30 commercial red wines from Australia and Spain using rate-all-that-apply sensory methodology. In parallel, linear sweep voltammetry signals and excitation-emission matrix (EEM) and absorbance data were acquired for the wines. Data were analysed following unsupervised statistical strategies including principal component analysis (PCA with varimax rotation) to simplify the interpretation of sensory variables, along with supervised regression models based on ML, namely random forest (RF) and extreme gradient boosting (XGBoost). PCA results showed that four independent and uncorrelated sensory dimensions mainly related to perceptions of ‘drying’, ‘full body’, ‘velvety’, and ‘gummy’ differentiated among the wines. The RF and XGBoost algorithms yielded superior validated regression models compared to classical PLS modelling. The ML algorithms exhibited strong predictive performance on test data, with an average value exceeding 80 % accuracy for any of the three sets of chemical variables employed. Although XGBoost provided slightly better models, the low computational effort required by RF is advantageous. Key variables included in the models are discussed along with the importance of controlling overfitting. Overall, absorbance, voltammetric or EEM signals coupled with RF or XGBoost algorithms are presented as cheap, easy-to-use, and rapid approaches to predicting sensory properties from chemical signals in complex matrices such as wine.
期刊介绍:
Food Quality and Preference is a journal devoted to sensory, consumer and behavioural research in food and non-food products. It publishes original research, critical reviews, and short communications in sensory and consumer science, and sensometrics. In addition, the journal publishes special invited issues on important timely topics and from relevant conferences. These are aimed at bridging the gap between research and application, bringing together authors and readers in consumer and market research, sensory science, sensometrics and sensory evaluation, nutrition and food choice, as well as food research, product development and sensory quality assurance. Submissions to Food Quality and Preference are limited to papers that include some form of human measurement; papers that are limited to physical/chemical measures or the routine application of sensory, consumer or econometric analysis will not be considered unless they specifically make a novel scientific contribution in line with the journal''s coverage as outlined below.