Statistical machine learning techniques applied to NIR spectral data for rapid detection of sudan dye-I in turmeric powders with optimized pre-processing and wavelength selection
{"title":"Statistical machine learning techniques applied to NIR spectral data for rapid detection of sudan dye-I in turmeric powders with optimized pre-processing and wavelength selection","authors":"Saumita Kar, Bipan Tudu, Rajib Bandyopadhyay","doi":"10.1007/s13197-024-05971-9","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning techniques were applied systematically to the spectral data of near-infrared (NIR) spectroscopy to find out the sudan dye I adulterants in turmeric powders. Turmeric powder is one of the most commonly used spice and a simple target for adulteration. Pure turmeric powder was prepared at the laboratory and spiked with sudan dye I adulterants. The spectral data of these adulterated mixtures were obtained by NIR spectrometer and investigated accordingly. The concentrations of the adulterants were 1%, 5%, 10%, 15%, 20%, 25%, 30% (w/w) respectively. Exploratory data analysis was done for the visualization of the adulterant classes by principal component analysis (PCA). Optimization of the pre-processing and wavelength selection was done by cross-validation techniques using a partial least squares regression (PLSR) model. For quantitative analysis four different regression techniques were applied namely ensemble tree regression (ENTR), support vector regression (SVR), principal component regression (PCR), and PLSR, and a comparative analysis was done. The best method was found to be PLSR. The accuracy of the PLSR analysis was determined with the coefficients of determination (R<sup>2</sup>) of greater than 0.97 and with root mean square error (RMSE) of less than 0.93 respectively. For the verification of the robustness of the model, the <i>Figure of merit</i> (FOM) of the model was derived with the help of the Net analyte signal (NAS) theory. The current study established that the NIR spectroscopy can be applied to detect and quantify the amount of sudan dye I adulterants added to the turmeric powders with satisfactory accuracy.</p></div>","PeriodicalId":632,"journal":{"name":"Journal of Food Science and Technology","volume":null,"pages":null},"PeriodicalIF":2.7010,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Food Science and Technology","FirstCategoryId":"1","ListUrlMain":"https://link.springer.com/article/10.1007/s13197-024-05971-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning techniques were applied systematically to the spectral data of near-infrared (NIR) spectroscopy to find out the sudan dye I adulterants in turmeric powders. Turmeric powder is one of the most commonly used spice and a simple target for adulteration. Pure turmeric powder was prepared at the laboratory and spiked with sudan dye I adulterants. The spectral data of these adulterated mixtures were obtained by NIR spectrometer and investigated accordingly. The concentrations of the adulterants were 1%, 5%, 10%, 15%, 20%, 25%, 30% (w/w) respectively. Exploratory data analysis was done for the visualization of the adulterant classes by principal component analysis (PCA). Optimization of the pre-processing and wavelength selection was done by cross-validation techniques using a partial least squares regression (PLSR) model. For quantitative analysis four different regression techniques were applied namely ensemble tree regression (ENTR), support vector regression (SVR), principal component regression (PCR), and PLSR, and a comparative analysis was done. The best method was found to be PLSR. The accuracy of the PLSR analysis was determined with the coefficients of determination (R2) of greater than 0.97 and with root mean square error (RMSE) of less than 0.93 respectively. For the verification of the robustness of the model, the Figure of merit (FOM) of the model was derived with the help of the Net analyte signal (NAS) theory. The current study established that the NIR spectroscopy can be applied to detect and quantify the amount of sudan dye I adulterants added to the turmeric powders with satisfactory accuracy.