{"title":"Identification of millet origin using terahertz spectroscopy combined with ensemble learning","authors":"","doi":"10.1016/j.infrared.2024.105547","DOIUrl":null,"url":null,"abstract":"<div><p>It’s crucial for both producers and consumers to accurately trace the origin of millet, given the significant differences in price and taste that exist between millets from various origins. The traditional method of identifying the origin of millet is time-consuming, laborious, complex, and destructive. In this study, a new method for fast and non-destructive differentiation of millet origins is developed by combining terahertz time domain spectroscopy with ensemble learning. Firstly, three machine learning algorithms, namely support vector machine (SVM), random forest (RF), and kernel extreme learning machine (KELM), were used to build different discriminative models, and then the impact of six different preprocessing methods on the models’ classification performance was compared. It was observed that models employing Savitzky-Golay preprocessing exhibited pronounced superiority in accurately determining the millet’s geographical origins. Building upon these findings, the research introduces an innovative ensemble learning strategy, leveraging both topsis and stacking techniques, to harness the collective strengths of the three algorithms. The outcomes of this approach reveal its remarkable capacity to distinguish millets originating from five distinct locations without the necessity for any parameter fine-tuning. The accuracy, F1 score, and Kappa on the prediction set are all 100 %, which significantly outperforms the single model, traditional voting method, and stacking method. The culmination of this study suggests that the integration of terahertz time-domain spectroscopy and TOPSIS-Stacking ensemble learning emerges as a promising method for the swift and non-intrusive discrimination of millet geographical origins with remarkable precision.</p></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449524004316","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
It’s crucial for both producers and consumers to accurately trace the origin of millet, given the significant differences in price and taste that exist between millets from various origins. The traditional method of identifying the origin of millet is time-consuming, laborious, complex, and destructive. In this study, a new method for fast and non-destructive differentiation of millet origins is developed by combining terahertz time domain spectroscopy with ensemble learning. Firstly, three machine learning algorithms, namely support vector machine (SVM), random forest (RF), and kernel extreme learning machine (KELM), were used to build different discriminative models, and then the impact of six different preprocessing methods on the models’ classification performance was compared. It was observed that models employing Savitzky-Golay preprocessing exhibited pronounced superiority in accurately determining the millet’s geographical origins. Building upon these findings, the research introduces an innovative ensemble learning strategy, leveraging both topsis and stacking techniques, to harness the collective strengths of the three algorithms. The outcomes of this approach reveal its remarkable capacity to distinguish millets originating from five distinct locations without the necessity for any parameter fine-tuning. The accuracy, F1 score, and Kappa on the prediction set are all 100 %, which significantly outperforms the single model, traditional voting method, and stacking method. The culmination of this study suggests that the integration of terahertz time-domain spectroscopy and TOPSIS-Stacking ensemble learning emerges as a promising method for the swift and non-intrusive discrimination of millet geographical origins with remarkable precision.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.