Fengle Zhu , Yuqian Zhang , Jian Wang , Xiangdong Luo , Dengtao Liu , Kaicheng Jin , Jiyu Peng
{"title":"An improved deep convolutional generative adversarial network for quantification of catechins in fermented black tea","authors":"Fengle Zhu , Yuqian Zhang , Jian Wang , Xiangdong Luo , Dengtao Liu , Kaicheng Jin , Jiyu Peng","doi":"10.1016/j.saa.2024.125357","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid and non-destructive quantification of catechins in fermented black tea is crucial for evaluating the quality of black tea. The combination of hyperspectral imaging and chemometrics has been applied for quantitative detection, but its performance is usually constrained by the limited dataset size. Targeted at the challenge of insufficient samples in regression analysis of catechins, this study proposes an improved deep convolutional generative adversarial network with labeling module, named as DCGAN-L for hyperspectral data augmentation. The DCGAN-L consists of the spectral and label generating modules. First the synthetic spectra were generated, and an indicator was proposed to evaluate their quality. Then, the corresponding label values were generated, including epicatechin gallate (ECG), epicatechin (EGC), catechin (C), and total catechin (CC). For label generating, the Euclidean distances between the synthetic spectrum and all true spectra were measured, followed by allocating weights for calculating the label values based on these distances. Subsequently, the training dataset was augmented with the generated synthetic data. The effect of data augmentation was finally evaluated based on two regression models of random forest (RF) and broad learning system (BLS) for the quantification of catechins. Compared with the results before data augmentation, the average <em>R<sup>2</sup></em> of RF and BLS models increased by 0.044 and 0.164, respectively. The proposed DCGAN-L model allows for the rapid, non-destructive quantification of catechins in black tea in the case of limited sample size.</div></div>","PeriodicalId":433,"journal":{"name":"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy","volume":"327 ","pages":"Article 125357"},"PeriodicalIF":4.3000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386142524015233","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SPECTROSCOPY","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid and non-destructive quantification of catechins in fermented black tea is crucial for evaluating the quality of black tea. The combination of hyperspectral imaging and chemometrics has been applied for quantitative detection, but its performance is usually constrained by the limited dataset size. Targeted at the challenge of insufficient samples in regression analysis of catechins, this study proposes an improved deep convolutional generative adversarial network with labeling module, named as DCGAN-L for hyperspectral data augmentation. The DCGAN-L consists of the spectral and label generating modules. First the synthetic spectra were generated, and an indicator was proposed to evaluate their quality. Then, the corresponding label values were generated, including epicatechin gallate (ECG), epicatechin (EGC), catechin (C), and total catechin (CC). For label generating, the Euclidean distances between the synthetic spectrum and all true spectra were measured, followed by allocating weights for calculating the label values based on these distances. Subsequently, the training dataset was augmented with the generated synthetic data. The effect of data augmentation was finally evaluated based on two regression models of random forest (RF) and broad learning system (BLS) for the quantification of catechins. Compared with the results before data augmentation, the average R2 of RF and BLS models increased by 0.044 and 0.164, respectively. The proposed DCGAN-L model allows for the rapid, non-destructive quantification of catechins in black tea in the case of limited sample size.
期刊介绍:
Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy (SAA) is an interdisciplinary journal which spans from basic to applied aspects of optical spectroscopy in chemistry, medicine, biology, and materials science.
The journal publishes original scientific papers that feature high-quality spectroscopic data and analysis. From the broad range of optical spectroscopies, the emphasis is on electronic, vibrational or rotational spectra of molecules, rather than on spectroscopy based on magnetic moments.
Criteria for publication in SAA are novelty, uniqueness, and outstanding quality. Routine applications of spectroscopic techniques and computational methods are not appropriate.
Topics of particular interest of Spectrochimica Acta Part A include, but are not limited to:
Spectroscopy and dynamics of bioanalytical, biomedical, environmental, and atmospheric sciences,
Novel experimental techniques or instrumentation for molecular spectroscopy,
Novel theoretical and computational methods,
Novel applications in photochemistry and photobiology,
Novel interpretational approaches as well as advances in data analysis based on electronic or vibrational spectroscopy.