Muhamad Adib bin Ahmad, Loong Chuen Lee, Nur Ain Najihah Binti Mohd Rosdi, Nadirah Binti Abd Hamid, A. Ishak, Hukil Sino
{"title":"比较基于 UPLC 和 PLS-DA 方法的基线校正算法在鉴别五处近距离棕壤中的应用","authors":"Muhamad Adib bin Ahmad, Loong Chuen Lee, Nur Ain Najihah Binti Mohd Rosdi, Nadirah Binti Abd Hamid, A. Ishak, Hukil Sino","doi":"10.1093/fsr/owad045","DOIUrl":null,"url":null,"abstract":"\n \n \n Soil is commonly collected from an outdoor crime scene, and thus it is helpful in linking a suspect and a victim to a crime scene. The chemical profiles of soils can be acquired via chemical instruments such as Ultra-Performance Liquid Chromatography (UPLC). However, the UPLC chromatogram often interferes with an unstable baseline. In this paper, we compared the performance of five baseline correction (BC) algorithms, i.e., asymmetric least squares, fill peak (FP), iterative restricted least squares, median window (MW), and modified polynomial fitting, in discriminating 30 chromatograms of brownish soils by five locations of origin, i.e., PP, HK, KU, BL and KB. The performances of the preprocessed sub-datasets were first visually inspected through the mean chromatograms and then further explored via scores plots of principal component analysis. Eventually, the predictive performances of the PLS-DA models estimated from 1000 pairs of training and testing samples (i.e., prepared via iterative random resampling split at 75:25) were studied to identify the best BC method. Mean raw chromatograms of the ten soil samples were different from each other, with evident fluctuated baselines. AsLS and MW corrected chromatograms demonstrated the most significant improvement compared to the raw counterpart. Meanwhile, the scores plot of PCA revealed that most of the sub-datasets produced three separate clusters. Then, the sub-datasets were modelled via the partial least squares-discriminant analysis (PLS-DA) technique. MW emerged as the excellent BC method based on the mean prediction accuracy estimated using 1000 pairs of training and testing samples. In conclusion, MW outperformed the other BC methods in correcting the UPLC data of soil.\n \n \n \n","PeriodicalId":45852,"journal":{"name":"Forensic Sciences Research","volume":" 30","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing baseline correction algorithms in discriminating brownish soils from five proximity locations based on UPLC and PLS-DA methods\",\"authors\":\"Muhamad Adib bin Ahmad, Loong Chuen Lee, Nur Ain Najihah Binti Mohd Rosdi, Nadirah Binti Abd Hamid, A. Ishak, Hukil Sino\",\"doi\":\"10.1093/fsr/owad045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n \\n \\n Soil is commonly collected from an outdoor crime scene, and thus it is helpful in linking a suspect and a victim to a crime scene. The chemical profiles of soils can be acquired via chemical instruments such as Ultra-Performance Liquid Chromatography (UPLC). However, the UPLC chromatogram often interferes with an unstable baseline. In this paper, we compared the performance of five baseline correction (BC) algorithms, i.e., asymmetric least squares, fill peak (FP), iterative restricted least squares, median window (MW), and modified polynomial fitting, in discriminating 30 chromatograms of brownish soils by five locations of origin, i.e., PP, HK, KU, BL and KB. The performances of the preprocessed sub-datasets were first visually inspected through the mean chromatograms and then further explored via scores plots of principal component analysis. Eventually, the predictive performances of the PLS-DA models estimated from 1000 pairs of training and testing samples (i.e., prepared via iterative random resampling split at 75:25) were studied to identify the best BC method. Mean raw chromatograms of the ten soil samples were different from each other, with evident fluctuated baselines. AsLS and MW corrected chromatograms demonstrated the most significant improvement compared to the raw counterpart. Meanwhile, the scores plot of PCA revealed that most of the sub-datasets produced three separate clusters. Then, the sub-datasets were modelled via the partial least squares-discriminant analysis (PLS-DA) technique. MW emerged as the excellent BC method based on the mean prediction accuracy estimated using 1000 pairs of training and testing samples. In conclusion, MW outperformed the other BC methods in correcting the UPLC data of soil.\\n \\n \\n \\n\",\"PeriodicalId\":45852,\"journal\":{\"name\":\"Forensic Sciences Research\",\"volume\":\" 30\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Forensic Sciences Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/fsr/owad045\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, LEGAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forensic Sciences Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/fsr/owad045","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, LEGAL","Score":null,"Total":0}
引用次数: 0
摘要
土壤通常是从室外犯罪现场采集的,因此有助于将嫌疑人和受害者与犯罪现场联系起来。可以通过超高效液相色谱(UPLC)等化学仪器获取土壤的化学特征。然而,超高效液相色谱法的色谱图往往会受到不稳定基线的干扰。本文比较了非对称最小二乘法、填充峰(FP)、迭代限制最小二乘法、中值窗(MW)和修正多项式拟合等五种基线校正(BC)算法在按五个产地(即 PP、HK、KU、BL 和 KB)判别 30 幅棕壤色谱图时的性能。预处理后的子数据集的性能首先通过平均色谱图进行直观检查,然后通过主成分分析的得分图进行进一步探讨。最后,研究了从 1000 对训练样本和测试样本(即通过迭代随机重样法按 75:25 的比例分割制备的样本)估算出的 PLS-DA 模型的预测性能,以确定最佳的 BC 方法。10 个土壤样品的平均原始色谱图彼此不同,基线波动明显。与原始色谱图相比,AsLS 和 MW 校正色谱图的改进最为显著。同时,PCA 的得分图显示,大多数子数据集都产生了三个独立的聚类。然后,通过偏最小二乘判别分析(PLS-DA)技术对子数据集进行建模。根据使用 1000 对训练和测试样本估算的平均预测准确率,MW 成为优秀的 BC 方法。总之,MW 在校正土壤 UPLC 数据方面的表现优于其他 BC 方法。
Comparing baseline correction algorithms in discriminating brownish soils from five proximity locations based on UPLC and PLS-DA methods
Soil is commonly collected from an outdoor crime scene, and thus it is helpful in linking a suspect and a victim to a crime scene. The chemical profiles of soils can be acquired via chemical instruments such as Ultra-Performance Liquid Chromatography (UPLC). However, the UPLC chromatogram often interferes with an unstable baseline. In this paper, we compared the performance of five baseline correction (BC) algorithms, i.e., asymmetric least squares, fill peak (FP), iterative restricted least squares, median window (MW), and modified polynomial fitting, in discriminating 30 chromatograms of brownish soils by five locations of origin, i.e., PP, HK, KU, BL and KB. The performances of the preprocessed sub-datasets were first visually inspected through the mean chromatograms and then further explored via scores plots of principal component analysis. Eventually, the predictive performances of the PLS-DA models estimated from 1000 pairs of training and testing samples (i.e., prepared via iterative random resampling split at 75:25) were studied to identify the best BC method. Mean raw chromatograms of the ten soil samples were different from each other, with evident fluctuated baselines. AsLS and MW corrected chromatograms demonstrated the most significant improvement compared to the raw counterpart. Meanwhile, the scores plot of PCA revealed that most of the sub-datasets produced three separate clusters. Then, the sub-datasets were modelled via the partial least squares-discriminant analysis (PLS-DA) technique. MW emerged as the excellent BC method based on the mean prediction accuracy estimated using 1000 pairs of training and testing samples. In conclusion, MW outperformed the other BC methods in correcting the UPLC data of soil.