Multi-Analyte Concentration Analysis of Marine Samples through Regression-Based Machine Learning

IF 2.9 3区 化学 Q2 CHEMISTRY, MULTIDISCIPLINARY ACS Earth and Space Chemistry Pub Date : 2024-07-31 DOI:10.1021/acsearthspacechem.4c00018
Nicole M. North, Jessica B. Clark, Abigail A. A. Enders, Alex J. Grooms, Salmika G. Wairegi, Kezia A. Duah, Efthimia I. Palassis-Naziri, Abraham Badu-Tawiah, Heather C. Allen
{"title":"Multi-Analyte Concentration Analysis of Marine Samples through Regression-Based Machine Learning","authors":"Nicole M. North, Jessica B. Clark, Abigail A. A. Enders, Alex J. Grooms, Salmika G. Wairegi, Kezia A. Duah, Efthimia I. Palassis-Naziri, Abraham Badu-Tawiah, Heather C. Allen","doi":"10.1021/acsearthspacechem.4c00018","DOIUrl":null,"url":null,"abstract":"Marine systems are incredibly chemically complex. An understanding of the chemical compounds that make up the chemical diversity in marine samples is critical to understanding ecological and ocean health metrics. Using Raman spectroscopy in tandem with machine learning combines a low-cost, highly transportable analytical technique with a powerful and rapid computational approach that can aid in marine analysis. Here, we use Raman spectroscopy and machine learning to identify mM concentrations of three chemically relevant compounds in three distinct classes in a complex aqueous matrix. Saccharides are represented by glucose, fatty acids by butyric acid, and proteins by an amino acid proxy through glycine. Eight classical machine learning models (gradient boosted regressors, random forests, histogram gradient boosted regressors, decision trees, k-nearest neighbors, support vector regression, multi-layer perceptrons, and multivariate linear regression) were tested for their accuracy in identifying the concentrations of glycine, glucose, and butyric acid in marine samples, which were benchmarked through a mass spectrometric method. Support vector regression was able to best identify all three concentrations of glycine, butyric acid, and glucose. Butyric acid was similarly well described through gradient boosted regression and histogram gradient boosted regression. The described spectroscopy and machine learning methodology has the potential to significantly advance rapid field analysis of marine samples.","PeriodicalId":15,"journal":{"name":"ACS Earth and Space Chemistry","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Earth and Space Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acsearthspacechem.4c00018","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Marine systems are incredibly chemically complex. An understanding of the chemical compounds that make up the chemical diversity in marine samples is critical to understanding ecological and ocean health metrics. Using Raman spectroscopy in tandem with machine learning combines a low-cost, highly transportable analytical technique with a powerful and rapid computational approach that can aid in marine analysis. Here, we use Raman spectroscopy and machine learning to identify mM concentrations of three chemically relevant compounds in three distinct classes in a complex aqueous matrix. Saccharides are represented by glucose, fatty acids by butyric acid, and proteins by an amino acid proxy through glycine. Eight classical machine learning models (gradient boosted regressors, random forests, histogram gradient boosted regressors, decision trees, k-nearest neighbors, support vector regression, multi-layer perceptrons, and multivariate linear regression) were tested for their accuracy in identifying the concentrations of glycine, glucose, and butyric acid in marine samples, which were benchmarked through a mass spectrometric method. Support vector regression was able to best identify all three concentrations of glycine, butyric acid, and glucose. Butyric acid was similarly well described through gradient boosted regression and histogram gradient boosted regression. The described spectroscopy and machine learning methodology has the potential to significantly advance rapid field analysis of marine samples.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过基于回归的机器学习对海洋样本进行多分析物浓度分析
海洋系统的化学性质极其复杂。了解构成海洋样本化学多样性的化合物对于了解生态和海洋健康指标至关重要。将拉曼光谱与机器学习结合起来使用,可以将低成本、高运输性的分析技术与强大、快速的计算方法结合起来,从而帮助进行海洋分析。在这里,我们利用拉曼光谱和机器学习来识别复杂水基质中三个不同类别的三种化学相关化合物的毫摩尔浓度。葡萄糖代表糖类,丁酸代表脂肪酸,甘氨酸代表氨基酸,蛋白质代表蛋白质。我们测试了八种经典机器学习模型(梯度提升回归模型、随机森林、直方图梯度提升回归模型、决策树、k 最近邻、支持向量回归、多层感知器和多元线性回归)在识别海洋样本中甘氨酸、葡萄糖和丁酸浓度方面的准确性,并通过质谱方法对其进行了基准测试。支持向量回归能够最好地识别甘氨酸、丁酸和葡萄糖的所有三种浓度。通过梯度提升回归和直方图梯度提升回归,丁酸也得到了类似的良好描述。所描述的光谱学和机器学习方法有可能极大地推动海洋样本的快速现场分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACS Earth and Space Chemistry
ACS Earth and Space Chemistry Earth and Planetary Sciences-Geochemistry and Petrology
CiteScore
5.30
自引率
11.80%
发文量
249
期刊介绍: The scope of ACS Earth and Space Chemistry includes the application of analytical, experimental and theoretical chemistry to investigate research questions relevant to the Earth and Space. The journal encompasses the highly interdisciplinary nature of research in this area, while emphasizing chemistry and chemical research tools as the unifying theme. The journal publishes broadly in the domains of high- and low-temperature geochemistry, atmospheric chemistry, marine chemistry, planetary chemistry, astrochemistry, and analytical geochemistry. ACS Earth and Space Chemistry publishes Articles, Letters, Reviews, and Features to provide flexible formats to readily communicate all aspects of research in these fields.
期刊最新文献
Issue Publication Information Issue Editorial Masthead Reactant Discovery with an Ab Initio Nanoreactor: Exploration of Astrophysical N-Heterocycle Precursors and Formation Pathways Vacuum Ultraviolet Photoionization of Methane-Water Clusters Leads to Methanol Formation A Network Approach for the Accurate Characterization of Water Lines Observable in Astronomical Masers and Extragalactic Environments
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1