Deborah F McGlynn, Lindsay D Yee, H Martin Garraffo, Lewis Y Geer, Tytus D Mak, Yuri A Mirokhin, Dmitrii V Tchekhovskoi, Coty N Jen, Allen H Goldstein, Anthony J Kearsley, Stephen E Stein
{"title":"基于文库的非靶向化合物GC-EI-MS鉴定新方法","authors":"Deborah F McGlynn, Lindsay D Yee, H Martin Garraffo, Lewis Y Geer, Tytus D Mak, Yuri A Mirokhin, Dmitrii V Tchekhovskoi, Coty N Jen, Allen H Goldstein, Anthony J Kearsley, Stephen E Stein","doi":"10.1021/jasms.4c00451","DOIUrl":null,"url":null,"abstract":"<p><p>While gas chromatography mass spectrometry (GC-MS) has long been used to identify compounds in complex mixtures, this process is often subjective and time-consuming and leaves a large fraction of seemingly good-quality spectra unidentified. In this work, we describe a set of new mass spectral library-based methods to assist compound identification in complex mixtures. These methods employ mass spectral uniqueness and compound ubiquity of library entries alongside noise reduction and automated comparison of retention indices to library compounds. As a test data set, we used a publicly available electron ionization mass spectrometry data set consisting of 4833 spectra of particulate organic compounds emitted by combustion of wildland fuels. In the present work, spectra in this data set were first identified using the NIST 2023 EI-MS Library and associated batch process identification software (NIST MS PepSearch) using retention-index corrected Identity Search scoring. Resulting identifications and related information were then employed to parametrize other factors that correlate with identification. A method for identifying compounds absent from but related to those present in mass spectral libraries using the Hybrid Similarity Search is illustrated. Nevertheless, some 90% of the spectra remain unidentified. Through comparison of unidentified to identified mass spectra in this data set, a new simple measure, namely median relative abundance, was developed for evaluating the likelihood of identification.</p>","PeriodicalId":672,"journal":{"name":"Journal of the American Society for Mass Spectrometry","volume":" ","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"New Library-Based Methods for Nontargeted Compound Identification by GC-EI-MS.\",\"authors\":\"Deborah F McGlynn, Lindsay D Yee, H Martin Garraffo, Lewis Y Geer, Tytus D Mak, Yuri A Mirokhin, Dmitrii V Tchekhovskoi, Coty N Jen, Allen H Goldstein, Anthony J Kearsley, Stephen E Stein\",\"doi\":\"10.1021/jasms.4c00451\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>While gas chromatography mass spectrometry (GC-MS) has long been used to identify compounds in complex mixtures, this process is often subjective and time-consuming and leaves a large fraction of seemingly good-quality spectra unidentified. In this work, we describe a set of new mass spectral library-based methods to assist compound identification in complex mixtures. These methods employ mass spectral uniqueness and compound ubiquity of library entries alongside noise reduction and automated comparison of retention indices to library compounds. As a test data set, we used a publicly available electron ionization mass spectrometry data set consisting of 4833 spectra of particulate organic compounds emitted by combustion of wildland fuels. In the present work, spectra in this data set were first identified using the NIST 2023 EI-MS Library and associated batch process identification software (NIST MS PepSearch) using retention-index corrected Identity Search scoring. Resulting identifications and related information were then employed to parametrize other factors that correlate with identification. A method for identifying compounds absent from but related to those present in mass spectral libraries using the Hybrid Similarity Search is illustrated. Nevertheless, some 90% of the spectra remain unidentified. Through comparison of unidentified to identified mass spectra in this data set, a new simple measure, namely median relative abundance, was developed for evaluating the likelihood of identification.</p>\",\"PeriodicalId\":672,\"journal\":{\"name\":\"Journal of the American Society for Mass Spectrometry\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Society for Mass Spectrometry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/jasms.4c00451\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Society for Mass Spectrometry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/jasms.4c00451","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
摘要
虽然气相色谱-质谱法(GC-MS)长期以来一直用于鉴定复杂混合物中的化合物,但该过程通常是主观且耗时的,并且留下了很大一部分看似质量良好的光谱未被识别。在这项工作中,我们描述了一套新的基于质谱库的方法来辅助复杂混合物中的化合物鉴定。这些方法利用库条目的质谱唯一性和化合物的普遍性,以及降噪和保留指数与库化合物的自动比较。作为测试数据集,我们使用了一个公开的电子电离质谱数据集,该数据集由4833个由荒地燃料燃烧释放的颗粒有机化合物的光谱组成。在目前的工作中,该数据集中的光谱首先使用NIST 2023 EI-MS库和相关的批处理识别软件(NIST MS PepSearch)使用保留索引校正的身份搜索评分进行识别。结果识别和相关信息然后被用来参数化与识别相关的其他因素。一种方法来识别化合物不存在,但相关的那些存在于质谱库使用混合相似搜索说明。然而,大约90%的光谱仍未被识别。通过对该数据集中未识别质谱与已识别质谱的比较,建立了一种新的简单度量,即中位数相对丰度,用于评估鉴定的可能性。
New Library-Based Methods for Nontargeted Compound Identification by GC-EI-MS.
While gas chromatography mass spectrometry (GC-MS) has long been used to identify compounds in complex mixtures, this process is often subjective and time-consuming and leaves a large fraction of seemingly good-quality spectra unidentified. In this work, we describe a set of new mass spectral library-based methods to assist compound identification in complex mixtures. These methods employ mass spectral uniqueness and compound ubiquity of library entries alongside noise reduction and automated comparison of retention indices to library compounds. As a test data set, we used a publicly available electron ionization mass spectrometry data set consisting of 4833 spectra of particulate organic compounds emitted by combustion of wildland fuels. In the present work, spectra in this data set were first identified using the NIST 2023 EI-MS Library and associated batch process identification software (NIST MS PepSearch) using retention-index corrected Identity Search scoring. Resulting identifications and related information were then employed to parametrize other factors that correlate with identification. A method for identifying compounds absent from but related to those present in mass spectral libraries using the Hybrid Similarity Search is illustrated. Nevertheless, some 90% of the spectra remain unidentified. Through comparison of unidentified to identified mass spectra in this data set, a new simple measure, namely median relative abundance, was developed for evaluating the likelihood of identification.
期刊介绍:
The Journal of the American Society for Mass Spectrometry presents research papers covering all aspects of mass spectrometry, incorporating coverage of fields of scientific inquiry in which mass spectrometry can play a role.
Comprehensive in scope, the journal publishes papers on both fundamentals and applications of mass spectrometry. Fundamental subjects include instrumentation principles, design, and demonstration, structures and chemical properties of gas-phase ions, studies of thermodynamic properties, ion spectroscopy, chemical kinetics, mechanisms of ionization, theories of ion fragmentation, cluster ions, and potential energy surfaces. In addition to full papers, the journal offers Communications, Application Notes, and Accounts and Perspectives