首页 > 最新文献

Applications in Plant Sciences最新文献

英文 中文
Using large language models to extract plant functional traits from unstructured text 利用大型语言模型从非结构化文本中提取植物功能特征
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-06-03 DOI: 10.1002/aps3.70011
Viktor Domazetoski, Holger Kreft, Helena Bestova, Philipp Wieder, Radoslav Koynov, Alireza Zarei, Patrick Weigelt

Premise

Functional plant ecology seeks to understand how functional traits govern species distributions, community assembly, and ecosystem functions. While global trait datasets have advanced the field, substantial gaps remain, and extracting trait information from text in books, research articles, and online sources via machine learning offers a valuable complement to costly field campaigns.

Methods

We propose a natural language processing pipeline that extracts traits from unstructured species descriptions by using classification models for categorical traits and question-answering models for numerical traits. The pipeline's performance is evaluated on two large databases with over 50,000 species descriptions, utilizing approaches ranging from a keyword search to large language models.

Results

Our final optimized pipeline used a transformer architecture and obtained a mean precision of 90.8% (range 81.6–97%) and a mean recall of 88.6% (77.4–97%) across five categorical traits, representing a 9.83% increase in precision and 42.35% increase in recall over a regular expression-based approach. The question-answering model yielded a normalized mean absolute error of 10.3% averaged across three numerical traits.

Discussion

The natural language processing pipeline we propose has the potential to facilitate the digitization and extraction of large amounts of plant functional trait information residing in scattered textual descriptions.

前提功能植物生态学旨在了解功能性状如何控制物种分布,群落组装和生态系统功能。虽然全球特征数据集已经推动了该领域的发展,但仍然存在巨大的差距,通过机器学习从书籍、研究文章和在线资源中的文本中提取特征信息,为昂贵的实地活动提供了有价值的补充。方法提出了一种自然语言处理管道,通过分类模型提取非结构化物种描述中的分类特征,采用问答模型提取数量特征。管道的性能在两个拥有超过50,000个物种描述的大型数据库上进行评估,使用的方法从关键字搜索到大型语言模型。结果我们最终优化的管道使用了变压器架构,在五个分类特征上获得了90.8%(范围81.6-97%)的平均精度和88.6%(78.4 - 97%)的平均召回率,比基于正则表达式的方法提高了9.83%的精度和42.35%的召回率。该问答模型在三个数值特征上的平均标准化平均绝对误差为10.3%。我们提出的自然语言处理管道有可能促进分散在文本描述中的大量植物功能性状信息的数字化和提取。
{"title":"Using large language models to extract plant functional traits from unstructured text","authors":"Viktor Domazetoski,&nbsp;Holger Kreft,&nbsp;Helena Bestova,&nbsp;Philipp Wieder,&nbsp;Radoslav Koynov,&nbsp;Alireza Zarei,&nbsp;Patrick Weigelt","doi":"10.1002/aps3.70011","DOIUrl":"10.1002/aps3.70011","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>Functional plant ecology seeks to understand how functional traits govern species distributions, community assembly, and ecosystem functions. While global trait datasets have advanced the field, substantial gaps remain, and extracting trait information from text in books, research articles, and online sources via machine learning offers a valuable complement to costly field campaigns.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We propose a natural language processing pipeline that extracts traits from unstructured species descriptions by using classification models for categorical traits and question-answering models for numerical traits. The pipeline's performance is evaluated on two large databases with over 50,000 species descriptions, utilizing approaches ranging from a keyword search to large language models.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>Our final optimized pipeline used a transformer architecture and obtained a mean precision of 90.8% (range 81.6–97%) and a mean recall of 88.6% (77.4–97%) across five categorical traits, representing a 9.83% increase in precision and 42.35% increase in recall over a regular expression-based approach. The question-answering model yielded a normalized mean absolute error of 10.3% averaged across three numerical traits.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>The natural language processing pipeline we propose has the potential to facilitate the digitization and extraction of large amounts of plant functional trait information residing in scattered textual descriptions.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144472906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fully automatic extraction of morphological traits from the web: Utopia or reality? 网络形态特征的全自动提取:乌托邦还是现实?
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-06-01 DOI: 10.1002/aps3.70005
Diego Marcos, Robert van de Vlasakker, Ioannis N. Athanasiadis, Pierre Bonnet, Hervé Goëau, Alexis Joly, W. Daniel Kissling, César Leblanc, André S. J. van Proosdij, Konstantinos P. Panousis

Premise

Plant morphological traits, their observable characteristics, are fundamental to understanding the role played by each species within its ecosystem; however, compiling trait information for even a moderate number of species is a demanding task that may take experts years to accomplish. At the same time, online species descriptions contain massive amounts of information about morphological traits, but the lack of structure makes this source of data impossible to use at scale.

Methods

To overcome this, we propose to leverage recent advances in large language models and devise a mechanism for gathering and processing plant trait information in the form of unstructured textual descriptions, without manual curation.

Results

We evaluate our approach by automatically replicating three manually created species–trait matrices. Our method found values for over half of all species–trait pairs, with an F1 score of over 75%.

Discussion

Our results suggest that large-scale creation of structured trait databases from unstructured online text is now feasible due to the information extraction capabilities of large language models. However, the process is currently limited by the availability of textual descriptions that cover all traits of interest.

植物的形态特征,即它们的可观察特征,是理解每个物种在其生态系统中所扮演的角色的基础;然而,即使是为中等数量的物种汇编性状信息也是一项艰巨的任务,可能需要专家花费数年时间才能完成。与此同时,在线物种描述包含了大量的形态特征信息,但缺乏结构使得这些数据来源无法大规模使用。为了克服这一问题,我们建议利用大型语言模型的最新进展,设计一种机制,以非结构化文本描述的形式收集和处理植物性状信息,而无需人工管理。结果我们通过自动复制三个手动创建的物种-性状矩阵来评估我们的方法。我们的方法发现了超过一半的物种-性状对的值,F1得分超过75%。我们的研究结果表明,由于大型语言模型的信息提取能力,从非结构化在线文本大规模创建结构化特征数据库现在是可行的。然而,该过程目前受到涵盖所有感兴趣的特征的文本描述的可用性的限制。
{"title":"Fully automatic extraction of morphological traits from the web: Utopia or reality?","authors":"Diego Marcos,&nbsp;Robert van de Vlasakker,&nbsp;Ioannis N. Athanasiadis,&nbsp;Pierre Bonnet,&nbsp;Hervé Goëau,&nbsp;Alexis Joly,&nbsp;W. Daniel Kissling,&nbsp;César Leblanc,&nbsp;André S. J. van Proosdij,&nbsp;Konstantinos P. Panousis","doi":"10.1002/aps3.70005","DOIUrl":"10.1002/aps3.70005","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>Plant morphological traits, their observable characteristics, are fundamental to understanding the role played by each species within its ecosystem; however, compiling trait information for even a moderate number of species is a demanding task that may take experts years to accomplish. At the same time, online species descriptions contain massive amounts of information about morphological traits, but the lack of structure makes this source of data impossible to use at scale.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>To overcome this, we propose to leverage recent advances in large language models and devise a mechanism for gathering and processing plant trait information in the form of unstructured textual descriptions, without manual curation.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>We evaluate our approach by automatically replicating three manually created species–trait matrices. Our method found values for over half of all species–trait pairs, with an F1 score of over 75%.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>Our results suggest that large-scale creation of structured trait databases from unstructured online text is now feasible due to the information extraction capabilities of large language models. However, the process is currently limited by the availability of textual descriptions that cover all traits of interest.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144472832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-scale reference genome of Pectocarya recurvata, the species with the smallest reported genome size in Boraginaceae 在婆罗苣苔科中报道的基因组最小的物种——胸carya recurvata的染色体尺度参考基因组
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-05-21 DOI: 10.1002/aps3.70008
Poppy C. Northing, Jessie A. Pelosi, D. Lawrence Venable, Katrina M. Dlugosch

Premise

Pectocarya recurvata (Boraginaceae, subfamily Cynoglossoideae), a species native to the Sonoran Desert (North America), has served as a model system for a suite of ecological and evolutionary studies. However, no reference genomes are currently available in Cynoglossoideae. A high-quality reference genome for P. recurvata would be valuable for addressing questions in this system and across broader taxonomic scales.

Methods

Using PacBio HiFi sequencing, we assembled a reference genome for P. recurvata and annotated coding regions with full-length transcripts from an Iso-Seq library. We assessed genome completeness with BUSCO and k-mer analysis, and estimated the genome size of six individuals using flow cytometry.

Results

The chromosome-scale genome assembly for P. recurvata was 216.0 Mbp long (N50 = 12.1 Mbp). Previous observations indicated P. recurvata is 2n = 24. Our assembly included 12 primary contigs (158.3 Mbp) containing 30,655 genes with telomeres at 23 out of 24 ends. Flow cytometry measurements from the same population included two plants with 1C = 196.9 Mbp, the smallest measured for Boraginaceae, and four with 1C = 385.8 Mbp, which is consistent with tetraploidy in this population.

Discussion

The P. recurvata genome assembly and annotation provide a high-quality genomic resource in a sparsely represented area of the angiosperm phylogeny. This new reference genome will facilitate answering open questions in ecophysiology, biogeography, and systematics.

原产于索诺兰沙漠(北美)的山核桃(Pectocarya recurvata, Boraginaceae, Cynoglossoideae亚科)是一种原生物种,已成为一系列生态学和进化研究的模型系统。然而,目前还没有关于棘球蝇科的参考基因组。一个高质量的参考基因组对于解决这个系统和更广泛的分类尺度上的问题是有价值的。方法采用PacBio HiFi测序技术,利用Iso-Seq文库中的全长转录本,组装了P. recurvata的参考基因组和注释的编码区。我们使用BUSCO和k-mer分析评估了基因组的完整性,并使用流式细胞术估计了6个个体的基因组大小。结果在染色体尺度上,该基因组长216.0 Mbp (N50 = 12.1 Mbp)。先前的观察表明,P. recurvata是2n = 24。我们的组装包括12个初级contigs (158.3 Mbp),包含30,655个基因,在24个端粒中的23个端粒。同一群体的流式细胞术检测结果包括2株1C = 196.9 Mbp (Boraginaceae最小)和4株1C = 385.8 Mbp,这与该群体的四倍性一致。P. recurvata基因组组装和注释为被子植物系统发育的稀疏区域提供了高质量的基因组资源。这个新的参考基因组将有助于回答生态生理学、生物地理学和系统学方面的开放性问题。
{"title":"Chromosome-scale reference genome of Pectocarya recurvata, the species with the smallest reported genome size in Boraginaceae","authors":"Poppy C. Northing,&nbsp;Jessie A. Pelosi,&nbsp;D. Lawrence Venable,&nbsp;Katrina M. Dlugosch","doi":"10.1002/aps3.70008","DOIUrl":"10.1002/aps3.70008","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p><i>Pectocarya recurvata</i> (Boraginaceae, subfamily Cynoglossoideae), a species native to the Sonoran Desert (North America), has served as a model system for a suite of ecological and evolutionary studies. However, no reference genomes are currently available in Cynoglossoideae. A high-quality reference genome for <i>P. recurvata</i> would be valuable for addressing questions in this system and across broader taxonomic scales.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>Using PacBio HiFi sequencing, we assembled a reference genome for <i>P. recurvata</i> and annotated coding regions with full-length transcripts from an Iso-Seq library. We assessed genome completeness with BUSCO and <i>k</i>-mer analysis, and estimated the genome size of six individuals using flow cytometry.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>The chromosome-scale genome assembly for <i>P. recurvata</i> was 216.0 Mbp long (N50 = 12.1 Mbp). Previous observations indicated <i>P. recurvata</i> is 2<i>n</i> = 24. Our assembly included 12 primary contigs (158.3 Mbp) containing 30,655 genes with telomeres at 23 out of 24 ends. Flow cytometry measurements from the same population included two plants with 1C = 196.9 Mbp, the smallest measured for Boraginaceae, and four with 1C = 385.8 Mbp, which is consistent with tetraploidy in this population.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>The <i>P. recurvata</i> genome assembly and annotation provide a high-quality genomic resource in a sparsely represented area of the angiosperm phylogeny. This new reference genome will facilitate answering open questions in ecophysiology, biogeography, and systematics.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144472827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing plant metabolic research by using large language models to expand databases and extract labeled data 利用大型语言模型扩展数据库和提取标记数据,推进植物代谢研究
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-05-14 DOI: 10.1002/aps3.70007
Rachel Knapp, Braidon Johnson, Lucas Busta

Premise

Recently, plant science has seen transformative advances in scalable data collection for sequence and chemical data. These large datasets, combined with machine learning, have demonstrated that conducting plant metabolic research on large scales yields remarkable insights. A key next step in increasing scale has been revealed with the advent of accessible large language models, which, even in their early stages, can distill structured data from the literature. This brings us closer to creating specialized databases that consolidate virtually all published knowledge on a topic.

Methods

Here, we first test different combinations of prompt engineering techniques and language models in the identification of validated enzyme–product pairs. Next, we evaluate the application of automated prompt engineering and retrieval-augmented generation to identify compound–species associations. Finally, we build and determine the accuracy of a multimodal language model–based pipeline that transcribes images of tables into machine-readable formats.

Results

When tuned for each specific task, these methods perform with high (80–90%) or modest (50%) accuracies for enzyme–product pair identification and table image transcription, but with lower false-negative rates than previous methods (decreasing from 55% to 40%) for compound–species pair identification.

Discussion

We enumerate several suggestions for researchers working with language models, among which is the importance of the user's domain-specific expertise and knowledge.

最近,植物科学在序列和化学数据的可扩展数据收集方面取得了革命性的进展。这些大型数据集与机器学习相结合,表明在大规模上进行植物代谢研究可以产生非凡的见解。随着可访问的大型语言模型的出现,揭示了增加规模的关键下一步,即使在早期阶段,也可以从文献中提取结构化数据。这使我们离创建专门的数据库更近了一步,这些数据库实际上整合了关于一个主题的所有已发表的知识。在这里,我们首先测试了提示工程技术和语言模型的不同组合,以识别已验证的酶-产物对。接下来,我们评估了自动提示工程和检索增强生成在识别化合物物种关联方面的应用。最后,我们构建并确定了一个基于多模态语言模型的管道的准确性,该管道将表的图像转录为机器可读的格式。当对每个特定任务进行调整时,这些方法在酶产物对鉴定和表图像转录方面具有高(80-90%)或中等(50%)的准确性,但在化合物物种对鉴定方面的假阴性率低于以前的方法(从55%降至40%)。我们为研究语言模型的研究人员列举了一些建议,其中包括用户特定领域的专业知识和知识的重要性。
{"title":"Advancing plant metabolic research by using large language models to expand databases and extract labeled data","authors":"Rachel Knapp,&nbsp;Braidon Johnson,&nbsp;Lucas Busta","doi":"10.1002/aps3.70007","DOIUrl":"10.1002/aps3.70007","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>Recently, plant science has seen transformative advances in scalable data collection for sequence and chemical data. These large datasets, combined with machine learning, have demonstrated that conducting plant metabolic research on large scales yields remarkable insights. A key next step in increasing scale has been revealed with the advent of accessible large language models, which, even in their early stages, can distill structured data from the literature. This brings us closer to creating specialized databases that consolidate virtually all published knowledge on a topic.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>Here, we first test different combinations of prompt engineering techniques and language models in the identification of validated enzyme–product pairs. Next, we evaluate the application of automated prompt engineering and retrieval-augmented generation to identify compound–species associations. Finally, we build and determine the accuracy of a multimodal language model–based pipeline that transcribes images of tables into machine-readable formats.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>When tuned for each specific task, these methods perform with high (80–90%) or modest (50%) accuracies for enzyme–product pair identification and table image transcription, but with lower false-negative rates than previous methods (decreasing from 55% to 40%) for compound–species pair identification.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>We enumerate several suggestions for researchers working with language models, among which is the importance of the user's domain-specific expertise and knowledge.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 4","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144768090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new spin on chemotaxonomy: Using non-proteogenic amino acids as a test case 化学分类学的新进展:使用非蛋白质氨基酸作为测试案例
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-04-14 DOI: 10.1002/aps3.70006
Makenzie Gibson, William Thives Santos, Alan R. Oyler, Lucas Busta, Craig A. Schenck

Premise

Specialized metabolites serve various roles for plants and humans. Unlike core metabolites, specialized metabolites are restricted to certain plant lineages; thus, in addition to their ecological functions, specialized metabolites can serve as diagnostic markers of plant lineages.

Methods

We investigated the phylogenetic distribution of plant metabolites using non-proteogenic amino acids (NPAA). Species–NPAA associations for eight NPAAs were identified from the existing literature and placed within a phylogenetic context using R packages and the Interactive Tree of Life. To confirm and extend the literature-based NPAA distribution, we selected azetidine-2-carboxylic acid (Aze) and screened over 70 diverse plants using gas chromatography–mass spectrometry (GC-MS).

Results

Literature searches identified 163 NPAA-relevant articles, which were manually inspected to identify 822 species–NPAA associations. NPAAs were mapped at the order and genus level, revealing that some NPAAs are restricted to single orders, whereas others are present across divergent taxa. The observed distribution of Aze across plants and ancestral state reconstruction suggests a convergent evolutionary history.

Discussion

Although reliance on chemotaxonomy has decreased in recent years, there is still value in placing metabolites within a phylogenetic context to understand the evolutionary processes of plant chemical diversification. This approach can be applied to metabolites present in any organism and compared at a range of taxonomic levels.

专门的代谢物在植物和人类中起着不同的作用。与核心代谢物不同,特化代谢物仅限于某些植物谱系;因此,除了它们的生态功能外,特殊代谢物还可以作为植物谱系的诊断标记。方法利用非蛋白质原性氨基酸(NPAA)研究植物代谢产物的系统发育分布。从现有文献中确定了8个npaa的物种- npaa关联,并使用R包和交互生命树将其置于系统发育背景中。为了进一步证实和扩展NPAA的分布,我们选择了azetidine-2-carboxylic acid (Aze),并使用气相色谱-质谱(GC-MS)技术对70多种不同的植物进行了筛选。结果检索到163篇与npaa相关的文献,人工鉴定出822篇与npaa相关的物种。在目和属水平上对NPAAs进行了定位,发现一些NPAAs仅局限于单一目,而另一些则存在于不同的分类群中。观察到的Aze在植物间的分布和祖先状态的重建表明其进化史趋同。尽管近年来对化学分类学的依赖有所减少,但将代谢物置于系统发育背景下以理解植物化学多样化的进化过程仍然有价值。这种方法可以应用于任何生物体中存在的代谢物,并在一系列分类水平上进行比较。
{"title":"A new spin on chemotaxonomy: Using non-proteogenic amino acids as a test case","authors":"Makenzie Gibson,&nbsp;William Thives Santos,&nbsp;Alan R. Oyler,&nbsp;Lucas Busta,&nbsp;Craig A. Schenck","doi":"10.1002/aps3.70006","DOIUrl":"10.1002/aps3.70006","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>Specialized metabolites serve various roles for plants and humans. Unlike core metabolites, specialized metabolites are restricted to certain plant lineages; thus, in addition to their ecological functions, specialized metabolites can serve as diagnostic markers of plant lineages.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We investigated the phylogenetic distribution of plant metabolites using non-proteogenic amino acids (NPAA). Species–NPAA associations for eight NPAAs were identified from the existing literature and placed within a phylogenetic context using R packages and the Interactive Tree of Life. To confirm and extend the literature-based NPAA distribution, we selected azetidine-2-carboxylic acid (Aze) and screened over 70 diverse plants using gas chromatography–mass spectrometry (GC-MS).</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>Literature searches identified 163 NPAA-relevant articles, which were manually inspected to identify 822 species–NPAA associations. NPAAs were mapped at the order and genus level, revealing that some NPAAs are restricted to single orders, whereas others are present across divergent taxa. The observed distribution of Aze across plants and ancestral state reconstruction suggests a convergent evolutionary history.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>Although reliance on chemotaxonomy has decreased in recent years, there is still value in placing metabolites within a phylogenetic context to understand the evolutionary processes of plant chemical diversification. This approach can be applied to metabolites present in any organism and compared at a range of taxonomic levels.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 4","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144767905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A low-cost protocol for the optical method of vulnerability curves to calculate P50 一种低成本的脆弱性曲线光学法计算P50的方案
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-03-31 DOI: 10.1002/aps3.70004
Georgina González-Rebeles, Miguel Ángel Alonso-Arevalo, Eulogio López, Rodrigo Méndez-Alonzo

Premise

The quantification of plant drought resistance, particularly embolism formation, within and across species, is critical for ecosystem management and agriculture. We developed a cost-effective protocol to measure the water potential at which 50% of hydraulic conductivity (P50) is lost in stems, using affordable and accessible materials in comparison to the traditional optical method.

Methods and Results

Our protocol uses inexpensive USB microscopes, which are secured along with the plants to a pegboard base to avoid movement. A Python program automatized the image acquisition. This method was applied to quantify P50 in an exotic species (Nicotiana glauca) and native species (Rhus integrifolia) of the Mediterranean vegetation in Baja California, Mexico.

Conclusions

The intra- and interspecific patterns of variation in stem P50 of N. glauca and R. integrifolia were obtained using the low-cost optical method with widely available and affordable materials that can be easily replicated for other species.

植物抗旱性的量化,特别是物种内和物种间的栓塞形成,对生态系统管理和农业至关重要。与传统的光学方法相比,我们开发了一种具有成本效益的方案,使用价格合理且易于获取的材料来测量茎中50%的水力电导率(P50)损失时的水势。方法和结果我们使用廉价的USB显微镜,将其与植物一起固定在钉板底座上以避免移动。一个Python程序实现了图像采集的自动化。该方法应用于墨西哥下加利福尼亚州地中海植被的一种外来种(Nicotiana glauca)和本地种(Rhus integrfolia)的P50定量分析。结论采用低成本的光学方法,利用广泛、经济、可复制的材料,获得了青花云杉和整合叶云杉茎P50的种内和种间变异模式。
{"title":"A low-cost protocol for the optical method of vulnerability curves to calculate P50","authors":"Georgina González-Rebeles,&nbsp;Miguel Ángel Alonso-Arevalo,&nbsp;Eulogio López,&nbsp;Rodrigo Méndez-Alonzo","doi":"10.1002/aps3.70004","DOIUrl":"10.1002/aps3.70004","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>The quantification of plant drought resistance, particularly embolism formation, within and across species, is critical for ecosystem management and agriculture. We developed a cost-effective protocol to measure the water potential at which 50% of hydraulic conductivity (<i>P</i><sub>50</sub>) is lost in stems, using affordable and accessible materials in comparison to the traditional optical method.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods and Results</h3>\u0000 \u0000 <p>Our protocol uses inexpensive USB microscopes, which are secured along with the plants to a pegboard base to avoid movement. A Python program automatized the image acquisition. This method was applied to quantify <i>P</i><sub>50</sub> in an exotic species (<i>Nicotiana glauca</i>) and native species (<i>Rhus integrifolia</i>) of the Mediterranean vegetation in Baja California, Mexico.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>The intra- and interspecific patterns of variation in stem <i>P</i><sub>50</sub> of <i>N. glauca</i> and <i>R. integrifolia</i> were obtained using the low-cost optical method with widely available and affordable materials that can be easily replicated for other species.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 2","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143884020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The PteridoPortal: A publicly accessible collection of over three million records of extant and extinct pteridophytes pteridportal:一个公开的收集超过三百万现存和灭绝的蕨类植物的记录
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-03-10 DOI: 10.1002/aps3.70003
Carl J. Rothfels, Jaemin Lee, Michael A. Sundue, Alan R. Smith, Amy Kasameyer, Joyce Gross, Garth Holman, Shusheng Hu, Matt von Konrat, Emily B. Sessa, Kimberly Watson, Alan Weakley, Libing Zhang, Patricia Gensel, Michael Hassler, Katelin D. Pearson, Ed Gilbert, Robyn J. Burnham, Richard K. Rabeler, Patrick Sweeney, Alejandra Vasco, Weston Testo, David E. Giblin, Stefanie M. Ickert-Bond, Margaret Landis, Melanie Link-Perez, Tatyana Livshultz, Ian Miller, Christopher Neefus, Kathleen Pigg, Mitchell Power, Alan Prather, Tiana Rehman, Lena Struwe, Michael Vincent, George Weiblen, Timothy Whitfeld, Michael D. Windham, George Yatskievych, Aaron Liston, Elizabeth Makings, Kathleen M. Pryer, Caroline Strömberg, Eve Atri, Jason Best, Ian Glasspool, Layne Huiet, Elizabeth Johnson, Megan R. King, Az Klymiuk, Richard Lupia, Lucas C. Majure, Carol Ann McCormick, Richard McCourt, Shanna Oberreiter, Kent D. Perkins, Yarency Rodriguez, Chelsea Smith, James Solomon, Jordan Teisher, Donna Ford-Werntz, Petra Fuehrding-Potschkat, Holly Little, Tom A. Ranker, Eric Schuettpelz, Carrie M. Tribble, Diane M. Erwin, Cindy V. Looy

Premise

Pteridophytes—vascular land plants that disperse by spores—are a powerful system for studying plant evolution, particularly with respect to the impact of abiotic factors on evolutionary trajectories through deep time. However, our ability to use pteridophytes to investigate such questions—or to capitalize on the ecological and conservation-related applications of the group—has been impaired by the relative isolation of the neo- and paleobotanical research communities and by the absence of large-scale biodiversity data sources.

Methods

Here we present the Pteridophyte Collections Consortium (PCC), an interdisciplinary community uniting neo- and paleobotanists, and the associated PteridoPortal, a publicly accessible online portal that serves over three million pteridophyte records, including herbarium specimens, paleontological museum specimens, and iNaturalist observations. We demonstrate the utility of the PteridoPortal through discussion of three example PteridoPortal-enabled research projects.

Results

The data within the PteridoPortal are global in scope and are queryable in a flexible manner. The PteridoPortal contains a taxonomic thesaurus (a digital version of a Linnaean classification) that includes both extant and extinct pteridophytes in a common phylogenetic framework. The PteridoPortal allows applications such as greatly accelerated classic floristics, entirely new “next-generation” floristic approaches, and the study of environmentally mediated evolution of functional morphology across deep time.

Discussion

The PCC and PteridoPortal provide a comprehensive resource enabling novel research into plant evolution, ecology, and conservation across deep time, facilitating rapid floristic analyses and other biodiversity-related investigations, and providing new opportunities for education and community engagement.

蕨类植物是一种通过孢子传播的维管束陆地植物,是研究植物进化的有力系统,特别是研究非生物因素对长时间进化轨迹的影响。然而,我们利用蕨类植物来调查这些问题的能力,或者利用这一群体的生态和保护相关应用的能力,由于新植物和古植物研究界的相对孤立以及缺乏大规模的生物多样性数据源而受到损害。在这里,我们介绍了蕨类植物收藏联盟(PCC),一个由新植物学家和古植物学家组成的跨学科社区,以及相关的pteridportal,一个公开访问的在线门户网站,提供超过300万份蕨类记录,包括标本馆标本、古生物博物馆标本和自然学家的观察结果。我们通过讨论三个支持pteridportal的示例研究项目来演示pteridportal的实用性。结果pteridportal中的数据是全局范围的,并且可以灵活地查询。pteridportal包含一个分类词典(林奈分类的数字版本),在一个共同的系统发育框架中包括现存和灭绝的翼植物。pteridportal允许应用程序,如大大加快经典植物区系,全新的“下一代”植物区系方法,以及研究环境介导的功能形态的进化跨越时间。PCC和pteridportal提供了一个全面的资源,可以对植物进化、生态学和保护进行深入的研究,促进快速的植物区系分析和其他生物多样性相关的调查,并为教育和社区参与提供新的机会。
{"title":"The PteridoPortal: A publicly accessible collection of over three million records of extant and extinct pteridophytes","authors":"Carl J. Rothfels,&nbsp;Jaemin Lee,&nbsp;Michael A. Sundue,&nbsp;Alan R. Smith,&nbsp;Amy Kasameyer,&nbsp;Joyce Gross,&nbsp;Garth Holman,&nbsp;Shusheng Hu,&nbsp;Matt von Konrat,&nbsp;Emily B. Sessa,&nbsp;Kimberly Watson,&nbsp;Alan Weakley,&nbsp;Libing Zhang,&nbsp;Patricia Gensel,&nbsp;Michael Hassler,&nbsp;Katelin D. Pearson,&nbsp;Ed Gilbert,&nbsp;Robyn J. Burnham,&nbsp;Richard K. Rabeler,&nbsp;Patrick Sweeney,&nbsp;Alejandra Vasco,&nbsp;Weston Testo,&nbsp;David E. Giblin,&nbsp;Stefanie M. Ickert-Bond,&nbsp;Margaret Landis,&nbsp;Melanie Link-Perez,&nbsp;Tatyana Livshultz,&nbsp;Ian Miller,&nbsp;Christopher Neefus,&nbsp;Kathleen Pigg,&nbsp;Mitchell Power,&nbsp;Alan Prather,&nbsp;Tiana Rehman,&nbsp;Lena Struwe,&nbsp;Michael Vincent,&nbsp;George Weiblen,&nbsp;Timothy Whitfeld,&nbsp;Michael D. Windham,&nbsp;George Yatskievych,&nbsp;Aaron Liston,&nbsp;Elizabeth Makings,&nbsp;Kathleen M. Pryer,&nbsp;Caroline Strömberg,&nbsp;Eve Atri,&nbsp;Jason Best,&nbsp;Ian Glasspool,&nbsp;Layne Huiet,&nbsp;Elizabeth Johnson,&nbsp;Megan R. King,&nbsp;Az Klymiuk,&nbsp;Richard Lupia,&nbsp;Lucas C. Majure,&nbsp;Carol Ann McCormick,&nbsp;Richard McCourt,&nbsp;Shanna Oberreiter,&nbsp;Kent D. Perkins,&nbsp;Yarency Rodriguez,&nbsp;Chelsea Smith,&nbsp;James Solomon,&nbsp;Jordan Teisher,&nbsp;Donna Ford-Werntz,&nbsp;Petra Fuehrding-Potschkat,&nbsp;Holly Little,&nbsp;Tom A. Ranker,&nbsp;Eric Schuettpelz,&nbsp;Carrie M. Tribble,&nbsp;Diane M. Erwin,&nbsp;Cindy V. Looy","doi":"10.1002/aps3.70003","DOIUrl":"10.1002/aps3.70003","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>Pteridophytes—vascular land plants that disperse by spores—are a powerful system for studying plant evolution, particularly with respect to the impact of abiotic factors on evolutionary trajectories through deep time. However, our ability to use pteridophytes to investigate such questions—or to capitalize on the ecological and conservation-related applications of the group—has been impaired by the relative isolation of the neo- and paleobotanical research communities and by the absence of large-scale biodiversity data sources.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>Here we present the Pteridophyte Collections Consortium (PCC), an interdisciplinary community uniting neo- and paleobotanists, and the associated PteridoPortal, a publicly accessible online portal that serves over three million pteridophyte records, including herbarium specimens, paleontological museum specimens, and iNaturalist observations. We demonstrate the utility of the PteridoPortal through discussion of three example PteridoPortal-enabled research projects.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>The data within the PteridoPortal are global in scope and are queryable in a flexible manner. The PteridoPortal contains a taxonomic thesaurus (a digital version of a Linnaean classification) that includes both extant and extinct pteridophytes in a common phylogenetic framework. The PteridoPortal allows applications such as greatly accelerated classic floristics, entirely new “next-generation” floristic approaches, and the study of environmentally mediated evolution of functional morphology across deep time.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>The PCC and PteridoPortal provide a comprehensive resource enabling novel research into plant evolution, ecology, and conservation across deep time, facilitating rapid floristic analyses and other biodiversity-related investigations, and providing new opportunities for education and community engagement.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 2","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143884192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive illustrated protocol for clearing, mounting, and imaging leaf venation networks 一个全面的说明协议清理,安装,和成像叶脉网络
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-03-07 DOI: 10.1002/aps3.70002
Isabella Niewiadomski, Monica Antonio, Luiza Maria T. Aparecido, Mickey Boakye, Sonoma Carlos, Andrea Echevarria, Adrian Fontao, Joseph Mann, Ilaíne Silveira Matos, Norma Salinas, Bradley Vu, Benjamin Wong Blonder

Premise

Leaf venation network architecture can provide insights into plant evolution, ecology, and physiology. Venation networks are typically assessed through histological methods, but existing protocols provide limited guidance on processing large or challenging leaves.

Methods and results

We present an illustrated protocol for visualizing whole leaf venation networks, including sample preparation, clearing, staining, mounting, imaging, and archiving steps. The protocol also includes supply lists, troubleshooting procedures, safety considerations, and examples of successful and unsuccessful outcomes. The protocol is suitable for a wide range of leaf sizes and morphologies and has been used with all major plant groups.

Conclusion

We provide a workflow for obtaining high-quality mounts and images of venation networks of a wide range of species, using readily available materials.

前提是叶脉网络结构可以提供对植物进化、生态学和生理学的见解。脉网络通常通过组织学方法进行评估,但现有的协议在处理大叶片或具有挑战性的叶片方面提供有限的指导。方法和结果我们提出了一个可视化全叶脉网络的说明方案,包括样品制备,清除,染色,安装,成像和存档步骤。该协议还包括供应清单、故障排除程序、安全考虑以及成功和不成功结果的示例。该协议适用于广泛的叶片大小和形态,并已用于所有主要植物群。结论本研究提供了一种工作流程,可以利用现成的材料获得多种物种的高质量血管网络图和图像。
{"title":"A comprehensive illustrated protocol for clearing, mounting, and imaging leaf venation networks","authors":"Isabella Niewiadomski,&nbsp;Monica Antonio,&nbsp;Luiza Maria T. Aparecido,&nbsp;Mickey Boakye,&nbsp;Sonoma Carlos,&nbsp;Andrea Echevarria,&nbsp;Adrian Fontao,&nbsp;Joseph Mann,&nbsp;Ilaíne Silveira Matos,&nbsp;Norma Salinas,&nbsp;Bradley Vu,&nbsp;Benjamin Wong Blonder","doi":"10.1002/aps3.70002","DOIUrl":"10.1002/aps3.70002","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>Leaf venation network architecture can provide insights into plant evolution, ecology, and physiology. Venation networks are typically assessed through histological methods, but existing protocols provide limited guidance on processing large or challenging leaves.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods and results</h3>\u0000 \u0000 <p>We present an illustrated protocol for visualizing whole leaf venation networks, including sample preparation, clearing, staining, mounting, imaging, and archiving steps. The protocol also includes supply lists, troubleshooting procedures, safety considerations, and examples of successful and unsuccessful outcomes. The protocol is suitable for a wide range of leaf sizes and morphologies and has been used with all major plant groups.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusion</h3>\u0000 \u0000 <p>We provide a workflow for obtaining high-quality mounts and images of venation networks of a wide range of species, using readily available materials.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 2","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143884061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of plant metabolomics data using identification-free approaches 利用无鉴定方法分析植物代谢组学数据
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-03-01 DOI: 10.1002/aps3.70001
Xinyu Yuan, Nathaniel S. S. Smith, Gaurav D. Moghe

Plant metabolomes are structurally diverse. One of the most popular techniques for sampling this diversity is liquid chromatography–mass spectrometry (LC-MS), which typically detects thousands of peaks from single organ extracts, many representing true metabolites. These peaks are usually annotated using in-house retention time or spectral libraries, in silico fragmentation libraries, and increasingly through computational techniques such as machine learning. Despite these advances, over 85% of LC-MS peaks remain unidentified, posing a major challenge for data analysis and biological interpretation. This bottleneck limits our ability to fully understand the diversity, functions, and evolution of plant metabolites. In this review, we first summarize current approaches for metabolite identification, highlighting their challenges and limitations. We further focus on alternative strategies that bypass the need for metabolite identification, allowing researchers to interpret global metabolic patterns and pinpoint key metabolite signals. These methods include molecular networking, distance-based approaches, information theory–based metrics, and discriminant analysis. Additionally, we explore their practical applications in plant science and highlight a set of useful tools to support researchers in analyzing complex plant metabolomics data. By adopting these approaches, researchers can enhance their ability to uncover new insights into plant metabolism.

植物代谢组具有结构多样性。取样这种多样性最流行的技术之一是液相色谱-质谱法(LC-MS),它通常从单个器官提取物中检测数千个峰,其中许多代表真正的代谢物。这些峰通常使用内部保留时间或光谱库、硅碎片库进行注释,并越来越多地通过机器学习等计算技术进行注释。尽管取得了这些进展,但超过85%的LC-MS峰仍未被识别,这对数据分析和生物学解释构成了重大挑战。这一瓶颈限制了我们充分了解植物代谢物的多样性、功能和进化的能力。在这篇综述中,我们首先总结了目前代谢物鉴定的方法,强调了它们的挑战和局限性。我们进一步关注替代策略,绕过代谢物鉴定的需要,使研究人员能够解释全球代谢模式并查明关键代谢物信号。这些方法包括分子网络、基于距离的方法、基于信息论的度量和判别分析。此外,我们还探讨了它们在植物科学中的实际应用,并重点介绍了一套有用的工具,以支持研究人员分析复杂的植物代谢组学数据。通过采用这些方法,研究人员可以提高他们发现植物代谢新见解的能力。
{"title":"Analysis of plant metabolomics data using identification-free approaches","authors":"Xinyu Yuan,&nbsp;Nathaniel S. S. Smith,&nbsp;Gaurav D. Moghe","doi":"10.1002/aps3.70001","DOIUrl":"10.1002/aps3.70001","url":null,"abstract":"<p>Plant metabolomes are structurally diverse. One of the most popular techniques for sampling this diversity is liquid chromatography–mass spectrometry (LC-MS), which typically detects thousands of peaks from single organ extracts, many representing true metabolites. These peaks are usually annotated using in-house retention time or spectral libraries, in silico fragmentation libraries, and increasingly through computational techniques such as machine learning. Despite these advances, over 85% of LC-MS peaks remain unidentified, posing a major challenge for data analysis and biological interpretation. This bottleneck limits our ability to fully understand the diversity, functions, and evolution of plant metabolites. In this review, we first summarize current approaches for metabolite identification, highlighting their challenges and limitations. We further focus on alternative strategies that bypass the need for metabolite identification, allowing researchers to interpret global metabolic patterns and pinpoint key metabolite signals. These methods include molecular networking, distance-based approaches, information theory–based metrics, and discriminant analysis. Additionally, we explore their practical applications in plant science and highlight a set of useful tools to support researchers in analyzing complex plant metabolomics data. By adopting these approaches, researchers can enhance their ability to uncover new insights into plant metabolism.</p>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 4","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144767428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing plant morphological trait identification in herbarium collections through deep learning–based segmentation 基于深度学习分割的植物形态特征识别方法研究
IF 2.4 3区 生物学 Q2 PLANT SCIENCES Pub Date : 2025-02-13 DOI: 10.1002/aps3.70000
Hanane Ariouat, Youcef Sklab, Edi Prifti, Jean-Daniel Zucker, Eric Chenin

Premise

Deep learning has become increasingly important in the analysis of digitized herbarium collections, which comprise millions of scans that provide valuable resources for studying plant evolution and biodiversity. However, leveraging deep learning algorithms to analyze these scans presents significant challenges, partly due to the heterogeneous nature of the non-plant material that forms the background of the scans. We hypothesize that removing such backgrounds can improve the performance of these algorithms.

Methods

We propose a novel method based on deep learning to segment and generate plant masks from herbarium scans and subsequently remove the non-plant backgrounds. The semi-automatic preprocessing stages involve the identification and removal of non-plant elements, substantially reducing the manual effort required to prepare the training dataset.

Results

The results highlight the importance of effective image segmentation, which achieved an F1 score of up to 96.6%. Moreover, when used in classification models for plant morphological trait identification, the images resulting from segmentation improved classification accuracy by up to 3% and F1 score by up to 7% compared to non-segmented images.

Discussion

Our approach isolates plant elements in herbarium scans by removing background elements to improve classification tasks. We demonstrate that image segmentation significantly enhances the performance of plant morphological trait identification models.

深度学习在分析数字化植物标本馆藏品方面变得越来越重要,这些藏品包括数百万次扫描,为研究植物进化和生物多样性提供了宝贵的资源。然而,利用深度学习算法来分析这些扫描带来了重大挑战,部分原因是形成扫描背景的非植物材料的异质性。我们假设去除这些背景可以提高这些算法的性能。方法提出了一种基于深度学习的方法,从植物标本扫描中分割和生成植物掩模,并随后去除非植物背景。半自动预处理阶段包括识别和去除非植物元素,大大减少了准备训练数据集所需的人工工作量。结果显示了有效分割图像的重要性,F1分值高达96.6%。此外,当用于植物形态性状鉴定的分类模型时,与未分割的图像相比,分割后的图像分类精度提高了3%,F1分数提高了7%。我们的方法通过去除背景元素来分离植物标本扫描中的植物元素,从而改善分类任务。我们证明了图像分割显著提高了植物形态性状识别模型的性能。
{"title":"Enhancing plant morphological trait identification in herbarium collections through deep learning–based segmentation","authors":"Hanane Ariouat,&nbsp;Youcef Sklab,&nbsp;Edi Prifti,&nbsp;Jean-Daniel Zucker,&nbsp;Eric Chenin","doi":"10.1002/aps3.70000","DOIUrl":"10.1002/aps3.70000","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Premise</h3>\u0000 \u0000 <p>Deep learning has become increasingly important in the analysis of digitized herbarium collections, which comprise millions of scans that provide valuable resources for studying plant evolution and biodiversity. However, leveraging deep learning algorithms to analyze these scans presents significant challenges, partly due to the heterogeneous nature of the non-plant material that forms the background of the scans. We hypothesize that removing such backgrounds can improve the performance of these algorithms.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We propose a novel method based on deep learning to segment and generate plant masks from herbarium scans and subsequently remove the non-plant backgrounds. The semi-automatic preprocessing stages involve the identification and removal of non-plant elements, substantially reducing the manual effort required to prepare the training dataset.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>The results highlight the importance of effective image segmentation, which achieved an F1 score of up to 96.6%. Moreover, when used in classification models for plant morphological trait identification, the images resulting from segmentation improved classification accuracy by up to 3% and F1 score by up to 7% compared to non-segmented images.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>Our approach isolates plant elements in herbarium scans by removing background elements to improve classification tasks. We demonstrate that image segmentation significantly enhances the performance of plant morphological trait identification models.</p>\u0000 </section>\u0000 </div>","PeriodicalId":8022,"journal":{"name":"Applications in Plant Sciences","volume":"13 2","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aps3.70000","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143883986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applications in Plant Sciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1