以知识图谱形式发布农业文件语义注释的统一方法

IF 6.3 Q1 AGRICULTURAL ENGINEERING Smart agricultural technology Pub Date : 2024-08-01 DOI:10.1016/j.atech.2024.100484
{"title":"以知识图谱形式发布农业文件语义注释的统一方法","authors":"","doi":"10.1016/j.atech.2024.100484","DOIUrl":null,"url":null,"abstract":"<div><p>The research results presented in this paper were obtained as part of the D2KAB project (Data to Knowledge in Agriculture and Biodiversity) which aims to develop semantic web-based tools to describe and make agronomical data actionable and accessible following the FAIR principles. We focus on constructing domain-specific Knowledge Graphs (KGs) from textual data sources, using Natural Language Processing (NLP) techniques to extract and structure relevant entities. Our approach is based on the formalization of a semantic data model using common linked open vocabularies such as the Web Annotation Ontology (OA) and the Provenance Ontology (PROV). The model was developed by formulating motivating scenarios and competency questions from domain experts. This model has been used to construct three different KGs from three distinct corpora: PubMed scientific publications on wheat and rice genetics and phenotyping, and French agricultural alert bulletins. The named entities to be recognized include genes, phenotypes, traits, genetic markers, taxa and phenological stages normalized using semantic resources such as the Wheat Trait and Phenotype Ontology (WTO), the French Crop Usage (FCU) thesaurus and the Plant Phenological Description Ontology (PPDO). Named entities were extracted using different NLP approaches and tools. The relevance of the semantic model was validated by implementing experts questions as SPARQL queries to be answered on the constructed RDF knowledge graphs. Our work demonstrates how domain-specific vocabularies and systematic querying of KGs can reveal hidden interactions and support agronomists in navigating vast amounts of data. The resources and transformation pipelines developed are publicly available in Git repositories.</p></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":null,"pages":null},"PeriodicalIF":6.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772375524000893/pdfft?md5=7b50dd8eaf7a72ae5125f8390427364e&pid=1-s2.0-S2772375524000893-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A unified approach to publish semantic annotations of agricultural documents as knowledge graphs\",\"authors\":\"\",\"doi\":\"10.1016/j.atech.2024.100484\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The research results presented in this paper were obtained as part of the D2KAB project (Data to Knowledge in Agriculture and Biodiversity) which aims to develop semantic web-based tools to describe and make agronomical data actionable and accessible following the FAIR principles. We focus on constructing domain-specific Knowledge Graphs (KGs) from textual data sources, using Natural Language Processing (NLP) techniques to extract and structure relevant entities. Our approach is based on the formalization of a semantic data model using common linked open vocabularies such as the Web Annotation Ontology (OA) and the Provenance Ontology (PROV). The model was developed by formulating motivating scenarios and competency questions from domain experts. This model has been used to construct three different KGs from three distinct corpora: PubMed scientific publications on wheat and rice genetics and phenotyping, and French agricultural alert bulletins. The named entities to be recognized include genes, phenotypes, traits, genetic markers, taxa and phenological stages normalized using semantic resources such as the Wheat Trait and Phenotype Ontology (WTO), the French Crop Usage (FCU) thesaurus and the Plant Phenological Description Ontology (PPDO). Named entities were extracted using different NLP approaches and tools. The relevance of the semantic model was validated by implementing experts questions as SPARQL queries to be answered on the constructed RDF knowledge graphs. Our work demonstrates how domain-specific vocabularies and systematic querying of KGs can reveal hidden interactions and support agronomists in navigating vast amounts of data. The resources and transformation pipelines developed are publicly available in Git repositories.</p></div>\",\"PeriodicalId\":74813,\"journal\":{\"name\":\"Smart agricultural technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772375524000893/pdfft?md5=7b50dd8eaf7a72ae5125f8390427364e&pid=1-s2.0-S2772375524000893-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Smart agricultural technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772375524000893\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375524000893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍的研究成果是 D2KAB 项目(农业和生物多样性数据到知识)的一部分,该项目旨在开发基于语义网络的工具,以按照 FAIR 原则描述农学数据并使其具有可操作性和可访问性。我们的重点是从文本数据源中构建特定领域的知识图谱(KGs),使用自然语言处理(NLP)技术来提取和构建相关实体。我们的方法基于语义数据模型的形式化,使用的是通用的链接开放词汇表,如网络注释本体(OA)和出处本体(PROV)。该模型是通过制定激励情景和领域专家提出的能力问题开发出来的。该模型已被用于从三个不同的语料库中构建三个不同的 KG:PubMed 上关于小麦和水稻遗传学和表型的科学出版物,以及法国农业警报公告。要识别的命名实体包括基因、表型、性状、遗传标记、类群和表型阶段,这些命名实体利用小麦性状和表型本体(WTO)、法国作物使用(FCU)词库和植物表型描述本体(PPDO)等语义资源进行规范化。使用不同的 NLP 方法和工具提取命名实体。通过在构建的 RDF 知识图谱上将专家问题作为 SPARQL 查询来回答,验证了语义模型的相关性。我们的工作展示了特定领域词汇表和对知识图谱的系统查询如何揭示隐藏的交互作用,并支持农学家浏览海量数据。所开发的资源和转换管道可在 Git 存储库中公开获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A unified approach to publish semantic annotations of agricultural documents as knowledge graphs

The research results presented in this paper were obtained as part of the D2KAB project (Data to Knowledge in Agriculture and Biodiversity) which aims to develop semantic web-based tools to describe and make agronomical data actionable and accessible following the FAIR principles. We focus on constructing domain-specific Knowledge Graphs (KGs) from textual data sources, using Natural Language Processing (NLP) techniques to extract and structure relevant entities. Our approach is based on the formalization of a semantic data model using common linked open vocabularies such as the Web Annotation Ontology (OA) and the Provenance Ontology (PROV). The model was developed by formulating motivating scenarios and competency questions from domain experts. This model has been used to construct three different KGs from three distinct corpora: PubMed scientific publications on wheat and rice genetics and phenotyping, and French agricultural alert bulletins. The named entities to be recognized include genes, phenotypes, traits, genetic markers, taxa and phenological stages normalized using semantic resources such as the Wheat Trait and Phenotype Ontology (WTO), the French Crop Usage (FCU) thesaurus and the Plant Phenological Description Ontology (PPDO). Named entities were extracted using different NLP approaches and tools. The relevance of the semantic model was validated by implementing experts questions as SPARQL queries to be answered on the constructed RDF knowledge graphs. Our work demonstrates how domain-specific vocabularies and systematic querying of KGs can reveal hidden interactions and support agronomists in navigating vast amounts of data. The resources and transformation pipelines developed are publicly available in Git repositories.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
期刊最新文献
Deep learning-based sow posture classifier using colour and depth images Assessing plant pigmentation impacts: A novel approach integrating UAV and multispectral data to analyze atrazine metabolite effects from soil contamination Field scale wheat yield prediction using ensemble machine learning techniques Developing a reference method for indirect measurement of pasture evapotranspiration at sub-meter spatial resolution Public irrigation decision support systems (IDSS) in Italy: Description, evaluation and national context overview
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1