首页 > 最新文献

Journal of Biomedical Semantics最新文献

英文 中文
RecSOI: recommending research directions using statements of ignorance RecSOI:利用无知声明推荐研究方向
IF 1.9 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-22 DOI: 10.1186/s13326-024-00304-3
Adrien Bibal, Nourah M. Salem, Rémi Cardon, Elizabeth K. White, Daniel E. Acuna, Robin Burke, Lawrence E. Hunter
The more science advances, the more questions are asked. This compounding growth can make it difficult to keep up with current research directions. Furthermore, this difficulty is exacerbated for junior researchers who enter fields with already large bases of potentially fruitful research avenues. In this paper, we propose a novel task and a recommender system for research directions, RecSOI, that draws from statements of ignorance (SOIs) found in the research literature. By building researchers’ profiles based on textual elements, RecSOI generates personalized recommendations of potential research directions tailored to their interests. In addition, RecSOI provides context for the recommended SOIs, so that users can quickly evaluate how relevant the research direction is for them. In this paper, we provide an overview of RecSOI’s functioning, implementation, and evaluation, demonstrating its effectiveness in guiding researchers through the vast landscape of potential research directions.
科学越进步,问题就越多。这种复合式增长会让人难以跟上当前的研究方向。此外,对于初级研究人员来说,如果他们进入的领域已经有大量潜在的富有成效的研究途径,那么这种困难就会更加严重。在本文中,我们提出了一个新颖的任务和研究方向推荐系统 RecSOI,它借鉴了研究文献中的无知声明 (SOI)。通过基于文本元素建立研究人员档案,RecSOI 可根据研究人员的兴趣生成个性化的潜在研究方向推荐。此外,RecSOI 还为推荐的 SOIs 提供上下文,以便用户快速评估研究方向与自己的相关性。在本文中,我们概述了 RecSOI 的功能、实施和评估情况,展示了它在引导研究人员浏览大量潜在研究方向方面的有效性。
{"title":"RecSOI: recommending research directions using statements of ignorance","authors":"Adrien Bibal, Nourah M. Salem, Rémi Cardon, Elizabeth K. White, Daniel E. Acuna, Robin Burke, Lawrence E. Hunter","doi":"10.1186/s13326-024-00304-3","DOIUrl":"https://doi.org/10.1186/s13326-024-00304-3","url":null,"abstract":"The more science advances, the more questions are asked. This compounding growth can make it difficult to keep up with current research directions. Furthermore, this difficulty is exacerbated for junior researchers who enter fields with already large bases of potentially fruitful research avenues. In this paper, we propose a novel task and a recommender system for research directions, RecSOI, that draws from statements of ignorance (SOIs) found in the research literature. By building researchers’ profiles based on textual elements, RecSOI generates personalized recommendations of potential research directions tailored to their interests. In addition, RecSOI provides context for the recommended SOIs, so that users can quickly evaluate how relevant the research direction is for them. In this paper, we provide an overview of RecSOI’s functioning, implementation, and evaluation, demonstrating its effectiveness in guiding researchers through the vast landscape of potential research directions.","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"32 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140634747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enriching the FIDEO ontology with food-drug interactions from online knowledge sources. 利用在线知识源中的食物-药物相互作用丰富 FIDEO 本体。
IF 2 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-03-04 DOI: 10.1186/s13326-024-00302-5
Rabia Azzi, Georgeta Bordea, Romain Griffier, Jean Noël Nikiema, Fleur Mougin

The increasing number of articles on adverse interactions that may occur when specific foods are consumed with certain drugs makes it difficult to keep up with the latest findings. Conflicting information is available in the scientific literature and specialized knowledge bases because interactions are described in an unstructured or semi-structured format. The FIDEO ontology aims to integrate and represent information about food-drug interactions in a structured way. This article reports on the new version of this ontology in which more than 1700 interactions are integrated from two online resources: DrugBank and Hedrine. These food-drug interactions have been represented in FIDEO in the form of precompiled concepts, each of which specifies both the food and the drug involved. Additionally, competency questions that can be answered are reviewed, and avenues for further enrichment are discussed.

有关特定食物与某些药物一起食用时可能发生不良相互作用的文章越来越多,这使得我们很难跟上最新的研究成果。科学文献和专业知识库中的信息相互矛盾,因为对相互作用的描述是非结构化或半结构化的。FIDEO 本体论旨在以结构化的方式整合和表述有关食物-药物相互作用的信息。本文报告了该本体的新版本,其中整合了来自两个在线资源的 1700 多种相互作用:DrugBank和Hedrine。这些食物-药物相互作用在 FIDEO 中以预编译概念的形式呈现,每个概念都指明了所涉及的食物和药物。此外,还回顾了可以回答的能力问题,并讨论了进一步丰富的途径。
{"title":"Enriching the FIDEO ontology with food-drug interactions from online knowledge sources.","authors":"Rabia Azzi, Georgeta Bordea, Romain Griffier, Jean Noël Nikiema, Fleur Mougin","doi":"10.1186/s13326-024-00302-5","DOIUrl":"10.1186/s13326-024-00302-5","url":null,"abstract":"<p><p>The increasing number of articles on adverse interactions that may occur when specific foods are consumed with certain drugs makes it difficult to keep up with the latest findings. Conflicting information is available in the scientific literature and specialized knowledge bases because interactions are described in an unstructured or semi-structured format. The FIDEO ontology aims to integrate and represent information about food-drug interactions in a structured way. This article reports on the new version of this ontology in which more than 1700 interactions are integrated from two online resources: DrugBank and Hedrine. These food-drug interactions have been represented in FIDEO in the form of precompiled concepts, each of which specifies both the food and the drug involved. Additionally, competency questions that can be answered are reviewed, and avenues for further enrichment are discussed.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"15 1","pages":"1"},"PeriodicalIF":2.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10913206/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140028059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The use of foundational ontologies in biomedical research 在生物医学研究中使用基础本体论
IF 1.9 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-12-11 DOI: 10.1186/s13326-023-00300-z
César H. Bernabé, Núria Queralt-Rosinach, Vítor E. Silva Souza, Luiz Olavo Bonino da Silva Santos, Barend Mons, Annika Jacobsen, Marco Roos
The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts. Ontologies are currently modelled following different approaches, sometimes describing conflicting definitions of the same concepts, which can affect interoperability. To cope with that, prior literature suggests organising ontologies in levels, where domain specific (low-level) ontologies are grounded in domain independent high-level ontologies (i.e., foundational ontologies). In this level-based organisation, foundational ontologies work as translators of intended meaning, thus improving interoperability. Despite their considerable acceptance in biomedical research, there are very few studies testing foundational ontologies. This paper describes a systematic literature mapping that was conducted to understand how foundational ontologies are used in biomedical research and to find empirical evidence supporting their claimed (dis)advantages. From a set of 79 selected papers, we identified that foundational ontologies are used for several purposes: ontology construction, repair, mapping, and ontology-based data analysis. Foundational ontologies are claimed to improve interoperability, enhance reasoning, speed up ontology development and facilitate maintainability. The complexity of using foundational ontologies is the most commonly cited downside. Despite being used for several purposes, there were hardly any experiments (1 paper) testing the claims for or against the use of foundational ontologies. In the subset of 49 papers that describe the development of an ontology, it was observed a low adherence to ontology construction (16 papers) and ontology evaluation formal methods (4 papers). Our findings have two main implications. First, the lack of empirical evidence about the use of foundational ontologies indicates a need for evaluating the use of such artefacts in biomedical research. Second, the low adherence to formal methods illustrates how the field could benefit from a more systematic approach when dealing with the development and evaluation of ontologies. The understanding of how foundational ontologies are used in the biomedical field can drive future research towards the improvement of ontologies and, consequently, data FAIRness. The adoption of formal methods can impact the quality and sustainability of ontologies, and reusing these methods from other fields is encouraged.
FAIR 原则建议使用本体等受控词汇表来定义数据和元数据概念。本体目前采用不同的建模方法,有时对相同概念的定义会相互冲突,从而影响互操作性。为了解决这个问题,以前的文献建议按层次组织本体,其中特定领域(低层次)本体以独立于领域的高层次本体(即基础本体)为基础。在这种基于层次的组织方式中,基础本体充当了预期意义的翻译者,从而提高了互操作性。尽管基础本体在生物医学研究中被广泛接受,但对基础本体进行测试的研究却寥寥无几。本文介绍了为了解基础性本体在生物医学研究中的应用情况并寻找支持其所声称的(不)优势的实证证据而进行的系统性文献映射。从79篇被选中的论文中,我们发现基础本体有几种用途:本体构建、修复、映射和基于本体的数据分析。基础本体被认为可以提高互操作性、增强推理能力、加快本体开发速度并促进可维护性。使用基础本体的复杂性是最常被提到的缺点。尽管基础性本体被用于多种目的,但几乎没有任何实验(1 篇论文)检验过使用基础性本体的利弊。在 49 篇描述本体开发的子集论文中,我们观察到对本体构建(16 篇)和本体评估正式方法(4 篇)的遵守程度较低。我们的发现有两个主要影响。首先,缺乏有关使用基础性本体的经验证据表明,有必要对生物医学研究中此类人工制品的使用进行评估。其次,对正式方法的遵守程度很低,这说明该领域在处理本体的开发和评估时可以从更系统的方法中获益。了解生物医学领域如何使用基础性本体,可以推动未来研究改进本体,从而提高数据的公平性。采用正规方法可以影响本体的质量和可持续性,鼓励从其他领域重新使用这些方法。
{"title":"The use of foundational ontologies in biomedical research","authors":"César H. Bernabé, Núria Queralt-Rosinach, Vítor E. Silva Souza, Luiz Olavo Bonino da Silva Santos, Barend Mons, Annika Jacobsen, Marco Roos","doi":"10.1186/s13326-023-00300-z","DOIUrl":"https://doi.org/10.1186/s13326-023-00300-z","url":null,"abstract":"The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts. Ontologies are currently modelled following different approaches, sometimes describing conflicting definitions of the same concepts, which can affect interoperability. To cope with that, prior literature suggests organising ontologies in levels, where domain specific (low-level) ontologies are grounded in domain independent high-level ontologies (i.e., foundational ontologies). In this level-based organisation, foundational ontologies work as translators of intended meaning, thus improving interoperability. Despite their considerable acceptance in biomedical research, there are very few studies testing foundational ontologies. This paper describes a systematic literature mapping that was conducted to understand how foundational ontologies are used in biomedical research and to find empirical evidence supporting their claimed (dis)advantages. From a set of 79 selected papers, we identified that foundational ontologies are used for several purposes: ontology construction, repair, mapping, and ontology-based data analysis. Foundational ontologies are claimed to improve interoperability, enhance reasoning, speed up ontology development and facilitate maintainability. The complexity of using foundational ontologies is the most commonly cited downside. Despite being used for several purposes, there were hardly any experiments (1 paper) testing the claims for or against the use of foundational ontologies. In the subset of 49 papers that describe the development of an ontology, it was observed a low adherence to ontology construction (16 papers) and ontology evaluation formal methods (4 papers). Our findings have two main implications. First, the lack of empirical evidence about the use of foundational ontologies indicates a need for evaluating the use of such artefacts in biomedical research. Second, the low adherence to formal methods illustrates how the field could benefit from a more systematic approach when dealing with the development and evaluation of ontologies. The understanding of how foundational ontologies are used in the biomedical field can drive future research towards the improvement of ontologies and, consequently, data FAIRness. The adoption of formal methods can impact the quality and sustainability of ontologies, and reusing these methods from other fields is encouraged.","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"31 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138569337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs BioBLP:多模态生物医学知识图谱的模块化学习框架
IF 1.9 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-12-08 DOI: 10.1186/s13326-023-00301-y
Daniel Daza, Dimitrios Alivanistos, Payal Mitra, Thom Pijnenburg, Michael Cochez, Paul Groth
Knowledge graphs (KGs) are an important tool for representing complex relationships between entities in the biomedical domain. Several methods have been proposed for learning embeddings that can be used to predict new links in such graphs. Some methods ignore valuable attribute data associated with entities in biomedical KGs, such as protein sequences, or molecular graphs. Other works incorporate such data, but assume that entities can be represented with the same data modality. This is not always the case for biomedical KGs, where entities exhibit heterogeneous modalities that are central to their representation in the subject domain. We aim to understand how to incorporate multimodal data into biomedical KG embeddings, and analyze the resulting performance in comparison with traditional methods. We propose a modular framework for learning embeddings in KGs with entity attributes, that allows encoding attribute data of different modalities while also supporting entities with missing attributes. We additionally propose an efficient pretraining strategy for reducing the required training runtime. We train models using a biomedical KG containing approximately 2 million triples, and evaluate the performance of the resulting entity embeddings on the tasks of link prediction, and drug-protein interaction prediction, comparing against methods that do not take attribute data into account. In the standard link prediction evaluation, the proposed method results in competitive, yet lower performance than baselines that do not use attribute data. When evaluated in the task of drug-protein interaction prediction, the method compares favorably with the baselines. Further analyses show that incorporating attribute data does outperform baselines over entities below a certain node degree, comprising approximately 75% of the diseases in the graph. We also observe that optimizing attribute encoders is a challenging task that increases optimization costs. Our proposed pretraining strategy yields significantly higher performance while reducing the required training runtime. BioBLP allows to investigate different ways of incorporating multimodal biomedical data for learning representations in KGs. With a particular implementation, we find that incorporating attribute data does not consistently outperform baselines, but improvements are obtained on a comparatively large subset of entities below a specific node-degree. Our results indicate a potential for improved performance in scientific discovery tasks where understudied areas of the KG would benefit from link prediction methods.
知识图谱(KG)是表示生物医学领域实体间复杂关系的重要工具。目前已提出了几种学习嵌入的方法,可用于预测此类图中的新链接。有些方法忽略了生物医学 KG 中与实体相关的宝贵属性数据,如蛋白质序列或分子图。其他方法包含了这些数据,但假设实体可以用相同的数据模式来表示。生物医学 KG 并不总是这种情况,其中的实体表现出不同的模式,而这些模式对它们在主题领域中的表示至关重要。我们的目标是了解如何将多模态数据纳入生物医学 KG 嵌入,并与传统方法比较分析由此产生的性能。我们提出了一个模块化框架,用于学习带有实体属性的 KG 嵌入,该框架允许对不同模态的属性数据进行编码,同时还支持属性缺失的实体。此外,我们还提出了一种高效的预训练策略,以减少所需的训练运行时间。我们使用包含约 200 万个三元组的生物医学 KG 对模型进行了训练,并在链接预测和药物-蛋白质相互作用预测任务中评估了所得实体嵌入的性能,并与不考虑属性数据的方法进行了比较。在标准链接预测评估中,提出的方法具有竞争力,但性能低于不使用属性数据的基线方法。在药物-蛋白质相互作用预测任务中进行评估时,该方法与基线方法相比更胜一筹。进一步的分析表明,对于低于一定节点度的实体(约占图中疾病的 75%),结合属性数据的效果确实优于基线方法。我们还发现,优化属性编码器是一项具有挑战性的任务,会增加优化成本。我们提出的预训练策略能显著提高性能,同时减少所需的训练运行时间。BioBLP 允许研究将多模态生物医学数据纳入幼稚园学习表征的不同方法。通过特定的实现方法,我们发现纳入属性数据并不能始终优于基线,但在特定节点度以下的相对较大的实体子集上却能获得改进。我们的研究结果表明,在科学发现任务中,KG 中未被充分研究的领域将从链接预测方法中获益,从而提高性能。
{"title":"BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs","authors":"Daniel Daza, Dimitrios Alivanistos, Payal Mitra, Thom Pijnenburg, Michael Cochez, Paul Groth","doi":"10.1186/s13326-023-00301-y","DOIUrl":"https://doi.org/10.1186/s13326-023-00301-y","url":null,"abstract":"Knowledge graphs (KGs) are an important tool for representing complex relationships between entities in the biomedical domain. Several methods have been proposed for learning embeddings that can be used to predict new links in such graphs. Some methods ignore valuable attribute data associated with entities in biomedical KGs, such as protein sequences, or molecular graphs. Other works incorporate such data, but assume that entities can be represented with the same data modality. This is not always the case for biomedical KGs, where entities exhibit heterogeneous modalities that are central to their representation in the subject domain. We aim to understand how to incorporate multimodal data into biomedical KG embeddings, and analyze the resulting performance in comparison with traditional methods. We propose a modular framework for learning embeddings in KGs with entity attributes, that allows encoding attribute data of different modalities while also supporting entities with missing attributes. We additionally propose an efficient pretraining strategy for reducing the required training runtime. We train models using a biomedical KG containing approximately 2 million triples, and evaluate the performance of the resulting entity embeddings on the tasks of link prediction, and drug-protein interaction prediction, comparing against methods that do not take attribute data into account. In the standard link prediction evaluation, the proposed method results in competitive, yet lower performance than baselines that do not use attribute data. When evaluated in the task of drug-protein interaction prediction, the method compares favorably with the baselines. Further analyses show that incorporating attribute data does outperform baselines over entities below a certain node degree, comprising approximately 75% of the diseases in the graph. We also observe that optimizing attribute encoders is a challenging task that increases optimization costs. Our proposed pretraining strategy yields significantly higher performance while reducing the required training runtime. BioBLP allows to investigate different ways of incorporating multimodal biomedical data for learning representations in KGs. With a particular implementation, we find that incorporating attribute data does not consistently outperform baselines, but improvements are obtained on a comparatively large subset of entities below a specific node-degree. Our results indicate a potential for improved performance in scientific discovery tasks where understudied areas of the KG would benefit from link prediction methods.","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"86 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138562929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing resolvability, parsability, and consistency of RDF resources: a use case in rare diseases. 评估 RDF 资源的可解析性、可分析性和一致性:罕见疾病用例。
IF 2 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-12-05 DOI: 10.1186/s13326-023-00299-3
Shuxin Zhang, Nirupama Benis, Ronald Cornet

Introduction: Healthcare data and the knowledge gleaned from it play a key role in improving the health of current and future patients. These knowledge sources are regularly represented as 'linked' resources based on the Resource Description Framework (RDF). Making resources 'linkable' to facilitate their interoperability is especially important in the rare-disease domain, where health resources are scattered and scarce. However, to benefit from using RDF, resources need to be of good quality. Based on existing metrics, we aim to assess the quality of RDF resources related to rare diseases and provide recommendations for their improvement.

Methods: Sixteen resources of relevance for the rare-disease domain were selected: two schemas, three metadatasets, and eleven ontologies. These resources were tested on six objective metrics regarding resolvability, parsability, and consistency. Any URI that failed the test based on any of the six metrics was recorded as an error. The error count and percentage of each tested resource were recorded. The assessment results were represented in RDF, using the Data Quality Vocabulary schema.

Results: For three out of the six metrics, the assessment revealed quality issues. Eleven resources have non-resolvable URIs with proportion to all URIs ranging from 0.1% (6/6,712) in the Anatomical Therapeutic Chemical Classification to 13.7% (17/124) in the WikiPathways Ontology; seven resources have undefined URIs; and two resources have incorrectly used properties of the 'owl:ObjectProperty' type. Individual errors were examined to generate suggestions for the development of high-quality RDF resources, including the tested resources.

Conclusion: We assessed the resolvability, parsability, and consistency of RDF resources in the rare-disease domain, and determined the extent of these types of errors that potentially affect interoperability. The qualitative investigation on these errors reveals how they can be avoided. All findings serve as valuable input for the development of a guideline for creating high-quality RDF resources, thereby enhancing the interoperability of biomedical resources.

导言:医疗保健数据和从中获取的知识在改善当前和未来患者的健康状况方面发挥着关键作用。这些知识源通常以基于资源描述框架(RDF)的 "链接 "资源形式表示。使资源 "可链接 "以促进其互操作性在罕见病领域尤为重要,因为该领域的医疗资源分散且稀缺。然而,要从使用 RDF 中获益,资源必须具有良好的质量。基于现有的衡量标准,我们旨在评估与罕见病相关的 RDF 资源的质量,并提出改进建议:我们选择了 16 个与罕见病领域相关的资源:两个模式、三个元数据集和 11 个本体。对这些资源进行了有关可解析性、可分析性和一致性的六项客观指标测试。任何未通过六项指标中任何一项测试的 URI 都会被记录为错误。每个测试资源的错误计数和百分比都被记录下来。评估结果使用数据质量词汇模式 RDF 表示:在六个指标中,有三个指标的评估结果显示存在质量问题。有 11 个资源的 URI 无法解析,占所有 URI 的比例从解剖学治疗化学分类的 0.1%(6/6,712)到 WikiPathways 本体的 13.7%(17/124)不等;有 7 个资源的 URI 未定义;有 2 个资源错误地使用了 "owl:ObjectProperty "类型的属性。通过对个别错误的研究,我们提出了开发高质量 RDF 资源的建议,其中包括测试过的资源:我们评估了罕见病领域中 RDF 资源的可解析性、可分析性和一致性,并确定了这些可能影响互操作性的错误类型的严重程度。对这些错误的定性调查揭示了如何避免这些错误。所有研究结果都为制定创建高质量 RDF 资源的指南提供了有价值的信息,从而提高了生物医学资源的互操作性。
{"title":"Assessing resolvability, parsability, and consistency of RDF resources: a use case in rare diseases.","authors":"Shuxin Zhang, Nirupama Benis, Ronald Cornet","doi":"10.1186/s13326-023-00299-3","DOIUrl":"10.1186/s13326-023-00299-3","url":null,"abstract":"<p><strong>Introduction: </strong>Healthcare data and the knowledge gleaned from it play a key role in improving the health of current and future patients. These knowledge sources are regularly represented as 'linked' resources based on the Resource Description Framework (RDF). Making resources 'linkable' to facilitate their interoperability is especially important in the rare-disease domain, where health resources are scattered and scarce. However, to benefit from using RDF, resources need to be of good quality. Based on existing metrics, we aim to assess the quality of RDF resources related to rare diseases and provide recommendations for their improvement.</p><p><strong>Methods: </strong>Sixteen resources of relevance for the rare-disease domain were selected: two schemas, three metadatasets, and eleven ontologies. These resources were tested on six objective metrics regarding resolvability, parsability, and consistency. Any URI that failed the test based on any of the six metrics was recorded as an error. The error count and percentage of each tested resource were recorded. The assessment results were represented in RDF, using the Data Quality Vocabulary schema.</p><p><strong>Results: </strong>For three out of the six metrics, the assessment revealed quality issues. Eleven resources have non-resolvable URIs with proportion to all URIs ranging from 0.1% (6/6,712) in the Anatomical Therapeutic Chemical Classification to 13.7% (17/124) in the WikiPathways Ontology; seven resources have undefined URIs; and two resources have incorrectly used properties of the 'owl:ObjectProperty' type. Individual errors were examined to generate suggestions for the development of high-quality RDF resources, including the tested resources.</p><p><strong>Conclusion: </strong>We assessed the resolvability, parsability, and consistency of RDF resources in the rare-disease domain, and determined the extent of these types of errors that potentially affect interoperability. The qualitative investigation on these errors reveals how they can be avoided. All findings serve as valuable input for the development of a guideline for creating high-quality RDF resources, thereby enhancing the interoperability of biomedical resources.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"14 1","pages":"19"},"PeriodicalIF":2.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10696869/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138487612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of COVID-19 research: a study on predicting influential scholarly documents using machine learning and a domain-independent knowledge graph. COVID-19研究的影响:使用机器学习和领域独立知识图预测有影响力的学术文献的研究。
IF 2 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-11-28 DOI: 10.1186/s13326-023-00298-4
Gollam Rabby, Jennifer D'Souza, Allard Oelen, Lucie Dvorackova, Vojtěch Svátek, Sören Auer

Multiple studies have investigated bibliometric features and uncategorized scholarly documents for the influential scholarly document prediction task. In this paper, we describe our work that attempts to go beyond bibliometric metadata to predict influential scholarly documents. Furthermore, this work also examines the influential scholarly document prediction task over categorized scholarly documents. We also introduce a new approach to enhance the document representation method with a domain-independent knowledge graph to find the influential scholarly document using categorized scholarly content. As the input collection, we use the WHO corpus with scholarly documents on the theme of COVID-19. This study examines different document representation methods for machine learning, including TF-IDF, BOW, and embedding-based language models (BERT). The TF-IDF document representation method works better than others. From various machine learning methods tested, logistic regression outperformed the other for scholarly document category classification, and the random forest algorithm obtained the best results for influential scholarly document prediction, with the help of a domain-independent knowledge graph, specifically DBpedia, to enhance the document representation method for predicting influential scholarly documents with categorical scholarly content. In this case, our study combines state-of-the-art machine learning methods with the BOW document representation method. We also enhance the BOW document representation with the direct type (RDF type) and unqualified relation from DBpedia. From this experiment, we did not find any impact of the enhanced document representation for the scholarly document category classification. We found an effect in the influential scholarly document prediction with categorical data.

许多研究调查了文献计量学特征和未分类的学术文献,以进行有影响力的学术文献预测任务。在本文中,我们描述了我们的工作,试图超越文献计量元数据来预测有影响力的学术文献。此外,本工作还研究了对分类学术文献有影响的学术文献预测任务。我们还提出了一种新的方法,利用领域无关的知识图来增强文献表示方法,利用分类的学术内容来寻找有影响力的学术文献。作为输入库,我们使用了世卫组织关于COVID-19主题的学术文献语料库。本研究考察了机器学习的不同文档表示方法,包括TF-IDF、BOW和基于嵌入的语言模型(BERT)。TF-IDF文档表示方法比其他方法效果更好。从测试的各种机器学习方法中,逻辑回归在学术文献类别分类方面表现优于其他方法,随机森林算法在有影响力的学术文献预测方面取得了最好的结果,借助领域无关的知识图,特别是DBpedia,增强了预测具有分类学术内容的有影响力的学术文献的文档表示方法。在这种情况下,我们的研究结合了最先进的机器学习方法和BOW文档表示方法。我们还使用直接类型(RDF类型)和来自DBpedia的不限定关系增强了BOW文档表示。从这个实验中,我们没有发现增强的文档表示对学术文档类别分类有任何影响。我们发现在有影响力的学术文献预测中使用分类数据有一定的效果。
{"title":"Impact of COVID-19 research: a study on predicting influential scholarly documents using machine learning and a domain-independent knowledge graph.","authors":"Gollam Rabby, Jennifer D'Souza, Allard Oelen, Lucie Dvorackova, Vojtěch Svátek, Sören Auer","doi":"10.1186/s13326-023-00298-4","DOIUrl":"10.1186/s13326-023-00298-4","url":null,"abstract":"<p><p>Multiple studies have investigated bibliometric features and uncategorized scholarly documents for the influential scholarly document prediction task. In this paper, we describe our work that attempts to go beyond bibliometric metadata to predict influential scholarly documents. Furthermore, this work also examines the influential scholarly document prediction task over categorized scholarly documents. We also introduce a new approach to enhance the document representation method with a domain-independent knowledge graph to find the influential scholarly document using categorized scholarly content. As the input collection, we use the WHO corpus with scholarly documents on the theme of COVID-19. This study examines different document representation methods for machine learning, including TF-IDF, BOW, and embedding-based language models (BERT). The TF-IDF document representation method works better than others. From various machine learning methods tested, logistic regression outperformed the other for scholarly document category classification, and the random forest algorithm obtained the best results for influential scholarly document prediction, with the help of a domain-independent knowledge graph, specifically DBpedia, to enhance the document representation method for predicting influential scholarly documents with categorical scholarly content. In this case, our study combines state-of-the-art machine learning methods with the BOW document representation method. We also enhance the BOW document representation with the direct type (RDF type) and unqualified relation from DBpedia. From this experiment, we did not find any impact of the enhanced document representation for the scholarly document category classification. We found an effect in the influential scholarly document prediction with categorical data.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"14 1","pages":"18"},"PeriodicalIF":2.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10683290/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138451554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data management plans as linked open data: exploiting ARGOS FAIR and machine actionable outputs in the OpenAIRE research graph. 数据管理计划作为链接的开放数据:利用OpenAIRE研究图中的ARGOS FAIR和机器可操作输出。
IF 2 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-11-02 DOI: 10.1186/s13326-023-00297-5
Elli Papadopoulou, Alessia Bardi, George Kakaletris, Diamadis Tziotzios, Paolo Manghi, Natalia Manola

Background: Open Science Graphs (OSGs) are scientific knowledge graphs representing different entities of the research lifecycle (e.g. projects, people, research outcomes, institutions) and the relationships among them. They present a contextualized view of current research that supports discovery, re-use, reproducibility, monitoring, transparency and omni-comprehensive assessment. A Data Management Plan (DMP) contains information concerning both the research processes and the data collected, generated and/or re-used during a project's lifetime. Automated solutions and workflows that connect DMPs with the actual data and other contextual information (e.g., publications, fundings) are missing from the landscape. DMPs being submitted as deliverables also limit their findability. In an open and FAIR-enabling research ecosystem information linking between research processes and research outputs is essential. ARGOS tool for FAIR data management contributes to the OpenAIRE Research Graph (RG) and utilises its underlying services and trusted sources to progressively automate validation and automations of Research Data Management (RDM) practices.

Results: A comparative analysis was conducted between the data models of ARGOS and OpenAIRE Research Graph against the DMP Common Standard. Following this, we extended ARGOS with export format converters and semantic tagging, and the OpenAIRE RG with a DMP entity and semantics between existing entities and relationships. This enabled the integration of ARGOS machine actionable DMPs (ma-DMPs) to the OpenAIRE OSG, enriching and exposing DMPs as FAIR outputs.

Conclusions: This paper, to our knowledge, is the first to introduce exposing ma-DMPs in OSGs and making the link between OSGs and DMPs, introducing the latter as entities in the research lifecycle. Further, it provides insight to ARGOS DMP service interoperability practices and integrations to populate the OpenAIRE Research Graph with DMP entities and relationships and strengthen both FAIRness of outputs as well as information exchange in a standard way.

背景:开放科学图(OSG)是代表研究生命周期的不同实体(如项目、人员、研究成果、机构)及其之间关系的科学知识图。他们提出了当前研究的背景观点,支持发现、重复使用、再现性、监测、透明度和全方位综合评估。数据管理计划(DMP)包含有关研究过程以及在项目生命周期内收集、生成和/或重复使用的数据的信息。将DMP与实际数据和其他上下文信息(如出版物、资助)连接起来的自动化解决方案和工作流程在这一领域中缺失。DMP作为可交付成果提交也限制了其可查找性。在一个开放和FAIR的研究生态系统中,研究过程和研究产出之间的信息联系至关重要。用于FAIR数据管理的ARGOS工具有助于OpenAIRE研究图(RG),并利用其底层服务和可信来源逐步自动化研究数据管理(RDM)实践的验证和自动化。结果:ARGOS和OpenAIRE Research Graph的数据模型与DMP通用标准进行了比较分析。在此之后,我们使用导出格式转换器和语义标记扩展了ARGOS,并使用DMP实体和现有实体和关系之间的语义扩展了OpenAIRE RG。这使得ARGOS机器可操作DMP(ma DMP)能够集成到OpenAIRE OSG,丰富并公开DMP作为FAIR输出。结论:据我们所知,本文首次介绍了在OSG中暴露ma DMP,并在OSG和DMP之间建立联系,将后者作为研究生命周期中的实体引入。此外,它还深入了解了ARGOS DMP服务互操作性实践和集成,以用DMP实体和关系填充OpenAIRE研究图,并以标准方式加强输出的公平性和信息交换。
{"title":"Data management plans as linked open data: exploiting ARGOS FAIR and machine actionable outputs in the OpenAIRE research graph.","authors":"Elli Papadopoulou, Alessia Bardi, George Kakaletris, Diamadis Tziotzios, Paolo Manghi, Natalia Manola","doi":"10.1186/s13326-023-00297-5","DOIUrl":"10.1186/s13326-023-00297-5","url":null,"abstract":"<p><strong>Background: </strong>Open Science Graphs (OSGs) are scientific knowledge graphs representing different entities of the research lifecycle (e.g. projects, people, research outcomes, institutions) and the relationships among them. They present a contextualized view of current research that supports discovery, re-use, reproducibility, monitoring, transparency and omni-comprehensive assessment. A Data Management Plan (DMP) contains information concerning both the research processes and the data collected, generated and/or re-used during a project's lifetime. Automated solutions and workflows that connect DMPs with the actual data and other contextual information (e.g., publications, fundings) are missing from the landscape. DMPs being submitted as deliverables also limit their findability. In an open and FAIR-enabling research ecosystem information linking between research processes and research outputs is essential. ARGOS tool for FAIR data management contributes to the OpenAIRE Research Graph (RG) and utilises its underlying services and trusted sources to progressively automate validation and automations of Research Data Management (RDM) practices.</p><p><strong>Results: </strong>A comparative analysis was conducted between the data models of ARGOS and OpenAIRE Research Graph against the DMP Common Standard. Following this, we extended ARGOS with export format converters and semantic tagging, and the OpenAIRE RG with a DMP entity and semantics between existing entities and relationships. This enabled the integration of ARGOS machine actionable DMPs (ma-DMPs) to the OpenAIRE OSG, enriching and exposing DMPs as FAIR outputs.</p><p><strong>Conclusions: </strong>This paper, to our knowledge, is the first to introduce exposing ma-DMPs in OSGs and making the link between OSGs and DMPs, introducing the latter as entities in the research lifecycle. Further, it provides insight to ARGOS DMP service interoperability practices and integrations to populate the OpenAIRE Research Graph with DMP entities and relationships and strengthen both FAIRness of outputs as well as information exchange in a standard way.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"14 1","pages":"17"},"PeriodicalIF":2.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10621150/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71423853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Context-based refinement of mappings in evolving life science ontologies. 进化生命科学本体论中映射的基于上下文的精化。
IF 1.9 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-10-19 DOI: 10.1186/s13326-023-00294-8
Victor Eiti Yamamoto, Juliana Medeiros Destro, Julio Cesar Dos Reis

Background: Biomedical computational systems benefit from ontologies and their associated mappings. Indeed, aligned ontologies in life sciences play a central role in several semantic-enabled tasks, especially in data exchange. It is crucial to maintain up-to-date alignments according to new knowledge inserted in novel ontology releases. Refining ontology mappings in place, based on adding concepts, demands further research.

Results: This article studies the mapping refinement phenomenon by proposing techniques to refine a set of established mappings based on the evolution of biomedical ontologies. In our first analysis, we investigate ways of suggesting correspondences with the new ontology version without applying a matching operation to the whole set of ontology entities. In the second analysis, the refinement technique enables deriving new mappings and updating the semantic type of the mapping beyond equivalence. Our study explores the neighborhood of concepts in the alignment process to refine mapping sets.

Conclusion: Experimental evaluations with several versions of aligned biomedical ontologies were conducted. Those experiments demonstrated the usefulness of ontology evolution changes to support the process of mapping refinement. Furthermore, using context in ontological concepts was effective in our techniques.

背景:生物医学计算系统受益于本体论及其相关映射。事实上,生命科学中的对齐本体在一些语义支持的任务中发挥着核心作用,尤其是在数据交换中。根据新的本体发布中插入的新知识来保持最新的对齐是至关重要的。在添加概念的基础上,对本体映射进行适当的细化需要进一步的研究。结果:本文研究了映射精化现象,提出了基于生物医学本体论进化来精化一组已建立的映射的技术。在我们的第一次分析中,我们研究了在不将匹配操作应用于整个本体实体集的情况下,建议与新本体版本对应的方法。在第二种分析中,精化技术能够导出新的映射,并更新映射的语义类型,使其超越等价性。我们的研究探索了对齐过程中概念的邻域,以完善映射集。结论:对几种版本的生物医学本体进行了实验评估。这些实验证明了本体进化变化对支持映射精化过程的有用性。此外,在本体论概念中使用上下文在我们的技术中是有效的。
{"title":"Context-based refinement of mappings in evolving life science ontologies.","authors":"Victor Eiti Yamamoto, Juliana Medeiros Destro, Julio Cesar Dos Reis","doi":"10.1186/s13326-023-00294-8","DOIUrl":"10.1186/s13326-023-00294-8","url":null,"abstract":"<p><strong>Background: </strong>Biomedical computational systems benefit from ontologies and their associated mappings. Indeed, aligned ontologies in life sciences play a central role in several semantic-enabled tasks, especially in data exchange. It is crucial to maintain up-to-date alignments according to new knowledge inserted in novel ontology releases. Refining ontology mappings in place, based on adding concepts, demands further research.</p><p><strong>Results: </strong>This article studies the mapping refinement phenomenon by proposing techniques to refine a set of established mappings based on the evolution of biomedical ontologies. In our first analysis, we investigate ways of suggesting correspondences with the new ontology version without applying a matching operation to the whole set of ontology entities. In the second analysis, the refinement technique enables deriving new mappings and updating the semantic type of the mapping beyond equivalence. Our study explores the neighborhood of concepts in the alignment process to refine mapping sets.</p><p><strong>Conclusion: </strong>Experimental evaluations with several versions of aligned biomedical ontologies were conducted. Those experiments demonstrated the usefulness of ontology evolution changes to support the process of mapping refinement. Furthermore, using context in ontological concepts was effective in our techniques.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"14 1","pages":"16"},"PeriodicalIF":1.9,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10585791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49677735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis and implementation of the DynDiff tool when comparing versions of ontology. 比较本体版本时DynDiff工具的分析和实现。
IF 1.9 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-28 DOI: 10.1186/s13326-023-00295-7
Sara Diaz Benavides, Silvio D Cardoso, Marcos Da Silveira, Cédric Pruski

Background: Ontologies play a key role in the management of medical knowledge because they have the properties to support a wide range of knowledge-intensive tasks. The dynamic nature of knowledge requires frequent changes to the ontologies to keep them up-to-date. The challenge is to understand and manage these changes and their impact on depending systems well in order to handle the growing volume of data annotated with ontologies and the limited documentation describing the changes.

Methods: We present a method to detect and characterize the changes occurring between different versions of an ontology together with an ontology of changes entitled DynDiffOnto, designed according to Semantic Web best practices and FAIR principles. We further describe the implementation of the method and the evaluation of the tool with different ontologies from the biomedical domain (i.e. ICD9-CM, MeSH, NCIt, SNOMEDCT, GO, IOBC and CIDO), showing its performance in terms of time execution and capacity to classify ontological changes, compared with other state-of-the-art approaches.

Results: The experiments show a top-level performance of DynDiff for large ontologies and a good performance for smaller ones, with respect to execution time and capability to identify complex changes. In this paper, we further highlight the impact of ontology matchers on the diff computation and the possibility to parameterize the matcher in DynDiff, enabling the possibility of benefits from state-of-the-art matchers.

Conclusion: DynDiff is an efficient tool to compute differences between ontology versions and classify these differences according to DynDiffOnto concepts. This work also contributes to a better understanding of ontological changes through DynDiffOnto, which was designed to express the semantics of the changes between versions of an ontology and can be used to document the evolution of an ontology.

背景:本体论在医学知识管理中发挥着关键作用,因为它们具有支持广泛的知识密集型任务的特性。知识的动态性质要求对本体进行频繁的更改,以使其保持最新状态。挑战在于理解和管理这些变化及其对依赖系统的影响,以便处理越来越多的用本体注释的数据和描述这些变化的有限文档。方法:我们提出了一种检测和表征不同版本本体之间发生的变化的方法,以及根据语义网最佳实践和FAIR原则设计的名为DynDiffOnto的变化本体。我们进一步描述了该方法的实现以及该工具在生物医学领域的不同本体(即ICD9-CM、MeSH、NCIt、SNOMEDCT、GO、IOBC和CIDO)的评估,与其他最先进的方法相比,显示了其在时间执行和对本体变化进行分类的能力方面的性能。结果:实验表明,在执行时间和识别复杂变化的能力方面,DynDiff对大型本体具有顶级性能,对小型本体具有良好性能。在本文中,我们进一步强调了本体匹配器对diff计算的影响,以及在DynDiff中参数化匹配器的可能性,从而有可能从最先进的匹配器中获益。结论:DynDiff是一种计算本体版本之间差异并根据DynDiffOnto概念对这些差异进行分类的有效工具。这项工作也有助于通过DynDiffOnto更好地理解本体论的变化,DynDiff Onto旨在表达本体论版本之间变化的语义,并可用于记录本体论的演变。
{"title":"Analysis and implementation of the DynDiff tool when comparing versions of ontology.","authors":"Sara Diaz Benavides, Silvio D Cardoso, Marcos Da Silveira, Cédric Pruski","doi":"10.1186/s13326-023-00295-7","DOIUrl":"10.1186/s13326-023-00295-7","url":null,"abstract":"<p><strong>Background: </strong>Ontologies play a key role in the management of medical knowledge because they have the properties to support a wide range of knowledge-intensive tasks. The dynamic nature of knowledge requires frequent changes to the ontologies to keep them up-to-date. The challenge is to understand and manage these changes and their impact on depending systems well in order to handle the growing volume of data annotated with ontologies and the limited documentation describing the changes.</p><p><strong>Methods: </strong>We present a method to detect and characterize the changes occurring between different versions of an ontology together with an ontology of changes entitled DynDiffOnto, designed according to Semantic Web best practices and FAIR principles. We further describe the implementation of the method and the evaluation of the tool with different ontologies from the biomedical domain (i.e. ICD9-CM, MeSH, NCIt, SNOMEDCT, GO, IOBC and CIDO), showing its performance in terms of time execution and capacity to classify ontological changes, compared with other state-of-the-art approaches.</p><p><strong>Results: </strong>The experiments show a top-level performance of DynDiff for large ontologies and a good performance for smaller ones, with respect to execution time and capability to identify complex changes. In this paper, we further highlight the impact of ontology matchers on the diff computation and the possibility to parameterize the matcher in DynDiff, enabling the possibility of benefits from state-of-the-art matchers.</p><p><strong>Conclusion: </strong>DynDiff is an efficient tool to compute differences between ontology versions and classify these differences according to DynDiffOnto concepts. This work also contributes to a better understanding of ontological changes through DynDiffOnto, which was designed to express the semantics of the changes between versions of an ontology and can be used to document the evolution of an ontology.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"14 1","pages":"15"},"PeriodicalIF":1.9,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10537977/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41114733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of the early warning system scores ontology. 预警系统评分本体的开发和验证。
IF 1.9 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-20 DOI: 10.1186/s13326-023-00296-6
Cilia E Zayas, Justin M Whorton, Kevin W Sexton, Charles D Mabry, S Clint Dowland, Mathias Brochhausen

Background: Clinical early warning scoring systems, have improved patient outcomes in a range of specializations and global contexts. These systems are used to predict patient deterioration. A multitude of patient-level physiological decompensation data has been made available through the widespread integration of early warning scoring systems within EHRs across national and international health care organizations. These data can be used to promote secondary research. The diversity of early warning scoring systems and various EHR systems is one barrier to secondary analysis of early warning score data. Given that early warning score parameters are varied, this makes it difficult to query across providers and EHR systems. Moreover, mapping and merging the parameters is challenging. We develop and validate the Early Warning System Scores Ontology (EWSSO), representing three commonly used early warning scores: the National Early Warning Score (NEWS), the six-item modified Early Warning Score (MEWS), and the quick Sequential Organ Failure Assessment (qSOFA) to overcome these problems.

Methods: We apply the Software Development Lifecycle Framework-conceived by Winston Boyce in 1970-to model the activities involved in organizing, producing, and evaluating the EWSSO. We also follow OBO Foundry Principles and the principles of best practice for domain ontology design, terms, definitions, and classifications to meet BFO requirements for ontology building.

Results: We developed twenty-nine new classes, reused four classes and four object properties to create the EWSSO. When we queried the data our ontology-based process could differentiate between necessary and unnecessary features for score calculation 100% of the time. Further, our process applied the proper temperature conversions for the early warning score calculator 100% of the time.

Conclusions: Using synthetic datasets, we demonstrate the EWSSO can be used to generate and query health system data on vital signs and provide input to calculate the NEWS, six-item MEWS, and qSOFA. Future work includes extending the EWSSO by introducing additional early warning scores for adult and pediatric patient populations and creating patient profiles that contain clinical, demographic, and outcomes data regarding the patient.

背景:临床预警评分系统在一系列专业和全球背景下改善了患者的预后。这些系统用于预测患者病情恶化。通过在国家和国际卫生保健组织的EHR中广泛集成早期预警评分系统,已经提供了大量患者水平的生理失代偿数据。这些数据可用于促进二次研究。预警评分系统和各种EHR系统的多样性是对预警评分数据进行二次分析的障碍之一。鉴于预警分数参数各不相同,因此很难在供应商和EHR系统之间进行查询。此外,映射和合并参数也是一项挑战。为了克服这些问题,我们开发并验证了预警系统分数本体论(EWSSO),它代表了三种常用的预警分数:国家预警分数(NEWS)、六项修正预警分数(MEWS)和快速顺序器官衰竭评估(qSOFA)。方法:我们应用Winston Boyce在1970年提出的软件开发生命周期框架来对组织、生产和评估EWSSO所涉及的活动进行建模。我们还遵循海外建筑运营管理局铸造原则和领域本体设计、术语、定义和分类的最佳实践原则,以满足BFO对本体构建的要求。结果:我们开发了二十九个新类,重用了四个类和四个对象属性来创建EWSSO。当我们查询数据时,我们基于本体的过程可以100%区分必要和不必要的特征,用于分数计算。此外,我们的过程在100%的时间内为预警分数计算器应用了适当的温度转换。结论:使用合成数据集,我们证明了EWSSO可以用于生成和查询健康系统的生命体征数据,并为计算NEWS、六项MEWS和qSOFA提供输入。未来的工作包括通过为成人和儿科患者群体引入额外的早期预警分数来扩展EWSSO,并创建包含患者临床、人口统计和结果数据的患者档案。
{"title":"Development and validation of the early warning system scores ontology.","authors":"Cilia E Zayas, Justin M Whorton, Kevin W Sexton, Charles D Mabry, S Clint Dowland, Mathias Brochhausen","doi":"10.1186/s13326-023-00296-6","DOIUrl":"10.1186/s13326-023-00296-6","url":null,"abstract":"<p><strong>Background: </strong>Clinical early warning scoring systems, have improved patient outcomes in a range of specializations and global contexts. These systems are used to predict patient deterioration. A multitude of patient-level physiological decompensation data has been made available through the widespread integration of early warning scoring systems within EHRs across national and international health care organizations. These data can be used to promote secondary research. The diversity of early warning scoring systems and various EHR systems is one barrier to secondary analysis of early warning score data. Given that early warning score parameters are varied, this makes it difficult to query across providers and EHR systems. Moreover, mapping and merging the parameters is challenging. We develop and validate the Early Warning System Scores Ontology (EWSSO), representing three commonly used early warning scores: the National Early Warning Score (NEWS), the six-item modified Early Warning Score (MEWS), and the quick Sequential Organ Failure Assessment (qSOFA) to overcome these problems.</p><p><strong>Methods: </strong>We apply the Software Development Lifecycle Framework-conceived by Winston Boyce in 1970-to model the activities involved in organizing, producing, and evaluating the EWSSO. We also follow OBO Foundry Principles and the principles of best practice for domain ontology design, terms, definitions, and classifications to meet BFO requirements for ontology building.</p><p><strong>Results: </strong>We developed twenty-nine new classes, reused four classes and four object properties to create the EWSSO. When we queried the data our ontology-based process could differentiate between necessary and unnecessary features for score calculation 100% of the time. Further, our process applied the proper temperature conversions for the early warning score calculator 100% of the time.</p><p><strong>Conclusions: </strong>Using synthetic datasets, we demonstrate the EWSSO can be used to generate and query health system data on vital signs and provide input to calculate the NEWS, six-item MEWS, and qSOFA. Future work includes extending the EWSSO by introducing additional early warning scores for adult and pediatric patient populations and creating patient profiles that contain clinical, demographic, and outcomes data regarding the patient.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"14 1","pages":"14"},"PeriodicalIF":1.9,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510162/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41123049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Biomedical Semantics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1