使用SPARQL测试格:应用于生物医学本体的质量保证。

Guo-Qiang Zhang, Olivier Bodenreider
{"title":"使用SPARQL测试格:应用于生物医学本体的质量保证。","authors":"Guo-Qiang Zhang,&nbsp;Olivier Bodenreider","doi":"10.1007/978-3-642-17749-1_18","DOIUrl":null,"url":null,"abstract":"<p><p>We present a scalable, SPARQL-based computational pipeline for testing the lattice-theoretic properties of partial orders represented as RDF triples. The use case for this work is quality assurance in biomedical ontologies, one desirable property of which is conformance to lattice structures. At the core of our pipeline is the algorithm called <i>NuMi</i>, for detecting the <i>Nu</i>mber of <i>Mi</i>nimal upper bounds of any pair of elements in a given finite partial order. Our technical contribution is the coding of <i>NuMi</i> completely in SPARQL. To show its scalability, we applied <i>NuMi</i> to the entirety of SNOMED CT, the largest clinical ontology (over 300,000 conepts). Our experimental results have been groundbreaking: for the first time, all non-lattice pairs in SNOMED CT have been identified exhaustively from 34 million candidate pairs using over 2.5 billion queries issued to Virtuoso. The percentage of non-lattice pairs ranges from 0 to 1.66 among the 19 SNOMED CT hierarchies. These non-lattice pairs represent target areas for focused curation by domain experts. RDF, SPARQL and related tooling provide an e cient platform for implementing lattice algorithms on large data structures.</p>","PeriodicalId":90988,"journal":{"name":"The semantic Web--ISWC ... : ... International Semantic Web Conference ... proceedings. International Semantic Web Conference","volume":"6497 ","pages":"273-288"},"PeriodicalIF":0.0000,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4330995/pdf/nihms-654705.pdf","citationCount":"24","resultStr":"{\"title\":\"Using SPARQL to Test for Lattices: application to quality assurance in biomedical ontologies.\",\"authors\":\"Guo-Qiang Zhang,&nbsp;Olivier Bodenreider\",\"doi\":\"10.1007/978-3-642-17749-1_18\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We present a scalable, SPARQL-based computational pipeline for testing the lattice-theoretic properties of partial orders represented as RDF triples. The use case for this work is quality assurance in biomedical ontologies, one desirable property of which is conformance to lattice structures. At the core of our pipeline is the algorithm called <i>NuMi</i>, for detecting the <i>Nu</i>mber of <i>Mi</i>nimal upper bounds of any pair of elements in a given finite partial order. Our technical contribution is the coding of <i>NuMi</i> completely in SPARQL. To show its scalability, we applied <i>NuMi</i> to the entirety of SNOMED CT, the largest clinical ontology (over 300,000 conepts). Our experimental results have been groundbreaking: for the first time, all non-lattice pairs in SNOMED CT have been identified exhaustively from 34 million candidate pairs using over 2.5 billion queries issued to Virtuoso. The percentage of non-lattice pairs ranges from 0 to 1.66 among the 19 SNOMED CT hierarchies. These non-lattice pairs represent target areas for focused curation by domain experts. RDF, SPARQL and related tooling provide an e cient platform for implementing lattice algorithms on large data structures.</p>\",\"PeriodicalId\":90988,\"journal\":{\"name\":\"The semantic Web--ISWC ... : ... International Semantic Web Conference ... proceedings. International Semantic Web Conference\",\"volume\":\"6497 \",\"pages\":\"273-288\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4330995/pdf/nihms-654705.pdf\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The semantic Web--ISWC ... : ... International Semantic Web Conference ... proceedings. International Semantic Web Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/978-3-642-17749-1_18\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The semantic Web--ISWC ... : ... International Semantic Web Conference ... proceedings. International Semantic Web Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-642-17749-1_18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

摘要

我们提出了一个可扩展的、基于sparql的计算管道,用于测试用RDF三元组表示的部分顺序的格理论属性。这项工作的用例是生物医学本体的质量保证,其中一个理想的特性是符合晶格结构。流水线的核心是称为NuMi的算法,用于在给定的有限偏序中检测任何一对元素的最小上界的个数。我们的技术贡献是完全用SPARQL编码NuMi。为了展示它的可扩展性,我们将NuMi应用于最大的临床本体(超过30万个概念)SNOMED CT的整体。我们的实验结果是开创性的:首次使用向Virtuoso发出的超过25亿次查询,从3400万对候选对中详尽地确定了SNOMED CT中的所有非晶格对。在19个SNOMED CT层次中,非晶格对的百分比从0到1.66不等。这些非晶格对代表了领域专家集中管理的目标区域。RDF、SPARQL和相关工具为在大型数据结构上实现点阵算法提供了一个高效的平台。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Using SPARQL to Test for Lattices: application to quality assurance in biomedical ontologies.

We present a scalable, SPARQL-based computational pipeline for testing the lattice-theoretic properties of partial orders represented as RDF triples. The use case for this work is quality assurance in biomedical ontologies, one desirable property of which is conformance to lattice structures. At the core of our pipeline is the algorithm called NuMi, for detecting the Number of Minimal upper bounds of any pair of elements in a given finite partial order. Our technical contribution is the coding of NuMi completely in SPARQL. To show its scalability, we applied NuMi to the entirety of SNOMED CT, the largest clinical ontology (over 300,000 conepts). Our experimental results have been groundbreaking: for the first time, all non-lattice pairs in SNOMED CT have been identified exhaustively from 34 million candidate pairs using over 2.5 billion queries issued to Virtuoso. The percentage of non-lattice pairs ranges from 0 to 1.66 among the 19 SNOMED CT hierarchies. These non-lattice pairs represent target areas for focused curation by domain experts. RDF, SPARQL and related tooling provide an e cient platform for implementing lattice algorithms on large data structures.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Semantic Web: 19th International Conference, ESWC 2022, Hersonissos, Crete, Greece, May 29 – June 2, 2022, Proceedings Correction to: A Semantic Framework to Support AI System Accountability and Audit The Semantic Web: 18th International Conference, ESWC 2021, Virtual Event, June 6–10, 2021, Proceedings QAnswer KG: Designing a Portable Question Answering System over RDF Data Incremental Multi-source Entity Resolution for Knowledge Graph Completion
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1