Maryamsadat Mohtashamian, Rashmie Abeysinghe, Xubing Hao, Licong Cui
{"title":"孤儿罕见病本体缺失IS-A关系的识别","authors":"Maryamsadat Mohtashamian, Rashmie Abeysinghe, Xubing Hao, Licong Cui","doi":"10.1109/bibm55620.2022.9995614","DOIUrl":null,"url":null,"abstract":"<p><p>The Orphanet Rare Disease Ontology (ORDO) provides a structured vocabulary encapsulating rare diseases. Downstream applications of ORDO depend on its accuracy to effectively perform their tasks. In this paper, we implement an automated quality assurance pipeline to identify missing <i>is-a</i> relations in ORDO. We first obtain lexical features from concept names. Then we generate related and unrelated feature sharing concept-pairs, where a feature sharing concept-pair can further generate derived term-pairs. If an unrelated and related feature sharing concept-pair generate the same derived term-pair, then we suggest a potential missing <i>is-a</i> relation between the unrelated feature sharing concept-pair. Applying this approach on the 2022-06-27 release of ORDO, we obtained 705 potential missing <i>is-a</i> relations. Leveraging external ontological information in the Unified Medical Language System, we validated 164 missing <i>is-a</i> relations. This indicates that our approach is a promising way to audit <i>is-a</i> relations in ORDO, even though further domain expert evaluation is still needed to validate the remaining potential missing <i>is-a</i> relations identified.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2022 ","pages":"3274-3279"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9918376/pdf/nihms-1870911.pdf","citationCount":"0","resultStr":"{\"title\":\"Identifying Missing IS-A Relations in Orphanet Rare Disease Ontology.\",\"authors\":\"Maryamsadat Mohtashamian, Rashmie Abeysinghe, Xubing Hao, Licong Cui\",\"doi\":\"10.1109/bibm55620.2022.9995614\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The Orphanet Rare Disease Ontology (ORDO) provides a structured vocabulary encapsulating rare diseases. Downstream applications of ORDO depend on its accuracy to effectively perform their tasks. In this paper, we implement an automated quality assurance pipeline to identify missing <i>is-a</i> relations in ORDO. We first obtain lexical features from concept names. Then we generate related and unrelated feature sharing concept-pairs, where a feature sharing concept-pair can further generate derived term-pairs. If an unrelated and related feature sharing concept-pair generate the same derived term-pair, then we suggest a potential missing <i>is-a</i> relation between the unrelated feature sharing concept-pair. Applying this approach on the 2022-06-27 release of ORDO, we obtained 705 potential missing <i>is-a</i> relations. Leveraging external ontological information in the Unified Medical Language System, we validated 164 missing <i>is-a</i> relations. This indicates that our approach is a promising way to audit <i>is-a</i> relations in ORDO, even though further domain expert evaluation is still needed to validate the remaining potential missing <i>is-a</i> relations identified.</p>\",\"PeriodicalId\":74563,\"journal\":{\"name\":\"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine\",\"volume\":\"2022 \",\"pages\":\"3274-3279\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9918376/pdf/nihms-1870911.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/bibm55620.2022.9995614\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bibm55620.2022.9995614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identifying Missing IS-A Relations in Orphanet Rare Disease Ontology.
The Orphanet Rare Disease Ontology (ORDO) provides a structured vocabulary encapsulating rare diseases. Downstream applications of ORDO depend on its accuracy to effectively perform their tasks. In this paper, we implement an automated quality assurance pipeline to identify missing is-a relations in ORDO. We first obtain lexical features from concept names. Then we generate related and unrelated feature sharing concept-pairs, where a feature sharing concept-pair can further generate derived term-pairs. If an unrelated and related feature sharing concept-pair generate the same derived term-pair, then we suggest a potential missing is-a relation between the unrelated feature sharing concept-pair. Applying this approach on the 2022-06-27 release of ORDO, we obtained 705 potential missing is-a relations. Leveraging external ontological information in the Unified Medical Language System, we validated 164 missing is-a relations. This indicates that our approach is a promising way to audit is-a relations in ORDO, even though further domain expert evaluation is still needed to validate the remaining potential missing is-a relations identified.