评估人工智能(AI)实施协助基因链接(在国家医学图书馆)。

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES JAMIA Open Pub Date : 2025-01-07 eCollection Date: 2025-02-01 DOI:10.1093/jamiaopen/ooae129
Rezarta Islamaj, Chih-Hsuan Wei, Po-Ting Lai, Melanie Huston, Cathleen Coss, Preeti Gokal Kochar, Nicholas Miliaras, James G Mork, Oleg Rodionov, Keiko Sekiya, Dorothy Trinh, Deborah Whitman, Craig Wallin, Zhiyong Lu
{"title":"评估人工智能(AI)实施协助基因链接(在国家医学图书馆)。","authors":"Rezarta Islamaj, Chih-Hsuan Wei, Po-Ting Lai, Melanie Huston, Cathleen Coss, Preeti Gokal Kochar, Nicholas Miliaras, James G Mork, Oleg Rodionov, Keiko Sekiya, Dorothy Trinh, Deborah Whitman, Craig Wallin, Zhiyong Lu","doi":"10.1093/jamiaopen/ooae129","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>The National Library of Medicine (NLM) currently indexes close to a million articles each year pertaining to more than 5300 medicine and life sciences journals. Of these, a significant number of articles contain critical information about the structure, genetics, and function of genes and proteins in normal and disease states. These articles are identified by the NLM curators, and a manual link is created between these articles and the corresponding gene records at the NCBI Gene database. Thus, the information is interconnected with all the NLM resources, services which bring considerable value to life sciences. National Library of Medicine aims to provide timely access to all metadata, and this necessitates that the article indexing scales to the volume of the published literature. On the other hand, although automatic information extraction methods have been shown to achieve accurate results in biomedical text mining research, it remains difficult to evaluate them on established pipelines and integrate them within the daily workflows.</p><p><strong>Materials and methods: </strong>Here, we demonstrate how our machine learning model, GNorm2, which achieved state-of-the art performance on identifying genes and their corresponding species at the same time handling innate textual ambiguities, could be integrated with the established daily workflow at the NLM and evaluated for its performance in this new environment.</p><p><strong>Results: </strong>We worked with 8 biomedical curator experts and evaluated the integration using these parameters: (1) gene identification accuracy, (2) interannotator agreement with and without GNorm2, (3) GNorm2 potential bias, and (4) indexing consistency and efficiency. We identified key interface changes that significantly helped the curators to maximize the GNorm2 benefit, and further improved the GNorm2 algorithm to cover 135 species of genes including viral and bacterial genes, based on the biocurator expert survey.</p><p><strong>Conclusion: </strong>GNorm2 is currently in the process of being fully integrated into the regular curator's workflow.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooae129"},"PeriodicalIF":2.5000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11706533/pdf/","citationCount":"0","resultStr":"{\"title\":\"Assessing Artificial Intelligence (AI) Implementation for Assisting Gene Linking (at the National Library of Medicine).\",\"authors\":\"Rezarta Islamaj, Chih-Hsuan Wei, Po-Ting Lai, Melanie Huston, Cathleen Coss, Preeti Gokal Kochar, Nicholas Miliaras, James G Mork, Oleg Rodionov, Keiko Sekiya, Dorothy Trinh, Deborah Whitman, Craig Wallin, Zhiyong Lu\",\"doi\":\"10.1093/jamiaopen/ooae129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>The National Library of Medicine (NLM) currently indexes close to a million articles each year pertaining to more than 5300 medicine and life sciences journals. Of these, a significant number of articles contain critical information about the structure, genetics, and function of genes and proteins in normal and disease states. These articles are identified by the NLM curators, and a manual link is created between these articles and the corresponding gene records at the NCBI Gene database. Thus, the information is interconnected with all the NLM resources, services which bring considerable value to life sciences. National Library of Medicine aims to provide timely access to all metadata, and this necessitates that the article indexing scales to the volume of the published literature. On the other hand, although automatic information extraction methods have been shown to achieve accurate results in biomedical text mining research, it remains difficult to evaluate them on established pipelines and integrate them within the daily workflows.</p><p><strong>Materials and methods: </strong>Here, we demonstrate how our machine learning model, GNorm2, which achieved state-of-the art performance on identifying genes and their corresponding species at the same time handling innate textual ambiguities, could be integrated with the established daily workflow at the NLM and evaluated for its performance in this new environment.</p><p><strong>Results: </strong>We worked with 8 biomedical curator experts and evaluated the integration using these parameters: (1) gene identification accuracy, (2) interannotator agreement with and without GNorm2, (3) GNorm2 potential bias, and (4) indexing consistency and efficiency. We identified key interface changes that significantly helped the curators to maximize the GNorm2 benefit, and further improved the GNorm2 algorithm to cover 135 species of genes including viral and bacterial genes, based on the biocurator expert survey.</p><p><strong>Conclusion: </strong>GNorm2 is currently in the process of being fully integrated into the regular curator's workflow.</p>\",\"PeriodicalId\":36278,\"journal\":{\"name\":\"JAMIA Open\",\"volume\":\"8 1\",\"pages\":\"ooae129\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11706533/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JAMIA Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jamiaopen/ooae129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooae129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

目标:国家医学图书馆(NLM)目前每年索引近100万篇文章,涉及5300多种医学和生命科学期刊。其中,相当数量的文章包含关于正常和疾病状态下基因和蛋白质的结构、遗传学和功能的关键信息。这些文章由NLM管理员识别,并在这些文章和NCBI基因数据库中相应的基因记录之间创建手动链接。因此,这些信息与所有NLM资源和服务相互关联,这些资源和服务为生命科学带来了可观的价值。国家医学图书馆的目标是提供对所有元数据的及时访问,这就要求文章索引与已发表文献的数量相匹配。另一方面,尽管自动信息提取方法在生物医学文本挖掘研究中获得了准确的结果,但在已建立的管道上对其进行评估并将其集成到日常工作流程中仍然很困难。材料和方法:在这里,我们展示了我们的机器学习模型GNorm2,它在识别基因及其相应物种的同时处理先天文本歧歧方面取得了最先进的性能,可以与NLM建立的日常工作流程相结合,并评估其在这个新环境中的性能。结果:我们与8位生物医学管理员专家合作,使用以下参数对整合进行了评估:(1)基因鉴定准确性,(2)使用和不使用GNorm2的注释者一致性,(3)GNorm2潜在偏倚,(4)索引一致性和效率。我们确定了关键的界面变化,显著地帮助管理员最大化GNorm2的利益,并进一步改进GNorm2算法,以覆盖135种基因,包括病毒和细菌基因,基于生物管理员专家调查。结论:GNorm2目前正处于完全集成到常规策展人工作流程的过程中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assessing Artificial Intelligence (AI) Implementation for Assisting Gene Linking (at the National Library of Medicine).

Objectives: The National Library of Medicine (NLM) currently indexes close to a million articles each year pertaining to more than 5300 medicine and life sciences journals. Of these, a significant number of articles contain critical information about the structure, genetics, and function of genes and proteins in normal and disease states. These articles are identified by the NLM curators, and a manual link is created between these articles and the corresponding gene records at the NCBI Gene database. Thus, the information is interconnected with all the NLM resources, services which bring considerable value to life sciences. National Library of Medicine aims to provide timely access to all metadata, and this necessitates that the article indexing scales to the volume of the published literature. On the other hand, although automatic information extraction methods have been shown to achieve accurate results in biomedical text mining research, it remains difficult to evaluate them on established pipelines and integrate them within the daily workflows.

Materials and methods: Here, we demonstrate how our machine learning model, GNorm2, which achieved state-of-the art performance on identifying genes and their corresponding species at the same time handling innate textual ambiguities, could be integrated with the established daily workflow at the NLM and evaluated for its performance in this new environment.

Results: We worked with 8 biomedical curator experts and evaluated the integration using these parameters: (1) gene identification accuracy, (2) interannotator agreement with and without GNorm2, (3) GNorm2 potential bias, and (4) indexing consistency and efficiency. We identified key interface changes that significantly helped the curators to maximize the GNorm2 benefit, and further improved the GNorm2 algorithm to cover 135 species of genes including viral and bacterial genes, based on the biocurator expert survey.

Conclusion: GNorm2 is currently in the process of being fully integrated into the regular curator's workflow.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JAMIA Open
JAMIA Open Medicine-Health Informatics
CiteScore
4.10
自引率
4.80%
发文量
102
审稿时长
16 weeks
期刊最新文献
Aligning prediction models with clinical information needs: infant sepsis case study. Semantic enrichment of Pomeranian health study data using LOINC and WHO-FIC terminology mapping principles. Exploring beyond diagnoses in electronic health records to improve discovery: a review of the phenome-wide association study. Toward digital caregiving network interventions for children with medical complexity living in socioeconomically disadvantaged neighborhoods. Transforming appeal decisions: machine learning triage for hospital admission denials.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1