医学主题词的两个相似度度量:对生物医学文本挖掘和作者姓名消歧的帮助。

Neil R Smalheiser, Gary Bonifield
{"title":"医学主题词的两个相似度度量:对生物医学文本挖掘和作者姓名消歧的帮助。","authors":"Neil R Smalheiser,&nbsp;Gary Bonifield","doi":"10.5210/disco.v7i0.6654","DOIUrl":null,"url":null,"abstract":"<p><p>In the present paper, we have created and characterized several similarity metrics for relating any two Medical Subject Headings (MeSH terms) to each other. The article-based metric measures the tendency of two MeSH terms to appear in the MEDLINE record of the same article. The author-based metric measures the tendency of two MeSH terms to appear in the body of articles written by the same individual (using the 2009 Author-ity author name disambiguation dataset as a gold standard). The two metrics are only modestly correlated with each other (r = 0.50), indicating that they capture different aspects of term usage. The article-based metric provides a measure of semantic relatedness, and MeSH term pairs that co-occur more often than expected by chance may reflect relations between the two terms. In contrast, the author metric is indicative of how individuals practice science, and may have value for author name disambiguation and studies of scientific discovery. We have calculated article metrics for all MeSH terms appearing in at least 25 articles in MEDLINE (as of 2014) and author metrics for MeSH terms published as of 2009. The dataset is freely available for download and can be queried at http://arrowsmith.psych.uic.edu/arrowsmith_uic/mesh_pair_metrics.html. Handling editor: Elizabeth Workman, MLIS, PhD. </p>","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.5210/disco.v7i0.6654","citationCount":"0","resultStr":"{\"title\":\"Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation.\",\"authors\":\"Neil R Smalheiser,&nbsp;Gary Bonifield\",\"doi\":\"10.5210/disco.v7i0.6654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In the present paper, we have created and characterized several similarity metrics for relating any two Medical Subject Headings (MeSH terms) to each other. The article-based metric measures the tendency of two MeSH terms to appear in the MEDLINE record of the same article. The author-based metric measures the tendency of two MeSH terms to appear in the body of articles written by the same individual (using the 2009 Author-ity author name disambiguation dataset as a gold standard). The two metrics are only modestly correlated with each other (r = 0.50), indicating that they capture different aspects of term usage. The article-based metric provides a measure of semantic relatedness, and MeSH term pairs that co-occur more often than expected by chance may reflect relations between the two terms. In contrast, the author metric is indicative of how individuals practice science, and may have value for author name disambiguation and studies of scientific discovery. We have calculated article metrics for all MeSH terms appearing in at least 25 articles in MEDLINE (as of 2014) and author metrics for MeSH terms published as of 2009. The dataset is freely available for download and can be queried at http://arrowsmith.psych.uic.edu/arrowsmith_uic/mesh_pair_metrics.html. Handling editor: Elizabeth Workman, MLIS, PhD. </p>\",\"PeriodicalId\":87404,\"journal\":{\"name\":\"Journal of biomedical discovery and collaboration\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.5210/disco.v7i0.6654\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of biomedical discovery and collaboration\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5210/disco.v7i0.6654\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of biomedical discovery and collaboration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5210/disco.v7i0.6654","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们已经创建并表征了几个相似性指标,用于将任何两个医学主题词(MeSH术语)相互关联。基于文章的度量衡量两个MeSH术语在同一篇文章的MEDLINE记录中出现的趋势。基于作者的指标衡量两个MeSH术语出现在同一个人撰写的文章主体中的趋势(使用2009年author- authority作者姓名消歧数据集作为黄金标准)。这两个指标彼此之间只有适度的相关性(r = 0.50),这表明它们捕获了术语使用的不同方面。基于文章的度量提供了语义相关性的度量,同时出现的MeSH术语对可能反映了两个术语之间的关系。相比之下,作者度量指标表明个人如何从事科学研究,可能对作者姓名歧义消除和科学发现研究有价值。我们计算了MEDLINE中至少25篇文章中出现的所有MeSH术语(截至2014年)的文章指标,以及截至2009年发表的MeSH术语的作者指标。该数据集可免费下载,并可在http://arrowsmith.psych.uic.edu/arrowsmith_uic/mesh_pair_metrics.html上查询。代办编辑:Elizabeth Workman, MLIS,博士。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation.

In the present paper, we have created and characterized several similarity metrics for relating any two Medical Subject Headings (MeSH terms) to each other. The article-based metric measures the tendency of two MeSH terms to appear in the MEDLINE record of the same article. The author-based metric measures the tendency of two MeSH terms to appear in the body of articles written by the same individual (using the 2009 Author-ity author name disambiguation dataset as a gold standard). The two metrics are only modestly correlated with each other (r = 0.50), indicating that they capture different aspects of term usage. The article-based metric provides a measure of semantic relatedness, and MeSH term pairs that co-occur more often than expected by chance may reflect relations between the two terms. In contrast, the author metric is indicative of how individuals practice science, and may have value for author name disambiguation and studies of scientific discovery. We have calculated article metrics for all MeSH terms appearing in at least 25 articles in MEDLINE (as of 2014) and author metrics for MeSH terms published as of 2009. The dataset is freely available for download and can be queried at http://arrowsmith.psych.uic.edu/arrowsmith_uic/mesh_pair_metrics.html. Handling editor: Elizabeth Workman, MLIS, PhD.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation. The language of discovery. Bias associated with mining electronic health records. Literature-based Resurrection of Neglected Medical Discoveries. A cognitive task analysis of a visual analytic workflow: Exploring molecular interaction networks in systems biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1