学术收藏中相关对象语义的揭秘:一种概率方法

J. M. Pinto, Wolf-Tilo Balke
{"title":"学术收藏中相关对象语义的揭秘:一种概率方法","authors":"J. M. Pinto, Wolf-Tilo Balke","doi":"10.1145/2756406.2756923","DOIUrl":null,"url":null,"abstract":"Efforts to make highly specialized knowledge accessible through scientific digital libraries need to go beyond mere bibliographic metadata, since here information search is mostly entity-centric. Previous work has realized this trend and developed different methods to recognize and (to some degree even automatically) annotate several important types of entities: genes and proteins, chemical structures and molecules, or drug names to name but a few. Moreover, such entities are often crossreferenced with entries in curated databases. However, several questions still remain to be answered: Given a scientific discipline what are the important entities? How can they be automatically identified? Are really all of them relevant, i.e. do all of them carry deeper semantics for assessing a publication? How can they be represented, described, and subsequently annotated? How can they be used for search tasks? In this work we focus on answering some of these questions. We claim that to bring the use of scientific digital libraries to the next level we must find treat topic-specific entities as first class citizens and deeply integrate their semantics into the search process. To support this we propose a novel probabilistic approach that not only successfully provides a solution to the integration problem, but also demonstrates how to leverage the knowledge encoded in entities and provide insights to explore the use of our approach in different scenarios. Finally, we show how our results can benefit information providers.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"os-44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Demystifying the Semantics of Relevant Objects in Scholarly Collections: A Probabilistic Approach\",\"authors\":\"J. M. Pinto, Wolf-Tilo Balke\",\"doi\":\"10.1145/2756406.2756923\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Efforts to make highly specialized knowledge accessible through scientific digital libraries need to go beyond mere bibliographic metadata, since here information search is mostly entity-centric. Previous work has realized this trend and developed different methods to recognize and (to some degree even automatically) annotate several important types of entities: genes and proteins, chemical structures and molecules, or drug names to name but a few. Moreover, such entities are often crossreferenced with entries in curated databases. However, several questions still remain to be answered: Given a scientific discipline what are the important entities? How can they be automatically identified? Are really all of them relevant, i.e. do all of them carry deeper semantics for assessing a publication? How can they be represented, described, and subsequently annotated? How can they be used for search tasks? In this work we focus on answering some of these questions. We claim that to bring the use of scientific digital libraries to the next level we must find treat topic-specific entities as first class citizens and deeply integrate their semantics into the search process. To support this we propose a novel probabilistic approach that not only successfully provides a solution to the integration problem, but also demonstrates how to leverage the knowledge encoded in entities and provide insights to explore the use of our approach in different scenarios. Finally, we show how our results can benefit information providers.\",\"PeriodicalId\":256118,\"journal\":{\"name\":\"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries\",\"volume\":\"os-44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2756406.2756923\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2756406.2756923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

通过科学数字图书馆使高度专业化的知识可访问的努力需要超越仅仅书目元数据,因为这里的信息搜索主要是以实体为中心的。以前的工作已经意识到这一趋势,并开发了不同的方法来识别和(在某种程度上甚至是自动的)注释几种重要类型的实体:基因和蛋白质,化学结构和分子,或药物名称等等。此外,这些实体经常与管理数据库中的条目交叉引用。然而,仍有几个问题有待回答:给定一门科学学科,什么是重要的实体?如何自动识别它们?它们真的都是相关的吗?也就是说,它们是否都有更深层次的语义来评估一篇文章?如何表示、描述和随后注释它们?如何将它们用于搜索任务?在这项工作中,我们专注于回答其中的一些问题。我们声称,为了将科学数字图书馆的使用提升到一个新的水平,我们必须将特定主题的实体视为一流公民,并将其语义深度整合到搜索过程中。为了支持这一点,我们提出了一种新颖的概率方法,该方法不仅成功地提供了集成问题的解决方案,而且还演示了如何利用实体中编码的知识,并为探索在不同场景中使用我们的方法提供了见解。最后,我们展示了我们的结果如何使信息提供者受益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Demystifying the Semantics of Relevant Objects in Scholarly Collections: A Probabilistic Approach
Efforts to make highly specialized knowledge accessible through scientific digital libraries need to go beyond mere bibliographic metadata, since here information search is mostly entity-centric. Previous work has realized this trend and developed different methods to recognize and (to some degree even automatically) annotate several important types of entities: genes and proteins, chemical structures and molecules, or drug names to name but a few. Moreover, such entities are often crossreferenced with entries in curated databases. However, several questions still remain to be answered: Given a scientific discipline what are the important entities? How can they be automatically identified? Are really all of them relevant, i.e. do all of them carry deeper semantics for assessing a publication? How can they be represented, described, and subsequently annotated? How can they be used for search tasks? In this work we focus on answering some of these questions. We claim that to bring the use of scientific digital libraries to the next level we must find treat topic-specific entities as first class citizens and deeply integrate their semantics into the search process. To support this we propose a novel probabilistic approach that not only successfully provides a solution to the integration problem, but also demonstrates how to leverage the knowledge encoded in entities and provide insights to explore the use of our approach in different scenarios. Finally, we show how our results can benefit information providers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Combining Classifiers and User Feedback for Disambiguating Author Names Improving Access to Large-scale Digital Libraries ThroughSemantic-enhanced Search and Disambiguation ConfAssist: A Conflict Resolution Framework for Assisting the Categorization of Computer Science Conferences The HathiTrust Research Center: Providing analytic access to the HathiTrust Digital Library's 4.7 billion pages Scholarly Document Information Extraction using Extensible Features for Efficient Higher Order Semi-CRFs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1