Improving Patent Search by Search Result Diversification

Youngho Kim, W. Bruce Croft
{"title":"Improving Patent Search by Search Result Diversification","authors":"Youngho Kim, W. Bruce Croft","doi":"10.1145/2808194.2809455","DOIUrl":null,"url":null,"abstract":"Patent retrieval has some unique features relative to web search. One major task in this domain is finding existing patents that may invalidate new patents, known as prior-art or invalidity search, where search queries can be formulated from query patents (i.e., new patents). Since a patent document generally contains long and complex descriptions, generating effective search queries can be complex and difficult. Typically, these queries must cover diverse aspects of the new patent application in order to retrieve relevant documents that cover the full scope of the patent. Given this context, search diversification techniques can potentially improve the retrieval performance of patent search by introducing diversity into the document ranking. In this paper, we examine the effectiveness for patent search of a recent term-based diversification framework. Using this framework involves developing methods to identify effective phrases related to the topics mentioned in the query patent. In our experiments, we evaluate our diversification approach using standard measures of retrieval effectiveness and diversity, and show significant improvements relative to state-of-the-art baselines.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"30 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2808194.2809455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Patent retrieval has some unique features relative to web search. One major task in this domain is finding existing patents that may invalidate new patents, known as prior-art or invalidity search, where search queries can be formulated from query patents (i.e., new patents). Since a patent document generally contains long and complex descriptions, generating effective search queries can be complex and difficult. Typically, these queries must cover diverse aspects of the new patent application in order to retrieve relevant documents that cover the full scope of the patent. Given this context, search diversification techniques can potentially improve the retrieval performance of patent search by introducing diversity into the document ranking. In this paper, we examine the effectiveness for patent search of a recent term-based diversification framework. Using this framework involves developing methods to identify effective phrases related to the topics mentioned in the query patent. In our experiments, we evaluate our diversification approach using standard measures of retrieval effectiveness and diversity, and show significant improvements relative to state-of-the-art baselines.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用检索结果多样化改进专利检索
专利检索相对于网络检索有一些独特的特点。该领域的一个主要任务是查找可能使新专利无效的现有专利,称为现有技术或无效搜索,其中搜索查询可以从查询专利(即新专利)中制定。由于专利文件通常包含长而复杂的描述,因此生成有效的搜索查询可能既复杂又困难。通常,这些查询必须涵盖新专利申请的各个方面,以便检索涵盖专利全部范围的相关文档。在这种背景下,搜索多样化技术可以通过将多样性引入到文档排序中来潜在地提高专利检索的检索性能。在本文中,我们检验了最近的基于术语的多样化框架的专利检索的有效性。使用此框架涉及开发方法来识别与查询专利中提到的主题相关的有效短语。在我们的实验中,我们使用检索有效性和多样性的标准度量来评估我们的多样化方法,并显示出相对于最先进的基线的显著改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Entity Linking in Queries: Tasks and Evaluation Using Part-of-Speech N-grams for Sensitive-Text Classification Query Expansion with Freebase Partially Labeled Supervised Topic Models for RetrievingSimilar Questions in CQA Forums Two Operators to Define and Manipulate Themes of a Document Collection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1