专利检索的新型重新排序架构

IF 2.2 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE World Patent Information Pub Date : 2024-05-28 DOI:10.1016/j.wpi.2024.102282
Vasileios Stamatis, Michail Salampasis, Konstantinos Diamantaras
{"title":"专利检索的新型重新排序架构","authors":"Vasileios Stamatis,&nbsp;Michail Salampasis,&nbsp;Konstantinos Diamantaras","doi":"10.1016/j.wpi.2024.102282","DOIUrl":null,"url":null,"abstract":"<div><p>Patent search presents unique challenges due to the intricate structure and specialized terminology embedded in patent documents. While neural models have been successfully applied in various information retrieval (IR) tasks, these inherent complexities have hindered their effectiveness in patent search. To address these challenges, we propose a novel re-ranking architecture that effectively handles long, structured patent documents and leverages AI models to interpolate lexical and semantic signals of relevance. Additionally, the architecture incorporates query-specific weights for the final re-ranking process. To address partial relevance between patent sections our method effectively models the relevance relationships between different sections of patent documents. We calculate lexical and semantic signals of relevance from each document section and feed them as input features to AI models that estimate a combined relevance score. Finally, we compute query-specific weights to determine the relative contributions of lexical and semantic relevance for the final re-ranking. Extensive experiments on the CLEF-IP dataset demonstrate that our method outperforms several baselines, achieving substantial and statistically significant improvements in retrieval performance. We further assess the adaptability of our method using the MSMARCO dataset, where it exhibits limited performance, indicating its suitability for domain-specific patent research.</p></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"78 ","pages":"Article 102282"},"PeriodicalIF":2.2000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel re-ranking architecture for patent search\",\"authors\":\"Vasileios Stamatis,&nbsp;Michail Salampasis,&nbsp;Konstantinos Diamantaras\",\"doi\":\"10.1016/j.wpi.2024.102282\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Patent search presents unique challenges due to the intricate structure and specialized terminology embedded in patent documents. While neural models have been successfully applied in various information retrieval (IR) tasks, these inherent complexities have hindered their effectiveness in patent search. To address these challenges, we propose a novel re-ranking architecture that effectively handles long, structured patent documents and leverages AI models to interpolate lexical and semantic signals of relevance. Additionally, the architecture incorporates query-specific weights for the final re-ranking process. To address partial relevance between patent sections our method effectively models the relevance relationships between different sections of patent documents. We calculate lexical and semantic signals of relevance from each document section and feed them as input features to AI models that estimate a combined relevance score. Finally, we compute query-specific weights to determine the relative contributions of lexical and semantic relevance for the final re-ranking. Extensive experiments on the CLEF-IP dataset demonstrate that our method outperforms several baselines, achieving substantial and statistically significant improvements in retrieval performance. We further assess the adaptability of our method using the MSMARCO dataset, where it exhibits limited performance, indicating its suitability for domain-specific patent research.</p></div>\",\"PeriodicalId\":51794,\"journal\":{\"name\":\"World Patent Information\",\"volume\":\"78 \",\"pages\":\"Article 102282\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Patent Information\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S017221902400022X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Patent Information","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S017221902400022X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

由于专利文件中蕴含着错综复杂的结构和专业术语,专利检索面临着独特的挑战。虽然神经模型已成功应用于各种信息检索(IR)任务,但这些固有的复杂性阻碍了它们在专利检索中的有效性。为了应对这些挑战,我们提出了一种新颖的重新排序架构,它能有效处理冗长的结构化专利文档,并利用人工智能模型来插值相关的词汇和语义信号。此外,该架构还在最终重新排序过程中加入了特定查询权重。为了解决专利部分之间的部分相关性问题,我们的方法对专利文件不同部分之间的相关性关系进行了有效建模。我们计算每个文档部分的词汇和语义相关性信号,并将其作为输入特征提供给人工智能模型,从而估算出综合相关性得分。最后,我们计算特定查询的权重,以确定词义和语义相关性对最终重新排序的相对贡献。在 CLEF-IP 数据集上进行的大量实验表明,我们的方法优于几种基线方法,在检索性能方面取得了显著的统计改进。我们还使用 MSMARCO 数据集进一步评估了我们方法的适应性,该数据集的性能有限,这表明我们的方法适用于特定领域的专利研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A novel re-ranking architecture for patent search

Patent search presents unique challenges due to the intricate structure and specialized terminology embedded in patent documents. While neural models have been successfully applied in various information retrieval (IR) tasks, these inherent complexities have hindered their effectiveness in patent search. To address these challenges, we propose a novel re-ranking architecture that effectively handles long, structured patent documents and leverages AI models to interpolate lexical and semantic signals of relevance. Additionally, the architecture incorporates query-specific weights for the final re-ranking process. To address partial relevance between patent sections our method effectively models the relevance relationships between different sections of patent documents. We calculate lexical and semantic signals of relevance from each document section and feed them as input features to AI models that estimate a combined relevance score. Finally, we compute query-specific weights to determine the relative contributions of lexical and semantic relevance for the final re-ranking. Extensive experiments on the CLEF-IP dataset demonstrate that our method outperforms several baselines, achieving substantial and statistically significant improvements in retrieval performance. We further assess the adaptability of our method using the MSMARCO dataset, where it exhibits limited performance, indicating its suitability for domain-specific patent research.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
World Patent Information
World Patent Information INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
3.50
自引率
18.50%
发文量
40
期刊介绍: The aim of World Patent Information is to provide a worldwide forum for the exchange of information between people working professionally in the field of Industrial Property information and documentation and to promote the widest possible use of the associated literature. Regular features include: papers concerned with all aspects of Industrial Property information and documentation; new regulations pertinent to Industrial Property information and documentation; short reports on relevant meetings and conferences; bibliographies, together with book and literature reviews.
期刊最新文献
Editorial Board A novel approach to measuring the scope of patent claims based on probabilities obtained from (large) language models Laser-based disassembly of end-of-life automotive traction batteries: A systematic patent analysis Factors affecting patent applicant choice of International Searching Authority Comprehensive analysis of the current status and future trends of microalgae bioreactors using patent and bibliometric approaches
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1