{"title":"专利检索的新型重新排序架构","authors":"Vasileios Stamatis, Michail Salampasis, Konstantinos Diamantaras","doi":"10.1016/j.wpi.2024.102282","DOIUrl":null,"url":null,"abstract":"<div><p>Patent search presents unique challenges due to the intricate structure and specialized terminology embedded in patent documents. While neural models have been successfully applied in various information retrieval (IR) tasks, these inherent complexities have hindered their effectiveness in patent search. To address these challenges, we propose a novel re-ranking architecture that effectively handles long, structured patent documents and leverages AI models to interpolate lexical and semantic signals of relevance. Additionally, the architecture incorporates query-specific weights for the final re-ranking process. To address partial relevance between patent sections our method effectively models the relevance relationships between different sections of patent documents. We calculate lexical and semantic signals of relevance from each document section and feed them as input features to AI models that estimate a combined relevance score. Finally, we compute query-specific weights to determine the relative contributions of lexical and semantic relevance for the final re-ranking. Extensive experiments on the CLEF-IP dataset demonstrate that our method outperforms several baselines, achieving substantial and statistically significant improvements in retrieval performance. We further assess the adaptability of our method using the MSMARCO dataset, where it exhibits limited performance, indicating its suitability for domain-specific patent research.</p></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"78 ","pages":"Article 102282"},"PeriodicalIF":2.2000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel re-ranking architecture for patent search\",\"authors\":\"Vasileios Stamatis, Michail Salampasis, Konstantinos Diamantaras\",\"doi\":\"10.1016/j.wpi.2024.102282\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Patent search presents unique challenges due to the intricate structure and specialized terminology embedded in patent documents. While neural models have been successfully applied in various information retrieval (IR) tasks, these inherent complexities have hindered their effectiveness in patent search. To address these challenges, we propose a novel re-ranking architecture that effectively handles long, structured patent documents and leverages AI models to interpolate lexical and semantic signals of relevance. Additionally, the architecture incorporates query-specific weights for the final re-ranking process. To address partial relevance between patent sections our method effectively models the relevance relationships between different sections of patent documents. We calculate lexical and semantic signals of relevance from each document section and feed them as input features to AI models that estimate a combined relevance score. Finally, we compute query-specific weights to determine the relative contributions of lexical and semantic relevance for the final re-ranking. Extensive experiments on the CLEF-IP dataset demonstrate that our method outperforms several baselines, achieving substantial and statistically significant improvements in retrieval performance. We further assess the adaptability of our method using the MSMARCO dataset, where it exhibits limited performance, indicating its suitability for domain-specific patent research.</p></div>\",\"PeriodicalId\":51794,\"journal\":{\"name\":\"World Patent Information\",\"volume\":\"78 \",\"pages\":\"Article 102282\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Patent Information\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S017221902400022X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Patent Information","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S017221902400022X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
Patent search presents unique challenges due to the intricate structure and specialized terminology embedded in patent documents. While neural models have been successfully applied in various information retrieval (IR) tasks, these inherent complexities have hindered their effectiveness in patent search. To address these challenges, we propose a novel re-ranking architecture that effectively handles long, structured patent documents and leverages AI models to interpolate lexical and semantic signals of relevance. Additionally, the architecture incorporates query-specific weights for the final re-ranking process. To address partial relevance between patent sections our method effectively models the relevance relationships between different sections of patent documents. We calculate lexical and semantic signals of relevance from each document section and feed them as input features to AI models that estimate a combined relevance score. Finally, we compute query-specific weights to determine the relative contributions of lexical and semantic relevance for the final re-ranking. Extensive experiments on the CLEF-IP dataset demonstrate that our method outperforms several baselines, achieving substantial and statistically significant improvements in retrieval performance. We further assess the adaptability of our method using the MSMARCO dataset, where it exhibits limited performance, indicating its suitability for domain-specific patent research.
期刊介绍:
The aim of World Patent Information is to provide a worldwide forum for the exchange of information between people working professionally in the field of Industrial Property information and documentation and to promote the widest possible use of the associated literature. Regular features include: papers concerned with all aspects of Industrial Property information and documentation; new regulations pertinent to Industrial Property information and documentation; short reports on relevant meetings and conferences; bibliographies, together with book and literature reviews.