Renukswamy Chikkamath , Deepak Rastogi , Mahesh Maan , Markus Endres
{"title":"Is your search query well-formed? A natural query understanding for patent prior art search","authors":"Renukswamy Chikkamath , Deepak Rastogi , Mahesh Maan , Markus Endres","doi":"10.1016/j.wpi.2023.102254","DOIUrl":null,"url":null,"abstract":"<div><p><span>Recent advances in Deep Learning<span> based prior art search has enabled the development of easy-to-use prior art search engines that accept natural language search queries and provide improved search performance. However, unlike conventional keyword-based techniques where the results are readily interpreted by the presence of queried keywords, Deep Learning based techniques act like a black box. As a result, it is difficult for users to articulate their information in order to obtain optimal results. In this paper, we share insights on query well-formedness from extensive experimentation with PQAI,</span></span><span><sup>1</sup></span><span> an open source Deep Learning based prior art search engine. We study the effects of various query parameters such as grammar, specificity, and verbosity on the search results and show that ill-formed queries containing grammatical errors, non-essential content, and broad terminology adversely affect the relevance of search results. We also develop a number of Machine Learning models, viz. Grammatical Error Detection Model (GEDM), Query Specificity Model (QSM), and Query Verbosity Model (QVM), to identify and mitigate commonly encountered issues with ill-formed queries. The data, survey forms, and code relating to this work will be released to the community</span><span><sup>2</sup></span>. Towards future breakthroughs, critical areas of query understanding in prior art search for advancing research are given in the end.</p></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"76 ","pages":"Article 102254"},"PeriodicalIF":2.2000,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Patent Information","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0172219023000844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recent advances in Deep Learning based prior art search has enabled the development of easy-to-use prior art search engines that accept natural language search queries and provide improved search performance. However, unlike conventional keyword-based techniques where the results are readily interpreted by the presence of queried keywords, Deep Learning based techniques act like a black box. As a result, it is difficult for users to articulate their information in order to obtain optimal results. In this paper, we share insights on query well-formedness from extensive experimentation with PQAI,1 an open source Deep Learning based prior art search engine. We study the effects of various query parameters such as grammar, specificity, and verbosity on the search results and show that ill-formed queries containing grammatical errors, non-essential content, and broad terminology adversely affect the relevance of search results. We also develop a number of Machine Learning models, viz. Grammatical Error Detection Model (GEDM), Query Specificity Model (QSM), and Query Verbosity Model (QVM), to identify and mitigate commonly encountered issues with ill-formed queries. The data, survey forms, and code relating to this work will be released to the community2. Towards future breakthroughs, critical areas of query understanding in prior art search for advancing research are given in the end.
期刊介绍:
The aim of World Patent Information is to provide a worldwide forum for the exchange of information between people working professionally in the field of Industrial Property information and documentation and to promote the widest possible use of the associated literature. Regular features include: papers concerned with all aspects of Industrial Property information and documentation; new regulations pertinent to Industrial Property information and documentation; short reports on relevant meetings and conferences; bibliographies, together with book and literature reviews.