World Patent Information最新文献

英文中文

Patent alert for targeted protein degradation 靶向蛋白质降解的专利警报

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2026-02-02 DOI: 10.1016/j.wpi.2026.102428

Julia N. Heinrich

This article develops a search strategy for a weekly alert in the field of targeted protein degradation (TPD). The task is challenging due to the rapid evolution of therapeutic approaches, the complexity of patent publication feeds, nuances in patent database indexing, language translation issues, and the tendency of inventors and attorneys to use unique terminology. TPD is an emerging, multidisciplinary technology that aims to redirect molecules to hijack natural protein degradation pathways, targeting previously “undruggable” proteins. Unlike traditional “occupancy-driven” drugs, TPD drugs use “event-driven” pharmacology, initiating specific biological events such as protein degradation or modulation of protein-protein interactions (PPIs). This article provides an overview of TPD, presents a patent search strategy for identifying small molecules targeting the ubiquitin-proteasome system (UPS), and highlights the need for effective innovation tracking in this field.

本文开发了一种靶向蛋白降解（TPD）领域的每周警报搜索策略。由于治疗方法的快速发展、专利出版物提要的复杂性、专利数据库索引的细微差别、语言翻译问题以及发明人和律师使用独特术语的趋势，这项任务具有挑战性。TPD是一项新兴的多学科技术，旨在重定向分子劫持天然蛋白质降解途径，靶向以前“不可药物”的蛋白质。与传统的“占位驱动”药物不同，TPD药物使用“事件驱动”药理学，启动特定的生物事件，如蛋白质降解或蛋白质相互作用（PPIs）的调节。本文综述了TPD的研究概况，提出了一种识别靶向泛素-蛋白酶体系统（UPS）小分子的专利检索策略，并强调了该领域有效创新跟踪的必要性。

引用次数: 0

Identification of domain-relevant patents via weakly supervised deep learning 通过弱监督深度学习识别领域相关专利

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2026-02-05 DOI: 10.1016/j.wpi.2026.102434

Mustafa Sofean

Patent identification is the process of finding patents relevant to a specific technical topic, especially in the early stages of research and development (R&D) projects. Accurately identifying relevant patents helps scientists, researchers, and industry maximize IP value, anticipate challenges, and gain insights into technological trends, competition, and future innovation opportunities. Conventional approaches, like keyword searches or retrieval based on patent classification codes, frequently result in low precision, while machine learning methods demand extensive manual annotation, posing a major bottleneck for domain-specific applications. In this work, we present a deep learning–based approach for domain-specific patent identification, with a focus on the plasma physics and cybersecurity domains. Our methodology employs a weak supervision paradigm to construct a high-quality training dataset by integrating multiple noisy labeling sources, including linguistic patterns, domain heuristics, and expert-defined rules. Using this synthesized training dataset, we fine-tune pre-trained transformer models, systematically optimizing hyperparameters to maximize performance. The resulting models can be deployed as automated patent identification systems tailored to specialized scientific and industrial contexts. We evaluate our models on previously unseen test sets using standard performance metrics. A comprehensive evaluation on unseen test set demonstrates that our approach achieves high accuracy and significantly outperforms a benchmark in-context learning approach based on large language models.

专利识别是寻找与特定技术主题相关的专利的过程，特别是在研究和开发（R&；D）项目的早期阶段。准确识别相关专利有助于科学家、研究人员和行业最大化知识产权价值，预测挑战，洞察技术趋势、竞争和未来创新机会。传统的方法，如关键词搜索或基于专利分类代码的检索，往往导致精度低，而机器学习方法需要大量的人工注释，这对特定领域的应用构成了主要瓶颈。在这项工作中，我们提出了一种基于深度学习的特定领域专利识别方法，重点关注等离子体物理和网络安全领域。我们的方法采用弱监督范式，通过集成多个噪声标记源（包括语言模式、领域启发式和专家定义规则）来构建高质量的训练数据集。使用这个合成的训练数据集，我们微调预训练的变压器模型，系统地优化超参数以最大化性能。由此产生的模型可以部署为专门的科学和工业环境量身定制的自动专利识别系统。我们使用标准性能指标在以前未见过的测试集上评估我们的模型。对未见测试集的综合评估表明，我们的方法达到了很高的准确性，并且显著优于基于大型语言模型的基准上下文学习方法。

{"title":"Identification of domain-relevant patents via weakly supervised deep learning","authors":"Mustafa Sofean","doi":"10.1016/j.wpi.2026.102434","DOIUrl":"10.1016/j.wpi.2026.102434","url":null,"abstract":"<div><div>Patent identification is the process of finding patents relevant to a specific technical topic, especially in the early stages of research and development (R&D) projects. Accurately identifying relevant patents helps scientists, researchers, and industry maximize IP value, anticipate challenges, and gain insights into technological trends, competition, and future innovation opportunities. Conventional approaches, like keyword searches or retrieval based on patent classification codes, frequently result in low precision, while machine learning methods demand extensive manual annotation, posing a major bottleneck for domain-specific applications. In this work, we present a deep learning–based approach for domain-specific patent identification, with a focus on the plasma physics and cybersecurity domains. Our methodology employs a weak supervision paradigm to construct a high-quality training dataset by integrating multiple noisy labeling sources, including linguistic patterns, domain heuristics, and expert-defined rules. Using this synthesized training dataset, we fine-tune pre-trained transformer models, systematically optimizing hyperparameters to maximize performance. The resulting models can be deployed as automated patent identification systems tailored to specialized scientific and industrial contexts. We evaluate our models on previously unseen test sets using standard performance metrics. A comprehensive evaluation on unseen test set demonstrates that our approach achieves high accuracy and significantly outperforms a benchmark in-context learning approach based on large language models.</div></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"84 ","pages":"Article 102434"},"PeriodicalIF":1.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146173233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

From filing to grant: Predicting patent outcomes in FinTech using a predictive analytics perspective 从申请到授权：从预测分析的角度预测金融科技的专利结果

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2025-12-20 DOI: 10.1016/j.wpi.2025.102423

Milad Armani Dehghani , Mehmet Sahiner , Noptanit Chotisarn

Patents are critical indicators of innovation, especially in fast-evolving domains like Financial Technology (FinTech). However, accurately predicting patent grant outcomes with modern artificial intelligence techniques has remained challenging. This study addresses that gap by applying state-of-the-art machine learning (ML), including ensemble methods and deep learning models, to a dataset of 20,008 FinTech patent applications from 2000 to 2020. We demonstrate that our ML framework can forecast grant success with high precision (up to 89 %), revealing that patent quality and strategic filing choices, such as optimal IPC classes and jurisdictions, are key determinants of grant probability. The findings highlight practical implications for innovators and intellectual property managers, such as better resource allocation and informed patent strategy decisions. Overall, this work introduces a novel, AI-driven approach to patent analytics in FinTech, offering a forward-looking tool to enhance innovation management and strategic IP planning.

专利是创新的关键指标，尤其是在金融科技等快速发展的领域。然而，利用现代人工智能技术准确预测专利授权结果仍然具有挑战性。本研究通过将最先进的机器学习（ML），包括集成方法和深度学习模型，应用于2000年至2020年的20,008项金融科技专利申请数据集，解决了这一差距。我们证明了我们的机器学习框架可以高精度地预测授权成功（高达89%），揭示了专利质量和战略性申请选择，如最佳IPC类别和司法管辖区，是授权概率的关键决定因素。这些发现强调了对创新者和知识产权管理者的实际意义，例如更好地分配资源和做出明智的专利战略决策。总的来说，这项工作为金融科技领域的专利分析引入了一种新颖的、人工智能驱动的方法，为加强创新管理和战略知识产权规划提供了一种前瞻性工具。

引用次数: 0

Enhancing mechanical performance of thick steel plates for offshore wind structures: A classification and patent landscape study 提高海上风力结构厚钢板的力学性能：分类和专利景观研究

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2025-12-08 DOI: 10.1016/j.wpi.2025.102419

Jeong-sang Eom , Dong-chan Kim , Ji-hun Han , Won-Gyu Bae

Offshore wind energy is emerging as a pivotal energy resource, and as turbine dimensions expand to meet growing power demands, structural requirements for support towers have intensified. This has led to the use of thicker steel plates, introducing challenges such as microstructural inhomogeneity from uneven cooling across plate thicknesses. To address these issues, we conducted a comprehensive patent analysis on heavy steel plate technologies to identify technological gaps and track innovation trends. We developed a classification framework to organize production methods aimed at enhancing mechanical properties. Additionally, we assessed average steel plate thicknesses across countries and companies, reflecting the trend towards larger turbines and towers. Patent impact and market potential were evaluated using the Cites Per Patent (CPP) and Patent Family Size (PFS) indices.

海上风能正在成为一种关键的能源资源，随着涡轮机尺寸的扩大以满足不断增长的电力需求，对支撑塔的结构要求也越来越高。这导致了使用更厚的钢板，带来了挑战，如由于板厚不同而冷却不均匀的微观结构不均匀性。为了解决这些问题，我们对厚钢板技术进行了全面的专利分析，以识别技术差距并跟踪创新趋势。我们开发了一个分类框架来组织旨在提高机械性能的生产方法。此外，我们评估了不同国家和公司的平均钢板厚度，反映了更大的涡轮机和塔的趋势。利用专利家族规模（PFS）和专利数量指数对专利影响和市场潜力进行了评价。

引用次数: 0

Clustering doc2vec output for topic-dimensionality reduction: A MITRE ATT&CK calibration 用于主题降维的doc2vec输出聚类：MITRE ATT&CK校准

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2026-01-14 DOI: 10.1016/j.wpi.2026.102426

Nathan Monnet , Loïc Maréchal

We introduce a novel approach to text classification by combining doc2vec embeddings with advanced clustering techniques to improve the analysis of specialized, high-dimensional textual data. We integrate unsupervised methods such as Louvain, K-means, and Spectral clustering with doc2vec to enhance the detection of semantic patterns across a large corpus. As a case study, we apply this methodology to cybersecurity risk analysis using the MITRE ATT&CK framework to structure and reduce the dimensionality of cyberattack tactics. Louvain clustering proved the most effective among the tested methods, achieving the best balance between cluster coherence and computational efficiency. Our approach identifies four “super tactics”, demonstrating how clustering improves thematic coherence and risk attribution. The results validate the utility of combining doc2vec with clustering, particularly Louvain, for enhancing topic modelling and text classification.

我们引入了一种新的文本分类方法，将doc2vec嵌入与先进的聚类技术相结合，以改进对专门的高维文本数据的分析。我们将Louvain， K-means和光谱聚类等无监督方法与doc2vec集成在一起，以增强跨大型语料库的语义模式检测。作为一个案例研究，我们将这种方法应用于网络安全风险分析，使用MITRE att&ck框架来构建和降低网络攻击策略的维度。Louvain聚类被证明是测试方法中最有效的，实现了簇相干性和计算效率之间的最佳平衡。我们的方法确定了四种“超级策略”，展示了聚类如何提高主题一致性和风险归因。结果验证了doc2vec与聚类（特别是Louvain）相结合的实用性，可以增强主题建模和文本分类。

引用次数: 0

Global patent panorama of 3D bioprinting: Trends, maturity and key stakeholders 全球3D生物打印专利全景图：趋势、成熟度和关键利益相关者

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2025-12-05 DOI: 10.1016/j.wpi.2025.102421

K.C. Pantoja , V.S. Tarabal , M.E.J. Oliveira , A.G.S. Oliveira , C.L.V. Silva , P.F. Nascimento , T.A. França , R.I.M.A. Ribeiro , J.A. Dernowsek , P.A. Granjeiro

Three-dimensional (3D) bioprinting is emerging as a high-complexity technology in the field of biofabrication, integrating interdisciplinary principles from engineering, materials science, cell biology, and regenerative medicine. This technique enables the fabrication of functional biological constructs composed of living cells and biomaterials through additive manufacturing methods with high spatial resolution. This article provides an in-depth analysis of the main applications, recent advances, and technical limitations related to 3D bioprinting, with emphasis on its implementation in bioprocesses. In the biomedical context, significant progress has been observed in tissue engineering and 3D disease modeling, particularly in translational oncology and the development of predictive drug screening platforms. In industrial biotechnology, bioprinting has been employed for the production of high-purity biological inputs, such as extracellular matrix (ECM) proteins, using human cell systems, thereby promoting more sustainable, animal-free production routes. In the food industry, this technology allows the development of personalized and nutritionally tailored products incorporating innovative and environmentally sustainable ingredients, such as microalgae and insects. In the agricultural sector, 3D bioprinting has been applied to plant tissue engineering and the design of biomimetic models to optimize crop systems. Additionally, a patentometric analysis highlights the global expansion of 3D bioprinting, with a notable increase in filings across international jurisdictions and a gradual transition toward technological maturity. The findings underscore the strategic role of 3D bioprinting as a driver of technological innovation with significant impacts on health, sustainability, and the bioeconomy.

三维生物打印是生物制造领域的一项高度复杂的技术，它融合了工程学、材料科学、细胞生物学和再生医学的跨学科原理。该技术能够通过高空间分辨率的增材制造方法制造由活细胞和生物材料组成的功能性生物结构。本文深入分析了与3D生物打印相关的主要应用、最新进展和技术限制，重点介绍了其在生物过程中的实施。在生物医学领域，组织工程和3D疾病建模，特别是转化肿瘤学和预测性药物筛选平台的发展取得了重大进展。在工业生物技术方面，生物打印已被用于生产高纯度的生物输入，如细胞外基质（ECM）蛋白质，利用人类细胞系统，从而促进更可持续的，无动物的生产路线。在食品行业，这项技术允许开发个性化和营养定制的产品，其中包含创新和环境可持续的成分，如微藻和昆虫。在农业领域，3D生物打印已被应用于植物组织工程和仿生模型的设计，以优化作物系统。此外，专利计量分析强调了3D生物打印的全球扩张，国际司法管辖区的申请数量显着增加，并逐渐向技术成熟过渡。研究结果强调了3D生物打印作为技术创新驱动力的战略作用，对健康、可持续性和生物经济产生重大影响。

{"title":"Global patent panorama of 3D bioprinting: Trends, maturity and key stakeholders","authors":"K.C. Pantoja , V.S. Tarabal , M.E.J. Oliveira , A.G.S. Oliveira , C.L.V. Silva , P.F. Nascimento , T.A. França , R.I.M.A. Ribeiro , J.A. Dernowsek , P.A. Granjeiro","doi":"10.1016/j.wpi.2025.102421","DOIUrl":"10.1016/j.wpi.2025.102421","url":null,"abstract":"<div><div>Three-dimensional (3D) bioprinting is emerging as a high-complexity technology in the field of biofabrication, integrating interdisciplinary principles from engineering, materials science, cell biology, and regenerative medicine. This technique enables the fabrication of functional biological constructs composed of living cells and biomaterials through additive manufacturing methods with high spatial resolution. This article provides an in-depth analysis of the main applications, recent advances, and technical limitations related to 3D bioprinting, with emphasis on its implementation in bioprocesses. In the biomedical context, significant progress has been observed in tissue engineering and 3D disease modeling, particularly in translational oncology and the development of predictive drug screening platforms. In industrial biotechnology, bioprinting has been employed for the production of high-purity biological inputs, such as extracellular matrix (ECM) proteins, using human cell systems, thereby promoting more sustainable, animal-free production routes. In the food industry, this technology allows the development of personalized and nutritionally tailored products incorporating innovative and environmentally sustainable ingredients, such as microalgae and insects. In the agricultural sector, 3D bioprinting has been applied to plant tissue engineering and the design of biomimetic models to optimize crop systems. Additionally, a patentometric analysis highlights the global expansion of 3D bioprinting, with a notable increase in filings across international jurisdictions and a gradual transition toward technological maturity. The findings underscore the strategic role of 3D bioprinting as a driver of technological innovation with significant impacts on health, sustainability, and the bioeconomy.</div></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"84 ","pages":"Article 102421"},"PeriodicalIF":1.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Intellectual property awareness in the Gulf region 海湾地区的知识产权意识

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2025-12-05 DOI: 10.1016/j.wpi.2025.102422

Hady M. Khawand , Markus Kittler , Elie Chahda

This study assesses the level of intellectual property (IP) awareness among top executives in small and medium-sized enterprises (SMEs) within the Gulf Cooperation Council (GCC) region. It addresses a notable gap in the literature on IP familiarity and its strategic use in emerging markets. We surveyed 526 executives across the six GCC states, with scales developed to measure IP familiarity, perception of IP's importance, and understanding of central IP concepts (trademarks, patents, copyrights). Statistical analysis reveals a significant lack of IP awareness, particularly in fundamental areas like patent protection and territorial limitations, underscoring potential risks to strategic decision-making and growth. The findings demonstrate a strong, positive correlation between participation in IP-related education and familiarity with IP concepts, yet most executives lack practical understanding of IP's strategic value. Tailored IP education—through workshops, university courses, and industry conferences—is recommended to bridge this gap, aligning executive knowledge with international standards and fostering an innovation-driven business environment in the GCC.

本研究评估了海湾合作委员会（GCC）地区中小企业高管的知识产权意识水平。它解决了关于知识产权熟悉程度及其在新兴市场战略应用的文献中的一个显著空白。我们调查了六个海湾合作委员会国家的526名高管，并制定了衡量知识产权熟悉程度、对知识产权重要性的认识以及对知识产权核心概念（商标、专利、版权）的理解的量表。统计分析显示，知识产权意识严重缺乏，特别是在专利保护和地域限制等基本领域，这凸显了战略决策和增长面临的潜在风险。研究结果表明，参与知识产权相关教育与熟悉知识产权概念之间存在强烈的正相关关系，但大多数高管缺乏对知识产权战略价值的实际理解。建议通过研讨会、大学课程和行业会议开展量身定制的知识产权教育，以弥合这一差距，使高管知识与国际标准保持一致，并在海湾合作委员会培育创新驱动的商业环境。

{"title":"Intellectual property awareness in the Gulf region","authors":"Hady M. Khawand , Markus Kittler , Elie Chahda","doi":"10.1016/j.wpi.2025.102422","DOIUrl":"10.1016/j.wpi.2025.102422","url":null,"abstract":"<div><div>This study assesses the level of intellectual property (IP) awareness among top executives in small and medium-sized enterprises (SMEs) within the Gulf Cooperation Council (GCC) region. It addresses a notable gap in the literature on IP familiarity and its strategic use in emerging markets. We surveyed 526 executives across the six GCC states, with scales developed to measure IP familiarity, perception of IP's importance, and understanding of central IP concepts (trademarks, patents, copyrights). Statistical analysis reveals a significant lack of IP awareness, particularly in fundamental areas like patent protection and territorial limitations, underscoring potential risks to strategic decision-making and growth. The findings demonstrate a strong, positive correlation between participation in IP-related education and familiarity with IP concepts, yet most executives lack practical understanding of IP's strategic value. Tailored IP education—through workshops, university courses, and industry conferences—is recommended to bridge this gap, aligning executive knowledge with international standards and fostering an innovation-driven business environment in the GCC.</div></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"84 ","pages":"Article 102422"},"PeriodicalIF":1.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Literature listing 文献清单

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2026-01-02 DOI: 10.1016/j.wpi.2025.102424

Susan Bates

Welcome to the latest quarterly Literature Listing intended as a current awareness service for readers indicating newly published books, journal, and conference articles on IP management; Information Retrieval Techniques; Patent Landscapes; Education & Certification; and Legal & Intellectual Property Office Matters. The current Literature Listing was compiled mid-November 2025. Key resources include Scopus, Digital Commons, publishers' RSS feeds, and serendipity! This article gives a selection of interesting references to whet your appetite - the full list of references can be found in the companion datafile.

欢迎访问最新的季刊《文献列表》，该列表旨在为读者提供最新的知识产权管理相关书籍、期刊和会议文章的了解服务；信息检索技术；专利景观;教育&认证；法律和知识产权局事务。目前的文献清单是在2025年11月中旬编制的。关键资源包括Scopus、Digital Commons、出版商的RSS订阅和serendipity！本文提供了一些有趣的参考文献来满足您的胃口——完整的参考文献列表可以在附带的数据文件中找到。

引用次数: 0

Rethinking patent retrieval with language models: Toward scalable and efficient search 用语言模型重新思考专利检索：走向可扩展和高效的检索

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2026-01-28 DOI: 10.1016/j.wpi.2026.102433

Renukswamy Chikkamath , Linda Andersson , Markus Endres

Semantic search with embedding models offers an alternative to traditional keyword-based patent retrieval but often struggles with computational cost and efficiency in real-time scenarios compared to methods like BM25. Meanwhile, the rapid advancement of language models raises questions about the necessity of domain-specific models versus the viability of general-purpose ones. This work presents a comprehensive evaluation of embedding-based patent search using the CLEF-IP 2011 dataset. We assess 10 configurations employing language models as retrievers, re-rankers, or hybrids, across 9 models, both patent-specific and general-purpose, tested in 105 experimental setups. Our best configurations deliver a 14.81% absolute MAP improvement over state-of-the-art baselines and outperform patent-specific embeddings by at least 28.95% in MAP. We further show that embedding quantization enables large-scale patent search with up to 30×faster retrieval and 32×lower memory usage. These results provide practical guidance for integrating embedding models into patent prior art search while addressing performance and scalability constraints.

嵌入模型的语义搜索为传统的基于关键字的专利检索提供了一种替代方案，但与BM25等方法相比，在实时场景中，语义搜索往往存在计算成本和效率方面的问题。同时，语言模型的快速发展提出了关于特定领域模型的必要性与通用模型的可行性的问题。本研究使用CLEF-IP 2011数据集对基于嵌入的专利检索进行了综合评估。我们评估了使用语言模型作为检索器、重新排序器或混合器的10种配置，包括专利专用和通用的9种模型，在105个实验设置中进行了测试。与最先进的基线相比，我们的最佳配置提供了14.81%的MAP绝对改进，并且在MAP中比特定专利的嵌入至少高出28.95%。我们进一步表明，嵌入量化可以实现大规模的专利检索，最高可达30×faster检索和32×lower内存使用。这些结果为在解决性能和可扩展性限制的同时将嵌入模型集成到专利现有技术搜索中提供了实用指导。

{"title":"Rethinking patent retrieval with language models: Toward scalable and efficient search","authors":"Renukswamy Chikkamath , Linda Andersson , Markus Endres","doi":"10.1016/j.wpi.2026.102433","DOIUrl":"10.1016/j.wpi.2026.102433","url":null,"abstract":"<div><div>Semantic search with embedding models offers an alternative to traditional keyword-based patent retrieval but often struggles with computational cost and efficiency in real-time scenarios compared to methods like BM25. Meanwhile, the rapid advancement of language models raises questions about the necessity of domain-specific models versus the viability of general-purpose ones. This work presents a comprehensive evaluation of embedding-based patent search using the CLEF-IP 2011 dataset. We assess 10 configurations employing language models as retrievers, re-rankers, or hybrids, across 9 models, both patent-specific and general-purpose, tested in 105 experimental setups. Our best configurations deliver a 14.81% absolute MAP improvement over state-of-the-art baselines and outperform patent-specific embeddings by at least 28.95% in MAP. We further show that embedding quantization enables large-scale patent search with up to 30×faster retrieval and 32×lower memory usage. These results provide practical guidance for integrating embedding models into patent prior art search while addressing performance and scalability constraints.</div></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"84 ","pages":"Article 102433"},"PeriodicalIF":1.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards efficient patent analysis: A large language model and BERT-refined methodology for keyphrase extraction 迈向高效的专利分析：一个大型语言模型和bert精炼的关键词提取方法

IF 1.9 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

World Patent Information

Pub Date : 2026-03-01 Epub Date: 2026-02-06 DOI: 10.1016/j.wpi.2026.102435

Yaojia Mu, Jianhua Wang, Huaxiang Zhang, Zhongxue Gan, Guo-Niu Zhu

Patents play a pivotal role in engineering design by safeguarding innovation, forecasting technical trends, and promoting knowledge sharing. However, the vast volume of patents and their complex technical descriptions pose significant challenges for effective analysis and information retrieval. To address these issues, we propose an integrated framework that combines large language models (LLM) and a BERT-refined approach for patent analysis. Specifically, patent titles and abstracts are first collected, and term frequency-inverse document frequency (TF-IDF) is introduced to extract candidate keyphrases. An LLM is then employed to refine these keyphrases by filtering irrelevant terms and identifying significant keywords. Subsequently, a fine-tuned BERT model is developed for named entity recognition (NER) to extract domain-specific keywords, which are further refined into keyphrases through our BERT-refined keyphrase extraction (BRKE) method. Experimental results on a large dataset of USPTO patents demonstrate the effectiveness of the proposed BRKE. It achieves the highest F1-score of 52.97% when the top-10 keyphrases are retained, outperforming keyBERT, YAKE, and RAKE by 9.52%, 6.1%, and 2.35%, respectively. By enhancing the accuracy of patent keyphrase extraction, our contributions make patent analysis more efficient and accessible to both analysts and design engineers.

专利在工程设计中发挥着保障创新、预测技术趋势和促进知识共享的关键作用。然而，大量的专利及其复杂的技术描述给有效的分析和信息检索带来了巨大的挑战。为了解决这些问题，我们提出了一个集成框架，该框架结合了大型语言模型（LLM）和bert改进的专利分析方法。具体而言，首先收集专利名称和摘要，并引入术语频率逆文档频率（TF-IDF）来提取候选关键短语。然后使用LLM通过过滤不相关的术语和识别重要的关键字来细化这些关键短语。随后，开发了一种用于命名实体识别（NER）的微调BERT模型，以提取特定领域的关键字，并通过BERT精炼关键字提取（BRKE）方法将其进一步细化为关键短语。在大量USPTO专利数据集上的实验结果证明了该算法的有效性。在保留前10个关键词的情况下，它的f1得分最高，达到52.97%，比keyBERT、YAKE和RAKE分别高出9.52%、6.1%和2.35%。通过提高专利关键词提取的准确性，我们的贡献使专利分析更高效，对分析师和设计工程师都更容易使用。

{"title":"Towards efficient patent analysis: A large language model and BERT-refined methodology for keyphrase extraction","authors":"Yaojia Mu, Jianhua Wang, Huaxiang Zhang, Zhongxue Gan, Guo-Niu Zhu","doi":"10.1016/j.wpi.2026.102435","DOIUrl":"10.1016/j.wpi.2026.102435","url":null,"abstract":"<div><div>Patents play a pivotal role in engineering design by safeguarding innovation, forecasting technical trends, and promoting knowledge sharing. However, the vast volume of patents and their complex technical descriptions pose significant challenges for effective analysis and information retrieval. To address these issues, we propose an integrated framework that combines large language models (LLM) and a BERT-refined approach for patent analysis. Specifically, patent titles and abstracts are first collected, and term frequency-inverse document frequency (TF-IDF) is introduced to extract candidate keyphrases. An LLM is then employed to refine these keyphrases by filtering irrelevant terms and identifying significant keywords. Subsequently, a fine-tuned BERT model is developed for named entity recognition (NER) to extract domain-specific keywords, which are further refined into keyphrases through our BERT-refined keyphrase extraction (BRKE) method. Experimental results on a large dataset of USPTO patents demonstrate the effectiveness of the proposed BRKE. It achieves the highest F1-score of 52.97% when the top-10 keyphrases are retained, outperforming keyBERT, YAKE, and RAKE by 9.52%, 6.1%, and 2.35%, respectively. By enhancing the accuracy of patent keyphrase extraction, our contributions make patent analysis more efficient and accessible to both analysts and design engineers.</div></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"84 ","pages":"Article 102435"},"PeriodicalIF":1.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146173231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

World Patent Information

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀