Kai Zhang , Hongbo Gang , Feng Hu , Runlong Yu , Qi Liu
{"title":"Adaptive ensemble learning for efficient keyphrase extraction: Diagnosis, aggregation, and distillation","authors":"Kai Zhang , Hongbo Gang , Feng Hu , Runlong Yu , Qi Liu","doi":"10.1016/j.eswa.2025.127236","DOIUrl":null,"url":null,"abstract":"<div><div>Keyphrase extraction (KE) refers to the process of identifying words or phrases that signify the primary themes of a document. Although keyphrase extraction is important in many downstream applications, including scientific document indexing, search, and question answering, the challenge lies in executing this extraction both adaptively and effectively. To this end, we propose a novel <em><strong>D</strong>istillation-based <strong>A</strong>daptive <strong>E</strong>nsemble <strong>L</strong>earning (<strong>DAEL</strong>)</em> method specifically designed for efficient keyphrase extraction, encompassing diagnosis, aggregation, and distillation processes. Specifically, we initiate with a <em>Cognitive Diagnosis Module (CDM)</em> to evaluate the diverse capabilities of individual KE models. Following this, an <em>Adaptive Aggregation Module (AAM)</em> is employed to create a weight distribution uniquely suited to each data instance. The process concludes with a <em>Knowledge Distillation Module (KDM)</em> to distill the superior performance of the ensemble model into a single model, thereby refining its efficiency and reducing computational cost. Extensive testing on real-world datasets highlights the superior performance of the proposed model. In comparison with leading-edge methods, our approach notably excels in processing text with complex structures or significant noise, marking a substantial advancement in KE effectiveness.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"278 ","pages":"Article 127236"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425008589","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/25 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Keyphrase extraction (KE) refers to the process of identifying words or phrases that signify the primary themes of a document. Although keyphrase extraction is important in many downstream applications, including scientific document indexing, search, and question answering, the challenge lies in executing this extraction both adaptively and effectively. To this end, we propose a novel Distillation-based Adaptive Ensemble Learning (DAEL) method specifically designed for efficient keyphrase extraction, encompassing diagnosis, aggregation, and distillation processes. Specifically, we initiate with a Cognitive Diagnosis Module (CDM) to evaluate the diverse capabilities of individual KE models. Following this, an Adaptive Aggregation Module (AAM) is employed to create a weight distribution uniquely suited to each data instance. The process concludes with a Knowledge Distillation Module (KDM) to distill the superior performance of the ensemble model into a single model, thereby refining its efficiency and reducing computational cost. Extensive testing on real-world datasets highlights the superior performance of the proposed model. In comparison with leading-edge methods, our approach notably excels in processing text with complex structures or significant noise, marking a substantial advancement in KE effectiveness.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.