The above article, published online on 21 November 2021 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.
{"title":"Retraction: Mahaboob John, Y. M., Ravi, G. Multi constrained network feature approximation based secure routing for improved quality of service in mobile ad-hoc network. Comput Intell 40: e12489, 2024 (10.1111/coin.12489)","authors":"","doi":"10.1111/coin.12670","DOIUrl":"https://doi.org/10.1111/coin.12670","url":null,"abstract":"<p>The above article, published online on 21 November 2021 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/coin.12670","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The above article, published online on 08 March 2022 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.
{"title":"Retraction: Manikam Babu, Thangaraju Jesudas. An artificial intelligence-based smart health system for biological cognitive detection based on wireless telecommunication. Comput Intell 38: 1365–1378, 2022 (10.1111/coin.12513)","authors":"","doi":"10.1111/coin.12678","DOIUrl":"https://doi.org/10.1111/coin.12678","url":null,"abstract":"<p>The above article, published online on 08 March 2022 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/coin.12678","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The above article, published online on 12 July 2022 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.
{"title":"Retraction: Nehru Veerabatheran, Prabhu Venkatesan, Rakesh Kumar Mahendran. Denoising and segmentation of brain image by proficient blended threshold and conserve edge scrutinize technique. Comput Intell 40: e12542, 2024 (10.1111/coin.12542)","authors":"","doi":"10.1111/coin.12680","DOIUrl":"https://doi.org/10.1111/coin.12680","url":null,"abstract":"<p>The above article, published online on 12 July 2022 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/coin.12680","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The above article, published online on 21 February 2022 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.
{"title":"Retraction: Meeran Sheriff, Rajagopal Gayathri. An enhanced ensemble machine learning classification method to detect attention deficit hyperactivity for various artificial intelligence and telecommunication applications. Comput Intell 38: 1327–1337, 2022 (10.1111/coin.12509)","authors":"","doi":"10.1111/coin.12673","DOIUrl":"https://doi.org/10.1111/coin.12673","url":null,"abstract":"<p>The above article, published online on 21 February 2022 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor-in-Chief, Diana Inkpen, and Wiley Periodicals LLC. The article was published as part of a guest-edited special issue. Following publication, it came to our attention that two of those named as Guest Editors of this issue were being impersonated and/or misrepresented by a fraudulent entity. An investigation by the publisher found that all of the articles, including this one, experienced compromised editorial handling and peer review which was not in line with the journal's ethical standards. Therefore, a decision has been made to retract this article. We did not find any evidence of misconduct by the authors. The authors have been informed of the decision to retract.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/coin.12673","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dental caries, a common oral disease, poses serious risks if untreated, necessitating effective preventive measures like pit and fissure sealing. However, the reliance on experienced dentists for pit and fissures or caries detection limits accessibility, potentially leading to missed treatment opportunities, especially among children. To bridge this gap, we leverage deep learning in object detection to develop a method for autonomously identifying caries and determining pit and fissure sealing requirements using smartphone oral photos. We test several detection models and adopt a tiling strategy to reduce information loss during image pre-processing. Our implementation achieves 72.3 mAP.5 with the YOLOXs model and tiling strategy. We enhance accessibility by deploying the pre-trained network as a WeChat applet on mobile devices, enabling in-home detection by parents or guardians. In addition, our data set of children's first permanent molars will also aid in the broader study of pediatric oral disease.
{"title":"Object detection for caries or pit and fissure sealing requirement in children's first permanent molars","authors":"Chenyao Jiang, Shiyao Zhai, Hengrui Song, Yuqing Ma, Yachen Fan, Yancheng Fang, Dongmei Yu, Canyang Zhang, Sanyang Han, Runming Wang, Yong Liu, Zhenglin Chen, Jianbo Li, Peiwu Qin","doi":"10.1111/coin.12653","DOIUrl":"https://doi.org/10.1111/coin.12653","url":null,"abstract":"<p>Dental caries, a common oral disease, poses serious risks if untreated, necessitating effective preventive measures like pit and fissure sealing. However, the reliance on experienced dentists for pit and fissures or caries detection limits accessibility, potentially leading to missed treatment opportunities, especially among children. To bridge this gap, we leverage deep learning in object detection to develop a method for autonomously identifying caries and determining pit and fissure sealing requirements using smartphone oral photos. We test several detection models and adopt a tiling strategy to reduce information loss during image pre-processing. Our implementation achieves 72.3 mAP.5 with the YOLOXs model and tiling strategy. We enhance accessibility by deploying the pre-trained network as a WeChat applet on mobile devices, enabling in-home detection by parents or guardians. In addition, our data set of children's first permanent molars will also aid in the broader study of pediatric oral disease.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As industrial production escalates in scale and complexity, the rapid localization and diagnosis of equipment failures have become a core technical challenge. In response to the demand for intelligent fault diagnosis in large-scale industrial equipment, this study presents “MultiCogniGraph”—a multi-hop reasoning diagnostic method that integrates multimodal data fusion, knowledge graphs, and graph convolutional networks (GCN). This method leverages internet of things (IoT) sensor data, small-sample imagery, and expert knowledge to comprehensively characterize the equipment state and accurately detect subtle distinctions in fault patterns. Utilizing a knowledge graph to synthesize data from multiple sources and deep reasoning with GCN, “MultiCogniGraph” achieves swift and effective fault localization and diagnosis. The integration of these techniques not only enhances the efficiency and accuracy of fault diagnosis but also its interpretability, marking a new direction in the field of intelligent fault diagnostics.
{"title":"MultiCogniGraph: A multimodal data fusion and graph convolutional network-based multi-hop reasoning method for large equipment fault diagnosis","authors":"Sen Chen, Jian Wang","doi":"10.1111/coin.12646","DOIUrl":"https://doi.org/10.1111/coin.12646","url":null,"abstract":"<p>As industrial production escalates in scale and complexity, the rapid localization and diagnosis of equipment failures have become a core technical challenge. In response to the demand for intelligent fault diagnosis in large-scale industrial equipment, this study presents “MultiCogniGraph”—a multi-hop reasoning diagnostic method that integrates multimodal data fusion, knowledge graphs, and graph convolutional networks (GCN). This method leverages internet of things (IoT) sensor data, small-sample imagery, and expert knowledge to comprehensively characterize the equipment state and accurately detect subtle distinctions in fault patterns. Utilizing a knowledge graph to synthesize data from multiple sources and deep reasoning with GCN, “MultiCogniGraph” achieves swift and effective fault localization and diagnosis. The integration of these techniques not only enhances the efficiency and accuracy of fault diagnosis but also its interpretability, marking a new direction in the field of intelligent fault diagnostics.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although the market demand for smart devices (SDs) in the Internet of Things (IoT) era is surging, the corresponding thunderstorm protection measures have rarely attracted attention. This paper presents a thunderstorm prediction method with elevation correction, to reduce the thunderstorm damage to SDs by visually tracking thunderstorm activities. First, a self-made three-dimensional atmospheric electric field apparatus (3DAEFA) deployed in IoT is developed to collect real-time AEF data. A 3DAEFA-based localization model is established, and the localization formula after correction is derived. AEF data predicted by the bi-directional long short-term memory (BiLSTM) model are input to this formula to obtain thunderstorm point charge localization results. Then, the localization skill is evaluated. Finally, the proposed method is assessed in experiments, under single and multiple point charge conditions. There are significant reductions of at least 33.1% and 8.8% in ranging and elevation angle errors, respectively. Particularly, this post-prediction correction reduces the deviation of fitted point charge moving paths by at most 0.189 km, demonstrating excellent application effects. Comparisons with radar charts and existing methods testify that this method can effectively predict thunderstorms.
{"title":"BiLSTM-based thunderstorm prediction for IoT applications","authors":"Li Zhuang, Lin Zhu","doi":"10.1111/coin.12683","DOIUrl":"https://doi.org/10.1111/coin.12683","url":null,"abstract":"<p>Although the market demand for smart devices (SDs) in the Internet of Things (IoT) era is surging, the corresponding thunderstorm protection measures have rarely attracted attention. This paper presents a thunderstorm prediction method with elevation correction, to reduce the thunderstorm damage to SDs by visually tracking thunderstorm activities. First, a self-made three-dimensional atmospheric electric field apparatus (3DAEFA) deployed in IoT is developed to collect real-time AEF data. A 3DAEFA-based localization model is established, and the localization formula after correction is derived. AEF data predicted by the bi-directional long short-term memory (BiLSTM) model are input to this formula to obtain thunderstorm point charge localization results. Then, the localization skill is evaluated. Finally, the proposed method is assessed in experiments, under single and multiple point charge conditions. There are significant reductions of at least 33.1% and 8.8% in ranging and elevation angle errors, respectively. Particularly, this post-prediction correction reduces the deviation of fitted point charge moving paths by at most 0.189 km, demonstrating excellent application effects. Comparisons with radar charts and existing methods testify that this method can effectively predict thunderstorms.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Click-through rate (CTR) prediction is a pivotal challenge in recommendation systems. Existing models are prone to disturbances from noise and redundant features, hindering their ability to fully capture implicit and higher-order feature interactions present in sparse feature data. Moreover, conventional dual-tower models overlook the significance of layer-level feature interactions. To address these limitations, this article introduces Gate-enhanced Multi-space Interactive Neural Networks (GMINN), a novel model for CTR prediction. GMINN adopts a dual-tower architecture in which a multi-space interaction layer is introduced after each layer in the dual-tower deep neural network. This layer allocates features into multiple subspaces and employs matrix multiplication to establish layer-level interactions between the dual towers. Simultaneously, a field-aware gate mechanism is proposed to extract crucial latent information from the original features. Experimental validation on publicly available datasets, Criteo and Avazu, demonstrates the superiority of the proposed GMINN model. Comparative analyses against baseline models reveal that GMINN substantially improves up to 4.09% in AUC and a maximum reduction of 7.21% in Logloss. Additionally, ablation experiments provide further validation of the effectiveness of GMINN.
{"title":"GMINN: Gate-enhanced multi-space interaction neural networks for click-through rate prediction","authors":"Xingyu Feng, Xuekang Yang, Boyun Zhou","doi":"10.1111/coin.12645","DOIUrl":"https://doi.org/10.1111/coin.12645","url":null,"abstract":"<p>Click-through rate (CTR) prediction is a pivotal challenge in recommendation systems. Existing models are prone to disturbances from noise and redundant features, hindering their ability to fully capture implicit and higher-order feature interactions present in sparse feature data. Moreover, conventional dual-tower models overlook the significance of layer-level feature interactions. To address these limitations, this article introduces <b>G</b>ate-enhanced <b>M</b>ulti-space <b>I</b>nteractive <b>N</b>eural <b>N</b>etworks (GMINN), a novel model for CTR prediction. GMINN adopts a dual-tower architecture in which a multi-space interaction layer is introduced after each layer in the dual-tower deep neural network. This layer allocates features into multiple subspaces and employs matrix multiplication to establish layer-level interactions between the dual towers. Simultaneously, a field-aware gate mechanism is proposed to extract crucial latent information from the original features. Experimental validation on publicly available datasets, Criteo and Avazu, demonstrates the superiority of the proposed GMINN model. Comparative analyses against baseline models reveal that GMINN substantially improves up to 4.09% in AUC and a maximum reduction of 7.21% in Logloss. Additionally, ablation experiments provide further validation of the effectiveness of GMINN.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marija Kopanja, Stefan Hačko, Sanja Brdar, Miloš Savić
Cost-sensitive ensemble learning as a combination of two approaches, ensemble learning and cost-sensitive learning, enables generation of cost-sensitive tree-based ensemble models using the cost-sensitive decision tree (CSDT) learning algorithm. In general, tree-based models characterize nice graphical representation that can explain a model's decision-making process. However, the depth of the tree and the number of base models in the ensemble can be a limiting factor in comprehending the model's decision for each sample. The CSDT models are widely used in finance (e.g., credit scoring and fraud detection) but lack effective explanation methods. We previously addressed this gap with cost-sensitive tree Shapley Additive Explanation Method (CSTreeSHAP), a cost-sensitive tree explanation method for the single-tree CSDT model. Here, we extend the introduced methodology to cost-sensitive ensemble models, particularly cost-sensitive random forest models. The paper details the theoretical foundation and implementation details of CSTreeSHAP for both single CSDT and ensemble models. The usefulness of the proposed method is demonstrated by providing explanations for single and ensemble CSDT models trained on well-known benchmark credit scoring datasets. Finally, we apply our methodology and analyze the stability of explanations for those models compared to the cost-insensitive tree-based models. Our analysis reveals statistically significant differences between SHAP values despite seemingly similar global feature importance plots of the models. This highlights the value of our methodology as a comprehensive tool for explaining CSDT models.
{"title":"Cost-sensitive tree SHAP for explaining cost-sensitive tree-based models","authors":"Marija Kopanja, Stefan Hačko, Sanja Brdar, Miloš Savić","doi":"10.1111/coin.12651","DOIUrl":"https://doi.org/10.1111/coin.12651","url":null,"abstract":"<p>Cost-sensitive ensemble learning as a combination of two approaches, ensemble learning and cost-sensitive learning, enables generation of cost-sensitive tree-based ensemble models using the cost-sensitive decision tree (CSDT) learning algorithm. In general, tree-based models characterize nice graphical representation that can explain a model's decision-making process. However, the depth of the tree and the number of base models in the ensemble can be a limiting factor in comprehending the model's decision for each sample. The CSDT models are widely used in finance (e.g., credit scoring and fraud detection) but lack effective explanation methods. We previously addressed this gap with cost-sensitive tree Shapley Additive Explanation Method (CSTreeSHAP), a cost-sensitive tree explanation method for the single-tree CSDT model. Here, we extend the introduced methodology to cost-sensitive ensemble models, particularly cost-sensitive random forest models. The paper details the theoretical foundation and implementation details of CSTreeSHAP for both single CSDT and ensemble models. The usefulness of the proposed method is demonstrated by providing explanations for single and ensemble CSDT models trained on well-known benchmark credit scoring datasets. Finally, we apply our methodology and analyze the stability of explanations for those models compared to the cost-insensitive tree-based models. Our analysis reveals statistically significant differences between SHAP values despite seemingly similar global feature importance plots of the models. This highlights the value of our methodology as a comprehensive tool for explaining CSDT models.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min Pan, Shuting Zhou, Teng Li, Yu Liu, Quanli Pei, Angela J. Huang, Jimmy X. Huang
The pre-trained language model (PLM) based on the Transformer encoder, namely BERT, has achieved state-of-the-art results in the field of Information Retrieval. Existing BERT-based ranking models divide documents into passages and aggregate passage-level relevance to rank the document list. However, these common score aggregation strategies cannot capture important semantic information such as document structure and have not been extensively studied. In this article, we propose a novel kernel-based score pooling system to capture document-level relevance by aggregating passage-level relevance. In particular, we propose and study several representative kernel pooling functions and several different document ranking strategies based on passage-level relevance. Our proposed framework KnBERT naturally incorporates kernel functions from the passage level into the BERT-based re-ranking method, which provides a promising avenue for building universal retrieval-then-rerank information retrieval systems. Experiments conducted on two widely used TREC Robust04 and GOV2 test datasets show that the KnBERT has made significant improvements over other BERT-based ranking approaches in terms of MAP, P@20, and NDCG@20 indicators with no extra or even less computations.
{"title":"Utilizing passage-level relevance and kernel pooling for enhancing BERT-based document reranking","authors":"Min Pan, Shuting Zhou, Teng Li, Yu Liu, Quanli Pei, Angela J. Huang, Jimmy X. Huang","doi":"10.1111/coin.12656","DOIUrl":"https://doi.org/10.1111/coin.12656","url":null,"abstract":"<p>The pre-trained language model (PLM) based on the Transformer encoder, namely BERT, has achieved state-of-the-art results in the field of Information Retrieval. Existing BERT-based ranking models divide documents into passages and aggregate passage-level relevance to rank the document list. However, these common score aggregation strategies cannot capture important semantic information such as document structure and have not been extensively studied. In this article, we propose a novel kernel-based score pooling system to capture document-level relevance by aggregating passage-level relevance. In particular, we propose and study several representative kernel pooling functions and several different document ranking strategies based on passage-level relevance. Our proposed framework KnBERT naturally incorporates kernel functions from the passage level into the BERT-based re-ranking method, which provides a promising avenue for building universal retrieval-then-rerank information retrieval systems. Experiments conducted on two widely used TREC Robust04 and GOV2 test datasets show that the KnBERT has made significant improvements over other BERT-based ranking approaches in terms of MAP, P@20, and NDCG@20 indicators with no extra or even less computations.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 3","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/coin.12656","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141286951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}