首页 > 最新文献

Expert Systems with Applications最新文献

英文 中文
Dual-stage explainable ensemble learning model for diabetes diagnosis
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126899
Ibrahim A. Elgendy , Mohamed Hosny , Mousa Ahmad Albashrawi , Shrooq Alsenan
Early diagnosis of diabetes is crucial for effective management and prevention of complications. However, traditional diagnostic methods are often constrained by the complexity of clinical datasets. To this end, this study proposes a novel explainable machine learning (ML) framework to enhance diabetes prediction. Specifically, the developed methodology involves the detection of outliers using local outlier factor and data reconstruction through a sparse autoencoder. Subsequently, multiple imputation strategies are employed to effectively address missing or erroneous data, while the synthetic minority oversampling technique is applied to mitigate class imbalance. Afterward, a stacking ensemble model, consisting of seven base ML models, is developed for classification, and the outputs of these base models are aggregated using four meta models. To enhance interpretability, two layers of model explainability are implemented. Feature importance analysis is conducted to identify the significance of input variables and Shapley additive explanations is employed to assess the contribution of each base model to the meta model predictions. The results demonstrated that replacing missing data with zeros or mean values led to a noticeable decrease in accuracy compared to K-nearest neighbor imputation or removing samples. Notably, hypertension and kidney failure are pivotal features in the diabetes diagnosis process. Among the base models, Extra Trees model had the most significant impact on the meta model decisions. The stacking multi-layer perceptron model achieved the highest accuracy of 92.54% for diabetes detection, surpassing the performance of standalone ML techniques. This approach enhances diagnostic precision and provides transparency in model predictions, essential for clinical applications.
{"title":"Dual-stage explainable ensemble learning model for diabetes diagnosis","authors":"Ibrahim A. Elgendy ,&nbsp;Mohamed Hosny ,&nbsp;Mousa Ahmad Albashrawi ,&nbsp;Shrooq Alsenan","doi":"10.1016/j.eswa.2025.126899","DOIUrl":"10.1016/j.eswa.2025.126899","url":null,"abstract":"<div><div>Early diagnosis of diabetes is crucial for effective management and prevention of complications. However, traditional diagnostic methods are often constrained by the complexity of clinical datasets. To this end, this study proposes a novel explainable machine learning (ML) framework to enhance diabetes prediction. Specifically, the developed methodology involves the detection of outliers using local outlier factor and data reconstruction through a sparse autoencoder. Subsequently, multiple imputation strategies are employed to effectively address missing or erroneous data, while the synthetic minority oversampling technique is applied to mitigate class imbalance. Afterward, a stacking ensemble model, consisting of seven base ML models, is developed for classification, and the outputs of these base models are aggregated using four meta models. To enhance interpretability, two layers of model explainability are implemented. Feature importance analysis is conducted to identify the significance of input variables and Shapley additive explanations is employed to assess the contribution of each base model to the meta model predictions. The results demonstrated that replacing missing data with zeros or mean values led to a noticeable decrease in accuracy compared to K-nearest neighbor imputation or removing samples. Notably, hypertension and kidney failure are pivotal features in the diabetes diagnosis process. Among the base models, Extra Trees model had the most significant impact on the meta model decisions. The stacking multi-layer perceptron model achieved the highest accuracy of 92.54% for diabetes detection, surpassing the performance of standalone ML techniques. This approach enhances diagnostic precision and provides transparency in model predictions, essential for clinical applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126899"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143488616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowing What and Why: Causal emotion entailment for emotion recognition in conversations
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126924
Hao Liu , Runguo Wei , Geng Tu , Jiali Lin , Dazhi Jiang , Erik Cambria
The clues for eliciting emotion deserve attention in the realm of Emotion Recognition in Conversations (ERC). In an ideal dialog system, comprehending emotions alone is insufficient, and underlying the causes of emotion is also imperative. However, previous research overlooked the integration of causal emotion entailment for a prolonged period. Therefore, an emotion-cause hybrid framework that utilizes causal emotion entailment (CEE) is proposed to promote the ERC task. Specifically, the presented method integrates the information of the cause clause extracted through the CEE module that triggers emotions into the utterance representations obtained by the ERC model. Moreover, a Bidirectional Reasoning Network (BRN) is designed to extract emotional cues to simulate human complex emotional cognitive behavior. Experimental results demonstrate that our framework achieves a new state-of-the-art performance on different datasets, indicating that the proposed framework can improve the model’s ability to emotion understanding.
{"title":"Knowing What and Why: Causal emotion entailment for emotion recognition in conversations","authors":"Hao Liu ,&nbsp;Runguo Wei ,&nbsp;Geng Tu ,&nbsp;Jiali Lin ,&nbsp;Dazhi Jiang ,&nbsp;Erik Cambria","doi":"10.1016/j.eswa.2025.126924","DOIUrl":"10.1016/j.eswa.2025.126924","url":null,"abstract":"<div><div>The clues for eliciting emotion deserve attention in the realm of Emotion Recognition in Conversations (ERC). In an ideal dialog system, comprehending emotions alone is insufficient, and underlying the causes of emotion is also imperative. However, previous research overlooked the integration of causal emotion entailment for a prolonged period. Therefore, an emotion-cause hybrid framework that utilizes causal emotion entailment (CEE) is proposed to promote the ERC task. Specifically, the presented method integrates the information of the cause clause extracted through the CEE module that triggers emotions into the utterance representations obtained by the ERC model. Moreover, a Bidirectional Reasoning Network (BRN) is designed to extract emotional cues to simulate human complex emotional cognitive behavior. Experimental results demonstrate that our framework achieves a new state-of-the-art performance on different datasets, indicating that the proposed framework can improve the model’s ability to emotion understanding.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126924"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WTSF-ReID: Depth-driven Window-oriented Token Selection and Fusion for multi-modality vehicle re-identification with knowledge consistency constraint
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126921
Zhi Yu , Zhiyong Huang , Mingyang Hou , Yan Yan , Yushi Liu
Multi-modality vehicle re-identification, as a crucial task in intelligent transportation system, aims to retrieve specific vehicles across non-overlapping cameras by amalgamating visible and infrared images. The main challenge lies in mitigating inter-modality discrepancies and extracting modality-irrelevant vehicle information. Existing methods concentrate on the integration of distinct modalities, but less attention is paid to the modality-specific crucial information. To this end, we propose a novel depth-driven Window-oriented Token Selection and Fusion network, designated as WTSF-ReID. Specifically, WTSF-ReID is comprised of three distinct modules. The initial component is a Multi-modality General Feature Extraction (MGFE) module, which employs a weight-shared vision transformer to extract features from multi-modality images. The subsequent component is a depth-driven Window-oriented Token Selection and Fusion (WTSF) module, which implements local-to-global windows to select the significant tokens, followed by token fusion and feature aggregation to extract modality-specific crucial information while mitigating inter-modality discrepancies. Finally, to further reduce inter-modality heterogeneity and enhance feature discriminability, a Knowledge Consistency Constraint (KCC) loss simultaneously deploying inter-modality token selection constraint, modality center constraint, and modality triplet constraint is constructed. Extensive experiments on the popular datasets demonstrate the competitive performance against state-of-the-art methods. The datasets and codes are available at https://github.com/unicofu/WTSF-ReID.
{"title":"WTSF-ReID: Depth-driven Window-oriented Token Selection and Fusion for multi-modality vehicle re-identification with knowledge consistency constraint","authors":"Zhi Yu ,&nbsp;Zhiyong Huang ,&nbsp;Mingyang Hou ,&nbsp;Yan Yan ,&nbsp;Yushi Liu","doi":"10.1016/j.eswa.2025.126921","DOIUrl":"10.1016/j.eswa.2025.126921","url":null,"abstract":"<div><div>Multi-modality vehicle re-identification, as a crucial task in intelligent transportation system, aims to retrieve specific vehicles across non-overlapping cameras by amalgamating visible and infrared images. The main challenge lies in mitigating inter-modality discrepancies and extracting modality-irrelevant vehicle information. Existing methods concentrate on the integration of distinct modalities, but less attention is paid to the modality-specific crucial information. To this end, we propose a novel depth-driven Window-oriented Token Selection and Fusion network, designated as WTSF-ReID. Specifically, WTSF-ReID is comprised of three distinct modules. The initial component is a Multi-modality General Feature Extraction (MGFE) module, which employs a weight-shared vision transformer to extract features from multi-modality images. The subsequent component is a depth-driven Window-oriented Token Selection and Fusion (WTSF) module, which implements local-to-global windows to select the significant tokens, followed by token fusion and feature aggregation to extract modality-specific crucial information while mitigating inter-modality discrepancies. Finally, to further reduce inter-modality heterogeneity and enhance feature discriminability, a Knowledge Consistency Constraint (KCC) loss simultaneously deploying inter-modality token selection constraint, modality center constraint, and modality triplet constraint is constructed. Extensive experiments on the popular datasets demonstrate the competitive performance against state-of-the-art methods. The datasets and codes are available at <span><span>https://github.com/unicofu/WTSF-ReID</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126921"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk assessment for digital transformation projects in construction Enterprises: An enhanced FMEA model
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126991
Tangzhenhao Li , Jianxin You , Emel Aktas , Yongxin Dong , Miying Yang
The digital transformation of the construction industry is crucial for advancing global digital economies, but it involves significant risks that require a standardized and robust assessment methodology. This paper presents an enhanced Failure Mode and Effect Analysis (FMEA) model that integrates the Multiple Attribute Border Approximation Area Comparison (MABAC) method with Grey Relational Analysis (GRA). Unlike previous approaches, this integration aligns grey relational changes with border approximation vector components, capturing both positive and negative correlations between modes. This enhances the prioritization process by distinguishing failure modes that may amplify or mitigate each other’s impact, leading to more precise risk assessments and mitigation strategies. The model also employs interval numbers instead of crisp numbers to reduce information loss from decision-making ambiguities caused by heterogeneous expert evaluations. Applied in a real-life case study, the improved model effectively accommodates biases and hesitations in expert decision-making, enhancing the accuracy and reliability of risk assessments in digital transformation projects. The findings highlight the model’s potential as a comprehensive and reliable framework for identifying, prioritizing, and mitigating risks in the digital transformation of the construction industry.
{"title":"Risk assessment for digital transformation projects in construction Enterprises: An enhanced FMEA model","authors":"Tangzhenhao Li ,&nbsp;Jianxin You ,&nbsp;Emel Aktas ,&nbsp;Yongxin Dong ,&nbsp;Miying Yang","doi":"10.1016/j.eswa.2025.126991","DOIUrl":"10.1016/j.eswa.2025.126991","url":null,"abstract":"<div><div>The digital transformation of the construction industry is crucial for advancing global digital economies, but it involves significant risks that require a standardized and robust assessment methodology. This paper presents an enhanced Failure Mode and Effect Analysis (FMEA) model that integrates the Multiple Attribute Border Approximation Area Comparison (MABAC) method with Grey Relational Analysis (GRA). Unlike previous approaches, this integration aligns grey relational changes with border approximation vector components, capturing both positive and negative correlations between modes. This enhances the prioritization process by distinguishing failure modes that may amplify or mitigate each other’s impact, leading to more precise risk assessments and mitigation strategies. The model also employs interval numbers instead of crisp numbers to reduce information loss from decision-making ambiguities caused by heterogeneous expert evaluations. Applied in a real-life case study, the improved model effectively accommodates biases and hesitations in expert decision-making, enhancing the accuracy and reliability of risk assessments in digital transformation projects. The findings highlight the model’s potential as a comprehensive and reliable framework for identifying, prioritizing, and mitigating risks in the digital transformation of the construction industry.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126991"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image Retrieval
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126902
Jing Zhang, Shuli Cheng , Liejun Wang
The development of multimedia technology has led to an increasing number of images, making the search for similar images an urgent need in daily life. Hash image retrieval has gradually dominated the field of image retrieval due to its advantages of computational efficiency and high accuracy. Currently, image retrieval algorithms based on Convolutional Neural Network (CNN) and Vision Transformer (ViT) remain inadequate in extracting target category features, ignoring local fine-grained features, which affects retrieval accuracy. This paper proposes an Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image Retrieval. Firstly, based on the diversity of image feature scales, a Channel Separation Attention Embedding Block (CSAE Block) is designed within the deep semantic feature layer. This block not only extracts global features but also incorporates a separated local attention branch to capture local features of objects at various scales. This enhances the output of deep features, providing rich semantic information for discrete mapping. Secondly, we design a quantization function that promotes the discreteness of hash codes, forcing the discrete values of the model output towards ±1. This ensures that the binary output of the hash code is more stable and representative. Finally, we conduct extensive experiments with the proposed algorithm on four public image retrieval datasets: MS-COCO, NUS-WIDE, ImageNet and CIFAR-10, achieving excellent retrieval performance with accuracies of 93.94%, 89.26%, 92.54% and 96.63%, respectively.
{"title":"Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image Retrieval","authors":"Jing Zhang,&nbsp;Shuli Cheng ,&nbsp;Liejun Wang","doi":"10.1016/j.eswa.2025.126902","DOIUrl":"10.1016/j.eswa.2025.126902","url":null,"abstract":"<div><div>The development of multimedia technology has led to an increasing number of images, making the search for similar images an urgent need in daily life. Hash image retrieval has gradually dominated the field of image retrieval due to its advantages of computational efficiency and high accuracy. Currently, image retrieval algorithms based on Convolutional Neural Network (CNN) and Vision Transformer (ViT) remain inadequate in extracting target category features, ignoring local fine-grained features, which affects retrieval accuracy. This paper proposes an Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image Retrieval. Firstly, based on the diversity of image feature scales, a Channel Separation Attention Embedding Block (CSAE Block) is designed within the deep semantic feature layer. This block not only extracts global features but also incorporates a separated local attention branch to capture local features of objects at various scales. This enhances the output of deep features, providing rich semantic information for discrete mapping. Secondly, we design a quantization function that promotes the discreteness of hash codes, forcing the discrete values of the model output towards ±1. This ensures that the binary output of the hash code is more stable and representative. Finally, we conduct extensive experiments with the proposed algorithm on four public image retrieval datasets: MS-COCO, NUS-WIDE, ImageNet and CIFAR-10, achieving excellent retrieval performance with accuracies of 93.94%, 89.26%, 92.54% and 96.63%, respectively.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126902"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TMC-Net: A temporal multivariate correction network in temperature forecasting
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.127015
Wei Fang , Zhong Yuan , Binglun Wang
Numerical weather prediction and meteorological grand models have emerged as the predominant methods for modern temperature forecasting, with continuous advancements towards higher resolution and accuracy in recent years. However, as the forecast lead time increases, errors inevitably accumulate, necessitating the application of bias correction techniques to mitigate these inaccuracies. Existing bias correction models, however, exhibit several limitations, including suboptimal correction performance and insufficient utilization of historical information. To address these shortcomings, we propose a novel bias correction model called the Temporal Multivariate Correction Net (TMC-Net). The proposed model is composed of three principal modules: a Temporal Extraction Module, which captures the temporal variation patterns of forecast errors by accounting for factors such as seasonality and forecast lead time, making full use of historical information; a Multi-scale Fusion Module, which integrates multi-scale features from multiple variables and selects the most effective features; and a Transformer-based High-order Feature Fusion Module, which performs a deep fusion of interactive features among multiple variables. Empirical results, derived from applying TMC-Net to correct 2-m temperature forecasts from ECMWF HRES, ECMWF ENS, and Pangu models for lead times ranging from 12 to 240 h, demonstrate that TMC-Net can reduce forecast errors by 0.4 °C, enhance forecast accuracy by 5 %, and increase the anomaly correlation coefficient by 0.2 within the 12 to 240-h forecast range. These findings highlight the efficacy of TMC-Net in mitigating numerical forecast errors and improving forecast accuracy, indicating its potential application in high-resolution temperature forecasting.
{"title":"TMC-Net: A temporal multivariate correction network in temperature forecasting","authors":"Wei Fang ,&nbsp;Zhong Yuan ,&nbsp;Binglun Wang","doi":"10.1016/j.eswa.2025.127015","DOIUrl":"10.1016/j.eswa.2025.127015","url":null,"abstract":"<div><div>Numerical weather prediction and meteorological grand models have emerged as the predominant methods for modern temperature forecasting, with continuous advancements towards higher resolution and accuracy in recent years. However, as the forecast lead time increases, errors inevitably accumulate, necessitating the application of bias correction techniques to mitigate these inaccuracies. Existing bias correction models, however, exhibit several limitations, including suboptimal correction performance and insufficient utilization of historical information. To address these shortcomings, we propose a novel bias correction model called the Temporal Multivariate Correction Net (TMC-Net). The proposed model is composed of three principal modules: a Temporal Extraction Module, which captures the temporal variation patterns of forecast errors by accounting for factors such as seasonality and forecast lead time, making full use of historical information; a Multi-scale Fusion Module, which integrates multi-scale features from multiple variables and selects the most effective features; and a Transformer-based High-order Feature Fusion Module, which performs a deep fusion of interactive features among multiple variables. Empirical results, derived from applying TMC-Net to correct 2-m temperature forecasts from ECMWF HRES, ECMWF ENS, and Pangu models for lead times ranging from 12 to 240 h, demonstrate that TMC-Net can reduce forecast errors by 0.4 °C, enhance forecast accuracy by 5 %, and increase the anomaly correlation coefficient by 0.2 within the 12 to 240-h forecast range. These findings highlight the efficacy of TMC-Net in mitigating numerical forecast errors and improving forecast accuracy, indicating its potential application in high-resolution temperature forecasting.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 127015"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143488655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EIKA: Explicit & Implicit Knowledge-Augmented Network for entity-aware sports video captioning
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126906
Zeyu Xi, Ge Shi, Haoying Sun, Bowen Zhang, Shuyi Li, Lifang Wu
Sports video captioning in real application scenarios requires both entities and specific scenes. However, it is difficult to extract this fine-grained information solely from the video content. This paper introduces an Explicit & Implicit Knowledge-Augmented Network for Entity-Aware Sports Video Captioning (EIKA), which leverages both explicit game-related knowledge (i.e., the set of involved player entities) and implicit visual scene knowledge extracted from the training set. Our innovative Entity-Video Interaction Module (EVIM) and Video-Knowledge Interaction Module (VKIM) are instrumental in enhancing the extraction of entity-related and scene-specific video features, respectively. The spatiotemporal information in video is encoded by introducing the Spatial-Temporal Modeling Module (STMM). And the designed Scene-To-Entity (STE) decoder fully utilizes the two kinds of knowledge to generate informative captions with the distributed decoding approach. Extensive evaluations on the VC-NBA-2022, Goal and NSVA datasets demonstrate that our method has the leading performance compared with existing methods.
{"title":"EIKA: Explicit & Implicit Knowledge-Augmented Network for entity-aware sports video captioning","authors":"Zeyu Xi,&nbsp;Ge Shi,&nbsp;Haoying Sun,&nbsp;Bowen Zhang,&nbsp;Shuyi Li,&nbsp;Lifang Wu","doi":"10.1016/j.eswa.2025.126906","DOIUrl":"10.1016/j.eswa.2025.126906","url":null,"abstract":"<div><div>Sports video captioning in real application scenarios requires both entities and specific scenes. However, it is difficult to extract this fine-grained information solely from the video content. This paper introduces an Explicit &amp; Implicit Knowledge-Augmented Network for Entity-Aware Sports Video Captioning (EIKA), which leverages both explicit game-related knowledge (i.e., the set of involved player entities) and implicit visual scene knowledge extracted from the training set. Our innovative Entity-Video Interaction Module (EVIM) and Video-Knowledge Interaction Module (VKIM) are instrumental in enhancing the extraction of entity-related and scene-specific video features, respectively. The spatiotemporal information in video is encoded by introducing the Spatial-Temporal Modeling Module (STMM). And the designed Scene-To-Entity (STE) decoder fully utilizes the two kinds of knowledge to generate informative captions with the distributed decoding approach. Extensive evaluations on the VC-NBA-2022, Goal and NSVA datasets demonstrate that our method has the leading performance compared with existing methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126906"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dominance relation-based feature selection for interval-valued multi-label ordered information system
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126898
Yujie Qin , Guoping Lin , Yidong Lin , Yi Kou , Wenyue Hu
Multi-label learning addresses situations where a instance is linked to several labels. Existing multi-label feature selection has mainly addressed single-valued problems, while research on attribute reduction for interval-valued multi-label systems has yet to be reported. And explore how to apply the dominance principle to interval-valued multi-label ordered data is a promising area for future research. In this article, a new feature selection method was introduced, aiming to identify a more relevant and compact subset of features by incorporating label correlations and the dominance principle. First we combine multi-label learning with interval-valued information systems and design a new information system. Second, to make knowledge representation simpler, we discuss the dominance principle of interval-valued multi-label information systems. On this basis, we present a novel method for generating reduction information for each label and introduce a label correlation learning approach that exploits the overlap of this reduction information. Subsequently, an innovative feature selection algorithm utilizes dominance-based rough set is developed to efficiently filter out redundant features in the feature space. Finally, extensive experiments on nine multi-label datasets were performed, and the results confirm that the proposed algorithms surpass six state-of-the-art methods in performance and exhibit robustness.
{"title":"Dominance relation-based feature selection for interval-valued multi-label ordered information system","authors":"Yujie Qin ,&nbsp;Guoping Lin ,&nbsp;Yidong Lin ,&nbsp;Yi Kou ,&nbsp;Wenyue Hu","doi":"10.1016/j.eswa.2025.126898","DOIUrl":"10.1016/j.eswa.2025.126898","url":null,"abstract":"<div><div>Multi-label learning addresses situations where a instance is linked to several labels. Existing multi-label feature selection has mainly addressed single-valued problems, while research on attribute reduction for interval-valued multi-label systems has yet to be reported. And explore how to apply the dominance principle to interval-valued multi-label ordered data is a promising area for future research. In this article, a new feature selection method was introduced, aiming to identify a more relevant and compact subset of features by incorporating label correlations and the dominance principle. First we combine multi-label learning with interval-valued information systems and design a new information system. Second, to make knowledge representation simpler, we discuss the dominance principle of interval-valued multi-label information systems. On this basis, we present a novel method for generating reduction information for each label and introduce a label correlation learning approach that exploits the overlap of this reduction information. Subsequently, an innovative feature selection algorithm utilizes dominance-based rough set is developed to efficiently filter out redundant features in the feature space. Finally, extensive experiments on nine multi-label datasets were performed, and the results confirm that the proposed algorithms surpass six state-of-the-art methods in performance and exhibit robustness.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126898"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143488658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid grey approach for battery remaining useful life prediction considering capacity regeneration
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-22 DOI: 10.1016/j.eswa.2025.126905
Kailing Li , Naiming Xie , Hui Li
Remaining useful life (RUL) prediction is the core of prognostic and health management. Lithium-ion batteries are a kind of consumable resource, whose RUL has a revelatory effect on its safe use and management. However, it has been found that battery degradation is not a completely degrading trend, which is ignored by traditional prediction methods. This paper proposes a hybrid grey approach to predict the RUL under multiple states. To predict the detailed degradation as real as possible, the capacity regeneration phenomenon is found to be a crucial part. A hybrid grey forecasting model is established to forecast the normal degradation and capacity regeneration process in detail. Ensemble Kalman Filter (EnKF) algorithm is applied to weaken the errors of grey forecasting models for long-term prediction. Based on single models from grey modeling and filtering, the ablation study shows the influence of different components in the hybrid approach. Then, compared with other data-driven models, the hybrid grey-EnKF demonstrates a quite satisfactory prediction of RUL for a set of batteries. Results show the grey-EnKF predictions are highly accurate with mean absolute percentage error smaller than 1%.
剩余使用寿命(RUL)预测是预报和健康管理的核心。锂离子电池是一种消耗性资源,其剩余使用寿命对其安全使用和管理具有启示作用。然而,人们发现电池的衰减并不是完全的衰减趋势,传统的预测方法忽略了这一点。本文提出了一种混合灰色方法来预测多种状态下的 RUL。为了尽可能真实地预测详细的衰减情况,容量再生现象被认为是关键部分。本文建立了一个混合灰色预测模型,以详细预测正常衰减和容量再生过程。采用集合卡尔曼滤波(EnKF)算法削弱灰色预测模型的误差,以进行长期预测。基于灰色建模和滤波的单一模型,消融研究显示了混合方法中不同组成部分的影响。然后,与其他数据驱动模型相比,混合灰色-EnKF 对一组电池的 RUL 预测相当令人满意。结果表明,灰色-EnKF 预测非常准确,平均绝对百分比误差小于 1%。
{"title":"A hybrid grey approach for battery remaining useful life prediction considering capacity regeneration","authors":"Kailing Li ,&nbsp;Naiming Xie ,&nbsp;Hui Li","doi":"10.1016/j.eswa.2025.126905","DOIUrl":"10.1016/j.eswa.2025.126905","url":null,"abstract":"<div><div>Remaining useful life (RUL) prediction is the core of prognostic and health management. Lithium-ion batteries are a kind of consumable resource, whose RUL has a revelatory effect on its safe use and management. However, it has been found that battery degradation is not a completely degrading trend, which is ignored by traditional prediction methods. This paper proposes a hybrid grey approach to predict the RUL under multiple states. To predict the detailed degradation as real as possible, the capacity regeneration phenomenon is found to be a crucial part. A hybrid grey forecasting model is established to forecast the normal degradation and capacity regeneration process in detail. Ensemble Kalman Filter (EnKF) algorithm is applied to weaken the errors of grey forecasting models for long-term prediction. Based on single models from grey modeling and filtering, the ablation study shows the influence of different components in the hybrid approach. Then, compared with other data-driven models, the hybrid grey-EnKF demonstrates a quite satisfactory prediction of RUL for a set of batteries. Results show the grey-EnKF predictions are highly accurate with mean absolute percentage error smaller than 1%.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126905"},"PeriodicalIF":7.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-prompt contextual learning with AxialMamba for multi-label segmentation in carotid ultrasound
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-21 DOI: 10.1016/j.eswa.2025.126749
Congyu Tian , Yan Hu , Meng Zhang , Xiangyun Liao , Jianping Lv , Weixin Si
Plaque and vessel segmentation in carotid ultrasound videos is critical for assessing carotid artery stenosis and providing essential information for doctors’ diagnostic and treatment planning. However, most existing methods segment vessels and plaques without distinguishing the plaque types and their corresponding vascular segments. To address this limitation, we define a novel multi-label carotid ultrasound video segmentation task that categorizes vessels based on their anatomical locations and classifies plaques according to their echo characteristics. To address this task, we constructed a novel dataset, CAUS45, comprising 7479 annotated frames from 45 patients. In this dataset, vessels are segmented into three categories: the internal carotid artery (ICA), external carotid artery (ECA), and common carotid artery (CCA). Plaques are classified based on echogenicity into three types: weakly echogenic, moderately echogenic, and strongly echogenic. To further advance this task, we propose a self-prompt contextual segmentation framework, termed SPCNet. To address the challenges posed by the significant variability in ultrasound images, we leveraged foundational models pretrained on large-scale ultrasound datasets as part of our video clip encoder to extract features from individual frames. To effectively utilize the inter-frame contextual information within a clip, we propose a novel AxialMamba module designed for extracting inter-frame features. Additionally, to fully exploit the correlation between different clips within a video, we introduce a self-prompted contextual learning strategy to establish contextual dependencies across clips. Experiments demonstrate that SPCNet achieves a Dice coefficient of 89.08%, with a 3.04% improvement over the current state-of-the-art method. Additionally, SPCNet achieves a Hausdorff Distance (HD) of 5.04 and an Average Surface Distance (ASD) of 1.21 on our private CAUS45 dataset. Our method shows the great potential to be applied in practical large-scale screening.
{"title":"Self-prompt contextual learning with AxialMamba for multi-label segmentation in carotid ultrasound","authors":"Congyu Tian ,&nbsp;Yan Hu ,&nbsp;Meng Zhang ,&nbsp;Xiangyun Liao ,&nbsp;Jianping Lv ,&nbsp;Weixin Si","doi":"10.1016/j.eswa.2025.126749","DOIUrl":"10.1016/j.eswa.2025.126749","url":null,"abstract":"<div><div>Plaque and vessel segmentation in carotid ultrasound videos is critical for assessing carotid artery stenosis and providing essential information for doctors’ diagnostic and treatment planning. However, most existing methods segment vessels and plaques without distinguishing the plaque types and their corresponding vascular segments. To address this limitation, we define a novel multi-label carotid ultrasound video segmentation task that categorizes vessels based on their anatomical locations and classifies plaques according to their echo characteristics. To address this task, we constructed a novel dataset, CAUS45, comprising 7479 annotated frames from 45 patients. In this dataset, vessels are segmented into three categories: the internal carotid artery (ICA), external carotid artery (ECA), and common carotid artery (CCA). Plaques are classified based on echogenicity into three types: weakly echogenic, moderately echogenic, and strongly echogenic. To further advance this task, we propose a self-prompt contextual segmentation framework, termed SPCNet. To address the challenges posed by the significant variability in ultrasound images, we leveraged foundational models pretrained on large-scale ultrasound datasets as part of our video clip encoder to extract features from individual frames. To effectively utilize the inter-frame contextual information within a clip, we propose a novel AxialMamba module designed for extracting inter-frame features. Additionally, to fully exploit the correlation between different clips within a video, we introduce a self-prompted contextual learning strategy to establish contextual dependencies across clips. Experiments demonstrate that SPCNet achieves a Dice coefficient of 89.08%, with a 3.04% improvement over the current state-of-the-art method. Additionally, SPCNet achieves a Hausdorff Distance (HD) of 5.04 and an Average Surface Distance (ASD) of 1.21 on our private CAUS45 dataset. Our method shows the great potential to be applied in practical large-scale screening.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126749"},"PeriodicalIF":7.5,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Expert Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1