Pub Date : 2024-09-09DOI: 10.1007/s10489-024-05773-8
Huageng Fan, Tongwei Lu
Recently transformer-based scene text detection methods have been gradually investigated. However, these methods usually use attention to model visual content relationships in single sample, ignoring the relationships between samples. Exploring sample relationships enables feature propagation between samples, which facilitates detector to detect scene text images with more complex features. Aware of the challenges above, this paper proposes exploring sample relationships network (ESRNet) for detecting arbitrary-shaped texts. In detail, we construct the exploring sample relationships module (ESRM) to model sample relationships in the encoder, capturing interactions between all samples in each batch and propagating features across samples. Because of the inconsistency in batch sizes for training and testing leads to differences in exploring sample relationships between these two phases, so two-stream encoder method is used to solve the problem. Moreover, we propose location-aware factorized self-attention (LAFSA), which incorporates the sequential information of text polygon control points into the modeling and effectively improves the accuracy of label reading order in terms of visual features. Experimental results on multiple datasets demonstrate that ESRNet exhibits superior performance compared to other methods. Notably, ESRNet achieves F-measure of 88.9(%), 88.4(%), and 77.4(%) on the Total-Text, CTW1500, and ArT datasets, respectively.
{"title":"ESRNet: an exploring sample relationships network for arbitrary-shaped scene text detection","authors":"Huageng Fan, Tongwei Lu","doi":"10.1007/s10489-024-05773-8","DOIUrl":"10.1007/s10489-024-05773-8","url":null,"abstract":"<div><p>Recently transformer-based scene text detection methods have been gradually investigated. However, these methods usually use attention to model visual content relationships in single sample, ignoring the relationships between samples. Exploring sample relationships enables feature propagation between samples, which facilitates detector to detect scene text images with more complex features. Aware of the challenges above, this paper proposes exploring sample relationships network (ESRNet) for detecting arbitrary-shaped texts. In detail, we construct the exploring sample relationships module (ESRM) to model sample relationships in the encoder, capturing interactions between all samples in each batch and propagating features across samples. Because of the inconsistency in batch sizes for training and testing leads to differences in exploring sample relationships between these two phases, so two-stream encoder method is used to solve the problem. Moreover, we propose location-aware factorized self-attention (LAFSA), which incorporates the sequential information of text polygon control points into the modeling and effectively improves the accuracy of label reading order in terms of visual features. Experimental results on multiple datasets demonstrate that ESRNet exhibits superior performance compared to other methods. Notably, ESRNet achieves F-measure of 88.9<span>(%)</span>, 88.4<span>(%)</span>, and 77.4<span>(%)</span> on the Total-Text, CTW1500, and ArT datasets, respectively.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 22","pages":"11995 - 12008"},"PeriodicalIF":3.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-09DOI: 10.1007/s10489-024-05811-5
Xinyu Zhao, Jianxiang Liu, Faguo Wu, Xiao Zhang, Guojian Wang
Uncertainty in the evolution of opponent behavior creates a non-stationary environment for the agent, reducing the reliability of value estimation and strategy selection while compromising security during the exploration process. Previous studies have developed various uncertainty quantification techniques and designed uncertainty-aware exploration methods for multi-agent reinforcement learning (MARL). However, existing methods have gaps in theoretical research and experimental verification of decoupling uncertainty between opponents and environment, which can decrease learning efficiency and lead to an unstable training process. Due to inaccurate opponent modeling, the agent is vulnerable to harm from opponents, which is undesirable in real-world tasks. To address these issues, this study proposes a novel uncertainty-guided safe exploration strategy for MARL that decouples the two types of uncertainty originating from the environment and opponents. Specifically, we introduce an uncertainty decoupling quantification technique based on a novel variance decomposition method for action-value functions. Furthermore, we present an uncertainty-aware policy optimization mechanism to facilitate safe exploration in MARL. Finally, we propose a new adaptive parameter scaling method to ensure efficient exploration by the agents. Theoretical analysis establishes the proposed approach’s convergence rate, and its effectiveness is demonstrated empirically. Extensive experiments on benchmark tasks spanning differential games, multi-agent particle environments, and RoboSumo validate the proposed uncertainty-guided method’s significant advantages in attaining higher scores and facilitating safe agent exploration.
{"title":"Uncertainty modified policy for multi-agent reinforcement learning","authors":"Xinyu Zhao, Jianxiang Liu, Faguo Wu, Xiao Zhang, Guojian Wang","doi":"10.1007/s10489-024-05811-5","DOIUrl":"10.1007/s10489-024-05811-5","url":null,"abstract":"<div><p>Uncertainty in the evolution of opponent behavior creates a non-stationary environment for the agent, reducing the reliability of value estimation and strategy selection while compromising security during the exploration process. Previous studies have developed various uncertainty quantification techniques and designed uncertainty-aware exploration methods for multi-agent reinforcement learning (MARL). However, existing methods have gaps in theoretical research and experimental verification of decoupling uncertainty between opponents and environment, which can decrease learning efficiency and lead to an unstable training process. Due to inaccurate opponent modeling, the agent is vulnerable to harm from opponents, which is undesirable in real-world tasks. To address these issues, this study proposes a novel uncertainty-guided safe exploration strategy for MARL that decouples the two types of uncertainty originating from the environment and opponents. Specifically, we introduce an uncertainty decoupling quantification technique based on a novel variance decomposition method for action-value functions. Furthermore, we present an uncertainty-aware policy optimization mechanism to facilitate safe exploration in MARL. Finally, we propose a new adaptive parameter scaling method to ensure efficient exploration by the agents. Theoretical analysis establishes the proposed approach’s convergence rate, and its effectiveness is demonstrated empirically. Extensive experiments on benchmark tasks spanning differential games, multi-agent particle environments, and RoboSumo validate the proposed uncertainty-guided method’s significant advantages in attaining higher scores and facilitating safe agent exploration.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 22","pages":"12020 - 12034"},"PeriodicalIF":3.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Individuals’ emotions, such as hesitation and unwavering confidence, can influence the ability of decision-makers (DMs) to make rational judgments. The emotion is always hidden in individual preference series, which is referred to as emotion soft factors, It is a prerequisite for avoiding unfavorable impacts on consensus reaching process. This study focuses on structuring a consensus model with emotion soft factors in linguistic preference time sequence. Specifically, a personalized individual semantics (PIS) learning process is implemented to obtain the personalized numerical scales of DMs’ linguistic terms. Subsequently, we propose a consensus model incorporating the consensus measurement and feedback modification phase. In the process, a grey clustering scheme is devised to mine emotion soft factors from DMs’ preference sequences and manage individuals in different grey classes. Finally, numerical examples, simulation analysis, and comparison study are presented to illustrate the influence of different parameters and justify the validity of the proposed model.
{"title":"Mining emotion soft factors in linguistic preference time sequences based on personalized individual semantics in group decision-making","authors":"Fuying Jing, Mengru Xu, Xiangrui Chao, Enrique Herrera-viedma","doi":"10.1007/s10489-024-05697-3","DOIUrl":"10.1007/s10489-024-05697-3","url":null,"abstract":"<div><p>Individuals’ emotions, such as hesitation and unwavering confidence, can influence the ability of decision-makers (DMs) to make rational judgments. The emotion is always hidden in individual preference series, which is referred to as emotion soft factors, It is a prerequisite for avoiding unfavorable impacts on consensus reaching process. This study focuses on structuring a consensus model with emotion soft factors in linguistic preference time sequence. Specifically, a personalized individual semantics (PIS) learning process is implemented to obtain the personalized numerical scales of DMs’ linguistic terms. Subsequently, we propose a consensus model incorporating the consensus measurement and feedback modification phase. In the process, a grey clustering scheme is devised to mine emotion soft factors from DMs’ preference sequences and manage individuals in different grey classes. Finally, numerical examples, simulation analysis, and comparison study are presented to illustrate the influence of different parameters and justify the validity of the proposed model.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 21","pages":"11120 - 11143"},"PeriodicalIF":3.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-09DOI: 10.1007/s10489-024-05508-9
Huaqing Zhang, Hongbin Ma, Bemnet Wondimagegnehu Mersha, Ying Jin
On-policy deep reinforcement learning (DRL) has the inherent advantage of using multi-step interaction data for policy learning. However, on-policy DRL still faces challenges in improving the sample efficiency of policy evaluations. Therefore, we propose a multi-step on-policy DRL method assisted by off-policy policy evaluation (abbreviated as MSOAO), whichs integrates on-policy and off-policy policy evaluations and belongs to a new type of DRL method. We propose a low-pass filtering algorithm for state-values to perform off-policy policy evaluation and make it efficiently assist on-policy policy evaluation. The filtered state-values and the multi-step interaction data are used as the input of the V-trace algorithm. Then, the state-value function is learned by simultaneously approximating the target state-values obtained from the V-trace output and the action-values of the current policy. The action-value function is learned by using the one-step bootstrapping algorithm to approximate the target action-values obtained from the V-trace output. Extensive evaluation results indicate that MSOAO outperformed the performance of state-of-the-art on-policy DRL algorithms, and the simultaneous learning of the state-value function and the action-value function in MSOAO can promote each other, thus improving the learning capability of the algorithm.
{"title":"A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation","authors":"Huaqing Zhang, Hongbin Ma, Bemnet Wondimagegnehu Mersha, Ying Jin","doi":"10.1007/s10489-024-05508-9","DOIUrl":"10.1007/s10489-024-05508-9","url":null,"abstract":"<div><p>On-policy deep reinforcement learning (DRL) has the inherent advantage of using multi-step interaction data for policy learning. However, on-policy DRL still faces challenges in improving the sample efficiency of policy evaluations. Therefore, we propose a multi-step on-policy DRL method assisted by off-policy policy evaluation (abbreviated as MSOAO), whichs integrates on-policy and off-policy policy evaluations and belongs to a new type of DRL method. We propose a low-pass filtering algorithm for state-values to perform off-policy policy evaluation and make it efficiently assist on-policy policy evaluation. The filtered state-values and the multi-step interaction data are used as the input of the V-trace algorithm. Then, the state-value function is learned by simultaneously approximating the target state-values obtained from the V-trace output and the action-values of the current policy. The action-value function is learned by using the one-step bootstrapping algorithm to approximate the target action-values obtained from the V-trace output. Extensive evaluation results indicate that MSOAO outperformed the performance of state-of-the-art on-policy DRL algorithms, and the simultaneous learning of the state-value function and the action-value function in MSOAO can promote each other, thus improving the learning capability of the algorithm.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 21","pages":"11144 - 11159"},"PeriodicalIF":3.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-07DOI: 10.1007/s10489-024-05815-1
Shiyu Yang, Qunyong Wu, Yuhang Wang, Tingyu Lin
Current research often formalizes traffic prediction tasks as spatio-temporal graph modeling problems. Despite some progress, this approach still has the following limitations. First, space can be divided into intrinsic and latent spaces. Static graphs in intrinsic space lack flexibility when facing changing prediction tasks, while dynamic relationships in latent space are influenced by multiple factors. A deep understanding of specific traffic patterns in different spaces is crucial for accurately modeling spatial dependencies. Second, most studies focus on correlations in sequential time periods, neglecting both reverse and global temporal correlations. This oversight leads to incomplete temporal representations in models. In this work, we propose a Space-Specific Graph Convolutional Recurrent Transformer Network (SSGCRTN) to address these limitations simultaneously. For the spatial aspect, we propose a space-specific graph convolution operation to identify patterns unique to each space. For the temporal aspect, we introduce a spatio-temporal interaction module that integrates spatial and temporal domain knowledge of nodes at multiple granularities. This module learns and utilizes parallel spatio-temporal relationships between different time points from both forward and backward perspectives, revealing latent patterns in spatio-temporal associations. Additionally, we use a transformer-based global temporal fusion module to capture global spatio-temporal correlations. We conduct experiments on four real-world traffic flow datasets (PeMS03/04/07/08) and two traffic speed datasets (PeMSD7(M)/(L)), achieving better performance than existing technologies. Notably, on the PeMS08 dataset, our model improves the MAE by 6.41% compared to DGCRN. The code of SSGCRTN is available at https://github.com/OvOYu/SSGCRTN.
{"title":"SSGCRTN: a space-specific graph convolutional recurrent transformer network for traffic prediction","authors":"Shiyu Yang, Qunyong Wu, Yuhang Wang, Tingyu Lin","doi":"10.1007/s10489-024-05815-1","DOIUrl":"10.1007/s10489-024-05815-1","url":null,"abstract":"<div><p>Current research often formalizes traffic prediction tasks as spatio-temporal graph modeling problems. Despite some progress, this approach still has the following limitations. First, space can be divided into intrinsic and latent spaces. Static graphs in intrinsic space lack flexibility when facing changing prediction tasks, while dynamic relationships in latent space are influenced by multiple factors. A deep understanding of specific traffic patterns in different spaces is crucial for accurately modeling spatial dependencies. Second, most studies focus on correlations in sequential time periods, neglecting both reverse and global temporal correlations. This oversight leads to incomplete temporal representations in models. In this work, we propose a Space-Specific Graph Convolutional Recurrent Transformer Network (SSGCRTN) to address these limitations simultaneously. For the spatial aspect, we propose a space-specific graph convolution operation to identify patterns unique to each space. For the temporal aspect, we introduce a spatio-temporal interaction module that integrates spatial and temporal domain knowledge of nodes at multiple granularities. This module learns and utilizes parallel spatio-temporal relationships between different time points from both forward and backward perspectives, revealing latent patterns in spatio-temporal associations. Additionally, we use a transformer-based global temporal fusion module to capture global spatio-temporal correlations. We conduct experiments on four real-world traffic flow datasets (PeMS03/04/07/08) and two traffic speed datasets (PeMSD7(M)/(L)), achieving better performance than existing technologies. Notably, on the PeMS08 dataset, our model improves the MAE by 6.41% compared to DGCRN. The code of SSGCRTN is available at https://github.com/OvOYu/SSGCRTN.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 22","pages":"11978 - 11994"},"PeriodicalIF":3.4,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1007/s10489-024-05817-z
Mary Carlota Bernal, Edgar Batista, Antoni Martínez-Ballesté, Agusti Solanas
As society experiences accelerated ageing, understanding the complex biological processes of human ageing, which are affected by a large number of variables and factors, becomes increasingly crucial. Artificial intelligence (AI) presents a promising avenue for ageing research, offering the ability to detect patterns, make accurate predictions, and extract valuable insights from large volumes of complex, heterogeneous data. As ageing research increasingly leverages AI techniques, we present a timely systematic literature review to explore the current state-of-the-art in this field following a rigorous and transparent review methodology. As a result, a total of 77 articles have been identified, summarised, and categorised based on their characteristics. AI techniques, such as machine learning and deep learning, have been extensively used to analyse diverse datasets, comprising imaging, genetic, behavioural, and contextual data. Findings showcase the potential of AI in predicting age-related outcomes, developing ageing biomarkers, and determining factors associated with healthy ageing. However, challenges related to data quality, interpretability of AI models, and privacy and ethical considerations have also been identified. Despite the advancements, novel approaches suggest that there is still room for improvement to provide personalised AI-driven healthcare services and promote active ageing initiatives with the ultimate goal of enhancing the quality of life and well-being of older adults.
{"title":"Artificial intelligence for the study of human ageing: a systematic literature review","authors":"Mary Carlota Bernal, Edgar Batista, Antoni Martínez-Ballesté, Agusti Solanas","doi":"10.1007/s10489-024-05817-z","DOIUrl":"10.1007/s10489-024-05817-z","url":null,"abstract":"<p>As society experiences accelerated ageing, understanding the complex biological processes of human ageing, which are affected by a large number of variables and factors, becomes increasingly crucial. Artificial intelligence (AI) presents a promising avenue for ageing research, offering the ability to detect patterns, make accurate predictions, and extract valuable insights from large volumes of complex, heterogeneous data. As ageing research increasingly leverages AI techniques, we present a timely systematic literature review to explore the current state-of-the-art in this field following a rigorous and transparent review methodology. As a result, a total of 77 articles have been identified, summarised, and categorised based on their characteristics. AI techniques, such as machine learning and deep learning, have been extensively used to analyse diverse datasets, comprising imaging, genetic, behavioural, and contextual data. Findings showcase the potential of AI in predicting age-related outcomes, developing ageing biomarkers, and determining factors associated with healthy ageing. However, challenges related to data quality, interpretability of AI models, and privacy and ethical considerations have also been identified. Despite the advancements, novel approaches suggest that there is still room for improvement to provide personalised AI-driven healthcare services and promote active ageing initiatives with the ultimate goal of enhancing the quality of life and well-being of older adults.</p><p>Overview of the literature review.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 22","pages":"11949 - 11977"},"PeriodicalIF":3.4,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05817-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1007/s10489-024-05771-w
Jiang Longting, Wei Ruixuan, Wang Dong
A collaborators’ experiences learning (CEL) algorithm, based on multiagent reinforcement learning (MARL) is presented for multi-UAV cooperative path-finding, where reaching destinations and avoiding obstacles are simultaneously considered as independent or interactive tasks. In this article, we are inspired by the experience learning phenomenon to propose the multiagent experience learning theory based on MARL. A strategy for updating parameters randomly is also suggested to allow homogeneous UAVs to effectively learn cooperative strategies. Additionally, the convergence of this algorithm is theoretically demonstrated. To demonstrate the effectiveness of the algorithm, we conduct experiments with different numbers of UAVs and different algorithms. The experiments show that the proposed method can achieve experience sharing and learning among UAVs and complete the cooperative path-finding task very well in unknown dynamic environments.
{"title":"Improving multi-UAV cooperative path-finding through multiagent experience learning","authors":"Jiang Longting, Wei Ruixuan, Wang Dong","doi":"10.1007/s10489-024-05771-w","DOIUrl":"10.1007/s10489-024-05771-w","url":null,"abstract":"<div><p>A collaborators’ experiences learning (CEL) algorithm, based on multiagent reinforcement learning (MARL) is presented for multi-UAV cooperative path-finding, where reaching destinations and avoiding obstacles are simultaneously considered as independent or interactive tasks. In this article, we are inspired by the experience learning phenomenon to propose the multiagent experience learning theory based on MARL. A strategy for updating parameters randomly is also suggested to allow homogeneous UAVs to effectively learn cooperative strategies. Additionally, the convergence of this algorithm is theoretically demonstrated. To demonstrate the effectiveness of the algorithm, we conduct experiments with different numbers of UAVs and different algorithms. The experiments show that the proposed method can achieve experience sharing and learning among UAVs and complete the cooperative path-finding task very well in unknown dynamic environments.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 21","pages":"11103 - 11119"},"PeriodicalIF":3.4,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05771-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1007/s10489-024-05807-1
Kaiqiang Xu, Kewei Tang, Zhixun Su
Multi-view subspace clustering has attracted extensive attention due to its ability to efficiently handle data from diverse sources. In recent years, plentiful multi-view subspace clustering methods have emerged and achieved satisfactory clustering performance. However, these methods rarely consider simultaneously handling data with a nonlinear structure and exploiting the structural and multi-level information inherent in the data. To remedy these shortcomings, we propose the novel multi-view deep subspace clustering via level-by-level guided multi-level features learning (MDSC-LGMFL). Specifically, an autoencoder is used for each view to extract the view-specific multi-level features, and multiple self-representation layers are introduced into the autoencoder to learn the subspace representations corresponding to the multi-level features. These self-representation layers not only provide multiple information flow paths through the autoencoder but also enforce multiple encoder layers to produce the multi-level features that satisfy the linear subspace assumption. With the novel level-by-level guidance strategy, the last-level feature is guaranteed to encode the structural information from the view and the previous-level features. Naturally, the subspace representation of the last-level feature can more reliably reflect the data affinity relationship and thus can be viewed as the new, better representation of the view. Furthermore, to guarantee the structural consistency among different views, instead of simply learning the common subspace structure by enforcing it to be close to different view-specific new, better representations, we conduct self-representation on these new, better representations to learn the common subspace structure, which can be applied to the spectral clustering algorithm to achieve the final clustering results. Numerous experiments on six widely used benchmark datasets show the superiority of the proposed method.
{"title":"Multi-view deep subspace clustering via level-by-level guided multi-level features learning","authors":"Kaiqiang Xu, Kewei Tang, Zhixun Su","doi":"10.1007/s10489-024-05807-1","DOIUrl":"10.1007/s10489-024-05807-1","url":null,"abstract":"<div><p>Multi-view subspace clustering has attracted extensive attention due to its ability to efficiently handle data from diverse sources. In recent years, plentiful multi-view subspace clustering methods have emerged and achieved satisfactory clustering performance. However, these methods rarely consider simultaneously handling data with a nonlinear structure and exploiting the structural and multi-level information inherent in the data. To remedy these shortcomings, we propose the novel multi-view deep subspace clustering via level-by-level guided multi-level features learning (MDSC-LGMFL). Specifically, an autoencoder is used for each view to extract the view-specific multi-level features, and multiple self-representation layers are introduced into the autoencoder to learn the subspace representations corresponding to the multi-level features. These self-representation layers not only provide multiple information flow paths through the autoencoder but also enforce multiple encoder layers to produce the multi-level features that satisfy the linear subspace assumption. With the novel level-by-level guidance strategy, the last-level feature is guaranteed to encode the structural information from the view and the previous-level features. Naturally, the subspace representation of the last-level feature can more reliably reflect the data affinity relationship and thus can be viewed as the new, better representation of the view. Furthermore, to guarantee the structural consistency among different views, instead of simply learning the common subspace structure by enforcing it to be close to different view-specific new, better representations, we conduct self-representation on these new, better representations to learn the common subspace structure, which can be applied to the spectral clustering algorithm to achieve the final clustering results. Numerous experiments on six widely used benchmark datasets show the superiority of the proposed method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 21","pages":"11083 - 11102"},"PeriodicalIF":3.4,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conventional statistical methods for missing data imputation have been challenging to adapt to the large-scale new features of high dimensionality. Moreover, the missing data imputation methods based on Generative Adversarial Networks (GAN) are plagued with gradient vanishing and mode collapse. To address these problems, we have proposed a new imputation method based on GAN to enhance the accuracy of missing data imputation in this study. We refer to our missing data method using Generative Adversarial Imputation Networks (MGAIN). Specifically, the least squares loss is first introduced to solve the gradient vanishing problem and ensure the high quality of the output data in MGAIN. To mitigate mode collapse, dual discriminator is used in the model, which improved the diversity of output data to avoid the degradation of computational performance caused by single data. As a result, MGAIN generates rich and accurate imputation values. The MGAIN enhances imputation accuracy and reduces the root mean square error metric by 21.66% compared to the baseline model. We evaluated our method on baseline datasets and found that MGAIN outperformed state-of-the-art and popular imputation methods, demonstrating its effectiveness and superiority.
传统的缺失数据估算统计方法在适应大规模高维新特征方面一直面临挑战。此外,基于生成对抗网络(GAN)的缺失数据估算方法也存在梯度消失和模式崩溃的问题。针对这些问题,我们在本研究中提出了一种基于 GAN 的新估算方法,以提高缺失数据估算的准确性。我们称这种缺失数据估算方法为生成对抗估算网络(MGAIN)。具体来说,首先引入最小二乘损失来解决梯度消失问题,确保 MGAIN 输出数据的高质量。为了缓解模式崩溃,模型中使用了双判别器,提高了输出数据的多样性,避免了单一数据造成的计算性能下降。因此,MGAIN 可以生成丰富而准确的估算值。与基线模型相比,MGAIN 提高了估算的准确性,并将均方根误差指标降低了 21.66%。我们在基线数据集上对我们的方法进行了评估,发现 MGAIN 的性能优于最先进和流行的估算方法,这证明了它的有效性和优越性。
{"title":"Improved generative adversarial imputation networks for missing data","authors":"Xiwen Qin, Hongyu Shi, Xiaogang Dong, Siqi Zhang, Liping Yuan","doi":"10.1007/s10489-024-05814-2","DOIUrl":"10.1007/s10489-024-05814-2","url":null,"abstract":"<div><p>Conventional statistical methods for missing data imputation have been challenging to adapt to the large-scale new features of high dimensionality. Moreover, the missing data imputation methods based on Generative Adversarial Networks (GAN) are plagued with gradient vanishing and mode collapse. To address these problems, we have proposed a new imputation method based on GAN to enhance the accuracy of missing data imputation in this study. We refer to our missing data method using Generative Adversarial Imputation Networks (MGAIN). Specifically, the least squares loss is first introduced to solve the gradient vanishing problem and ensure the high quality of the output data in MGAIN. To mitigate mode collapse, dual discriminator is used in the model, which improved the diversity of output data to avoid the degradation of computational performance caused by single data. As a result, MGAIN generates rich and accurate imputation values. The MGAIN enhances imputation accuracy and reduces the root mean square error metric by 21.66% compared to the baseline model. We evaluated our method on baseline datasets and found that MGAIN outperformed state-of-the-art and popular imputation methods, demonstrating its effectiveness and superiority.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 21","pages":"11068 - 11082"},"PeriodicalIF":3.4,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s10489-024-05801-7
Haibin Liao, Mou Wu, Li Yuan, Yiyang Hu, Haowei Gong
Air pollution is one of the main public health and safety issues facing humanity. PM2.5 concentration prediction (PCP) helps the public to prevent and make government decisions in advance. PCP is a typical knowledge mining problem based on spatiotemporal sequential data, which still faces great challenges up to now. Aiming at the complex conundrum of meteorological, geographical, and temporal factors interference and concentration sudden changes, a dynamic spatiotemporal graph neural network (DST_GNN) method for PCP is proposed by using the advantages of graph neural network (GNN) and mechanism model. Its main methods are: The graph structure is used to construct the spatial relationship of PM2.5 among different monitoring stations, the mechanism model HYSPLIT is used to construct the dynamic edge relationship among graph nodes, and the gate recurrent unit of attention mechanism is used to learn the timing of PM2.5 concentration, thus forming a GNN architecture that integrates machine learning and domain knowledge. In addition, a loss function based on trend and shape is proposed when the model objective function is designed. The proposed model innovatively uses HYSPLIT to assist in building a dynamic spatiotemporal graph network and uses trend loss function for model training, which provides a new way for the dynamic construction of GNN, and provides a reference for PCP by combining domain knowledge and deep learning. Experimental results show that the proposed method has the best prediction accuracy among GNN based methods, which reduced the mean absolute error by about 14% and root mean square error by about 13% compared with the advanced GNN methods. The mean absolute error within 48 h forecast is less than 50, which predictive performance is far superior to the traditional mechanism model, and it also has the characteristics of flexible deployment and easy implementation.
{"title":"PM2.5 prediction based on dynamic spatiotemporal graph neural network","authors":"Haibin Liao, Mou Wu, Li Yuan, Yiyang Hu, Haowei Gong","doi":"10.1007/s10489-024-05801-7","DOIUrl":"10.1007/s10489-024-05801-7","url":null,"abstract":"<div><p>Air pollution is one of the main public health and safety issues facing humanity. PM2.5 concentration prediction (PCP) helps the public to prevent and make government decisions in advance. PCP is a typical knowledge mining problem based on spatiotemporal sequential data, which still faces great challenges up to now. Aiming at the complex conundrum of meteorological, geographical, and temporal factors interference and concentration sudden changes, a dynamic spatiotemporal graph neural network (DST_GNN) method for PCP is proposed by using the advantages of graph neural network (GNN) and mechanism model. Its main methods are: The graph structure is used to construct the spatial relationship of PM2.5 among different monitoring stations, the mechanism model HYSPLIT is used to construct the dynamic edge relationship among graph nodes, and the gate recurrent unit of attention mechanism is used to learn the timing of PM2.5 concentration, thus forming a GNN architecture that integrates machine learning and domain knowledge. In addition, a loss function based on trend and shape is proposed when the model objective function is designed. The proposed model innovatively uses HYSPLIT to assist in building a dynamic spatiotemporal graph network and uses trend loss function for model training, which provides a new way for the dynamic construction of GNN, and provides a reference for PCP by combining domain knowledge and deep learning. Experimental results show that the proposed method has the best prediction accuracy among GNN based methods, which reduced the mean absolute error by about 14% and root mean square error by about 13% compared with the advanced GNN methods. The mean absolute error within 48 h forecast is less than 50, which predictive performance is far superior to the traditional mechanism model, and it also has the characteristics of flexible deployment and easy implementation.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 22","pages":"11933 - 11948"},"PeriodicalIF":3.4,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}