Pub Date : 2024-10-24DOI: 10.1007/s10489-024-05756-9
Xinyu Meng, Meng Zhao, Chenxi Zhang, Yimai Zhang
Optimizing hotel recommendation systems based on consumer preferences is crucial for online hotel booking platforms. The purpose of this study is to reveal differences in hotel recommendation results for different types of consumers by considering consumer expectations. Specifically, this study introduces an online hotel recommendation method that considers three preferences for five types of consumers (business, couples, families, friends, and solo): attribute importance, consumer expectations, and actual hotel attribute performance. Here, consumer expectations are expressed in the form of the 2-tuple. 2-tuple expectations mean that customers can not only express specific demands but also express the probability of meeting the demands. Further, using three different consumer preferences, a similarity measurement model is constructed to recommend hotels for different types of consumers. This study puts this innovative method to the test using a dataset covering 40 hotels in the Beijing area and analyzes the impact of three preferences for different types of consumers on their hotel recommendation results. The method introduced in this study has two management implications. On the one hand, the recommendation method based on consumer preferences can optimize hotel recommendation systems and help online hotel booking platforms improve the accuracy of recommendation results. On the other hand, the proposed method can offer valuable insights to hotel managers, helping them measure their competitiveness and providing guidance for developing service improvement strategies.
{"title":"Making platform recommendations more responsive to the expectations of different types of consumers: a recommendation method based on online reviews","authors":"Xinyu Meng, Meng Zhao, Chenxi Zhang, Yimai Zhang","doi":"10.1007/s10489-024-05756-9","DOIUrl":"10.1007/s10489-024-05756-9","url":null,"abstract":"<div><p>Optimizing hotel recommendation systems based on consumer preferences is crucial for online hotel booking platforms. The purpose of this study is to reveal differences in hotel recommendation results for different types of consumers by considering consumer expectations. Specifically, this study introduces an online hotel recommendation method that considers three preferences for five types of consumers (business, couples, families, friends, and solo): attribute importance, consumer expectations, and actual hotel attribute performance. Here, consumer expectations are expressed in the form of the 2-tuple. 2-tuple expectations mean that customers can not only express specific demands but also express the probability of meeting the demands. Further, using three different consumer preferences, a similarity measurement model is constructed to recommend hotels for different types of consumers. This study puts this innovative method to the test using a dataset covering 40 hotels in the Beijing area and analyzes the impact of three preferences for different types of consumers on their hotel recommendation results. The method introduced in this study has two management implications. On the one hand, the recommendation method based on consumer preferences can optimize hotel recommendation systems and help online hotel booking platforms improve the accuracy of recommendation results. On the other hand, the proposed method can offer valuable insights to hotel managers, helping them measure their competitiveness and providing guidance for developing service improvement strategies.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"13075 - 13100"},"PeriodicalIF":3.4,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1007/s10489-024-05819-x
Qiaokang Liang, Jintao Li, Hai Qin, Mingfeng Liu, Xiao Xiao, Dongbo Zhang, Yaonan Wang, Dan Zhang
Kitchen waste images encompass a wide range of garbage categories, posing a typical multi-label classification challenge. However, due to the complex background and significant variations in garbage morphology, there is currently limited research on kitchen waste classification. In this paper, we propose a multi-head attention-driven dynamic graph convolution lightweight network for multi-label classification of kitchen waste images. Firstly, we address the issue of large model parameterization in traditional GCN methods by optimizing the backbone network for lightweight model design. Secondly, to overcome performance losses resulting from reduced model parameters, we introduce a multi-head attention mechanism to mitigate feature information loss, enhancing the feature extraction capability of the backbone network in complex scenarios and improving the correlation between graph nodes. Finally, the dynamic graph convolution module is employed to adaptively capture semantic-aware regions, further boosting recognition capabilities. Experiments conducted on our self-constructed multi-label kitchen waste classification dataset MLKW demonstrate that our proposed algorithm achieves a 8.6% and 4.8% improvement in mAP compared to the benchmark GCN-based methods ML-GCN and ADD-GCN, respectively, establishing state-of-the-art performance. Additionally, extensive experiments on two public datasets, MS-COCO and VOC2007, showcase excellent classification results, highlighting the strong generalization ability of our algorithm.
{"title":"MHA-DGCLN: multi-head attention-driven dynamic graph convolutional lightweight network for multi-label image classification of kitchen waste","authors":"Qiaokang Liang, Jintao Li, Hai Qin, Mingfeng Liu, Xiao Xiao, Dongbo Zhang, Yaonan Wang, Dan Zhang","doi":"10.1007/s10489-024-05819-x","DOIUrl":"10.1007/s10489-024-05819-x","url":null,"abstract":"<div><p>Kitchen waste images encompass a wide range of garbage categories, posing a typical multi-label classification challenge. However, due to the complex background and significant variations in garbage morphology, there is currently limited research on kitchen waste classification. In this paper, we propose a multi-head attention-driven dynamic graph convolution lightweight network for multi-label classification of kitchen waste images. Firstly, we address the issue of large model parameterization in traditional GCN methods by optimizing the backbone network for lightweight model design. Secondly, to overcome performance losses resulting from reduced model parameters, we introduce a multi-head attention mechanism to mitigate feature information loss, enhancing the feature extraction capability of the backbone network in complex scenarios and improving the correlation between graph nodes. Finally, the dynamic graph convolution module is employed to adaptively capture semantic-aware regions, further boosting recognition capabilities. Experiments conducted on our self-constructed multi-label kitchen waste classification dataset MLKW demonstrate that our proposed algorithm achieves a 8.6% and 4.8% improvement in mAP compared to the benchmark GCN-based methods ML-GCN and ADD-GCN, respectively, establishing state-of-the-art performance. Additionally, extensive experiments on two public datasets, MS-COCO and VOC2007, showcase excellent classification results, highlighting the strong generalization ability of our algorithm.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"13057 - 13074"},"PeriodicalIF":3.4,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05819-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1007/s10489-024-05693-7
Zixin Chen, Jiong Yu, Qiyin Tan, Shu Li, XuSheng Du
The advancement of the computer and information industry has led to the emergence of new demands for multivariate time series anomaly detection (MTSAD) models, namely, the necessity for unsupervised anomaly detection that is both efficient and accurate. However, long-term time series data typically encompass a multitude of intricate temporal pattern variations and noise. Consequently, accurately capturing anomalous patterns within such data and establishing precise and rapid anomaly detection models pose challenging problems. In this paper, we propose a decomposition GAN-based transformer for anomaly detection (DGTAD) in multivariate time series data. Specifically, DGTAD integrates a time series decomposition structure into the original transformer model, further decomposing the extracted global features into deep trend information and seasonal information. On this basis, we improve the attention mechanism, which uses decomposed time-dependent features to change the traditional focus of the transformer, enabling the model to reconstruct anomalies of different types in a targeted manner. This makes it difficult for anomalous data to adapt to these changes, thereby amplifying the anomalous features. Finally, by combining the GAN structure and using multiple generators from different perspectives, we alleviate the mode collapse issue, thereby enhancing the model’s generalizability. DGTAD has been validated on nine benchmark datasets, demonstrating significant performance improvements and thus proving its effectiveness in unsupervised anomaly detection.
计算机和信息产业的发展对多变量时间序列异常检测(MTSAD)模型提出了新的要求,即需要高效、准确的无监督异常检测。然而,长期时间序列数据通常包含大量错综复杂的时间模式变化和噪声。因此,准确捕捉此类数据中的异常模式并建立精确、快速的异常检测模型是一个具有挑战性的问题。本文提出了一种基于分解 GAN 的异常检测转换器(DGTAD)。具体来说,DGTAD 将时间序列分解结构整合到原始变换器模型中,进一步将提取的全局特征分解为深度趋势信息和季节信息。在此基础上,我们改进了关注机制,利用分解后的随时间变化的特征来改变变换器的传统关注点,使模型能够有针对性地重建不同类型的异常现象。这使得异常数据难以适应这些变化,从而放大了异常特征。最后,通过结合 GAN 结构和使用来自不同角度的多个生成器,我们缓解了模式崩溃问题,从而增强了模型的普适性。DGTAD 已在九个基准数据集上进行了验证,显示出显著的性能改进,从而证明了它在无监督异常检测中的有效性。
{"title":"DGTAD: decomposition GAN-based transformer for anomaly detection in multivariate time series data","authors":"Zixin Chen, Jiong Yu, Qiyin Tan, Shu Li, XuSheng Du","doi":"10.1007/s10489-024-05693-7","DOIUrl":"10.1007/s10489-024-05693-7","url":null,"abstract":"<div><p>The advancement of the computer and information industry has led to the emergence of new demands for multivariate time series anomaly detection (MTSAD) models, namely, the necessity for unsupervised anomaly detection that is both efficient and accurate. However, long-term time series data typically encompass a multitude of intricate temporal pattern variations and noise. Consequently, accurately capturing anomalous patterns within such data and establishing precise and rapid anomaly detection models pose challenging problems. In this paper, we propose a decomposition GAN-based transformer for anomaly detection (DGTAD) in multivariate time series data. Specifically, DGTAD integrates a time series decomposition structure into the original transformer model, further decomposing the extracted global features into deep trend information and seasonal information. On this basis, we improve the attention mechanism, which uses decomposed time-dependent features to change the traditional focus of the transformer, enabling the model to reconstruct anomalies of different types in a targeted manner. This makes it difficult for anomalous data to adapt to these changes, thereby amplifying the anomalous features. Finally, by combining the GAN structure and using multiple generators from different perspectives, we alleviate the mode collapse issue, thereby enhancing the model’s generalizability. DGTAD has been validated on nine benchmark datasets, demonstrating significant performance improvements and thus proving its effectiveness in unsupervised anomaly detection.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"13038 - 13056"},"PeriodicalIF":3.4,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic classification of remote sensing images using machine learning techniques is challenging due to the complex features of the images. The images are characterized by features such as multi-resolution, heterogeneous appearance and multi-spectral channels. Deep learning methods have achieved promising results in the analysis of remote sensing satellite images in the recent past. However, deep learning methods based on convolutional neural networks (CNN) experience difficulties in the analysis of intrinsic objects from satellite images. These techniques have not achieved optimum performance in the analysis of remote sensing satellite images due to their complex features, such as coarse resolution, cloud masking, varied sizes of embedded objects and appearance. The receptive fields in convolutional operations are not able to establish long-range dependencies and lack global contextual connectivity for effective feature extraction. To address this problem, we propose an improved deep learning-based vision transformer model for the efficient analysis of remote sensing images. The proposed model incorporates a multi-head local self-attention mechanism with patch shifting procedure to provide both local and global context for effective extraction of multi-scale and multi-resolution spatial features of remote sensing images. The proposed model is also enhanced by fine-tuning the hyper-parameters by introducing dropout modules and a decay linear learning rate scheduler. This approach leverages local self-attention for learning and extraction of the complex features in satellite images. Four distinct remote sensing image datasets, namely RSSCN, EuroSat, UC Merced (UCM) and SIRI-WHU, were subjected to experiments and analysis. The results show some improvement in the proposed vision transformer on the CNN-based methods.
{"title":"Automated classification of remote sensing satellite images using deep learning based vision transformer","authors":"Adekanmi Adegun, Serestina Viriri, Jules-Raymond Tapamo","doi":"10.1007/s10489-024-05818-y","DOIUrl":"10.1007/s10489-024-05818-y","url":null,"abstract":"<div><p>Automatic classification of remote sensing images using machine learning techniques is challenging due to the complex features of the images. The images are characterized by features such as multi-resolution, heterogeneous appearance and multi-spectral channels. Deep learning methods have achieved promising results in the analysis of remote sensing satellite images in the recent past. However, deep learning methods based on convolutional neural networks (CNN) experience difficulties in the analysis of intrinsic objects from satellite images. These techniques have not achieved optimum performance in the analysis of remote sensing satellite images due to their complex features, such as coarse resolution, cloud masking, varied sizes of embedded objects and appearance. The receptive fields in convolutional operations are not able to establish long-range dependencies and lack global contextual connectivity for effective feature extraction. To address this problem, we propose an improved deep learning-based vision transformer model for the efficient analysis of remote sensing images. The proposed model incorporates a multi-head local self-attention mechanism with patch shifting procedure to provide both local and global context for effective extraction of multi-scale and multi-resolution spatial features of remote sensing images. The proposed model is also enhanced by fine-tuning the hyper-parameters by introducing dropout modules and a decay linear learning rate scheduler. This approach leverages local self-attention for learning and extraction of the complex features in satellite images. Four distinct remote sensing image datasets, namely RSSCN, EuroSat, UC Merced (UCM) and SIRI-WHU, were subjected to experiments and analysis. The results show some improvement in the proposed vision transformer on the CNN-based methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"13018 - 13037"},"PeriodicalIF":3.4,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05818-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-15DOI: 10.1007/s10489-024-05829-9
Huohai Yang, Yi Li, Chao Min, Jie Yue, Fuwei Li, Renze Li, Xiangshu Chu
The micro- and nanopore throats in shale oil reservoirs are finer than those in conventional oil reservoirs and have a larger specific surface area, potentially resulting in a more pronounced crude oil boundary effect. The prediction of recoverable reserves in shale oil reservoirs is influenced by factors such as geological complexity, fracture characteristics, and multiphase flow characteristics. The application of conventional reservoir seepage theories and engineering methods is challenging because of the unique characteristics of shale formations. A novel computational framework is proposed for the prediction of recoverable reserves and optimization of fracturing parameters by combining machine learning algorithms with causal discovery. Based on the theory of causal inference, the framework discovers the underlying causal relationships of the data, mines the internal laws of the data, and evaluates the causal effects, aiming to build an interpretable machine learning model to better understand the properties of shale oil reservoirs. Compared to traditional methods, the interpretable machine learning model has an outstanding prediction ability, with R2 of 0.94 and average error as low as 8.57%, which is 5.22% lower than that of traditional methods. Moreover, the maximum prediction error is only 21.84%, which is 25.2% smaller than the maximum error of traditional methods. The prediction robustness is good. An accurate prediction of recoverable reserves can be achieved. Furthermore, by integrating particle swarm optimization and TabNet, a fracturing parameter optimization model for shale oil reservoirs is developed. According to an on-site validation, this optimization results in an average increase of 13.45% in recoverable reserves. This study provides an accurate reference for reserve assessment and production design in the exploration and development of shale oil reservoirs.
{"title":"Interpretable fracturing optimization of shale oil reservoir production based on causal inference","authors":"Huohai Yang, Yi Li, Chao Min, Jie Yue, Fuwei Li, Renze Li, Xiangshu Chu","doi":"10.1007/s10489-024-05829-9","DOIUrl":"10.1007/s10489-024-05829-9","url":null,"abstract":"<div><p>The micro- and nanopore throats in shale oil reservoirs are finer than those in conventional oil reservoirs and have a larger specific surface area, potentially resulting in a more pronounced crude oil boundary effect. The prediction of recoverable reserves in shale oil reservoirs is influenced by factors such as geological complexity, fracture characteristics, and multiphase flow characteristics. The application of conventional reservoir seepage theories and engineering methods is challenging because of the unique characteristics of shale formations. A novel computational framework is proposed for the prediction of recoverable reserves and optimization of fracturing parameters by combining machine learning algorithms with causal discovery. Based on the theory of causal inference, the framework discovers the underlying causal relationships of the data, mines the internal laws of the data, and evaluates the causal effects, aiming to build an interpretable machine learning model to better understand the properties of shale oil reservoirs. Compared to traditional methods, the interpretable machine learning model has an outstanding prediction ability, with R<sup>2</sup> of 0.94 and average error as low as 8.57%, which is 5.22% lower than that of traditional methods. Moreover, the maximum prediction error is only 21.84%, which is 25.2% smaller than the maximum error of traditional methods. The prediction robustness is good. An accurate prediction of recoverable reserves can be achieved. Furthermore, by integrating particle swarm optimization and TabNet, a fracturing parameter optimization model for shale oil reservoirs is developed. According to an on-site validation, this optimization results in an average increase of 13.45% in recoverable reserves. This study provides an accurate reference for reserve assessment and production design in the exploration and development of shale oil reservoirs.</p><h3>Graphical Abstract</h3><div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"13001 - 13017"},"PeriodicalIF":3.4,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-15DOI: 10.1007/s10489-024-05789-0
Yucheng Wu, Shuxin Wang, Xianghua Fu
Short-term forecasting of the Limit Order Book (LOB) is challenging due to market noise. Traditionally, technical analysis using candlestick charts has been effective for market analysis and predictions. Inspired by this, we introduce a novel methodology. First, we preprocess the LOB data into long-term frame data resembling candlestick patterns to reduce noise interference. We then present the Long Short-Term Temporal Fusion Transformer (LSTFT), skillfully integrating both short-term and long-term information to capture complex dependencies and enhance prediction accuracy. Additionally, we propose a Temporal Attention Mechanism (TAM) that effectively distinguishes between long-term and short-term temporal relationships in LOB data. Our experimental results demonstrate the effectiveness of our approach in accurately forecasting the Limit Order Book in the short term.
{"title":"Long short-term temporal fusion transformer for short-term forecasting of limit order book in China markets","authors":"Yucheng Wu, Shuxin Wang, Xianghua Fu","doi":"10.1007/s10489-024-05789-0","DOIUrl":"10.1007/s10489-024-05789-0","url":null,"abstract":"<div><p>Short-term forecasting of the Limit Order Book (LOB) is challenging due to market noise. Traditionally, technical analysis using candlestick charts has been effective for market analysis and predictions. Inspired by this, we introduce a novel methodology. First, we preprocess the LOB data into long-term frame data resembling candlestick patterns to reduce noise interference. We then present the Long Short-Term Temporal Fusion Transformer (LSTFT), skillfully integrating both short-term and long-term information to capture complex dependencies and enhance prediction accuracy. Additionally, we propose a Temporal Attention Mechanism (TAM) that effectively distinguishes between long-term and short-term temporal relationships in LOB data. Our experimental results demonstrate the effectiveness of our approach in accurately forecasting the Limit Order Book in the short term.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12979 - 13000"},"PeriodicalIF":3.4,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-12DOI: 10.1007/s10489-024-05835-x
Yuehua Gan, Qianqian Wang, Zhejun Huang, Lili Yang
Out-of-distribution (OOD) recommendations have emerged as a popular field in recommendation systems. Traditional causal OOD recommendation frameworks often overlook shifts in latent user features and the interrelations between different user preferences. To address these issues, this paper proposes an innovative framework called Attention-based Causal OOD Recommendation (ABCOR), which applies the attention mechanism in two distinct ways. For shifts in latent user features, variational attention is employed to analyze shift information and refine the interaction-generation process. Besides, ABCOR integrates a multi-head self-attention layer to infer the complex user preference relationship and enhance recommendation accuracy before calculating post-intervention interaction probabilities. The proposed method has been validated on two public real-world datasets, and the results demonstrate that the proposal significantly outperforms the current state-of-the-art COR methods. Codes are available at https://github.com/YaffaGan/ABCOR.
{"title":"Attention-based causal representation learning for out-of-distribution recommendation","authors":"Yuehua Gan, Qianqian Wang, Zhejun Huang, Lili Yang","doi":"10.1007/s10489-024-05835-x","DOIUrl":"10.1007/s10489-024-05835-x","url":null,"abstract":"<p>Out-of-distribution (OOD) recommendations have emerged as a popular field in recommendation systems. Traditional causal OOD recommendation frameworks often overlook shifts in latent user features and the interrelations between different user preferences. To address these issues, this paper proposes an innovative framework called Attention-based Causal OOD Recommendation (ABCOR), which applies the attention mechanism in two distinct ways. For shifts in latent user features, variational attention is employed to analyze shift information and refine the interaction-generation process. Besides, ABCOR integrates a multi-head self-attention layer to infer the complex user preference relationship and enhance recommendation accuracy before calculating post-intervention interaction probabilities. The proposed method has been validated on two public real-world datasets, and the results demonstrate that the proposal significantly outperforms the current state-of-the-art COR methods. Codes are available at https://github.com/YaffaGan/ABCOR.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12964 - 12978"},"PeriodicalIF":3.4,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An enhanced U-shaped network (EU-Net) based on deep semantic information fusion and edge information guidance is studied to improve the segmentation accuracy of road cracks under hazy conditions. The EU-Net comprises multimode feature fusion, side information fusion and edge extraction modules. The feature and side information fusion modules are applied to fuse deep semantic information with multiscale features. The edge extraction module uses the Canny edge detection algorithm to guide and constrain crack edge information from the neural network. The experimental results show that the method in this work is superior to the most widely used crack segmentation methods. Compared with that of the baseline U-Net, the mIoU of the EU-Net increases by 0.59% and 5.7% on the Crack500 and Masonry datasets, respectively.
{"title":"EU-Net: a segmentation network based on semantic fusion and edge guidance for road crack images","authors":"Jing Gao, Yiting Gui, Wen Ji, Jun Wen, Yueyu Zhou, Xiaoxiao Huang, Qiang Wang, Chenlong Wei, Zhong Huang, Chuanlong Wang, Zhu Zhu","doi":"10.1007/s10489-024-05788-1","DOIUrl":"10.1007/s10489-024-05788-1","url":null,"abstract":"<div><p>An enhanced U-shaped network (EU-Net) based on deep semantic information fusion and edge information guidance is studied to improve the segmentation accuracy of road cracks under hazy conditions. The EU-Net comprises multimode feature fusion, side information fusion and edge extraction modules. The feature and side information fusion modules are applied to fuse deep semantic information with multiscale features. The edge extraction module uses the Canny edge detection algorithm to guide and constrain crack edge information from the neural network. The experimental results show that the method in this work is superior to the most widely used crack segmentation methods. Compared with that of the baseline U-Net, the mIoU of the EU-Net increases by 0.59% and 5.7% on the Crack500 and Masonry datasets, respectively.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12949 - 12963"},"PeriodicalIF":3.4,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1007/s10489-024-05764-9
Yongfeng Su, Juhui Zhang, Qiuyue Li
Deep learning-based models have emerged as promising tools for multivariate long-term time series forecasting. These models are finely structured to perform feature extraction from time series, greatly improving the accuracy of multivariate long-term time series forecasting. However, to the best of our knowledge, few scholars have focused their research on preprocessing time series, such as analyzing their periodic distributions or analyzing their values and volatility at the global level. In fact, properly preprocessing time series can often significantly improve the accuracy of multivariate long-term time series forecasting. In this paper, using the cross-variable transformer as a basis, we introduce a statistical characteristics space fusion module to preprocess the time series, this module takes the mean and standard deviation values of the time series during different periods as part of the model’s inputs and greatly improves the model’s performance. The Statistical Characteristics Space Fusion Module consists of a statistical characteristics space, which represents the mean and standard deviation values of a time series under different periods, and a convolutional neural network, which is used to fuse the original time series with the corresponding mean and standard deviation values. Moreover, to extract the linear dependencies of the time series variables more efficiently, we introduce three different linear projection layers at different nodes of the model, which we call the Multi-level Linear Projection Module. This new methodology, called the SCSformer, includes three innovations. First, we propose a Statistical Characteristics Space Fusion Module, which is capable of calculating the statistical characteristics space of the time series and fusing the original time series with a specific element of the statistical characteristics space as inputs of the model. Second, we introduce a Multi-level Linear Projection Module to capture linear dependencies of time series from different stages of the model. Third, we combine the Statistical Characteristics Space Fusion Module, the Multi-level Linear Projection Module, the Reversible Instance Normalization and the Cross-variable Transformer proposed in Client in a certain order to generate the SCSformer. We test this combination on nine real-world time series datasets and achieve optimal results on eight of them. Our code is publicly available at https://github.com/qiuyueli123/SCSformer.
{"title":"SCSformer: cross-variable transformer framework for multivariate long-term time series forecasting via statistical characteristics space","authors":"Yongfeng Su, Juhui Zhang, Qiuyue Li","doi":"10.1007/s10489-024-05764-9","DOIUrl":"10.1007/s10489-024-05764-9","url":null,"abstract":"<div><p>Deep learning-based models have emerged as promising tools for multivariate long-term time series forecasting. These models are finely structured to perform feature extraction from time series, greatly improving the accuracy of multivariate long-term time series forecasting. However, to the best of our knowledge, few scholars have focused their research on preprocessing time series, such as analyzing their periodic distributions or analyzing their values and volatility at the global level. In fact, properly preprocessing time series can often significantly improve the accuracy of multivariate long-term time series forecasting. In this paper, using the cross-variable transformer as a basis, we introduce a statistical characteristics space fusion module to preprocess the time series, this module takes the mean and standard deviation values of the time series during different periods as part of the model’s inputs and greatly improves the model’s performance. The Statistical Characteristics Space Fusion Module consists of a statistical characteristics space, which represents the mean and standard deviation values of a time series under different periods, and a convolutional neural network, which is used to fuse the original time series with the corresponding mean and standard deviation values. Moreover, to extract the linear dependencies of the time series variables more efficiently, we introduce three different linear projection layers at different nodes of the model, which we call the Multi-level Linear Projection Module. This new methodology, called <b>the SCSformer</b>, includes three innovations. First, we propose a Statistical Characteristics Space Fusion Module, which is capable of calculating the statistical characteristics space of the time series and fusing the original time series with a specific element of the statistical characteristics space as inputs of the model. Second, we introduce a Multi-level Linear Projection Module to capture linear dependencies of time series from different stages of the model. Third, we combine the Statistical Characteristics Space Fusion Module, the Multi-level Linear Projection Module, the Reversible Instance Normalization and the Cross-variable Transformer proposed in Client in a certain order to generate the SCSformer. We test this combination on nine real-world time series datasets and achieve optimal results on eight of them. Our code is publicly available at https://github.com/qiuyueli123/SCSformer.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12922 - 12948"},"PeriodicalIF":3.4,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siamese visual trackers based on segmentation have garnered considerable attention due to their high accuracy. However, these trackers rely solely on simple classification confidence to distinguish between positive and negative samples (foreground or background), lacking more precise discrimination capabilities for objects. Moreover, the backbone network excels at focusing on local information during feature extraction, failing to capture the long-distance contextual semantics crucial for classification. Consequently, these trackers are highly susceptible to interference during actual tracking, leading to erroneous object segmentation and subsequent tracking failures, thereby compromising robustness. For this purpose, we propose a Siamese visual segmentation and tracking network with classification-rank loss and classification-aware (Siam2C). We design a classification-rank loss (CRL) algorithm to enlarge the margin between positive and negative samples, ensuring that positive samples are ranked higher than negative ones. This optimization enhances the network’s ability to learn from positive and negative samples, allowing the tracker to accurately select the object for segmentation and tracking rather than being misled by interfering targets. Additionally, we design a classification-aware attention module (CAM), which employs spatial and channel self-attention mechanisms to capture long-distance dependencies between different positions in the feature map. The module enhances the feature representation capability of the backbone network, providing richer global contextual semantic information for the tracking network’s classification decisions. Extensive experiments on the VOT2016, VOT2018, VOT2019, OTB100, UAV123, GOT-10k, DAVIS2016, and DAVIS2017 datasets demonstrate the outstanding performance of Siam2C.
{"title":"Siam2C: Siamese visual segmentation and tracking with classification-rank loss and classification-aware","authors":"Bangjun Lei, Qishuai Ding, Weisheng Li, Hao Tian, Lifang Zhou","doi":"10.1007/s10489-024-05840-0","DOIUrl":"10.1007/s10489-024-05840-0","url":null,"abstract":"<div><p>Siamese visual trackers based on segmentation have garnered considerable attention due to their high accuracy. However, these trackers rely solely on simple classification confidence to distinguish between positive and negative samples (foreground or background), lacking more precise discrimination capabilities for objects. Moreover, the backbone network excels at focusing on local information during feature extraction, failing to capture the long-distance contextual semantics crucial for classification. Consequently, these trackers are highly susceptible to interference during actual tracking, leading to erroneous object segmentation and subsequent tracking failures, thereby compromising robustness. For this purpose, we propose a Siamese visual segmentation and tracking network with classification-rank loss and classification-aware (Siam2C). We design a classification-rank loss (CRL) algorithm to enlarge the margin between positive and negative samples, ensuring that positive samples are ranked higher than negative ones. This optimization enhances the network’s ability to learn from positive and negative samples, allowing the tracker to accurately select the object for segmentation and tracking rather than being misled by interfering targets. Additionally, we design a classification-aware attention module (CAM), which employs spatial and channel self-attention mechanisms to capture long-distance dependencies between different positions in the feature map. The module enhances the feature representation capability of the backbone network, providing richer global contextual semantic information for the tracking network’s classification decisions. Extensive experiments on the VOT2016, VOT2018, VOT2019, OTB100, UAV123, GOT-10k, DAVIS2016, and DAVIS2017 datasets demonstrate the outstanding performance of Siam2C.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12898 - 12921"},"PeriodicalIF":3.4,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}