Pub Date : 2025-12-10DOI: 10.1007/s40747-025-02186-z
Yang Lian, Ruizhi Han, Shiyuan Han, Defu Qiu, Jin Zhou
Skin cancer research is essential to finding new treatments and improving survival rates in computer-aided medicine. Within this research, the accurate segmentation of skin lesion images is an important step for both early diagnosis and personalized treatment strategies. However, while current popular Transformer-based models have achieved competitive segmentation results, they often ignore the computational complexity and the high costs associated with their training. In this paper, we propose a lightweight network, a multi-scale atrous attention network for skin lesion segmentation (MAAN). Firstly, we optimize the residual basic block by constructing a dual-path framework with both high and low-resolution paths, which reduces the number of parameters while maintaining effective feature extraction capability. Secondly, to better capture the information in the skin lesion images and further improve the model performance, we design an adaptive multi-scale atrous attention module at the final stage of the low-resolution path. The experiments conducted on the ISIC 2017 and ISIC2018 datasets show that the proposed model MAAN achieves mIoU of 85.20 and 85.67% respectively, outperforming recent MHorNet while maintaining only 0.37M parameters and 0.23G FLOPs computational complexity. Additionally, through ablation studies, we demonstrate that the AMAA module can work as a plug-and-play module for performance improvement on CNN-based methods.
{"title":"MAAN: multi-scale atrous attention network for skin lesion segmentation","authors":"Yang Lian, Ruizhi Han, Shiyuan Han, Defu Qiu, Jin Zhou","doi":"10.1007/s40747-025-02186-z","DOIUrl":"https://doi.org/10.1007/s40747-025-02186-z","url":null,"abstract":"Skin cancer research is essential to finding new treatments and improving survival rates in computer-aided medicine. Within this research, the accurate segmentation of skin lesion images is an important step for both early diagnosis and personalized treatment strategies. However, while current popular Transformer-based models have achieved competitive segmentation results, they often ignore the computational complexity and the high costs associated with their training. In this paper, we propose a lightweight network, a multi-scale atrous attention network for skin lesion segmentation (MAAN). Firstly, we optimize the residual basic block by constructing a dual-path framework with both high and low-resolution paths, which reduces the number of parameters while maintaining effective feature extraction capability. Secondly, to better capture the information in the skin lesion images and further improve the model performance, we design an adaptive multi-scale atrous attention module at the final stage of the low-resolution path. The experiments conducted on the ISIC 2017 and ISIC2018 datasets show that the proposed model MAAN achieves mIoU of 85.20 and 85.67% respectively, outperforming recent MHorNet while maintaining only 0.37M parameters and 0.23G FLOPs computational complexity. Additionally, through ablation studies, we demonstrate that the AMAA module can work as a plug-and-play module for performance improvement on CNN-based methods.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"22 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145711460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1007/s40747-025-02156-5
Hongguan Hu, Jianjun Peng, Zhidong Xiao, Li Guo, Yi Hu, Di Wu
Continuous Sign Language Recognition (CSLR) is fundamental to bridging the communication gap between hearing-impaired individuals and the broader society. The primary challenge lies in effectively modeling the complex spatial-temporal dynamic features in sign language videos. Current approaches typically employ independent processing strategies for motion feature extraction and temporal modeling, which impedes the unified modeling of action continuity and semantic integrity in sign language sequences. To address these limitations, we propose the Motion-Temporal Calibration Network (MTCNet), a novel framework for continuous sign language recognition that integrates dynamic feature enhancement and temporal calibration. The framework consists of two key innovative modules. First, the Cross-Frame Motion Refinement (CFMR) module implements an inter-frame differential attention mechanism combined with residual learning strategies, enabling precise motion feature modeling and effective enhancement of dynamic information between adjacent frames. Second, the Temporal-Channel Adaptive Recalibration (TCAR) module utilizes adaptive convolution kernel design and a dual-branch feature extraction architecture, facilitating joint optimization in both temporal and channel dimensions. In experimental evaluations, our method demonstrates competitive performance on the widely-used PHOENIX-2014 and PHOENIX-2014-T datasets, achieving results comparable to leading unimodal approaches. Moreover, it achieves state-of-the-art performance on the Chinese Sign Language (CSL) dataset. Through comprehensive ablation studies and quantitative analysis, we validate the effectiveness of our proposed method in fine-grained dynamic feature modeling and long-term dependency capture while maintaining computational efficiency.
{"title":"Motion-temporal calibration network for continuous sign language recognition","authors":"Hongguan Hu, Jianjun Peng, Zhidong Xiao, Li Guo, Yi Hu, Di Wu","doi":"10.1007/s40747-025-02156-5","DOIUrl":"https://doi.org/10.1007/s40747-025-02156-5","url":null,"abstract":"Continuous Sign Language Recognition (CSLR) is fundamental to bridging the communication gap between hearing-impaired individuals and the broader society. The primary challenge lies in effectively modeling the complex spatial-temporal dynamic features in sign language videos. Current approaches typically employ independent processing strategies for motion feature extraction and temporal modeling, which impedes the unified modeling of action continuity and semantic integrity in sign language sequences. To address these limitations, we propose the Motion-Temporal Calibration Network (MTCNet), a novel framework for continuous sign language recognition that integrates dynamic feature enhancement and temporal calibration. The framework consists of two key innovative modules. First, the Cross-Frame Motion Refinement (CFMR) module implements an inter-frame differential attention mechanism combined with residual learning strategies, enabling precise motion feature modeling and effective enhancement of dynamic information between adjacent frames. Second, the Temporal-Channel Adaptive Recalibration (TCAR) module utilizes adaptive convolution kernel design and a dual-branch feature extraction architecture, facilitating joint optimization in both temporal and channel dimensions. In experimental evaluations, our method demonstrates competitive performance on the widely-used PHOENIX-2014 and PHOENIX-2014-T datasets, achieving results comparable to leading unimodal approaches. Moreover, it achieves state-of-the-art performance on the Chinese Sign Language (CSL) dataset. Through comprehensive ablation studies and quantitative analysis, we validate the effectiveness of our proposed method in fine-grained dynamic feature modeling and long-term dependency capture while maintaining computational efficiency.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"134 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145704005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-05DOI: 10.1007/s40747-025-02141-y
Angela Cortecchia, Giovanni Ciatto, Roberto Casadei, Danilo Pianini
{"title":"FieldVMC: an asynchronous model and platform for self-organising morphogenesis of artificial structures","authors":"Angela Cortecchia, Giovanni Ciatto, Roberto Casadei, Danilo Pianini","doi":"10.1007/s40747-025-02141-y","DOIUrl":"https://doi.org/10.1007/s40747-025-02141-y","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"36 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145680381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-05DOI: 10.1007/s40747-025-02179-y
Ch. Srilakshmi, N. Ramakrishnaiah, E. Laxmi Lydia
The last few years have witnessed rapid increase in skin cancer caused mortality rate. Despite innovations and growth in vision-computing and artificial intelligence technologies, the complex shapes, sizes, textural patterns and ambiguous edges limits the reliability of existing approaches. Nevertheless, unlike traditional approaches the deep learning methods have performed superior; yet, the demands for the superior skin-lesion segmentation, ROI-specific feature extraction and learning can’t be ruled out. Moreover, it requires addressing class-imbalance problems as well to avoid skewed learning and prediction. Considering it as motivation, in this paper a novel and robust semantic segmentation assisted deep ensemble feature learning environment for skin-cancer detection and classification (SDENet) is proposed. The proposed SDENet model is targeted to perform multi-class skin-cancer classification. To achieve it, the SDENet at first performs standard pre-processing followed by synthetic minority over-sampling (SMOTE) to alleviate class-imbalance problem. Subsequently, it performs firefly heuristic algorithm based Fuzzy C-means clustering to segment skin-lesions (say, ROI), which is followed by ROI-specific deep spatio-textural ensemble feature extraction and fusion (DeS-TEFF). Specifically, SDENet makes use of the AlexNet deep network, DenseNet121 and Gray level co-occurrence matrix (GLCM) feature extraction methods. Here, AlexNet serves high-dimensional information rich features, while DenseNet121 yields layer-wise learning and feature reuse driven feature-set. Performing horizontal concatenation over the AlexNet, DenseNet121 and GLCM features, the principal component analysis (PCA) feature selection was performed, which helped to avoid local minima and convergence. The selected features were normalized by means of the z-score normalization so as to avoid over-fitting problems. Finally, the normalized features were trained and classified by using Heterogenous Ensemble Classifier, embodying SVM, DT, Random Forest, Extra Tree Classifier and XGBoost classifiers. The maximum voting ensemble-based classification over HAM10000 dataset exhibited the average accuracy of 98.97%, precision 99.38%, recall 98.94% and F-Measure 0.99, confirming its superiority over other existing approaches for real-time skin cancer diagnosis purposes.
最近几年,皮肤癌引起的死亡率迅速上升。尽管视觉计算和人工智能技术不断创新和发展,但复杂的形状、大小、纹理模式和模糊的边缘限制了现有方法的可靠性。然而,与传统方法不同,深度学习方法表现得更优越;然而,也不能排除对更好的皮肤病变分割、roi特征提取和学习的需求。此外,它还需要解决阶级失衡问题,以避免学习和预测的偏差。以语义分割为动机,提出了一种新的鲁棒语义分割辅助深度集成特征学习环境(SDENet)用于皮肤癌检测与分类。提出的SDENet模型旨在进行多类皮肤癌分类。为了实现这一目标,SDENet首先执行标准预处理,然后进行合成少数过采样(SMOTE)来缓解类不平衡问题。随后,采用基于萤火虫启发式算法的模糊c均值聚类对皮肤病变(如ROI)进行分割,然后对ROI进行深度空间纹理集成特征提取与融合(DeS-TEFF)。具体来说,SDENet使用了AlexNet深度网络、DenseNet121和灰度共生矩阵(GLCM)特征提取方法。在这里,AlexNet提供高维信息丰富的功能,而DenseNet121提供分层学习和功能重用驱动的功能集。在AlexNet、DenseNet121和GLCM特征上进行水平拼接,进行主成分分析(PCA)特征选择,有助于避免局部最小值和收敛。选取的特征通过z-score归一化进行归一化,避免出现过拟合问题。最后,使用异构集成分类器对归一化特征进行训练和分类,包括SVM、DT、Random Forest、Extra Tree Classifier和XGBoost分类器。在HAM10000数据集上,基于投票集合的最大分类平均准确率为98.97%,精密度为99.38%,召回率为98.94%,F-Measure为0.99,证实了其在实时皮肤癌诊断方面优于其他现有方法。
{"title":"Semantic segmentation assisted deep ensemble feature learning model for skin-cancer detection and classification: SDENet","authors":"Ch. Srilakshmi, N. Ramakrishnaiah, E. Laxmi Lydia","doi":"10.1007/s40747-025-02179-y","DOIUrl":"https://doi.org/10.1007/s40747-025-02179-y","url":null,"abstract":"The last few years have witnessed rapid increase in skin cancer caused mortality rate. Despite innovations and growth in vision-computing and artificial intelligence technologies, the complex shapes, sizes, textural patterns and ambiguous edges limits the reliability of existing approaches. Nevertheless, unlike traditional approaches the deep learning methods have performed superior; yet, the demands for the superior skin-lesion segmentation, ROI-specific feature extraction and learning can’t be ruled out. Moreover, it requires addressing class-imbalance problems as well to avoid skewed learning and prediction. Considering it as motivation, in this paper a novel and robust semantic segmentation assisted deep ensemble feature learning environment for skin-cancer detection and classification (SDENet) is proposed. The proposed SDENet model is targeted to perform multi-class skin-cancer classification. To achieve it, the SDENet at first performs standard pre-processing followed by synthetic minority over-sampling (SMOTE) to alleviate class-imbalance problem. Subsequently, it performs firefly heuristic algorithm based Fuzzy C-means clustering to segment skin-lesions (say, ROI), which is followed by ROI-specific deep spatio-textural ensemble feature extraction and fusion (DeS-TEFF). Specifically, SDENet makes use of the AlexNet deep network, DenseNet121 and Gray level co-occurrence matrix (GLCM) feature extraction methods. Here, AlexNet serves high-dimensional information rich features, while DenseNet121 yields layer-wise learning and feature reuse driven feature-set. Performing horizontal concatenation over the AlexNet, DenseNet121 and GLCM features, the principal component analysis (PCA) feature selection was performed, which helped to avoid local minima and convergence. The selected features were normalized by means of the z-score normalization so as to avoid over-fitting problems. Finally, the normalized features were trained and classified by using Heterogenous Ensemble Classifier, embodying SVM, DT, Random Forest, Extra Tree Classifier and XGBoost classifiers. The maximum voting ensemble-based classification over HAM10000 dataset exhibited the average accuracy of 98.97%, precision 99.38%, recall 98.94% and F-Measure 0.99, confirming its superiority over other existing approaches for real-time skin cancer diagnosis purposes.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"100 1 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145680383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-05DOI: 10.1007/s40747-025-02188-x
Nguyen Hoang Vu, Tran Van Duc, Pham Quang Tien, Nguyen Thi Ngoc Anh, Nguyen Tien Dat
{"title":"A real-time mobile solution for shoe try-on using foot pose estimation and 3D processing techniques","authors":"Nguyen Hoang Vu, Tran Van Duc, Pham Quang Tien, Nguyen Thi Ngoc Anh, Nguyen Tien Dat","doi":"10.1007/s40747-025-02188-x","DOIUrl":"https://doi.org/10.1007/s40747-025-02188-x","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"69 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145680382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}