首页 > 最新文献

Cognitive Computation最新文献

英文 中文
Category-Aware Siamese Learning Network for Few-Shot Segmentation 分类感知连体学习网络用于少镜头分割
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-08 DOI: 10.1007/s12559-024-10273-5
Hui Sun, Ziyan Zhang, Lili Huang, Bo Jiang, Bin Luo

Few-shot segmentation (FS) which aims to segment unseen query image based on a few annotated support samples is an active problem in computer vision and multimedia field. It is known that the core issue of FS is how to leverage the annotated information from the support images to guide query image segmentation. Existing methods mainly adopt Siamese Convolutional Neural Network (SCNN) which first encodes both support and query images and then utilizes the masked Global Average Pooling (GAP) to facilitate query image pixel-level representation and segmentation. However, this pipeline generally fails to fully exploit the category/class coherent information between support and query images. For FS task, one can observe that both support and query images share the same category information. This inherent property provides an important cue for FS task. However, previous methods generally fail to fully exploit it for FS task. To overcome this limitation, in this paper, we propose a novel Category-aware Siamese Learning Network (CaSLNet) to encode both support and query images. The proposed CaSLNet conducts Category Consistent Learning (CCL) for both support images and query images and thus can achieve the information communication between support and query images more sufficiently. Comprehensive experimental results on several public datasets demonstrate the advantage of our proposed CaSLNet. Our code is publicly available at https://github.com/HuiSun123/CaSLN.

少镜头分割(FS)的目的是根据少数有注释的支持样本来分割未见的查询图像,它是计算机视觉和多媒体领域的一个活跃问题。众所周知,FS 的核心问题是如何利用支持图像中的注释信息来指导查询图像的分割。现有的方法主要采用连体卷积神经网络(SCNN),它首先对支持图像和查询图像进行编码,然后利用掩码全局平均池化(GAP)来促进查询图像像素级的表示和分割。然而,这种方法通常无法充分利用支持图像和查询图像之间的类别/类一致性信息。在 FS 任务中,我们可以观察到支持图像和查询图像共享相同的类别信息。这一固有属性为 FS 任务提供了重要线索。然而,以往的方法通常无法在 FS 任务中充分利用这一特性。为了克服这一局限性,我们在本文中提出了一种新颖的类别感知连体学习网络(CaSLNet)来对支持图像和查询图像进行编码。所提出的 CaSLNet 对支持图像和查询图像都进行了类别一致学习(CCL),因此能更充分地实现支持图像和查询图像之间的信息沟通。在多个公开数据集上的综合实验结果证明了我们提出的 CaSLNet 的优势。我们的代码可在 https://github.com/HuiSun123/CaSLN 公开获取。
{"title":"Category-Aware Siamese Learning Network for Few-Shot Segmentation","authors":"Hui Sun, Ziyan Zhang, Lili Huang, Bo Jiang, Bin Luo","doi":"10.1007/s12559-024-10273-5","DOIUrl":"https://doi.org/10.1007/s12559-024-10273-5","url":null,"abstract":"<p>Few-shot segmentation (FS) which aims to segment unseen query image based on a few annotated support samples is an active problem in computer vision and multimedia field. It is known that the core issue of FS is how to leverage the annotated information from the support images to guide query image segmentation. Existing methods mainly adopt Siamese Convolutional Neural Network (SCNN) which first encodes both support and query images and then utilizes the masked Global Average Pooling (GAP) to facilitate query image pixel-level representation and segmentation. However, this pipeline generally fails to fully exploit the category/class coherent information between support and query images. <i>For FS task, one can observe that both support and query images share the same category information</i>. This inherent property provides an important cue for FS task. However, previous methods generally fail to fully exploit it for FS task. To overcome this limitation, in this paper, we propose a novel Category-aware Siamese Learning Network (CaSLNet) to encode both support and query images. The proposed CaSLNet conducts <i>Category Consistent Learning (CCL)</i> for both support images and query images and thus can achieve the information communication between support and query images more sufficiently. Comprehensive experimental results on several public datasets demonstrate the advantage of our proposed CaSLNet. Our code is publicly available at https://github.com/HuiSun123/CaSLN.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140935614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ChatGPT Needs SPADE (Sustainability, PrivAcy, Digital divide, and Ethics) Evaluation: A Review ChatGPT 需要 SPADE(可持续性、隐私、数字鸿沟和道德)评估:回顾
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-05 DOI: 10.1007/s12559-024-10285-1
Sunder Ali Khowaja, Parus Khuwaja, Kapal Dev, Weizheng Wang, Lewis Nkenyereye

ChatGPT is another large language model (LLM) vastly available for the consumers on their devices but due to its performance and ability to converse effectively, it has gained a huge popularity amongst research as well as industrial community. Recently, many studies have been published to show the effectiveness, efficiency, integration, and sentiments of chatGPT and other LLMs. In contrast, this study focuses on the important aspects that are mostly overlooked, i.e. sustainability, privacy, digital divide, and ethics and suggests that not only chatGPT but every subsequent entry in the category of conversational bots should undergo Sustainability, PrivAcy, Digital divide, and Ethics (SPADE) evaluation. This paper discusses in detail the issues and concerns raised over chatGPT in line with aforementioned characteristics. We also discuss the recent EU AI Act briefly in accordance with the SPADE evaluation. We support our hypothesis by some preliminary data collection and visualizations along with hypothesized facts. We also suggest mitigations and recommendations for each of the concerns. Furthermore, we also suggest some policies and recommendations for EU AI policy act concerning ethics, digital divide, and sustainability.

ChatGPT 是另一种大型语言模型(LLM),消费者可以在自己的设备上大量使用,但由于其性能和有效对话的能力,它在研究和工业界都获得了极大的欢迎。最近,有很多研究报告展示了 chatGPT 和其他 LLM 的效果、效率、集成度和情感。相比之下,本研究关注的是被忽视的重要方面,即可持续性、隐私、数字鸿沟和道德,并建议不仅是 chatGPT,对话机器人类别中的每个后续项目都应进行可持续性、隐私、数字鸿沟和道德(SPADE)评估。本文根据上述特点详细讨论了针对 chatGPT 提出的问题和担忧。我们还根据 SPADE 评估简要讨论了最近的欧盟人工智能法案。我们通过一些初步的数据收集和可视化以及假设的事实来支持我们的假设。我们还针对每个问题提出了缓解措施和建议。此外,我们还为欧盟人工智能政策法案提出了一些有关伦理、数字鸿沟和可持续性的政策和建议。
{"title":"ChatGPT Needs SPADE (Sustainability, PrivAcy, Digital divide, and Ethics) Evaluation: A Review","authors":"Sunder Ali Khowaja, Parus Khuwaja, Kapal Dev, Weizheng Wang, Lewis Nkenyereye","doi":"10.1007/s12559-024-10285-1","DOIUrl":"https://doi.org/10.1007/s12559-024-10285-1","url":null,"abstract":"<p>ChatGPT is another large language model (LLM) vastly available for the consumers on their devices but due to its performance and ability to converse effectively, it has gained a huge popularity amongst research as well as industrial community. Recently, many studies have been published to show the effectiveness, efficiency, integration, and sentiments of chatGPT and other LLMs. In contrast, this study focuses on the important aspects that are mostly overlooked, i.e. sustainability, privacy, digital divide, and ethics and suggests that not only chatGPT but every subsequent entry in the category of conversational bots should undergo Sustainability, PrivAcy, Digital divide, and Ethics (SPADE) evaluation. This paper discusses in detail the issues and concerns raised over chatGPT in line with aforementioned characteristics. We also discuss the recent EU AI Act briefly in accordance with the SPADE evaluation. We support our hypothesis by some preliminary data collection and visualizations along with hypothesized facts. We also suggest mitigations and recommendations for each of the concerns. Furthermore, we also suggest some policies and recommendations for EU AI policy act concerning ethics, digital divide, and sustainability.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CIL-Net: Densely Connected Context Information Learning Network for Boosting Thyroid Nodule Segmentation Using Ultrasound Images CIL-Net:利用超声图像增强甲状腺结节分割的密集连接上下文信息学习网络
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-04 DOI: 10.1007/s12559-024-10289-x
Haider Ali, Mingzhao Wang, Juanying Xie

Thyroid nodule (TYN) is a life-threatening disease that is commonly observed among adults globally. The applications of deep learning in computer-aided diagnosis systems (CADs) for diagnosing thyroid nodules have attracted attention among clinical professionals due to their significantly potential role in reducing the occurrence of missed diagnoses. However, most techniques for segmenting thyroid nodules rely on U-Net structures or deep convolutional neural networks, which have limitations in obtaining different context information due to the diversities in the shapes and sizes, ambiguous boundaries, and heterostructure of thyroid nodules. To resolve these challenges, we present an encoder-decoder-based architecture (referred to as CIL-Net) for boosting TYN segmentation. There are three contributions in the proposed CIL-Net. First, the encoder is established using dense connectivity for efficient feature extraction and the triplet attention block (TAB) for highlighting essential feature maps. Second, we design a feature improvement block (FIB) using dilated convolutions and attention mechanisms to capture the global context information and also build up robust feature maps between the encoder-decoder branches. Third, we introduce the residual context block (RCB), which leverages residual units (ResUnits) to accumulate the context information from the multiple blocks of decoders in the decoder branch. We assess the segmentation quality of our proposed method using six different evaluation metrics on two standard datasets (DDTI and TN3K) of TYN and demonstrate competitive performance against advanced state-of-the-art methods. We consider that the proposed method advances the performance of TYN region localization and segmentation, which heavily rely on an accurate assessment of different context information. This advancement is primarily attributed to the comprehensive incorporation of dense connectivity, TAB, FIB, and RCB, which effectively capture both extensive and intricate contextual details. We anticipate that this approach reliability and visual explainability make it a valuable tool that holds the potential to significantly enhance clinical practices by offering reliable predictions to facilitate cognitive and healthcare decision-making.

甲状腺结节(TYN)是一种威胁生命的疾病,在全球成年人中很常见。深度学习在计算机辅助诊断系统(CAD)中用于诊断甲状腺结节的应用引起了临床专业人员的关注,因为它在减少漏诊方面具有巨大的潜在作用。然而,大多数甲状腺结节的分割技术都依赖于 U-Net 结构或深度卷积神经网络,由于甲状腺结节的形状和大小各异、边界模糊、结构各异,这些技术在获取不同的上下文信息方面存在局限性。为了解决这些难题,我们提出了一种基于编码器-解码器的架构(称为 CIL-Net),用于提升 TYN 分割。所提出的 CIL-Net 有三个贡献。首先,利用密集连接建立编码器,以实现高效的特征提取,并利用三重关注块(TAB)突出重要的特征图。其次,我们设计了一个特征改进块(FIB),利用扩张卷积和注意力机制捕捉全局上下文信息,并在编码器-解码器分支之间建立稳健的特征图。第三,我们引入了残差上下文块(RCB),它利用残差单元(ResUnits)来积累解码器分支中多个解码器块的上下文信息。我们在 TYN 的两个标准数据集(DDTI 和 TN3K)上使用六种不同的评估指标评估了我们提出的方法的分割质量,并展示了与先进的一流方法相比具有竞争力的性能。我们认为,所提出的方法提高了 TYN 区域定位和分割的性能,这在很大程度上依赖于对不同上下文信息的准确评估。这种进步主要归功于密集连接、TAB、FIB 和 RCB 的全面整合,它们有效地捕捉了广泛而复杂的上下文细节。我们预计,这种方法的可靠性和可视化解释性将使其成为一种有价值的工具,通过提供可靠的预测来促进认知和医疗决策,从而有可能极大地改进临床实践。
{"title":"CIL-Net: Densely Connected Context Information Learning Network for Boosting Thyroid Nodule Segmentation Using Ultrasound Images","authors":"Haider Ali, Mingzhao Wang, Juanying Xie","doi":"10.1007/s12559-024-10289-x","DOIUrl":"https://doi.org/10.1007/s12559-024-10289-x","url":null,"abstract":"<p>Thyroid nodule (TYN) is a life-threatening disease that is commonly observed among adults globally. The applications of deep learning in computer-aided diagnosis systems (CADs) for diagnosing thyroid nodules have attracted attention among clinical professionals due to their significantly potential role in reducing the occurrence of missed diagnoses. However, most techniques for segmenting thyroid nodules rely on U-Net structures or deep convolutional neural networks, which have limitations in obtaining different context information due to the diversities in the shapes and sizes, ambiguous boundaries, and heterostructure of thyroid nodules. To resolve these challenges, we present an encoder-decoder-based architecture (referred to as CIL-Net) for boosting TYN segmentation. There are three contributions in the proposed CIL-Net. First, the encoder is established using dense connectivity for efficient feature extraction and the triplet attention block (TAB) for highlighting essential feature maps. Second, we design a feature improvement block (FIB) using dilated convolutions and attention mechanisms to capture the global context information and also build up robust feature maps between the encoder-decoder branches. Third, we introduce the residual context block (RCB), which leverages residual units (ResUnits) to accumulate the context information from the multiple blocks of decoders in the decoder branch. We assess the segmentation quality of our proposed method using six different evaluation metrics on two standard datasets (DDTI and TN3K) of TYN and demonstrate competitive performance against advanced state-of-the-art methods. We consider that the proposed method advances the performance of TYN region localization and segmentation, which heavily rely on an accurate assessment of different context information. This advancement is primarily attributed to the comprehensive incorporation of dense connectivity, TAB, FIB, and RCB, which effectively capture both extensive and intricate contextual details. We anticipate that this approach reliability and visual explainability make it a valuable tool that holds the potential to significantly enhance clinical practices by offering reliable predictions to facilitate cognitive and healthcare decision-making.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ConceptGlassbox: Guided Concept-Based Explanation for Deep Neural Networks ConceptGlassbox:为深度神经网络提供基于概念的引导式解释
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-03 DOI: 10.1007/s12559-024-10262-8
Radwa El Shawi

Various industries and fields have utilized machine learning models, particularly those that demand a significant degree of accountability and transparency. With the introduction of the General Data Protection Regulation (GDPR), it has become imperative for machine learning model predictions to be both plausible and verifiable. One approach to explaining these predictions involves assigning an importance score to each input element. Another category aims to quantify the importance of human-understandable concepts to explain global and local model behaviours. The way concepts are constructed in such concept-based explanation techniques lacks inherent interpretability. Additionally, the magnitude and diversity of the discovered concepts make it difficult for machine learning practitioners to comprehend and make sense of the concept space. To this end, we introduce ConceptGlassbox, a novel local explanation framework that seeks to learn high-level transparent concept definitions. Our approach leverages human knowledge and feedback to facilitate the acquisition of concepts with minimal human labelling effort. The ConceptGlassbox learns concepts consistent with the user’s understanding of a concept’s meaning. It then dissects the evidence for the prediction by identifying the key concepts the black-box model uses to arrive at its decision regarding the instance being explained. Additionally, ConceptGlassbox produces counterfactual explanations, proposing the smallest changes to the instance’s concept-based explanation that would result in a counterfactual decision as specified by the user. Our systematic experiments confirm that ConceptGlassbox successfully discovers relevant and comprehensible concepts that are important for neural network predictions.

各行各业都在使用机器学习模型,尤其是那些要求高度问责和透明的行业。随着《通用数据保护条例》(GDPR)的出台,机器学习模型的预测必须可信且可验证。解释这些预测的一种方法是为每个输入元素分配一个重要性分数。另一类方法旨在量化人类可理解概念的重要性,以解释全局和局部模型行为。在这类基于概念的解释技术中,概念的构建方式缺乏内在的可解释性。此外,已发现概念的规模和多样性也使机器学习从业人员难以理解和理解概念空间。为此,我们引入了 ConceptGlassbox,这是一个新颖的本地解释框架,旨在学习高级别的透明概念定义。我们的方法利用人类知识和反馈,以最小的人工标注工作来促进概念的获取。ConceptGlassbox 根据用户对概念含义的理解来学习概念。然后,它通过识别黑盒模型用来得出有关被解释实例的决定的关键概念,来剖析预测的证据。此外,ConceptGlassbox 还能生成反事实解释,对基于概念的实例解释提出最小的改动,以实现用户指定的反事实决策。我们的系统实验证实,ConceptGlassbox 成功地发现了对神经网络预测非常重要的相关可理解概念。
{"title":"ConceptGlassbox: Guided Concept-Based Explanation for Deep Neural Networks","authors":"Radwa El Shawi","doi":"10.1007/s12559-024-10262-8","DOIUrl":"https://doi.org/10.1007/s12559-024-10262-8","url":null,"abstract":"<p>Various industries and fields have utilized machine learning models, particularly those that demand a significant degree of accountability and transparency. With the introduction of the General Data Protection Regulation (GDPR), it has become imperative for machine learning model predictions to be both plausible and verifiable. One approach to explaining these predictions involves assigning an importance score to each input element. Another category aims to quantify the importance of human-understandable concepts to explain global and local model behaviours. The way concepts are constructed in such concept-based explanation techniques lacks inherent interpretability. Additionally, the magnitude and diversity of the discovered concepts make it difficult for machine learning practitioners to comprehend and make sense of the concept space. To this end, we introduce ConceptGlassbox, a novel local explanation framework that seeks to learn high-level transparent concept definitions. Our approach leverages human knowledge and feedback to facilitate the acquisition of concepts with minimal human labelling effort. The ConceptGlassbox learns concepts consistent with the user’s understanding of a concept’s meaning. It then dissects the evidence for the prediction by identifying the key concepts the black-box model uses to arrive at its decision regarding the instance being explained. Additionally, ConceptGlassbox produces counterfactual explanations, proposing the smallest changes to the instance’s concept-based explanation that would result in a counterfactual decision as specified by the user. Our systematic experiments confirm that ConceptGlassbox successfully discovers relevant and comprehensible concepts that are important for neural network predictions.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Document-Level Relation Extraction with Attention-Convolutional Hybrid Networks and Evidence Extraction 利用注意力-卷积混合网络和证据提取加强文档级关系提取
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-02 DOI: 10.1007/s12559-024-10269-1
Feiyu Zhang, Ruiming Hu, Guiduo Duan, Tianxi Huang

Document-level relation extraction aims at extracting relations between entities in a document. In contrast to sentence-level correspondences, document-level relation extraction requires reasoning over multiple sentences to extract complex relational triples. Recent work has found that adding additional evidence extraction tasks and using the extracted evidence sentences to help predict can improve the performance of document-level relation extraction tasks, however, these approaches still face the problem of inadequate modeling of the interactions between entity pairs. In this paper, based on the review of human cognitive processes, we propose a hybrid network HIMAC applied to the entity pair feature matrix, in which the multi-head attention sub-module can fuse global entity-pair information on a specific inference path, while the convolution sub-module is able to obtain local information of adjacent entity pairs. Then we incorporate the contextual interaction information learned by the entity pairs into the relation prediction and evidence extraction tasks. Finally, the extracted evidence sentences are used to further correct the relation extraction results. We conduct extensive experiments on two document-level relation extraction benchmark datasets (DocRED/Re-DocRED), and the experimental results demonstrate that our method achieves state-of-the-art performance (62.84/75.89 F1). Experiments demonstrate the effectiveness of the proposed method.

文档级关系提取旨在提取文档中实体之间的关系。与句子级对应关系不同,文档级关系提取需要对多个句子进行推理,以提取复杂的关系三元组。最近的研究发现,增加额外的证据提取任务和使用提取的证据句子帮助预测可以提高文档级关系提取任务的性能,但是,这些方法仍然面临实体对之间的交互建模不足的问题。本文在回顾人类认知过程的基础上,提出了一种应用于实体对特征矩阵的混合网络 HIMAC,其中多头注意力子模块可以在特定推理路径上融合全局实体对信息,而卷积子模块则能够获取相邻实体对的局部信息。然后,我们将实体对的上下文交互信息纳入关系预测和证据提取任务中。最后,我们利用提取的证据句来进一步修正关系提取结果。我们在两个文档级关系提取基准数据集(DocRED/Re-DocRED)上进行了广泛的实验,实验结果表明我们的方法达到了最先进的性能(62.84/75.89 F1)。实验证明了所提方法的有效性。
{"title":"Enhancing Document-Level Relation Extraction with Attention-Convolutional Hybrid Networks and Evidence Extraction","authors":"Feiyu Zhang, Ruiming Hu, Guiduo Duan, Tianxi Huang","doi":"10.1007/s12559-024-10269-1","DOIUrl":"https://doi.org/10.1007/s12559-024-10269-1","url":null,"abstract":"<p>Document-level relation extraction aims at extracting relations between entities in a document. In contrast to sentence-level correspondences, document-level relation extraction requires reasoning over multiple sentences to extract complex relational triples. Recent work has found that adding additional evidence extraction tasks and using the extracted evidence sentences to help predict can improve the performance of document-level relation extraction tasks, however, these approaches still face the problem of inadequate modeling of the interactions between entity pairs. In this paper, based on the review of human cognitive processes, we propose a hybrid network HIMAC applied to the entity pair feature matrix, in which the multi-head attention sub-module can fuse global entity-pair information on a specific inference path, while the convolution sub-module is able to obtain local information of adjacent entity pairs. Then we incorporate the contextual interaction information learned by the entity pairs into the relation prediction and evidence extraction tasks. Finally, the extracted evidence sentences are used to further correct the relation extraction results. We conduct extensive experiments on two document-level relation extraction benchmark datasets (DocRED/Re-DocRED), and the experimental results demonstrate that our method achieves state-of-the-art performance (62.84/75.89 F1). Experiments demonstrate the effectiveness of the proposed method.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate Prediction of Lysine Methylation Sites Using Evolutionary and Structural-Based Information 利用基于进化和结构的信息准确预测赖氨酸甲基化位点
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-02 DOI: 10.1007/s12559-024-10268-2
Md. Easin Arafat, Md. Wakil Ahmad, S. M. Shovan, Towhid Ul Haq, Nazrul Islam, Mufti Mahmud, M. Shamim Kaiser

Methylation is considered one of the proteins’ most important post-translational modifications (PTM). Plasticity and cellular dynamics are among the many traits that are regulated by methylation. Currently, methylation sites are identified using experimental approaches. However, these methods are time-consuming and expensive. With the use of computer modelling, methylation sites can be identified quickly and accurately, providing valuable information for further trial and investigation. In this study, we propose a new machine-learning model called MeSEP to predict methylation sites that incorporates both evolutionary and structural-based information. To build this model, we first extract evolutionary and structural features from the PSSM and SPD2 profiles, respectively. We then employ Extreme Gradient Boosting (XGBoost) as the classification model to predict methylation sites. To address the issue of imbalanced data and bias towards negative samples, we use the SMOTETomek-based hybrid sampling method. The MeSEP was validated on an independent test set (ITS) and 10-fold cross-validation (TCV) using lysine methylation sites. The method achieved: an accuracy of 82.9% in ITS and 84.6% in TCV; precision of 0.92 in ITS and 0.94 in TCV; area under the curve values of 0.90 in ITS and 0.92 in TCV; F1 score of 0.81 in ITS and 0.83 in TCV; and MCC of 0.67 in ITS and 0.70 in TCV. MeSEP significantly outperformed previous studies found in the literature. MeSEP as a standalone toolkit and all its source codes are publicly available at https://github.com/arafatro/MeSEP.

甲基化被认为是蛋白质最重要的翻译后修饰(PTM)之一。可塑性和细胞动力学是受甲基化调控的许多特征之一。目前,甲基化位点是通过实验方法确定的。然而,这些方法既耗时又昂贵。利用计算机建模可以快速、准确地确定甲基化位点,为进一步试验和研究提供有价值的信息。在本研究中,我们提出了一种名为 MeSEP 的新机器学习模型,用于预测甲基化位点,该模型结合了基于进化和结构的信息。为了建立这个模型,我们首先分别从 PSSM 和 SPD2 图谱中提取进化和结构特征。然后,我们采用极端梯度提升(XGBoost)作为分类模型来预测甲基化位点。为了解决数据不平衡和偏向负样本的问题,我们采用了基于 SMOTETomek 的混合采样方法。我们使用赖氨酸甲基化位点在独立测试集(ITS)和 10 倍交叉验证(TCV)上对 MeSEP 进行了验证。该方法的准确率在 ITS 中为 82.9%,在 TCV 中为 84.6%;精确度在 ITS 中为 0.92,在 TCV 中为 0.94;曲线下面积值在 ITS 中为 0.90,在 TCV 中为 0.92;F1 分数在 ITS 中为 0.81,在 TCV 中为 0.83;MCC 在 ITS 中为 0.67,在 TCV 中为 0.70。MeSEP 的性能明显优于以往文献中的研究。MeSEP 作为一个独立的工具包及其所有源代码均可在 https://github.com/arafatro/MeSEP 上公开获取。
{"title":"Accurate Prediction of Lysine Methylation Sites Using Evolutionary and Structural-Based Information","authors":"Md. Easin Arafat, Md. Wakil Ahmad, S. M. Shovan, Towhid Ul Haq, Nazrul Islam, Mufti Mahmud, M. Shamim Kaiser","doi":"10.1007/s12559-024-10268-2","DOIUrl":"https://doi.org/10.1007/s12559-024-10268-2","url":null,"abstract":"<p>Methylation is considered one of the proteins’ most important post-translational modifications (PTM). Plasticity and cellular dynamics are among the many traits that are regulated by methylation. Currently, methylation sites are identified using experimental approaches. However, these methods are time-consuming and expensive. With the use of computer modelling, methylation sites can be identified quickly and accurately, providing valuable information for further trial and investigation. In this study, we propose a new machine-learning model called MeSEP to predict methylation sites that incorporates both evolutionary and structural-based information. To build this model, we first extract evolutionary and structural features from the PSSM and SPD2 profiles, respectively. We then employ Extreme Gradient Boosting (XGBoost) as the classification model to predict methylation sites. To address the issue of imbalanced data and bias towards negative samples, we use the SMOTETomek-based hybrid sampling method. The MeSEP was validated on an independent test set (ITS) and 10-fold cross-validation (TCV) using lysine methylation sites. The method achieved: an accuracy of 82.9% in ITS and 84.6% in TCV; precision of 0.92 in ITS and 0.94 in TCV; area under the curve values of 0.90 in ITS and 0.92 in TCV; F1 score of 0.81 in ITS and 0.83 in TCV; and MCC of 0.67 in ITS and 0.70 in TCV. MeSEP significantly outperformed previous studies found in the literature. MeSEP as a standalone toolkit and all its source codes are publicly available at https://github.com/arafatro/MeSEP.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Duo of Visual Servoing and Deep Learning-Based Methods for Situation-Aware Disaster Management: A Comprehensive Review 基于视觉伺服和深度学习的双管齐下方法用于情况感知型灾害管理:全面综述
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-01 DOI: 10.1007/s12559-024-10290-4
Senthil Kumar Jagatheesaperumal, Mohammad Mehedi Hassan, Md. Rafiul Hassan, Giancarlo Fortino

Unmanned aerial vehicles (UAVs) have become essential in disaster management due to their ability to provide real-time situational awareness and support decision-making processes. Visual servoing, a technique that uses visual feedback to control the motion of a robotic system, has been used to improve the precision and accuracy of UAVs in disaster scenarios. The study integrates visual servoing to enhance UAV precision while exploring recent advancements in deep learning. This integration enhances the precision and efficiency of disaster response by enabling UAVs to navigate complex environments, identify critical areas for intervention, and provide actionable insights to decision-makers in real time. It discusses disaster management aspects like search and rescue, damage assessment, and situational awareness, while also analyzing the challenges associated with integrating visual servoing and deep learning into UAVs. This review article provides a comprehensive analysis to offer real-time situational awareness and decision support in disaster management. It highlights that deep learning along with visual servoing enhances precision and accuracy in disaster scenarios. The analysis also summarizes the challenges and the need for high computational power, data processing, and communication capabilities. UAVs, especially when combined with visual servoing and deep learning, play a crucial role in disaster management. The review underscores the potential benefits and challenges of integrating these technologies, emphasizing their significance in improving disaster response and recovery, with possible means of enhanced situational awareness and decision-making.

无人驾驶飞行器(UAV)能够提供实时态势感知并支持决策过程,因此已成为灾害管理中必不可少的工具。视觉伺服是一种利用视觉反馈控制机器人系统运动的技术,已被用于提高无人飞行器在灾难场景中的精度和准确性。本研究整合了视觉伺服技术,以提高无人机的精度,同时探索深度学习的最新进展。这种集成能够使无人机在复杂环境中导航,识别需要干预的关键区域,并实时向决策者提供可操作的见解,从而提高灾害响应的精度和效率。文章讨论了搜救、损害评估和态势感知等灾害管理方面的问题,同时还分析了将视觉伺服和深度学习集成到无人机中的相关挑战。这篇综述文章提供了全面分析,以便在灾害管理中提供实时态势感知和决策支持。文章强调,深度学习与视觉伺服相结合可提高灾难场景中的精确度和准确性。分析还总结了所面临的挑战以及对高计算能力、数据处理和通信能力的需求。无人机,尤其是与视觉伺服和深度学习相结合的无人机,在灾害管理中发挥着至关重要的作用。审查强调了整合这些技术的潜在益处和挑战,强调了它们在改善灾害响应和恢复方面的重要意义,以及增强态势感知和决策的可能手段。
{"title":"The Duo of Visual Servoing and Deep Learning-Based Methods for Situation-Aware Disaster Management: A Comprehensive Review","authors":"Senthil Kumar Jagatheesaperumal, Mohammad Mehedi Hassan, Md. Rafiul Hassan, Giancarlo Fortino","doi":"10.1007/s12559-024-10290-4","DOIUrl":"https://doi.org/10.1007/s12559-024-10290-4","url":null,"abstract":"<p>Unmanned aerial vehicles (UAVs) have become essential in disaster management due to their ability to provide real-time situational awareness and support decision-making processes. Visual servoing, a technique that uses visual feedback to control the motion of a robotic system, has been used to improve the precision and accuracy of UAVs in disaster scenarios. The study integrates visual servoing to enhance UAV precision while exploring recent advancements in deep learning. This integration enhances the precision and efficiency of disaster response by enabling UAVs to navigate complex environments, identify critical areas for intervention, and provide actionable insights to decision-makers in real time. It discusses disaster management aspects like search and rescue, damage assessment, and situational awareness, while also analyzing the challenges associated with integrating visual servoing and deep learning into UAVs. This review article provides a comprehensive analysis to offer real-time situational awareness and decision support in disaster management. It highlights that deep learning along with visual servoing enhances precision and accuracy in disaster scenarios. The analysis also summarizes the challenges and the need for high computational power, data processing, and communication capabilities. UAVs, especially when combined with visual servoing and deep learning, play a crucial role in disaster management. The review underscores the potential benefits and challenges of integrating these technologies, emphasizing their significance in improving disaster response and recovery, with possible means of enhanced situational awareness and decision-making.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140828458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Efficient Recurrent Architectures: A Deep LSTM Neural Network Applied to Speech Enhancement and Recognition 迈向高效循环架构:应用于语音增强和识别的深度 LSTM 神经网络
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-30 DOI: 10.1007/s12559-024-10288-y
Jing Wang, Nasir Saleem, Teddy Surya Gunawan

Long short-term memory (LSTM) has proven effective in modeling sequential data. However, it may encounter challenges in accurately capturing long-term temporal dependencies. LSTM plays a central role in speech enhancement by effectively modeling and capturing temporal dependencies in speech signals. This paper introduces a variable-neurons-based LSTM designed for capturing long-term temporal dependencies by reducing neuron representation in layers with no loss of data. A skip connection between nonadjacent layers is added to prevent gradient vanishing. An attention mechanism in these connections highlights important features and spectral components. Our LSTM is inherently causal, making it well-suited for real-time processing without relying on future information. Training involves utilizing combined acoustic feature sets for improved performance, and the models estimate two time–frequency masks—the ideal ratio mask (IRM) and the ideal binary mask (IBM). Comprehensive evaluation using perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI) showed that the proposed LSTM architecture demonstrates enhanced speech intelligibility and perceptual quality. Composite measures further substantiated performance, considering residual noise distortion (Cbak) and speech distortion (Csig). The proposed model showed a 16.21% improvement in STOI and a 0.69 improvement in PESQ on the TIMIT database. Similarly, with the LibriSpeech database, the STOI and PESQ showed improvements of 16.41% and 0.71 over noisy mixtures. The proposed LSTM architecture outperforms deep neural networks (DNNs) in different stationary and nonstationary background noisy conditions. To train an automatic speech recognition (ASR) system on enhanced speech, the Kaldi toolkit is used for evaluating word error rate (WER). The proposed LSTM at the front-end notably reduced WERs, achieving a notable 15.13% WER across different noisy backgrounds.

事实证明,长短时记忆(LSTM)能有效地模拟顺序数据。然而,它在准确捕捉长期时间依赖性方面可能会遇到挑战。通过有效建模和捕捉语音信号中的时间依赖性,LSTM 在语音增强中发挥着核心作用。本文介绍了一种基于可变神经元的 LSTM,旨在通过减少神经元在层中的表示,在不丢失数据的情况下捕捉长期时间依赖性。非相邻层之间添加了跳过连接,以防止梯度消失。这些连接中的注意机制可突出重要特征和频谱成分。我们的 LSTM 本身具有因果关系,因此非常适合实时处理,而无需依赖未来信息。训练包括利用组合声学特征集以提高性能,模型估计两个时频掩码--理想比率掩码(IRM)和理想二进制掩码(IBM)。使用语音质量感知评估(PESQ)和短时客观可懂度(STOI)进行的综合评估表明,所提出的 LSTM 架构具有更高的语音可懂度和感知质量。考虑到残余噪声失真(Cbak)和语音失真(Csig),综合测量进一步证实了其性能。在 TIMIT 数据库中,拟议模型的 STOI 提高了 16.21%,PESQ 提高了 0.69%。同样,在 LibriSpeech 数据库中,STOI 和 PESQ 分别比噪声混合物提高了 16.41% 和 0.71%。在不同的静态和非静态背景噪声条件下,所提出的 LSTM 架构优于深度神经网络(DNN)。为了在增强语音上训练自动语音识别(ASR)系统,使用了 Kaldi 工具包来评估词错误率(WER)。拟议的前端 LSTM 显著降低了 WER,在不同的噪声背景下实现了 15.13% 的显著 WER。
{"title":"Towards Efficient Recurrent Architectures: A Deep LSTM Neural Network Applied to Speech Enhancement and Recognition","authors":"Jing Wang, Nasir Saleem, Teddy Surya Gunawan","doi":"10.1007/s12559-024-10288-y","DOIUrl":"https://doi.org/10.1007/s12559-024-10288-y","url":null,"abstract":"<p>Long short-term memory (LSTM) has proven effective in modeling sequential data. However, it may encounter challenges in accurately capturing long-term temporal dependencies. LSTM plays a central role in speech enhancement by effectively modeling and capturing temporal dependencies in speech signals. This paper introduces a variable-neurons-based LSTM designed for capturing long-term temporal dependencies by reducing neuron representation in layers with no loss of data. A skip connection between nonadjacent layers is added to prevent gradient vanishing. An attention mechanism in these connections highlights important features and spectral components. Our LSTM is inherently causal, making it well-suited for real-time processing without relying on future information. Training involves utilizing combined acoustic feature sets for improved performance, and the models estimate two time–frequency masks—the ideal ratio mask (IRM) and the ideal binary mask (IBM). Comprehensive evaluation using perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI) showed that the proposed LSTM architecture demonstrates enhanced speech intelligibility and perceptual quality. Composite measures further substantiated performance, considering residual noise distortion (Cbak) and speech distortion (Csig). The proposed model showed a 16.21% improvement in STOI and a 0.69 improvement in PESQ on the TIMIT database. Similarly, with the LibriSpeech database, the STOI and PESQ showed improvements of 16.41% and 0.71 over noisy mixtures. The proposed LSTM architecture outperforms deep neural networks (DNNs) in different stationary and nonstationary background noisy conditions. To train an automatic speech recognition (ASR) system on enhanced speech, the Kaldi toolkit is used for evaluating word error rate (WER). The proposed LSTM at the front-end notably reduced WERs, achieving a notable 15.13% WER across different noisy backgrounds.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140828347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DFootNet: A Domain Adaptive Classification Framework for Diabetic Foot Ulcers Using Dense Neural Network Architecture DFootNet:使用密集神经网络架构的糖尿病足溃疡领域自适应分类框架
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-29 DOI: 10.1007/s12559-024-10282-4
Nishu Bansal, Ankit Vidyarthi

Diabetic foot ulcers (DFUs) are a prevalent and serious complication of diabetes, often leading to severe morbidity and even amputations if not timely diagnosed and managed. The increasing prevalence of DFUs poses a significant challenge to healthcare systems worldwide. Accurate and timely classification of DFUs is crucial for effective treatment and prevention of complications. In this paper, we present “DFootNet”, an innovative and comprehensive classification framework for the accurate assessment of diabetic foot ulcers using a dense neural network architecture. Our proposed approach leverages the power of deep learning to automatically extract relevant features from diverse clinical DFU images. The proposed model comprises a multi-layered dense neural network designed to handle the intricate patterns and variations present in different stages and types of DFUs. The network architecture integrates convolutional and fully connected layers, allowing for hierarchical feature extraction and robust feature representation. To evaluate the efficacy of DFootNet, we conducted experiments on a large and diverse dataset of diabetic foot ulcers. Our results demonstrate that DFootNet achieves a remarkable accuracy of 98.87%, precision—99.01%, recall—98.73%, F1-score as 98.86%, and AUC-ROC as 98.13%, outperforming existing methods in distinguishing between ulcer and non-ulcer images. Moreover, our framework provides insights into the decision-making process, offering transparency and interpretability through attention mechanisms that highlight important regions within ulcer images. We also present a comparative analysis of DFootNet’s performance against other popular deep learning models, showcasing its robustness and adaptability across various scenarios.

糖尿病足溃疡(DFUs)是一种普遍而严重的糖尿病并发症,如果得不到及时诊断和治疗,往往会导致严重的发病率,甚至截肢。糖尿病足溃疡发病率的不断上升给全球医疗系统带来了巨大挑战。对 DFU 进行准确及时的分类对于有效治疗和预防并发症至关重要。在本文中,我们介绍了 "DFootNet",这是一个创新的综合分类框架,利用密集神经网络架构对糖尿病足溃疡进行准确评估。我们提出的方法利用深度学习的强大功能,自动从不同的临床 DFU 图像中提取相关特征。所提议的模型由多层密集神经网络组成,旨在处理不同阶段和类型的糖尿病足溃疡中存在的复杂模式和变化。该网络架构整合了卷积层和全连接层,可进行分层特征提取和稳健特征表示。为了评估 DFootNet 的功效,我们在一个大型、多样化的糖尿病足溃疡数据集上进行了实验。结果表明,在区分溃疡和非溃疡图像方面,DFootNet 的准确率为 98.87%,精确率为 99.01%,召回率为 98.73%,F1 分数为 98.86%,AUC-ROC 为 98.13%,均优于现有方法。此外,我们的框架还为决策过程提供了洞察力,通过关注机制突出了溃疡图像中的重要区域,从而提供了透明度和可解释性。我们还对 DFootNet 的性能与其他流行的深度学习模型进行了对比分析,展示了它在各种场景下的鲁棒性和适应性。
{"title":"DFootNet: A Domain Adaptive Classification Framework for Diabetic Foot Ulcers Using Dense Neural Network Architecture","authors":"Nishu Bansal, Ankit Vidyarthi","doi":"10.1007/s12559-024-10282-4","DOIUrl":"https://doi.org/10.1007/s12559-024-10282-4","url":null,"abstract":"<p>Diabetic foot ulcers (DFUs) are a prevalent and serious complication of diabetes, often leading to severe morbidity and even amputations if not timely diagnosed and managed. The increasing prevalence of DFUs poses a significant challenge to healthcare systems worldwide. Accurate and timely classification of DFUs is crucial for effective treatment and prevention of complications. In this paper, we present “DFootNet”, an innovative and comprehensive classification framework for the accurate assessment of diabetic foot ulcers using a dense neural network architecture. Our proposed approach leverages the power of deep learning to automatically extract relevant features from diverse clinical DFU images. The proposed model comprises a multi-layered dense neural network designed to handle the intricate patterns and variations present in different stages and types of DFUs. The network architecture integrates convolutional and fully connected layers, allowing for hierarchical feature extraction and robust feature representation. To evaluate the efficacy of DFootNet, we conducted experiments on a large and diverse dataset of diabetic foot ulcers. Our results demonstrate that DFootNet achieves a remarkable accuracy of 98.87%, precision—99.01%, recall—98.73%, F1-score as 98.86%, and AUC-ROC as 98.13%, outperforming existing methods in distinguishing between ulcer and non-ulcer images. Moreover, our framework provides insights into the decision-making process, offering transparency and interpretability through attention mechanisms that highlight important regions within ulcer images. We also present a comparative analysis of DFootNet’s performance against other popular deep learning models, showcasing its robustness and adaptability across various scenarios.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140828536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Method for Human-Vehicle Recognition Based on Wireless Sensing and Deep Learning Technologies 基于无线传感和深度学习技术的新型人车识别方法
IF 5.4 3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-18 DOI: 10.1007/s12559-024-10276-2
Liangliang Lou, Ruyin Cai, Mingan Lu, Mingmin Wang, Guang Chen

Currently, human-vehicle recognition (HVR) method has been applied in road monitoring, congestion control, and safety protection situations. However, traditional vision-based HVR methods suffer from problems such as high construction cost and low robustness in scenarios with insufficient lighting. For this reason, it is necessary to develop a low-cost and high-robust HVR method for intelligent street light systems (ISLS). A well-designed HVR method can aid the brightness adjustment in ISLSs that operate exclusively at night, facilitating lower power consumption and carbon emission. The paper proposes a novel wireless sensing-based human-vehicle recognition (WsHVR) method based on deep learning technologies, which can be applied in ISLSs that assembled with wireless sensor network (WSN). To solve the problem of limited recognition ability of wireless sensing technology, a deep feature extraction model that combines multi-scale convolution and attention mechanism is proposed, in which the received signal strength (RSS) features of road users are extracted by multi-scale convolution. WsHVR integrates an adaptive registration convolutional attention mechanism (ARCAM) to further feature extraction and classification. The final normalized classification result is obtained by SoftMax function. Experiments show that the proposed WsHVR outperforms existing methods with an accuracy of 99.07%. The dataset and source code related to the paper have been published at https://github.com/TZ-mx/WiParam and https://github.com/TZ-mx/WsHVR, respectively. The proposed WsHVR method has high performance in the field of human-vehicle recognition, potentially providing valuable guidance for the design of intelligent streetlight systems in intelligent transportation systems.

目前,人车识别(HVR)方法已被应用于道路监控、拥堵控制和安全保护等领域。然而,传统的基于视觉的人车识别方法存在建造成本高、在照明不足的场景下鲁棒性低等问题。因此,有必要为智能路灯系统(ISLS)开发一种低成本、高鲁棒性的 HVR 方法。设计良好的 HVR 方法可以帮助专门在夜间运行的 ISLS 进行亮度调节,从而降低功耗和碳排放。本文提出了一种基于深度学习技术的新型无线传感人车识别(WsHVR)方法,可应用于装配了无线传感器网络(WSN)的智能照明系统。为了解决无线传感技术识别能力有限的问题,本文提出了一种结合多尺度卷积和关注机制的深度特征提取模型,通过多尺度卷积提取道路使用者的接收信号强度(RSS)特征。WsHVR 集成了自适应注册卷积注意力机制(ARCAM),以进一步进行特征提取和分类。最终的归一化分类结果由 SoftMax 函数获得。实验表明,所提出的 WsHVR 优于现有方法,准确率高达 99.07%。与论文相关的数据集和源代码已分别发布在 https://github.com/TZ-mx/WiParam 和 https://github.com/TZ-mx/WsHVR 上。所提出的 WsHVR 方法在人车识别领域具有很高的性能,有可能为智能交通系统中智能路灯系统的设计提供有价值的指导。
{"title":"A Novel Method for Human-Vehicle Recognition Based on Wireless Sensing and Deep Learning Technologies","authors":"Liangliang Lou, Ruyin Cai, Mingan Lu, Mingmin Wang, Guang Chen","doi":"10.1007/s12559-024-10276-2","DOIUrl":"https://doi.org/10.1007/s12559-024-10276-2","url":null,"abstract":"<p>Currently, human-vehicle recognition (HVR) method has been applied in road monitoring, congestion control, and safety protection situations. However, traditional vision-based HVR methods suffer from problems such as high construction cost and low robustness in scenarios with insufficient lighting. For this reason, it is necessary to develop a low-cost and high-robust HVR method for intelligent street light systems (ISLS). A well-designed HVR method can aid the brightness adjustment in ISLSs that operate exclusively at night, facilitating lower power consumption and carbon emission. The paper proposes a novel wireless sensing-based human-vehicle recognition (WsHVR) method based on deep learning technologies, which can be applied in ISLSs that assembled with wireless sensor network (WSN). To solve the problem of limited recognition ability of wireless sensing technology, a deep feature extraction model that combines multi-scale convolution and attention mechanism is proposed, in which the received signal strength (RSS) features of road users are extracted by multi-scale convolution. WsHVR integrates an adaptive registration convolutional attention mechanism (ARCAM) to further feature extraction and classification. The final normalized classification result is obtained by SoftMax function. Experiments show that the proposed WsHVR outperforms existing methods with an accuracy of 99.07%. The dataset and source code related to the paper have been published at https://github.com/TZ-mx/WiParam and https://github.com/TZ-mx/WsHVR, respectively. The proposed WsHVR method has high performance in the field of human-vehicle recognition, potentially providing valuable guidance for the design of intelligent streetlight systems in intelligent transportation systems.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140626025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Cognitive Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1