首页 > 最新文献

Neural Computing and Applications最新文献

英文 中文
Evidential neural network for tensile stress uncertainty quantification in thermoplastic elastomers 用于热塑性弹性体拉伸应力不确定性量化的证据神经网络
Pub Date : 2024-08-15 DOI: 10.1007/s00521-024-10320-0
Alejandro E. Rodríguez-Sánchez

This work presents the use of artificial neural networks (ANNs) with deep evidential regression to model the tensile stress response of a thermoplastic elastomer (TPE) considering uncertainty. Three Gaussian noise scenarios were added to a previous dataset of a TPE to simulate noise in the stress response. The trained ANN models were able to address stress–strain data that were not used for their training or validation, even in the presence of noise. The uncertainty in all tested ANN scenarios comprised, within ± (3sigma), the noisy data of the TPE stress response. The method was extended to other grades of Hytrel material with ANN architectures that obtained results with a coefficient of determination of about 0.9. These results suggest that shallow neural networks, equipped and trained using evidential output layers and an evidential regression loss, can predict, generalize, and simulate noisy tensile stress responses in TPE materials.

这项工作介绍了使用人工神经网络(ANN)和深度证据回归来模拟热塑性弹性体(TPE)的拉伸应力响应,并考虑了不确定性。在先前的 TPE 数据集中添加了三种高斯噪声情景,以模拟应力响应中的噪声。即使存在噪声,经过训练的 ANN 模型也能处理未用于训练或验证的应力-应变数据。所有经过测试的 ANN 方案中的不确定性都在 ± (3sigma)的范围内,包括 TPE 应力响应的噪声数据。该方法已扩展到其他等级的 Hytrel 材料,其神经网络架构的结果确定系数约为 0.9。这些结果表明,使用证据输出层和证据回归损失装备和训练的浅层神经网络可以预测、概括和模拟 TPE 材料中的噪声拉伸应力响应。
{"title":"Evidential neural network for tensile stress uncertainty quantification in thermoplastic elastomers","authors":"Alejandro E. Rodríguez-Sánchez","doi":"10.1007/s00521-024-10320-0","DOIUrl":"https://doi.org/10.1007/s00521-024-10320-0","url":null,"abstract":"<p>This work presents the use of artificial neural networks (ANNs) with deep evidential regression to model the tensile stress response of a thermoplastic elastomer (TPE) considering uncertainty. Three Gaussian noise scenarios were added to a previous dataset of a TPE to simulate noise in the stress response. The trained ANN models were able to address stress–strain data that were not used for their training or validation, even in the presence of noise. The uncertainty in all tested ANN scenarios comprised, within ± <span>(3sigma)</span>, the noisy data of the TPE stress response. The method was extended to other grades of Hytrel material with ANN architectures that obtained results with a coefficient of determination of about 0.9. These results suggest that shallow neural networks, equipped and trained using evidential output layers and an evidential regression loss, can predict, generalize, and simulate noisy tensile stress responses in TPE materials.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Traffic sign detection and recognition based on MMS data using YOLOv4-Tiny algorithm 使用 YOLOv4-Tiny 算法基于 MMS 数据进行交通标志检测和识别
Pub Date : 2024-08-14 DOI: 10.1007/s00521-024-10279-y
Hilal Gezgin, Reha Metin Alkan

Traffic signs have great importance in driving safety. For the recently emerging autonomous vehicles, that can automatically detect and recognize all road inventories such as traffic signs. Firstly, in this study, a method based on a mobile mapping system (MMS) is proposed for the detection of traffic signs to establish a Turkish traffic sign dataset. Obtaining images from real traffic scenes using the MMS method enhances the reliability of the model. It is an easy method to be applied to real life in terms of both cost and suitability for mobile and autonomous systems. In this frame, YOLOv4-Tiny, one of the object detection algorithms, that is considered to be more suitable for mobile vehicles, is used to detect and recognize traffic signs. This algorithm is low operation cost and more suitable for embedded devices due to its simple neural network structure compared to other algorithms. It is also a better option for real-time detection than other approaches. For the training of the model in the suggested method, a dataset consisting partly of images taken with MMS based on realistic field measurement and partly of images obtained from open data sets was used. This training resulted in the mean average precision (mAP) value being obtained as 98.1%. The trained model was first tested on existing images and then tested in real time in a laboratory environment using a simple fixed web camera. The test results show that the suggested method can improve driving safety by detecting traffic signs quickly and accurately, especially for autonomous vehicles. Therefore, the proposed method is considered suitable for use in autonomous vehicles.

交通标志对驾驶安全至关重要。对于最近出现的自动驾驶汽车来说,可以自动检测和识别所有道路库存,如交通标志。首先,本研究提出了一种基于移动映射系统(MMS)的交通标志检测方法,以建立土耳其交通标志数据集。使用 MMS 方法从真实交通场景中获取图像可提高模型的可靠性。就成本和对移动与自主系统的适用性而言,这是一种易于应用于现实生活的方法。在本框架中,YOLOv4-Tiny 是一种被认为更适用于移动车辆的物体检测算法,被用于检测和识别交通标志。与其他算法相比,该算法操作成本低,神经网络结构简单,更适合嵌入式设备。与其他方法相比,它也是实时检测的更好选择。为了训练建议方法中的模型,我们使用了一个数据集,其中一部分是基于实际现场测量使用 MMS 拍摄的图像,另一部分是从公开数据集中获取的图像。训练的结果是平均精确度(mAP)达到 98.1%。训练后的模型首先在现有图像上进行了测试,然后在实验室环境中使用一个简单的固定网络摄像头进行了实时测试。测试结果表明,所建议的方法可以快速、准确地检测交通标志,从而提高驾驶安全性,尤其适用于自动驾驶车辆。因此,建议的方法适合用于自动驾驶车辆。
{"title":"Traffic sign detection and recognition based on MMS data using YOLOv4-Tiny algorithm","authors":"Hilal Gezgin, Reha Metin Alkan","doi":"10.1007/s00521-024-10279-y","DOIUrl":"https://doi.org/10.1007/s00521-024-10279-y","url":null,"abstract":"<p>Traffic signs have great importance in driving safety. For the recently emerging autonomous vehicles, that can automatically detect and recognize all road inventories such as traffic signs. Firstly, in this study, a method based on a mobile mapping system (MMS) is proposed for the detection of traffic signs to establish a Turkish traffic sign dataset. Obtaining images from real traffic scenes using the MMS method enhances the reliability of the model. It is an easy method to be applied to real life in terms of both cost and suitability for mobile and autonomous systems. In this frame, YOLOv4-Tiny, one of the object detection algorithms, that is considered to be more suitable for mobile vehicles, is used to detect and recognize traffic signs. This algorithm is low operation cost and more suitable for embedded devices due to its simple neural network structure compared to other algorithms. It is also a better option for real-time detection than other approaches. For the training of the model in the suggested method, a dataset consisting partly of images taken with MMS based on realistic field measurement and partly of images obtained from open data sets was used. This training resulted in the mean average precision (mAP) value being obtained as 98.1%. The trained model was first tested on existing images and then tested in real time in a laboratory environment using a simple fixed web camera. The test results show that the suggested method can improve driving safety by detecting traffic signs quickly and accurately, especially for autonomous vehicles. Therefore, the proposed method is considered suitable for use in autonomous vehicles.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PRF: deep neural network compression by systematic pruning of redundant filters PRF:通过系统修剪冗余滤波器压缩深度神经网络
Pub Date : 2024-08-14 DOI: 10.1007/s00521-024-10256-5
C. H. Sarvani, Mrinmoy Ghorai, S. H. Shabbeer Basha

In deep neural networks, the filters of convolutional layers play an important role in extracting the features from the input. Redundant filters often extract similar features, leading to increased computational overhead and larger model size. To address this issue, a two-step approach is proposed in this paper. First, the clusters of redundant filters are identified based on the cosine distance between them using hierarchical agglomerative clustering (HAC). Next, instead of pruning all the redundant filters from every cluster in single-shot, we propose to prune the filters in a systematic manner. To prune the filters, the cluster importance among all clusters and filter importance within each cluster are identified using the (ell _1)-norm based criterion. Then, based on the pruning ratio filters from the least important cluster to the most important ones are pruned systematically. The proposed method showed better results compared to other clustering-based works. The benchmark datasets CIFAR-10 and ImageNet are used in the experiments. After pruning 83.92% parameters from VGG-16 architecture, an improvement over the baseline is observed. After pruning 54.59% and 49.33% of the FLOPs from ResNet-56 and ResNet-110, respectively, both showed an improvement in accuracy. After pruning 52.97% of the FLOPs, the top-5 accuracy of ResNet-50 drops by only 0.56 over ImageNet.

在深度神经网络中,卷积层的滤波器在从输入中提取特征方面发挥着重要作用。冗余滤波器通常会提取相似的特征,从而导致计算开销增加和模型体积增大。为解决这一问题,本文提出了一种分两步走的方法。首先,利用分层聚类(HAC)技术,根据冗余过滤器之间的余弦距离确定它们的聚类。接下来,我们不再一次性剪除每个簇中的所有冗余滤波器,而是提议以系统化的方式剪除滤波器。为了剪切过滤器,我们使用基于 (ell _1)-norm的准则来确定所有聚类中的聚类重要性和每个聚类中过滤器的重要性。然后,根据剪枝率,从最不重要的簇到最重要的簇,对过滤器进行系统剪枝。与其他基于聚类的方法相比,所提出的方法取得了更好的效果。实验中使用了基准数据集 CIFAR-10 和 ImageNet。从 VGG-16 架构中剪枝 83.92% 的参数后,观察到比基线有所改进。在对 ResNet-56 和 ResNet-110 分别剪枝 54.59% 和 49.33% 的 FLOP 后,两者的准确率都有所提高。在剪枝 52.97% 的 FLOP 后,ResNet-50 的前五名准确率仅比 ImageNet 降低了 0.56。
{"title":"PRF: deep neural network compression by systematic pruning of redundant filters","authors":"C. H. Sarvani, Mrinmoy Ghorai, S. H. Shabbeer Basha","doi":"10.1007/s00521-024-10256-5","DOIUrl":"https://doi.org/10.1007/s00521-024-10256-5","url":null,"abstract":"<p>In deep neural networks, the filters of convolutional layers play an important role in extracting the features from the input. Redundant filters often extract similar features, leading to increased computational overhead and larger model size. To address this issue, a two-step approach is proposed in this paper. First, the clusters of redundant filters are identified based on the cosine distance between them using hierarchical agglomerative clustering (HAC). Next, instead of pruning all the redundant filters from every cluster in single-shot, we propose to prune the filters in a systematic manner. To prune the filters, the cluster importance among all clusters and filter importance within each cluster are identified using the <span>(ell _1)</span>-norm based criterion. Then, based on the pruning ratio filters from the least important cluster to the most important ones are pruned systematically. The proposed method showed better results compared to other clustering-based works. The benchmark datasets CIFAR-10 and ImageNet are used in the experiments. After pruning 83.92% parameters from VGG-16 architecture, an improvement over the baseline is observed. After pruning 54.59% and 49.33% of the FLOPs from ResNet-56 and ResNet-110, respectively, both showed an improvement in accuracy. After pruning 52.97% of the FLOPs, the top-5 accuracy of ResNet-50 drops by only 0.56 over ImageNet.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A two-stage algorithm for heterogeneous face recognition using Deep Stacked PCA Descriptor (DSPD) and Coupled Discriminant Neighbourhood Embedding (CDNE) 使用深度堆积 PCA 描述符 (DSPD) 和耦合判别邻域嵌入 (CDNE) 的两阶段异构人脸识别算法
Pub Date : 2024-08-14 DOI: 10.1007/s00521-024-10272-5
Shubhobrata Bhattacharya

Automatic face recognition has made significant progress in recent decades, particularly in controlled environments. However, recognizing faces across different modalities, known as Heterogeneous Face Recognition, presents challenges due to variations in modality gaps. This paper addresses the problem of HFR by proposing a two-stage algorithm. In the first stage, a deep stacked PCA descriptor (DSPD) is introduced to extract domain-invariant features from face images of different modalities. The DSPD utilizes multiple convolution layers of domain-trained PCA filters, and the features extracted from each layer are concatenated to obtain a final feature representation. Additionally, pre-processing steps are applied to input images to enhance the prominence of facial edges, making the features more distinctive. The obtained DSPD features can be directly used for recognition using nearest neighbour algorithms. To further improve recognition robustness, a coupled subspace called coupled discriminant neighbourhood embedding (CDNE) is proposed in the second stage. CDNE is trained with a limited number of data samples and can project DSPD features from different modalities onto a common subspace. In this subspace, data points representing the same subjects from different modalities are positioned closely, while those of different subjects are positioned apart. This spatial arrangement enhances the recognition of heterogeneous faces using nearest neighbour algorithms. Experimental results demonstrate the effectiveness of the proposed algorithm on various HFR scenarios, including VIS-NIR, VIS-Sketch, and VIS-Thermal face pairs from respective databases. The algorithm shows promising performance in addressing the challenges posed by the modality gap, providing a potential solution for accurate and robust Heterogeneous Face Recognition.

近几十年来,自动人脸识别技术取得了长足进步,尤其是在受控环境中。然而,由于模态间隙的差异,识别不同模态的人脸(即异构人脸识别)面临着挑战。本文通过提出一种两阶段算法来解决 HFR 问题。在第一阶段,引入深度堆叠 PCA 描述符(DSPD),从不同模态的人脸图像中提取域不变特征。DSPD 利用多层领域训练 PCA 过滤器的卷积层,将从每一层提取的特征串联起来,以获得最终的特征表示。此外,还对输入图像进行预处理,以增强面部边缘的突出度,使特征更加鲜明。获得的 DSPD 特征可直接用于使用近邻算法进行识别。为了进一步提高识别鲁棒性,第二阶段提出了一种称为耦合判别邻域嵌入(CDNE)的耦合子空间。CDNE 使用有限的数据样本进行训练,可以将不同模态的 DSPD 特征投射到一个共同的子空间。在这个子空间中,来自不同模态的代表相同受试者的数据点被紧密定位,而来自不同受试者的数据点则被分开定位。这种空间安排提高了使用近邻算法识别异质人脸的能力。实验结果证明了所提算法在各种 HFR 场景下的有效性,包括来自各自数据库的 VIS-NIR、VIS-Sketch 和 VIS-Thermal 人脸对。该算法在应对模态差距带来的挑战方面表现出了良好的性能,为准确、稳健的异构人脸识别提供了潜在的解决方案。
{"title":"A two-stage algorithm for heterogeneous face recognition using Deep Stacked PCA Descriptor (DSPD) and Coupled Discriminant Neighbourhood Embedding (CDNE)","authors":"Shubhobrata Bhattacharya","doi":"10.1007/s00521-024-10272-5","DOIUrl":"https://doi.org/10.1007/s00521-024-10272-5","url":null,"abstract":"<p>Automatic face recognition has made significant progress in recent decades, particularly in controlled environments. However, recognizing faces across different modalities, known as Heterogeneous Face Recognition, presents challenges due to variations in modality gaps. This paper addresses the problem of HFR by proposing a two-stage algorithm. In the first stage, a deep stacked PCA descriptor (DSPD) is introduced to extract domain-invariant features from face images of different modalities. The DSPD utilizes multiple convolution layers of domain-trained PCA filters, and the features extracted from each layer are concatenated to obtain a final feature representation. Additionally, pre-processing steps are applied to input images to enhance the prominence of facial edges, making the features more distinctive. The obtained DSPD features can be directly used for recognition using nearest neighbour algorithms. To further improve recognition robustness, a coupled subspace called coupled discriminant neighbourhood embedding (CDNE) is proposed in the second stage. CDNE is trained with a limited number of data samples and can project DSPD features from different modalities onto a common subspace. In this subspace, data points representing the same subjects from different modalities are positioned closely, while those of different subjects are positioned apart. This spatial arrangement enhances the recognition of heterogeneous faces using nearest neighbour algorithms. Experimental results demonstrate the effectiveness of the proposed algorithm on various HFR scenarios, including VIS-NIR, VIS-Sketch, and VIS-Thermal face pairs from respective databases. The algorithm shows promising performance in addressing the challenges posed by the modality gap, providing a potential solution for accurate and robust Heterogeneous Face Recognition.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gene expression clock: an unsupervised deep learning approach for predicting circadian rhythmicity from whole genome expression 基因表达时钟:从全基因组表达预测昼夜节律的无监督深度学习方法
Pub Date : 2024-08-14 DOI: 10.1007/s00521-024-10316-w
Aram Ansary Ogholbake, Qiang Cheng

Circadian rhythms are driven by an internal molecular clock which controls physiological and behavioral processes. Disruptions in these rhythms have been associated with health issues. Therefore, studying circadian rhythms is crucial for understanding physiology, behavior, and pathophysiology. However, it is challenging to study circadian rhythms over gene expression data, due to a scarcity of time labels. In this paper, we propose a novel approach to predict the phases of un-timed samples based on a deep neural network (DNN) architecture. This approach addresses two challenges: (1) prediction of sample phases and reliable identification of cyclic genes from high-dimensional expression data without relying on conserved circadian genes and (2) handling small sample-sized datasets. Our algorithm begins with initial gene screening to select candidate cyclic genes using a Minimum Distortion Embedding framework. This stage is then followed by greedy layer-wise pre-training of our DNN. Pre-training accomplishes two critical objectives: First, it initializes the hidden layers of our DNN model, enabling them to effectively capture features from the gene profiles with limited samples. Second, it provides suitable initial values for essential aspects of gene periodic oscillations. Subsequently, we fine-tune the pre-trained network to achieve precise sample phase predictions. Extensive experiments on both animal and human datasets show accurate and robust prediction of both sample phases and cyclic genes. Moreover, based on an Alzheimer’s disease (AD) dataset, we identify a set of hub genes that show significant oscillations in cognitively normal subjects but had disruptions in AD, as well as their potential therapeutic targets.

昼夜节律由控制生理和行为过程的内部分子钟驱动。这些节律的紊乱与健康问题有关。因此,研究昼夜节律对于了解生理、行为和病理生理学至关重要。然而,由于缺乏时间标签,在基因表达数据上研究昼夜节律具有挑战性。在本文中,我们提出了一种基于深度神经网络(DNN)架构的新方法来预测未定时样本的阶段。这种方法解决了两个难题:(1) 预测样本阶段,并从高维表达数据中可靠地识别周期基因,而无需依赖保守的昼夜节律基因;(2) 处理小样本量数据集。我们的算法从最初的基因筛选开始,利用最小失真嵌入(Minimum Distortion Embedding)框架选择候选循环基因。在这一阶段之后,我们将对 DNN 进行贪婪的分层预训练。预训练实现了两个关键目标:首先,它初始化了 DNN 模型的隐藏层,使其能够在样本有限的情况下有效捕捉基因图谱的特征。其次,它为基因周期性振荡的重要方面提供了合适的初始值。随后,我们对预训练网络进行微调,以实现精确的样本相位预测。在动物和人类数据集上进行的广泛实验表明,对样本相位和周期基因的预测都是准确而稳健的。此外,基于阿尔茨海默病(AD)数据集,我们确定了一组在认知正常的受试者中表现出显著振荡,但在 AD 中却出现紊乱的中枢基因,以及它们的潜在治疗靶点。
{"title":"Gene expression clock: an unsupervised deep learning approach for predicting circadian rhythmicity from whole genome expression","authors":"Aram Ansary Ogholbake, Qiang Cheng","doi":"10.1007/s00521-024-10316-w","DOIUrl":"https://doi.org/10.1007/s00521-024-10316-w","url":null,"abstract":"<p>Circadian rhythms are driven by an internal molecular clock which controls physiological and behavioral processes. Disruptions in these rhythms have been associated with health issues. Therefore, studying circadian rhythms is crucial for understanding physiology, behavior, and pathophysiology. However, it is challenging to study circadian rhythms over gene expression data, due to a scarcity of time labels. In this paper, we propose a novel approach to predict the phases of un-timed samples based on a deep neural network (DNN) architecture. This approach addresses two challenges: (1) prediction of sample phases and reliable identification of cyclic genes from high-dimensional expression data without relying on conserved circadian genes and (2) handling small sample-sized datasets. Our algorithm begins with initial gene screening to select candidate cyclic genes using a Minimum Distortion Embedding framework. This stage is then followed by greedy layer-wise pre-training of our DNN. Pre-training accomplishes two critical objectives: First, it initializes the hidden layers of our DNN model, enabling them to effectively capture features from the gene profiles with limited samples. Second, it provides suitable initial values for essential aspects of gene periodic oscillations. Subsequently, we fine-tune the pre-trained network to achieve precise sample phase predictions. Extensive experiments on both animal and human datasets show accurate and robust prediction of both sample phases and cyclic genes. Moreover, based on an Alzheimer’s disease (AD) dataset, we identify a set of hub genes that show significant oscillations in cognitively normal subjects but had disruptions in AD, as well as their potential therapeutic targets.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid-Mode tracker with online SA-LSTM updater 带有在线 SA-LSTM 更新器的混合模式跟踪器
Pub Date : 2024-08-14 DOI: 10.1007/s00521-024-10354-4
Hongsheng Zheng, Yun Gao, Yaqing Hu, Xuejie Zhang

The backbone network and target template are pivotal factors influencing the performance of Siamese trackers. However, traditional approaches encounter challenges in eliminating local redundancy and establishing global dependencies when learning visual data representations. While convolutional neural networks (CNNs) and vision transformers (ViTs) are commonly employed as backbones in Siamese-based trackers, each primarily addresses only one of these challenges. Furthermore, tracking is a dynamic process. Nonetheless, in many Siamese trackers, solely a fixed initial template is employed to facilitate target state matching. This approach often proves inadequate for effectively handling scenes characterized by target deformation, occlusion, and fast motion. In this paper, we propose a Hybrid-Mode Siamese tracker featuring an online SA-LSTM updater. Distinct learning operators are tailored to exploit characteristics at different depth levels of the backbone, integrating convolution and transformers to form a Hybrid-Mode backbone. This backbone efficiently learns global dependencies among input tokens while minimizing redundant computations in local domains, enhancing feature richness for target tracking. The online SA-LSTM updater comprehensively integrates spatial–temporal context during tracking, producing dynamic template features with enhanced representations of target appearance. Extensive experiments across multiple benchmark datasets, including GOT-10K, LaSOT, TrackingNet, OTB-100, UAV123, and NFS, demonstrate that the proposed method achieves outstanding performance, running at 35 FPS on a single GPU.

骨干网络和目标模板是影响连体跟踪器性能的关键因素。然而,传统方法在学习视觉数据表示时,在消除局部冗余和建立全局依赖关系方面遇到了挑战。虽然卷积神经网络(CNN)和视觉变换器(ViT)通常被用作基于连体的跟踪器的骨干,但每种方法都只能解决其中的一个难题。此外,跟踪是一个动态过程。然而,在许多连体跟踪器中,仅使用固定的初始模板来促进目标状态匹配。事实证明,这种方法往往无法有效处理以目标变形、遮挡和快速运动为特征的场景。在本文中,我们提出了一种混合模式连体跟踪器,具有在线 SA-LSTM 更新器。为利用骨干网不同深度层次的特征,我们定制了不同的学习算子,并将卷积和变换器整合在一起,形成混合模式骨干网。该骨干系统能有效地学习输入标记之间的全局依赖关系,同时最大限度地减少局部域的冗余计算,从而提高目标跟踪的特征丰富度。在线 SA-LSTM 更新器在跟踪过程中全面整合了时空背景,生成了具有目标外观增强表征的动态模板特征。在多个基准数据集(包括 GOT-10K、LaSOT、TrackingNet、OTB-100、UAV123 和 NFS)上进行的广泛实验表明,所提出的方法实现了出色的性能,在单 GPU 上以 35 FPS 的速度运行。
{"title":"Hybrid-Mode tracker with online SA-LSTM updater","authors":"Hongsheng Zheng, Yun Gao, Yaqing Hu, Xuejie Zhang","doi":"10.1007/s00521-024-10354-4","DOIUrl":"https://doi.org/10.1007/s00521-024-10354-4","url":null,"abstract":"<p>The backbone network and target template are pivotal factors influencing the performance of Siamese trackers. However, traditional approaches encounter challenges in eliminating local redundancy and establishing global dependencies when learning visual data representations. While convolutional neural networks (CNNs) and vision transformers (ViTs) are commonly employed as backbones in Siamese-based trackers, each primarily addresses only one of these challenges. Furthermore, tracking is a dynamic process. Nonetheless, in many Siamese trackers, solely a fixed initial template is employed to facilitate target state matching. This approach often proves inadequate for effectively handling scenes characterized by target deformation, occlusion, and fast motion. In this paper, we propose a Hybrid-Mode Siamese tracker featuring an online SA-LSTM updater. Distinct learning operators are tailored to exploit characteristics at different depth levels of the backbone, integrating convolution and transformers to form a Hybrid-Mode backbone. This backbone efficiently learns global dependencies among input tokens while minimizing redundant computations in local domains, enhancing feature richness for target tracking. The online SA-LSTM updater comprehensively integrates spatial–temporal context during tracking, producing dynamic template features with enhanced representations of target appearance. Extensive experiments across multiple benchmark datasets, including GOT-10K, LaSOT, TrackingNet, OTB-100, UAV123, and NFS, demonstrate that the proposed method achieves outstanding performance, running at 35 FPS on a single GPU.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HMedCaps: a new hybrid capsule network architecture for complex medical images HMedCaps:用于复杂医学图像的新型混合胶囊网络架构
Pub Date : 2024-08-14 DOI: 10.1007/s00521-024-10147-9
Sumeyra Busra Sengul, Ilker Ali Ozkan

Recognizing and analyzing medical images is crucial for disease early detection and treatment planning with appropriate treatment options based on the patient's individual needs and disease history. Deep learning technologies are widely used in the field of healthcare because they can analyze images rapidly and precisely. However, because each object on the image has the potential to hold illness information in medical images, it is critical to analyze the images with minimal information loss. In this context, Capsule Network (CapsNet) architecture is an important approach that aims to reduce information loss by storing the location and properties of objects in images as capsules. However, because CapsNet maintains information on each object in the image, the existence of several objects in complicated images can impair CapsNet's performance. This work proposes a new model called HMedCaps to improve the performance of CapsNet. In the proposed model, it is aimed to develop a deeper and hybrid structure by using Residual Block and FractalNet module together in the feature extraction layer. While it is aimed to obtain rich feature maps by increasing the number of features extracted by deepening the network, it is aimed to prevent the vanishing gradient problem that may occur in the network with increasing depth with these modules with skip connections. Furthermore, a new squash function is proposed to make distinctive capsules more prominent by customizing capsule activation. The CIFAR10 dataset of complex images, RFMiD dataset of retinal images, and Blood Cell Count Dataset dataset of blood cell images were used to evaluate the study. When the proposed model was compared with the basic CapsNet and studies in the literature, it was observed that the performance in complex images was improved and more accurate classification results were obtained in the field of medical image analysis. The proposed hybrid HMedCaps architecture has the potential to make more accurate diagnoses in the field of medical image analysis.

识别和分析医学影像对于疾病的早期检测以及根据患者的个人需求和病史制定适当的治疗方案至关重要。深度学习技术能够快速、精确地分析图像,因此在医疗保健领域得到了广泛应用。然而,由于图像上的每个物体都有可能包含医疗图像中的疾病信息,因此在分析图像时尽量减少信息丢失至关重要。在这种情况下,胶囊网络(CapsNet)架构是一种重要的方法,旨在通过将图像中物体的位置和属性存储为胶囊来减少信息丢失。然而,由于 CapsNet 维护图像中每个物体的信息,复杂图像中存在多个物体会影响 CapsNet 的性能。本研究提出了一种名为 HMedCaps 的新模型来提高 CapsNet 的性能。在提出的模型中,旨在通过在特征提取层中同时使用残差块和分形网模块来开发一种更深层次的混合结构。其目的是通过加深网络来增加特征提取的数量,从而获得丰富的特征图,同时也是为了防止随着网络深度的增加,这些具有跳转连接的模块可能会出现梯度消失的问题。此外,还提出了一种新的压扁函数,通过定制胶囊激活,使独特的胶囊更加突出。研究使用了复杂图像的 CIFAR10 数据集、视网膜图像的 RFMiD 数据集和血细胞计数数据集来进行评估。将所提出的模型与基本的 CapsNet 和文献中的研究进行比较后发现,在医学图像分析领域,所提出的模型在复杂图像中的性能得到了提高,并获得了更准确的分类结果。所提出的混合 HMedCaps 架构有望在医学图像分析领域做出更准确的诊断。
{"title":"HMedCaps: a new hybrid capsule network architecture for complex medical images","authors":"Sumeyra Busra Sengul, Ilker Ali Ozkan","doi":"10.1007/s00521-024-10147-9","DOIUrl":"https://doi.org/10.1007/s00521-024-10147-9","url":null,"abstract":"<p>Recognizing and analyzing medical images is crucial for disease early detection and treatment planning with appropriate treatment options based on the patient's individual needs and disease history. Deep learning technologies are widely used in the field of healthcare because they can analyze images rapidly and precisely. However, because each object on the image has the potential to hold illness information in medical images, it is critical to analyze the images with minimal information loss. In this context, Capsule Network (CapsNet) architecture is an important approach that aims to reduce information loss by storing the location and properties of objects in images as capsules. However, because CapsNet maintains information on each object in the image, the existence of several objects in complicated images can impair CapsNet's performance. This work proposes a new model called HMedCaps to improve the performance of CapsNet. In the proposed model, it is aimed to develop a deeper and hybrid structure by using Residual Block and FractalNet module together in the feature extraction layer. While it is aimed to obtain rich feature maps by increasing the number of features extracted by deepening the network, it is aimed to prevent the vanishing gradient problem that may occur in the network with increasing depth with these modules with skip connections. Furthermore, a new squash function is proposed to make distinctive capsules more prominent by customizing capsule activation. The CIFAR10 dataset of complex images, RFMiD dataset of retinal images, and Blood Cell Count Dataset dataset of blood cell images were used to evaluate the study. When the proposed model was compared with the basic CapsNet and studies in the literature, it was observed that the performance in complex images was improved and more accurate classification results were obtained in the field of medical image analysis. The proposed hybrid HMedCaps architecture has the potential to make more accurate diagnoses in the field of medical image analysis.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
YOLOv7 for brain tumour detection using morphological transfer learning model 利用形态学迁移学习模型检测脑肿瘤的 YOLOv7
Pub Date : 2024-08-12 DOI: 10.1007/s00521-024-10246-7
Sanat Kumar Pandey, Ashish Kumar Bhandari

An accurate diagnosis of a brain tumour in its early stages is required to improve the possibility of survival for cancer patients. Due to the structural complexity of the brain, it has become very difficult and tedious for neurologists and radiologists to diagnose brain tumours in the initial stages with the help of various common manual approaches to tumour diagnosis. To improve the performance of the diagnosis, some computer-aided diagnosis-based systems are developed with the concepts of artificial intelligence. In this proposed manuscript, we analyse various computer-aided design (CAD)-based approaches and design a modern approach with ideas of transfer learning over deep learning on magnetic resonance imaging (MRI). In this study, we apply a transfer learning approach with the object detection model YOLO (You Only Look Once) and analyse the MRI dataset with the various modified versions of YOLO. After the analysis, we propose an object detection model based on the modified YOLOv7 with a morphological filtering approach to reach an efficient and accurate diagnosis. To enhance the performance accuracy of this suggested model, we also analyse the various versions of YOLOv7 models and find that the proposed model having the YOLOv7-E6E object detection technique gives the optimum value of performance indicators as precision, recall, F1, and mAP@50 as 1, 0.92, 0.958333, and 0.974, respectively. The value of mAP@50 improves to 0.992 by introducing a morphological filtering approach before the object detection technique. During the complete analysis of the suggested model, we use the BraTS 2021 dataset. The BraTS 2021 dataset has brain MR images from the RSNA-MICCAI brain tumour radiogenetic competition, and the complete dataset is labelled using the online tool MakeSense AI.

要提高癌症患者的生存率,就必须在早期阶段对脑肿瘤做出准确诊断。由于脑部结构复杂,神经科医生和放射科医生在初期阶段借助各种常见的人工方法诊断脑肿瘤变得非常困难和乏味。为了提高诊断效果,一些基于计算机辅助诊断的系统在人工智能的概念下应运而生。在本手稿中,我们分析了各种基于计算机辅助设计(CAD)的方法,并在磁共振成像(MRI)上设计了一种具有迁移学习和深度学习思想的现代方法。在这项研究中,我们将迁移学习方法与物体检测模型 YOLO(You Only Look Once,你只看一次)相结合,并利用各种修改版的 YOLO 对磁共振成像数据集进行分析。分析结束后,我们提出了一种基于改进版 YOLOv7 的物体检测模型,该模型采用形态学过滤方法,可实现高效、准确的诊断。为了提高该建议模型的性能精度,我们还分析了各种版本的 YOLOv7 模型,发现采用 YOLOv7-E6E 物体检测技术的建议模型的精度、召回率、F1 和 mAP@50 等性能指标的最佳值分别为 1、0.92、0.958333 和 0.974。通过在物体检测技术之前引入形态学过滤方法,mAP@50 的值提高到了 0.992。在对建议模型进行全面分析时,我们使用了 BraTS 2021 数据集。BraTS 2021 数据集包含来自 RSNA-MICCAI 脑肿瘤放射遗传学竞赛的脑部 MR 图像,整个数据集使用在线工具 MakeSense AI 进行标注。
{"title":"YOLOv7 for brain tumour detection using morphological transfer learning model","authors":"Sanat Kumar Pandey, Ashish Kumar Bhandari","doi":"10.1007/s00521-024-10246-7","DOIUrl":"https://doi.org/10.1007/s00521-024-10246-7","url":null,"abstract":"<p>An accurate diagnosis of a brain tumour in its early stages is required to improve the possibility of survival for cancer patients. Due to the structural complexity of the brain, it has become very difficult and tedious for neurologists and radiologists to diagnose brain tumours in the initial stages with the help of various common manual approaches to tumour diagnosis. To improve the performance of the diagnosis, some computer-aided diagnosis-based systems are developed with the concepts of artificial intelligence. In this proposed manuscript, we analyse various computer-aided design (CAD)-based approaches and design a modern approach with ideas of transfer learning over deep learning on magnetic resonance imaging (MRI). In this study, we apply a transfer learning approach with the object detection model YOLO (You Only Look Once) and analyse the MRI dataset with the various modified versions of YOLO. After the analysis, we propose an object detection model based on the modified YOLOv7 with a morphological filtering approach to reach an efficient and accurate diagnosis. To enhance the performance accuracy of this suggested model, we also analyse the various versions of YOLOv7 models and find that the proposed model having the YOLOv7-E6E object detection technique gives the optimum value of performance indicators as precision, recall, F1, and mAP@50 as 1, 0.92, 0.958333, and 0.974, respectively. The value of mAP@50 improves to 0.992 by introducing a morphological filtering approach before the object detection technique. During the complete analysis of the suggested model, we use the BraTS 2021 dataset. The BraTS 2021 dataset has brain MR images from the RSNA-MICCAI brain tumour radiogenetic competition, and the complete dataset is labelled using the online tool MakeSense AI.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing CNN model classification performance through RGB angle rotation method 通过 RGB 角度旋转方法提高 CNN 模型分类性能
Pub Date : 2024-08-12 DOI: 10.1007/s00521-024-10232-z
Yahya Dogan, Cuneyt Ozdemir, Yılmaz Kaya

In recent years, convolutional neural networks have significantly advanced the field of computer vision by automatically extracting features from image data. CNNs enable the modeling of complex and abstract image features using learnable filters, eliminating the need for manual feature extraction. However, combining feature maps obtained from CNNs with different approaches can lead to more complex and interpretable inferences, thereby enhancing model performance and generalizability. In this study, we propose a new method called RGB angle rotation to effectively obtain feature maps from RGB images. Our method rotates color channels at different angles and uses the angle information between channels to generate new feature maps. We then investigate the effects of integrating models trained with these feature maps into an ensemble architecture. Experimental results on the CIFAR-10 dataset show that using the proposed method in the ensemble model results in performance increases of 9.10 and 8.42% for the B and R channels, respectively, compared to the original model, while the effect of the G channel is very limited. For the CIFAR-100 dataset, the proposed method resulted in a 17.09% improvement in ensemble model performance for the R channel, a 5.06% increase for the B channel, and no significant improvement for the G channel compared to the original model. Additionally, we compared our method with traditional feature extraction methods like scale-invariant feature transform and local binary pattern and observed higher performance. In conclusion, it has been observed that the proposed RGB angle rotation method significantly impacts model performance.

近年来,卷积神经网络通过从图像数据中自动提取特征,极大地推动了计算机视觉领域的发展。卷积神经网络能够利用可学习的滤波器对复杂抽象的图像特征进行建模,从而消除了人工特征提取的需要。然而,将从 CNN 中获得的特征图与不同的方法相结合,可以得到更复杂、更可解释的推论,从而提高模型的性能和普适性。在本研究中,我们提出了一种名为 RGB 角度旋转的新方法,以有效地从 RGB 图像中获取特征图。我们的方法以不同的角度旋转颜色通道,并利用通道间的角度信息生成新的特征图。然后,我们研究了将使用这些特征图训练的模型集成到集合架构中的效果。在 CIFAR-10 数据集上的实验结果表明,与原始模型相比,在集合模型中使用所提出的方法,B 和 R 信道的性能分别提高了 9.10% 和 8.42%,而 G 信道的影响则非常有限。对于 CIFAR-100 数据集,与原始模型相比,建议方法使 R 信道的集合模型性能提高了 17.09%,B 信道提高了 5.06%,而 G 信道没有显著提高。此外,我们还将我们的方法与传统的特征提取方法(如尺度不变特征变换和局部二进制模式)进行了比较,发现我们的方法性能更高。总之,我们发现所提出的 RGB 角度旋转方法对模型性能有显著影响。
{"title":"Enhancing CNN model classification performance through RGB angle rotation method","authors":"Yahya Dogan, Cuneyt Ozdemir, Yılmaz Kaya","doi":"10.1007/s00521-024-10232-z","DOIUrl":"https://doi.org/10.1007/s00521-024-10232-z","url":null,"abstract":"<p>In recent years, convolutional neural networks have significantly advanced the field of computer vision by automatically extracting features from image data. CNNs enable the modeling of complex and abstract image features using learnable filters, eliminating the need for manual feature extraction. However, combining feature maps obtained from CNNs with different approaches can lead to more complex and interpretable inferences, thereby enhancing model performance and generalizability. In this study, we propose a new method called RGB angle rotation to effectively obtain feature maps from RGB images. Our method rotates color channels at different angles and uses the angle information between channels to generate new feature maps. We then investigate the effects of integrating models trained with these feature maps into an ensemble architecture. Experimental results on the CIFAR-10 dataset show that using the proposed method in the ensemble model results in performance increases of 9.10 and 8.42% for the B and R channels, respectively, compared to the original model, while the effect of the G channel is very limited. For the CIFAR-100 dataset, the proposed method resulted in a 17.09% improvement in ensemble model performance for the R channel, a 5.06% increase for the B channel, and no significant improvement for the G channel compared to the original model. Additionally, we compared our method with traditional feature extraction methods like scale-invariant feature transform and local binary pattern and observed higher performance. In conclusion, it has been observed that the proposed RGB angle rotation method significantly impacts model performance.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-preserving hierarchical federated learning with biosignals to detect drowsiness while driving 利用生物信号进行隐私保护分层联合学习,以检测驾驶时的嗜睡状态
Pub Date : 2024-08-12 DOI: 10.1007/s00521-024-10282-3
Sergio López Bernal, José Manuel Hidalgo Rogel, Enrique Tomás Martínez Beltrán, Mario Quiles Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

In response to the global safety concern of drowsiness during driving, the European Union enforces that new vehicles must integrate detection systems compliant with the general data protection regulation. To identify drowsiness patterns while preserving drivers’ data privacy, recent literature has combined Federated Learning (FL) with different biosignals, such as facial expressions, heart rate, electroencephalography (EEG), or electrooculography (EOG). However, existing solutions are unsuitable for drowsiness detection where heterogeneous stakeholders want to collaborate at different levels while guaranteeing data privacy. There is a lack of works evaluating the benefits of using Hierarchical FL (HFL) with EEG and EOG biosignals, and comparing HFL over traditional FL and Machine Learning (ML) approaches to detect drowsiness at the wheel while ensuring data confidentiality. Thus, this work proposes a flexible framework for drowsiness identification by using HFL, FL, and ML over EEG and EOG data. To validate the framework, this work defines a scenario of three transportation companies aiming to share data from their drivers without compromising their confidentiality, defining a two-level hierarchical structure. This study presents three incremental Use Cases (UCs) to assess detection performance: UC1) intra-company FL, yielding a 77.3% accuracy while ensuring the privacy of individual drivers’ data; UC2) inter-company FL, achieving 71.7% accuracy for known drivers and 67.1% for new subjects, ensuring data confidentiality between companies but not intra-organization; and UC3) HFL inter-company, which ensured comprehensive data privacy both within and between companies, with an accuracy of 71.9% for training subjects and 65.5% for new subjects.

为应对全球对驾驶过程中嗜睡问题的安全关切,欧盟规定新车必须集成符合一般数据保护法规的检测系统。为了在识别嗜睡模式的同时保护驾驶员的数据隐私,最近有文献将联合学习(FL)与不同的生物信号(如面部表情、心率、脑电图(EEG)或脑电图(EOG))相结合。然而,现有的解决方案并不适合嗜睡检测,因为在嗜睡检测中,不同的利益相关者希望在不同层面进行协作,同时保证数据的隐私性。目前,还没有作品对使用分层动态脑电图(HFL)与脑电图和眼电图生物信号的好处进行评估,也没有作品对 HFL 与传统动态脑电图和机器学习(ML)方法进行比较,以在确保数据保密性的同时检测车轮上的瞌睡情况。因此,本研究提出了一个灵活的框架,通过使用 HFL、FL 和 ML 对 EEG 和 EOG 数据进行嗜睡识别。为了验证该框架,本研究定义了三个运输公司的场景,目的是在不影响数据保密性的情况下共享司机的数据,并定义了一个两级分层结构。本研究提出了三个增量用例(UC)来评估检测性能:UC1) 公司内部 FL,准确率为 77.3%,同时确保了司机个人数据的隐私;UC2) 公司间 FL,已知司机准确率为 71.7%,新对象准确率为 67.1%,确保了公司之间的数据保密性,但未确保组织内部的数据保密性;UC3) 公司间 HFL,确保了公司内部和公司之间的全面数据隐私,培训对象准确率为 71.9%,新对象准确率为 65.5%。
{"title":"Privacy-preserving hierarchical federated learning with biosignals to detect drowsiness while driving","authors":"Sergio López Bernal, José Manuel Hidalgo Rogel, Enrique Tomás Martínez Beltrán, Mario Quiles Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán","doi":"10.1007/s00521-024-10282-3","DOIUrl":"https://doi.org/10.1007/s00521-024-10282-3","url":null,"abstract":"<p>In response to the global safety concern of drowsiness during driving, the European Union enforces that new vehicles must integrate detection systems compliant with the general data protection regulation. To identify drowsiness patterns while preserving drivers’ data privacy, recent literature has combined Federated Learning (FL) with different biosignals, such as facial expressions, heart rate, electroencephalography (EEG), or electrooculography (EOG). However, existing solutions are unsuitable for drowsiness detection where heterogeneous stakeholders want to collaborate at different levels while guaranteeing data privacy. There is a lack of works evaluating the benefits of using Hierarchical FL (HFL) with EEG and EOG biosignals, and comparing HFL over traditional FL and Machine Learning (ML) approaches to detect drowsiness at the wheel while ensuring data confidentiality. Thus, this work proposes a flexible framework for drowsiness identification by using HFL, FL, and ML over EEG and EOG data. To validate the framework, this work defines a scenario of three transportation companies aiming to share data from their drivers without compromising their confidentiality, defining a two-level hierarchical structure. This study presents three incremental Use Cases (UCs) to assess detection performance: UC1) intra-company FL, yielding a 77.3% accuracy while ensuring the privacy of individual drivers’ data; UC2) inter-company FL, achieving 71.7% accuracy for known drivers and 67.1% for new subjects, ensuring data confidentiality between companies but not intra-organization; and UC3) HFL inter-company, which ensured comprehensive data privacy both within and between companies, with an accuracy of 71.9% for training subjects and 65.5% for new subjects.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Computing and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1