首页 > 最新文献

Journal of Ambient Intelligence and Humanized Computing最新文献

英文 中文
LW-MHFI-Net: a lightweight multi-scale network for medical image segmentation based on hierarchical feature incorporation LW-MHFI-Net:基于分层特征整合的轻量级多尺度医学图像分割网络
3区 计算机科学 Q1 Computer Science Pub Date : 2024-06-15 DOI: 10.1007/s12652-024-04820-z
Yasmeen A. Kassem, S. Kishk, Mohamed A. Yakout, Doaa A. Altantawy
{"title":"LW-MHFI-Net: a lightweight multi-scale network for medical image segmentation based on hierarchical feature incorporation","authors":"Yasmeen A. Kassem, S. Kishk, Mohamed A. Yakout, Doaa A. Altantawy","doi":"10.1007/s12652-024-04820-z","DOIUrl":"https://doi.org/10.1007/s12652-024-04820-z","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141336895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rf-based fingerprinting for indoor localization: deep transfer learning approach 基于射频的室内定位指纹识别:深度迁移学习方法
3区 计算机科学 Q1 Computer Science Pub Date : 2024-06-14 DOI: 10.1007/s12652-024-04819-6
Rokaya Safwat, Eman Shaaban, S. Al-Tabbakh, Karim Emara
{"title":"Rf-based fingerprinting for indoor localization: deep transfer learning approach","authors":"Rokaya Safwat, Eman Shaaban, S. Al-Tabbakh, Karim Emara","doi":"10.1007/s12652-024-04819-6","DOIUrl":"https://doi.org/10.1007/s12652-024-04819-6","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141340068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
$$lambda $$-possibility-center based MCDM technique on the control of Ganga river pollution under non-linear pentagonal fuzzy environment 基于可能性中心的非线性五边形模糊环境下恒河污染控制的 MCDM 技术
3区 计算机科学 Q1 Computer Science Pub Date : 2024-06-12 DOI: 10.1007/s12652-024-04817-8
Totan Garai
{"title":"$$lambda $$-possibility-center based MCDM technique on the control of Ganga river pollution under non-linear pentagonal fuzzy environment","authors":"Totan Garai","doi":"10.1007/s12652-024-04817-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04817-8","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141353039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention based: modeling human perception of reflectional symmetry in the wild 基于注意力:模拟人类对野外反射对称性的感知
3区 计算机科学 Q1 Computer Science Pub Date : 2024-06-07 DOI: 10.1007/s12652-024-04821-y
Habibie Akbar, Muhammad Munwar Iqbal
{"title":"Attention based: modeling human perception of reflectional symmetry in the wild","authors":"Habibie Akbar, Muhammad Munwar Iqbal","doi":"10.1007/s12652-024-04821-y","DOIUrl":"https://doi.org/10.1007/s12652-024-04821-y","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141375021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiview data fusion technique for missing value imputation in multisensory air pollution dataset 用于多感官空气污染数据集缺失值估算的多视图数据融合技术
3区 计算机科学 Q1 Computer Science Pub Date : 2024-06-04 DOI: 10.1007/s12652-024-04816-9
Asif Iqbal Middya, Sarbani Roy

The missing readings in various sensors of air pollution monitoring stations is a common issue. Those missing sensor readings may greatly influence the performance of monitoring and analysis of air pollution data. To address this problem, in this paper, a multi-view based missing value (MV) imputation method called MVDI (Multi-View Data Imputation) is proposed for air pollution related time series data. MVDI combines four models namely LSTM (Long-Short Term Memory), IDS (Inverse Distance Squared), SVR (Support Vector Regressor), and KNN (K-Nearest Neighbors) to estimate MVs. These four models are mainly employed to capture the variations in data from different views of the dataset. Here, different views represent different portions (subsets) of the actual dataset. The estimates of MVs from all the views are combined using a kernel function to get an overall result. The proposed model MVDI is evaluated on real-world air pollution dataset in terms of RMSE, MAE, MAPE, and R2. The experimental results show that MVDI dominates over the baseline methods namely AR (AutoRegressive), ARIMA (AutoRegressive Integrated Moving Average), RFR (Random Forest Regressor), ANN (Artificial Neural Network), LI (Linear Interpolation), NN (Nearest Neighbors), MI (Mean Imputation), CNN (Convolutional Neural Network), ConvLSTM (Convolutional LSTM).

空气污染监测站的各种传感器读数缺失是一个常见问题。这些缺失的传感器读数可能会极大地影响空气污染数据的监测和分析性能。为解决这一问题,本文针对空气污染相关时间序列数据提出了一种基于多视图的缺失值(MV)估算方法,即 MVDI(多视图数据估算)。MVDI 结合了四种模型,即 LSTM(长短期记忆)、IDS(反距离平方)、SVR(支持向量回归器)和 KNN(K-近邻)来估计 MV。这四种模型主要用于捕捉数据集不同视图中的数据变化。这里,不同视图代表实际数据集的不同部分(子集)。使用核函数将所有视图的 MV 估计值进行组合,以得到整体结果。根据 RMSE、MAE、MAPE 和 R2,在实际空气污染数据集上对所提出的 MVDI 模型进行了评估。实验结果表明,MVDI 比 AR(自动回归法)、ARIMA(自动回归整合移动平均法)、RFR(随机森林回归法)、ANN(人工神经网络)、LI(线性插值法)、NN(近邻法)、MI(平均归约法)、CNN(卷积神经网络)、ConvLSTM(卷积 LSTM)等基线方法更具优势。
{"title":"Multiview data fusion technique for missing value imputation in multisensory air pollution dataset","authors":"Asif Iqbal Middya, Sarbani Roy","doi":"10.1007/s12652-024-04816-9","DOIUrl":"https://doi.org/10.1007/s12652-024-04816-9","url":null,"abstract":"<p>The missing readings in various sensors of air pollution monitoring stations is a common issue. Those missing sensor readings may greatly influence the performance of monitoring and analysis of air pollution data. To address this problem, in this paper, a multi-view based missing value (MV) imputation method called MVDI (<b>M</b>ulti-<b>V</b>iew <b>D</b>ata <b>I</b>mputation) is proposed for air pollution related time series data. MVDI combines four models namely LSTM (Long-Short Term Memory), IDS (Inverse Distance Squared), SVR (Support Vector Regressor), and KNN (K-Nearest Neighbors) to estimate MVs. These four models are mainly employed to capture the variations in data from different views of the dataset. Here, different views represent different portions (subsets) of the actual dataset. The estimates of MVs from all the views are combined using a kernel function to get an overall result. The proposed model MVDI is evaluated on real-world air pollution dataset in terms of RMSE, MAE, MAPE, and R<sup>2</sup>. The experimental results show that MVDI dominates over the baseline methods namely AR (AutoRegressive), ARIMA (AutoRegressive Integrated Moving Average), RFR (Random Forest Regressor), ANN (Artificial Neural Network), LI (Linear Interpolation), NN (Nearest Neighbors), MI (Mean Imputation), CNN (Convolutional Neural Network), ConvLSTM (Convolutional LSTM).</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Road intersection detection using the YOLO model based on traffic signs and road signs 利用基于交通标志和道路标志的 YOLO 模型检测道路交叉口
3区 计算机科学 Q1 Computer Science Pub Date : 2024-06-01 DOI: 10.1007/s12652-024-04815-w
William Eric Manongga, Rung-Ching Chen

A road intersection is an area where more than two roads in different directions connect. It is a point of transition where the driver navigates and makes the decision, making it an area with a high risk for traffic accidents. Road intersection detection is identifying and analyzing road intersections in real time using various technologies and algorithms. It is an essential part of intelligent transportation systems and autonomous driving. Road intersection detection helps the driver to identify the road intersection early to make good driving decisions and avoid accidents. Despite its high importance, only a few research is found regarding this topic. Existing research mainly focuses on detecting and classifying traffic signs, vehicles, and pedestrians. In this research, we propose an algorithm to detect road intersections using an image from the front-facing camera installed on the car as an input. We use traffic sign detection to detect seven types of traffic signs having a high probability of intersection nearby and combine it with our novel road intersection detection algorithm to detect the location of the road intersection. Our road inter-section detection algorithm leverages the relationship between the area of the traffic signs and the location of the intersection. Our proposed method gives promising results from the experiments and can detect road intersections from further distances. Our method is also able to perform detection in real time.

交叉路口是两条以上不同方向的道路相连接的区域。它是驾驶员导航和做出决定的过渡点,是交通事故的高风险区域。道路交叉口检测是利用各种技术和算法对道路交叉口进行实时识别和分析。它是智能交通系统和自动驾驶的重要组成部分。道路交叉口检测可以帮助驾驶员及早识别道路交叉口,从而做出正确的驾驶决策,避免事故发生。尽管路口检测非常重要,但有关这一主题的研究却寥寥无几。现有研究主要集中在交通标志、车辆和行人的检测和分类上。在本研究中,我们提出了一种使用安装在汽车上的前置摄像头图像作为输入来检测道路交叉口的算法。我们使用交通标志检测来检测附近有高概率交叉的七种交通标志,并将其与我们新颖的道路交叉口检测算法相结合来检测道路交叉口的位置。我们的道路交叉口检测算法利用了交通标志区域与交叉口位置之间的关系。我们提出的方法在实验中取得了很好的结果,可以检测到更远距离的道路交叉口。我们的方法还能进行实时检测。
{"title":"Road intersection detection using the YOLO model based on traffic signs and road signs","authors":"William Eric Manongga, Rung-Ching Chen","doi":"10.1007/s12652-024-04815-w","DOIUrl":"https://doi.org/10.1007/s12652-024-04815-w","url":null,"abstract":"<p>A road intersection is an area where more than two roads in different directions connect. It is a point of transition where the driver navigates and makes the decision, making it an area with a high risk for traffic accidents. Road intersection detection is identifying and analyzing road intersections in real time using various technologies and algorithms. It is an essential part of intelligent transportation systems and autonomous driving. Road intersection detection helps the driver to identify the road intersection early to make good driving decisions and avoid accidents. Despite its high importance, only a few research is found regarding this topic. Existing research mainly focuses on detecting and classifying traffic signs, vehicles, and pedestrians. In this research, we propose an algorithm to detect road intersections using an image from the front-facing camera installed on the car as an input. We use traffic sign detection to detect seven types of traffic signs having a high probability of intersection nearby and combine it with our novel road intersection detection algorithm to detect the location of the road intersection. Our road inter-section detection algorithm leverages the relationship between the area of the traffic signs and the location of the intersection. Our proposed method gives promising results from the experiments and can detect road intersections from further distances. Our method is also able to perform detection in real time.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non intrusive load monitoring using additive time series modeling via finite mixture models aggregation 通过有限混合物模型聚合使用加性时间序列建模进行非侵入式负荷监测
3区 计算机科学 Q1 Computer Science Pub Date : 2024-06-01 DOI: 10.1007/s12652-024-04814-x
Soudabeh Tabarsaii, Manar Amayri, Nizar Bouguila, Ursula Eicker

Energy disaggregation, or Non-Intrusive Load Monitoring (NILM), involves different methods aiming to distinguish the individual contribution of appliances, given the aggregated power signal. In this paper, the application of finite Generalized Gaussian and finite Gamma mixtures in energy disaggregation is proposed and investigated. The procedure includes approximation of the distribution of the sum of two Generalized Gaussian random variables (RVs) and the approximation of the distribution of the sum of two Gamma RVs using Method-of-Moments matching. By adopting this procedure, the probability distribution of each combination of appliances consumption is acquired to predict and disaggregate the specific device data from the aggregated data. Moreover, to make the models more practical we propose a deep version, that we call DNN-Mixture, as a cascade model, which is a combination of a deep neural network and each of the proposed mixture models. As part of our extensive evaluation process, we apply the proposed models on three different datasets, from different geographical locations, that had different sampling rates. The results indicate the superiority of proposed models as compared to the Gaussian mixture model and other widely used approaches. In order to investigate the applicability of our models in challenging unsupervised settings, we tested them on unseen houses with unlabeled data. The outcomes proved the extensibility and robustness of the proposed approach. Finally, the evaluation of the cascade model against the state of the art shows that by benefiting from the advantages of both neural networks and finite mixtures, cascade model can produce promising and competing results with RNN without suffering from its inherent disadvantages.

能量分解或非侵入式负荷监测(NILM)涉及不同的方法,目的是在综合功率信号的情况下,区分各个电器的贡献。本文提出并研究了有限广义高斯混合物和有限伽马混合物在能量分解中的应用。该过程包括使用矩量法匹配对两个广义高斯随机变量(RV)之和的分布进行近似,以及对两个伽马随机变量之和的分布进行近似。通过采用这种方法,可以获得每种家电消费组合的概率分布,从而从汇总数据中预测和分解出具体的设备数据。此外,为了使模型更加实用,我们还提出了一个深度版本,我们称之为 DNN-Mixture,它是一个级联模型,由深度神经网络和每个建议的混合模型组合而成。作为广泛评估过程的一部分,我们在三个不同的数据集上应用了所提出的模型,这些数据集来自不同的地理位置,具有不同的采样率。结果表明,与高斯混合模型和其他广泛使用的方法相比,所提出的模型更具优势。为了研究我们的模型在具有挑战性的无监督环境中的适用性,我们在未标注数据的不可见房屋上对其进行了测试。测试结果证明了所提出方法的可扩展性和鲁棒性。最后,对级联模型与现有技术的对比评估表明,级联模型受益于神经网络和有限混合物的优点,可以产生与 RNN 相媲美的结果,而不会受到其固有缺点的影响。
{"title":"Non intrusive load monitoring using additive time series modeling via finite mixture models aggregation","authors":"Soudabeh Tabarsaii, Manar Amayri, Nizar Bouguila, Ursula Eicker","doi":"10.1007/s12652-024-04814-x","DOIUrl":"https://doi.org/10.1007/s12652-024-04814-x","url":null,"abstract":"<p>Energy disaggregation, or Non-Intrusive Load Monitoring (NILM), involves different methods aiming to distinguish the individual contribution of appliances, given the aggregated power signal. In this paper, the application of finite Generalized Gaussian and finite Gamma mixtures in energy disaggregation is proposed and investigated. The procedure includes approximation of the distribution of the sum of two Generalized Gaussian random variables (RVs) and the approximation of the distribution of the sum of two Gamma RVs using Method-of-Moments matching. By adopting this procedure, the probability distribution of each combination of appliances consumption is acquired to predict and disaggregate the specific device data from the aggregated data. Moreover, to make the models more practical we propose a deep version, that we call DNN-Mixture, as a cascade model, which is a combination of a deep neural network and each of the proposed mixture models. As part of our extensive evaluation process, we apply the proposed models on three different datasets, from different geographical locations, that had different sampling rates. The results indicate the superiority of proposed models as compared to the Gaussian mixture model and other widely used approaches. In order to investigate the applicability of our models in challenging unsupervised settings, we tested them on unseen houses with unlabeled data. The outcomes proved the extensibility and robustness of the proposed approach. Finally, the evaluation of the cascade model against the state of the art shows that by benefiting from the advantages of both neural networks and finite mixtures, cascade model can produce promising and competing results with RNN without suffering from its inherent disadvantages.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Suspicious activities detection using spatial–temporal features based on vision transformer and recurrent neural network 利用基于视觉变换器和递归神经网络的时空特征检测可疑活动
3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-29 DOI: 10.1007/s12652-024-04818-7
Saba Hameed, Javaria Amin, Muhammad Almas Anjum, Muhammad Sharif

Nowadays there is growing demand for surveillance applications due to the safety and security from anomalous events. An anomaly in the video is referred to as an event that has some unusual behavior. Although time is required for the recognition of these anomalous events, computerized methods might help to decrease it and perform efficient prediction. However, accurate anomaly detection is still a challenge due to complex background, illumination, variations, and occlusion. To handle these challenges a method is proposed for a vision transformer convolutional recurrent neural network named ViT-CNN-RCNN model for the classification of suspicious activities based on frames and videos. The proposed pre-trained ViT-base-patch16-224-in21k model contains 224 × 224 × 3 video frames as input and converts into a 16 × 16 patch size. The ViT-base-patch16-224-in21k has a patch embedding layer, ViT encoder, and ViT transformer layer having 11 blocks, layer-norm, and ViT pooler. The ViT model is trained on selected learning parameters such as 20 training epochs, and 10 batch-size to categorize the input frames into thirteen different classes such as robbery, fighting, shooting, stealing, shoplifting, Arrest, Arson, Abuse, exploiting, Road Accident, Burglary, and Vandalism. The CNN-RNN sequential model is designed to process sequential data, that contains an input layer, GRU layer, GRU-1 Layer and Dense Layer. This model is trained on optimal hyperparameters such as 32 video frame sizes, 30 training epochs, and 16 batch-size for classification into corresponding class labels. The proposed model is evaluated on UNI-crime and UCF-crime datasets. The experimental outcomes conclude that the proposed approach better performed as compared to recently published works.

如今,由于异常事件对安全和安保的影响,对监控应用的需求日益增长。视频中的异常是指具有某些异常行为的事件。虽然识别这些异常事件需要时间,但计算机化方法可能有助于减少时间并进行有效预测。然而,由于复杂的背景、光照、变化和遮挡,准确的异常检测仍然是一个挑战。为了应对这些挑战,我们提出了一种名为 ViT-CNN-RCNN 模型的视觉变换卷积递归神经网络方法,用于根据帧和视频对可疑活动进行分类。拟议的预训练 ViT-base-patch16-224-in21k 模型包含 224 × 224 × 3 视频帧作为输入,并转换成 16 × 16 补丁大小。ViT-base-patch16-224-in21k 有一个补丁嵌入层、ViT 编码器、ViT 变换层(有 11 个块)、层规范和 ViT 池器。ViT 模型根据选定的学习参数(如 20 个训练历元和 10 个批量大小)进行训练,将输入帧分为 13 个不同的类别,如抢劫、斗殴、枪击、偷窃、商店行窃、纵火、虐待、剥削、道路事故、入室盗窃和破坏。CNN-RNN 序列模型设计用于处理序列数据,包含输入层、GRU 层、GRU-1 层和密集层。该模型在最佳超参数(如 32 个视频帧大小、30 个训练历元和 16 个批量大小)的基础上进行训练,以将数据分类为相应的类别标签。在 UNI 犯罪数据集和 UCF 犯罪数据集上对所提出的模型进行了评估。实验结果表明,与最近发表的作品相比,所提出的方法性能更好。
{"title":"Suspicious activities detection using spatial–temporal features based on vision transformer and recurrent neural network","authors":"Saba Hameed, Javaria Amin, Muhammad Almas Anjum, Muhammad Sharif","doi":"10.1007/s12652-024-04818-7","DOIUrl":"https://doi.org/10.1007/s12652-024-04818-7","url":null,"abstract":"<p>Nowadays there is growing demand for surveillance applications due to the safety and security from anomalous events. An anomaly in the video is referred to as an event that has some unusual behavior. Although time is required for the recognition of these anomalous events, computerized methods might help to decrease it and perform efficient prediction. However, accurate anomaly detection is still a challenge due to complex background, illumination, variations, and occlusion. To handle these challenges a method is proposed for a vision transformer convolutional recurrent neural network named ViT-CNN-RCNN model for the classification of suspicious activities based on frames and videos. The proposed pre-trained ViT-base-patch16-224-in21k model contains 224 × 224 × 3 video frames as input and converts into a 16 × 16 patch size. The ViT-base-patch16-224-in21k has a patch embedding layer, ViT encoder, and ViT transformer layer having 11 blocks, layer-norm, and ViT pooler. The ViT model is trained on selected learning parameters such as 20 training epochs, and 10 batch-size to categorize the input frames into thirteen different classes such as robbery, fighting, shooting, stealing, shoplifting, Arrest, Arson, Abuse, exploiting, Road Accident, Burglary, and Vandalism. The CNN-RNN sequential model is designed to process sequential data, that contains an input layer, GRU layer, GRU-1 Layer and Dense Layer. This model is trained on optimal hyperparameters such as 32 video frame sizes, 30 training epochs, and 16 batch-size for classification into corresponding class labels. The proposed model is evaluated on UNI-crime and UCF-crime datasets. The experimental outcomes conclude that the proposed approach better performed as compared to recently published works.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on badminton take-off recognition method based on improved deep learning 基于改进型深度学习的羽毛球起飞识别方法研究
3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-28 DOI: 10.1007/s12652-024-04809-8
Lu Lianju, Zhang Haiying

Because of the fast take-off speed of badminton, a single action recognition method can’t quickly and accurately identify the action. Therefore, a new badminton take-off recognition method based on improved deep learning is proposed to capture badminton take-off accurately. Collect badminton sports videos and get images of athletes’ activity areas by tracking the moving targets in badminton competition videos. The static characteristics of badminton players’ take-off actions are extracted from the athletes’ activity areas’ images using 3D ConvNets. According to the human joint points in the badminton player’s target tracking image, the human skeleton sequence is constructed by using a 2D coordinate pseudo-image and 2D skeleton data design algorithm, and the dynamic characteristics of badminton take-off action are extracted from the human skeleton sequence by using LSTM (Long-term and Short-term Memory Network). After the static and dynamic features are fused by weighted summation, badminton take-off feature fusion results are input into a convolutional neural network (CNN) to complete badminton take-off recognition. The CNN pool layer is improved by adaptive pooling, and the network convergence is accelerated by combining batch normalization to further optimize the recognition results of badminton take-off. Experiments show that the human skeleton model can accurately match human movements and assist in extracting action features. The improved CNN has greatly improved the accuracy of recognition of take-off actions. When recognizing real images, it can accurately identify human movements and judge whether there is a take-off action.

由于羽毛球起飞速度快,单一的动作识别方法无法快速准确地识别动作。因此,提出了一种基于改进的深度学习的新型羽毛球起飞识别方法,以准确捕捉羽毛球的起飞动作。采集羽毛球运动视频,通过跟踪羽毛球比赛视频中的运动目标,获取运动员活动区域图像。利用三维 ConvNets 从运动员活动区域图像中提取羽毛球运动员腾空动作的静态特征。根据羽毛球运动员目标跟踪图像中的人体关节点,利用二维坐标伪图像和二维骨架数据设计算法构建人体骨架序列,并利用 LSTM(长短期记忆网络)从人体骨架序列中提取羽毛球运动员起跳动作的动态特征。通过加权求和将静态和动态特征融合后,将羽毛球腾空特征融合结果输入卷积神经网络(CNN),完成羽毛球腾空识别。通过自适应池化改进 CNN 池层,并结合批量归一化加速网络收敛,进一步优化羽毛球起飞的识别结果。实验表明,人体骨架模型能准确匹配人体动作,并辅助提取动作特征。改进后的 CNN 极大地提高了起飞动作的识别准确率。在识别真实图像时,它能准确识别人体动作并判断是否有起球动作。
{"title":"Research on badminton take-off recognition method based on improved deep learning","authors":"Lu Lianju, Zhang Haiying","doi":"10.1007/s12652-024-04809-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04809-8","url":null,"abstract":"<p>Because of the fast take-off speed of badminton, a single action recognition method can’t quickly and accurately identify the action. Therefore, a new badminton take-off recognition method based on improved deep learning is proposed to capture badminton take-off accurately. Collect badminton sports videos and get images of athletes’ activity areas by tracking the moving targets in badminton competition videos. The static characteristics of badminton players’ take-off actions are extracted from the athletes’ activity areas’ images using 3D ConvNets. According to the human joint points in the badminton player’s target tracking image, the human skeleton sequence is constructed by using a 2D coordinate pseudo-image and 2D skeleton data design algorithm, and the dynamic characteristics of badminton take-off action are extracted from the human skeleton sequence by using LSTM (Long-term and Short-term Memory Network). After the static and dynamic features are fused by weighted summation, badminton take-off feature fusion results are input into a convolutional neural network (CNN) to complete badminton take-off recognition. The CNN pool layer is improved by adaptive pooling, and the network convergence is accelerated by combining batch normalization to further optimize the recognition results of badminton take-off. Experiments show that the human skeleton model can accurately match human movements and assist in extracting action features. The improved CNN has greatly improved the accuracy of recognition of take-off actions. When recognizing real images, it can accurately identify human movements and judge whether there is a take-off action.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141169612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cryptography-based location privacy protection in the Internet of Vehicles 车联网中基于密码学的位置隐私保护
3区 计算机科学 Q1 Computer Science Pub Date : 2024-05-21 DOI: 10.1007/s12652-024-04752-8
George Routis, George Katsouris, Ioanna Roussaki
{"title":"Cryptography-based location privacy protection in the Internet of Vehicles","authors":"George Routis, George Katsouris, Ioanna Roussaki","doi":"10.1007/s12652-024-04752-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04752-8","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141116601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Ambient Intelligence and Humanized Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1