首页 > 最新文献

Journal of Ambient Intelligence and Humanized Computing最新文献

英文 中文
Ensemble deep learning for high-precision classification of 90 rice seed varieties from hyperspectral images 利用高光谱图像对 90 个水稻种子品种进行高精度分类的集合深度学习
3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-05 DOI: 10.1007/s12652-024-04782-2
AmirMasoud Taheri, Hossein Ebrahimnezhad, Mohammadhossein Sedaaghi

To develop rice varieties with better nutritional qualities, it is important to classify rice seeds accurately. Hyperspectral imaging can be used to extract spectral information from rice seeds, which can then be used to classify them into different varieties. The challenges of precise classification increase when there are many classes and few training samples. In this paper, we present a novel method for high-precision Hyperspectral Image (HSI) classification of 90 different classes of rice seeds using ensemble deep learning. Our method first employs band selection techniques to select the optimal hyperspectral bands for rice seed classification. Then, a deep neural network is trained with the selected hyperspectral and RGB data from rice seed images to obtain different models for different bands. Finally, an ensemble of deep learning models is employed to classify rice seed images and improve classification accuracy. The proposed method achieves an overall precision ranging from 92.73 to 96.17% despite a large number of classes and low data samples for each class and with only 15 selected hyperspectral bands. This precision is significantly higher than the state-of-the-art classical machine learning methods like random forest, confirming the effectiveness of the proposed method in classifying hyperspectral images of rice seeds.

要培育出营养品质更好的水稻品种,必须对水稻种子进行准确分类。高光谱成像技术可用于提取水稻种子的光谱信息,然后将其分为不同的品种。当类别多而训练样本少时,精确分类所面临的挑战就会增加。在本文中,我们提出了一种利用集合深度学习对 90 种不同类别的水稻种子进行高精度高光谱图像(HSI)分类的新方法。我们的方法首先采用波段选择技术,为水稻种子分类选择最佳的高光谱波段。然后,利用从水稻种子图像中选择的高光谱和 RGB 数据训练深度神经网络,以获得不同波段的不同模型。最后,利用深度学习模型的集合对水稻种子图像进行分类,提高分类精度。尽管分类数量大、每类数据样本少,而且只选择了 15 个高光谱波段,但所提出的方法实现了 92.73% 至 96.17% 的总体精度。这一精确度明显高于随机森林等最先进的经典机器学习方法,证实了所提方法在水稻种子高光谱图像分类中的有效性。
{"title":"Ensemble deep learning for high-precision classification of 90 rice seed varieties from hyperspectral images","authors":"AmirMasoud Taheri, Hossein Ebrahimnezhad, Mohammadhossein Sedaaghi","doi":"10.1007/s12652-024-04782-2","DOIUrl":"https://doi.org/10.1007/s12652-024-04782-2","url":null,"abstract":"<p>To develop rice varieties with better nutritional qualities, it is important to classify rice seeds accurately. Hyperspectral imaging can be used to extract spectral information from rice seeds, which can then be used to classify them into different varieties. The challenges of precise classification increase when there are many classes and few training samples. In this paper, we present a novel method for high-precision Hyperspectral Image (HSI) classification of 90 different classes of rice seeds using ensemble deep learning. Our method first employs band selection techniques to select the optimal hyperspectral bands for rice seed classification. Then, a deep neural network is trained with the selected hyperspectral and RGB data from rice seed images to obtain different models for different bands. Finally, an ensemble of deep learning models is employed to classify rice seed images and improve classification accuracy. The proposed method achieves an overall precision ranging from 92.73 to 96.17% despite a large number of classes and low data samples for each class and with only 15 selected hyperspectral bands. This precision is significantly higher than the state-of-the-art classical machine learning methods like random forest, confirming the effectiveness of the proposed method in classifying hyperspectral images of rice seeds.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140570351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software security with natural language processing and vulnerability scoring using machine learning approach 利用自然语言处理和机器学习方法进行漏洞评分的软件安全性
3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-03 DOI: 10.1007/s12652-024-04778-y
Birendra Kumar Verma, Ajay Kumar Yadav

As software gets more complicated, diverse, and crucial to people’s daily lives, exploitable software vulnerabilities constitute a major security risk to the computer system. These vulnerabilities allow unauthorized access, which can cause losses in banking, energy, the military, healthcare, and other key infrastructure systems. Most vulnerability scoring methods employ Natural Language Processing to generate models from descriptions. These models ignore Impact scores, Exploitability scores, Attack Complexity and other statistical features when scoring vulnerabilities. A feature vector for machine learning models is created from a description, impact score, exploitability score, attack complexity score, etc. We score vulnerabilities more precisely than we categorize them. The Decision Tree Regressor, Random Forest Regressor, AdaBoost Regressor, K-nearest Neighbors Regressor, and Support Vector Regressor have been evaluated using the metrics explained variance, r-squared, mean absolute error, mean squared error, and root mean squared error. The tenfold cross-validation method verifies regressor test results. The research uses 193,463 Common Vulnerabilities and Exposures from the National Vulnerability Database. The Random Forest regressor performed well on four of the five criteria, and the tenfold cross-validation test performed even better (0.9968 vs. 0.9958).

随着软件变得越来越复杂、多样,而且对人们的日常生活越来越重要,可利用的软件漏洞对计算机系统构成了重大的安全风险。这些漏洞允许未经授权的访问,会给银行、能源、军事、医疗保健和其他关键基础设施系统造成损失。大多数漏洞评分方法都采用自然语言处理技术,从描述中生成模型。这些模型在对漏洞进行评分时会忽略影响得分、可开发性得分、攻击复杂性和其他统计特征。机器学习模型的特征向量由描述、影响得分、可利用性得分、攻击复杂性得分等创建。我们对漏洞的评分比对漏洞的分类更精确。使用解释方差、r 平方、平均绝对误差、平均平方误差和均方根误差等指标对决策树回归器、随机森林回归器、AdaBoost 回归器、K-近邻回归器和支持向量回归器进行了评估。十倍交叉验证法验证了回归器的测试结果。研究使用了国家脆弱性数据库中的 193,463 个常见脆弱性和暴露。随机森林回归器在五项标准中的四项上表现良好,十倍交叉验证测试的表现甚至更好(0.9968 对 0.9958)。
{"title":"Software security with natural language processing and vulnerability scoring using machine learning approach","authors":"Birendra Kumar Verma, Ajay Kumar Yadav","doi":"10.1007/s12652-024-04778-y","DOIUrl":"https://doi.org/10.1007/s12652-024-04778-y","url":null,"abstract":"<p>As software gets more complicated, diverse, and crucial to people’s daily lives, exploitable software vulnerabilities constitute a major security risk to the computer system. These vulnerabilities allow unauthorized access, which can cause losses in banking, energy, the military, healthcare, and other key infrastructure systems. Most vulnerability scoring methods employ Natural Language Processing to generate models from descriptions. These models ignore Impact scores, Exploitability scores, Attack Complexity and other statistical features when scoring vulnerabilities. A feature vector for machine learning models is created from a description, impact score, exploitability score, attack complexity score, etc. We score vulnerabilities more precisely than we categorize them. The Decision Tree Regressor, Random Forest Regressor, AdaBoost Regressor, K-nearest Neighbors Regressor, and Support Vector Regressor have been evaluated using the metrics explained variance, r-squared, mean absolute error, mean squared error, and root mean squared error. The tenfold cross-validation method verifies regressor test results. The research uses 193,463 Common Vulnerabilities and Exposures from the National Vulnerability Database. The Random Forest regressor performed well on four of the five criteria, and the tenfold cross-validation test performed even better (0.9968 vs. 0.9958).</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140570516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Existence of fixed points in soft metric spaces with application to boundary value problem 软度量空间中定点的存在及其在边界值问题中的应用
3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-03 DOI: 10.1007/s12652-024-04772-4
Vishal Gupta, Aanchal Gondhi

In this paper, we have proved fixed point results for a pair of soft fuzzy maps in complete ordered soft metric spaces. We have also given some useful corollaries to our main result along with examples. Moreover, the application is also presented in this communication to show the validity of new results.

本文证明了完全有序软度量空间中一对软模糊映射的定点结果。我们还给出了主要结果的一些有用推论,并举例说明。此外,本文还介绍了应用,以说明新结果的有效性。
{"title":"Existence of fixed points in soft metric spaces with application to boundary value problem","authors":"Vishal Gupta, Aanchal Gondhi","doi":"10.1007/s12652-024-04772-4","DOIUrl":"https://doi.org/10.1007/s12652-024-04772-4","url":null,"abstract":"<p>In this paper, we have proved fixed point results for a pair of soft fuzzy maps in complete ordered soft metric spaces. We have also given some useful corollaries to our main result along with examples. Moreover, the application is also presented in this communication to show the validity of new results.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140570355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Static video summarization with multi-objective constrained optimization 利用多目标约束优化进行静态视频总结
3区 计算机科学 Q1 Computer Science Pub Date : 2024-04-01 DOI: 10.1007/s12652-024-04777-z

Abstract

Video summarization is an emerging research field. In particular, static video summarization plays a major role in abstraction and indexing of video repositories. It extracts the vital events in a video such that it covers the entire content of the video. Frames having those important events are called keyframes which are eventually used in video indexing. It also helps in giving an abstract view of the video content such that the internet users are aware of the events present in the video before watching it completely. The proposed research work is focused on efficient static video summarization by extracting various visual features namely color, texture and shape features. These features are aggregated and clustered using a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. In order to produce good video summary by clustering, the parameters of DBSCAN algorithm are optimized by using a meta heuristic population based optimization called Artificial Algae Algorithm (AAA). The experimental results on two public datasets namely VSUMM and OVP dataset show that the proposed Static Video Summarization with Multi-objective Constrained Optimization (SVS_MCO) achieves better results when compared to existing methods.

摘要 视频摘要是一个新兴的研究领域。其中,静态视频摘要在视频库的抽象和索引中发挥着重要作用。它能提取视频中的重要事件,从而涵盖视频的全部内容。包含这些重要事件的帧称为关键帧,最终用于视频索引。它还有助于提供视频内容的抽象视图,使互联网用户在完整观看视频之前就能了解视频中出现的事件。拟议的研究工作侧重于通过提取各种视觉特征(即颜色、纹理和形状特征)来实现高效的静态视频摘要。使用基于密度的带噪声应用空间聚类(DBSCAN)算法对这些特征进行聚合和聚类。为了通过聚类产生良好的视频摘要,DBSCAN 算法的参数采用了一种名为人工藻类算法(AAA)的基于群体的元启发式优化方法进行优化。在两个公共数据集(即 VSUMM 和 OVP 数据集)上的实验结果表明,与现有方法相比,所提出的多目标约束优化静态视频摘要算法(SVS_MCO)取得了更好的效果。
{"title":"Static video summarization with multi-objective constrained optimization","authors":"","doi":"10.1007/s12652-024-04777-z","DOIUrl":"https://doi.org/10.1007/s12652-024-04777-z","url":null,"abstract":"<h3>Abstract</h3> <p>Video summarization is an emerging research field. In particular, static video summarization plays a major role in abstraction and indexing of video repositories. It extracts the vital events in a video such that it covers the entire content of the video. Frames having those important events are called keyframes which are eventually used in video indexing. It also helps in giving an abstract view of the video content such that the internet users are aware of the events present in the video before watching it completely. The proposed research work is focused on efficient static video summarization by extracting various visual features namely color, texture and shape features. These features are aggregated and clustered using a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. In order to produce good video summary by clustering, the parameters of DBSCAN algorithm are optimized by using a meta heuristic population based optimization called Artificial Algae Algorithm (AAA). The experimental results on two public datasets namely VSUMM and OVP dataset show that the proposed Static Video Summarization with Multi-objective Constrained Optimization (SVS_MCO) achieves better results when compared to existing methods.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140570439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An intelligent auction-based capacity allocation algorithm in shared railways 共享铁路中基于拍卖的智能运力分配算法
3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-30 DOI: 10.1007/s12652-024-04773-3
Mohsen Shahmohammadi, M. Fakhrzad, H. H. Nasab, S. F. Ghannadpour
{"title":"An intelligent auction-based capacity allocation algorithm in shared railways","authors":"Mohsen Shahmohammadi, M. Fakhrzad, H. H. Nasab, S. F. Ghannadpour","doi":"10.1007/s12652-024-04773-3","DOIUrl":"https://doi.org/10.1007/s12652-024-04773-3","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140363238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visualization of movements in sports training based on multimedia information processing technology 基于多媒体信息处理技术的运动训练动作可视化
3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-28 DOI: 10.1007/s12652-024-04767-1
Yanle Li

The rapid development of multimedia information processing technology provides development opportunities for digitization in sports, among which motion capture technology, as the latest achievement of multimedia information processing technology, has gradually gained the attention of scholars and started to be used for visualization of sports movements. Therefore, this paper introduces a monocular video motion capture method and optimizes it for the problems of reconstructing human movements such as floating, ground penetration and sliding, which provides a technical path for the specific application of motion capture technology in the field of sports training and also provides a technical guarantee for the visualization of sports training movements. Introduced a new motion capture optimization method. This method captures human motion trajectories from monocular videos, and trajectory operations combine human pose estimation and physical constraints. The proposed method uses foot contact judgment to obtain foot contact events for each motion frame. Then, it optimizes the overall body motion trajectory of the key points based on the obtained contact conditions, making the generated motion visually closer to reality. This article proposes LiteHumanPose Net with a inference speed of up to 22FPS, and conducts experimental analysis and comparison of several popular pose estimation methods from the perspectives of frame rate and average accuracy, such as Sim pleBaseline, HRNet, and Hourglass Net. LiteHumanPose Net outperforms Hourglass Net in terms of frame rate and accuracy, while HRNet has high accuracy due to its multiple parameters but low frame rate. The LiteHumanPose network proposed in this article has a good balance between accuracy and frame rate, and has obvious landing advantages.

多媒体信息处理技术的飞速发展为体育数字化提供了发展契机,其中动作捕捉技术作为多媒体信息处理技术的最新成果,逐渐受到学者们的关注,并开始应用于体育动作的可视化。因此,本文介绍了一种单目视频动作捕捉方法,并针对浮体、穿地、滑步等人体动作的重构问题对其进行了优化,为动作捕捉技术在体育训练领域的具体应用提供了技术路径,也为体育训练动作的可视化提供了技术保障。引入新的动作捕捉优化方法。该方法从单目视频中捕捉人体运动轨迹,轨迹运算结合了人体姿态估计和物理约束。该方法利用脚接触判断来获取每个运动帧的脚接触事件。然后,根据获得的接触条件优化关键点的整体身体运动轨迹,使生成的运动在视觉上更接近现实。本文提出了推理速度高达 22FPS 的 LiteHumanPose Net,并从帧率和平均精度的角度对 Sim pleBaseline、HRNet 和 Hourglass Net 等几种流行的姿势估计方法进行了实验分析和比较。结果表明,LiteHumanPose 网络在帧率和准确率方面都优于 Hourglass Net,而 HRNet 因其多参数而具有较高的准确率,但帧率较低。本文提出的 LiteHumanPose 网络在精度和帧速率之间取得了良好的平衡,具有明显的着陆优势。
{"title":"Visualization of movements in sports training based on multimedia information processing technology","authors":"Yanle Li","doi":"10.1007/s12652-024-04767-1","DOIUrl":"https://doi.org/10.1007/s12652-024-04767-1","url":null,"abstract":"<p>The rapid development of multimedia information processing technology provides development opportunities for digitization in sports, among which motion capture technology, as the latest achievement of multimedia information processing technology, has gradually gained the attention of scholars and started to be used for visualization of sports movements. Therefore, this paper introduces a monocular video motion capture method and optimizes it for the problems of reconstructing human movements such as floating, ground penetration and sliding, which provides a technical path for the specific application of motion capture technology in the field of sports training and also provides a technical guarantee for the visualization of sports training movements. Introduced a new motion capture optimization method. This method captures human motion trajectories from monocular videos, and trajectory operations combine human pose estimation and physical constraints. The proposed method uses foot contact judgment to obtain foot contact events for each motion frame. Then, it optimizes the overall body motion trajectory of the key points based on the obtained contact conditions, making the generated motion visually closer to reality. This article proposes LiteHumanPose Net with a inference speed of up to 22FPS, and conducts experimental analysis and comparison of several popular pose estimation methods from the perspectives of frame rate and average accuracy, such as Sim pleBaseline, HRNet, and Hourglass Net. LiteHumanPose Net outperforms Hourglass Net in terms of frame rate and accuracy, while HRNet has high accuracy due to its multiple parameters but low frame rate. The LiteHumanPose network proposed in this article has a good balance between accuracy and frame rate, and has obvious landing advantages.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140325797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A study of learning models for COVID-19 disease prediction COVID-19 疾病预测学习模型研究
3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-28 DOI: 10.1007/s12652-024-04775-1
Sakshi Jain, Pradeep Kumar Roy

Coronavirus belongs to the family of Coronaviridae. It is responsible for COVID-19 communicable disease, which has affected 213 countries and territories worldwide. Researchers in computational fields have been active in proposing techniques to filter the information and recommendations about this disease and provide surveillance in controlling this outbreak. Researchers used Chest X-ray images, abdominal Computed Tomography scans, and Tweet datasets for building machine learning and deep learning-based models for COVID-19 predictions and forecasting purposes. Accuracy, sensitivity, specificity, precision, and F1-measure are the five primary evaluation criteria researchers employ to evaluate the quality of their study. This article summarises research works on COVID-19 based on machine learning and deep learning models. The analysis of these research works, along with their limitations and source of datasets, will give a quick start for future research to arrive at a defined direction.

冠状病毒属于冠状病毒科。它是 COVID-19 传染病的元凶,已影响到全球 213 个国家和地区。计算领域的研究人员一直在积极提出技术,以过滤有关该疾病的信息和建议,并为控制疫情提供监控。研究人员利用胸部 X 光图像、腹部计算机断层扫描和 Tweet 数据集,建立了基于机器学习和深度学习的模型,用于 COVID-19 的预测和预报。准确性、灵敏度、特异性、精确度和 F1 测量是研究人员评估研究质量的五个主要评价标准。本文总结了基于机器学习和深度学习模型的 COVID-19 研究工作。对这些研究成果及其局限性和数据集来源的分析,将为未来的研究提供一个快速起点,从而确定研究方向。
{"title":"A study of learning models for COVID-19 disease prediction","authors":"Sakshi Jain, Pradeep Kumar Roy","doi":"10.1007/s12652-024-04775-1","DOIUrl":"https://doi.org/10.1007/s12652-024-04775-1","url":null,"abstract":"<p>Coronavirus belongs to the family of Coronaviridae. It is responsible for COVID-19 communicable disease, which has affected 213 countries and territories worldwide. Researchers in computational fields have been active in proposing techniques to filter the information and recommendations about this disease and provide surveillance in controlling this outbreak. Researchers used Chest X-ray images, abdominal Computed Tomography scans, and Tweet datasets for building machine learning and deep learning-based models for COVID-19 predictions and forecasting purposes. Accuracy, sensitivity, specificity, precision, and F1-measure are the five primary evaluation criteria researchers employ to evaluate the quality of their study. This article summarises research works on COVID-19 based on machine learning and deep learning models. The analysis of these research works, along with their limitations and source of datasets, will give a quick start for future research to arrive at a defined direction.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140325799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatiotemporal crowds features extraction of infrared images using neural network 利用神经网络提取红外图像的时空人群特征
3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-27 DOI: 10.1007/s12652-024-04771-5
Anas M. Al-Oraiqat, Oleksandr Drieiev, Hanna Drieieva, Yelyzaveta Meleshko, Hazim AlRawashdeh, Karim A. Al-Oraiqat, Yassin M. Y. Hasan, Noor Maricar, Sheroz Khan

Crowds can lead up to severe disasterous consequences resulting in fatalities. Videos obtained through public cameras or captured by drones flying overhead can be processed with artificial intelligence-based crowd analysis systems. Being a hot area of research over the past few years, the goal is not only to identify the presence of crowds but also to predict the probability of crowd-formation in order to issue timely warnings and preventive measures. Such systems will significantly reduce the probablity of the potential disasters. Developing effective systems is a challenging task, especially due to factors such as naturally occuring diverse conditions, variations in people or background pixel areas, noise, behaviors of individuals, relative amounts/distributions/directions of crowd movements, and crowd building reasons. This paper proposes an infrared video processing system based on U-Net convolutional neural network for crowd monitoring in infrared video frames to help estimate the people crowd with normal or abnormal trends. The proposed U-Net architecture aims to efficiently extract crowd features, achieve sufficient people marking-up accuracy, competitively with optimal network configurations in terms of the depth and number of filters to consequently minimise the number of coefficients. For further faster processing, hardware resources/implementation area savings, and lower power, the optimized network coefficients measured are represented in Canonic-Signed Digit with minimal number of nonzero (± 1) digits, minimizing the number of underlying shift-add/subtract operations of all multipliers. The achieved significantly reduced computational cost makes the proposed U-Net effectively suitable for resource-constrained and low power applications.

人群可能导致严重的灾难后果,造成人员伤亡。基于人工智能的人群分析系统可以处理通过公共摄像头或无人机拍摄的视频。作为过去几年的热门研究领域,该系统的目标不仅是识别人群的存在,还要预测人群形成的概率,以便及时发出警告和采取预防措施。这些系统将大大降低潜在灾害的发生概率。开发有效的系统是一项具有挑战性的任务,特别是由于自然发生的各种条件、人或背景像素区域的变化、噪声、个人行为、人群移动的相对数量/分布/方向以及人群聚集的原因等因素。本文提出了一种基于 U-Net 卷积神经网络的红外视频处理系统,用于红外视频帧中的人群监测,以帮助估计具有正常或异常趋势的人群。所提出的 U-Net 架构旨在高效提取人群特征,实现足够的人群标记精度,并在滤波器深度和数量方面与最佳网络配置竞争,从而最大限度地减少系数数量。为了进一步加快处理速度、节省硬件资源/实施面积和降低功耗,所测量的优化网络系数以卡诺尼-有符号数字表示,非零(± 1)位数最少,从而最大限度地减少了所有乘法器的底层移位-加法/减法运算次数。计算成本的大幅降低使所提出的 U-Net 能够有效适用于资源受限的低功耗应用。
{"title":"Spatiotemporal crowds features extraction of infrared images using neural network","authors":"Anas M. Al-Oraiqat, Oleksandr Drieiev, Hanna Drieieva, Yelyzaveta Meleshko, Hazim AlRawashdeh, Karim A. Al-Oraiqat, Yassin M. Y. Hasan, Noor Maricar, Sheroz Khan","doi":"10.1007/s12652-024-04771-5","DOIUrl":"https://doi.org/10.1007/s12652-024-04771-5","url":null,"abstract":"<p>Crowds can lead up to severe disasterous consequences resulting in fatalities. Videos obtained through public cameras or captured by drones flying overhead can be processed with artificial intelligence-based crowd analysis systems. Being a hot area of research over the past few years, the goal is not only to identify the presence of crowds but also to predict the probability of crowd-formation in order to issue timely warnings and preventive measures. Such systems will significantly reduce the probablity of the potential disasters. Developing effective systems is a challenging task, especially due to factors such as naturally occuring diverse conditions, variations in people or background pixel areas, noise, behaviors of individuals, relative amounts/distributions/directions of crowd movements, and crowd building reasons. This paper proposes an infrared video processing system based on U-Net convolutional neural network for crowd monitoring in infrared video frames to help estimate the people crowd with normal or abnormal trends. The proposed U-Net architecture aims to efficiently extract crowd features, achieve sufficient people marking-up accuracy, competitively with optimal network configurations in terms of the depth and number of filters to consequently minimise the number of coefficients. For further faster processing, hardware resources/implementation area savings, and lower power, the optimized network coefficients measured are represented in Canonic-Signed Digit with minimal number of nonzero (<b>± 1</b>) digits, minimizing the number of underlying shift-add/subtract operations of all multipliers. The achieved significantly reduced computational cost makes the proposed U-Net effectively suitable for resource-constrained and low power applications.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sign language detection using convolutional neural network 利用卷积神经网络进行手语检测
3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-26 DOI: 10.1007/s12652-024-04761-7
Pranati Rakshit, Sarbajeet Paul, Shruti Dey

Sign language recognition is an important social issue to be addressed which can benefit the deaf and hard of hearing community by providing easier and faster communication. Some previous studies on sign language recognition have used complex input modalities and feature extraction methods, limiting their practical applicability. This research aims to compare two custom-made convolutional neural network (CNN) models for recognizing American Sign Language (ASL) letters from A to Z, and determine which model performs better. The proposed models utilize a combination of CNN and Softmax activation function, which are powerful and widely used classification methods in the field of computer vision. The purpose of the proposed study is to compare the performance of two specially created CNN models for identifying 26 distinct hand signals that represent the 26 English alphabets. The study found that Model_2 had better overall performance than Model_1, with an accuracy of 98.44% and F1 score 98.41%. However, the performance of each model varied depending on the specific label, suggesting that the choice of model may depend on the specific use case and the labels of interest. This research contributes to the growing field of sign language recognition using deep learning techniques and highlights the importance of designing custom models.

手语识别是一个亟待解决的重要社会问题,它能为聋人和重听者提供更方便快捷的交流,从而使他们受益。之前一些关于手语识别的研究使用了复杂的输入模式和特征提取方法,限制了其实际应用性。本研究旨在比较两种定制的卷积神经网络(CNN)模型,以识别从 A 到 Z 的美国手语(ASL)字母,并确定哪种模型性能更好。所提出的模型结合使用了 CNN 和 Softmax 激活函数,这两种方法都是计算机视觉领域中强大且广泛使用的分类方法。拟议研究的目的是比较两个专门创建的 CNN 模型在识别代表 26 个英文字母的 26 个不同手势方面的性能。研究发现,Model_2 的整体性能优于 Model_1,准确率为 98.44%,F1 分数为 98.41%。然而,每个模型的性能因具体标签而异,这表明模型的选择可能取决于具体的使用情况和感兴趣的标签。这项研究为使用深度学习技术进行手语识别这一日益增长的领域做出了贡献,并强调了设计定制模型的重要性。
{"title":"Sign language detection using convolutional neural network","authors":"Pranati Rakshit, Sarbajeet Paul, Shruti Dey","doi":"10.1007/s12652-024-04761-7","DOIUrl":"https://doi.org/10.1007/s12652-024-04761-7","url":null,"abstract":"<p>Sign language recognition is an important social issue to be addressed which can benefit the deaf and hard of hearing community by providing easier and faster communication. Some previous studies on sign language recognition have used complex input modalities and feature extraction methods, limiting their practical applicability. This research aims to compare two custom-made convolutional neural network (CNN) models for recognizing American Sign Language (ASL) letters from A to Z, and determine which model performs better. The proposed models utilize a combination of CNN and Softmax activation function, which are powerful and widely used classification methods in the field of computer vision. The purpose of the proposed study is to compare the performance of two specially created CNN models for identifying 26 distinct hand signals that represent the 26 English alphabets. The study found that Model_2 had better overall performance than Model_1, with an accuracy of 98.44% and F1 score 98.41%. However, the performance of each model varied depending on the specific label, suggesting that the choice of model may depend on the specific use case and the labels of interest. This research contributes to the growing field of sign language recognition using deep learning techniques and highlights the importance of designing custom models.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140298136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Heptagonal Reinforcement Learning (HRL): a novel algorithm for early prevention of non-sinus cardiac arrhythmia 七边强化学习(HRL):早期预防非窦性心律失常的新型算法
3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-25 DOI: 10.1007/s12652-024-04776-0
Arman Daliri, Roghaye Sadeghi, Neda Sedighian, Abbas Karimi, Javad Mohammadzadeh

There have been many connections between medical science and artificial intelligence in recent years. Many problems arise with the integrity of communication. Cardiac arrhythmia, carried out using artificial intelligence methods, is one of the most dangerous diseases in the field of prevention. Topics introduced in artificial intelligence are the automatic selection of balancing and classification algorithms. In this study, metrics for machine learning algorithm selection are presented. The first problem is the problem of choosing the best balancing algorithm to balance the data sets, introduced as triangle rate (TR). The second issue to be studied is selecting the best automatic classification algorithm. The third action was to use a scoring algorithm to predict sinus and non-sinus arrhythmias. The heptagonal reinforcement learning (HRL) achieved results competitive with standard algorithms by combining three types of algorithms. The data used in this study was a 12-lead electrocardiogram (ECG) database of arrhythmias. The number of patients examined in this dataset is 10,646. The HRL algorithm has improved the previous algorithms by 5%, achieving 86% cardiac arrhythmia prediction.

近年来,医学科学与人工智能之间产生了许多联系。在交流的完整性方面出现了许多问题。使用人工智能方法进行的心律失常是预防领域中最危险的疾病之一。人工智能引入的主题是自动选择平衡和分类算法。本研究提出了机器学习算法选择的衡量标准。第一个问题是选择最佳平衡算法来平衡数据集的问题,引入三角形率(TR)。第二个要研究的问题是选择最佳自动分类算法。第三个行动是使用评分算法预测窦性和非窦性心律失常。七边强化学习(HRL)通过结合三种算法,取得了与标准算法相媲美的结果。这项研究使用的数据是一个 12 导联心电图(ECG)心律失常数据库。该数据集中的患者人数为 10,646 人。HRL 算法比之前的算法提高了 5%,心律失常预测率达到 86%。
{"title":"Heptagonal Reinforcement Learning (HRL): a novel algorithm for early prevention of non-sinus cardiac arrhythmia","authors":"Arman Daliri, Roghaye Sadeghi, Neda Sedighian, Abbas Karimi, Javad Mohammadzadeh","doi":"10.1007/s12652-024-04776-0","DOIUrl":"https://doi.org/10.1007/s12652-024-04776-0","url":null,"abstract":"<p>There have been many connections between medical science and artificial intelligence in recent years. Many problems arise with the integrity of communication. Cardiac arrhythmia, carried out using artificial intelligence methods, is one of the most dangerous diseases in the field of prevention. Topics introduced in artificial intelligence are the automatic selection of balancing and classification algorithms. In this study, metrics for machine learning algorithm selection are presented. The first problem is the problem of choosing the best balancing algorithm to balance the data sets, introduced as triangle rate (TR). The second issue to be studied is selecting the best automatic classification algorithm. The third action was to use a scoring algorithm to predict sinus and non-sinus arrhythmias. The heptagonal reinforcement learning (HRL) achieved results competitive with standard algorithms by combining three types of algorithms. The data used in this study was a 12-lead electrocardiogram (ECG) database of arrhythmias. The number of patients examined in this dataset is 10,646. The HRL algorithm has improved the previous algorithms by 5%, achieving 86% cardiac arrhythmia prediction.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140298139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Ambient Intelligence and Humanized Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1