首页 > 最新文献

Journal of Advanced Computational Intelligence and Intelligent Informatics最新文献

英文 中文
3D Street Object Detection from Monocular Images Using Deep Learning and Depth Information 利用深度学习和深度信息从单目图像中检测3D街道目标
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-20 DOI: 10.20965/jaciii.2023.p0198
Wei Liu, Zhang Tao, Yun Ma, Longsheng Wei
In this study, we present a three-dimensional (3D) object detection algorithm based on monocular images by constructing an end-to-end network, that incorporates depth information. The entire network consists of three parts. The first part includes the basic object detection neural network as the main body, that uses the region proposal network to obtain the two-dimensional (2D) region proposal of the object. The second part is the depth estimation branch network, that obtains the depth information of the object pixels and calculates the corresponding 3D point cloud. In the last part, concatenated features obtained from the aforementioned two parts are fed into the fully-connected layers. Subsequently, 2D and 3D detection results are obtained. Compared with certain existing methods, the accuracy of the detection results is improved in this study.
在这项研究中,我们通过构建一个包含深度信息的端到端网络,提出了一种基于单眼图像的三维(3D)目标检测算法。整个网络由三部分组成。第一部分以基本目标检测神经网络为主体,利用区域建议网络获取目标的二维区域建议。第二部分是深度估计分支网络,获取目标像素的深度信息并计算相应的三维点云。在最后一部分中,从上述两部分获得的连接特征被馈送到全连接层中。随后,得到二维和三维检测结果。与现有的一些方法相比,本研究提高了检测结果的准确性。
{"title":"3D Street Object Detection from Monocular Images Using Deep Learning and Depth Information","authors":"Wei Liu, Zhang Tao, Yun Ma, Longsheng Wei","doi":"10.20965/jaciii.2023.p0198","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0198","url":null,"abstract":"In this study, we present a three-dimensional (3D) object detection algorithm based on monocular images by constructing an end-to-end network, that incorporates depth information. The entire network consists of three parts. The first part includes the basic object detection neural network as the main body, that uses the region proposal network to obtain the two-dimensional (2D) region proposal of the object. The second part is the depth estimation branch network, that obtains the depth information of the object pixels and calculates the corresponding 3D point cloud. In the last part, concatenated features obtained from the aforementioned two parts are fed into the fully-connected layers. Subsequently, 2D and 3D detection results are obtained. Compared with certain existing methods, the accuracy of the detection results is improved in this study.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"71 1","pages":"198-206"},"PeriodicalIF":0.7,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84932090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Positioning Method of Four-Wheel-Steering Mobile Robots Based on Improved UMBmark of Michigan Benchmark Algorithm 基于改进的密歇根基准算法的四轮转向移动机器人定位方法
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-20 DOI: 10.20965/jaciii.2023.p0135
Dianjun Wang, Meng Xu, Ya Chen, Haoxiang Zhong, Y. Zhu, Zilong Wang, Linlin Gao
To reduce the error of the odometer positioning system and improve the positioning accuracy of four-wheel-steering mobile robots, three types of coupling errors are considered, based on the University of Michigan Benchmark (UMBmark) method: unequal track width, unequal wheel diameter, and speed difference of ipsilateral wheels. A “dual direction square path experiment” is designed to decouple the error, a new system error model is defined, and an improved UMBmark method for a four-wheel mobile robot is proposed. In the mobile robot positioning system, a laser tracker is used to measure the absolute positions of the starting and ending points of the robot. The positioning test results of the robot using the improved UMBmark method show that the odometer system error is 69.103 mm, which is 2.6 times less than that in the traditional UMBmark method. Hence, the improved UMBmark can better compensate for the system error of four-wheel-steering mobile robots.
为了减小里程表定位系统的误差,提高四轮转向移动机器人的定位精度,基于密歇根大学基准(ummark)方法,考虑了三种耦合误差:履带宽度不等、车轮直径不等和同侧车轮速度差。设计了“双向方形路径实验”来解耦误差,定义了新的系统误差模型,提出了一种改进的四轮移动机器人UMBmark方法。在移动机器人定位系统中,激光跟踪仪用于测量机器人起点和终点的绝对位置。采用改进的UMBmark方法对机器人进行了定位测试,结果表明,里程表系统误差为69.103 mm,比传统的UMBmark方法减小了2.6倍。因此,改进的UMBmark能够更好地补偿四轮转向移动机器人的系统误差。
{"title":"Positioning Method of Four-Wheel-Steering Mobile Robots Based on Improved UMBmark of Michigan Benchmark Algorithm","authors":"Dianjun Wang, Meng Xu, Ya Chen, Haoxiang Zhong, Y. Zhu, Zilong Wang, Linlin Gao","doi":"10.20965/jaciii.2023.p0135","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0135","url":null,"abstract":"To reduce the error of the odometer positioning system and improve the positioning accuracy of four-wheel-steering mobile robots, three types of coupling errors are considered, based on the University of Michigan Benchmark (UMBmark) method: unequal track width, unequal wheel diameter, and speed difference of ipsilateral wheels. A “dual direction square path experiment” is designed to decouple the error, a new system error model is defined, and an improved UMBmark method for a four-wheel mobile robot is proposed. In the mobile robot positioning system, a laser tracker is used to measure the absolute positions of the starting and ending points of the robot. The positioning test results of the robot using the improved UMBmark method show that the odometer system error is 69.103 mm, which is 2.6 times less than that in the traditional UMBmark method. Hence, the improved UMBmark can better compensate for the system error of four-wheel-steering mobile robots.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"65 1","pages":"135-142"},"PeriodicalIF":0.7,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76562596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of Alcohol Consumption on the Frequency of Microsaccades 饮酒对微跳频的影响
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-20 DOI: 10.20965/jaciii.2023.p0148
Toumi Ohara, Fumiya Kinoshita
In recent years, as eye movement measurement devices have become relatively cheap, many attempts have been made to quantitatively evaluate covert attention by focusing on microsaccades. However, the measurement of microsaccades still has many unclear points, and a unified analysis method is still lacking. As such, the interpretation of results differs among different research groups. To solve this problem, it is important to conduct empirical studies on microsaccades to evaluate them using a unified method. In this study, we conducted an empirical experiment on the effects of alcohol consumption on microsaccades by temporarily suppressing cerebellar activity with alcohol consumption. The results showed that the frequency of microsaccades was significantly reduced after 30, 50, and 70 min of drinking compared to after drinking (p< 0.05). These results suggest that the decrease in brain function caused by alcohol consumption suppresses the frequency of microsaccades, and that this may be the cause of constriction in the peripheral visual field when drinking.
近年来,随着眼动测量设备变得相对便宜,许多人尝试通过关注微眼跳来定量评估隐蔽注意。然而,微跳的测量仍有许多不明确之处,缺乏统一的分析方法。因此,不同的研究小组对结果的解释是不同的。为了解决这一问题,有必要对微跳动进行实证研究,以统一的方法对其进行评价。在本研究中,我们通过饮酒暂时抑制小脑活动,对饮酒对微跳的影响进行了实证实验。结果表明,饮酒后30min、50min、70min微眼跳频率较饮酒后显著降低(p< 0.05)。这些结果表明,饮酒引起的大脑功能下降抑制了微眼跳的频率,这可能是饮酒时周围视野收缩的原因。
{"title":"Effect of Alcohol Consumption on the Frequency of Microsaccades","authors":"Toumi Ohara, Fumiya Kinoshita","doi":"10.20965/jaciii.2023.p0148","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0148","url":null,"abstract":"In recent years, as eye movement measurement devices have become relatively cheap, many attempts have been made to quantitatively evaluate covert attention by focusing on microsaccades. However, the measurement of microsaccades still has many unclear points, and a unified analysis method is still lacking. As such, the interpretation of results differs among different research groups. To solve this problem, it is important to conduct empirical studies on microsaccades to evaluate them using a unified method. In this study, we conducted an empirical experiment on the effects of alcohol consumption on microsaccades by temporarily suppressing cerebellar activity with alcohol consumption. The results showed that the frequency of microsaccades was significantly reduced after 30, 50, and 70 min of drinking compared to after drinking (p< 0.05). These results suggest that the decrease in brain function caused by alcohol consumption suppresses the frequency of microsaccades, and that this may be the cause of constriction in the peripheral visual field when drinking.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"19 1","pages":"148-153"},"PeriodicalIF":0.7,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86949661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Capsule Network Extension Based on Metric Learning 基于度量学习的胶囊网络扩展
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-20 DOI: 10.20965/jaciii.2023.p0173
Nozomu Ohta, Shin Kawai, H. Nobuhara
A capsule network (CapsNet) is a deep learning model for image classification that provides robustness to changes in the poses of objects in the images. A capsule is a vector whose direction represents the presence, position, size, and pose of an object. However, with CapsNet, the distribution of capsules is concentrated in a class, and the number of capsules increases with the number of classes. In addition, learning is computationally expensive for a CapsNet. We proposed a method to increase the diversity of capsule directions and decrease the computational cost of CapsNet training by allowing a single capsule to represent multiple object classes. To determine the distance between classes, we used an additive angular margin loss called ArcFace. To validate the proposed method, the distribution of the capsules was determined using principal component analysis to validate the proposed method. In addition, using the MNIST, fashion-MNIST, EMNIST, SVHN, and CIFAR-10 datasets, as well as the corresponding affine-transformed datasets, we determined the accuracy and training time of the proposed method and original CapsNet. The accuracy of the proposed method improved by 8.91% on the CIFAR-10 dataset, and the training time reduced by more than 19% for each dataset compared with those of the original CapsNets.
胶囊网络(CapsNet)是一种用于图像分类的深度学习模型,它对图像中物体的姿势变化提供鲁棒性。胶囊是一个矢量,它的方向表示对象的存在、位置、大小和姿态。而在CapsNet中,胶囊的分布集中在一个类中,胶囊的数量随着类的增加而增加。此外,学习对于CapsNet来说在计算上是昂贵的。我们提出了一种方法,通过允许单个胶囊代表多个对象类来增加胶囊方向的多样性并降低CapsNet训练的计算成本。为了确定类之间的距离,我们使用了一种叫做ArcFace的附加角边损失。为了验证所提出的方法,利用主成分分析确定胶囊的分布以验证所提出的方法。此外,利用MNIST、fashion-MNIST、EMNIST、SVHN和CIFAR-10数据集以及相应的仿射变换数据集,确定了本文方法与原始CapsNet的准确率和训练时间。该方法在CIFAR-10数据集上的准确率提高了8.91%,每个数据集的训练时间比原始capnet减少了19%以上。
{"title":"Capsule Network Extension Based on Metric Learning","authors":"Nozomu Ohta, Shin Kawai, H. Nobuhara","doi":"10.20965/jaciii.2023.p0173","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0173","url":null,"abstract":"A capsule network (CapsNet) is a deep learning model for image classification that provides robustness to changes in the poses of objects in the images. A capsule is a vector whose direction represents the presence, position, size, and pose of an object. However, with CapsNet, the distribution of capsules is concentrated in a class, and the number of capsules increases with the number of classes. In addition, learning is computationally expensive for a CapsNet. We proposed a method to increase the diversity of capsule directions and decrease the computational cost of CapsNet training by allowing a single capsule to represent multiple object classes. To determine the distance between classes, we used an additive angular margin loss called ArcFace. To validate the proposed method, the distribution of the capsules was determined using principal component analysis to validate the proposed method. In addition, using the MNIST, fashion-MNIST, EMNIST, SVHN, and CIFAR-10 datasets, as well as the corresponding affine-transformed datasets, we determined the accuracy and training time of the proposed method and original CapsNet. The accuracy of the proposed method improved by 8.91% on the CIFAR-10 dataset, and the training time reduced by more than 19% for each dataset compared with those of the original CapsNets.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"269 1","pages":"173-181"},"PeriodicalIF":0.7,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78743763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Image Inpainting Algorithms Based on Attention Guidance 基于注意力引导的图像绘制算法研究
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-20 DOI: 10.20965/jaciii.2023.p0190
Yankun Shen, Yaya Su, L. Wang, Dongli Jia
In recent years, the use of deep learning in image inpainting has yielded positive results. However, existing image inpainting algorithms do not pay sufficient attention to the structural and textural features of the image when inpainting, which leads to issues in the inpainting results such as blurring and distortion. To solve the above problems, a channel attention mechanism was introduced to emphasize the importance of structure and texture after extraction by the convolutional network. A bidirectional gated feature fusion module was employed to exchange and fuse the structural and textural features, ensuring the overall consistency of the image. In addition, the features of the image were better captured by selecting a deformable convolution that can adapt the receptive field to replace the ordinary convolution in the contextual feature aggregation module. This resulted in highly vivid and realistic restoration results with more reasonable details. The experiments showed that, compared with the current mainstream network, the repair results of this algorithm were more realistic, and the superiority of this algorithm was proved by qualitative and quantitative experiments.
近年来,深度学习在图像绘制中的应用取得了积极的成果。然而,现有的图像补图算法在补图时没有充分考虑到图像的结构和纹理特征,导致补图结果出现模糊和失真等问题。为了解决上述问题,引入通道注意机制,强调卷积网络提取后结构和纹理的重要性。采用双向门控特征融合模块交换融合图像的结构特征和纹理特征,保证图像的整体一致性。此外,在上下文特征聚合模块中,通过选择可调整接收野的可变形卷积来代替普通卷积,可以更好地捕获图像的特征。这导致了高度生动和真实的恢复结果与更合理的细节。实验表明,与目前主流网络相比,该算法的修复结果更加真实,并通过定性和定量实验证明了该算法的优越性。
{"title":"Research on Image Inpainting Algorithms Based on Attention Guidance","authors":"Yankun Shen, Yaya Su, L. Wang, Dongli Jia","doi":"10.20965/jaciii.2023.p0190","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0190","url":null,"abstract":"In recent years, the use of deep learning in image inpainting has yielded positive results. However, existing image inpainting algorithms do not pay sufficient attention to the structural and textural features of the image when inpainting, which leads to issues in the inpainting results such as blurring and distortion. To solve the above problems, a channel attention mechanism was introduced to emphasize the importance of structure and texture after extraction by the convolutional network. A bidirectional gated feature fusion module was employed to exchange and fuse the structural and textural features, ensuring the overall consistency of the image. In addition, the features of the image were better captured by selecting a deformable convolution that can adapt the receptive field to replace the ordinary convolution in the contextual feature aggregation module. This resulted in highly vivid and realistic restoration results with more reasonable details. The experiments showed that, compared with the current mainstream network, the repair results of this algorithm were more realistic, and the superiority of this algorithm was proved by qualitative and quantitative experiments.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"18 1","pages":"190-197"},"PeriodicalIF":0.7,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91052634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
optIFnet: A Capacitive Antenna Dipole Indention-Flexure Predictive Model Optimized Using Hybrid Lichtenberg Algorithm and Neural Network 基于Lichtenberg算法和神经网络优化的电容天线偶极子压痕-挠曲预测模型
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-20 DOI: 10.20965/jaciii.2023.p0027
Mike Louie C. Enriquez, Ronnie S. Concepcion, R. Relano, Kate G. Francisco, Jonah Jahara G. Baun, Adrian Genevie G. Janairo, R. Baldovino, R. R. Vicerra, A. Bandala, E. Dadios
In performing underground imaging surveying, applying a coating in the antenna dipole plates with robust and durable material to stay protected against rough road features is vital to consider. By doing this, the mechanical properties of the metallic antenna dipole can be improved and be shielded from deterioration. With that, this study has developed an indentation-flexure algorithm optimized using a hybrid Lichtenberg algorithm (LA) and artificial neural network (ANN) that can predict the indentation-flexure as a function of the coating material’s elastic modulus, Poisson ratio, and thickness as well as the load antenna weight. Acrylic, epoxy, nylon 101, high-density polyethylene, and polyvinyl chloride were chosen as the top five most popular coating materials. A 120° titanium cone indenter with a 0.5-inch-diameter, slightly rounded point, and a constant compressive force of 200 N in the center was employed to plot and use a nonlinear mechanical finite element analysis on an antenna dipole plate using SolidWorks. Nature-inspired and evolutionary metaheuristics such as African vultures, Lichtenberg, and gorilla troop optimization algorithm including genetic algorithm (GA) were employed as optimized models for the hardness indentation for capacitively coupled antenna dipoles. Based on the results, the hybrid LA-ANN solution with a hidden neurons of 3000 and a sigmoid activation function is the best performing model as it acquired a MSE score of 0.0061 in validation and 0.1478 in testing compare to the other model with 0.1610 for GA with 100 hidden neurons with sigmoid activation function. Thus, LA-ANN model is considered as the optIFnet as it exhibited the best prediction performance and fastest convergence among all optimizers used.
在进行地下成像测量时,在天线偶极板上涂上一层坚固耐用的材料,以防止粗糙的道路特征是至关重要的考虑因素。通过这样做,金属天线偶极子的机械性能可以得到改善,并防止其恶化。基于此,本研究开发了一种采用混合Lichtenberg算法(LA)和人工神经网络(ANN)优化的压痕-挠曲算法,该算法可以预测压痕-挠曲作为涂层材料弹性模量、泊松比、厚度以及负载天线重量的函数。丙烯酸、环氧树脂、尼龙101、高密度聚乙烯和聚氯乙烯被选为最受欢迎的五大涂料材料。利用SolidWorks软件对天线偶极板进行了非线性力学有限元分析,采用直径为0.5 inch、点略圆、中心恒定压缩力为200 N的120°钛锥压头。采用非洲秃鹫、Lichtenberg和大猩猩种群优化算法(包括遗传算法)等自然启发和进化元启发式算法作为电容耦合天线偶极子硬度压痕的优化模型。结果表明,具有3000个隐藏神经元和一个sigmoid激活函数的混合LA-ANN解决方案是性能最好的模型,验证时的MSE得分为0.0061,测试时的MSE得分为0.1478,而具有100个隐藏神经元和sigmoid激活函数的GA模型的MSE得分为0.1610。因此,LA-ANN模型在所有优化器中表现出最好的预测性能和最快的收敛速度,被认为是最优的。
{"title":"optIFnet: A Capacitive Antenna Dipole Indention-Flexure Predictive Model Optimized Using Hybrid Lichtenberg Algorithm and Neural Network","authors":"Mike Louie C. Enriquez, Ronnie S. Concepcion, R. Relano, Kate G. Francisco, Jonah Jahara G. Baun, Adrian Genevie G. Janairo, R. Baldovino, R. R. Vicerra, A. Bandala, E. Dadios","doi":"10.20965/jaciii.2023.p0027","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0027","url":null,"abstract":"In performing underground imaging surveying, applying a coating in the antenna dipole plates with robust and durable material to stay protected against rough road features is vital to consider. By doing this, the mechanical properties of the metallic antenna dipole can be improved and be shielded from deterioration. With that, this study has developed an indentation-flexure algorithm optimized using a hybrid Lichtenberg algorithm (LA) and artificial neural network (ANN) that can predict the indentation-flexure as a function of the coating material’s elastic modulus, Poisson ratio, and thickness as well as the load antenna weight. Acrylic, epoxy, nylon 101, high-density polyethylene, and polyvinyl chloride were chosen as the top five most popular coating materials. A 120° titanium cone indenter with a 0.5-inch-diameter, slightly rounded point, and a constant compressive force of 200 N in the center was employed to plot and use a nonlinear mechanical finite element analysis on an antenna dipole plate using SolidWorks. Nature-inspired and evolutionary metaheuristics such as African vultures, Lichtenberg, and gorilla troop optimization algorithm including genetic algorithm (GA) were employed as optimized models for the hardness indentation for capacitively coupled antenna dipoles. Based on the results, the hybrid LA-ANN solution with a hidden neurons of 3000 and a sigmoid activation function is the best performing model as it acquired a MSE score of 0.0061 in validation and 0.1478 in testing compare to the other model with 0.1610 for GA with 100 hidden neurons with sigmoid activation function. Thus, LA-ANN model is considered as the optIFnet as it exhibited the best prediction performance and fastest convergence among all optimizers used.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"3 1","pages":"27-34"},"PeriodicalIF":0.7,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79281931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech-Section Extraction Using Lip Movement and Voice Information in Japanese 基于唇动和语音信息的日语语段提取
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-20 DOI: 10.20965/jaciii.2023.p0054
Etsuro Nakamura, Y. Kageyama, Satoshi Hirose
In recent years, several Japanese companies have attempted to improve the efficiency of their meetings, which has been a significant challenge. For instance, voice recognition technology is used to considerably improve meeting minutes creation. In an automatic minutes-creating system, identifying the speaker to add speaker information to the text would substantially improve the overall efficiency of the process. Therefore, a few companies and research groups have proposed speaker estimation methods; however, it includes challenges, such as requiring advance preparation, special equipment, and multiple microphones. These problems can be solved by using speech sections that are extracted from lip movements and voice information. When a person speaks, voice and lip movements occur simultaneously. Therefore, the speaker’s speech section can be extracted from videos by using lip movement and voice information. However, when this speech section contains only voice information, the voiceprint information of each meeting participant is required for speaker identification. When using lip movements, the speech section and speaker position can be extracted without the voiceprint information. Therefore, in this study, we propose a speech-section extraction method that uses image and voice information in Japanese for speaker identification. The proposed method consists of three processes: i) the extraction of speech frames using lip movements, ii) the extraction of speech frames using voices, and iii) the classification of speech sections using these extraction results. We used video data to evaluate the functionality of the method. Further, the proposed method was compared with state-of-the-art techniques. The average F-measure of the proposed method is determined to be higher than that of the conventional methods that are based on state-of-the-art techniques. The evaluation results showed that the proposed method achieves state-of-the-art performance using a simpler process compared to the conventional method.
近年来,几家日本公司试图提高会议效率,但这一直是一项重大挑战。例如,语音识别技术被用于大大提高会议记录的制作。在自动制作会议记录的系统中,识别发言者以便将发言者的信息添加到案文中,将大大提高该过程的总体效率。因此,一些公司和研究小组提出了说话人估计方法;然而,它也包含挑战,例如需要提前准备,特殊设备和多个麦克风。这些问题可以通过使用从嘴唇运动和语音信息中提取的语音片段来解决。当一个人说话时,声音和嘴唇的运动同时发生。因此,可以利用嘴唇运动和语音信息从视频中提取说话人的语音部分。但是,当此演讲部分仅包含语音信息时,需要每个与会者的声纹信息来识别发言者。当使用唇部运动时,可以在不需要声纹信息的情况下提取语音段和说话人的位置。因此,在本研究中,我们提出了一种利用日语图像和语音信息进行说话人识别的语音片段提取方法。该方法包括三个过程:i)使用唇形运动提取语音帧,ii)使用声音提取语音帧,以及iii)使用这些提取结果对语音片段进行分类。我们使用视频数据来评估该方法的功能。此外,将所提出的方法与最先进的技术进行了比较。所建议方法的平均f值被确定为高于基于最先进技术的传统方法。评估结果表明,与传统方法相比,该方法以更简单的过程获得了最先进的性能。
{"title":"Speech-Section Extraction Using Lip Movement and Voice Information in Japanese","authors":"Etsuro Nakamura, Y. Kageyama, Satoshi Hirose","doi":"10.20965/jaciii.2023.p0054","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0054","url":null,"abstract":"In recent years, several Japanese companies have attempted to improve the efficiency of their meetings, which has been a significant challenge. For instance, voice recognition technology is used to considerably improve meeting minutes creation. In an automatic minutes-creating system, identifying the speaker to add speaker information to the text would substantially improve the overall efficiency of the process. Therefore, a few companies and research groups have proposed speaker estimation methods; however, it includes challenges, such as requiring advance preparation, special equipment, and multiple microphones. These problems can be solved by using speech sections that are extracted from lip movements and voice information. When a person speaks, voice and lip movements occur simultaneously. Therefore, the speaker’s speech section can be extracted from videos by using lip movement and voice information. However, when this speech section contains only voice information, the voiceprint information of each meeting participant is required for speaker identification. When using lip movements, the speech section and speaker position can be extracted without the voiceprint information. Therefore, in this study, we propose a speech-section extraction method that uses image and voice information in Japanese for speaker identification. The proposed method consists of three processes: i) the extraction of speech frames using lip movements, ii) the extraction of speech frames using voices, and iii) the classification of speech sections using these extraction results. We used video data to evaluate the functionality of the method. Further, the proposed method was compared with state-of-the-art techniques. The average F-measure of the proposed method is determined to be higher than that of the conventional methods that are based on state-of-the-art techniques. The evaluation results showed that the proposed method achieves state-of-the-art performance using a simpler process compared to the conventional method.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"1 1","pages":"54-63"},"PeriodicalIF":0.7,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89687225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gradient-Based Scheduler for Scientific Workflows in Cloud Computing 基于梯度的云计算科学工作流调度器
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-20 DOI: 10.20965/jaciii.2023.p0064
Danjing Wang, Huifang Li, Youwei Zhang, Baihai Zhang
It is becoming increasingly attractive to execute workflows in the cloud, as the cloud environment enables scientific applications to utilize elastic computing resources on demand. However, despite being a key to efficiently managing application execution in the cloud, traditional workflow scheduling algorithms face significant challenges in the cloud environment. The gradient-based optimizer (GBO) is a newly proposed evolutionary algorithm with a search engine based on the Newton’s method. It employs a set of vectors to search in the solution space. This study designs a gradient-based scheduler by using GBO for workflow scheduling to minimize the usage costs of workflows under given deadline constraints. Extensive experiments are conducted on well-known scientific workflows of different sizes and types using WorkflowSim. The experimental results show that the proposed scheduling algorithm outperforms five other state-of-the-art algorithms in terms of both the constraint satisfiability and cost optimization, thereby verifying its advantages in addressing workflow scheduling problems.
在云中执行工作流正变得越来越有吸引力,因为云环境使科学应用程序能够按需利用弹性计算资源。然而,尽管传统的工作流调度算法是在云环境中有效管理应用程序执行的关键,但它在云环境中面临着重大挑战。基于梯度的优化器(gradient-based optimizer, GBO)是一种基于牛顿方法的搜索引擎进化算法。它使用一组向量在解空间中搜索。为了在给定的期限约束下最小化工作流的使用成本,本文设计了一个基于梯度的工作流调度程序。利用WorkflowSim对不同规模和类型的知名科学工作流进行了广泛的实验。实验结果表明,本文提出的调度算法在约束可满足性和成本优化方面均优于其他五种最新算法,从而验证了该算法在解决工作流调度问题方面的优势。
{"title":"Gradient-Based Scheduler for Scientific Workflows in Cloud Computing","authors":"Danjing Wang, Huifang Li, Youwei Zhang, Baihai Zhang","doi":"10.20965/jaciii.2023.p0064","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0064","url":null,"abstract":"It is becoming increasingly attractive to execute workflows in the cloud, as the cloud environment enables scientific applications to utilize elastic computing resources on demand. However, despite being a key to efficiently managing application execution in the cloud, traditional workflow scheduling algorithms face significant challenges in the cloud environment. The gradient-based optimizer (GBO) is a newly proposed evolutionary algorithm with a search engine based on the Newton’s method. It employs a set of vectors to search in the solution space. This study designs a gradient-based scheduler by using GBO for workflow scheduling to minimize the usage costs of workflows under given deadline constraints. Extensive experiments are conducted on well-known scientific workflows of different sizes and types using WorkflowSim. The experimental results show that the proposed scheduling algorithm outperforms five other state-of-the-art algorithms in terms of both the constraint satisfiability and cost optimization, thereby verifying its advantages in addressing workflow scheduling problems.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"65 1","pages":"64-73"},"PeriodicalIF":0.7,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87738399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SpeedX: Smart Speed Controller Model of Towed Subterranean Imaging System for Resistivity Data Distortion Reduction Using Computational Intelligence 利用计算智能降低电阻率数据失真的拖曳式地下成像系统的智能速度控制器模型
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-20 DOI: 10.20965/jaciii.2023.p0003
R. Relano, Kate G. Francisco, Ronnie S. Concepcion, Mike Louie C. Enriquez, Jonah Jahara G. Baun, Adrian Genevie G. Janairo, R. R. Vicerra, A. Bandala, E. Dadios
Land surveying has been one of the core operations in performing underground imaging. It is known that dynamic and continuous resistivity readings were employed through this technique using the array of capacitive electrodes being towed with a light vehicle. However, the main challenge in doing subsurface surveying is the change in speed of the system when there are inevitable obstacles and sloping road surfaces. To address it, this study will develop prediction models using different computational intelligence such as multigene symbolic regression genetic programming (MSRGP), regression-based decision tree (RTree), and feed forward neural network (FFNN) that will result in a smart speed controller system that can maintain the constant speed of the towed subterranean system. The best performing prediction model will be considered as the SpeedX. The expected output is a correction factor that will signal the speed controller in slow down or inclined plane road environment to maintain a constant speed of 1.6667 m/s for avoidance of data distortion on land surveying. Thus, the MSEs for MSRGP, FFNN, and RTree are 0.00163, 0.00178, and 0.00240, respectively. This results in MSRGP as the best performing model and was considered as the SpeedX model. Other evaluation metrics were employed such as the MAE and R2 which signify the advantage of SpeedX. Furthermore, the comparison between the CI-controlled and uncontrolled towed subterranean imaging trailer system, as well as its advantages clearly highlight the advantage of embedded SpeedX in the system.
土地测量一直是地下成像的核心业务之一。众所周知,动态和连续电阻率读数是通过这种技术使用的电容电极阵列被拖着一辆轻型车辆。然而,进行地下测量的主要挑战是当存在不可避免的障碍物和倾斜路面时系统的速度变化。为了解决这一问题,本研究将使用不同的计算智能开发预测模型,如多基因符号回归遗传规划(MSRGP)、基于回归的决策树(RTree)和前馈神经网络(FFNN),这将产生一个智能速度控制器系统,可以保持拖曳地下系统的恒定速度。表现最好的预测模型将被认为是SpeedX。期望输出为校正因子,在减速或斜面道路环境下,向速度控制器发出信号,以保持1.6667 m/s的恒定速度,避免大地测量数据失真。因此,MSRGP、FFNN和RTree的mse分别为0.00163、0.00178和0.00240。这导致MSRGP成为性能最好的模型,并被认为是SpeedX模型。采用了其他评估指标,如MAE和R2,这表明SpeedX的优势。此外,通过对ci控制与非ci控制的牵引式地下成像拖车系统的对比,以及其所具有的优势,可以明显地看出嵌入式SpeedX在系统中的优势。
{"title":"SpeedX: Smart Speed Controller Model of Towed Subterranean Imaging System for Resistivity Data Distortion Reduction Using Computational Intelligence","authors":"R. Relano, Kate G. Francisco, Ronnie S. Concepcion, Mike Louie C. Enriquez, Jonah Jahara G. Baun, Adrian Genevie G. Janairo, R. R. Vicerra, A. Bandala, E. Dadios","doi":"10.20965/jaciii.2023.p0003","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0003","url":null,"abstract":"Land surveying has been one of the core operations in performing underground imaging. It is known that dynamic and continuous resistivity readings were employed through this technique using the array of capacitive electrodes being towed with a light vehicle. However, the main challenge in doing subsurface surveying is the change in speed of the system when there are inevitable obstacles and sloping road surfaces. To address it, this study will develop prediction models using different computational intelligence such as multigene symbolic regression genetic programming (MSRGP), regression-based decision tree (RTree), and feed forward neural network (FFNN) that will result in a smart speed controller system that can maintain the constant speed of the towed subterranean system. The best performing prediction model will be considered as the SpeedX. The expected output is a correction factor that will signal the speed controller in slow down or inclined plane road environment to maintain a constant speed of 1.6667 m/s for avoidance of data distortion on land surveying. Thus, the MSEs for MSRGP, FFNN, and RTree are 0.00163, 0.00178, and 0.00240, respectively. This results in MSRGP as the best performing model and was considered as the SpeedX model. Other evaluation metrics were employed such as the MAE and R2 which signify the advantage of SpeedX. Furthermore, the comparison between the CI-controlled and uncontrolled towed subterranean imaging trailer system, as well as its advantages clearly highlight the advantage of embedded SpeedX in the system.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"36 1","pages":"3-11"},"PeriodicalIF":0.7,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91164932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Digital Twin Concept Utilizing Electrical Resistivity Tomography for Monitoring Seawater Intrusion 利用电阻率层析成像技术监测海水入侵的数字孪生概念
IF 0.7 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-20 DOI: 10.20965/jaciii.2023.p0012
J. A. D. Leon, Ronnie S. Concepcion, R. Billones, Jonah Jahara G. Baun, Jose Miguel F. Custodio, R. R. Vicerra, A. Bandala, E. Dadios
Electrical resistivity tomography (ERT) has been seen as an appropriate instrument in several works to monitor and aid in the control of seawater intrusion (SWI) in coastal groundwater systems. This study seeks to discuss the synthesis of a digital twin that couples information between the physical space through ERT as a monitoring sensor and the digital space using SWI simulations to accurately model the behavior of SWI in the present and future settings. To showcase the concept, a Python-based simulation was presented that shows (a) the joint forward modeling-simulation scheme for calculating expected ERT apparent resistivity values from simulated SWI and (b) the calibration of the digital coastal aquifer system through genetic algorithm to accurately match the outputs of the SWI simulations with the ERT measurements.
电阻率层析成像(ERT)已被认为是监测和控制沿海地下水系统海水入侵(SWI)的一种合适的工具。本研究旨在讨论数字孪生的合成,通过ERT作为监测传感器将物理空间与使用SWI模拟的数字空间之间的信息耦合起来,以准确地模拟当前和未来设置中的SWI行为。为了展示这一概念,本文提出了一个基于python的模拟,展示了(a)通过模拟SWI计算ERT视电阻率值的联合正演模拟-模拟方案,以及(b)通过遗传算法校准数字沿海含水层系统,以精确匹配SWI模拟输出与ERT测量结果。
{"title":"Digital Twin Concept Utilizing Electrical Resistivity Tomography for Monitoring Seawater Intrusion","authors":"J. A. D. Leon, Ronnie S. Concepcion, R. Billones, Jonah Jahara G. Baun, Jose Miguel F. Custodio, R. R. Vicerra, A. Bandala, E. Dadios","doi":"10.20965/jaciii.2023.p0012","DOIUrl":"https://doi.org/10.20965/jaciii.2023.p0012","url":null,"abstract":"Electrical resistivity tomography (ERT) has been seen as an appropriate instrument in several works to monitor and aid in the control of seawater intrusion (SWI) in coastal groundwater systems. This study seeks to discuss the synthesis of a digital twin that couples information between the physical space through ERT as a monitoring sensor and the digital space using SWI simulations to accurately model the behavior of SWI in the present and future settings. To showcase the concept, a Python-based simulation was presented that shows (a) the joint forward modeling-simulation scheme for calculating expected ERT apparent resistivity values from simulated SWI and (b) the calibration of the digital coastal aquifer system through genetic algorithm to accurately match the outputs of the SWI simulations with the ERT measurements.","PeriodicalId":45921,"journal":{"name":"Journal of Advanced Computational Intelligence and Intelligent Informatics","volume":"20 1","pages":"12-18"},"PeriodicalIF":0.7,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83497663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Advanced Computational Intelligence and Intelligent Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1