International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)最新文献

英文中文

Box-driven coarse-grained segmentation for stroke rehabilitation scenarios 针对中风康复场景的盒式驱动粗粒度分割技术

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014426

Yiming Fan, Yunjia Liu, Xiaofeng Lu

For complex stroke rehabilitation scenarios, visual algorithms, such as motion recognition or video understanding, find it challenging to focus on patient areas with slow motion amplitude and pay more attention to targets with drastic changes in light flow. Therefore, it can provide critical perspectives and adequate information for the above visual tasks using a semantic segmentation algorithm to capture the patient's area from the captured image. Currently, the weakly supervised segmentation algorithm based on bounding boxes tends to utilize existing image classification methods. They can perform secondary processing on the internal images of boxes to obtain larger areas of pseudo-label information. In order to avoid the redundancy caused by algorithm concatenation, this paper proposes an end-to-end weakly supervised segmentation algorithm. In this method, a U-shaped residual module with variable depth is designed to capture the deep semantic features of images, and its output is integrated into the target matrix of the NCut problem in the form of blocks. Then, the region of the target is indicated by solving the sub-minimum eigenvector of the generalized eigensystem, and the segmentation is realized. We conducted experiments on the PASCAL VOC 2012 dataset, and the proposed method achieved 67.7% mIoU. On the private dataset, we compared the proposed method with similar algorithms, which can segment the target area more intensively

对于复杂的中风康复场景，运动识别或视频理解等视觉算法在关注运动幅度较慢的患者区域时会遇到困难，而对于光流变化剧烈的目标则会更加关注。因此，利用语义分割算法从捕获的图像中捕捉患者区域，可为上述视觉任务提供关键视角和充足信息。目前，基于边界框的弱监督分割算法倾向于利用现有的图像分类方法。它们可以对方框内部图像进行二次处理，以获取更大区域的伪标签信息。为了避免算法串联带来的冗余，本文提出了一种端到端的弱监督分割算法。在该方法中，设计了一个深度可变的 U 型残差模块来捕捉图像的深层语义特征，并将其输出以块的形式集成到 NCut 问题的目标矩阵中。然后，通过求解广义特征系统的次最小特征向量来指示目标区域，并实现分割。我们在 PASCAL VOC 2012 数据集上进行了实验，所提出的方法达到了 67.7% 的 mIoU。在私人数据集上，我们将提出的方法与同类算法进行了比较，发现后者能更集中地分割目标区域

{"title":"Box-driven coarse-grained segmentation for stroke rehabilitation scenarios","authors":"Yiming Fan, Yunjia Liu, Xiaofeng Lu","doi":"10.1117/12.3014426","DOIUrl":"https://doi.org/10.1117/12.3014426","url":null,"abstract":"For complex stroke rehabilitation scenarios, visual algorithms, such as motion recognition or video understanding, find it challenging to focus on patient areas with slow motion amplitude and pay more attention to targets with drastic changes in light flow. Therefore, it can provide critical perspectives and adequate information for the above visual tasks using a semantic segmentation algorithm to capture the patient's area from the captured image. Currently, the weakly supervised segmentation algorithm based on bounding boxes tends to utilize existing image classification methods. They can perform secondary processing on the internal images of boxes to obtain larger areas of pseudo-label information. In order to avoid the redundancy caused by algorithm concatenation, this paper proposes an end-to-end weakly supervised segmentation algorithm. In this method, a U-shaped residual module with variable depth is designed to capture the deep semantic features of images, and its output is integrated into the target matrix of the NCut problem in the form of blocks. Then, the region of the target is indicated by solving the sub-minimum eigenvector of the generalized eigensystem, and the segmentation is realized. We conducted experiments on the PASCAL VOC 2012 dataset, and the proposed method achieved 67.7% mIoU. On the private dataset, we compared the proposed method with similar algorithms, which can segment the target area more intensively","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":" 3","pages":"129692D - 129692D-7"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on automatic scoring algorithm for English composition based on machine learning 基于机器学习的英语作文自动评分算法研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014482

Hui Li

It is difficult to extract deep semantic features for English composition scoring methods based on artificial features, and it is difficult for English composition scoring methods based on neural networks to extract shallow features such as the number of words, resulting in the limitations of different composition scoring methods. Based on existing research results, this paper proposes an English composition scoring method that combines artificial feature extraction methods and deep learning methods. This method uses artificially designed features to extract shallow features at the word and sentence levels in the composition, draws on existing methods to extract semantic features of the composition, and performs regression calculations on the deep features and shallow features to obtain the total score of the composition. The experiment uses the Pearson evaluation index to measure the correlation between the predicted total score of the essay and the true total score under the combination method. The experiment shows that compared with the average results of 0.747 and 0.645 of baseline models such as BiLSTM and RNN, the algorithm proposed in this article is respectively improvements are 0.068 and 0.17, which proves the effectiveness of the method proposed in this paper.

基于人工特征的英语作文评分方法难以提取深层语义特征，而基于神经网络的英语作文评分方法又难以提取字数等浅层特征，导致不同作文评分方法的局限性。本文在已有研究成果的基础上，提出了一种人工特征提取方法与深度学习方法相结合的英语作文评分方法。该方法利用人工设计的特征提取作文中单词和句子层面的浅层特征，借鉴现有方法提取作文的语义特征，并对深层特征和浅层特征进行回归计算，得到作文的总分。实验采用皮尔逊评价指数来衡量组合方法下作文预测总分与真实总分之间的相关性。实验结果表明，与BiLSTM和RNN等基线模型的平均结果0.747和0.645相比，本文提出的算法分别提高了0.068和0.17，证明了本文所提方法的有效性。

{"title":"Research on automatic scoring algorithm for English composition based on machine learning","authors":"Hui Li","doi":"10.1117/12.3014482","DOIUrl":"https://doi.org/10.1117/12.3014482","url":null,"abstract":"It is difficult to extract deep semantic features for English composition scoring methods based on artificial features, and it is difficult for English composition scoring methods based on neural networks to extract shallow features such as the number of words, resulting in the limitations of different composition scoring methods. Based on existing research results, this paper proposes an English composition scoring method that combines artificial feature extraction methods and deep learning methods. This method uses artificially designed features to extract shallow features at the word and sentence levels in the composition, draws on existing methods to extract semantic features of the composition, and performs regression calculations on the deep features and shallow features to obtain the total score of the composition. The experiment uses the Pearson evaluation index to measure the correlation between the predicted total score of the essay and the true total score under the combination method. The experiment shows that compared with the average results of 0.747 and 0.645 of baseline models such as BiLSTM and RNN, the algorithm proposed in this article is respectively improvements are 0.068 and 0.17, which proves the effectiveness of the method proposed in this paper.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"20 6","pages":"129690T - 129690T-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing audio perception in augmented reality: a dynamic vocal information processing framework 增强增强现实中的音频感知：动态人声信息处理框架

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014440

Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, Anqi Du

The development of the Metaverse nowadays has sparked widespread emotions among researchers, and correspondingly, many technologies have been derived to improve the human's sense of reality in the Metaverse. Especially, Extended Reality (XR), as an indispensable and important technology and research direction in the study of the metaverse, aims to bring seamless transformation between the virtual world and the real-world immersion to the experiential world. However, the technology we currently lack is the ability to simultaneously separate, classify, and locate dynamic human sound information to enhance human sound perception in complex noise environments. This article proposes a framework that utilizes FCNN for separation, algebraic models for positioning to obtain estimated distances, and SVM for classification. The dataset is built to simulates distance-related changes with accurate ground truth labels. The results show that our method can effectively separate, separate, and locate mixed sound data, providing users with comprehensive information about the content, gender, and distance of the speaking object in complex sound environments, enhancing their immersive experience and perception ability. Our innovation lies in the combination of three audio processing technologies and the framework proposed may well inspire future work on related topics.

如今，元宇宙的发展引发了研究者们的广泛关注，相应地也衍生出许多技术来改善人类在元宇宙中的现实感。尤其是扩展现实技术（Extended Reality，XR），作为元宇宙研究中不可或缺的重要技术和研究方向，旨在实现虚拟世界与现实世界之间的无缝转换，让人们身临其境地体验世界。然而，我们目前缺乏的技术是同时分离、分类和定位人类动态声音信息的能力，以增强人类在复杂噪声环境中的声音感知能力。本文提出的框架利用 FCNN 进行分离，利用代数模型进行定位以获得估计距离，并利用 SVM 进行分类。建立的数据集模拟了与距离相关的变化，并带有准确的地面实况标签。结果表明，我们的方法可以有效地分离、分隔和定位混合声音数据，为用户提供复杂声音环境中说话对象的内容、性别和距离等综合信息，增强用户的沉浸式体验和感知能力。我们的创新之处在于结合了三种音频处理技术，所提出的框架很可能会对未来相关课题的研究有所启发。

{"title":"Enhancing audio perception in augmented reality: a dynamic vocal information processing framework","authors":"Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, Anqi Du","doi":"10.1117/12.3014440","DOIUrl":"https://doi.org/10.1117/12.3014440","url":null,"abstract":"The development of the Metaverse nowadays has sparked widespread emotions among researchers, and correspondingly, many technologies have been derived to improve the human's sense of reality in the Metaverse. Especially, Extended Reality (XR), as an indispensable and important technology and research direction in the study of the metaverse, aims to bring seamless transformation between the virtual world and the real-world immersion to the experiential world. However, the technology we currently lack is the ability to simultaneously separate, classify, and locate dynamic human sound information to enhance human sound perception in complex noise environments. This article proposes a framework that utilizes FCNN for separation, algebraic models for positioning to obtain estimated distances, and SVM for classification. The dataset is built to simulates distance-related changes with accurate ground truth labels. The results show that our method can effectively separate, separate, and locate mixed sound data, providing users with comprehensive information about the content, gender, and distance of the speaking object in complex sound environments, enhancing their immersive experience and perception ability. Our innovation lies in the combination of three audio processing technologies and the framework proposed may well inspire future work on related topics.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":" 22","pages":"129691Z - 129691Z-9"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Collaborative filtering recommendation method based on graph convolutional neural networks 基于图卷积神经网络的协作过滤推荐方法

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014407

Zhengwu Yuan, Xiling Zhan, Yatao Zhou, Hao Yang

In the rapidly advancing information technology era, information overload poses a significant challenge. Recommender systems offer a partial solution, yet traditional methods grapple with issues like sparse data and accuracy. For this reason, this paper introduces a novel approach—a high-order graph convolutional collaborative filtering model. This model employs a subgraph generation module to enhance the importance of neighbor nodes during high-order graph convolutions. Our approach yields enhanced embeddings by embedding user-item interaction information using graph techniques, stacking multi-layer graph convolutional networks to capture complex interactions, and leveraging both initial and convoluted embeddings. This paper introduces a constraint loss function to address over-smoothing in graph-based recommendations. Our method's effectiveness is confirmed through extensive experiments on three real-world datasets

在信息技术飞速发展的时代，信息过载是一项重大挑战。推荐系统提供了部分解决方案，但传统方法在数据稀疏和准确性等问题上却束手无策。为此，本文引入了一种新方法--高阶图卷积协同过滤模型。该模型采用子图生成模块，在高阶图卷积过程中增强邻近节点的重要性。我们的方法通过使用图技术嵌入用户-项目交互信息、堆叠多层图卷积网络以捕捉复杂的交互，以及利用初始嵌入和卷积嵌入来产生增强嵌入。本文引入了一个约束损失函数，以解决基于图的推荐中的过度平滑问题。通过在三个真实世界数据集上的广泛实验，证实了我们方法的有效性

引用次数: 0

Three-dimensional target detection algorithm for dangerous goods in CT security inspection CT 安全检查中危险品的三维目标检测算法

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014353

Jingze He, Yao Guo, qing song

In this paper, a 3D dangerous goods detection method based on RetinaNet is proposed. This method uses the bidirectional feature pyramid network structure of RetinaNet to extract multi-scale features from point cloud data and trains the system using Focal Loss function to achieve fast and accurate detection of dangerous goods. In addition, in order to improve the detection accuracy, this paper also introduces the 3D region proposal network (3D RPN) and nonmaximum suppression (NMS) algorithm. The experimental results show that the proposed method performs well on our self-built CT dataset, with high accuracy and low false positive rate, and is suitable for dangerous goods detection tasks in practical scenarios.

本文提出了一种基于 RetinaNet 的三维危险品检测方法。该方法利用 RetinaNet 的双向特征金字塔网络结构从点云数据中提取多尺度特征，并利用 Focal Loss 函数对系统进行训练，从而实现快速准确的危险品检测。此外，为了提高检测精度，本文还引入了三维区域建议网络（3D RPN）和非最大抑制（NMS）算法。实验结果表明，本文提出的方法在自建的 CT 数据集上表现良好，具有较高的准确率和较低的误报率，适用于实际场景中的危险品检测任务。

引用次数: 0

Rapid identification of adulterated rice using fusion of near-infrared spectroscopy and machine vision data: the combination of feature optimization and nonlinear modeling 利用近红外光谱和机器视觉数据的融合快速识别掺假大米：特征优化与非线性建模的结合

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014380

Chenxuan Song, Jinming Liu, Chunqi Wang, Zhijiang Li

Rice is susceptible to mold and mildew during storage. Metabolites such as aflatoxin produced during mildew will do great harm to consumers. To meet the need for rapid detection of normal rice adulterated with moldy rice, a rapid identification method of adulterated rice was established based on data fusion of near-infrared spectroscopy and machine vision. Using competitive adaptive reweighted sampling (CARS), genetic algorithm (GA), and least angle regression (LARS) for spectral and image feature extraction, combined with support vector classification (SVC), random forest (RF), and gradient boosting tree (GBT) nonlinear discriminant models, and use Bayesian search to optimize modeling parameters. The results show that the GBT fusion data model established by LARS optimization of spectral and image feature variables has the highest discrimination accuracy, with recognition accuracy rates of 100.00% and 98.11% for its training and testing sets, respectively. The discrimination performance is significantly improved compared to single near-infrared spectroscopy and machine vision. The results indicate that rapid identification of adulterated rice based on near-infrared spectroscopy and machine vision data fusion technology is feasible, providing theoretical support for the development of online identification equipment for adulterated rice.

大米在储存过程中容易发霉。霉变过程中产生的黄曲霉毒素等代谢物会对消费者造成极大伤害。为了满足快速检测正常大米与霉变大米掺假的需要，建立了一种基于近红外光谱和机器视觉数据融合的快速识别掺假大米的方法。利用竞争性自适应加权采样（CARS）、遗传算法（GA）和最小角度回归（LARS）进行光谱和图像特征提取，结合支持向量分类（SVC）、随机森林（RF）和梯度提升树（GBT）非线性判别模型，并利用贝叶斯搜索优化建模参数。结果表明，通过对光谱和图像特征变量进行 LARS 优化而建立的 GBT 融合数据模型的判别准确率最高，其训练集和测试集的识别准确率分别为 100.00% 和 98.11%。与单一的近红外光谱仪和机器视觉相比，其识别性能明显提高。结果表明，基于近红外光谱和机器视觉数据融合技术快速识别掺假大米是可行的，为掺假大米在线识别设备的开发提供了理论支持。

{"title":"Rapid identification of adulterated rice using fusion of near-infrared spectroscopy and machine vision data: the combination of feature optimization and nonlinear modeling","authors":"Chenxuan Song, Jinming Liu, Chunqi Wang, Zhijiang Li","doi":"10.1117/12.3014380","DOIUrl":"https://doi.org/10.1117/12.3014380","url":null,"abstract":"Rice is susceptible to mold and mildew during storage. Metabolites such as aflatoxin produced during mildew will do great harm to consumers. To meet the need for rapid detection of normal rice adulterated with moldy rice, a rapid identification method of adulterated rice was established based on data fusion of near-infrared spectroscopy and machine vision. Using competitive adaptive reweighted sampling (CARS), genetic algorithm (GA), and least angle regression (LARS) for spectral and image feature extraction, combined with support vector classification (SVC), random forest (RF), and gradient boosting tree (GBT) nonlinear discriminant models, and use Bayesian search to optimize modeling parameters. The results show that the GBT fusion data model established by LARS optimization of spectral and image feature variables has the highest discrimination accuracy, with recognition accuracy rates of 100.00% and 98.11% for its training and testing sets, respectively. The discrimination performance is significantly improved compared to single near-infrared spectroscopy and machine vision. The results indicate that rapid identification of adulterated rice based on near-infrared spectroscopy and machine vision data fusion technology is feasible, providing theoretical support for the development of online identification equipment for adulterated rice.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"63 2","pages":"129692J - 129692J-16"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast and high quality neural radiance fields reconstruction based on depth regularization 基于深度正则化的快速、高质量神经辐射场重建

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014528

Bin Zhu, Gaoxiang He, Bo Xie, Yi Chen, Yaoxuan Zhu, Liuying Chen

Although the Neural Radiance Fields (NeRF) has been shown to achieve high-quality novel view synthesis, existing models still perform poorly in some scenarios, particularly unbounded scenes. These models either require excessively long training times or produce suboptimal synthesis results. Consequently, we propose SD-NeRF, which consists of a compact neural radiance field model and self-supervised depth regularization. Experimental results demonstrate that SDNeRF can shorten training time by over 20 times compared to Mip-NeRF360 without compromising reconstruction accuracy.

尽管神经辐射场（NeRF）已被证明可以实现高质量的新颖视图合成，但现有模型在某些场景下，尤其是无边界场景下，仍然表现不佳。这些模型要么需要过长的训练时间，要么产生不理想的合成结果。因此，我们提出了 SD-NeRF，它由紧凑型神经辐射场模型和自监督深度正则化组成。实验结果表明，与 Mip-NeRF360 相比，SDNeRF 可以将训练时间缩短 20 倍以上，而且不会影响重建精度。

引用次数: 0

Research on collaborative positioning of intelligent vehicle aided navigation based on computer vision technology 基于计算机视觉技术的智能车辆辅助导航协同定位研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014415

Shun Zhang

Due to the low accuracy of collecting vehicle position information, the error in the positioning stage is relatively large. Therefore, the collaborative positioning of intelligent vehicle aided navigation based on computer vision technology is proposed. Taking the computer vision equipment-smart cameras VOF/VOF-S as a specific data acquisition device, and combining with the specific running state of the vehicle, the specific parameters in the data acquisition stage are set differently, so as to realize the accurate acquisition of vehicle position information. In the positioning stage, the plane where the wheel is located is taken as the road plane, and the coordinate parameters of data information collected by several road ground points in VOF/VOF-S computer vision technology device are integrated to realize the transformation of vehicle position information in real space. In the test results, the positioning error of vehicle position under different driving conditions is always stable within 1.50m, which has high accuracy.

由于采集车辆位置信息的精度较低，定位阶段的误差相对较大。因此，提出了基于计算机视觉技术的智能车辆辅助导航协同定位。以计算机视觉设备--智能相机 VOF/VOF-S 作为具体的数据采集设备，结合车辆的具体运行状态，对数据采集阶段的具体参数进行不同的设置，从而实现车辆位置信息的精确采集。在定位阶段，以车轮所在平面为道路平面，综合 VOF/VOF-S 计算机视觉技术装置中多个道路地面点采集的数据信息坐标参数，实现车辆位置信息在真实空间中的变换。测试结果表明，不同行驶条件下车辆位置的定位误差始终稳定在 1.50m 以内，具有较高的精度。

引用次数: 0

Image segmentation of rail surface defects based on fractional order particle swarm optimization 2D-Otsu algorithm 基于分数阶粒子群优化 2D-Otsu 算法的轨道表面缺陷图像分割

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014444

Na Geng, Hu Sheng, Weizhi Sun, Yifeng Wang, Tan Yu, Zihan Liu

Under the influence of high density operation and natural environment, the rail surface will appear abrasion damage, which will affect the safety and comfort of the train. Rail surface defect detection is an important part to ensure the safe and efficient operation of railway system. In order to distinguish whether there are defects on the rail surface, a method of rail surface defect image segmentation based on FPSO 2D-Otsu algorithm is proposed. The rail image is denoised and enhanced by adaptive fractional calculus, and then the rail image is segmented by FPSO 2D-Otsu algorithm. In order to verify the accuracy of the algorithm, the proposed algorithm is compared with PSO 2D-Otsu image segmentation algorithm. The experimental results show that the accuracy of FPSO 2D-Otsu algorithm in rail image segmentation is improved from 48.76% to 83.59% compared with PSO 2D-Otsu algorithm.

在高密度运行和自然环境的影响下，钢轨表面会出现磨损损伤，从而影响列车的安全性和舒适性。钢轨表面缺陷检测是确保铁路系统安全高效运行的重要环节。为了区分钢轨表面是否存在缺陷，本文提出了一种基于 FPSO 2D-Otsu 算法的钢轨表面缺陷图像分割方法。利用自适应分数微积分对钢轨图像进行去噪和增强，然后利用 FPSO 2D-Otsu 算法对钢轨图像进行分割。为了验证算法的准确性，将提出的算法与 PSO 2D-Otsu 图像分割算法进行了比较。实验结果表明，与 PSO 2D-Otsu 算法相比，FPSO 2D-Otsu 算法在铁路图像分割中的准确率从 48.76% 提高到 83.59%。

引用次数: 0

Microexpression recognition algorithm based on multi feature fusion 基于多特征融合的微表情识别算法

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014469

BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan

Video facial micro expression recognition is difficult to extract features due to its short duration and small action amplitude. In order to better combine temporal and spatial information of video, the whole model is divided into local attention module, global attention module and temporal module. First, the local attention module intercepts the key areas and sends them to the network with channel attention after processing; Then the global attention module sends the data into the network with spatial attention after random erasure avoiding key areas; Finally, the temporal module sends the micro expression occurrence frame to the network with temporal shift module and spatial attention after processing; Finally, the classification results are obtained through three full connection layers after feature fusion. The experiment is tested based on CASMEⅡ dataset,After five-fold Cross Validation, the average accuracy rate is 76.15, the unweighted F1 value is 0.691.Compared with the mainstream algorithm, this method has improvement.

视频面部微表情识别因其持续时间短、动作幅度小而难以提取特征。为了更好地结合视频的时空信息，整个模型分为局部注意模块、全局注意模块和时间模块。首先，局部注意模块截取关键区域，经过处理后发送到通道注意网络；然后，全局注意模块随机擦除关键区域后，将数据发送到空间注意网络；最后，时序模块将微表情发生帧经过处理后发送到时移模块和空间注意网络；最后，通过三个全连接层进行特征融合后得到分类结果。实验基于 CASMEⅡ 数据集进行测试，经过五倍交叉验证后，平均准确率为 76.15，非加权 F1 值为 0.691。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀