首页 > 最新文献

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)最新文献

英文 中文
A deep learning approach for fruit detection: YOLO-GF 水果检测的深度学习方法YOLO-GF
J. Guo, Wei Wu
To achieve automatic fruit object recognition in complex backgrounds, this paper proposes a fruit object detection algorithm based on YOLO-GF. Addressing challenges such as complex backgrounds, significant variations in target shapes, and instances of occlusion in fruit images, we utilize the Global Attention Mechanism (GAM) to enhance the feature extraction capability for fruit targets, thereby improving fruit recognition accuracy. Additionally, the Focal-EIOU loss function is used instead of the CIOU loss function to expedite model convergence. Experimental results demonstrate a significant improvement in recognition accuracy under the same hardware conditions. On the same test dataset, the improved model achieves an mAP50 of 92.1% and mAP50:95 of 76.5%, representing increases of 5.8% and 11.9% compared to the original model, respectively.
为了实现复杂背景下的水果目标自动识别,本文提出了一种基于 YOLO-GF 的水果目标检测算法。针对水果图像中存在的复杂背景、目标形状的显著变化和遮挡等挑战,我们利用全局注意力机制(GAM)来增强水果目标的特征提取能力,从而提高水果识别的准确率。此外,我们还使用 Focal-EIOU 损失函数代替 CIOU 损失函数,以加快模型收敛速度。实验结果表明,在相同的硬件条件下,识别准确率有了显著提高。在相同的测试数据集上,改进后的模型的 mAP50 为 92.1%,mAP50:95 为 76.5%,与原始模型相比分别提高了 5.8%和 11.9%。
{"title":"A deep learning approach for fruit detection: YOLO-GF","authors":"J. Guo, Wei Wu","doi":"10.1117/12.3014430","DOIUrl":"https://doi.org/10.1117/12.3014430","url":null,"abstract":"To achieve automatic fruit object recognition in complex backgrounds, this paper proposes a fruit object detection algorithm based on YOLO-GF. Addressing challenges such as complex backgrounds, significant variations in target shapes, and instances of occlusion in fruit images, we utilize the Global Attention Mechanism (GAM) to enhance the feature extraction capability for fruit targets, thereby improving fruit recognition accuracy. Additionally, the Focal-EIOU loss function is used instead of the CIOU loss function to expedite model convergence. Experimental results demonstrate a significant improvement in recognition accuracy under the same hardware conditions. On the same test dataset, the improved model achieves an mAP50 of 92.1% and mAP50:95 of 76.5%, representing increases of 5.8% and 11.9% compared to the original model, respectively.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"23 2","pages":"129691E - 129691E-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The automated segmentation and enhancement of cracks on airport pavements using three-dimensional imaging techniques 利用三维成像技术自动分割和强化机场路面裂缝
Shanshan Zhai, Yanna Xu
Based on 3D images, this study aims to explore automatic segmentation and enhancement methods for airfield runway surface cracks. Firstly, a typical 2D Gaussian filter is used to remove noise from the road surface data. Then, Steerable Matched Filter (SMFB) is introduced to extract crack features. By constructing a set of 52 SMFB filters with different parameters, we are able to accurately capture cracks with different directions and sizes. After that, Tensor Voting (TV) technique is introduced to further enhance the continuity of the cracks. With this method, we are able to detect and segment the cracks in the airfield runway surface for a more accurate and comprehensive analysis. The experimental results show that the proposed method performs well in crack detection and segmentation, providing strong support for airport pavement maintenance and management.
基于三维图像,本研究旨在探索机场跑道表面裂缝的自动分割和增强方法。首先,使用典型的二维高斯滤波器去除路面数据中的噪声。然后,引入可转向匹配滤波器(SMFB)来提取裂缝特征。通过构建一组具有不同参数的 52 个 SMFB 滤波器,我们能够准确捕捉不同方向和尺寸的裂缝。之后,我们引入了张量投票(TV)技术,以进一步增强裂纹的连续性。有了这种方法,我们就能检测和分割机场跑道表面的裂缝,从而进行更准确、更全面的分析。实验结果表明,所提出的方法在裂缝检测和分割方面表现良好,为机场路面维护和管理提供了有力支持。
{"title":"The automated segmentation and enhancement of cracks on airport pavements using three-dimensional imaging techniques","authors":"Shanshan Zhai, Yanna Xu","doi":"10.1117/12.3014473","DOIUrl":"https://doi.org/10.1117/12.3014473","url":null,"abstract":"Based on 3D images, this study aims to explore automatic segmentation and enhancement methods for airfield runway surface cracks. Firstly, a typical 2D Gaussian filter is used to remove noise from the road surface data. Then, Steerable Matched Filter (SMFB) is introduced to extract crack features. By constructing a set of 52 SMFB filters with different parameters, we are able to accurately capture cracks with different directions and sizes. After that, Tensor Voting (TV) technique is introduced to further enhance the continuity of the cracks. With this method, we are able to detect and segment the cracks in the airfield runway surface for a more accurate and comprehensive analysis. The experimental results show that the proposed method performs well in crack detection and segmentation, providing strong support for airport pavement maintenance and management.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"19 2","pages":"129691H - 129691H-8"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140512102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization research on pedestrian multiobjects tracking model based on TBD strategy 基于 TBD 策略的行人多目标跟踪模型优化研究
Shi Wang, Xiangju Liu, Xinshu Liu, JiaHui Chen, XiaoHong Wang
The main task of pedestrian multi objects tracking technology is to continuously track multiple pedestrian objects simultaneously in video sequences and maintain their unique ID numbers. However, current pedestrian multi objects tracking models still have many problems, such as false detection, missed detection, and frequent ID number switching when pedestrians are obstructed or have overly similar appearances, ultimately leading to tracking failure. Therefore, this paper proposes a pedestrian multi objects tracking model based on TBD strategy. It mainly consists of two parts: pedestrian detector and pedestrian tracker. In terms of pedestrian detectors, this paper uses ES-YOLO pedestrian detectors. In terms of pedestrian trackers, this paper draws on the Omni-scale feature learning module in OSNet to redesign the StrongSORT pedestrian appearance feature extraction network, and ultimately obtains the StrongSORT pedestrian tracker based on omni-scale feature fusion, further enhancing its pedestrian feature extraction ability. In terms of experimental results. The experimental results of the pedestrian multi objects tracking model based on the TBD strategy in this paper on the MOT16 dataset show that the proposed pedestrian multi-objective tracking model can effectively improve the accuracy of pedestrian multi objects tracking and reduce the problem of frequent pedestrian ID number switching.
行人多目标跟踪技术的主要任务是在视频序列中同时连续跟踪多个行人目标,并保持其唯一的 ID 编号。然而,目前的行人多目标跟踪模型仍然存在很多问题,例如误检、漏检,以及当行人受到遮挡或外观过于相似时频繁切换 ID 号,最终导致跟踪失败。因此,本文提出了一种基于 TBD 策略的行人多目标跟踪模型。它主要由两部分组成:行人检测器和行人跟踪器。在行人检测器方面,本文使用 ES-YOLO 行人检测器。在行人跟踪器方面,本文借鉴 OSNet 中的全尺度特征学习模块,重新设计了 StrongSORT 行人外观特征提取网络,最终得到了基于全尺度特征融合的 StrongSORT 行人跟踪器,进一步增强了其行人特征提取能力。在实验结果方面。基于本文 TBD 策略的行人多目标跟踪模型在 MOT16 数据集上的实验结果表明,本文提出的行人多目标跟踪模型能有效提高行人多目标跟踪的精度,减少行人 ID 号频繁切换的问题。
{"title":"Optimization research on pedestrian multiobjects tracking model based on TBD strategy","authors":"Shi Wang, Xiangju Liu, Xinshu Liu, JiaHui Chen, XiaoHong Wang","doi":"10.1117/12.3014360","DOIUrl":"https://doi.org/10.1117/12.3014360","url":null,"abstract":"The main task of pedestrian multi objects tracking technology is to continuously track multiple pedestrian objects simultaneously in video sequences and maintain their unique ID numbers. However, current pedestrian multi objects tracking models still have many problems, such as false detection, missed detection, and frequent ID number switching when pedestrians are obstructed or have overly similar appearances, ultimately leading to tracking failure. Therefore, this paper proposes a pedestrian multi objects tracking model based on TBD strategy. It mainly consists of two parts: pedestrian detector and pedestrian tracker. In terms of pedestrian detectors, this paper uses ES-YOLO pedestrian detectors. In terms of pedestrian trackers, this paper draws on the Omni-scale feature learning module in OSNet to redesign the StrongSORT pedestrian appearance feature extraction network, and ultimately obtains the StrongSORT pedestrian tracker based on omni-scale feature fusion, further enhancing its pedestrian feature extraction ability. In terms of experimental results. The experimental results of the pedestrian multi objects tracking model based on the TBD strategy in this paper on the MOT16 dataset show that the proposed pedestrian multi-objective tracking model can effectively improve the accuracy of pedestrian multi objects tracking and reduce the problem of frequent pedestrian ID number switching.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"33 4","pages":"129692K - 129692K-7"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140512240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-dimensional target detection algorithm for dangerous goods in CT security inspection CT 安全检查中危险品的三维目标检测算法
Jingze He, Yao Guo, qing song
In this paper, a 3D dangerous goods detection method based on RetinaNet is proposed. This method uses the bidirectional feature pyramid network structure of RetinaNet to extract multi-scale features from point cloud data and trains the system using Focal Loss function to achieve fast and accurate detection of dangerous goods. In addition, in order to improve the detection accuracy, this paper also introduces the 3D region proposal network (3D RPN) and nonmaximum suppression (NMS) algorithm. The experimental results show that the proposed method performs well on our self-built CT dataset, with high accuracy and low false positive rate, and is suitable for dangerous goods detection tasks in practical scenarios.
本文提出了一种基于 RetinaNet 的三维危险品检测方法。该方法利用 RetinaNet 的双向特征金字塔网络结构从点云数据中提取多尺度特征,并利用 Focal Loss 函数对系统进行训练,从而实现快速准确的危险品检测。此外,为了提高检测精度,本文还引入了三维区域建议网络(3D RPN)和非最大抑制(NMS)算法。实验结果表明,本文提出的方法在自建的 CT 数据集上表现良好,具有较高的准确率和较低的误报率,适用于实际场景中的危险品检测任务。
{"title":"Three-dimensional target detection algorithm for dangerous goods in CT security inspection","authors":"Jingze He, Yao Guo, qing song","doi":"10.1117/12.3014353","DOIUrl":"https://doi.org/10.1117/12.3014353","url":null,"abstract":"In this paper, a 3D dangerous goods detection method based on RetinaNet is proposed. This method uses the bidirectional feature pyramid network structure of RetinaNet to extract multi-scale features from point cloud data and trains the system using Focal Loss function to achieve fast and accurate detection of dangerous goods. In addition, in order to improve the detection accuracy, this paper also introduces the 3D region proposal network (3D RPN) and nonmaximum suppression (NMS) algorithm. The experimental results show that the proposed method performs well on our self-built CT dataset, with high accuracy and low false positive rate, and is suitable for dangerous goods detection tasks in practical scenarios.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"54 5","pages":"1296902 - 1296902-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on collaborative positioning of intelligent vehicle aided navigation based on computer vision technology 基于计算机视觉技术的智能车辆辅助导航协同定位研究
Shun Zhang
Due to the low accuracy of collecting vehicle position information, the error in the positioning stage is relatively large. Therefore, the collaborative positioning of intelligent vehicle aided navigation based on computer vision technology is proposed. Taking the computer vision equipment-smart cameras VOF/VOF-S as a specific data acquisition device, and combining with the specific running state of the vehicle, the specific parameters in the data acquisition stage are set differently, so as to realize the accurate acquisition of vehicle position information. In the positioning stage, the plane where the wheel is located is taken as the road plane, and the coordinate parameters of data information collected by several road ground points in VOF/VOF-S computer vision technology device are integrated to realize the transformation of vehicle position information in real space. In the test results, the positioning error of vehicle position under different driving conditions is always stable within 1.50m, which has high accuracy.
由于采集车辆位置信息的精度较低,定位阶段的误差相对较大。因此,提出了基于计算机视觉技术的智能车辆辅助导航协同定位。以计算机视觉设备--智能相机 VOF/VOF-S 作为具体的数据采集设备,结合车辆的具体运行状态,对数据采集阶段的具体参数进行不同的设置,从而实现车辆位置信息的精确采集。在定位阶段,以车轮所在平面为道路平面,综合 VOF/VOF-S 计算机视觉技术装置中多个道路地面点采集的数据信息坐标参数,实现车辆位置信息在真实空间中的变换。测试结果表明,不同行驶条件下车辆位置的定位误差始终稳定在 1.50m 以内,具有较高的精度。
{"title":"Research on collaborative positioning of intelligent vehicle aided navigation based on computer vision technology","authors":"Shun Zhang","doi":"10.1117/12.3014415","DOIUrl":"https://doi.org/10.1117/12.3014415","url":null,"abstract":"Due to the low accuracy of collecting vehicle position information, the error in the positioning stage is relatively large. Therefore, the collaborative positioning of intelligent vehicle aided navigation based on computer vision technology is proposed. Taking the computer vision equipment-smart cameras VOF/VOF-S as a specific data acquisition device, and combining with the specific running state of the vehicle, the specific parameters in the data acquisition stage are set differently, so as to realize the accurate acquisition of vehicle position information. In the positioning stage, the plane where the wheel is located is taken as the road plane, and the coordinate parameters of data information collected by several road ground points in VOF/VOF-S computer vision technology device are integrated to realize the transformation of vehicle position information in real space. In the test results, the positioning error of vehicle position under different driving conditions is always stable within 1.50m, which has high accuracy.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"4 1","pages":"129692P - 129692P-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image segmentation of rail surface defects based on fractional order particle swarm optimization 2D-Otsu algorithm 基于分数阶粒子群优化 2D-Otsu 算法的轨道表面缺陷图像分割
Na Geng, Hu Sheng, Weizhi Sun, Yifeng Wang, Tan Yu, Zihan Liu
Under the influence of high density operation and natural environment, the rail surface will appear abrasion damage, which will affect the safety and comfort of the train. Rail surface defect detection is an important part to ensure the safe and efficient operation of railway system. In order to distinguish whether there are defects on the rail surface, a method of rail surface defect image segmentation based on FPSO 2D-Otsu algorithm is proposed. The rail image is denoised and enhanced by adaptive fractional calculus, and then the rail image is segmented by FPSO 2D-Otsu algorithm. In order to verify the accuracy of the algorithm, the proposed algorithm is compared with PSO 2D-Otsu image segmentation algorithm. The experimental results show that the accuracy of FPSO 2D-Otsu algorithm in rail image segmentation is improved from 48.76% to 83.59% compared with PSO 2D-Otsu algorithm.
在高密度运行和自然环境的影响下,钢轨表面会出现磨损损伤,从而影响列车的安全性和舒适性。钢轨表面缺陷检测是确保铁路系统安全高效运行的重要环节。为了区分钢轨表面是否存在缺陷,本文提出了一种基于 FPSO 2D-Otsu 算法的钢轨表面缺陷图像分割方法。利用自适应分数微积分对钢轨图像进行去噪和增强,然后利用 FPSO 2D-Otsu 算法对钢轨图像进行分割。为了验证算法的准确性,将提出的算法与 PSO 2D-Otsu 图像分割算法进行了比较。实验结果表明,与 PSO 2D-Otsu 算法相比,FPSO 2D-Otsu 算法在铁路图像分割中的准确率从 48.76% 提高到 83.59%。
{"title":"Image segmentation of rail surface defects based on fractional order particle swarm optimization 2D-Otsu algorithm","authors":"Na Geng, Hu Sheng, Weizhi Sun, Yifeng Wang, Tan Yu, Zihan Liu","doi":"10.1117/12.3014444","DOIUrl":"https://doi.org/10.1117/12.3014444","url":null,"abstract":"Under the influence of high density operation and natural environment, the rail surface will appear abrasion damage, which will affect the safety and comfort of the train. Rail surface defect detection is an important part to ensure the safe and efficient operation of railway system. In order to distinguish whether there are defects on the rail surface, a method of rail surface defect image segmentation based on FPSO 2D-Otsu algorithm is proposed. The rail image is denoised and enhanced by adaptive fractional calculus, and then the rail image is segmented by FPSO 2D-Otsu algorithm. In order to verify the accuracy of the algorithm, the proposed algorithm is compared with PSO 2D-Otsu image segmentation algorithm. The experimental results show that the accuracy of FPSO 2D-Otsu algorithm in rail image segmentation is improved from 48.76% to 83.59% compared with PSO 2D-Otsu algorithm.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"226 1","pages":"129690A - 129690A-4"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Microexpression recognition algorithm based on multi feature fusion 基于多特征融合的微表情识别算法
BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan
Video facial micro expression recognition is difficult to extract features due to its short duration and small action amplitude. In order to better combine temporal and spatial information of video, the whole model is divided into local attention module, global attention module and temporal module. First, the local attention module intercepts the key areas and sends them to the network with channel attention after processing; Then the global attention module sends the data into the network with spatial attention after random erasure avoiding key areas; Finally, the temporal module sends the micro expression occurrence frame to the network with temporal shift module and spatial attention after processing; Finally, the classification results are obtained through three full connection layers after feature fusion. The experiment is tested based on CASMEⅡ dataset,After five-fold Cross Validation, the average accuracy rate is 76.15, the unweighted F1 value is 0.691.Compared with the mainstream algorithm, this method has improvement.
视频面部微表情识别因其持续时间短、动作幅度小而难以提取特征。为了更好地结合视频的时空信息,整个模型分为局部注意模块、全局注意模块和时间模块。首先,局部注意模块截取关键区域,经过处理后发送到通道注意网络;然后,全局注意模块随机擦除关键区域后,将数据发送到空间注意网络;最后,时序模块将微表情发生帧经过处理后发送到时移模块和空间注意网络;最后,通过三个全连接层进行特征融合后得到分类结果。实验基于 CASMEⅡ 数据集进行测试,经过五倍交叉验证后,平均准确率为 76.15,非加权 F1 值为 0.691。
{"title":"Microexpression recognition algorithm based on multi feature fusion","authors":"BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan","doi":"10.1117/12.3014469","DOIUrl":"https://doi.org/10.1117/12.3014469","url":null,"abstract":"Video facial micro expression recognition is difficult to extract features due to its short duration and small action amplitude. In order to better combine temporal and spatial information of video, the whole model is divided into local attention module, global attention module and temporal module. First, the local attention module intercepts the key areas and sends them to the network with channel attention after processing; Then the global attention module sends the data into the network with spatial attention after random erasure avoiding key areas; Finally, the temporal module sends the micro expression occurrence frame to the network with temporal shift module and spatial attention after processing; Finally, the classification results are obtained through three full connection layers after feature fusion. The experiment is tested based on CASMEⅡ dataset,After five-fold Cross Validation, the average accuracy rate is 76.15, the unweighted F1 value is 0.691.Compared with the mainstream algorithm, this method has improvement.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"12 6","pages":"1296908 - 1296908-10"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140512112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rapid identification of adulterated rice using fusion of near-infrared spectroscopy and machine vision data: the combination of feature optimization and nonlinear modeling 利用近红外光谱和机器视觉数据的融合快速识别掺假大米:特征优化与非线性建模的结合
Chenxuan Song, Jinming Liu, Chunqi Wang, Zhijiang Li
Rice is susceptible to mold and mildew during storage. Metabolites such as aflatoxin produced during mildew will do great harm to consumers. To meet the need for rapid detection of normal rice adulterated with moldy rice, a rapid identification method of adulterated rice was established based on data fusion of near-infrared spectroscopy and machine vision. Using competitive adaptive reweighted sampling (CARS), genetic algorithm (GA), and least angle regression (LARS) for spectral and image feature extraction, combined with support vector classification (SVC), random forest (RF), and gradient boosting tree (GBT) nonlinear discriminant models, and use Bayesian search to optimize modeling parameters. The results show that the GBT fusion data model established by LARS optimization of spectral and image feature variables has the highest discrimination accuracy, with recognition accuracy rates of 100.00% and 98.11% for its training and testing sets, respectively. The discrimination performance is significantly improved compared to single near-infrared spectroscopy and machine vision. The results indicate that rapid identification of adulterated rice based on near-infrared spectroscopy and machine vision data fusion technology is feasible, providing theoretical support for the development of online identification equipment for adulterated rice.
大米在储存过程中容易发霉。霉变过程中产生的黄曲霉毒素等代谢物会对消费者造成极大伤害。为了满足快速检测正常大米与霉变大米掺假的需要,建立了一种基于近红外光谱和机器视觉数据融合的快速识别掺假大米的方法。利用竞争性自适应加权采样(CARS)、遗传算法(GA)和最小角度回归(LARS)进行光谱和图像特征提取,结合支持向量分类(SVC)、随机森林(RF)和梯度提升树(GBT)非线性判别模型,并利用贝叶斯搜索优化建模参数。结果表明,通过对光谱和图像特征变量进行 LARS 优化而建立的 GBT 融合数据模型的判别准确率最高,其训练集和测试集的识别准确率分别为 100.00% 和 98.11%。与单一的近红外光谱仪和机器视觉相比,其识别性能明显提高。结果表明,基于近红外光谱和机器视觉数据融合技术快速识别掺假大米是可行的,为掺假大米在线识别设备的开发提供了理论支持。
{"title":"Rapid identification of adulterated rice using fusion of near-infrared spectroscopy and machine vision data: the combination of feature optimization and nonlinear modeling","authors":"Chenxuan Song, Jinming Liu, Chunqi Wang, Zhijiang Li","doi":"10.1117/12.3014380","DOIUrl":"https://doi.org/10.1117/12.3014380","url":null,"abstract":"Rice is susceptible to mold and mildew during storage. Metabolites such as aflatoxin produced during mildew will do great harm to consumers. To meet the need for rapid detection of normal rice adulterated with moldy rice, a rapid identification method of adulterated rice was established based on data fusion of near-infrared spectroscopy and machine vision. Using competitive adaptive reweighted sampling (CARS), genetic algorithm (GA), and least angle regression (LARS) for spectral and image feature extraction, combined with support vector classification (SVC), random forest (RF), and gradient boosting tree (GBT) nonlinear discriminant models, and use Bayesian search to optimize modeling parameters. The results show that the GBT fusion data model established by LARS optimization of spectral and image feature variables has the highest discrimination accuracy, with recognition accuracy rates of 100.00% and 98.11% for its training and testing sets, respectively. The discrimination performance is significantly improved compared to single near-infrared spectroscopy and machine vision. The results indicate that rapid identification of adulterated rice based on near-infrared spectroscopy and machine vision data fusion technology is feasible, providing theoretical support for the development of online identification equipment for adulterated rice.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"63 2","pages":"129692J - 129692J-16"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and high quality neural radiance fields reconstruction based on depth regularization 基于深度正则化的快速、高质量神经辐射场重建
Bin Zhu, Gaoxiang He, Bo Xie, Yi Chen, Yaoxuan Zhu, Liuying Chen
Although the Neural Radiance Fields (NeRF) has been shown to achieve high-quality novel view synthesis, existing models still perform poorly in some scenarios, particularly unbounded scenes. These models either require excessively long training times or produce suboptimal synthesis results. Consequently, we propose SD-NeRF, which consists of a compact neural radiance field model and self-supervised depth regularization. Experimental results demonstrate that SDNeRF can shorten training time by over 20 times compared to Mip-NeRF360 without compromising reconstruction accuracy.
尽管神经辐射场(NeRF)已被证明可以实现高质量的新颖视图合成,但现有模型在某些场景下,尤其是无边界场景下,仍然表现不佳。这些模型要么需要过长的训练时间,要么产生不理想的合成结果。因此,我们提出了 SD-NeRF,它由紧凑型神经辐射场模型和自监督深度正则化组成。实验结果表明,与 Mip-NeRF360 相比,SDNeRF 可以将训练时间缩短 20 倍以上,而且不会影响重建精度。
{"title":"Fast and high quality neural radiance fields reconstruction based on depth regularization","authors":"Bin Zhu, Gaoxiang He, Bo Xie, Yi Chen, Yaoxuan Zhu, Liuying Chen","doi":"10.1117/12.3014528","DOIUrl":"https://doi.org/10.1117/12.3014528","url":null,"abstract":"Although the Neural Radiance Fields (NeRF) has been shown to achieve high-quality novel view synthesis, existing models still perform poorly in some scenarios, particularly unbounded scenes. These models either require excessively long training times or produce suboptimal synthesis results. Consequently, we propose SD-NeRF, which consists of a compact neural radiance field model and self-supervised depth regularization. Experimental results demonstrate that SDNeRF can shorten training time by over 20 times compared to Mip-NeRF360 without compromising reconstruction accuracy.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"43 3","pages":"129692F - 129692F-9"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combinatorial action recognition based on causal segment intervention 基于因果片段干预的组合动作识别
Xiaozhou Sun
Combinatorial action recognition has recently attracted the attention of researchers in the field of computer vision. It focuses on the effective representation and discrimination of spatio-temporal interactions occurring between different actions and objects in video data. Existing work tends to strengthen the framework's object recognition capabilities and relationship modeling capabilities, e.g., attention mechanisms, and graph structures. We find that existing algorithms can be influenced by interaction-independent video segments in a video, misleading the algorithm to focus on additional information in the vision. For the algorithm to analyze the spatio-temporal interactions of causally related video segments in a video, a Causal Slice Recognition Network (CSRN) is proposed. This method can effectively remove the interference of video background segments by explicitly recognizing and extracting the causally related segments in the video. We validate the method on the Something-else dataset and obtain the best results.
组合动作识别最近引起了计算机视觉领域研究人员的关注。其重点是有效表示和辨别视频数据中不同动作和物体之间发生的时空互动。现有的工作倾向于加强框架的物体识别能力和关系建模能力,如注意力机制和图结构。我们发现,现有算法会受到视频中与交互无关的视频片段的影响,从而误导算法关注视觉中的其他信息。为了分析视频中因果相关视频片段的时空交互作用,我们提出了一种因果片段识别网络(CSRN)算法。该方法通过明确识别和提取视频中的因果相关片段,可以有效消除视频背景片段的干扰。我们在 Something-else 数据集上对该方法进行了验证,并获得了最佳结果。
{"title":"Combinatorial action recognition based on causal segment intervention","authors":"Xiaozhou Sun","doi":"10.1117/12.3014465","DOIUrl":"https://doi.org/10.1117/12.3014465","url":null,"abstract":"Combinatorial action recognition has recently attracted the attention of researchers in the field of computer vision. It focuses on the effective representation and discrimination of spatio-temporal interactions occurring between different actions and objects in video data. Existing work tends to strengthen the framework's object recognition capabilities and relationship modeling capabilities, e.g., attention mechanisms, and graph structures. We find that existing algorithms can be influenced by interaction-independent video segments in a video, misleading the algorithm to focus on additional information in the vision. For the algorithm to analyze the spatio-temporal interactions of causally related video segments in a video, a Causal Slice Recognition Network (CSRN) is proposed. This method can effectively remove the interference of video background segments by explicitly recognizing and extracting the causally related segments in the video. We validate the method on the Something-else dataset and obtain the best results.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"252 1","pages":"129692W - 129692W-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1