International Workshop on Frontiers of Graphics and Image Processing最新文献

英文中文

Fast image quantization with efficient color clustering 快速图像量化与有效的颜色聚类

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2668985

Yingying Liu

Color image quantization has been widely used as an important task in graphics manipulation and image processing. The key to color image quantization is to generate an efficient color palette. At present, there are many color image quantization methods that have been presented, which are fundamentally clustering-based algorithms. As an illustration, the K-means clustering algorithm is quite popular. However, the K-means algorithm has not been given sufficient focus in the field of color quantization due to its high computational effort caused by multiple iterations and its very susceptibility to initialization. This paper presented an efficient color clustering method to implement fast color quantization. This method mainly addresses the drawbacks of the conventional K-means clustering algorithm, which involves reducing the data samples and making use of triangular inequalities to accelerate the nearest neighbor search. The method mainly contains two stages. During the first phase, an initial palette is generated. In the second phase, quantized images are generated by a modified K-means method. Major modifications include data sampling and mean sorting, avoiding traversal of all cluster centers, and speeding up the time to search the palette. The experimental results illustrate that this presented method is quite competitive with previously presented color quantization algorithms both in the matter of efficiency and effectiveness.

彩色图像量化作为图形处理和图像处理中的一项重要任务，得到了广泛的应用。彩色图像量化的关键是生成有效的调色板。目前提出的彩色图像量化方法很多，基本上都是基于聚类的算法。例如，K-means聚类算法非常流行。然而，由于K-means算法多次迭代导致计算量大，且极易初始化，因此在颜色量化领域并没有得到足够的重视。提出了一种有效的颜色聚类方法来实现快速的颜色量化。该方法主要解决了传统K-means聚类算法的缺点，即减少数据样本并利用三角不等式加速最近邻搜索。该方法主要包括两个阶段。在第一阶段，生成一个初始调色板。在第二阶段，通过改进的K-means方法生成量化图像。主要的修改包括数据采样和平均排序，避免遍历所有集群中心，以及加快搜索调色板的时间。实验结果表明，该方法在效率和有效性方面都与现有的颜色量化算法具有相当的竞争力。

{"title":"Fast image quantization with efficient color clustering","authors":"Yingying Liu","doi":"10.1117/12.2668985","DOIUrl":"https://doi.org/10.1117/12.2668985","url":null,"abstract":"Color image quantization has been widely used as an important task in graphics manipulation and image processing. The key to color image quantization is to generate an efficient color palette. At present, there are many color image quantization methods that have been presented, which are fundamentally clustering-based algorithms. As an illustration, the K-means clustering algorithm is quite popular. However, the K-means algorithm has not been given sufficient focus in the field of color quantization due to its high computational effort caused by multiple iterations and its very susceptibility to initialization. This paper presented an efficient color clustering method to implement fast color quantization. This method mainly addresses the drawbacks of the conventional K-means clustering algorithm, which involves reducing the data samples and making use of triangular inequalities to accelerate the nearest neighbor search. The method mainly contains two stages. During the first phase, an initial palette is generated. In the second phase, quantized images are generated by a modified K-means method. Major modifications include data sampling and mean sorting, avoiding traversal of all cluster centers, and speeding up the time to search the palette. The experimental results illustrate that this presented method is quite competitive with previously presented color quantization algorithms both in the matter of efficiency and effectiveness.","PeriodicalId":236099,"journal":{"name":"International Workshop on Frontiers of Graphics and Image Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129494544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Performance optimization of target detection based on edge-to-cloud deep learning 基于边缘到云深度学习的目标检测性能优化

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2668891

Zhongkui Fan, Yepeng Guan

With the development of mobile internet, real-time target detection using mobile devices has wide application prospects, but the computing power of the terminal greatly limits the speed and accuracy of target detection. Edge-cloud collaborative computing is the main method to solve the lack of computing power of mobile terminals. The current method can't settle the problem of computation scheduling in the edge-cloud collaboration system. Given the existing problems, this paper proposes the pruning technology of classical target detection deep learning networks; training and prediction offloading strategy of edge-to-cloud deep learning network; dynamic load balancing migration strategy based on CPU, memory, bandwidth, and disk state-changing in cluster. After testing, the edge-to-cloud deep learning method can reduce the inference delay by 50% and increase the system throughput by 40%. The maximum waiting time for operation can be reduced by about 20%. The efficiency and accuracy of target detection are effectively improved.

随着移动互联网的发展，利用移动设备进行实时目标检测具有广泛的应用前景，但终端的计算能力极大地限制了目标检测的速度和准确性。边缘云协同计算是解决移动终端计算能力不足的主要方法。现有的方法不能解决边缘云协作系统中的计算调度问题。针对存在的问题，本文提出了经典目标检测深度学习网络的剪枝技术;边缘到云深度学习网络的训练和预测卸载策略基于集群内CPU、内存、带宽和磁盘状态变化的动态负载均衡迁移策略。经过测试，边缘到云的深度学习方法可以将推理延迟降低50%，将系统吞吐量提高40%。操作的最长等待时间可减少约20%。有效地提高了目标检测的效率和精度。

引用次数: 0

Design of parking lot vehicle entry system based on human image recognition analysis technology 基于人体图像识别分析技术的停车场车辆进入系统设计

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2669161

Liang Zhu, Junhong Xi

Due to the increasing number of private cars, traffic management departments are paying more and more attention to vehicle traffic problems. In the daily management, video image is often the most intuitive, effective and fast way to obtain information resources. In the actual parking lot, vehicle access management is a very complex and difficult job. As most of the parking lots are scanned manually to complete the task of entering and leaving the parking lot. In order to solve this problem, this paper is based on the human image recognition analysis technology to realize the effective and fast recording of incoming and outgoing personnel and vehicle information of the parking lot vehicle entry system, so that these information can be statistically analyzed and the corresponding processing plan can be made quickly when there is an unexpected situation, and at the same time can improve the road traffic efficiency and safety performance, and also provide a simple and fast, easy to operate work for the relevant staff. It also provides a simple, fast and easy to operate method for the relevant staff, which has certain practical value.

由于私家车数量的不断增加，交通管理部门越来越重视车辆交通问题。在日常管理中，视频图像往往是获取信息资源最直观、有效、快捷的方式。在实际的停车场中，车辆进出管理是一项非常复杂和困难的工作。由于大多数停车场都是通过人工扫描来完成进出停车场的任务。为了解决这一问题，本文基于人体图像识别分析技术，实现停车场车辆入库系统进出人员和车辆信息的有效快速记录，以便在出现意外情况时对这些信息进行统计分析并快速制定相应的处理方案，同时可以提高道路交通效率和安全性能。同时也为相关工作人员提供了一个简单快捷、易于操作的工作场所。为相关工作人员提供了一种简单、快捷、易操作的方法，具有一定的实用价值。

{"title":"Design of parking lot vehicle entry system based on human image recognition analysis technology","authors":"Liang Zhu, Junhong Xi","doi":"10.1117/12.2669161","DOIUrl":"https://doi.org/10.1117/12.2669161","url":null,"abstract":"Due to the increasing number of private cars, traffic management departments are paying more and more attention to vehicle traffic problems. In the daily management, video image is often the most intuitive, effective and fast way to obtain information resources. In the actual parking lot, vehicle access management is a very complex and difficult job. As most of the parking lots are scanned manually to complete the task of entering and leaving the parking lot. In order to solve this problem, this paper is based on the human image recognition analysis technology to realize the effective and fast recording of incoming and outgoing personnel and vehicle information of the parking lot vehicle entry system, so that these information can be statistically analyzed and the corresponding processing plan can be made quickly when there is an unexpected situation, and at the same time can improve the road traffic efficiency and safety performance, and also provide a simple and fast, easy to operate work for the relevant staff. It also provides a simple, fast and easy to operate method for the relevant staff, which has certain practical value.","PeriodicalId":236099,"journal":{"name":"International Workshop on Frontiers of Graphics and Image Processing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116096286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RVFIT: Real-time Video Frame Interpolation Transformer RVFIT:实时视频帧插值变压器

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2669055

Linlin Ou, Yuanping Chen

Video frame interpolation (VFI), which aims to synthesize predictive frames from bidirectional historical references, has made remarkable progress with the development of deep convolutional neural networks (CNNs) over the past years. Existing CNNs generally face challenges in handing large motions due to the locality of convolution operations, resulting in a slow inference structure. We introduce a Real-time video frame interpolation transformer (RVFIT), a novel framework to overcome this limitation. Unlike traditional methods based on CNNs, this paper does not process video frames separately with different network modules in the spatial domain but batches adjacent frames through a single UNet-style structure end-to-end Transformer network architecture. Moreover, this paper creatively sets up two-stage interpolation sampling before and after the end-to-end network to maximize the performance of the traditional CV algorithm. The experimental results show that compared with SOTA TMNet, RVFIT has only 50% of the network size (6.2M vs 12.3M, parameters) while ensuring comparable performance, and the speed is increased by 80% (26.1 fps vs 14.3 fps, frame size is 720*576).

视频帧插值(VFI)旨在从双向历史参考合成预测帧，近年来随着深度卷积神经网络(cnn)的发展取得了显著进展。由于卷积运算的局部性，现有cnn在处理大运动时普遍面临挑战，导致推理结构缓慢。我们介绍了一种实时视频帧插值转换器(RVFIT)，这是一种克服这一限制的新框架。与传统的基于cnn的方法不同，本文没有在空间域中使用不同的网络模块分别处理视频帧，而是通过单一的unet风格的端到端Transformer网络架构对相邻帧进行批量处理。此外，本文创造性地设置了端到端网络前后两阶段插值采样，最大限度地提高了传统CV算法的性能。实验结果表明，与SOTA TMNet相比，RVFIT在保证相当性能的同时，网络大小仅为前者的50% (6.2M vs 12.3M，参数)，速度提高了80% (26.1 fps vs 14.3 fps，帧大小为720*576)。

{"title":"RVFIT: Real-time Video Frame Interpolation Transformer","authors":"Linlin Ou, Yuanping Chen","doi":"10.1117/12.2669055","DOIUrl":"https://doi.org/10.1117/12.2669055","url":null,"abstract":"Video frame interpolation (VFI), which aims to synthesize predictive frames from bidirectional historical references, has made remarkable progress with the development of deep convolutional neural networks (CNNs) over the past years. Existing CNNs generally face challenges in handing large motions due to the locality of convolution operations, resulting in a slow inference structure. We introduce a Real-time video frame interpolation transformer (RVFIT), a novel framework to overcome this limitation. Unlike traditional methods based on CNNs, this paper does not process video frames separately with different network modules in the spatial domain but batches adjacent frames through a single UNet-style structure end-to-end Transformer network architecture. Moreover, this paper creatively sets up two-stage interpolation sampling before and after the end-to-end network to maximize the performance of the traditional CV algorithm. The experimental results show that compared with SOTA TMNet, RVFIT has only 50% of the network size (6.2M vs 12.3M, parameters) while ensuring comparable performance, and the speed is increased by 80% (26.1 fps vs 14.3 fps, frame size is 720*576).","PeriodicalId":236099,"journal":{"name":"International Workshop on Frontiers of Graphics and Image Processing","volume":"211 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133114220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measuring the fine-structure constant on quasar spectra: High spectral resolution gains more than large size of moderate spectral resolution spectra 类星体光谱精细结构常数的测量:高光谱分辨率比大尺寸的中等光谱分辨率获得更多

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2670012

Haoran Liang, Zhe Wu

By analyzing the data from the Sloan Digital Sky Survey (SDSS) Data Release 16, which the spectra of the Quasars are major samples, we focus on the investigation of the possible variations of the fine structure constant on the cosmological temporal scales over the universe. We analyzed 14495 quasar samples (red shift z<1) constrained in the literature by using emission-line method on [OIII] doublet and obtained Δα/α=0.70±1.6×10-5. We investigated the precision limit for the measurement of fine-structure constant by SDSS spectrum analysis by designing the simulation about three main sources of systematics: Noise, Outflow of gas, and Skyline. In addition, we exerted cross-correlation analysis on a high-resolution spectrum from MagE (MagE Observations at the Magellan II Clay telescope) named “J131651.29+055646.9” and got the result Δα/α=-9.16±11.38×10-7. Better constraints (Skyline subtraction algorithm) may improve the precision slightly by using SDSS. The more possible and efficient method may be to constrain Δα/α with the spectra of high-resolution spectroscopy and large active galaxy/QSO surveys.

通过对以类星体光谱为主要样本的斯隆数字巡天(SDSS)第16期数据的分析，重点研究了精细结构常数在宇宙时间尺度上的可能变化。利用[OIII]双线态的发射在线方法，对文献中限定的14495个红移z<1的类星体样品进行了分析，得到Δα/α=0.70±1.6×10-5。通过对噪声、气体流出和Skyline三种主要系统源的模拟，探讨了SDSS光谱分析测量精细结构常数的精度极限。此外，我们对麦哲伦II克莱望远镜观测到的高分辨率光谱“J131651.29+055646.9”进行了互相关分析，得到Δα/α=-9.16±11.38×10-7。更好的约束条件(Skyline减法算法)可以通过使用SDSS略微提高精度。更可行和有效的方法可能是用高分辨率光谱和大型活动星系/QSO调查的光谱来约束Δα/α。

{"title":"Measuring the fine-structure constant on quasar spectra: High spectral resolution gains more than large size of moderate spectral resolution spectra","authors":"Haoran Liang, Zhe Wu","doi":"10.1117/12.2670012","DOIUrl":"https://doi.org/10.1117/12.2670012","url":null,"abstract":"By analyzing the data from the Sloan Digital Sky Survey (SDSS) Data Release 16, which the spectra of the Quasars are major samples, we focus on the investigation of the possible variations of the fine structure constant on the cosmological temporal scales over the universe. We analyzed 14495 quasar samples (red shift z<1) constrained in the literature by using emission-line method on [OIII] doublet and obtained Δα/α=0.70±1.6×10-5. We investigated the precision limit for the measurement of fine-structure constant by SDSS spectrum analysis by designing the simulation about three main sources of systematics: Noise, Outflow of gas, and Skyline. In addition, we exerted cross-correlation analysis on a high-resolution spectrum from MagE (MagE Observations at the Magellan II Clay telescope) named “J131651.29+055646.9” and got the result Δα/α=-9.16±11.38×10-7. Better constraints (Skyline subtraction algorithm) may improve the precision slightly by using SDSS. The more possible and efficient method may be to constrain Δα/α with the spectra of high-resolution spectroscopy and large active galaxy/QSO surveys.","PeriodicalId":236099,"journal":{"name":"International Workshop on Frontiers of Graphics and Image Processing","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113998950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of digital pictures process technique in engineering surveying 数字图像处理技术在工程测量中的应用

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2669154

Xiaowen Hu, Yeming Wang

In order to avoid some errors in distance measurement, e.g. due to physical characteristics, environmental influences, human errors etc. during the measurement process, visual measurement and modern digital image related processing techniques are used. This requires the creation of relevant image acquisition systems and the detection of the edges of the acquired images using specialised procedures, on the basis of which the differences between the basic points of the image edges are brought into the measurement equation. The values obtained prove that the final results of this measurement method are consistent with the actual values, proving the accuracy of digital image correlation processing techniques in distance measurement. Digital image processing technology is a derivative of advanced manufacturing technology, and the rapid development of computer technology has led to the development of digital image recognition and image analysis capabilities. Engineering survey research is mainly applied in the process of engineering construction and engineering management, which can greatly reduce the error of engineering manufacturing and the period of engineering inspection. Based on digital image processing technology, this paper proposes an engineering displacement measurement method, an industrial part size measurement method and an industrial thread standard measurement method. Compared with the traditional manual measurement technology, the use of digital image technology can shorten the working period and improve the working efficiency.

为了避免测量过程中由于物理特性、环境影响、人为误差等造成的距离测量误差，采用了视觉测量和现代数字图像相关处理技术。这需要创建相关的图像采集系统，并使用专门的程序检测所获取图像的边缘，在此基础上，将图像边缘基本点之间的差异带入测量方程。结果表明，该测量方法的最终测量结果与实际测量值一致，证明了数字图像相关处理技术在距离测量中的准确性。数字图像处理技术是先进制造技术的衍生物，计算机技术的快速发展带动了数字图像识别和图像分析能力的发展。工程测量研究主要应用于工程建设和工程管理过程中，可以大大减少工程制造的误差和工程检验的周期。本文提出了一种基于数字图像处理技术的工程位移测量方法、工业零件尺寸测量方法和工业螺纹标准测量方法。与传统的人工测量技术相比，使用数字图像技术可以缩短工作周期，提高工作效率。

{"title":"Application of digital pictures process technique in engineering surveying","authors":"Xiaowen Hu, Yeming Wang","doi":"10.1117/12.2669154","DOIUrl":"https://doi.org/10.1117/12.2669154","url":null,"abstract":"In order to avoid some errors in distance measurement, e.g. due to physical characteristics, environmental influences, human errors etc. during the measurement process, visual measurement and modern digital image related processing techniques are used. This requires the creation of relevant image acquisition systems and the detection of the edges of the acquired images using specialised procedures, on the basis of which the differences between the basic points of the image edges are brought into the measurement equation. The values obtained prove that the final results of this measurement method are consistent with the actual values, proving the accuracy of digital image correlation processing techniques in distance measurement. Digital image processing technology is a derivative of advanced manufacturing technology, and the rapid development of computer technology has led to the development of digital image recognition and image analysis capabilities. Engineering survey research is mainly applied in the process of engineering construction and engineering management, which can greatly reduce the error of engineering manufacturing and the period of engineering inspection. Based on digital image processing technology, this paper proposes an engineering displacement measurement method, an industrial part size measurement method and an industrial thread standard measurement method. Compared with the traditional manual measurement technology, the use of digital image technology can shorten the working period and improve the working efficiency.","PeriodicalId":236099,"journal":{"name":"International Workshop on Frontiers of Graphics and Image Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128851527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Offset correction scheme for human eye positioning in naked eye 3D for Android Android裸眼3D人眼定位偏移校正方案

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2669407

Ke Wang, zonghai pan, Yuting Chen, Fei Li, C. Lan

Naked-eye 3D imaging needs to call the cell phone camera to detect the position of the human eye, the front camera of the cell phone is a certain distance away from the center of the cell phone screen, so there is a certain offset between the front camera and the human eye position detected by the front camera and the cell phone screen, and the 3D image display is the center of the two eyes detected by the front camera when the human eye looks directly at the center of the cell phone screen as the origin to switch the image. Therefore, the human eye positioning offset problem will be solved by direct measurement method and formula derivation method.

裸眼3D成像需要调用手机摄像头来检测人眼的位置，手机前置摄像头距离手机屏幕中心有一定距离，因此前置摄像头与手机屏幕检测到的人眼位置之间存在一定的偏移;3D图像显示为人眼直视手机屏幕中心为原点切换图像时，前置摄像头检测到的两眼中心。因此，人眼定位偏移问题将通过直接测量法和公式推导法来解决。

引用次数: 0

Research on verification framework of image processing IP core based on real-time reconfiguration 基于实时重构的图像处理IP核验证框架研究

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2669153

Wei Mo, Lu Zhao, Jianping Wen

The verification of IP core with image processing algorithm is important for SoC and FPGA application in the field of machine vision. This paper proposes a verification framework with general purpose, real-time performance and agility for IP core with image processing algorithm by using heterogeneous platform composed of ARM and FPGA. In the verification framework, the Gigabit Ethernet communication between PC and ARM is established. The FPGA is used to build the data bus to be compatible with multiple types of images, and combine with a partial reconfiguration to achieve fast iteration of IP cores of the algorithm to be verified. The validation framework is reusable for the algorithm IP core, and the deployment speed of the IP cores to be verified is 25 times faster than global reconfiguration. Compared with the existing FPGA verification technology, it has better reusability, shorter verification cycle, more targeted test stimulus, and faster deployment of IP cores to be verified.

IP核的图像处理算法验证对于机器视觉领域的SoC和FPGA应用具有重要意义。本文利用ARM和FPGA组成的异构平台，提出了一种具有通用性、实时性和敏捷性的IP核图像处理算法验证框架。在验证框架中，建立了PC机与ARM之间的千兆以太网通信。利用FPGA构建兼容多种类型图像的数据总线，并结合局部重构实现待验证算法的IP核快速迭代。验证框架可复用于算法IP核，待验证IP核的部署速度比全局重构快25倍。与现有的FPGA验证技术相比，具有更好的可重用性、更短的验证周期、更有针对性的测试刺激、更快部署待验证的IP核等优点。

引用次数: 0

A lightweight object grasping network using GhostNet 使用GhostNet的轻量级对象抓取网络

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2669156

Yangfan Deng, Qinghua Guo, Yong Zhao, Junli Xu

Object grasping is a very challenging problem in computer vision and robotics. Existing algorithms generally have a large number of training parameters, which lead to long training times and require high performance facilities. In this paper, we present a lightweight neural network to solve the problem of object grasping. Our network is able to generate grasps at real-time speeds (∼30ms), thus can be used on mobile devices. The main idea of GhostNet is to reduce the number of parameters by generating feature maps from each other in the process of convolution. We adopt this idea and apply it on the deconvolution process. Besides, we construct the lightweight grasp network based on these two processes. A lot of experiments on grasping datasets demonstrate that our network performs well. We achieve accuracy of 94% on Cornell grasp dataset and 91.8% on Jacquard dataset. At the same time, compared to traditional models, our model only requires 15% of the number of parameters and 47% of training time.

物体抓取是计算机视觉和机器人技术中一个非常具有挑战性的问题。现有算法通常具有大量的训练参数，导致训练时间长，并且需要高性能的设施。在本文中，我们提出了一个轻量级的神经网络来解决物体抓取问题。我们的网络能够以实时速度(~ 30ms)生成抓取，因此可以在移动设备上使用。GhostNet的主要思想是在卷积过程中通过相互生成特征映射来减少参数的数量。我们采用这一思想，并将其应用于反褶积过程。并在此基础上构建了轻量抓取网络。大量的数据集抓取实验表明，我们的网络具有良好的性能。在Cornell抓取数据集和Jacquard数据集上，准确率分别达到94%和91.8%。同时，与传统模型相比，我们的模型只需要15%的参数数量和47%的训练时间。

引用次数: 0

SARCUT: Contrastive learning for optical-SAR image translation with self-attention and relativistic discrimination SARCUT:基于自注意和相对论辨别的光学sar图像翻译的对比学习

International Workshop on Frontiers of Graphics and Image Processing

Pub Date : 2023-05-03 DOI: 10.1117/12.2669086

Yusen ZHANG, Min Li, Wei Cai, Yao Gou, Shuaibing Shi

Focusing on how to obtain high-quality and sufficient synthetic aperture radar (SAR) data in deep learning, this paper proposed a new mothed named SARCUT (Self-Attention Relativistic Contrastive Learning for Unpaired Image-to-Image Translation) to translate optical images into SAR images. In order to improve the coordination of generated images and stabilize the training process, we constructed a generator with the self-attention mechanism and spectral normalization operation. Meanwhile, relativistic discrimination adversarial loss function was designed to accelerate the model convergence and improved the authenticity of the generated images. Experiments on open datasets with 6 image quantitative evaluation metrics showed our model can learn the deeper internal relations and main features between multiple source images. Compared with the classical methods, SARCUT has more advantages in establishing the real image domain mapping, both the quality and authenticity of the generated image are significantly improved.

针对如何在深度学习中获得高质量和充足的合成孔径雷达(SAR)数据，本文提出了一种新的方法SARCUT (Self-Attention Relativistic contrast learning for Unpaired Image-to-Image Translation)，将光学图像转化为SAR图像。为了提高生成图像的协调性，稳定训练过程，我们构造了一个具有自关注机制和谱归一化操作的生成器。同时，设计了相对论性判别对抗损失函数，加快了模型的收敛速度，提高了生成图像的真实性。在具有6个图像定量评价指标的开放数据集上的实验表明，该模型可以学习到多源图像之间更深层次的内部关系和主要特征。与经典方法相比，SARCUT在建立真实图像域映射方面更有优势，生成的图像质量和真实性都有显著提高。

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Workshop on Frontiers of Graphics and Image Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀