首页 > 最新文献

2021 International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Aerial Image Classification with Label Splitting and Optimized Triplet Loss Learning 基于标签分割和优化三重损失学习的航空图像分类
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675441
Rijun Liao, Zhu Li, S. Bhattacharyya, George York
With the development of airplane platforms, aerial image classification plays an important role in a wide range of remote sensing applications. The number of most of aerial image dataset is very limited compared with other computer vision datasets. Unlike many works that use data augmentation to solve this problem, we adopt a novel strategy, called, label splitting, to deal with limited samples. Specifically, each sample has its original semantic label, we assign a new appearance label via unsupervised clustering for each sample by label splitting. Then an optimized triplet loss learning is applied to distill domain specific knowledge. This is achieved through a binary tree forest partitioning and triplets selection and optimization scheme that controls the triplet quality. Simulation results on NWPU, UCM and AID datasets demonstrate that proposed solution achieves the state-of-the-art performance in the aerial image classification.
随着飞机平台的发展,航空图像分类在广泛的遥感应用中发挥着重要作用。与其他计算机视觉数据集相比,大多数航空图像数据集的数量非常有限。与许多使用数据增强来解决这个问题的作品不同,我们采用了一种新的策略,称为标签分裂,来处理有限的样本。具体来说,每个样本都有其原始的语义标签,我们通过标签分裂对每个样本进行无监督聚类来分配新的外观标签。然后应用优化的三重损失学习方法提取领域特定知识。这是通过二叉树森林分区和三元组选择和优化方案来实现的,该方案控制三元组质量。在NWPU、UCM和AID数据集上的仿真结果表明,该方法在航空图像分类中达到了最先进的性能。
{"title":"Aerial Image Classification with Label Splitting and Optimized Triplet Loss Learning","authors":"Rijun Liao, Zhu Li, S. Bhattacharyya, George York","doi":"10.1109/VCIP53242.2021.9675441","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675441","url":null,"abstract":"With the development of airplane platforms, aerial image classification plays an important role in a wide range of remote sensing applications. The number of most of aerial image dataset is very limited compared with other computer vision datasets. Unlike many works that use data augmentation to solve this problem, we adopt a novel strategy, called, label splitting, to deal with limited samples. Specifically, each sample has its original semantic label, we assign a new appearance label via unsupervised clustering for each sample by label splitting. Then an optimized triplet loss learning is applied to distill domain specific knowledge. This is achieved through a binary tree forest partitioning and triplets selection and optimization scheme that controls the triplet quality. Simulation results on NWPU, UCM and AID datasets demonstrate that proposed solution achieves the state-of-the-art performance in the aerial image classification.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128689922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Universal GAN Image Detection 迈向通用GAN图像检测
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675329
D. Cozzolino, Diego Gragnaniello, G. Poggi, L. Verdoliva
The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limited robustness and generalization ability. Moreover, they often rely on side information not available at test time, that is, they are not universal. We investigate these problems and propose a new GAN image detector based on a limited sub-sampling architecture and a suitable contrastive learning paradigm. Experiments carried out in challenging conditions prove the proposed method to be a first step towards universal GAN image detection, ensuring also good robustness to common image impairments, and good generalization to unseen architectures.
越来越高的质量和广泛传播的假图像催生了对可靠的法医工具的追求。近年来,人们提出了许多GAN图像检测器。然而,在现实场景中,它们大多表现出有限的鲁棒性和泛化能力。此外,它们往往依赖于测试时无法获得的侧面信息,也就是说,它们不是通用的。我们研究了这些问题,并提出了一种新的基于有限子采样架构和合适的对比学习范式的GAN图像检测器。在具有挑战性的条件下进行的实验证明,该方法是通用GAN图像检测的第一步,确保了对常见图像损伤的良好鲁棒性,以及对未见结构的良好泛化。
{"title":"Towards Universal GAN Image Detection","authors":"D. Cozzolino, Diego Gragnaniello, G. Poggi, L. Verdoliva","doi":"10.1109/VCIP53242.2021.9675329","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675329","url":null,"abstract":"The ever higher quality and wide diffusion of fake images have spawn a quest for reliable forensic tools. Many GAN image detectors have been proposed, recently. In real world scenarios, however, most of them show limited robustness and generalization ability. Moreover, they often rely on side information not available at test time, that is, they are not universal. We investigate these problems and propose a new GAN image detector based on a limited sub-sampling architecture and a suitable contrastive learning paradigm. Experiments carried out in challenging conditions prove the proposed method to be a first step towards universal GAN image detection, ensuring also good robustness to common image impairments, and good generalization to unseen architectures.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116346835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Deep Color Constancy Using Spatio-Temporal Correlation of High-Speed Video 基于时空相关的高速视频深色恒常性
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675406
Dong-Jae Lee, Kang-Kyu Lee, Jong-Ok Kim
After the invention of electric bulbs, most of lights surrounding our worlds are powered by alternative current (AC). This intensity variation can be captured with a high-speed camera, and we can utilize the intensity difference between consecutive video frames for various vision tasks. For color constancy, conventional methods usually focus on exploiting only the spatial feature. To overcome the limitations of conventional methods, a couple of methods to utilize AC flickering have been proposed. The previous work employed temporal correlation between high-speed video frames. To further enhance the previous work, we propose a deep spatio-temporal color constancy method using spatial and temporal correlations. To extract temporal features for illuminant estimation, we calculate the temporal correlation between feature maps where global features as well as local are learned. By learning global features through spatio-temporal correlation, the proposed method can estimate illumination more accurately, and is particularly robust to noisy practical environments. The experimental results demonstrate that the performance of the proposed method is superior to that of existing methods.
在电灯泡发明之后,我们周围的大多数灯都是由交流电供电的。这种强度变化可以用高速摄像机捕捉到,我们可以利用连续视频帧之间的强度差异来完成各种视觉任务。对于色彩的恒常性,传统的方法通常只侧重于利用空间特征。为了克服传统方法的局限性,提出了几种利用交流闪变的方法。先前的工作采用了高速视频帧之间的时间相关性。为了进一步完善之前的工作,我们提出了一种基于时空相关性的深度时空颜色恒常性方法。为了提取用于光源估计的时间特征,我们计算了学习全局特征和局部特征的特征映射之间的时间相关性。该方法通过时空相关学习全局特征,可以更准确地估计光照,并且对有噪声的实际环境具有很强的鲁棒性。实验结果表明,该方法的性能优于现有方法。
{"title":"Deep Color Constancy Using Spatio-Temporal Correlation of High-Speed Video","authors":"Dong-Jae Lee, Kang-Kyu Lee, Jong-Ok Kim","doi":"10.1109/VCIP53242.2021.9675406","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675406","url":null,"abstract":"After the invention of electric bulbs, most of lights surrounding our worlds are powered by alternative current (AC). This intensity variation can be captured with a high-speed camera, and we can utilize the intensity difference between consecutive video frames for various vision tasks. For color constancy, conventional methods usually focus on exploiting only the spatial feature. To overcome the limitations of conventional methods, a couple of methods to utilize AC flickering have been proposed. The previous work employed temporal correlation between high-speed video frames. To further enhance the previous work, we propose a deep spatio-temporal color constancy method using spatial and temporal correlations. To extract temporal features for illuminant estimation, we calculate the temporal correlation between feature maps where global features as well as local are learned. By learning global features through spatio-temporal correlation, the proposed method can estimate illumination more accurately, and is particularly robust to noisy practical environments. The experimental results demonstrate that the performance of the proposed method is superior to that of existing methods.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114591126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two Stage Optimal Bit Allocation for HEVC Hierarchical Coding Structure HEVC分层编码结构的两阶段最优位分配
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675396
Zizheng Liu, Zhenzhong Chen, Shan Liu
In this paper, we propose a two stage optimal bit allocation scheme for HEVC hierarchical coding structure. The two stage, i.e., the frame-level and the CTU-level bit allocation, are separately conducted in the traditional rate control methods. In our proposed method, the optimal allocation in the second stage is firstly considered, and then the allocation strategy in the second stage is deemed as a foreknowledge in the first stage and applied to guide the frame-level bit allocation. With the formulation, the two stage bit allocation problem can be converted to a joint optimization problem. By solving the formulated optimization problem, the two stage optimal bit allocation scheme is established, in which more appropriate number of bits can be allocated to each frame and each CTU. The experimental results show that our proposed method can bring higher coding efficiency while satisfying the constraint of bit rate precisely.
本文提出了一种用于HEVC分层编码结构的两阶段最优位分配方案。在传统的速率控制方法中,帧级和cpu级的比特分配是分开进行的。该方法首先考虑第二阶段的最优分配,然后将第二阶段的分配策略视为第一阶段的预知,并应用于指导帧级比特分配。利用该公式,两级钻头分配问题可以转化为一个联合优化问题。通过求解所提出的优化问题,建立了两阶段最优比特分配方案,该方案可以为每帧和每个CTU分配更合适的比特数。实验结果表明,该方法能在满足码率约束的前提下提高编码效率。
{"title":"Two Stage Optimal Bit Allocation for HEVC Hierarchical Coding Structure","authors":"Zizheng Liu, Zhenzhong Chen, Shan Liu","doi":"10.1109/VCIP53242.2021.9675396","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675396","url":null,"abstract":"In this paper, we propose a two stage optimal bit allocation scheme for HEVC hierarchical coding structure. The two stage, i.e., the frame-level and the CTU-level bit allocation, are separately conducted in the traditional rate control methods. In our proposed method, the optimal allocation in the second stage is firstly considered, and then the allocation strategy in the second stage is deemed as a foreknowledge in the first stage and applied to guide the frame-level bit allocation. With the formulation, the two stage bit allocation problem can be converted to a joint optimization problem. By solving the formulated optimization problem, the two stage optimal bit allocation scheme is established, in which more appropriate number of bits can be allocated to each frame and each CTU. The experimental results show that our proposed method can bring higher coding efficiency while satisfying the constraint of bit rate precisely.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115179157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Alpha-trimmed Mean Filter and XOR based Image Enhancement for Embedding Data in Image 基于截断均值滤波器和异或的图像增强方法的图像嵌入
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675315
S. Alam, Moid Ul Huda, Muhammad Farhan
In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy protection, etc. In image steganography, secret data is generally embedded within the image through an additional step after a mandatory image enhancement process. In this paper, we propose the idea of embedding data during the image enhancement process. This saves the additional work required to separately encode the data inside the cover image. We used the Alpha-Trimmed mean filter for image enhancement and XOR of the 6 MSBs for embedding the two bits of the bitstream in the 2 LSBs whereas the extraction is a reverse process. Our obtained quantitative and qualitative results are better than a methodology presented in a very recent paper.
在数字内容创建和分发的时代,隐写术,即将秘密数据隐藏在另一个数据中,在许多应用中都需要用到,例如双方之间的秘密通信、盗版保护等。在图像隐写术中,秘密数据通常通过强制图像增强过程后的附加步骤嵌入到图像中。本文提出了在图像增强过程中嵌入数据的思想。这节省了单独编码封面图像内数据所需的额外工作。我们使用alpha - trim平均滤波器进行图像增强,并使用6个msb的异或将比特流的两个比特嵌入到2个lsb中,而提取则是一个相反的过程。我们获得的定量和定性结果比最近一篇论文中提出的方法要好。
{"title":"Alpha-trimmed Mean Filter and XOR based Image Enhancement for Embedding Data in Image","authors":"S. Alam, Moid Ul Huda, Muhammad Farhan","doi":"10.1109/VCIP53242.2021.9675315","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675315","url":null,"abstract":"In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy protection, etc. In image steganography, secret data is generally embedded within the image through an additional step after a mandatory image enhancement process. In this paper, we propose the idea of embedding data during the image enhancement process. This saves the additional work required to separately encode the data inside the cover image. We used the Alpha-Trimmed mean filter for image enhancement and XOR of the 6 MSBs for embedding the two bits of the bitstream in the 2 LSBs whereas the extraction is a reverse process. Our obtained quantitative and qualitative results are better than a methodology presented in a very recent paper.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116199601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RDPlot – An Evaluation Tool for Video Coding Simulations RDPlot -一个视频编码模拟的评估工具
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675399
J. Schneider, Johannes Sauer, M. Wien
RDPlot is an open source GUI application for plotting Rate-Distortion (RD)-curves and calculating Bjøntegaard Delta (BD) statistics [1]. It supports parsing the output of commonly used reference software packages, parsing *.csv-formatted files, and *.xml-formatted files. Once parsed, RDPlot offers the ability to evaluate video coding results interactively. Conceptually, several measures can be plotted over the bitrate and BD measurements can be conducted accordingly. Moreover, plots and corresponding BD statistics can be exported, and directly integrated into LaTeX documents.
RDPlot是一个开源的GUI应用程序,用于绘制率失真(RD)曲线和计算Bjøntegaard Delta (BD)统计量[1]。支持解析常用参考软件包的输出,解析*.csv格式的文件和*.xml格式的文件。一旦被解析,RDPlot提供了交互式评估视频编码结果的能力。从概念上讲,可以在比特率上绘制几个测量值,并相应地进行BD测量。此外,图和相应的BD统计数据可以导出,并直接集成到LaTeX文档中。
{"title":"RDPlot – An Evaluation Tool for Video Coding Simulations","authors":"J. Schneider, Johannes Sauer, M. Wien","doi":"10.1109/VCIP53242.2021.9675399","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675399","url":null,"abstract":"RDPlot is an open source GUI application for plotting Rate-Distortion (RD)-curves and calculating Bjøntegaard Delta (BD) statistics [1]. It supports parsing the output of commonly used reference software packages, parsing *.csv-formatted files, and *.xml-formatted files. Once parsed, RDPlot offers the ability to evaluate video coding results interactively. Conceptually, several measures can be plotted over the bitrate and BD measurements can be conducted accordingly. Moreover, plots and corresponding BD statistics can be exported, and directly integrated into LaTeX documents.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126903550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization of Probability Distributions for Residual Coding of Screen Content 屏幕内容残差编码概率分布的优化
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675326
Hannah Och, T. Strutz, A. Kaup
Probability distribution modeling is the basis for most competitive methods for lossless coding of screen content. One such state-of-the-art method is known as soft context formation (SCF). For each pixel to be encoded, a probability distribution is estimated based on the neighboring pattern and the occurrence of that pattern in the already encoded image. Using an arithmetic coder, the pixel color can thus be encoded very efficiently, provided that the current color has been observed before in association with a similar pattern. If this is not the case, the color is instead encoded using a color palette or, if it is still unknown, via residual coding. Both palette-based coding and residual coding have significantly worse compression efficiency than coding based on soft context formation. In this paper, the residual coding stage is improved by adaptively trimming the probability distributions for the residual error. Furthermore, an enhanced probability modeling for indicating a new color depending on the occurrence of new colors in the neighborhood is proposed. These modifications result in a bitrate reduction of up to 2.9 % on average. Compared to HEVC (HM-16.21 + SCM-8.8) and FLIF, the improved SCF method saves on average about 11 % and 18 % rate, respectively.
概率分布建模是屏幕内容无损编码最具竞争力的方法的基础。其中一种最先进的方法被称为软上下文形成(SCF)。对于要编码的每个像素,基于相邻模式和该模式在已经编码的图像中的出现情况估计概率分布。因此,使用算术编码器,可以非常有效地对像素颜色进行编码,前提是当前颜色之前已经与类似的图案相关联地观察过。如果不是这种情况,则使用调色板对颜色进行编码,或者如果仍然未知,则通过剩余编码进行编码。基于调色板的编码和残差编码的压缩效率都明显低于基于软上下文形成的编码。本文通过自适应调整残差的概率分布,改进了残差编码阶段。在此基础上,提出了一种基于邻域中新颜色出现的概率模型。这些修改导致比特率平均降低高达2.9%。与HEVC (HM-16.21 + SCM-8.8)和FLIF相比,改进的SCF方法平均分别节省了11%和18%的成本。
{"title":"Optimization of Probability Distributions for Residual Coding of Screen Content","authors":"Hannah Och, T. Strutz, A. Kaup","doi":"10.1109/VCIP53242.2021.9675326","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675326","url":null,"abstract":"Probability distribution modeling is the basis for most competitive methods for lossless coding of screen content. One such state-of-the-art method is known as soft context formation (SCF). For each pixel to be encoded, a probability distribution is estimated based on the neighboring pattern and the occurrence of that pattern in the already encoded image. Using an arithmetic coder, the pixel color can thus be encoded very efficiently, provided that the current color has been observed before in association with a similar pattern. If this is not the case, the color is instead encoded using a color palette or, if it is still unknown, via residual coding. Both palette-based coding and residual coding have significantly worse compression efficiency than coding based on soft context formation. In this paper, the residual coding stage is improved by adaptively trimming the probability distributions for the residual error. Furthermore, an enhanced probability modeling for indicating a new color depending on the occurrence of new colors in the neighborhood is proposed. These modifications result in a bitrate reduction of up to 2.9 % on average. Compared to HEVC (HM-16.21 + SCM-8.8) and FLIF, the improved SCF method saves on average about 11 % and 18 % rate, respectively.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128064230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Using Regularity Unit As Guidance For Summarization-Based Image Resizing 用规则单元指导基于摘要的图像大小调整
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675328
Fang-Tsung Hsiao, Yi Lin, Yi-Chang Lu
In this paper, we propose a novel algorithm for summarization-based image resizing. In the past, a process of detecting precise locations of repeating patterns is required before the pattern removal step in resizing. However, it is difficult to find repeating patterns which are illuminated under different lighting conditions and viewed from different perspectives. To solve the problem, we first identify the regularity unit of repeating patterns by statistics. Then we can use the regularity unit for shift-map optimization to obtain a better resized image. The experimental results show that our method is competitive with other well-known methods.
本文提出了一种基于摘要的图像调整算法。过去,在调整尺寸的模式移除步骤之前,需要检测重复模式的精确位置。然而,在不同的照明条件下,从不同的角度来看,很难找到重复的图案。为了解决这个问题,我们首先用统计方法识别重复模式的规则单元。然后,我们可以使用正则单元进行位移映射优化,以获得更好的缩放图像。实验结果表明,该方法与其他已知方法相比具有一定的竞争力。
{"title":"Using Regularity Unit As Guidance For Summarization-Based Image Resizing","authors":"Fang-Tsung Hsiao, Yi Lin, Yi-Chang Lu","doi":"10.1109/VCIP53242.2021.9675328","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675328","url":null,"abstract":"In this paper, we propose a novel algorithm for summarization-based image resizing. In the past, a process of detecting precise locations of repeating patterns is required before the pattern removal step in resizing. However, it is difficult to find repeating patterns which are illuminated under different lighting conditions and viewed from different perspectives. To solve the problem, we first identify the regularity unit of repeating patterns by statistics. Then we can use the regularity unit for shift-map optimization to obtain a better resized image. The experimental results show that our method is competitive with other well-known methods.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125898417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-Reference Stereoscopic Image Quality Assessment Considering Binocular Disparity and Fusion Compensation 考虑双眼视差和融合补偿的无参考立体图像质量评价
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675398
Jinhui Feng, Sumei Li, Yongli Chang
In this paper, we propose an optimized dual stream convolutional neural network (CNN) considering binocular disparity and fusion compensation for no-reference stereoscopic image quality assessment (SIQA). Different from previous methods, we extract both disparity and fusion features from multiple levels to simulate hierarchical processing of the stereoscopic images in human brain. Given that the ocular dominance plays an important role in quality evaluation, the fusion weights assignment module (FWAM) is proposed to assign weight to guide the fusion of the left and the right features respectively. Experimental results on four public stereoscopic image databases show that the proposed method is superior to the state-of-the-art SIQA methods on both symmetrical and asymmetrical distortion stereoscopic images.
本文提出了一种考虑双眼视差和融合补偿的优化双流卷积神经网络(CNN),用于无参考立体图像质量评估(SIQA)。与以往的方法不同,我们从多个层次提取视差和融合特征,模拟立体图像在人脑中的分层处理。考虑到眼优势在质量评价中的重要作用,提出了融合权重分配模块(FWAM),分别分配权重指导左右特征的融合。在四个公共立体图像数据库上的实验结果表明,该方法在对称和不对称畸变立体图像上都优于现有的SIQA方法。
{"title":"No-Reference Stereoscopic Image Quality Assessment Considering Binocular Disparity and Fusion Compensation","authors":"Jinhui Feng, Sumei Li, Yongli Chang","doi":"10.1109/VCIP53242.2021.9675398","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675398","url":null,"abstract":"In this paper, we propose an optimized dual stream convolutional neural network (CNN) considering binocular disparity and fusion compensation for no-reference stereoscopic image quality assessment (SIQA). Different from previous methods, we extract both disparity and fusion features from multiple levels to simulate hierarchical processing of the stereoscopic images in human brain. Given that the ocular dominance plays an important role in quality evaluation, the fusion weights assignment module (FWAM) is proposed to assign weight to guide the fusion of the left and the right features respectively. Experimental results on four public stereoscopic image databases show that the proposed method is superior to the state-of-the-art SIQA methods on both symmetrical and asymmetrical distortion stereoscopic images.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124477881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Rethinking Anchor-Object Matching and Encoding in Rotating Object Detection 旋转目标检测中锚点-目标匹配与编码的再思考
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675390
Zhiyuan Huang, Zhaohui Hou, Pingyu Wang, Fei Su, Zhicheng Zhao
Rotating object detection is more challenging than horizontal object detection because of the multi-orientation of the objects involved. In the recent anchor-based rotating object detector, the IoU-based matching mechanism has some mismatching and wrong-matching problems. Moreover, the encoding mechanism does not correctly reflect the location relationships between anchors and objects. In this paper, RBox-Diff-based matching (RDM) mechanism and angle-first encoding (AE) method are proposed to solve these problems. RDM optimizes the anchor-object matching by replacing IoU (Intersection-over-Union) with a new concept called RBox-Diff, while AE optimizes the encoding mechanism to make the encoding results consistent with the relative position between objects and anchors more. The proposed methods can be easily applied to most of the anchor-based rotating object detectors without introducing extra parameters. The extensive experiments on DOTA-v1.0 dataset show the effectiveness of the proposed methods over other advanced methods.
旋转目标检测比水平目标检测更具挑战性,因为所涉及的目标是多方位的。在目前基于锚点的旋转目标检测器中,基于iou的匹配机制存在不匹配和错误匹配的问题。此外,编码机制不能正确反映锚点和对象之间的位置关系。本文提出了基于rbox - ff的匹配(RDM)机制和角度优先编码(AE)方法来解决这些问题。RDM通过用RBox-Diff的新概念替换IoU (Intersection-over-Union)来优化锚点-对象匹配,AE则通过优化编码机制,使编码结果更加符合对象与锚点之间的相对位置。所提出的方法可以很容易地应用于大多数基于锚点的旋转目标探测器,而不需要引入额外的参数。在DOTA-v1.0数据集上的大量实验表明,该方法优于其他先进方法。
{"title":"Rethinking Anchor-Object Matching and Encoding in Rotating Object Detection","authors":"Zhiyuan Huang, Zhaohui Hou, Pingyu Wang, Fei Su, Zhicheng Zhao","doi":"10.1109/VCIP53242.2021.9675390","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675390","url":null,"abstract":"Rotating object detection is more challenging than horizontal object detection because of the multi-orientation of the objects involved. In the recent anchor-based rotating object detector, the IoU-based matching mechanism has some mismatching and wrong-matching problems. Moreover, the encoding mechanism does not correctly reflect the location relationships between anchors and objects. In this paper, RBox-Diff-based matching (RDM) mechanism and angle-first encoding (AE) method are proposed to solve these problems. RDM optimizes the anchor-object matching by replacing IoU (Intersection-over-Union) with a new concept called RBox-Diff, while AE optimizes the encoding mechanism to make the encoding results consistent with the relative position between objects and anchors more. The proposed methods can be easily applied to most of the anchor-based rotating object detectors without introducing extra parameters. The extensive experiments on DOTA-v1.0 dataset show the effectiveness of the proposed methods over other advanced methods.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121490280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1