首页 > 最新文献

2021 International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
On the Impact of Viewing Distance on Perceived Video Quality 观看距离对感知视频质量的影响
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675431
Hadi Amirpour, R. Schatz, C. Timmerer, M. Ghanbari
Due to the growing importance of optimizing the quality and efficiency of video streaming delivery, accurate assessment of user-perceived video quality becomes increasingly important. However, due to the wide range of viewing distances encountered in real-world viewing settings, the perceived video quality can vary significantly in everyday viewing situations. In this paper, we investigate and quantify the influence of viewing distance on perceived video quality. A subjective experiment was conducted with full HD sequences at three different fixed viewing distances, with each video sequence being encoded at three different quality levels. Our study results confirm that the viewing distance has a significant influence on the quality assessment. In particular, they show that an increased viewing distance generally leads to increased perceived video quality, especially at low media encoding quality levels. In this context, we also provide an estimation of potential bitrate savings that knowledge of actual viewing distance would enable in practice. Since current objective video quality metrics do not systematically take into account viewing distance, we also analyze and quantify the influence of viewing distance on the correlation between objective and subjective metrics. Our results confirm the need for distance-aware objective metrics when the accurate prediction of perceived video quality in real-world environments is required.
由于优化视频流传输的质量和效率变得越来越重要,准确评估用户感知的视频质量变得越来越重要。然而,由于在现实世界的观看设置中遇到的观看距离范围很大,因此在日常观看情况下,感知到的视频质量可能会有很大差异。在本文中,我们研究并量化了观看距离对感知视频质量的影响。在三种不同的固定观看距离下对全高清视频序列进行主观实验,每个视频序列以三种不同的质量水平进行编码。我们的研究结果证实了观看距离对质量评价有显著影响。特别是,他们表明,观看距离的增加通常会导致感知视频质量的提高,特别是在低媒体编码质量水平下。在这种情况下,我们还提供了潜在比特率节省的估计,实际观看距离的知识将在实践中实现。由于目前的客观视频质量指标没有系统地考虑观看距离,我们还分析和量化了观看距离对客观和主观指标之间相关性的影响。我们的研究结果证实,当需要在现实世界环境中准确预测感知视频质量时,需要距离感知客观指标。
{"title":"On the Impact of Viewing Distance on Perceived Video Quality","authors":"Hadi Amirpour, R. Schatz, C. Timmerer, M. Ghanbari","doi":"10.1109/VCIP53242.2021.9675431","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675431","url":null,"abstract":"Due to the growing importance of optimizing the quality and efficiency of video streaming delivery, accurate assessment of user-perceived video quality becomes increasingly important. However, due to the wide range of viewing distances encountered in real-world viewing settings, the perceived video quality can vary significantly in everyday viewing situations. In this paper, we investigate and quantify the influence of viewing distance on perceived video quality. A subjective experiment was conducted with full HD sequences at three different fixed viewing distances, with each video sequence being encoded at three different quality levels. Our study results confirm that the viewing distance has a significant influence on the quality assessment. In particular, they show that an increased viewing distance generally leads to increased perceived video quality, especially at low media encoding quality levels. In this context, we also provide an estimation of potential bitrate savings that knowledge of actual viewing distance would enable in practice. Since current objective video quality metrics do not systematically take into account viewing distance, we also analyze and quantify the influence of viewing distance on the correlation between objective and subjective metrics. Our results confirm the need for distance-aware objective metrics when the accurate prediction of perceived video quality in real-world environments is required.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115268245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Distortion Propagation Oriented CU-tree Algorithm for x265 面向x265失真传播的cu树算法
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675426
Xinye Jiang, Zhenyu Liu, Yongbing Zhang, Xiangyang Ji
Rate-distortion optimization (RDO) is widely used in video coding to improve coding efficiency. Conventionally, RDO is applied to each block independently to avoid high computational complexity. However, various prediction techniques introduce spatio-temporal dependency between blocks, therefore the independent RDO is not optimal. Specifically, because of the motion compensation, the distortion of reference blocks will affect the quality of subsequent prediction blocks. And considering this temporal dependency in RDO can improve the global rate-distortion (R-D) performance. x265 leveraged on a lookahead module to analyze the temporal dependency between blocks, and weighted the quality of each block based on its reference strength. However, the original algorithm in x265 ignored the impacts of quantization, and this shortcoming degraded the R-D performance of x265. In this paper, we propose a new linear distortion propagation model to estimate the temporal dependency, which introduces the impacts of quantization. And from a perspective of global RDO, a corresponding adaptive quantization formula is presented. The proposed algorithm was conducted in x265 version 3.2. Experiments revealed that, the proposed algorithm achieved average 15.43% PSNR-based and 23.81% SSIM-based BD-rate reductions, which outperformed the original algorithm in x265 by 4.14% and 9.68%, respectively.
码率失真优化(RDO)被广泛应用于视频编码中,以提高编码效率。传统上,RDO是独立应用于每个块,以避免高的计算复杂度。然而,各种预测技术引入了块之间的时空依赖性,因此独立的RDO不是最优的。具体来说,由于运动补偿,参考块的畸变会影响后续预测块的质量。在RDO中考虑这种时间依赖性可以提高全局率失真(R-D)性能。X265利用向前看模块来分析块之间的时间依赖性,并根据其引用强度对每个块的质量进行加权。但是,x265中的原始算法忽略了量化的影响,这一缺点降低了x265的R-D性能。在本文中,我们提出了一个新的线性失真传播模型来估计时间依赖性,该模型引入了量化的影响。并从全局RDO的角度,给出了相应的自适应量化公式。该算法在x265 3.2版本中进行。实验表明,该算法基于psnr和ssim的平均降噪率分别达到15.43%和23.81%,分别比x265的原始算法高4.14%和9.68%。
{"title":"A Distortion Propagation Oriented CU-tree Algorithm for x265","authors":"Xinye Jiang, Zhenyu Liu, Yongbing Zhang, Xiangyang Ji","doi":"10.1109/VCIP53242.2021.9675426","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675426","url":null,"abstract":"Rate-distortion optimization (RDO) is widely used in video coding to improve coding efficiency. Conventionally, RDO is applied to each block independently to avoid high computational complexity. However, various prediction techniques introduce spatio-temporal dependency between blocks, therefore the independent RDO is not optimal. Specifically, because of the motion compensation, the distortion of reference blocks will affect the quality of subsequent prediction blocks. And considering this temporal dependency in RDO can improve the global rate-distortion (R-D) performance. x265 leveraged on a lookahead module to analyze the temporal dependency between blocks, and weighted the quality of each block based on its reference strength. However, the original algorithm in x265 ignored the impacts of quantization, and this shortcoming degraded the R-D performance of x265. In this paper, we propose a new linear distortion propagation model to estimate the temporal dependency, which introduces the impacts of quantization. And from a perspective of global RDO, a corresponding adaptive quantization formula is presented. The proposed algorithm was conducted in x265 version 3.2. Experiments revealed that, the proposed algorithm achieved average 15.43% PSNR-based and 23.81% SSIM-based BD-rate reductions, which outperformed the original algorithm in x265 by 4.14% and 9.68%, respectively.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131560756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dictionary Learning-based Reference Picture Resampling in VVC 基于字典学习的VVC参考图片重采样
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675361
J. Schneider, Christian Rohlfing
Versatile Video Coding (VVC) introduces the con-cept of Reference Picture Resampling (RPR), which allows for a resolution change of the video during decoding, without introducing an additional Intra Random Access Point (IRAP) into the bitstream. When the resolution is increased, an upsampling operation of the reference picture is required in order to apply motion compensated prediction. Conceptually, the upsampling by linear interpolation filters fails to recover frequencies which were lost during downsampling. Yet, the quality of the upsampled reference picture is crucial to the pre-diction performance. In recent years, machine learning based Super-Resolution (SR) has shown to outperform conventional interpolation filters by far in regard to super-resolving a previ-ously downsampled image. In particular, Dictionary Learning-based Super-Resolution (DLSR) was shown to improve the inter-layer prediction in SHVC [1]. Thus, this paper introduces DLSR to the prediction process in RPR. Further, the approach is experimentally evaluated by an implementation based on the VTM-9.3 reference software. The simulation results show a reduction of the instantaneous bitrate of 0.98% on average at the same objective quality in terms of PSNR. Moreover, the peak bitrate reduction is measured to 4.74% for the “Johnny” sequence of the JVET test set.
通用视频编码(VVC)引入了参考图像重采样(RPR)的概念,它允许在解码过程中改变视频的分辨率,而无需在比特流中引入额外的内部随机接入点(IRAP)。当分辨率增加时,为了应用运动补偿预测,需要对参考图像进行上采样操作。从概念上讲,线性插值滤波器的上采样不能恢复下采样期间丢失的频率。然而,上采样参考图像的质量对预测性能至关重要。近年来,基于机器学习的超分辨率(SR)在对先前下采样图像的超分辨率方面表现优于传统的插值滤波器。特别是,基于字典学习的超分辨率(DLSR)被证明可以改善SHVC中的层间预测[1]。因此,本文将DLSR引入到RPR的预测过程中。此外,基于VTM-9.3参考软件的实现对该方法进行了实验评估。仿真结果表明,在相同物镜质量的情况下,PSNR的瞬时比特率平均降低了0.98%。此外,JVET测试集的“Johnny”序列的峰值比特率降低为4.74%。
{"title":"Dictionary Learning-based Reference Picture Resampling in VVC","authors":"J. Schneider, Christian Rohlfing","doi":"10.1109/VCIP53242.2021.9675361","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675361","url":null,"abstract":"Versatile Video Coding (VVC) introduces the con-cept of Reference Picture Resampling (RPR), which allows for a resolution change of the video during decoding, without introducing an additional Intra Random Access Point (IRAP) into the bitstream. When the resolution is increased, an upsampling operation of the reference picture is required in order to apply motion compensated prediction. Conceptually, the upsampling by linear interpolation filters fails to recover frequencies which were lost during downsampling. Yet, the quality of the upsampled reference picture is crucial to the pre-diction performance. In recent years, machine learning based Super-Resolution (SR) has shown to outperform conventional interpolation filters by far in regard to super-resolving a previ-ously downsampled image. In particular, Dictionary Learning-based Super-Resolution (DLSR) was shown to improve the inter-layer prediction in SHVC [1]. Thus, this paper introduces DLSR to the prediction process in RPR. Further, the approach is experimentally evaluated by an implementation based on the VTM-9.3 reference software. The simulation results show a reduction of the instantaneous bitrate of 0.98% on average at the same objective quality in terms of PSNR. Moreover, the peak bitrate reduction is measured to 4.74% for the “Johnny” sequence of the JVET test set.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134548173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Error Self-learning Semi-supervised Method for No-reference Image Quality Assessment 无参考图像质量评价的误差自学习半监督方法
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675352
Yingjie Feng, Sumei Li, Sihan Hao
In recent years, deep learning has achieved significant progress in many respects. However, unlike other research fields with millions of labeled data such as image recognition, only several thousand labeled images are available in image quality assessment (IQA) field for deep learning, which heavily hinders the development and application for IQA. To tackle this problem, in this paper, we proposed an error self-learning semi-supervised method for no-reference (NR) IQA (ESSIQA), which is based on deep learning. We employed an advanced full reference (FR) IQA method to expand databases and supervise the training of network. In addition, the network outputs of expanding images were used as proxy labels replacing errors between subjective scores and objective scores to achieve error self-learning. Two weights of error back propagation were designed to reduce the impact of inaccurate outputs. The experimental results show that the proposed method yielded comparative effect.
近年来,深度学习在许多方面都取得了重大进展。然而,与图像识别等其他拥有数百万标记数据的研究领域不同,深度学习的图像质量评估(IQA)领域只有几千张标记图像,这严重阻碍了IQA的发展和应用。为了解决这一问题,本文提出了一种基于深度学习的无参考(NR) IQA (ESSIQA)错误自学习半监督方法。我们采用了一种先进的全参考(FR) IQA方法来扩展数据库和监督网络的训练。此外,将扩展图像的网络输出作为代理标签,替换主观评分与客观评分之间的误差,实现误差自学习。设计了误差反向传播的两个权值,以减少不准确输出的影响。实验结果表明,该方法取得了比较好的效果。
{"title":"An Error Self-learning Semi-supervised Method for No-reference Image Quality Assessment","authors":"Yingjie Feng, Sumei Li, Sihan Hao","doi":"10.1109/VCIP53242.2021.9675352","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675352","url":null,"abstract":"In recent years, deep learning has achieved significant progress in many respects. However, unlike other research fields with millions of labeled data such as image recognition, only several thousand labeled images are available in image quality assessment (IQA) field for deep learning, which heavily hinders the development and application for IQA. To tackle this problem, in this paper, we proposed an error self-learning semi-supervised method for no-reference (NR) IQA (ESSIQA), which is based on deep learning. We employed an advanced full reference (FR) IQA method to expand databases and supervise the training of network. In addition, the network outputs of expanding images were used as proxy labels replacing errors between subjective scores and objective scores to achieve error self-learning. Two weights of error back propagation were designed to reduce the impact of inaccurate outputs. The experimental results show that the proposed method yielded comparative effect.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115719799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Portable Congenital Glaucoma Detection System 便携式先天性青光眼检测系统
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675423
Chunjun Hua, Menghan Hu, Yue Wu
Congenital glaucoma is an eye disease caused by embryonic developmental disorders, which damages the optic nerve. In this demo paper, we proposed a portable non-contact congenital glaucoma detection system, which can evaluate the condition of children's eyes by measuring the cornea size using the developed mobile application. The system consists of two modules viz. cornea identification module and diagnosis module. This system can be utilized by everyone with a smartphone, which is of wider application. It can be used as a convenient home self-examination tool for children in the large-scale screening of congenital glaucoma. The demo video of the proposed detection system is available at: https://doi.org/10.6084/m9.figshare.14728854.v1.
先天性青光眼是一种由胚胎发育障碍引起的眼部疾病,主要损害视神经。在本文的演示中,我们提出了一种便携式非接触式先天性青光眼检测系统,该系统可以使用开发的移动应用程序通过测量角膜大小来评估儿童眼睛的状况。该系统由角膜识别模块和角膜诊断模块两个模块组成。这个系统可以让每个人都有智能手机使用,应用范围更广。在先天性青光眼的大规模筛查中,可作为儿童方便的家庭自检工具。所提出的检测系统的演示视频可在:https://doi.org/10.6084/m9.figshare.14728854.v1。
{"title":"Portable Congenital Glaucoma Detection System","authors":"Chunjun Hua, Menghan Hu, Yue Wu","doi":"10.1109/VCIP53242.2021.9675423","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675423","url":null,"abstract":"Congenital glaucoma is an eye disease caused by embryonic developmental disorders, which damages the optic nerve. In this demo paper, we proposed a portable non-contact congenital glaucoma detection system, which can evaluate the condition of children's eyes by measuring the cornea size using the developed mobile application. The system consists of two modules viz. cornea identification module and diagnosis module. This system can be utilized by everyone with a smartphone, which is of wider application. It can be used as a convenient home self-examination tool for children in the large-scale screening of congenital glaucoma. The demo video of the proposed detection system is available at: https://doi.org/10.6084/m9.figshare.14728854.v1.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116542331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VCIP 2021 Organizing Committee VCIP 2021组委会
Pub Date : 2021-12-05 DOI: 10.1109/vcip53242.2021.9675374
{"title":"VCIP 2021 Organizing Committee","authors":"","doi":"10.1109/vcip53242.2021.9675374","DOIUrl":"https://doi.org/10.1109/vcip53242.2021.9675374","url":null,"abstract":"","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125832555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multicomponent Secondary Transform 多分量二次变换
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675447
M. Krishnan, Xin Zhao, Shanchun Liu
The Alliance for Open Media has recently initiated coding tool exploration activities towards the next-generation video coding beyond AV1. In this regard, a frequency-domain coding tool, which is designed to leverage the cross-component correlation existing between collocated chroma blocks, is explored in this paper. The tool, henceforth known as multi-component secondary transform (MCST), is implemented as a low complexity secondary transform with primary transform coefficients of multiple color components as input. The proposed tool is implemented and tested on top of libaom. Experimental results show that, compared to libaom, the proposed method achieves an average 0.34% to 0.44% overall coding efficiency for All Intra (AI) coding configuration for a wide range of video content.
开放媒体联盟最近开始了针对AV1以外的下一代视频编码的编码工具探索活动。在这方面,本文探索了一种频域编码工具,该工具旨在利用并置色度块之间存在的跨分量相关性。该工具,因此被称为多分量二次变换(MCST),被实现为一个低复杂度的二次变换,以多个颜色分量的主变换系数作为输入。该工具在libaom上进行了实现和测试。实验结果表明,与libaom相比,该方法在大范围视频内容的All Intra (AI)编码配置下的平均总编码效率为0.34% ~ 0.44%。
{"title":"Multicomponent Secondary Transform","authors":"M. Krishnan, Xin Zhao, Shanchun Liu","doi":"10.1109/VCIP53242.2021.9675447","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675447","url":null,"abstract":"The Alliance for Open Media has recently initiated coding tool exploration activities towards the next-generation video coding beyond AV1. In this regard, a frequency-domain coding tool, which is designed to leverage the cross-component correlation existing between collocated chroma blocks, is explored in this paper. The tool, henceforth known as multi-component secondary transform (MCST), is implemented as a low complexity secondary transform with primary transform coefficients of multiple color components as input. The proposed tool is implemented and tested on top of libaom. Experimental results show that, compared to libaom, the proposed method achieves an average 0.34% to 0.44% overall coding efficiency for All Intra (AI) coding configuration for a wide range of video content.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129305427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Urban Planter: A Web App for Automatic Classification of Urban Plants 城市植物:一个用于城市植物自动分类的Web应用程序
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675318
Sarit Divekar, Irina Rabaev, Marina Litvak
Plant classification requires an expert because subtle differences in leaves or petal forms might differentiate between different species. On the contrary, some species are characterized by high variability in appearance. This paper introduces a web app for assisting people in identifying plants for discovering the best growing methods. The uploaded picture is submitted to the back-end server, and a pre-trained neural network classifies it to one of the predefined classes. The classification label and confidence are displayed to the end user on the front-end page. The application focuses on the house and garden plant species that can be grown mainly in a desert climate and are not covered by existing datasets. For training a model, we collected the Urban Planter dataset. The installation code of the alpha version and the demo video of the app can be found on https://github.com/UrbanPlanter/urbanplanterapp.
植物分类需要专家,因为叶子或花瓣形态的细微差异可能会区分不同的物种。相反,有些物种的特征是在外观上有很大的变异性。本文介绍了一个帮助人们识别植物以发现最佳种植方法的web应用程序。上传的图片被提交到后端服务器,一个预训练的神经网络将其分类到一个预定义的类中。分类标签和置信度在前端页面显示给最终用户。该应用程序侧重于主要在沙漠气候中生长的房屋和花园植物物种,这些物种未被现有数据集覆盖。为了训练模型,我们收集了Urban Planter数据集。alpha版本的安装代码和演示视频可以在https://github.com/UrbanPlanter/urbanplanterapp上找到。
{"title":"Urban Planter: A Web App for Automatic Classification of Urban Plants","authors":"Sarit Divekar, Irina Rabaev, Marina Litvak","doi":"10.1109/VCIP53242.2021.9675318","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675318","url":null,"abstract":"Plant classification requires an expert because subtle differences in leaves or petal forms might differentiate between different species. On the contrary, some species are characterized by high variability in appearance. This paper introduces a web app for assisting people in identifying plants for discovering the best growing methods. The uploaded picture is submitted to the back-end server, and a pre-trained neural network classifies it to one of the predefined classes. The classification label and confidence are displayed to the end user on the front-end page. The application focuses on the house and garden plant species that can be grown mainly in a desert climate and are not covered by existing datasets. For training a model, we collected the Urban Planter dataset. The installation code of the alpha version and the demo video of the app can be found on https://github.com/UrbanPlanter/urbanplanterapp.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115999684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-stage Parallax Correction and Multi-stage Cross-view Fusion Network Based Stereo Image Super-Resolution 基于两级视差校正和多级交叉视点融合网络的立体图像超分辨率
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675418
Yijian Zheng, Sumei Li
Stereo image super-resolution (SR) has achieved great progress in recent years. However, the two major problems of the existing methods are that the parallax correction is insufficient and the cross-view information fusion only occurs in the beginning of the network. To address these problems, we propose a two-stage parallax correction and a multi-stage cross-view fusion network for better stereo image SR results. Specially, the two-stage parallax correction module consists of horizontal parallax correction and refined parallax correction. The first stage corrects horizontal parallax by parallax attention. The second stage is based on deformable convolution to refine horizontal parallax and correct vertical parallax simultaneously. Then, multiple cascaded enhanced residual spatial feature transform blocks are developed to fuse cross-view information at multiple stages. Extensive experiments show that our method achieves state-of-the-art performance on the KITTI2012, KITTI2015, Middlebury and Flickr1024 datasets.
立体图像超分辨率(SR)技术近年来取得了很大的进展。然而,现有方法存在视差校正不足和交叉视点信息融合只发生在网络初始阶段这两个主要问题。为了解决这些问题,我们提出了一种两阶段视差校正和多阶段交叉视点融合网络,以获得更好的立体图像SR结果。其中,两级视差校正模块包括水平视差校正和精细视差校正。第一阶段通过视差注意纠正水平视差。第二阶段是基于可变形卷积的水平视差细化和垂直视差校正同步进行。然后,开发多个级联的增强残差空间特征变换块,融合多阶段的交叉视图信息;大量的实验表明,我们的方法在KITTI2012, KITTI2015, Middlebury和Flickr1024数据集上达到了最先进的性能。
{"title":"Two-stage Parallax Correction and Multi-stage Cross-view Fusion Network Based Stereo Image Super-Resolution","authors":"Yijian Zheng, Sumei Li","doi":"10.1109/VCIP53242.2021.9675418","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675418","url":null,"abstract":"Stereo image super-resolution (SR) has achieved great progress in recent years. However, the two major problems of the existing methods are that the parallax correction is insufficient and the cross-view information fusion only occurs in the beginning of the network. To address these problems, we propose a two-stage parallax correction and a multi-stage cross-view fusion network for better stereo image SR results. Specially, the two-stage parallax correction module consists of horizontal parallax correction and refined parallax correction. The first stage corrects horizontal parallax by parallax attention. The second stage is based on deformable convolution to refine horizontal parallax and correct vertical parallax simultaneously. Then, multiple cascaded enhanced residual spatial feature transform blocks are developed to fuse cross-view information at multiple stages. Extensive experiments show that our method achieves state-of-the-art performance on the KITTI2012, KITTI2015, Middlebury and Flickr1024 datasets.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126014821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-Reference Stereoscopic Image Quality Assessment Based on The Visual Pathway of Human Visual System 基于人眼视觉系统视觉通路的无参考立体图像质量评价
Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675346
F. Meng, Sumei Li
With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full of challenges due to the complex relationship between binocular views. In this article, firstly, convolutional neural network (CNN) based on the visual pathway of human visual system (HVS) is built, which simulates different parts of visual pathway such as the optic chiasm, lateral geniculate nucleus (LGN), and visual cortex. Secondly, the two pathways of our method simulate the ‘what’ and ‘where’ visual pathway respectively, which are endowed with different feature extraction capabilities. Finally, we find a different application way for 3D-convolution, employing it fuse the information from left and right view, rather than just extracting temporal features in video. The experimental results show that our proposed method is more in line with subjective score and has good generalization.
随着立体成像技术的发展,立体图像质量评估(SIQA)逐渐受到重视,双目视图之间复杂的关系使如何设计出符合人类视觉感知的方法充满了挑战。本文首先构建了基于人类视觉系统视觉通路的卷积神经网络(CNN),该网络模拟了视交叉、外侧膝状核(LGN)和视觉皮层等视觉通路的不同部分;其次,我们的方法的两条路径分别模拟了“什么”和“在哪里”的视觉路径,它们被赋予了不同的特征提取能力。最后,我们找到了一种不同的3d卷积的应用方式,利用它融合了左右视图的信息,而不仅仅是提取视频中的时间特征。实验结果表明,该方法更符合主观评分,具有较好的泛化性。
{"title":"No-Reference Stereoscopic Image Quality Assessment Based on The Visual Pathway of Human Visual System","authors":"F. Meng, Sumei Li","doi":"10.1109/VCIP53242.2021.9675346","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675346","url":null,"abstract":"With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full of challenges due to the complex relationship between binocular views. In this article, firstly, convolutional neural network (CNN) based on the visual pathway of human visual system (HVS) is built, which simulates different parts of visual pathway such as the optic chiasm, lateral geniculate nucleus (LGN), and visual cortex. Secondly, the two pathways of our method simulate the ‘what’ and ‘where’ visual pathway respectively, which are endowed with different feature extraction capabilities. Finally, we find a different application way for 3D-convolution, employing it fuse the information from left and right view, rather than just extracting temporal features in video. The experimental results show that our proposed method is more in line with subjective score and has good generalization.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126348823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1