首页 > 最新文献

2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)最新文献

英文 中文
Using fractal interpolation over complex network modeling of deep texture representation 利用分形插值对复杂网络上的深层纹理进行建模表示
J. Florindo, O. Bruno
Convolutional neural networks have been a funda-mental model in computer vision in the last years. Nevertheless, specifically in the analysis of texture images, the use of that model as a feature extractor rather than trained from scratch or extensively fine tuned has demonstrated to be more effective. In this scenario, such deep features can also benefit from further advanced analysis that can provide more meaningful representation than the direct use of feature maps. A successful example of such procedure is the recent use of visibility graphs to analyze deep features in texture recognition. It has been found that models based on complex networks can quantify properties such as periodicity, randomness and chaoticity. All those features demonstrated usefulness in texture classification. Inspired by this context, here we propose an alternative modeling based on complex networks to leverage the effectiveness of deep texture features. More specifically, we employ recurrence matrices of the neural activation at the penultimate layer. Moreover, the importance of complexity attributes, such as chaoticity and fractality, also instigates us to associate the complex networks with a fractal technique. More precisely, we complement the complex network representation with the application of fractal interpolation over the degree distribution of the recurrence matrix. The final descriptors are employed for texture classification and the results are compared, in terms of accuracy, with classical and state-of-the-art approaches. The achieved results are competitive and pave the way for future analysis on how such complexity measures can be useful in deep learning-based texture recognition.
卷积神经网络在过去几年中一直是计算机视觉的基本心智模型。然而,特别是在纹理图像的分析中,使用该模型作为特征提取器而不是从头开始训练或广泛微调已被证明是更有效的。在这种情况下,这种深度特征还可以从进一步的高级分析中受益,这些分析可以提供比直接使用特征映射更有意义的表示。这种方法的一个成功例子是最近使用可见性图来分析纹理识别中的深层特征。研究发现,基于复杂网络的模型可以量化网络的周期性、随机性和混沌性等特性。所有这些特征都证明了纹理分类的有效性。受此启发,本文提出了一种基于复杂网络的替代建模方法,以利用深度纹理特征的有效性。更具体地说,我们在倒数第二层使用神经激活的递归矩阵。此外,复杂性属性的重要性,如混沌性和分形,也促使我们将复杂网络与分形技术联系起来。更准确地说,我们在递归矩阵的度分布上应用分形插值来补充复杂网络表示。最后的描述符用于纹理分类,并将结果与经典和最先进的方法在准确性方面进行了比较。所取得的结果具有竞争力,并为未来分析这种复杂性度量如何在基于深度学习的纹理识别中发挥作用铺平了道路。
{"title":"Using fractal interpolation over complex network modeling of deep texture representation","authors":"J. Florindo, O. Bruno","doi":"10.1109/IPTA54936.2022.9784138","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784138","url":null,"abstract":"Convolutional neural networks have been a funda-mental model in computer vision in the last years. Nevertheless, specifically in the analysis of texture images, the use of that model as a feature extractor rather than trained from scratch or extensively fine tuned has demonstrated to be more effective. In this scenario, such deep features can also benefit from further advanced analysis that can provide more meaningful representation than the direct use of feature maps. A successful example of such procedure is the recent use of visibility graphs to analyze deep features in texture recognition. It has been found that models based on complex networks can quantify properties such as periodicity, randomness and chaoticity. All those features demonstrated usefulness in texture classification. Inspired by this context, here we propose an alternative modeling based on complex networks to leverage the effectiveness of deep texture features. More specifically, we employ recurrence matrices of the neural activation at the penultimate layer. Moreover, the importance of complexity attributes, such as chaoticity and fractality, also instigates us to associate the complex networks with a fractal technique. More precisely, we complement the complex network representation with the application of fractal interpolation over the degree distribution of the recurrence matrix. The final descriptors are employed for texture classification and the results are compared, in terms of accuracy, with classical and state-of-the-art approaches. The achieved results are competitive and pave the way for future analysis on how such complexity measures can be useful in deep learning-based texture recognition.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121882883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of Pooling Methods on Image Quality Metrics 池化方法对图像质量度量的影响
David Norman Díaz Estrada, Marius Pedersen
Image quality assessment using objective metrics is becoming more widespread, and an impressive number of image quality metrics have been proposed in the literature. An aspect that has received little attention compared to the design of these metrics is pooling. In pooling the quality values, usually from every pixel is reduced to fewer numbers, usually a single value, that represents overall quality. In this paper we investigate the impact of different pooling techniques on the performance of image quality metrics. We have tested different pooling methods with the SSIM and S-CIELAB image quality metrics in the CID:IQ database, and found that the pooling technique has a significant impact on their performance.
使用客观度量的图像质量评估正变得越来越普遍,并且在文献中提出了大量令人印象深刻的图像质量度量。与这些指标的设计相比,很少受到关注的一个方面是池化。在汇集质量值时,通常从每个像素减少到更少的数字,通常是单个值,代表整体质量。在本文中,我们研究了不同池化技术对图像质量度量性能的影响。我们用CID:IQ数据库中的SSIM和S-CIELAB图像质量指标测试了不同的池化方法,发现池化技术对它们的性能有显著影响。
{"title":"Impact of Pooling Methods on Image Quality Metrics","authors":"David Norman Díaz Estrada, Marius Pedersen","doi":"10.1109/IPTA54936.2022.9784142","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784142","url":null,"abstract":"Image quality assessment using objective metrics is becoming more widespread, and an impressive number of image quality metrics have been proposed in the literature. An aspect that has received little attention compared to the design of these metrics is pooling. In pooling the quality values, usually from every pixel is reduced to fewer numbers, usually a single value, that represents overall quality. In this paper we investigate the impact of different pooling techniques on the performance of image quality metrics. We have tested different pooling methods with the SSIM and S-CIELAB image quality metrics in the CID:IQ database, and found that the pooling technique has a significant impact on their performance.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133952756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of automatically generated embedding guides for cell classification 细胞分类中自动生成嵌入指南的分析
Philipp Gräbel, Julian Thull, M. Crysandt, B. Klinkhammer, P. Boor, T. Brümmendorf, D. Merhof
Automated cell classification in human bone marrow microscopy images could lead to faster acquisition and, therefore, to a considerably larger number of cells for the statistical cell count analysis. As basis for the diagnosis of hematopoietic dis-eases such as leukemia, this would be a significant improvement of clinical workflows. The classification of such cells, however, is challenging, partially due to dependencies between different cell types. In 2021, guided representation learning was introduced as an approach to include this domain knowledge into training by providing “embedding guides” as an optimization target for individual cell types. In this work, we propose improvements to guided repre-sentation learning by automatically generating guides based on graph optimization algorithms. We incorporate information about the visual similarity and the impact on diagnosis of mis-classifications. We show that this reduces critical false predictions and improves the overall classification F-score by up to 2.5 percentage points.
在人骨髓显微镜图像中的自动细胞分类可能导致更快的采集,因此,用于统计细胞计数分析的细胞数量相当大。作为白血病等造血疾病的诊断依据,这将显著改善临床工作流程。然而,这些细胞的分类是具有挑战性的,部分原因是不同细胞类型之间的依赖性。在2021年,引导表示学习被引入,作为一种方法,通过提供“嵌入指南”作为单个细胞类型的优化目标,将该领域知识纳入训练。在这项工作中,我们提出了通过基于图优化算法自动生成向导来改进引导表示学习。我们结合了视觉相似性和对错误分类诊断的影响的信息。我们表明,这减少了关键的错误预测,并将总体分类f得分提高了2.5个百分点。
{"title":"Analysis of automatically generated embedding guides for cell classification","authors":"Philipp Gräbel, Julian Thull, M. Crysandt, B. Klinkhammer, P. Boor, T. Brümmendorf, D. Merhof","doi":"10.1109/IPTA54936.2022.9784119","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784119","url":null,"abstract":"Automated cell classification in human bone marrow microscopy images could lead to faster acquisition and, therefore, to a considerably larger number of cells for the statistical cell count analysis. As basis for the diagnosis of hematopoietic dis-eases such as leukemia, this would be a significant improvement of clinical workflows. The classification of such cells, however, is challenging, partially due to dependencies between different cell types. In 2021, guided representation learning was introduced as an approach to include this domain knowledge into training by providing “embedding guides” as an optimization target for individual cell types. In this work, we propose improvements to guided repre-sentation learning by automatically generating guides based on graph optimization algorithms. We incorporate information about the visual similarity and the impact on diagnosis of mis-classifications. We show that this reduces critical false predictions and improves the overall classification F-score by up to 2.5 percentage points.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115827854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of GWO-SVM and Random Forest Classifiers in a LevelSet based approach for Bladder wall segmentation and characterisation using MR images GWO-SVM和随机森林分类器在基于水平集的膀胱壁分割和表征方法中的比较
Rania Trigui, M. Adel, M. D. Bisceglie, J. Wojak, Jessica Pinol, Alice Faure, K. Chaumoitre
In order to characterize the bladder state and functioning, it is necessary to succeed the segmentation of its wall in MR images. In this context, we propose a computer-aided diagnosis system based on segmentation and classification applied to the Bladder Wall (BW), as a part of spina bifida disease study. The proposed system starts with the BW extraction using an improved levelSet based algorithm. Then an optimized classification is proposed using some selected features. Obtained results proves the efficiency of the proposed system, which can be significantly helpful for radiologist avoiding the fastidious manual segmentation and providing a precise idea about the spina bifida severity
为了表征膀胱的状态和功能,有必要在MR图像中成功分割膀胱壁。在此背景下,我们提出了一种基于分割分类的计算机辅助诊断系统,应用于膀胱壁(BW),作为脊柱裂疾病研究的一部分。该系统首先使用改进的基于levelSet的算法提取BW。然后利用选定的特征进行优化分类。实验结果证明了该系统的有效性,可以为放射科医生避免繁琐的人工分割提供重要帮助,并提供脊柱裂严重程度的精确概念
{"title":"Comparison of GWO-SVM and Random Forest Classifiers in a LevelSet based approach for Bladder wall segmentation and characterisation using MR images","authors":"Rania Trigui, M. Adel, M. D. Bisceglie, J. Wojak, Jessica Pinol, Alice Faure, K. Chaumoitre","doi":"10.1109/IPTA54936.2022.9784127","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784127","url":null,"abstract":"In order to characterize the bladder state and functioning, it is necessary to succeed the segmentation of its wall in MR images. In this context, we propose a computer-aided diagnosis system based on segmentation and classification applied to the Bladder Wall (BW), as a part of spina bifida disease study. The proposed system starts with the BW extraction using an improved levelSet based algorithm. Then an optimized classification is proposed using some selected features. Obtained results proves the efficiency of the proposed system, which can be significantly helpful for radiologist avoiding the fastidious manual segmentation and providing a precise idea about the spina bifida severity","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123525734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Special Session 1: Biological & Medical Image Analysis 特别会议1:生物和医学图像分析
{"title":"Special Session 1: Biological & Medical Image Analysis","authors":"","doi":"10.1109/ipta54936.2022.9784132","DOIUrl":"https://doi.org/10.1109/ipta54936.2022.9784132","url":null,"abstract":"","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134004959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ARIN: Adaptive Resampling and Instance Normalization for Robust Blind Inpainting of Dunhuang Cave Paintings 基于自适应重采样和实例归一化的敦煌岩洞壁画鲁棒盲补
Alexander Schmidt, Prathmesh Madhu, A. Maier, V. Christlein, Ronak Kosti
Image enhancement algorithms are very useful for real world computer vision tasks where image resolution is often physically limited by the sensor size. While state-of-the-art deep neural networks show impressive results for image enhancement, they often struggle to enhance real-world images. In this work, we tackle a real-world setting: inpainting of images from Dunhuang caves. The Dunhuang dataset consists of murals, half of which suffer from corrosion and aging. These murals feature a range of rich content, such as Buddha statues, bodhisattvas, sponsors, architecture, dance, music, and decorative patterns designed by different artists spanning ten centuries, which makes manual restoration challenging. We modify two different existing methods (CAR, HINet) that are based upon state-of-the-art (SOTA) super resolution and deblurring networks. We show that those can successfully inpaint and enhance these deteriorated cave paintings. We further show that a novel combination of CAR and HINet, resulting in our proposed inpainting network (ARIN), is very robust to external noise, especially Gaussian noise. To this end, we present a quantitative and qualitative comparison of our proposed approach with existing SOTA networks and winners of the Dunhuang challenge. One of the proposed methods (HINet) represents the new state of the art and outperforms the 1st place of the Dunhuang Challenge, while our combination ARIN, which is robust to noise, is comparable to the 1st place. We also present and discuss qualitative results showing the impact of our method for inpainting on Dunhuang cave images.
图像增强算法对于真实世界的计算机视觉任务非常有用,其中图像分辨率通常受到传感器尺寸的物理限制。虽然最先进的深度神经网络在图像增强方面显示出令人印象深刻的结果,但它们往往难以增强现实世界的图像。在这项工作中,我们解决了一个现实世界的设置:从敦煌洞穴图像的绘画。敦煌数据集由壁画组成,其中一半遭受腐蚀和老化。这些壁画内容丰富,如佛像、菩萨、赞助者、建筑、舞蹈、音乐和装饰图案,由不同的艺术家设计,跨越十个世纪,这使得手工修复具有挑战性。我们修改了两种不同的现有方法(CAR, HINet),它们基于最先进的(SOTA)超分辨率和去模糊网络。我们的研究表明,这些材料可以成功地对这些退化的洞穴壁画进行油漆和强化。我们进一步证明了CAR和HINet的新组合,导致我们提出的喷漆网络(ARIN)对外部噪声,特别是高斯噪声具有很强的鲁棒性。为此,我们将我们提出的方法与现有的SOTA网络和敦煌挑战的获胜者进行了定量和定性比较。其中一种提出的方法(HINet)代表了最新的技术水平,并且优于敦煌挑战赛的第一名,而我们的组合ARIN对噪声具有鲁棒性,与第一名相当。我们还提出并讨论了定性结果,表明我们的方法对敦煌洞穴图像的影响。
{"title":"ARIN: Adaptive Resampling and Instance Normalization for Robust Blind Inpainting of Dunhuang Cave Paintings","authors":"Alexander Schmidt, Prathmesh Madhu, A. Maier, V. Christlein, Ronak Kosti","doi":"10.1109/IPTA54936.2022.9784144","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784144","url":null,"abstract":"Image enhancement algorithms are very useful for real world computer vision tasks where image resolution is often physically limited by the sensor size. While state-of-the-art deep neural networks show impressive results for image enhancement, they often struggle to enhance real-world images. In this work, we tackle a real-world setting: inpainting of images from Dunhuang caves. The Dunhuang dataset consists of murals, half of which suffer from corrosion and aging. These murals feature a range of rich content, such as Buddha statues, bodhisattvas, sponsors, architecture, dance, music, and decorative patterns designed by different artists spanning ten centuries, which makes manual restoration challenging. We modify two different existing methods (CAR, HINet) that are based upon state-of-the-art (SOTA) super resolution and deblurring networks. We show that those can successfully inpaint and enhance these deteriorated cave paintings. We further show that a novel combination of CAR and HINet, resulting in our proposed inpainting network (ARIN), is very robust to external noise, especially Gaussian noise. To this end, we present a quantitative and qualitative comparison of our proposed approach with existing SOTA networks and winners of the Dunhuang challenge. One of the proposed methods (HINet) represents the new state of the art and outperforms the 1st place of the Dunhuang Challenge, while our combination ARIN, which is robust to noise, is comparable to the 1st place. We also present and discuss qualitative results showing the impact of our method for inpainting on Dunhuang cave images.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115528706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
AAEGAN Optimization by Purposeful Noise Injection for the Generation of Bright-Field Brain Organoid Images 基于有目的噪声注入的脑类器官图像AAEGAN优化
C. B. Martin, Camille Simon Chane, C. Clouchoux, A. Histace
Brain organoids are three-dimensional tissues gener-ated in vitro from pluripotent stem cells and replicating the early development of Human brain. To implement, test and compare methods to follow their growth on microscopic images, a large dataset not always available is required with a trusted ground truth when developing automated Machine Learning solutions. Recently, optimized Generative Adversarial Networks prove to generate only a similar object content but not a background specific to the real acquisition modality. In this work, a small database of brain organoid bright field images, characterized by a shot noise background, is extended using the already validated AAEGAN architecture, and specific noise or a mixture noise injected in the generator. We hypothesize this noise injection could help to generate an homogeneous and similar bright-field background. To validate or invalidate our generated images we use metric calculation, and a dimensional reduction on features on original and generated images. Our result suggest that noise injection can modulate the generated image backgrounds in order to produce a more similar content as produced in the microscopic reality. A validation of these images by biological experts could augment the original dataset and allow their analysis by Deep-based solutions.
脑类器官是由多能干细胞在体外生成的三维组织,复制了人类大脑的早期发育。为了实现、测试和比较在微观图像上跟踪其生长的方法,在开发自动化机器学习解决方案时,并不总是需要一个具有可信基础事实的大型数据集。最近,优化的生成对抗网络证明只能生成类似的对象内容,而不能生成特定于真实获取模式的背景。在这项工作中,使用已经验证的AAEGAN架构扩展了一个以散点噪声背景为特征的脑类器官亮场图像的小型数据库,并在发生器中注入了特定噪声或混合噪声。我们假设这种噪声注入可以帮助产生均匀和相似的亮场背景。为了验证或验证我们生成的图像,我们使用度量计算,并对原始图像和生成图像的特征进行降维。我们的结果表明,噪声注入可以调制生成的图像背景,以产生更类似于在微观现实中产生的内容。生物专家对这些图像的验证可以增强原始数据集,并允许基于deep的解决方案进行分析。
{"title":"AAEGAN Optimization by Purposeful Noise Injection for the Generation of Bright-Field Brain Organoid Images","authors":"C. B. Martin, Camille Simon Chane, C. Clouchoux, A. Histace","doi":"10.1109/IPTA54936.2022.9784149","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784149","url":null,"abstract":"Brain organoids are three-dimensional tissues gener-ated in vitro from pluripotent stem cells and replicating the early development of Human brain. To implement, test and compare methods to follow their growth on microscopic images, a large dataset not always available is required with a trusted ground truth when developing automated Machine Learning solutions. Recently, optimized Generative Adversarial Networks prove to generate only a similar object content but not a background specific to the real acquisition modality. In this work, a small database of brain organoid bright field images, characterized by a shot noise background, is extended using the already validated AAEGAN architecture, and specific noise or a mixture noise injected in the generator. We hypothesize this noise injection could help to generate an homogeneous and similar bright-field background. To validate or invalidate our generated images we use metric calculation, and a dimensional reduction on features on original and generated images. Our result suggest that noise injection can modulate the generated image backgrounds in order to produce a more similar content as produced in the microscopic reality. A validation of these images by biological experts could augment the original dataset and allow their analysis by Deep-based solutions.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114394383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
One-Shot Object Detection in Heterogeneous Artwork Datasets 异构艺术品数据集中一次目标检测
Prathmesh Madhu, Anna Meyer, Mathias Zinnen, Lara Mührenberg, Dirk Suckow, Torsten Bendschus, Corinna Reinhardt, Peter Bell, Ute Verstegen, Ronak Kosti, A. Maier, V. Christlein
Christian archeologists face many challenges in understanding visual narration through artwork images. This understanding is essential to access underlying semantic in-formation. Therefore, narrative elements (objects) need to be labeled, compared, and contextualized by experts, which takes an enormous amount of time and effort. Our work aims to reduce labeling costs by using one-shot object detection to generate a labeled database from unannotated images. Novel object categories can be defined broadly and annotated using visual examples of narrative elements without training exclusively for such objects. In this work, we propose two ways of using contextual information as data augmentation to improve the detection performance. Furthermore, we introduce a multi-relation detector to our framework, which extracts global, local, and patch-based relations of the image. Additionally, we evaluate the use of contrastive learning. We use data from Christian archeology (CHA) and art history - IconArt-v2 (IA). Our context encoding approach improves the typical fine-tuning approach in terms of mean average precision (mAP) by about 3.5 % (4 %) at 0.25 intersection over union (IoU) for UnSeen categories, and 6 % (1.5 %) for Seen categories in CHA (IA). To the best of our knowledge, our work is the first to explore few shot object detection on heterogeneous artistic data by investigating evaluation methods and data augmentation strategies. We will release the code and models after acceptance of the work.
基督教考古学家在通过艺术品图像理解视觉叙事方面面临许多挑战。这种理解对于访问底层语义信息至关重要。因此,叙事元素(对象)需要由专家进行标记、比较和情境化,这需要花费大量的时间和精力。我们的工作旨在通过使用一次性目标检测从未注释的图像中生成标记数据库来降低标记成本。新的对象类别可以广泛地定义,并使用叙事元素的视觉示例进行注释,而无需专门针对此类对象进行训练。在这项工作中,我们提出了两种使用上下文信息作为数据增强的方法来提高检测性能。此外,我们在我们的框架中引入了一个多关系检测器,它可以提取图像的全局、局部和基于补丁的关系。此外,我们评估对比学习的使用。我们使用的数据来自基督教考古学(CHA)和艺术史- IconArt-v2 (IA)。我们的上下文编码方法在平均平均精度(mAP)方面改进了典型的微调方法,对于未见类别,在0.25交集与联合(IoU)下提高了约3.5%(4%),对于CHA (IA)中的已见类别,提高了6%(1.5%)。据我们所知,我们的工作是第一个通过研究评估方法和数据增强策略来探索异构艺术数据上的少数镜头物体检测。我们将在工作验收后发布代码和模型。
{"title":"One-Shot Object Detection in Heterogeneous Artwork Datasets","authors":"Prathmesh Madhu, Anna Meyer, Mathias Zinnen, Lara Mührenberg, Dirk Suckow, Torsten Bendschus, Corinna Reinhardt, Peter Bell, Ute Verstegen, Ronak Kosti, A. Maier, V. Christlein","doi":"10.1109/IPTA54936.2022.9784141","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784141","url":null,"abstract":"Christian archeologists face many challenges in understanding visual narration through artwork images. This understanding is essential to access underlying semantic in-formation. Therefore, narrative elements (objects) need to be labeled, compared, and contextualized by experts, which takes an enormous amount of time and effort. Our work aims to reduce labeling costs by using one-shot object detection to generate a labeled database from unannotated images. Novel object categories can be defined broadly and annotated using visual examples of narrative elements without training exclusively for such objects. In this work, we propose two ways of using contextual information as data augmentation to improve the detection performance. Furthermore, we introduce a multi-relation detector to our framework, which extracts global, local, and patch-based relations of the image. Additionally, we evaluate the use of contrastive learning. We use data from Christian archeology (CHA) and art history - IconArt-v2 (IA). Our context encoding approach improves the typical fine-tuning approach in terms of mean average precision (mAP) by about 3.5 % (4 %) at 0.25 intersection over union (IoU) for UnSeen categories, and 6 % (1.5 %) for Seen categories in CHA (IA). To the best of our knowledge, our work is the first to explore few shot object detection on heterogeneous artistic data by investigating evaluation methods and data augmentation strategies. We will release the code and models after acceptance of the work.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"254 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133611218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Copyright 版权
{"title":"Copyright","authors":"","doi":"10.1109/ipta54936.2022.9784115","DOIUrl":"https://doi.org/10.1109/ipta54936.2022.9784115","url":null,"abstract":"","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127772598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction of Secret Images Reconstructed from Noised Shared Images 噪声共享图像重建秘密图像的校正
Laura Bertojo, W. Puech
Protecting sensitive images is nowadays a key issue in information security domain. As such, numerous techniques have emerged to securely transmit or store such multimedia data, as encryption, steganography or secret sharing. Most of today's secret image sharing methods relie on the polynomial-based scheme proposed by Shamir. However, some of the shared images distributed to the participants may be noised between their creation and their use to retrieve the secret image. Noise can be added to a shared image during transmission, storage or JPEG compression for example. However, to our knowledge, to date no analysis has been made on the impact of using a noised shared image in the reconstruction process of a secret image. In this paper, we propose a method to correct the errors during the reconstruction of a secret image using noised shared images.
敏感图像的保护是当今信息安全领域的一个关键问题。因此,已经出现了许多技术来安全地传输或存储这些多媒体数据,如加密、隐写或秘密共享。目前大多数秘密图像共享方法都依赖于Shamir提出的基于多项式的方案。然而,分发给参与者的一些共享图像在创建和使用它们检索秘密图像之间可能会受到噪声干扰。例如,噪声可以在传输、存储或JPEG压缩过程中添加到共享图像中。然而,据我们所知,到目前为止,还没有分析过在秘密图像的重建过程中使用带噪的共享图像的影响。在本文中,我们提出了一种利用带噪的共享图像来校正秘密图像重建过程中的误差的方法。
{"title":"Correction of Secret Images Reconstructed from Noised Shared Images","authors":"Laura Bertojo, W. Puech","doi":"10.1109/IPTA54936.2022.9784125","DOIUrl":"https://doi.org/10.1109/IPTA54936.2022.9784125","url":null,"abstract":"Protecting sensitive images is nowadays a key issue in information security domain. As such, numerous techniques have emerged to securely transmit or store such multimedia data, as encryption, steganography or secret sharing. Most of today's secret image sharing methods relie on the polynomial-based scheme proposed by Shamir. However, some of the shared images distributed to the participants may be noised between their creation and their use to retrieve the secret image. Noise can be added to a shared image during transmission, storage or JPEG compression for example. However, to our knowledge, to date no analysis has been made on the impact of using a noised shared image in the reconstruction process of a secret image. In this paper, we propose a method to correct the errors during the reconstruction of a secret image using noised shared images.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116746179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1