首页 > 最新文献

Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing最新文献

英文 中文
Shape Representations for Maya Codical Glyphs: Knowledge-driven or Deep? 玛雅楔形符号的形状表示:知识驱动还是深度驱动?
G. Can, J. Odobez, D. Gática-Pérez
This paper investigates two-types of shape representations for individual Maya codical glyphs: traditional bag-of-words built on knowledge-driven local shape descriptors (HOOSC), and Convolutional Neural Networks (CNN) based representations, learned from data. For CNN representations, first, we evaluate the activations of typical CNNs that are pretrained on large-scale image datasets; second, we train a CNN from scratch with all the available individual segments. One of the main challenges while training CNNs is the limited amount of available data (and handling data imbalance issue). Here, we attempt to solve this imbalance issue by introducing class-weights into the loss computation during training. Another possibility is oversampling the minority class samples during batch selection. We show that deep representations outperform the other, but CNN training requires special care for small-scale unbalanced data, that is usually the case in the cultural heritage domain.
本文研究了玛雅语词形的两种形状表示:基于知识驱动的局部形状描述符(HOOSC)的传统词袋,以及基于卷积神经网络(CNN)的表示,从数据中学习。对于CNN表示,首先,我们评估在大规模图像数据集上预训练的典型CNN的激活;其次,我们用所有可用的单个片段从头开始训练CNN。训练cnn的主要挑战之一是可用数据量有限(以及处理数据不平衡问题)。在这里,我们试图通过在训练过程中引入类权值来解决这种不平衡问题。另一种可能性是在批量选择过程中对少数类样本进行过采样。我们表明,深度表征优于其他表征,但CNN训练需要特别关注小规模的不平衡数据,这通常是在文化遗产领域的情况。
{"title":"Shape Representations for Maya Codical Glyphs: Knowledge-driven or Deep?","authors":"G. Can, J. Odobez, D. Gática-Pérez","doi":"10.1145/3095713.3095746","DOIUrl":"https://doi.org/10.1145/3095713.3095746","url":null,"abstract":"This paper investigates two-types of shape representations for individual Maya codical glyphs: traditional bag-of-words built on knowledge-driven local shape descriptors (HOOSC), and Convolutional Neural Networks (CNN) based representations, learned from data. For CNN representations, first, we evaluate the activations of typical CNNs that are pretrained on large-scale image datasets; second, we train a CNN from scratch with all the available individual segments. One of the main challenges while training CNNs is the limited amount of available data (and handling data imbalance issue). Here, we attempt to solve this imbalance issue by introducing class-weights into the loss computation during training. Another possibility is oversampling the minority class samples during batch selection. We show that deep representations outperform the other, but CNN training requires special care for small-scale unbalanced data, that is usually the case in the cultural heritage domain.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"238 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122125822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
JORD: A System for Collecting Information and Monitoring Natural Disasters by Linking Social Media with Satellite Imagery JORD:透过连结社会媒体与卫星影像来收集资讯与监测自然灾害的系统
Kashif Ahmad, M. Riegler, Konstantin Pogorelov, N. Conci, P. Halvorsen, F. D. Natale
Gathering information, and continuously monitoring the affected areas after a natural disaster can be crucial to assess the damage, and speed up the recovery process. Satellite imagery is being considered as one of the most productive sources to monitor the after effects of a natural disaster; however, it also comes with a lot of challenges and limitations, due to slow update. It would be beneficiary to link remote sensed data with social media for the damage assessment, and obtaining detailed information about a disaster. The additional information, which are obtainable by social media, can enrich remote-sensed data, and overcome its limitations. To tackle this, we present a system called JORD that is able to autonomously collect social media data about natural disasters, and link it automatically to remote-sensed data. In addition, we demonstrate that queries in local languages that are relevant to the exact position of natural disasters retrieve more accurate information about a disaster event. We also provide content based analysis along with temporal and geo-location based filtering to provide more accurate information to the users. To show the capabilities of the system, we demonstrate that a large number of disaster events can be detected by the system. In addition, we use crowdsourcing to demonstrate the quality of the provided information about the disasters, and usefulness of JORD from potential users point of view
在自然灾害发生后,收集信息和持续监测受灾地区对于评估损失和加快恢复进程至关重要。卫星图像被认为是监测自然灾害后影响的最有成效的来源之一;然而,由于更新缓慢,它也带来了许多挑战和限制。将遥感数据与社会媒体联系起来,以进行损害评估,并获得有关灾害的详细信息,将是有益的。通过社交媒体获得的额外信息可以丰富遥感数据,并克服其局限性。为了解决这个问题,我们提出了一个名为JORD的系统,它能够自动收集有关自然灾害的社交媒体数据,并将其自动链接到遥感数据。此外,我们证明了与自然灾害的确切位置相关的本地语言查询可以检索到关于灾害事件的更准确的信息。我们还提供基于内容的分析以及基于时间和地理位置的过滤,为用户提供更准确的信息。为了展示系统的能力,我们演示了系统可以检测到大量的灾难事件。此外,我们使用众包来展示所提供的灾难信息的质量,以及从潜在用户的角度来看JORD的有用性
{"title":"JORD: A System for Collecting Information and Monitoring Natural Disasters by Linking Social Media with Satellite Imagery","authors":"Kashif Ahmad, M. Riegler, Konstantin Pogorelov, N. Conci, P. Halvorsen, F. D. Natale","doi":"10.1145/3095713.3095726","DOIUrl":"https://doi.org/10.1145/3095713.3095726","url":null,"abstract":"Gathering information, and continuously monitoring the affected areas after a natural disaster can be crucial to assess the damage, and speed up the recovery process. Satellite imagery is being considered as one of the most productive sources to monitor the after effects of a natural disaster; however, it also comes with a lot of challenges and limitations, due to slow update. It would be beneficiary to link remote sensed data with social media for the damage assessment, and obtaining detailed information about a disaster. The additional information, which are obtainable by social media, can enrich remote-sensed data, and overcome its limitations. To tackle this, we present a system called JORD that is able to autonomously collect social media data about natural disasters, and link it automatically to remote-sensed data. In addition, we demonstrate that queries in local languages that are relevant to the exact position of natural disasters retrieve more accurate information about a disaster event. We also provide content based analysis along with temporal and geo-location based filtering to provide more accurate information to the users. To show the capabilities of the system, we demonstrate that a large number of disaster events can be detected by the system. In addition, we use crowdsourcing to demonstrate the quality of the provided information about the disasters, and usefulness of JORD from potential users point of view","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129641501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Selection and Combination of Unsupervised Learning Methods for Image Retrieval 图像检索中无监督学习方法的选择与组合
Lucas Pascotti Valem, D. C. G. Pedronette
The evolution of technologies to store and share images has made imperative the need for methods to index and retrieve multimedia information based on visual content. The CBIR (Content-Based Image Retrieval) systems are the main solution in this scenario. Originally, these systems were solely based on the use of low-level visual features, but evolved through the years in order to incorporate various supervised learning techniques. More recently, unsupervised learning methods have been showing promising results for improving the effectiveness of retrieval results. However, given the development of different methods, a challenging task consists in to exploit the advantages of diverse approaches. As different methods present distinct results even for the same dataset and set of features, a promising approach is to combine these methods. In this work, a framework is proposed aiming at selecting the best combination of methods in a given scenario, using different strategies based on effectiveness and correlation measures. Regarding the experimental evaluation, six distinct unsupervised learning methods and two different datasets were used. The results as a whole are promising and also reveal good perspectives for future works.
存储和共享图像技术的发展使得对基于视觉内容的多媒体信息索引和检索方法的需求势在必行。CBIR(基于内容的图像检索)系统是此场景中的主要解决方案。最初,这些系统仅仅基于低级视觉特征的使用,但经过多年的发展,为了融合各种监督学习技术。最近,无监督学习方法在提高检索结果的有效性方面显示出有希望的结果。然而,鉴于不同方法的发展,一项具有挑战性的任务在于利用不同方法的优势。即使对于相同的数据集和特征集,不同的方法也会给出不同的结果,因此将这些方法结合起来是一种很有前途的方法。在这项工作中,提出了一个框架,旨在在给定的场景中选择最佳的方法组合,使用基于有效性和相关性度量的不同策略。在实验评估方面,使用了六种不同的无监督学习方法和两种不同的数据集。结果总体上是有希望的,也为未来的工作揭示了良好的前景。
{"title":"Selection and Combination of Unsupervised Learning Methods for Image Retrieval","authors":"Lucas Pascotti Valem, D. C. G. Pedronette","doi":"10.1145/3095713.3095741","DOIUrl":"https://doi.org/10.1145/3095713.3095741","url":null,"abstract":"The evolution of technologies to store and share images has made imperative the need for methods to index and retrieve multimedia information based on visual content. The CBIR (Content-Based Image Retrieval) systems are the main solution in this scenario. Originally, these systems were solely based on the use of low-level visual features, but evolved through the years in order to incorporate various supervised learning techniques. More recently, unsupervised learning methods have been showing promising results for improving the effectiveness of retrieval results. However, given the development of different methods, a challenging task consists in to exploit the advantages of diverse approaches. As different methods present distinct results even for the same dataset and set of features, a promising approach is to combine these methods. In this work, a framework is proposed aiming at selecting the best combination of methods in a given scenario, using different strategies based on effectiveness and correlation measures. Regarding the experimental evaluation, six distinct unsupervised learning methods and two different datasets were used. The results as a whole are promising and also reveal good perspectives for future works.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"808 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123916262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Speaker Clustering Based on Non-Negative Matrix Factorization Using Gaussian Mixture Model in Complementary Subspace 基于互补子空间高斯混合模型非负矩阵分解的说话人聚类
M. Nishida, Seiichi Yamamoto
Speech feature variations are mainly attributed to variations in phonetic and speaker information included in speech data. If these two types of information are separated from each other, more robust speaker clustering can be achieved. Principal component analysis transformation can separate speaker information from phonetic information, under the assumption that a space with large within-speaker variance is a "phonetic subspace" and a space within-speaker variance is a "phonetic sub-space". We propose a speaker clustering method based on non-negative matrix factorization using a Gaussian mixture model trained in the speaker subspace. We carried out comparative experiments of the proposed method with conventional methods based on Bayesian information criterion and Gaussian mixture model in an observation space. The experimental results showed that the proposed method can achieve higher clustering accuracy than conventional methods.
语音特征的变化主要归因于语音数据中包含的语音信息和说话人信息的变化。如果将这两种类型的信息相互分离,可以实现更鲁棒的说话人聚类。主成分分析变换可以将说话人信息从语音信息中分离出来,假设说话人内部方差较大的空间为“语音子空间”,说话人内部方差较大的空间为“语音子空间”。本文提出了一种基于非负矩阵分解的说话人聚类方法,该方法使用在说话人子空间中训练的高斯混合模型。在观测空间中,将该方法与基于贝叶斯信息准则和高斯混合模型的传统方法进行了对比实验。实验结果表明,该方法比传统方法具有更高的聚类精度。
{"title":"Speaker Clustering Based on Non-Negative Matrix Factorization Using Gaussian Mixture Model in Complementary Subspace","authors":"M. Nishida, Seiichi Yamamoto","doi":"10.1145/3095713.3095721","DOIUrl":"https://doi.org/10.1145/3095713.3095721","url":null,"abstract":"Speech feature variations are mainly attributed to variations in phonetic and speaker information included in speech data. If these two types of information are separated from each other, more robust speaker clustering can be achieved. Principal component analysis transformation can separate speaker information from phonetic information, under the assumption that a space with large within-speaker variance is a \"phonetic subspace\" and a space within-speaker variance is a \"phonetic sub-space\". We propose a speaker clustering method based on non-negative matrix factorization using a Gaussian mixture model trained in the speaker subspace. We carried out comparative experiments of the proposed method with conventional methods based on Bayesian information criterion and Gaussian mixture model in an observation space. The experimental results showed that the proposed method can achieve higher clustering accuracy than conventional methods.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123957430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Searching and annotating 100M Images with YFCC100M-HNfc6 and MI-File 使用YFCC100M-HNfc6和MI-File对100M图像进行检索和标注
Giuseppe Amato, F. Falchi, C. Gennaro, F. Rabitti
We present an image search engine that allows searching by similarity about 100M images included in the YFCC100M dataset, and annotate query images. Image similarity search is performed using YFCC100M-HNfc6, the set of deep features we extracted from the YFCC100M dataset, which was indexed using the MI-File index for efficient similarity searching. A metadata cleaning algorithm, that uses visual and textual analysis, was used to select from the YFCC100M dataset a relevant subset of images and associated annotations, to create a training set to perform automatic textual annotation of submitted queries. The on-line image and annotation system demonstrates the effectiveness of the deep features for assessing conceptual similarity among images, the effectiveness of the metadata cleaning algorithm, to identify a relevant training set for annotation, and the efficiency and accuracy of the MI-File similarity index techniques, to search and annotate using a dataset of 100M images, with very limited computing resources.
我们提出了一个图像搜索引擎,允许对YFCC100M数据集中的100万幅图像进行相似度搜索,并对查询图像进行注释。使用YFCC100M数据集提取的深度特征集YFCC100M- hnfc6进行图像相似度搜索,并使用MI-File索引对其进行索引,以实现高效的相似度搜索。使用元数据清理算法,使用视觉和文本分析,从YFCC100M数据集中选择图像和相关注释的相关子集,创建训练集,对提交的查询执行自动文本注释。在线图像和注释系统展示了深度特征在评估图像之间概念相似性方面的有效性,元数据清理算法在识别注释相关训练集方面的有效性,以及MI-File相似索引技术在使用100万图像数据集进行搜索和注释时的效率和准确性,计算资源非常有限。
{"title":"Searching and annotating 100M Images with YFCC100M-HNfc6 and MI-File","authors":"Giuseppe Amato, F. Falchi, C. Gennaro, F. Rabitti","doi":"10.1145/3095713.3095740","DOIUrl":"https://doi.org/10.1145/3095713.3095740","url":null,"abstract":"We present an image search engine that allows searching by similarity about 100M images included in the YFCC100M dataset, and annotate query images. Image similarity search is performed using YFCC100M-HNfc6, the set of deep features we extracted from the YFCC100M dataset, which was indexed using the MI-File index for efficient similarity searching. A metadata cleaning algorithm, that uses visual and textual analysis, was used to select from the YFCC100M dataset a relevant subset of images and associated annotations, to create a training set to perform automatic textual annotation of submitted queries. The on-line image and annotation system demonstrates the effectiveness of the deep features for assessing conceptual similarity among images, the effectiveness of the metadata cleaning algorithm, to identify a relevant training set for annotation, and the efficiency and accuracy of the MI-File similarity index techniques, to search and annotate using a dataset of 100M images, with very limited computing resources.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124422879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
On Reflection Symmetry In Natural Images 论自然图像中的反射对称性
Alessandro Gnutti, Fabrizio Guerrini, R. Leonardi
Many new symmetry detection algorithms have been recently developed, thanks to an interest revival on computational symmetry for computer graphics and computer vision applications. Notably, in 2013 the IEEE CVPR Conference organized a dedicated workshop and an accompanying symmetry detection competition. In this paper we propose an approach for symmetric object detection that is based both on the computation of a symmetry measure for each pixel and on saliency. The symmetry value is obtained as the energy balance of the even-odd decomposition of a patch w.r.t. each possible axis. The candidate symmetry axes are then identified through the localization of peaks along the direction perpendicular to each considered axis orientation. These found candidate axes are finally evaluated through a confidence measure that also allow removing redundant detected symmetries. The obtained results within the framework adopted in the aforementioned competition show significant performance improvement.
由于计算机图形学和计算机视觉应用对计算对称性的兴趣复兴,最近开发了许多新的对称检测算法。值得注意的是,2013年IEEE CVPR会议组织了一个专门的研讨会和相应的对称检测竞赛。在本文中,我们提出了一种基于计算每个像素的对称度量和显著性的对称目标检测方法。对称值是一个贴片在每个可能的轴上的奇偶分解的能量平衡。然后通过沿垂直于每个考虑的轴方向的峰的定位来识别候选对称轴。这些发现的候选轴最终通过一个置信度措施进行评估,该措施也允许删除冗余检测到的对称性。在上述竞赛所采用的框架内获得的结果显示出显著的性能提升。
{"title":"On Reflection Symmetry In Natural Images","authors":"Alessandro Gnutti, Fabrizio Guerrini, R. Leonardi","doi":"10.1145/3095713.3095743","DOIUrl":"https://doi.org/10.1145/3095713.3095743","url":null,"abstract":"Many new symmetry detection algorithms have been recently developed, thanks to an interest revival on computational symmetry for computer graphics and computer vision applications. Notably, in 2013 the IEEE CVPR Conference organized a dedicated workshop and an accompanying symmetry detection competition. In this paper we propose an approach for symmetric object detection that is based both on the computation of a symmetry measure for each pixel and on saliency. The symmetry value is obtained as the energy balance of the even-odd decomposition of a patch w.r.t. each possible axis. The candidate symmetry axes are then identified through the localization of peaks along the direction perpendicular to each considered axis orientation. These found candidate axes are finally evaluated through a confidence measure that also allow removing redundant detected symmetries. The obtained results within the framework adopted in the aforementioned competition show significant performance improvement.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132732323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Building a Disclosed Lifelog Dataset: Challenges, Principles and Processes 建立公开的生活日志数据集:挑战、原则和过程
Duc-Tien Dang-Nguyen, Liting Zhou, Rashmi Gupta, M. Riegler, C. Gurrin
In this paper, we address the challenge of how to build a disclosed lifelog dataset by proposing the principles for building and sharing such types of data. Based on the proposed principles, we describe processes for how we built the benchmarking lifelog dataset for NTCIR-13 - Lifelog 2 tasks. Further, a list of potential applications and a framework for anonymisation are proposed and discussed.
在本文中,我们通过提出构建和共享这类数据的原则来解决如何构建公开的生活日志数据集的挑战。基于提出的原则,我们描述了如何为ntcirr -13 - lifelog 2任务构建基准生命日志数据集的过程。此外,还提出并讨论了匿名化的潜在应用和框架。
{"title":"Building a Disclosed Lifelog Dataset: Challenges, Principles and Processes","authors":"Duc-Tien Dang-Nguyen, Liting Zhou, Rashmi Gupta, M. Riegler, C. Gurrin","doi":"10.1145/3095713.3095736","DOIUrl":"https://doi.org/10.1145/3095713.3095736","url":null,"abstract":"In this paper, we address the challenge of how to build a disclosed lifelog dataset by proposing the principles for building and sharing such types of data. Based on the proposed principles, we describe processes for how we built the benchmarking lifelog dataset for NTCIR-13 - Lifelog 2 tasks. Further, a list of potential applications and a framework for anonymisation are proposed and discussed.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131331291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Outdoor Scene Labeling Using ALE and LSC Superpixels 使用ALE和LSC超像素的户外场景标记
Rabia Tahir, Sheikh Ziauddin, A. R. Shahid, A. Safi
Scene labeling has been an important and popular area of computer vision and image processing for the past few years. It is the process of assigning pixels to specific predefined categories in an image. A number of techniques have been proposed for scene labeling but all have some limitations regarding accuracy and computational time. Some methods only incorporate the local context of images and ignore the global information of objects in an image. Therefore, accuracy of scene labeling is low for these methods. There is a need to address these issues of scene labeling to improve labeling accuracy. In this paper, we perform outdoor scene labeling using Automatic labeling Environment (ALE). We enhance this framework by incorporating bilateral filter based preprocessing, LSC superpixels and large co-occurrence weight. Experiments on a publicly available MSRC v1 dataset showed promising results with 89.44% pixel-wise accuracy and 78.02% class-wise accuracy.
在过去的几年里,场景标注一直是计算机视觉和图像处理的一个重要和热门的领域。它是将像素分配给图像中特定的预定义类别的过程。已经提出了许多用于场景标记的技术,但在准确性和计算时间方面都有一些限制。有些方法只考虑图像的局部上下文,而忽略了图像中对象的全局信息。因此,这些方法的场景标注准确率较低。有必要解决这些问题的场景标注,以提高标注精度。在本文中,我们使用自动标记环境(ALE)进行户外场景标记。我们通过结合基于双边滤波的预处理、LSC超像素和大共现权来增强该框架。在公开可用的MSRC v1数据集上进行的实验显示出令人鼓舞的结果,像素精度为89.44%,类精度为78.02%。
{"title":"Outdoor Scene Labeling Using ALE and LSC Superpixels","authors":"Rabia Tahir, Sheikh Ziauddin, A. R. Shahid, A. Safi","doi":"10.1145/3095713.3095739","DOIUrl":"https://doi.org/10.1145/3095713.3095739","url":null,"abstract":"Scene labeling has been an important and popular area of computer vision and image processing for the past few years. It is the process of assigning pixels to specific predefined categories in an image. A number of techniques have been proposed for scene labeling but all have some limitations regarding accuracy and computational time. Some methods only incorporate the local context of images and ignore the global information of objects in an image. Therefore, accuracy of scene labeling is low for these methods. There is a need to address these issues of scene labeling to improve labeling accuracy. In this paper, we perform outdoor scene labeling using Automatic labeling Environment (ALE). We enhance this framework by incorporating bilateral filter based preprocessing, LSC superpixels and large co-occurrence weight. Experiments on a publicly available MSRC v1 dataset showed promising results with 89.44% pixel-wise accuracy and 78.02% class-wise accuracy.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128585394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dimensionality Reduction for Image Features using Deep Learning and Autoencoders 使用深度学习和自动编码器的图像特征降维
Stefan Petscharnig, M. Lux, S. Chatzichristofis
The field of similarity based image retrieval has experienced a game changer lately. Hand crafted image features have been vastly outperformed by machine learning based approaches. Deep learning methods are very good at finding optimal features for a domain, given enough data is available to learn from. However, hand crafted features are still means to an end in domains, where the data either is not freely available, i.e. because it violates privacy, where there are commercial concerns, or where it cannot be transmitted, i.e. due to bandwidth limitations. Moreover, we have to rely on hand crafted methods whenever neural networks cannot be trained effectively, e.g. if there is not enough training data. In this paper, we investigate a particular approach to combine hand crafted features and deep learning to (i) achieve early fusion of off the shelf handcrafted global image features and (ii) reduce the overall number of dimensions to combine both worlds. This method allows for fast image retrieval in domains, where training data is sparse.
近年来,基于相似度的图像检索领域发生了翻天覆地的变化。基于机器学习的方法大大优于手工制作的图像特征。深度学习方法非常善于为一个领域找到最优特征,只要有足够的数据可供学习。然而,手工制作的功能仍然意味着在数据不能免费获得的领域,即因为它侵犯了隐私,在有商业考虑的情况下,或者在无法传输的情况下,即由于带宽限制。此外,当神经网络无法有效训练时,例如,如果没有足够的训练数据,我们必须依赖手工制作的方法。在本文中,我们研究了一种结合手工制作特征和深度学习的特定方法,以(i)实现现成手工制作的全局图像特征的早期融合,(ii)减少总体维数以结合这两个世界。该方法允许在训练数据稀疏的域中快速检索图像。
{"title":"Dimensionality Reduction for Image Features using Deep Learning and Autoencoders","authors":"Stefan Petscharnig, M. Lux, S. Chatzichristofis","doi":"10.1145/3095713.3095737","DOIUrl":"https://doi.org/10.1145/3095713.3095737","url":null,"abstract":"The field of similarity based image retrieval has experienced a game changer lately. Hand crafted image features have been vastly outperformed by machine learning based approaches. Deep learning methods are very good at finding optimal features for a domain, given enough data is available to learn from. However, hand crafted features are still means to an end in domains, where the data either is not freely available, i.e. because it violates privacy, where there are commercial concerns, or where it cannot be transmitted, i.e. due to bandwidth limitations. Moreover, we have to rely on hand crafted methods whenever neural networks cannot be trained effectively, e.g. if there is not enough training data. In this paper, we investigate a particular approach to combine hand crafted features and deep learning to (i) achieve early fusion of off the shelf handcrafted global image features and (ii) reduce the overall number of dimensions to combine both worlds. This method allows for fast image retrieval in domains, where training data is sparse.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131220028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Lisbon Landmark Lenslet Light Field Dataset: Description and Retrieval Performance 里斯本Landmark Lenslet光场数据集:描述与检索性能
J. A. Teixeira, Catarina Brites, F. Pereira, J. Ascenso
Popular local feature extraction schemes, such as SIFT, are robust when changes in illumination, translation and scale occur, and play an important role in visual content retrieval. However, these solutions are not very robust to 3D object rotations and camera viewpoint changes. In such scenarios, the emerging and richer lenslet light field image representation can provide additional information such as multiple perspectives and depth data. This paper introduces a new lenslet light field imaging dataset and studies the retrieval performance when popular 2D visual descriptors are applied. The new dataset consists of 25 Lisbon landmarks captured with a lenslet camera from different perspectives. Moreover, this paper proposes and assesses straightforward extensions of visual 2D descriptor matching for lenslet light field retrieval. The experimental results show that gains up to 14% can be obtained with a light field representation when compared to a 2D imaging conventional representation.
当前流行的局部特征提取方法,如SIFT,在光照、平移和尺度发生变化时具有较强的鲁棒性,在视觉内容检索中发挥着重要作用。然而,这些解决方案对3D对象旋转和相机视点变化不是很健壮。在这种情况下,新兴和更丰富的透镜光场图像表示可以提供额外的信息,如多视角和深度数据。本文介绍了一种新的小透镜光场成像数据集,并研究了使用常用的二维视觉描述符时的检索性能。新的数据集由25个里斯本地标组成,这些地标是用透镜相机从不同角度拍摄的。此外,本文提出并评估了用于小透镜光场检索的可视化二维描述子匹配的直接扩展。实验结果表明,与传统的二维成像表示相比,光场表示可获得高达14%的增益。
{"title":"Lisbon Landmark Lenslet Light Field Dataset: Description and Retrieval Performance","authors":"J. A. Teixeira, Catarina Brites, F. Pereira, J. Ascenso","doi":"10.1145/3095713.3095723","DOIUrl":"https://doi.org/10.1145/3095713.3095723","url":null,"abstract":"Popular local feature extraction schemes, such as SIFT, are robust when changes in illumination, translation and scale occur, and play an important role in visual content retrieval. However, these solutions are not very robust to 3D object rotations and camera viewpoint changes. In such scenarios, the emerging and richer lenslet light field image representation can provide additional information such as multiple perspectives and depth data. This paper introduces a new lenslet light field imaging dataset and studies the retrieval performance when popular 2D visual descriptors are applied. The new dataset consists of 25 Lisbon landmarks captured with a lenslet camera from different perspectives. Moreover, this paper proposes and assesses straightforward extensions of visual 2D descriptor matching for lenslet light field retrieval. The experimental results show that gains up to 14% can be obtained with a light field representation when compared to a 2D imaging conventional representation.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131264058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1