首页 > 最新文献

Proceedings of the 21st ACM international conference on Multimedia最新文献

英文 中文
Learning articulated body models for people re-identification 学习铰接式人体模型,重新识别人
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502147
Davide Baltieri, R. Vezzani, R. Cucchiara
People re-identification is a challenging problem in surveillance and forensics and it aims at associating multiple instances of the same person which have been acquired from different points of view and after a temporal gap. Image-based appearance features are usually adopted but, in addition to their intrinsically low discriminability, they are subject to perspective and view-point issues. We propose to completely change the approach by mapping local descriptors extracted from RGB-D sensors on a 3D body model for creating a view-independent signature. An original bone-wise color descriptor is generated and reduced with PCA to compute the person signature. The virtual bone set used to map appearance features is learned using a recursive splitting approach. Finally, people matching for re-identification is performed using the Relaxed Pairwise Metric Learning, which simultaneously provides feature reduction and weighting. Experiments on a specific dataset created with the Microsoft Kinect sensor and the OpenNi libraries prove the advantages of the proposed technique with respect to state of the art methods based on 2D or non-articulated 3D body models.
人的再识别是监视和取证中的一个具有挑战性的问题,它旨在将从不同角度获得的同一个人的多个实例联系起来,并经过一段时间的间隔。通常采用基于图像的外观特征,但除了其本质上的低可辨别性外,它们还受到视角和视点问题的影响。我们建议将从RGB-D传感器提取的局部描述符映射到3D身体模型上,以创建与视图无关的签名,从而彻底改变这种方法。生成原始的骨骼颜色描述符,并使用PCA进行约简以计算人物签名。使用递归分割方法学习用于映射外观特征的虚拟骨集。最后,使用同时提供特征约简和加权的放松成对度量学习进行重新识别的人员匹配。在使用微软Kinect传感器和OpenNi库创建的特定数据集上的实验证明了所提出的技术相对于基于2D或非铰接3D身体模型的最先进方法的优势。
{"title":"Learning articulated body models for people re-identification","authors":"Davide Baltieri, R. Vezzani, R. Cucchiara","doi":"10.1145/2502081.2502147","DOIUrl":"https://doi.org/10.1145/2502081.2502147","url":null,"abstract":"People re-identification is a challenging problem in surveillance and forensics and it aims at associating multiple instances of the same person which have been acquired from different points of view and after a temporal gap. Image-based appearance features are usually adopted but, in addition to their intrinsically low discriminability, they are subject to perspective and view-point issues. We propose to completely change the approach by mapping local descriptors extracted from RGB-D sensors on a 3D body model for creating a view-independent signature. An original bone-wise color descriptor is generated and reduced with PCA to compute the person signature. The virtual bone set used to map appearance features is learned using a recursive splitting approach. Finally, people matching for re-identification is performed using the Relaxed Pairwise Metric Learning, which simultaneously provides feature reduction and weighting. Experiments on a specific dataset created with the Microsoft Kinect sensor and the OpenNi libraries prove the advantages of the proposed technique with respect to state of the art methods based on 2D or non-articulated 3D body models.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82066897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Session details: Keynote address 会议详情:主题演讲
Pub Date : 2013-10-21 DOI: 10.1145/3245284
David A. Shamma
{"title":"Session details: Keynote address","authors":"David A. Shamma","doi":"10.1145/3245284","DOIUrl":"https://doi.org/10.1145/3245284","url":null,"abstract":"","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86824028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Picture tags and world knowledge: learning tag relations from visual semantic sources 图片标签与世界知识:从视觉语义源学习标签关系
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502113
Lexing Xie, Xuming He
This paper studies the use of everyday words to describe images. The common saying has it that 'a picture is worth a thousand words', here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale -- one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.
本文研究了日常用语对意象的描述。俗话说“一图胜千言”,我们要问是哪一千个?标记的社交多媒体数据的激增对理解大规模的集体标签使用提出了挑战——人们可以问照片标签的模式是否有助于理解标签-标签关系,以及如何利用它来改进视觉搜索和识别。我们提出了一种新的方法来联合分析三个不同的视觉知识资源:Flickr、ImageNet/WordNet和ConceptNet。这使我们能够量化两个标签的视觉相关性,了解它们的关系。我们提出了一种新的网络估计算法,逆概念秩,以推断不完全标签关系。然后,我们设计了一种同时考虑图像和标签特征的图像标注算法。我们分析了超过500万张照片和超过2万个视觉标签。这个集合的统计数据对于图像标记、关系估计和推广到未见过的标记都有很好的效果。这是分析图片标签和日常语义知识的第一步。潜在的其他应用包括生成图片的自然语言描述,以及验证和补充知识库。
{"title":"Picture tags and world knowledge: learning tag relations from visual semantic sources","authors":"Lexing Xie, Xuming He","doi":"10.1145/2502081.2502113","DOIUrl":"https://doi.org/10.1145/2502081.2502113","url":null,"abstract":"This paper studies the use of everyday words to describe images. The common saying has it that 'a picture is worth a thousand words', here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale -- one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88380513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Design, development and evaluation of an adaptive and standardized RTP/RTCP-based IDMS solution 设计、开发和评估自适应和标准化的基于RTP/ rtcp的IDMS解决方案
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502219
M. Montagud
Inter-Destination Media Synchronization (IDMS) is essential for enabling pleasant shared media experiences. The goal of my PhD thesis is to design, develop and evaluate an advanced RTP/RTCP-based IDMS solution fitting the requirements of the emerging distributed media consumption paradigm. In particular, standard compliant extensions to RTCP are being specified to allow for an accurate, adaptive and dynamic IDMS control when using RTP for streaming media. Moreover, the feasibility and suitability of several architectural schemes for exchanging the IDMS information, algorithms for allowing a dynamic IDMS monitoring and control, as well as adjustment techniques are being investigated. Objective and subjective testing are being conducted to validate the satisfactory performance of our IDMS solution and to provide insights about the users' tolerance on asynchrony levels in different IDMS scenarios.
目的地间媒体同步(IDMS)对于实现愉快的共享媒体体验至关重要。我的博士论文的目标是设计、开发和评估一个先进的基于RTP/ rtcp的IDMS解决方案,以满足新兴的分布式媒体消费范式的需求。特别是,RTCP的标准兼容扩展正在被指定,以便在将RTP用于流媒体时允许精确、自适应和动态的IDMS控制。此外,正在研究交换IDMS信息的若干架构方案的可行性和适宜性、允许动态IDMS监测和控制的算法以及调整技术。正在进行客观和主观测试,以验证我们的IDMS解决方案的令人满意的性能,并提供关于不同IDMS场景中用户对异步级别的容忍度的见解。
{"title":"Design, development and evaluation of an adaptive and standardized RTP/RTCP-based IDMS solution","authors":"M. Montagud","doi":"10.1145/2502081.2502219","DOIUrl":"https://doi.org/10.1145/2502081.2502219","url":null,"abstract":"Inter-Destination Media Synchronization (IDMS) is essential for enabling pleasant shared media experiences. The goal of my PhD thesis is to design, develop and evaluate an advanced RTP/RTCP-based IDMS solution fitting the requirements of the emerging distributed media consumption paradigm. In particular, standard compliant extensions to RTCP are being specified to allow for an accurate, adaptive and dynamic IDMS control when using RTP for streaming media. Moreover, the feasibility and suitability of several architectural schemes for exchanging the IDMS information, algorithms for allowing a dynamic IDMS monitoring and control, as well as adjustment techniques are being investigated. Objective and subjective testing are being conducted to validate the satisfactory performance of our IDMS solution and to provide insights about the users' tolerance on asynchrony levels in different IDMS scenarios.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82815497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Robust facial expressions recognition using 3D average face and ameliorated adaboost 基于三维平均脸和改进adaboost的鲁棒面部表情识别
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502173
Jinhui Chen, Y. Ariki, T. Takiguchi
One of the most crucial techniques associated with Computer Vision is technology that deals with facial recognition, especially, the automatic estimation of facial expressions. However, in real-time facial expression recognition, when a face turns sideways, the expressional feature extraction becomes difficult as the view of camera changes and recognition accuracy degrades significantly. Therefore, quite many conventional methods are proposed, which are based on static images or limited to situations in which the face is viewed from the front. In this paper, a method that uses Look-Up-Table (LUT) AdaBoost combining with the three-dimensional average face is proposed to solve the problem mentioned above. In order to evaluate the proposed method, the experiment compared with the conventional method was executed. These approaches show promising results and very good success rates. This paper covers several methods that can improve results by making the system more robust.
与计算机视觉相关的最关键技术之一是处理面部识别的技术,特别是面部表情的自动估计。然而,在实时面部表情识别中,当人脸侧转时,由于相机视角的变化,面部表情特征提取变得困难,识别精度显著降低。因此,提出了许多传统的方法,这些方法都是基于静态图像或仅限于从正面观看面部的情况。本文提出了一种利用查找表(LUT) AdaBoost结合三维平均人脸的方法来解决上述问题。为了对该方法进行评价,与传统方法进行了对比实验。这些方法显示出有希望的结果和非常好的成功率。本文介绍了几种可以通过提高系统鲁棒性来改善结果的方法。
{"title":"Robust facial expressions recognition using 3D average face and ameliorated adaboost","authors":"Jinhui Chen, Y. Ariki, T. Takiguchi","doi":"10.1145/2502081.2502173","DOIUrl":"https://doi.org/10.1145/2502081.2502173","url":null,"abstract":"One of the most crucial techniques associated with Computer Vision is technology that deals with facial recognition, especially, the automatic estimation of facial expressions. However, in real-time facial expression recognition, when a face turns sideways, the expressional feature extraction becomes difficult as the view of camera changes and recognition accuracy degrades significantly. Therefore, quite many conventional methods are proposed, which are based on static images or limited to situations in which the face is viewed from the front. In this paper, a method that uses Look-Up-Table (LUT) AdaBoost combining with the three-dimensional average face is proposed to solve the problem mentioned above. In order to evaluate the proposed method, the experiment compared with the conventional method was executed. These approaches show promising results and very good success rates. This paper covers several methods that can improve results by making the system more robust.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"236 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89050805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Pl@ntNet mobile app Pl@ntNet移动应用
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502251
H. Goëau, P. Bonnet, A. Joly, V. Bakic, Julien Barbe, Itheri Yahiaoui, Souheil Selmi, Jennifer Carré, D. Barthélémy, N. Boujemaa, J. Molino, Grégoire Duché, Aurélien Péronnet
Pl@ntNet is an image sharing and retrieval application for the identification of plants, available on iPhone and iPad devices. Contrary to previous content-based identification applications it can work with several parts of the plant including flowers, leaves, fruits and bark. It also allows integrating user's observations in the database thanks to a collaborative workflow involving the members of a social network specialized on plants. Data collected so far makes it one of the largest mobile plant identification tool.
Pl@ntNet是一个用于识别植物的图像共享和检索应用程序,可在iPhone和iPad设备上使用。与以前基于内容的识别应用不同,它可以处理植物的几个部分,包括花、叶、果实和树皮。它还允许将用户的观察结果整合到数据库中,这要归功于一个涉及专门研究植物的社交网络成员的协作工作流程。迄今为止收集的数据使其成为最大的移动植物识别工具之一。
{"title":"Pl@ntNet mobile app","authors":"H. Goëau, P. Bonnet, A. Joly, V. Bakic, Julien Barbe, Itheri Yahiaoui, Souheil Selmi, Jennifer Carré, D. Barthélémy, N. Boujemaa, J. Molino, Grégoire Duché, Aurélien Péronnet","doi":"10.1145/2502081.2502251","DOIUrl":"https://doi.org/10.1145/2502081.2502251","url":null,"abstract":"Pl@ntNet is an image sharing and retrieval application for the identification of plants, available on iPhone and iPad devices. Contrary to previous content-based identification applications it can work with several parts of the plant including flowers, leaves, fruits and bark. It also allows integrating user's observations in the database thanks to a collaborative workflow involving the members of a social network specialized on plants. Data collected so far makes it one of the largest mobile plant identification tool.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"1994 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89094397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Efficient image and tag co-ranking: a bregman divergence optimization method 高效图像和标签联合排序:一种bregman散度优化方法
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502156
Lin Wu, Yang Wang, J. Shepherd
Ranking on image search has attracted considerable attentions. Many graph-based algorithms have been proposed to solve this problem. Despite their remarkable success, these approaches are restricted to their separated image networks. To improve the ranking performance, one effective strategy is to work beyond the separated image graph by leveraging fruitful information from manual semantic labeling (i.e., tags) associated with images, which leads to the technique of co-ranking images and tags, a representative method that aims to explore the reinforcing relationship between image and tag graphs. The idea of co-ranking is implemented by adopting the paradigm of random walks. However, there are two problems hidden in co-ranking remained to be open: the high computational complexity and the problem of out-of-sample. To address the challenges above, in this paper, we cast the co-ranking process into a Bregman divergence optimization framework under which we transform the original random walk into an equivalent optimal kernel matrix learning problem. Enhanced by this new formulation, we derive a novel extension to achieve a better performance for both in-sample and out-of-sample cases. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of our approach.
图片搜索中的排名问题引起了人们的广泛关注。许多基于图的算法被提出来解决这个问题。尽管它们取得了显著的成功,但这些方法仅限于它们的分离图像网络。为了提高排名性能,一种有效的策略是利用与图像相关的人工语义标记(即标签)的有效信息,从而超越分离的图像图,从而产生图像和标签的联合排名技术,这是一种旨在探索图像和标签图之间增强关系的代表性方法。通过采用随机游走的范式来实现共同排序的思想。然而,在协同排序中隐藏着两个有待解决的问题:高计算复杂度和样本外问题。为了解决上述挑战,在本文中,我们将联合排序过程转换为Bregman散度优化框架,在该框架下,我们将原始随机漫步转换为等效的最优核矩阵学习问题。在这个新公式的基础上,我们推导出了一个新的扩展,以在样本内和样本外情况下获得更好的性能。大量的实验证明了我们的方法的有效性和效率。
{"title":"Efficient image and tag co-ranking: a bregman divergence optimization method","authors":"Lin Wu, Yang Wang, J. Shepherd","doi":"10.1145/2502081.2502156","DOIUrl":"https://doi.org/10.1145/2502081.2502156","url":null,"abstract":"Ranking on image search has attracted considerable attentions. Many graph-based algorithms have been proposed to solve this problem. Despite their remarkable success, these approaches are restricted to their separated image networks. To improve the ranking performance, one effective strategy is to work beyond the separated image graph by leveraging fruitful information from manual semantic labeling (i.e., tags) associated with images, which leads to the technique of co-ranking images and tags, a representative method that aims to explore the reinforcing relationship between image and tag graphs. The idea of co-ranking is implemented by adopting the paradigm of random walks. However, there are two problems hidden in co-ranking remained to be open: the high computational complexity and the problem of out-of-sample. To address the challenges above, in this paper, we cast the co-ranking process into a Bregman divergence optimization framework under which we transform the original random walk into an equivalent optimal kernel matrix learning problem. Enhanced by this new formulation, we derive a novel extension to achieve a better performance for both in-sample and out-of-sample cases. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of our approach.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89160786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Efficient video quality assessment based on spacetime texture representation 基于时空纹理表示的高效视频质量评估
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502168
Peng Peng, Kevin J. Cannons, Ze-Nian Li
Most existing video quality metrics measure temporal distortions based on optical-flow estimation, which typically has limited descriptive power of visual dynamics and low efficiency. This paper presents a unified and efficient framework to measure temporal distortions based on a spacetime texture representation of motion. We first propose an effective motion-tuning scheme to capture temporal distortions along motion trajectories by exploiting the distributive characteristic of the spacetime texture. Then we reuse the motion descriptors to build a self-information based spatiotemporal saliency model to guide the spatial pooling. At last, a comprehensive quality metric is developed by combining the temporal distortion measure with spatial distortion measure. Our method demonstrates high efficiency and excellent correlation with the human perception of video quality.
现有的视频质量指标大多是基于光流估计来测量时间畸变的,这种方法对视觉动态的描述能力有限,效率较低。本文提出了一种基于运动的时空纹理表示来测量时间畸变的统一、高效的框架。我们首先提出了一种有效的运动调谐方案,通过利用时空纹理的分布特征来捕获运动轨迹上的时间畸变。在此基础上,利用运动描述符构建基于自信息的时空显著性模型来指导空间池化。最后,将时间畸变测度与空间畸变测度相结合,提出了一种综合质量测度。该方法具有较高的效率,并且与人类对视频质量的感知具有良好的相关性。
{"title":"Efficient video quality assessment based on spacetime texture representation","authors":"Peng Peng, Kevin J. Cannons, Ze-Nian Li","doi":"10.1145/2502081.2502168","DOIUrl":"https://doi.org/10.1145/2502081.2502168","url":null,"abstract":"Most existing video quality metrics measure temporal distortions based on optical-flow estimation, which typically has limited descriptive power of visual dynamics and low efficiency. This paper presents a unified and efficient framework to measure temporal distortions based on a spacetime texture representation of motion. We first propose an effective motion-tuning scheme to capture temporal distortions along motion trajectories by exploiting the distributive characteristic of the spacetime texture. Then we reuse the motion descriptors to build a self-information based spatiotemporal saliency model to guide the spatial pooling. At last, a comprehensive quality metric is developed by combining the temporal distortion measure with spatial distortion measure. Our method demonstrates high efficiency and excellent correlation with the human perception of video quality.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91385930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
What are the distance metrics for local features? 局部特征的距离度量是什么?
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502134
Zhendong Mao, Yongdong Zhang, Q. Tian
Previous research has found that the distance metric for similarity estimation is determined by the underlying data noise distribution. The well known Euclidean(L2) and Manhattan (L1) metrics are then justified when the additive noise are Gaussian and Exponential, respectively. However, finding a suitable distance metric for local features is still a challenge when the underlying noise distribution is unknown and could be neither Gaussian nor Exponential. To address this issue, we introduce a modeling framework for arbitrary noise distributions and propose a generalized distance metric for local features based on this framework. We prove that the proposed distance is equivalent to the L1 or the L2 distance when the noise is Gaussian or Exponential. Furthermore, we justify the Hamming metric when the noise meets the given conditions. In that case, the proposed distance is a linear mapping of the Hamming distance. The proposed metric has been extensively tested on a benchmark data set with five state-of-the-art local features: SIFT, SURF, BRIEF, ORB and BRISK. Experiments show that our framework better models the real noise distributions and that more robust results can be obtained by using the proposed distance metric.
以往的研究发现,相似性估计的距离度量是由底层数据噪声分布决定的。当加性噪声分别为高斯噪声和指数噪声时,众所周知的欧几里得(L2)和曼哈顿(L1)度量是合理的。然而,当底层噪声分布未知且既不是高斯分布也不是指数分布时,为局部特征找到合适的距离度量仍然是一个挑战。为了解决这个问题,我们引入了一个任意噪声分布的建模框架,并在此框架的基础上提出了一个局部特征的广义距离度量。我们证明了当噪声为高斯或指数时,所提出的距离等于L1或L2距离。此外,当噪声满足给定条件时,我们证明了汉明度量。在这种情况下,建议的距离是汉明距离的线性映射。提议的度量标准已经在一个基准数据集上进行了广泛的测试,该数据集具有五个最先进的本地特征:SIFT、SURF、BRIEF、ORB和BRISK。实验结果表明,该框架能较好地模拟实际噪声分布,并能获得较好的鲁棒性结果。
{"title":"What are the distance metrics for local features?","authors":"Zhendong Mao, Yongdong Zhang, Q. Tian","doi":"10.1145/2502081.2502134","DOIUrl":"https://doi.org/10.1145/2502081.2502134","url":null,"abstract":"Previous research has found that the distance metric for similarity estimation is determined by the underlying data noise distribution. The well known Euclidean(L2) and Manhattan (L1) metrics are then justified when the additive noise are Gaussian and Exponential, respectively. However, finding a suitable distance metric for local features is still a challenge when the underlying noise distribution is unknown and could be neither Gaussian nor Exponential. To address this issue, we introduce a modeling framework for arbitrary noise distributions and propose a generalized distance metric for local features based on this framework. We prove that the proposed distance is equivalent to the L1 or the L2 distance when the noise is Gaussian or Exponential. Furthermore, we justify the Hamming metric when the noise meets the given conditions. In that case, the proposed distance is a linear mapping of the Hamming distance. The proposed metric has been extensively tested on a benchmark data set with five state-of-the-art local features: SIFT, SURF, BRIEF, ORB and BRISK. Experiments show that our framework better models the real noise distributions and that more robust results can be obtained by using the proposed distance metric.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87439925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Joserlin: joint request and service scheduling for peer-to-peer non-linear media access 点对点非线性媒体访问的联合请求和服务调度
Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502090
Z. Zhao, Wei Tsang Ooi
A peer-to-peer non-linear media streaming system needs to schedule both on-demand and prefetch requests carefully so as to reduce the server load and ensure good user experience. In this work, we propose, Joserlin, a joint request and service scheduling solution that not only alleviates request contentions (requests compete for limited service capacity), but also schedules the prefetch requests by considering their contributions to potential reduction of server load. In particular, we propose a novel request binning algorithm to prevent self-contention among on-demand requests issued from the same peer. A service and rejection policy is devised to resolve contention among on-demand requests issued from different neighbors. More importantly, Joserlin employs a gain function to prioritize prefetch requests at both requesters and responders, and a prefetch request issuing algorithm to fully utilize available upload bandwidth. Evaluation with traces collected from a popular networked virtual environment shows that Joserlin leads to 20%~60% reduction in server load.
点对点非线性流媒体系统需要对点播请求和预取请求进行合理的调度,以减少服务器的负载,保证良好的用户体验。在这项工作中,我们提出了一个请求和服务联合调度解决方案,该解决方案不仅缓解了请求争用(请求竞争有限的服务容量),而且还通过考虑预取请求对服务器负载的潜在减少的贡献来调度预取请求。特别是,我们提出了一种新的请求分组算法,以防止来自同一对等点发出的按需请求之间的自争用。设计了服务和拒绝策略来解决来自不同邻居的按需请求之间的争用。更重要的是,Joserlin使用增益函数对请求者和响应者的预取请求进行优先级排序,并使用预取请求发布算法来充分利用可用的上传带宽。使用从一个流行的网络虚拟环境收集的跟踪进行评估表明,Joserlin使服务器负载减少了20%~60%。
{"title":"Joserlin: joint request and service scheduling for peer-to-peer non-linear media access","authors":"Z. Zhao, Wei Tsang Ooi","doi":"10.1145/2502081.2502090","DOIUrl":"https://doi.org/10.1145/2502081.2502090","url":null,"abstract":"A peer-to-peer non-linear media streaming system needs to schedule both on-demand and prefetch requests carefully so as to reduce the server load and ensure good user experience. In this work, we propose, Joserlin, a joint request and service scheduling solution that not only alleviates request contentions (requests compete for limited service capacity), but also schedules the prefetch requests by considering their contributions to potential reduction of server load. In particular, we propose a novel request binning algorithm to prevent self-contention among on-demand requests issued from the same peer. A service and rejection policy is devised to resolve contention among on-demand requests issued from different neighbors. More importantly, Joserlin employs a gain function to prioritize prefetch requests at both requesters and responders, and a prefetch request issuing algorithm to fully utilize available upload bandwidth. Evaluation with traces collected from a popular networked virtual environment shows that Joserlin leads to 20%~60% reduction in server load.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77314978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Proceedings of the 21st ACM international conference on Multimedia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1