首页 > 最新文献

Proceedings of the ACM Multimedia Asia最新文献

英文 中文
Session details: Multimedia Service 会话详细信息:多媒体服务
Pub Date : 2019-12-15 DOI: 10.1145/3379192
T. Yamasaki
{"title":"Session details: Multimedia Service","authors":"T. Yamasaki","doi":"10.1145/3379192","DOIUrl":"https://doi.org/10.1145/3379192","url":null,"abstract":"","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127371842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Surface Normal Data Guided Depth Recovery with Graph Laplacian Regularization 基于图拉普拉斯正则化的地表法向数据引导深度恢复
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366582
Longhua Sun, Jin Wang, Yunhui Shi, Qing Zhu, Baocai Yin
High-quality depth information has been increasingly used in many real-world multimedia applications in recent years. Due to the limitation of depth sensor and sensing technology, actually, the captured depth map usually has low resolution and black holes. In this paper, inspired by the geometric relationship between surface normal of a 3D scene and their distance from camera, we discover that surface normal map can provide more spatial geometric constraints for depth map reconstruction, as depth map is a special image with spatial information, which we called 2.5D image. To exploit this property, we propose a novel surface normal data guided depth recovery method, which uses surface normal data and observed depth value to estimate missing or interpolated depth values. Moreover, to preserve the inherent piecewise smooth characteristic of depth maps, graph Laplacian prior is applied to regularize the inverse problem of depth maps recovery and a graph Laplacian regularizer(GLR) is proposed. Finally, the spatial geometric constraint and graph Laplacian regularization are integrated into a unified optimization framework, which can be efficiently solved by conjugate gradient(CG). Extensive quantitative and qualitative evaluations compared with state-of-the-art schemes show the effectiveness and superiority of our method.
近年来,高质量的深度信息越来越多地应用于现实世界的多媒体应用中。实际上,由于深度传感器和传感技术的限制,捕获的深度图通常具有低分辨率和黑洞。本文从三维场景的表面法线与其与相机的距离之间的几何关系中得到启发,发现表面法线图可以为深度图的重建提供更多的空间几何约束,因为深度图是一种具有空间信息的特殊图像,我们称之为2.5D图像。为了利用这一特性,我们提出了一种新的地表法向数据引导深度恢复方法,该方法利用地表法向数据和观测深度值来估计缺失或插值的深度值。此外,为了保持深度图固有的分段平滑特性,应用图拉普拉斯先验对深度图恢复逆问题进行正则化,并提出了图拉普拉斯正则化器(GLR)。最后,将空间几何约束和图拉普拉斯正则化整合到一个统一的优化框架中,并利用共轭梯度(CG)进行有效求解。广泛的定量和定性评价与最先进的方案相比,表明了我们的方法的有效性和优越性。
{"title":"Surface Normal Data Guided Depth Recovery with Graph Laplacian Regularization","authors":"Longhua Sun, Jin Wang, Yunhui Shi, Qing Zhu, Baocai Yin","doi":"10.1145/3338533.3366582","DOIUrl":"https://doi.org/10.1145/3338533.3366582","url":null,"abstract":"High-quality depth information has been increasingly used in many real-world multimedia applications in recent years. Due to the limitation of depth sensor and sensing technology, actually, the captured depth map usually has low resolution and black holes. In this paper, inspired by the geometric relationship between surface normal of a 3D scene and their distance from camera, we discover that surface normal map can provide more spatial geometric constraints for depth map reconstruction, as depth map is a special image with spatial information, which we called 2.5D image. To exploit this property, we propose a novel surface normal data guided depth recovery method, which uses surface normal data and observed depth value to estimate missing or interpolated depth values. Moreover, to preserve the inherent piecewise smooth characteristic of depth maps, graph Laplacian prior is applied to regularize the inverse problem of depth maps recovery and a graph Laplacian regularizer(GLR) is proposed. Finally, the spatial geometric constraint and graph Laplacian regularization are integrated into a unified optimization framework, which can be efficiently solved by conjugate gradient(CG). Extensive quantitative and qualitative evaluations compared with state-of-the-art schemes show the effectiveness and superiority of our method.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122328075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Color Recovery from Multi-Spectral NIR Images Using Gray Information 利用灰色信息从多光谱近红外图像中恢复颜色
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3368259
Qingtao Fu, Cheolkon Jung, Chen Su
Converting near-infrared (NIR) images into color images is a challenging task due to the different characteristics of visible and NIR images. Most methods of generating color images directly from a single NIR image are limited by the scene and object categories. In this paper, we propose a novel approach to recovering object colors from multi-spectral NIR images using gray information. The multi-spectral NIR images are obtained by a 2-CCD NIR/RGB camera with narrow NIR bandpass filters of different wavelengths. The proposed approach is based on multi-spectral NIR images to estimate a conversion matrix for NIR to RGB conversion. In addition to the multi-spectral NIR images, a corresponding gray image is used as a complementary channel to estimate the conversion matrix for NIR to RGB color conversion. The conversion matrix is obtained from the ColorChecker's 24 color blocks using polynomial regression and applied to real-world scene NIR images for color recovery. The proposed approach has been evaluated by a large number of real-world scene images, and the results show that the proposed approach is simple yet effective for recovering color of objects.
由于可见光和近红外图像的特性不同,将近红外图像转换成彩色图像是一项具有挑战性的任务。大多数直接从单幅近红外图像生成彩色图像的方法都受到场景和物体类别的限制。在本文中,我们提出了一种利用灰度信息从多光谱近红外图像中恢复物体颜色的新方法。多光谱近红外图像由一台带有不同波长窄近红外带通滤光片的双 CCD 近红外/红外摄像机获得。所提出的方法基于多光谱近红外图像来估算近红外到 RGB 转换的转换矩阵。除了多光谱近红外图像外,还使用相应的灰度图像作为补充通道,以估算近红外到 RGB 色彩转换的转换矩阵。转换矩阵通过多项式回归从 ColorChecker 的 24 个颜色块中获得,并应用于真实场景近红外图像的颜色恢复。大量真实场景图像对所提出的方法进行了评估,结果表明所提出的方法简单而有效,可用于恢复物体的颜色。
{"title":"Color Recovery from Multi-Spectral NIR Images Using Gray Information","authors":"Qingtao Fu, Cheolkon Jung, Chen Su","doi":"10.1145/3338533.3368259","DOIUrl":"https://doi.org/10.1145/3338533.3368259","url":null,"abstract":"Converting near-infrared (NIR) images into color images is a challenging task due to the different characteristics of visible and NIR images. Most methods of generating color images directly from a single NIR image are limited by the scene and object categories. In this paper, we propose a novel approach to recovering object colors from multi-spectral NIR images using gray information. The multi-spectral NIR images are obtained by a 2-CCD NIR/RGB camera with narrow NIR bandpass filters of different wavelengths. The proposed approach is based on multi-spectral NIR images to estimate a conversion matrix for NIR to RGB conversion. In addition to the multi-spectral NIR images, a corresponding gray image is used as a complementary channel to estimate the conversion matrix for NIR to RGB color conversion. The conversion matrix is obtained from the ColorChecker's 24 color blocks using polynomial regression and applied to real-world scene NIR images for color recovery. The proposed approach has been evaluated by a large number of real-world scene images, and the results show that the proposed approach is simple yet effective for recovering color of objects.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"48 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130669564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Generalizing Rate Control Strategies for Realtime Video Streaming via Learning from Deep Learning 基于深度学习的实时视频流泛化速率控制策略
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366606
Tianchi Huang, Ruixiao Zhang, Chenglei Wu, Xin Yao, Chao Zhou, Bing Yu, Lifeng Sun
The leading learning-based rate control method, i.e., QARC, achieves state-of-the-art performances but fails to interpret the fundamental principles, and thus lacks the abilities to further improve itself efficiently. In this paper, we propose EQARC (Explainable QARC) via reconstructing QARC's modules, aiming to demystify how QARC works. In details, we first utilize a novel hybrid attention-based CNN+GRU model to re-characterize the original quality prediction network and reasonably replace the QARC's 1D-CNN layers with 2D-CNN layers. Using trace-driven experiment, we demonstrate the superiority of EQARC over existing state-of-the-art approaches. Next, we collect several useful information from each interpretable modules and learn the insight of EQARC. Following this step, we further propose AQARC (Advanced QARC), which is the light-weighted version of QARC. Experimental results show that AQARC achieves the same performances as the QARC with an overhead reduction of 90%. In short, through learning from deep learning, we generalize a rate control method which can both reach high performance and reduce computation cost.
目前领先的基于学习的速率控制方法,即QARC,虽然实现了最先进的性能,但未能解释其基本原理,因此缺乏进一步有效提高自身的能力。本文通过重构QARC的模块,提出了EQARC (Explainable QARC),旨在揭开QARC的工作原理。具体而言,我们首先利用一种新颖的基于注意力的CNN+GRU混合模型来重新表征原始质量预测网络,并合理地将QARC的1D-CNN层替换为2D-CNN层。通过跟踪驱动实验,我们证明了EQARC优于现有的最先进的方法。接下来,我们从每个可解释模块中收集一些有用的信息,并了解EQARC的见解。接下来,我们进一步提出AQARC (Advanced QARC),这是QARC的轻量级版本。实验结果表明,该算法与QARC算法性能相当,开销降低了90%。总之,通过学习深度学习,我们推广了一种既能达到高性能又能降低计算成本的速率控制方法。
{"title":"Generalizing Rate Control Strategies for Realtime Video Streaming via Learning from Deep Learning","authors":"Tianchi Huang, Ruixiao Zhang, Chenglei Wu, Xin Yao, Chao Zhou, Bing Yu, Lifeng Sun","doi":"10.1145/3338533.3366606","DOIUrl":"https://doi.org/10.1145/3338533.3366606","url":null,"abstract":"The leading learning-based rate control method, i.e., QARC, achieves state-of-the-art performances but fails to interpret the fundamental principles, and thus lacks the abilities to further improve itself efficiently. In this paper, we propose EQARC (Explainable QARC) via reconstructing QARC's modules, aiming to demystify how QARC works. In details, we first utilize a novel hybrid attention-based CNN+GRU model to re-characterize the original quality prediction network and reasonably replace the QARC's 1D-CNN layers with 2D-CNN layers. Using trace-driven experiment, we demonstrate the superiority of EQARC over existing state-of-the-art approaches. Next, we collect several useful information from each interpretable modules and learn the insight of EQARC. Following this step, we further propose AQARC (Advanced QARC), which is the light-weighted version of QARC. Experimental results show that AQARC achieves the same performances as the QARC with an overhead reduction of 90%. In short, through learning from deep learning, we generalize a rate control method which can both reach high performance and reduce computation cost.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133029202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Adaptive Dark Region Detail Enhancement Method for Low-light Images 弱光图像的自适应暗区细节增强方法
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366584
Wengang Cheng, Caiyun Guo, Haitao Hu
The images captured in low-light conditions are often of poor visual quality as most of details in dark regions buried. Although some advanced low-light image enhancement methods could lighten an image and its dark regions, they still cannot reveal the details in dark regions very well. This paper presents an adaptive dark region detail enhancement method for low-light images. As our method is based on the Retinex theory, we first formulate the Retinex-based low-light image enhancement problem into a Bayesian optimization framework. Then, a dark region prior is proposed and an adaptive gradient amplification strategy is designed to incorporate this prior into the illumination estimation. The dark region prior, together with the widely used spatial smooth and structure priors, leads to a dark region and structure-aware smoothness regularization term for illumination optimization. We provide a solver to this optimization and get final enhanced results after post processing. Experiments demonstrate that our method can obtain good enhancement results with better dark region details compared to several state-of-the-art methods.
在弱光条件下拍摄的图像通常视觉质量较差,因为大部分细节都隐藏在黑暗区域。虽然一些先进的弱光图像增强方法可以使图像及其暗区变亮,但它们仍然不能很好地显示暗区中的细节。提出了一种针对弱光图像的自适应暗区细节增强方法。由于我们的方法是基于Retinex理论,我们首先将基于Retinex的弱光图像增强问题转化为贝叶斯优化框架。然后,提出了一个暗区先验,并设计了一种自适应梯度放大策略,将该先验融合到照明估计中。暗区先验与广泛应用的空间平滑先验和结构先验共同构成了一个感知暗区和结构的平滑正则化项,用于照明优化。我们为这种优化提供了求解器,并在后期处理后得到最终的增强结果。实验表明,与现有的几种方法相比,该方法可以获得更好的暗区细节增强效果。
{"title":"An Adaptive Dark Region Detail Enhancement Method for Low-light Images","authors":"Wengang Cheng, Caiyun Guo, Haitao Hu","doi":"10.1145/3338533.3366584","DOIUrl":"https://doi.org/10.1145/3338533.3366584","url":null,"abstract":"The images captured in low-light conditions are often of poor visual quality as most of details in dark regions buried. Although some advanced low-light image enhancement methods could lighten an image and its dark regions, they still cannot reveal the details in dark regions very well. This paper presents an adaptive dark region detail enhancement method for low-light images. As our method is based on the Retinex theory, we first formulate the Retinex-based low-light image enhancement problem into a Bayesian optimization framework. Then, a dark region prior is proposed and an adaptive gradient amplification strategy is designed to incorporate this prior into the illumination estimation. The dark region prior, together with the widely used spatial smooth and structure priors, leads to a dark region and structure-aware smoothness regularization term for illumination optimization. We provide a solver to this optimization and get final enhanced results after post processing. Experiments demonstrate that our method can obtain good enhancement results with better dark region details compared to several state-of-the-art methods.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133853269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learn to Gesture: Let Your Body Speak 学会做手势:让你的身体说话
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366602
Tian Gan, Zhixin Ma, Yu Lu, Xuemeng Song, Liqiang Nie
Presentation is one of the most important and vivid methods to deliver information to audience. Apart from the content of presentation, how the speaker behaves during presentation makes a big difference. In other words, gestures, as part of the visual perception and synchronized with verbal information, express some subtle information that the voice or words alone cannot deliver. One of the most effective ways to improve presentation is to practice through feedback/suggestions by an expert. However, hiring human experts is expensive thus impractical most of the time. Towards this end, we propose a speech to gesture network (POSE) to generate exemplary body language given a vocal behavior speech as input. Specifically, we build an "expert" Speech-Gesture database based on the featured TED talk videos, and design a two-layer attentive recurrent encoder-decoder network to learn the translation from speech to gesture, as well as the hierarchical structure within gestures. Lastly, given a speech audio sequence, the appropriate gesture will be generated and visualized for a more effective communication. Both objective and subjective validation show the effectiveness of our proposed method.
演讲是向听众传递信息的最重要、最生动的方式之一。除了演讲内容之外,演讲者在演讲过程中的表现也会产生很大的影响。换句话说,手势作为视觉感知的一部分,与语言信息同步,表达了一些仅靠声音或文字无法传递的微妙信息。提高演讲能力最有效的方法之一就是通过专家的反馈和建议进行练习。然而,雇用人类专家是昂贵的,因此大多数时候不切实际。为此,我们提出了一个语音到手势网络(POSE),以声音行为语音作为输入来生成典型的肢体语言。具体而言,我们基于特色TED演讲视频构建了一个“专家”语音-手势数据库,并设计了一个两层关注循环编码器-解码器网络来学习从语音到手势的翻译,以及手势内部的层次结构。最后,给定语音音频序列,将生成适当的手势并将其可视化,以实现更有效的交流。客观和主观验证均表明了该方法的有效性。
{"title":"Learn to Gesture: Let Your Body Speak","authors":"Tian Gan, Zhixin Ma, Yu Lu, Xuemeng Song, Liqiang Nie","doi":"10.1145/3338533.3366602","DOIUrl":"https://doi.org/10.1145/3338533.3366602","url":null,"abstract":"Presentation is one of the most important and vivid methods to deliver information to audience. Apart from the content of presentation, how the speaker behaves during presentation makes a big difference. In other words, gestures, as part of the visual perception and synchronized with verbal information, express some subtle information that the voice or words alone cannot deliver. One of the most effective ways to improve presentation is to practice through feedback/suggestions by an expert. However, hiring human experts is expensive thus impractical most of the time. Towards this end, we propose a speech to gesture network (POSE) to generate exemplary body language given a vocal behavior speech as input. Specifically, we build an \"expert\" Speech-Gesture database based on the featured TED talk videos, and design a two-layer attentive recurrent encoder-decoder network to learn the translation from speech to gesture, as well as the hierarchical structure within gestures. Lastly, given a speech audio sequence, the appropriate gesture will be generated and visualized for a more effective communication. Both objective and subjective validation show the effectiveness of our proposed method.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131121919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Structural Feature Learning: Re-Identification of simailar vehicles In Structure-Aware Map Space 深度结构特征学习:结构感知地图空间中相似车辆的再识别
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366585
Wenqian Zhu, R. Hu, Zhongyuan Wang, Dengshi Li, Xiyue Gao
Vehicle re-identification (re-ID) has received more attention in recent years as a significant work, making huge contribution to the intelligent video surveillance. The complex intra-class and inter-class variation of vehicle images bring huge challenges for vehicle re-ID, especially for the similar vehicle re-ID. In this paper we focus on an interesting and challenging problem, vehicle re-ID of the same/similar model. Previous works mainly focus on extracting global features using deep models, ignoring the individual loa-cal regions in vehicle front window, such as decorations and stickers attached to the windshield, that can be more discriminative for vehicle re-ID. Instead of directly embedding these regions to learn their features, we propose a Regional Structure-Aware model (RSA) to learn structure-aware cues with the position distribution of individual local regions in vehicle front window area, constructing a FW structural map space. In this map sapce, deep models are able to learn more robust and discriminative spatial structure-aware features to improve the performance for vehicle re-ID of the same/similar model. We evaluate our method on a large-scale vehicle re-ID dataset Vehicle-1M. The experimental results show that our method can achieve promising performance and outperforms several recent state-of-the-art approaches.
车辆再识别(re-ID)作为近年来备受关注的一项重要工作,对智能视频监控做出了巨大贡献。车辆图像类内和类间的复杂变化给车辆再识别带来了巨大的挑战,特别是对于相似的车辆再识别。本文主要研究了一个有趣且具有挑战性的问题,即相同/相似车型的车辆重识别问题。以前的工作主要集中在使用深度模型提取全局特征,忽略了车辆前窗的单个局部区域,如挡风玻璃上的装饰和贴纸,这些区域对车辆重新识别更具辨别能力。本文提出了一种区域结构感知模型(Regional Structure-Aware model, RSA),利用单个局部区域在汽车前窗区域的位置分布来学习结构感知线索,构建FW结构地图空间,而不是直接嵌入这些区域来学习它们的特征。在该地图空间中,深度模型能够学习到更鲁棒和判别性更强的空间结构感知特征,从而提高相同/相似模型的车辆再识别性能。我们在大规模车辆重新识别数据集vehicle - 1m上评估了我们的方法。实验结果表明,我们的方法可以达到很好的性能,并且优于最近几种最先进的方法。
{"title":"Deep Structural Feature Learning: Re-Identification of simailar vehicles In Structure-Aware Map Space","authors":"Wenqian Zhu, R. Hu, Zhongyuan Wang, Dengshi Li, Xiyue Gao","doi":"10.1145/3338533.3366585","DOIUrl":"https://doi.org/10.1145/3338533.3366585","url":null,"abstract":"Vehicle re-identification (re-ID) has received more attention in recent years as a significant work, making huge contribution to the intelligent video surveillance. The complex intra-class and inter-class variation of vehicle images bring huge challenges for vehicle re-ID, especially for the similar vehicle re-ID. In this paper we focus on an interesting and challenging problem, vehicle re-ID of the same/similar model. Previous works mainly focus on extracting global features using deep models, ignoring the individual loa-cal regions in vehicle front window, such as decorations and stickers attached to the windshield, that can be more discriminative for vehicle re-ID. Instead of directly embedding these regions to learn their features, we propose a Regional Structure-Aware model (RSA) to learn structure-aware cues with the position distribution of individual local regions in vehicle front window area, constructing a FW structural map space. In this map sapce, deep models are able to learn more robust and discriminative spatial structure-aware features to improve the performance for vehicle re-ID of the same/similar model. We evaluate our method on a large-scale vehicle re-ID dataset Vehicle-1M. The experimental results show that our method can achieve promising performance and outperforms several recent state-of-the-art approaches.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"102 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132329156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Vision in Multimedia 会议详情:多媒体中的视觉
Pub Date : 2019-12-15 DOI: 10.1145/3379197
H. Hang
{"title":"Session details: Vision in Multimedia","authors":"H. Hang","doi":"10.1145/3379197","DOIUrl":"https://doi.org/10.1145/3379197","url":null,"abstract":"","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":" August","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113946847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive Event Storyline Generation from Microblogs 从微博中生成综合事件故事线
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366601
Wenjin Sun, Yuhang Wang, Yuqi Gao, Zesong Li, J. Sang, Jian Yu
Microblogging data contains a wealth of information of trending events and has gained increased attention among users, organizations, and research scholars for social media mining in different disciplines. Event storyline generation is one typical task of social media mining, whose goal is to extract the development stages with associated description of events. Existing storyline generation methods either generate storyline with less integrity or fail to guarantee the coherence between the discovered stages. Secondly, there are no scientific method to evaluate the quality of the storyline. In this paper, we propose a comprehensive storyline generation framework to address the above disadvantages. Given Microblogging data related to the specified event, we first propose Hot-Word-Based stage detection algorithm to identify the potential stages of event, which can effectively avoid ignoring important stages and preventing inconsistent sequence between stages. Community detection algorithm is applied then to select representative data for each stage. Finally, we conduct graph optimization algorithm to generate the logically coherent storylines of the event. We also introduce a new evaluation metric, SLEU, to emphasize the importance of the integrity and coherence of the generated storyline. Extensive experiments on real-world Chinese microblogging data demonstrate the effectiveness of the proposed methods in each module and the overall framework.
微博数据包含了丰富的趋势事件信息,越来越受到用户、组织和不同学科研究学者对社交媒体挖掘的关注。事件故事线生成是社交媒体挖掘的一项典型任务,其目标是提取与事件描述相关的发展阶段。现有的故事情节生成方法要么生成的故事情节完整性较差,要么无法保证所发现阶段之间的连贯性。其次,没有科学的方法来评估故事情节的质量。在本文中,我们提出了一个全面的故事情节生成框架来解决上述缺点。针对特定事件相关的微博数据,我们首先提出了基于热词的阶段检测算法来识别事件的潜在阶段,可以有效地避免忽略重要阶段,防止阶段之间的顺序不一致。然后采用群体检测算法,选取各阶段的代表性数据。最后,通过图形优化算法生成事件逻辑连贯的故事线。我们还引入了一个新的评估指标,SLEU,以强调生成的故事情节的完整性和连贯性的重要性。在真实的中文微博数据上进行的大量实验证明了所提出的方法在各个模块和整体框架中的有效性。
{"title":"Comprehensive Event Storyline Generation from Microblogs","authors":"Wenjin Sun, Yuhang Wang, Yuqi Gao, Zesong Li, J. Sang, Jian Yu","doi":"10.1145/3338533.3366601","DOIUrl":"https://doi.org/10.1145/3338533.3366601","url":null,"abstract":"Microblogging data contains a wealth of information of trending events and has gained increased attention among users, organizations, and research scholars for social media mining in different disciplines. Event storyline generation is one typical task of social media mining, whose goal is to extract the development stages with associated description of events. Existing storyline generation methods either generate storyline with less integrity or fail to guarantee the coherence between the discovered stages. Secondly, there are no scientific method to evaluate the quality of the storyline. In this paper, we propose a comprehensive storyline generation framework to address the above disadvantages. Given Microblogging data related to the specified event, we first propose Hot-Word-Based stage detection algorithm to identify the potential stages of event, which can effectively avoid ignoring important stages and preventing inconsistent sequence between stages. Community detection algorithm is applied then to select representative data for each stage. Finally, we conduct graph optimization algorithm to generate the logically coherent storylines of the event. We also introduce a new evaluation metric, SLEU, to emphasize the importance of the integrity and coherence of the generated storyline. Extensive experiments on real-world Chinese microblogging data demonstrate the effectiveness of the proposed methods in each module and the overall framework.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"201 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116159636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Semantic Prior Guided Face Inpainting 语义先验引导面部彩绘
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366587
Zeyang Zhang, Xiaobo Zhou, Shengjie Zhao, Xiaoyan Zhang
Face inpainting is a sub-task of image inpainting designed to repair broken or occluded incomplete portraits. Due to the high complexity of face image details, inpainting on the face is more difficult. At present, face-related tasks often draw on excellent methods from face recognition and face detection, using multitasking to boost its effect. Therefore, this paper proposes to add the face prior knowledge to the existing advanced inpainting model, combined with perceptual loss and SSIM loss to improve the model repair efficiency. A new face inpainting process and algorithm is implemented, and the repair effect is improved.
面部补图是图像补图的一个子任务,旨在修复破损或遮挡的不完整肖像。由于人脸图像细节的高度复杂性,在人脸上进行彩绘是比较困难的。目前,与人脸相关的任务往往借鉴了人脸识别和人脸检测的优秀方法,使用多任务处理来提高其效果。因此,本文提出将人脸先验知识加入到已有的高级喷漆模型中,并结合感知损失和SSIM损失来提高模型修复效率。实现了一种新的人脸修复工艺和算法,提高了修复效果。
{"title":"Semantic Prior Guided Face Inpainting","authors":"Zeyang Zhang, Xiaobo Zhou, Shengjie Zhao, Xiaoyan Zhang","doi":"10.1145/3338533.3366587","DOIUrl":"https://doi.org/10.1145/3338533.3366587","url":null,"abstract":"Face inpainting is a sub-task of image inpainting designed to repair broken or occluded incomplete portraits. Due to the high complexity of face image details, inpainting on the face is more difficult. At present, face-related tasks often draw on excellent methods from face recognition and face detection, using multitasking to boost its effect. Therefore, this paper proposes to add the face prior knowledge to the existing advanced inpainting model, combined with perceptual loss and SSIM loss to improve the model repair efficiency. A new face inpainting process and algorithm is implemented, and the repair effect is improved.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116252144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Proceedings of the ACM Multimedia Asia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1