首页 > 最新文献

Proceedings of the ACM Multimedia Asia最新文献

英文 中文
Deep Distillation Metric Learning 深度蒸馏度量学习
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366560
Jiaxu Han, Tianyu Zhao, Changqing Zhang
Due to the emergence of large-scale and high-dimensional data, measuring the similarity between data points becomes challenging. In order to obtain effective representations, metric learning has become one of the most active researches in the field of computer vision and pattern recognition. However, models using trained networks for predictions are often cumbersome and difficult to be deployed. Therefore, in this paper, we propose a novel deep distillation metric learning (DDML) for online teaching in the procedure of learning the distance metric. Specifically, we employ model distillation to transfer the knowledge acquired by the larger model to the smaller model. Unlike the 2-step offline and mutual online manners, we propose to train a powerful teacher model, who transfer the knowledge to a lightweight and generalizable student model and iteratively improved by the feedback from the student model. We show that our method has achieved state-of-the-art results on CUB200-2011 and CARS196 while having advantages in computational efficiency.
由于大规模和高维数据的出现,测量数据点之间的相似性变得具有挑战性。为了获得有效的表征,度量学习已成为计算机视觉和模式识别领域最活跃的研究之一。然而,使用经过训练的网络进行预测的模型通常很麻烦,而且很难部署。因此,在本文中,我们提出了一种新的深度蒸馏度量学习(DDML),用于在线教学中距离度量的学习过程。具体来说,我们使用模型蒸馏将大模型获得的知识转移到小模型中。不同于线下两步、线上互动的方式,我们建议培养一个强大的教师模型,将知识传递给一个轻量级的、可推广的学生模型,并根据学生模型的反馈进行迭代改进。我们表明,我们的方法在CUB200-2011和CARS196上取得了最先进的结果,同时在计算效率上具有优势。
{"title":"Deep Distillation Metric Learning","authors":"Jiaxu Han, Tianyu Zhao, Changqing Zhang","doi":"10.1145/3338533.3366560","DOIUrl":"https://doi.org/10.1145/3338533.3366560","url":null,"abstract":"Due to the emergence of large-scale and high-dimensional data, measuring the similarity between data points becomes challenging. In order to obtain effective representations, metric learning has become one of the most active researches in the field of computer vision and pattern recognition. However, models using trained networks for predictions are often cumbersome and difficult to be deployed. Therefore, in this paper, we propose a novel deep distillation metric learning (DDML) for online teaching in the procedure of learning the distance metric. Specifically, we employ model distillation to transfer the knowledge acquired by the larger model to the smaller model. Unlike the 2-step offline and mutual online manners, we propose to train a powerful teacher model, who transfer the knowledge to a lightweight and generalizable student model and iteratively improved by the feedback from the student model. We show that our method has achieved state-of-the-art results on CUB200-2011 and CARS196 while having advantages in computational efficiency.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116846886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Comprehensive Event Storyline Generation from Microblogs 从微博中生成综合事件故事线
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366601
Wenjin Sun, Yuhang Wang, Yuqi Gao, Zesong Li, J. Sang, Jian Yu
Microblogging data contains a wealth of information of trending events and has gained increased attention among users, organizations, and research scholars for social media mining in different disciplines. Event storyline generation is one typical task of social media mining, whose goal is to extract the development stages with associated description of events. Existing storyline generation methods either generate storyline with less integrity or fail to guarantee the coherence between the discovered stages. Secondly, there are no scientific method to evaluate the quality of the storyline. In this paper, we propose a comprehensive storyline generation framework to address the above disadvantages. Given Microblogging data related to the specified event, we first propose Hot-Word-Based stage detection algorithm to identify the potential stages of event, which can effectively avoid ignoring important stages and preventing inconsistent sequence between stages. Community detection algorithm is applied then to select representative data for each stage. Finally, we conduct graph optimization algorithm to generate the logically coherent storylines of the event. We also introduce a new evaluation metric, SLEU, to emphasize the importance of the integrity and coherence of the generated storyline. Extensive experiments on real-world Chinese microblogging data demonstrate the effectiveness of the proposed methods in each module and the overall framework.
微博数据包含了丰富的趋势事件信息,越来越受到用户、组织和不同学科研究学者对社交媒体挖掘的关注。事件故事线生成是社交媒体挖掘的一项典型任务,其目标是提取与事件描述相关的发展阶段。现有的故事情节生成方法要么生成的故事情节完整性较差,要么无法保证所发现阶段之间的连贯性。其次,没有科学的方法来评估故事情节的质量。在本文中,我们提出了一个全面的故事情节生成框架来解决上述缺点。针对特定事件相关的微博数据,我们首先提出了基于热词的阶段检测算法来识别事件的潜在阶段,可以有效地避免忽略重要阶段,防止阶段之间的顺序不一致。然后采用群体检测算法,选取各阶段的代表性数据。最后,通过图形优化算法生成事件逻辑连贯的故事线。我们还引入了一个新的评估指标,SLEU,以强调生成的故事情节的完整性和连贯性的重要性。在真实的中文微博数据上进行的大量实验证明了所提出的方法在各个模块和整体框架中的有效性。
{"title":"Comprehensive Event Storyline Generation from Microblogs","authors":"Wenjin Sun, Yuhang Wang, Yuqi Gao, Zesong Li, J. Sang, Jian Yu","doi":"10.1145/3338533.3366601","DOIUrl":"https://doi.org/10.1145/3338533.3366601","url":null,"abstract":"Microblogging data contains a wealth of information of trending events and has gained increased attention among users, organizations, and research scholars for social media mining in different disciplines. Event storyline generation is one typical task of social media mining, whose goal is to extract the development stages with associated description of events. Existing storyline generation methods either generate storyline with less integrity or fail to guarantee the coherence between the discovered stages. Secondly, there are no scientific method to evaluate the quality of the storyline. In this paper, we propose a comprehensive storyline generation framework to address the above disadvantages. Given Microblogging data related to the specified event, we first propose Hot-Word-Based stage detection algorithm to identify the potential stages of event, which can effectively avoid ignoring important stages and preventing inconsistent sequence between stages. Community detection algorithm is applied then to select representative data for each stage. Finally, we conduct graph optimization algorithm to generate the logically coherent storylines of the event. We also introduce a new evaluation metric, SLEU, to emphasize the importance of the integrity and coherence of the generated storyline. Extensive experiments on real-world Chinese microblogging data demonstrate the effectiveness of the proposed methods in each module and the overall framework.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"201 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116159636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Selective Attention Network for Image Dehazing and Deraining 图像去雾和去雾的选择性注意网络
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366688
Xiao Liang, Runde Li, Jinhui Tang
Image dehazing and deraining are import low-level compute vision tasks. In this paper, we propose a novel method named Selective Attention Network (SAN) to solve these two problems. Due to the density of haze and directions of rain streaks are complex and non-uniform, SAN adopts the channel-wise attention and spatial-channel attention to remove rain streaks and haze both in globally and locally. To better capture various of rain and hazy details, we propose a Selective Attention Module(SAM) to re-scale the channel-wise attention and spatial-channel attention instead of simple element-wise summation. In addition, we conduct ablation studies to validate the effectiveness of the each module of SAN. Extensive experimental results on synthetic and real-world datasets show that SAN performs favorably against state-of-the-art methods.
图像去雾和去噪是重要的底层计算视觉任务。本文提出了一种新的方法——选择性注意网络(SAN)来解决这两个问题。由于雾霾的密度和雨条的方向复杂且不均匀,SAN采用通道关注和空间通道关注,从全局和局部两方面去除雨条和雾霾。为了更好地捕捉降雨和雾霾的各种细节,我们提出了一个选择性注意模块(SAM)来重新缩放通道注意和空间通道注意,而不是简单的元素注意求和。此外,我们还进行了消融研究,以验证SAN的每个模块的有效性。在合成和真实世界数据集上的广泛实验结果表明,SAN优于最先进的方法。
{"title":"Selective Attention Network for Image Dehazing and Deraining","authors":"Xiao Liang, Runde Li, Jinhui Tang","doi":"10.1145/3338533.3366688","DOIUrl":"https://doi.org/10.1145/3338533.3366688","url":null,"abstract":"Image dehazing and deraining are import low-level compute vision tasks. In this paper, we propose a novel method named Selective Attention Network (SAN) to solve these two problems. Due to the density of haze and directions of rain streaks are complex and non-uniform, SAN adopts the channel-wise attention and spatial-channel attention to remove rain streaks and haze both in globally and locally. To better capture various of rain and hazy details, we propose a Selective Attention Module(SAM) to re-scale the channel-wise attention and spatial-channel attention instead of simple element-wise summation. In addition, we conduct ablation studies to validate the effectiveness of the each module of SAN. Extensive experimental results on synthetic and real-world datasets show that SAN performs favorably against state-of-the-art methods.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126790113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Session details: Vision in Multimedia 会议详情:多媒体中的视觉
Pub Date : 2019-12-15 DOI: 10.1145/3379197
H. Hang
{"title":"Session details: Vision in Multimedia","authors":"H. Hang","doi":"10.1145/3379197","DOIUrl":"https://doi.org/10.1145/3379197","url":null,"abstract":"","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":" August","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113946847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Performance-Aware Selection Strategy for Cloud-based Video Services with Micro-Service Architecture 基于微服务架构的云视频服务性能感知选择策略
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366609
Zhengjun Xu, Haitao Zhang, Han Huang
The cloud micro-service architecture provides loosely coupling services and efficient virtual resources, which becomes a promising solution for large-scale video services. It is difficult to efficiently select the optimal services under micro-service architecture, because the large number of micro-services leads to an exponential increase in the number of service selection candidate solutions. In addition, the time sensitivity of video services increases the complexity of service selection, and the video data can affects the service selection results. However, the current video service selection strategies are insufficient under micro-service architecture, because they do not take into account the resource fluctuation of the service instances and the features of the video service comprehensively. In this paper, we focus on the video service selection strategy under micro-service architecture. Firstly, we propose a QoS Prediction (QP) method using explicit factor analysis and linear regression. The QP can accurately predict the QoS values based on the features of video data and service instances. Secondly, we propose a Performance-Aware Video Service Selection (PVSS) method. We prune the candidate services to reduce computational complexity and then efficiently select the optimal solution based on Fruit Fly Optimization (FFO) algorithm. Finally, we conduct extensive experiments to evaluate our strategy, and the results demonstrate the effectiveness of our strategy.
云微服务架构提供了松散耦合的服务和高效的虚拟资源,成为大规模视频业务的解决方案。在微服务架构下,由于大量的微服务导致服务选择候选方案的数量呈指数级增长,因此难以有效地选择最优服务。此外,视频业务的时间敏感性增加了业务选择的复杂性,视频数据会影响业务选择的结果。然而,在微服务架构下,现有的视频服务选择策略没有全面考虑服务实例的资源波动和视频服务的特点,存在一定的不足。本文主要研究了微服务架构下的视频服务选择策略。首先,我们提出了一种基于显式因子分析和线性回归的QoS预测方法。QP可以根据视频数据和业务实例的特点准确预测QoS值。其次,提出了一种性能感知视频服务选择(PVSS)方法。为了降低计算复杂度,我们对候选服务进行剪接,然后基于果蝇优化算法(FFO)高效地选择最优解。最后,我们进行了大量的实验来评估我们的策略,结果证明了我们的策略的有效性。
{"title":"A Performance-Aware Selection Strategy for Cloud-based Video Services with Micro-Service Architecture","authors":"Zhengjun Xu, Haitao Zhang, Han Huang","doi":"10.1145/3338533.3366609","DOIUrl":"https://doi.org/10.1145/3338533.3366609","url":null,"abstract":"The cloud micro-service architecture provides loosely coupling services and efficient virtual resources, which becomes a promising solution for large-scale video services. It is difficult to efficiently select the optimal services under micro-service architecture, because the large number of micro-services leads to an exponential increase in the number of service selection candidate solutions. In addition, the time sensitivity of video services increases the complexity of service selection, and the video data can affects the service selection results. However, the current video service selection strategies are insufficient under micro-service architecture, because they do not take into account the resource fluctuation of the service instances and the features of the video service comprehensively. In this paper, we focus on the video service selection strategy under micro-service architecture. Firstly, we propose a QoS Prediction (QP) method using explicit factor analysis and linear regression. The QP can accurately predict the QoS values based on the features of video data and service instances. Secondly, we propose a Performance-Aware Video Service Selection (PVSS) method. We prune the candidate services to reduce computational complexity and then efficiently select the optimal solution based on Fruit Fly Optimization (FFO) algorithm. Finally, we conduct extensive experiments to evaluate our strategy, and the results demonstrate the effectiveness of our strategy.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126988022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active Perception Network for Salient Object Detection 显著目标检测的主动感知网络
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366580
Junhang Wei, Shuhui Wang, Liang Li, Qingming Huang
To get better saliency maps for salient object detection, recent methods fuse features from different levels of convolutional neural networks and have achieved remarkable progress. However, the differences between different feature levels bring difficulties to the fusion process, thus it may lead to unsatisfactory saliency predictions. To address this issue, we propose Active Perception Network (APN) to enhance inter-feature consistency for salient object detection. First, Mutual Projection Module (MPM) is developed to fuse different features, which uses high-level features as guided information to extract complementary components from low-level features, and can suppress background noises and improve semantic consistency. Self Projection Module (SPM) is designed to further refine the fused features, which can be considered as the extended version of residual connection. Features that pass through SPM can produce more accurate saliency maps. Finally, we propose Head Projection Module (HPM) to aggregate global information, which brings strong semantic consistency to the whole network. Comprehensive experiments on five benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches on different evaluation metrics.
为了获得更好的显著性图用于显著性目标检测,最近的方法融合了来自不同层次卷积神经网络的特征,并取得了显着进展。然而,不同特征层次之间的差异给融合过程带来了困难,从而可能导致显著性预测不理想。为了解决这个问题,我们提出了主动感知网络(APN)来增强显著目标检测的特征间一致性。首先,提出了融合不同特征的互投影模块(Mutual Projection Module, MPM),利用高层特征作为引导信息,从低层特征中提取互补成分,可以抑制背景噪声,提高语义一致性;自投影模块(Self Projection Module, SPM)是为了进一步细化融合特征而设计的,它可以看作是残余连接的扩展版本。通过SPM的特征可以生成更精确的显著性图。最后,我们提出了头部投影模块(Head Projection Module, HPM)来聚合全局信息,使整个网络具有较强的语义一致性。在五个基准数据集上的综合实验表明,该方法在不同的评估指标上优于最先进的方法。
{"title":"Active Perception Network for Salient Object Detection","authors":"Junhang Wei, Shuhui Wang, Liang Li, Qingming Huang","doi":"10.1145/3338533.3366580","DOIUrl":"https://doi.org/10.1145/3338533.3366580","url":null,"abstract":"To get better saliency maps for salient object detection, recent methods fuse features from different levels of convolutional neural networks and have achieved remarkable progress. However, the differences between different feature levels bring difficulties to the fusion process, thus it may lead to unsatisfactory saliency predictions. To address this issue, we propose Active Perception Network (APN) to enhance inter-feature consistency for salient object detection. First, Mutual Projection Module (MPM) is developed to fuse different features, which uses high-level features as guided information to extract complementary components from low-level features, and can suppress background noises and improve semantic consistency. Self Projection Module (SPM) is designed to further refine the fused features, which can be considered as the extended version of residual connection. Features that pass through SPM can produce more accurate saliency maps. Finally, we propose Head Projection Module (HPM) to aggregate global information, which brings strong semantic consistency to the whole network. Comprehensive experiments on five benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches on different evaluation metrics.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134536705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video Summarization based on Sparse Subspace Clustering with Automatically Estimated Number of Clusters 基于自动估计簇数的稀疏子空间聚类视频摘要
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366593
Pengyi Hao, Edwin Manhando, Taotao Ye, Cong Bai
Advancements in technology resulted in a sharp growth in the number of digital cameras at people's disposal all across the world. Consequently, the huge storage space consumed by the videos from these devices on video repositories make the job of video processing and analysis to be time-consuming. Furthermore, this also slows down the video browsing and retrieval. Video summarization plays a very crucial role in solving these issues. Despite the number of video summarization approaches proposed up to the present time, the goal is to take a long video and generate a video summary in form of a short video skim without losing the meaning or the message transmitted by the original lengthy video. This is done by selecting the important frames called key-frames. The approach proposed by this work performs automatic summarization of digital videos based on detected objects' deep features. To this end, we apply sparse subspace clustering with an automatically estimated number of clusters to the objects' deep features. The summary generated from our scheme will store the meta-data for each short video inferred from the clustering results. In this paper, we also suggest a new video dataset for video summarization. We evaluate the performance of our work using the TVSum dataset and our video summarization dataset.
科技的进步导致世界各地人们使用的数码相机数量急剧增加。因此,来自这些设备的视频在视频存储库中占用了巨大的存储空间,使得视频处理和分析工作非常耗时。此外,这也减慢了视频的浏览和检索速度。视频摘要在解决这些问题中起着至关重要的作用。尽管目前提出了许多视频摘要方法,但其目标是在不失去原长视频的意义或传递的信息的情况下,以短视频略读的形式生成视频摘要。这是通过选择称为关键帧的重要帧来完成的。本文提出的方法基于被检测对象的深度特征对数字视频进行自动摘要。为此,我们对对象的深度特征应用稀疏子空间聚类,并自动估计聚类的数量。从我们的方案生成的摘要将存储从聚类结果推断出的每个短视频的元数据。在本文中,我们还提出了一个新的视频数据集用于视频摘要。我们使用TVSum数据集和我们的视频摘要数据集来评估我们的工作性能。
{"title":"Video Summarization based on Sparse Subspace Clustering with Automatically Estimated Number of Clusters","authors":"Pengyi Hao, Edwin Manhando, Taotao Ye, Cong Bai","doi":"10.1145/3338533.3366593","DOIUrl":"https://doi.org/10.1145/3338533.3366593","url":null,"abstract":"Advancements in technology resulted in a sharp growth in the number of digital cameras at people's disposal all across the world. Consequently, the huge storage space consumed by the videos from these devices on video repositories make the job of video processing and analysis to be time-consuming. Furthermore, this also slows down the video browsing and retrieval. Video summarization plays a very crucial role in solving these issues. Despite the number of video summarization approaches proposed up to the present time, the goal is to take a long video and generate a video summary in form of a short video skim without losing the meaning or the message transmitted by the original lengthy video. This is done by selecting the important frames called key-frames. The approach proposed by this work performs automatic summarization of digital videos based on detected objects' deep features. To this end, we apply sparse subspace clustering with an automatically estimated number of clusters to the objects' deep features. The summary generated from our scheme will store the meta-data for each short video inferred from the clustering results. In this paper, we also suggest a new video dataset for video summarization. We evaluate the performance of our work using the TVSum dataset and our video summarization dataset.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132442556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Session details: Poster Session 会议详情:海报会议
Pub Date : 2019-12-15 DOI: 10.1145/3379191
Ting Gan
{"title":"Session details: Poster Session","authors":"Ting Gan","doi":"10.1145/3379191","DOIUrl":"https://doi.org/10.1145/3379191","url":null,"abstract":"","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"355 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123001321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning 基于层次强化学习的弱监督视频摘要
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366583
Yiyan Chen, Li Tao, Xueting Wang, T. Yamasaki
Conventional video summarization approaches based on reinforcement learning have the problem that the reward can only be received after the whole summary is generated. Such kind of reward is sparse and it makes reinforcement learning hard to converge. Another problem is that labelling each shot is tedious and costly, which usually prohibits the construction of large-scale datasets. To solve these problems, we propose a weakly supervised hierarchical reinforcement learning framework, which decomposes the whole task into several subtasks to enhance the summarization quality. This framework consists of a manager network and a worker network. For each subtask, the manager is trained to set a subgoal only by a task-level binary label, which requires much fewer labels than conventional approaches. With the guide of the subgoal, the worker predicts the importance scores for video shots in the subtask by policy gradient according to both global reward and innovative defined sub-rewards to overcome the sparse problem. Experiments on two benchmark datasets show that our proposal has achieved the best performance, even better than supervised approaches.
传统的基于强化学习的视频摘要方法存在一个问题,即只有在生成完整的摘要后才能收到奖励。这种奖励是稀疏的,这使得强化学习很难收敛。另一个问题是,给每个镜头贴上标签既繁琐又昂贵,这通常阻碍了大规模数据集的构建。为了解决这些问题,我们提出了一个弱监督分层强化学习框架,该框架将整个任务分解为几个子任务,以提高总结质量。该框架由管理者网络和工作人员网络组成。对于每个子任务,管理人员被训练为仅通过任务级二进制标签来设置子目标,这比传统方法需要的标签少得多。在子目标的指导下,根据全局奖励和创新定义的子奖励,通过策略梯度预测子任务中视频镜头的重要性分数,以克服稀疏问题。在两个基准数据集上的实验表明,我们的方法取得了最好的性能,甚至优于监督方法。
{"title":"Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning","authors":"Yiyan Chen, Li Tao, Xueting Wang, T. Yamasaki","doi":"10.1145/3338533.3366583","DOIUrl":"https://doi.org/10.1145/3338533.3366583","url":null,"abstract":"Conventional video summarization approaches based on reinforcement learning have the problem that the reward can only be received after the whole summary is generated. Such kind of reward is sparse and it makes reinforcement learning hard to converge. Another problem is that labelling each shot is tedious and costly, which usually prohibits the construction of large-scale datasets. To solve these problems, we propose a weakly supervised hierarchical reinforcement learning framework, which decomposes the whole task into several subtasks to enhance the summarization quality. This framework consists of a manager network and a worker network. For each subtask, the manager is trained to set a subgoal only by a task-level binary label, which requires much fewer labels than conventional approaches. With the guide of the subgoal, the worker predicts the importance scores for video shots in the subtask by policy gradient according to both global reward and innovative defined sub-rewards to overcome the sparse problem. Experiments on two benchmark datasets show that our proposal has achieved the best performance, even better than supervised approaches.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123933017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Deep Feature Interaction Embedding for Pair Matching Prediction 基于深度特征交互嵌入的配对预测
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366597
Luwei Zhang, Xueting Wang, T. Yamasaki
Online dating services have become popular in modern society. Pair matching prediction between two users in these services can help efficiently increase the possibility of finding their life partners. Deep learning based methods with automatic feature interaction functions such as Factorization Machines (FM) and cross network of Deep & Cross Network (DCN) can model sparse categorical features, which are effective to many recommendation tasks of web applications. To solve the partner recommendation task, we improve these FM-based deep models and DCN by enhancing the representation of feature interaction embedding and proposing a novel design of interaction layer avoiding information loss. Through the experiments on two real-world datasets of two online dating companies, we demonstrate the superior performances of our proposed designs.
在线约会服务在现代社会已经变得很流行。在这些服务中,对两个用户之间的配对预测可以有效地提高他们找到生活伴侣的可能性。基于深度学习的具有自动特征交互功能的方法,如Factorization Machines (FM)和cross network of Deep & cross network (DCN),可以对稀疏分类特征进行建模,对web应用的许多推荐任务都是有效的。为了解决伙伴推荐任务,我们通过增强特征交互嵌入的表示,提出了一种避免信息丢失的交互层设计,对这些基于fm的深度模型和DCN进行了改进。通过两家在线约会公司的两个真实数据集的实验,我们证明了我们提出的设计的优越性能。
{"title":"Deep Feature Interaction Embedding for Pair Matching Prediction","authors":"Luwei Zhang, Xueting Wang, T. Yamasaki","doi":"10.1145/3338533.3366597","DOIUrl":"https://doi.org/10.1145/3338533.3366597","url":null,"abstract":"Online dating services have become popular in modern society. Pair matching prediction between two users in these services can help efficiently increase the possibility of finding their life partners. Deep learning based methods with automatic feature interaction functions such as Factorization Machines (FM) and cross network of Deep & Cross Network (DCN) can model sparse categorical features, which are effective to many recommendation tasks of web applications. To solve the partner recommendation task, we improve these FM-based deep models and DCN by enhancing the representation of feature interaction embedding and proposing a novel design of interaction layer avoiding information loss. Through the experiments on two real-world datasets of two online dating companies, we demonstrate the superior performances of our proposed designs.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127678914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the ACM Multimedia Asia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1