首页 > 最新文献

ACM Multimedia Asia最新文献

英文 中文
FQM-GC: Full-reference Quality Metric for Colored Point Cloud Based on Graph Signal Features and Color Features FQM-GC:基于图信号特征和颜色特征的彩色点云全参考质量度量
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490578
Ke-Xin Zhang, G. Jiang, Mei Yu
Colored Point Cloud (CPC) is often distorted in the processes of its acquisition, processing, and compression, so reliable quality assessment metrics are required to estimate the perception of distortion of CPC. We propose a Full-reference Quality Metric for colored point cloud based on Graph signal features and Color features (FQM-GC). For geometric distortion, the normal and coordinate information of the sub-clouds divided via geometric segmentation is used to construct their underlying graphs, then, the geometric structure features are extracted. For color distortion, the corresponding color statistical features are extracted from regions divided with color attribution. Meanwhile, the color features of different regions are weighted to simulate the visual masking effect. Finally, all the extracted features are formed into a feature vector to estimate the quality of CPCs. Experimental results on three databases (CPCD2.0, IRPC and SJTU-PCQA) show that the proposed metric FQM-GC is more consistent with human visual perception.
彩色点云(CPC)在其采集、处理和压缩过程中经常出现失真,因此需要可靠的质量评估指标来评估CPC失真的感知。提出了一种基于图信号特征和颜色特征的彩色点云全参考质量度量(FQM-GC)。对于几何畸变,利用几何分割分割出的子云的法向和坐标信息构建子云的底层图,然后提取子云的几何结构特征。对于颜色失真,从颜色属性划分的区域中提取相应的颜色统计特征。同时,对不同区域的颜色特征进行加权,模拟视觉掩蔽效果。最后,将所有提取的特征组成一个特征向量,用于估计cpc的质量。在CPCD2.0、IRPC和SJTU-PCQA三个数据库上的实验结果表明,所提出的度量FQM-GC更符合人类的视觉感知。
{"title":"FQM-GC: Full-reference Quality Metric for Colored Point Cloud Based on Graph Signal Features and Color Features","authors":"Ke-Xin Zhang, G. Jiang, Mei Yu","doi":"10.1145/3469877.3490578","DOIUrl":"https://doi.org/10.1145/3469877.3490578","url":null,"abstract":"Colored Point Cloud (CPC) is often distorted in the processes of its acquisition, processing, and compression, so reliable quality assessment metrics are required to estimate the perception of distortion of CPC. We propose a Full-reference Quality Metric for colored point cloud based on Graph signal features and Color features (FQM-GC). For geometric distortion, the normal and coordinate information of the sub-clouds divided via geometric segmentation is used to construct their underlying graphs, then, the geometric structure features are extracted. For color distortion, the corresponding color statistical features are extracted from regions divided with color attribution. Meanwhile, the color features of different regions are weighted to simulate the visual masking effect. Finally, all the extracted features are formed into a feature vector to estimate the quality of CPCs. Experimental results on three databases (CPCD2.0, IRPC and SJTU-PCQA) show that the proposed metric FQM-GC is more consistent with human visual perception.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123766833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatically Generate Rigged Character from Single Image 自动生成操纵字符从单个图像
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490565
Zhanpeng Huang, Rui Han, Jianwen Huang, Hao Yin, Zipeng Qin, Zibin Wang
Animation plays an important role in virtual reality and augmented reality applications. However, it requires great efforts for non-professional users to create animation assets. In this paper, we propose a systematic pipeline to generate ready-to-used characters from images for real-time animation without user intervention. Rather than per-pixel mapping or synthesis in image space using optical flow or generative models, we employ an approximate geometric embodiment to undertake 3D animation without large distortion. The geometry structure is generated from a type-agnostic character. A skeleton adaption is then adopted to guarantee semantic motion transfer to the geometry proxy. The generated character is compatible with standard 3D graphics engines and ready to use for real-time applications. Experiments show that our method works on various images (e.g. sketches, cartoons, and photos) of most object categories (e.g. human, animals, and non-creatures). We develop an AR demo to show its potential usage for fast prototyping.
动画在虚拟现实和增强现实应用中起着重要的作用。然而,对于非专业用户来说,创建动画资产需要付出很大的努力。在本文中,我们提出了一种系统的管道,可以在没有用户干预的情况下从图像中生成实时动画的现成字符。我们不是使用光流或生成模型在图像空间中进行逐像素映射或合成,而是采用近似几何实施例来进行没有大失真的3D动画。几何结构是由类型不可知的字符生成的。然后采用骨架自适应来保证语义运动向几何代理的传递。生成的字符与标准的3D图形引擎兼容,并准备用于实时应用程序。实验表明,我们的方法适用于大多数对象类别(如人类、动物和非生物)的各种图像(如草图、漫画和照片)。我们开发了一个AR演示来展示它在快速原型设计中的潜在用途。
{"title":"Automatically Generate Rigged Character from Single Image","authors":"Zhanpeng Huang, Rui Han, Jianwen Huang, Hao Yin, Zipeng Qin, Zibin Wang","doi":"10.1145/3469877.3490565","DOIUrl":"https://doi.org/10.1145/3469877.3490565","url":null,"abstract":"Animation plays an important role in virtual reality and augmented reality applications. However, it requires great efforts for non-professional users to create animation assets. In this paper, we propose a systematic pipeline to generate ready-to-used characters from images for real-time animation without user intervention. Rather than per-pixel mapping or synthesis in image space using optical flow or generative models, we employ an approximate geometric embodiment to undertake 3D animation without large distortion. The geometry structure is generated from a type-agnostic character. A skeleton adaption is then adopted to guarantee semantic motion transfer to the geometry proxy. The generated character is compatible with standard 3D graphics engines and ready to use for real-time applications. Experiments show that our method works on various images (e.g. sketches, cartoons, and photos) of most object categories (e.g. human, animals, and non-creatures). We develop an AR demo to show its potential usage for fast prototyping.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124754355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Bus Crowdedness Classification System 一种高效的公交拥挤分类系统
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493587
Lingcan Meng, Xiushan Nie, Zhifang Tan
We propose an efficient bus crowdedness classification system that can be used in daily life. In particular, we analyze and study the data collected from real bus, aiming to deal with the difficulty of bus congestion classification. Besides, we combine deep learning and computer vision technology to extract images or videos from the internal surveillance cameras of the bus. The information of crowd will finally be integrated with algorithms into a complete classification system. As a consequence, when the user enters the system and submits the image or video to be detected, the system will display the classification results in turn. The classification results include passenger density distribution, number of passengers, date, and algorithm running time. In addition, the user can use the mouse to delineate an area in the passenger density distribution map and count any image area.
提出了一种可用于日常生活的高效公交拥挤分类系统。特别是对真实公交数据进行分析和研究,以解决公交拥堵分类的难题。此外,我们将深度学习和计算机视觉技术相结合,从公交车内部的监控摄像头中提取图像或视频。人群的信息最终将与算法整合成一个完整的分类系统。因此,当用户进入系统并提交待检测的图像或视频时,系统将依次显示分类结果。分类结果包括乘客密度分布、乘客人数、日期和算法运行时间。此外,用户可以使用鼠标在乘客密度分布图中划定一个区域,并对任何图像区域进行计数。
{"title":"An Efficient Bus Crowdedness Classification System","authors":"Lingcan Meng, Xiushan Nie, Zhifang Tan","doi":"10.1145/3469877.3493587","DOIUrl":"https://doi.org/10.1145/3469877.3493587","url":null,"abstract":"We propose an efficient bus crowdedness classification system that can be used in daily life. In particular, we analyze and study the data collected from real bus, aiming to deal with the difficulty of bus congestion classification. Besides, we combine deep learning and computer vision technology to extract images or videos from the internal surveillance cameras of the bus. The information of crowd will finally be integrated with algorithms into a complete classification system. As a consequence, when the user enters the system and submits the image or video to be detected, the system will display the classification results in turn. The classification results include passenger density distribution, number of passengers, date, and algorithm running time. In addition, the user can use the mouse to delineate an area in the passenger density distribution map and count any image area.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127444519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Fine-Grained River Ice Semantic Segmentation based on Attentive Features and Enhancing Feature Fusion 基于关注特征和增强特征融合的细粒度河冰语义分割
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3497698
Rui Wang, Chengyu Zheng, Yanru Jiang, Zhao-Hui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie
The semantic segmentation of frazil ice and anchor ice is of great significance for river management, ship navigation, and ice hazard forecasting in cold regions. Especially, distinguishing frazil ice from sediment-carrying anchor ice can increase the estimation accuracy of the sediment transportation capacity of the river. Although the river ice semantic segmentation methods based on deep learning has achieved great prediction accuracy, there is still the problem of insufficient feature extraction. To address this problem, we proposed a Fine-Grained River Ice Semantic Segmentation (FGRIS) based on attentive features and enhancing feature fusion to deal with these challenges. First, we propose a Dual-Attention Mechanism (DAM) method, which uses a combination of channel attention features and position attention features to extract more comprehensive semantic features. Then, we proposed a novel Branch Feature Fusion (BFF) module to bridge the semantic feature gap between high-level feature semantic features and low-level semantic features, which is robust to different scales. Experimental results conducted on Alberta River Ice Segmentation Dataset demonstrate the superiority of the proposed method.
冻冰和锚冰的语义分割对于寒区河流管理、船舶航行和冰害预报具有重要意义。特别是将带沙锚冰与带沙锚冰区分开来,可以提高河流输沙能力的估算精度。基于深度学习的河冰语义分割方法虽然取得了很高的预测精度,但仍然存在特征提取不足的问题。为了解决这一问题,我们提出了一种基于关注特征和增强特征融合的细粒度河冰语义分割(FGRIS)方法来应对这些挑战。首先,我们提出了一种双注意机制(Dual-Attention Mechanism, DAM)方法,该方法结合通道注意特征和位置注意特征来提取更全面的语义特征。然后,我们提出了一种新的分支特征融合(BFF)模块,以弥合高级特征语义特征和低级语义特征之间的语义特征差距,该模块具有不同尺度的鲁棒性。在Alberta河冰分割数据集上的实验结果表明了该方法的优越性。
{"title":"A Fine-Grained River Ice Semantic Segmentation based on Attentive Features and Enhancing Feature Fusion","authors":"Rui Wang, Chengyu Zheng, Yanru Jiang, Zhao-Hui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie","doi":"10.1145/3469877.3497698","DOIUrl":"https://doi.org/10.1145/3469877.3497698","url":null,"abstract":"The semantic segmentation of frazil ice and anchor ice is of great significance for river management, ship navigation, and ice hazard forecasting in cold regions. Especially, distinguishing frazil ice from sediment-carrying anchor ice can increase the estimation accuracy of the sediment transportation capacity of the river. Although the river ice semantic segmentation methods based on deep learning has achieved great prediction accuracy, there is still the problem of insufficient feature extraction. To address this problem, we proposed a Fine-Grained River Ice Semantic Segmentation (FGRIS) based on attentive features and enhancing feature fusion to deal with these challenges. First, we propose a Dual-Attention Mechanism (DAM) method, which uses a combination of channel attention features and position attention features to extract more comprehensive semantic features. Then, we proposed a novel Branch Feature Fusion (BFF) module to bridge the semantic feature gap between high-level feature semantic features and low-level semantic features, which is robust to different scales. Experimental results conducted on Alberta River Ice Segmentation Dataset demonstrate the superiority of the proposed method.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128081916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impression of a Job Interview training agent that gives rationalized feedback: Should Virtual Agent Give Advice with Rationale? 面试培训代理给出合理反馈的印象:虚拟代理应该给出合理的建议吗?
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493598
Nao Takeuchi, Tomoko Koda
The COVID-19 pandemic has had a significant socio-economic impact on the world. Specifically, social distancing has impacted many activities that were previously conducted face-to-face. One of these was the training that students receive for job interviews. Thus, we developed a job interview training system that will give students the ability to continue receiving this type of training. Our system recognized the nonverbal behaviors of an interviewee, namely gaze, facial expression, and posture and compares the recognition results with those of models of exemplary nonverbal behaviors of an interviewee. A virtual agent acted as an advisor gives feedback on the interviewee's behaviors that need improvement. In order to verify the effectiveness of the two kinds of feedback, namely, rationalized feedback (with quantitative recognition results) vs. non-rationalized one, we compared interviewees’ impression. The results of the evaluation experiment indicated that the virtual agent with rationalized feedback was rated as more reliable but less friendly than the non-rationalized feedback.
2019冠状病毒病大流行对世界产生了重大的社会经济影响。具体来说,社交距离影响了许多以前面对面进行的活动。其中之一就是学生们接受的求职面试培训。因此,我们开发了一个工作面试培训系统,使学生能够继续接受这种类型的培训。我们的系统识别了受访者的非语言行为,即凝视、面部表情和姿势,并将识别结果与受访者的典型非语言行为模型的识别结果进行了比较。虚拟代理作为顾问,对受访者需要改进的行为进行反馈。为了验证两种反馈的有效性,即合理化反馈(有量化的识别结果)和非合理化反馈,我们比较了受访者的印象。评价实验结果表明,有合理反馈的虚拟代理比没有合理反馈的虚拟代理更可靠,但更不友好。
{"title":"Impression of a Job Interview training agent that gives rationalized feedback: Should Virtual Agent Give Advice with Rationale?","authors":"Nao Takeuchi, Tomoko Koda","doi":"10.1145/3469877.3493598","DOIUrl":"https://doi.org/10.1145/3469877.3493598","url":null,"abstract":"The COVID-19 pandemic has had a significant socio-economic impact on the world. Specifically, social distancing has impacted many activities that were previously conducted face-to-face. One of these was the training that students receive for job interviews. Thus, we developed a job interview training system that will give students the ability to continue receiving this type of training. Our system recognized the nonverbal behaviors of an interviewee, namely gaze, facial expression, and posture and compares the recognition results with those of models of exemplary nonverbal behaviors of an interviewee. A virtual agent acted as an advisor gives feedback on the interviewee's behaviors that need improvement. In order to verify the effectiveness of the two kinds of feedback, namely, rationalized feedback (with quantitative recognition results) vs. non-rationalized one, we compared interviewees’ impression. The results of the evaluation experiment indicated that the virtual agent with rationalized feedback was rated as more reliable but less friendly than the non-rationalized feedback.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126690738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Spherical Image Compression Using Spherical Wavelet Transform 基于球面小波变换的球面图像压缩
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490577
Huan Wang, Yunhui Shi, Jin Wang, Gang Wu, N. Ling, Baocai Yin
The Spherical Measure Based Spherical Image Representation (SMSIR) has nearly uniformly distributed pixels in the spherical domain with effective index schemes. Based on SMSIR, the spherical wavelet transform can be efficiently designed, which can capture the spherical geometry feature in a compact manner and provides a powerful tool for spherical image compression. In this paper, we propose an efficient compression scheme for SMSIR images named Spherical Set Partitioning in Hierarchical Trees (S-SPIHT) using the spherical wavelet transform, which exploits the inherent similarities across the subbands in the spherical wavelet decomposition of a SMSIR image. The proposed S-SPIHT can progressively transform spherical wavelet coefficients into bit-stream, and generate an embedded compressed bit-stream that can be efficiently decoded at several spherical image quality levels. The most crucial part of our proposed S-SPIHT is the redesign of scanning the wavelet coefficients corresponding to different index schemes. We design three scanning methods, namely ordered root tree index scanning (ORTIS), dyadic index progressive scanning(DIPS) and dyadic index cross scanning(DICS)to efficiently reorganize the wavelet coefficients. These methods can effectively exploit the self-similarity between sub-bands and the fact that the high-frequency sub-bands mostly contain insignificant coefficients. Experimental results on widely-used datasets demonstrate that our proposed S-SPIHT outperforms the straightforward SPIHT for SMSIR images in terms of PSNR, S-PSNR and SSIM.
基于球面测度的球面图像表示(SMSIR)通过有效的索引方案在球面域内实现了像素的均匀分布。基于SMSIR的球面小波变换可以有效地设计,以紧凑的方式捕获球面几何特征,为球面图像压缩提供了有力的工具。本文提出了一种基于球面小波变换的SMSIR图像压缩方案,即S-SPIHT (Spherical Set Partitioning In Hierarchical Trees),该方案利用了SMSIR图像球面小波分解中各子带之间的内在相似性。所提出的S-SPIHT算法可以将球形小波系数逐步转化为比特流,并生成可在多个球形图像质量水平下有效解码的嵌入压缩比特流。我们提出的S-SPIHT最关键的部分是重新设计扫描不同指数方案对应的小波系数。我们设计了有序根树索引扫描(ORTIS)、并进索引逐级扫描(DIPS)和并进索引交叉扫描(DICS)三种扫描方法来有效地重组小波系数。这些方法可以有效地利用子带之间的自相似性和高频子带系数不显著的特点。在广泛使用的数据集上的实验结果表明,我们提出的S-SPIHT在PSNR, S-PSNR和SSIM方面优于SMSIR图像的直接SPIHT。
{"title":"Spherical Image Compression Using Spherical Wavelet Transform","authors":"Huan Wang, Yunhui Shi, Jin Wang, Gang Wu, N. Ling, Baocai Yin","doi":"10.1145/3469877.3490577","DOIUrl":"https://doi.org/10.1145/3469877.3490577","url":null,"abstract":"The Spherical Measure Based Spherical Image Representation (SMSIR) has nearly uniformly distributed pixels in the spherical domain with effective index schemes. Based on SMSIR, the spherical wavelet transform can be efficiently designed, which can capture the spherical geometry feature in a compact manner and provides a powerful tool for spherical image compression. In this paper, we propose an efficient compression scheme for SMSIR images named Spherical Set Partitioning in Hierarchical Trees (S-SPIHT) using the spherical wavelet transform, which exploits the inherent similarities across the subbands in the spherical wavelet decomposition of a SMSIR image. The proposed S-SPIHT can progressively transform spherical wavelet coefficients into bit-stream, and generate an embedded compressed bit-stream that can be efficiently decoded at several spherical image quality levels. The most crucial part of our proposed S-SPIHT is the redesign of scanning the wavelet coefficients corresponding to different index schemes. We design three scanning methods, namely ordered root tree index scanning (ORTIS), dyadic index progressive scanning(DIPS) and dyadic index cross scanning(DICS)to efficiently reorganize the wavelet coefficients. These methods can effectively exploit the self-similarity between sub-bands and the fact that the high-frequency sub-bands mostly contain insignificant coefficients. Experimental results on widely-used datasets demonstrate that our proposed S-SPIHT outperforms the straightforward SPIHT for SMSIR images in terms of PSNR, S-PSNR and SSIM.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130236949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-based Dual-Branches Localization Network for Weakly Supervised Object Localization 基于注意力的弱监督目标双分支定位网络
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490568
Wenjun Hui, Chuangchuang Tan, Guanghua Gu
Weakly supervised object localization exploits the last convolutional feature maps of classification model and the weights of Fully-Connected (FC) layer to achieves localization. However, high-level feature maps for localization lack edge features. Additionally, the weights are specific to classification task, causing only discriminative regions to be discovered. In order to fuse edge features and adjust the attention distribution for feature map channels, we propose an efficient method called Attention-based Dual-Branches Localization (ADBL) Network, in which dual-branches structure and attention mechanism are adopted to mine edge features and non-discriminative features for locating more target areas. Specifically, dual-branches structure cascades low-level feature maps to mine target object edge regions. Additionally, during inference stage, attention mechanism assigns appropriate attention for different features to preserve non-discriminative areas. Extensive experiments on both ILSVRC and CUB-200-2011 datasets show that the ADBL method achieves substantial performance improvements.
弱监督目标定位利用分类模型的最后一个卷积特征映射和全连通层的权值来实现定位。然而,用于定位的高级特征图缺乏边缘特征。此外,权重是特定于分类任务的,导致只发现判别区域。为了融合边缘特征,调整特征图通道的注意力分布,提出了一种基于注意力的双分支定位网络(ADBL)方法,该方法利用双分支结构和注意力机制挖掘边缘特征和非判别特征,以定位更多的目标区域。具体来说,双分支结构将低级特征映射级联到挖掘目标物体的边缘区域。此外,在推理阶段,注意机制对不同的特征分配适当的注意,以保留非歧视性区域。在ILSVRC和CUB-200-2011数据集上进行的大量实验表明,ADBL方法取得了显著的性能改进。
{"title":"Attention-based Dual-Branches Localization Network for Weakly Supervised Object Localization","authors":"Wenjun Hui, Chuangchuang Tan, Guanghua Gu","doi":"10.1145/3469877.3490568","DOIUrl":"https://doi.org/10.1145/3469877.3490568","url":null,"abstract":"Weakly supervised object localization exploits the last convolutional feature maps of classification model and the weights of Fully-Connected (FC) layer to achieves localization. However, high-level feature maps for localization lack edge features. Additionally, the weights are specific to classification task, causing only discriminative regions to be discovered. In order to fuse edge features and adjust the attention distribution for feature map channels, we propose an efficient method called Attention-based Dual-Branches Localization (ADBL) Network, in which dual-branches structure and attention mechanism are adopted to mine edge features and non-discriminative features for locating more target areas. Specifically, dual-branches structure cascades low-level feature maps to mine target object edge regions. Additionally, during inference stage, attention mechanism assigns appropriate attention for different features to preserve non-discriminative areas. Extensive experiments on both ILSVRC and CUB-200-2011 datasets show that the ADBL method achieves substantial performance improvements.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131001813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SangeetXML: An XML Format for Score Retrieval for Indic Music 一个用于印度音乐乐谱检索的XML格式
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493697
Chandan Misra
Efficient retrieval of score information from a large set of XML-encoded scores and lyrics in an XML database requires such music data to be stored in a well-structured and systematic technique. Current search engines for Indic music (Tagore songs in the present context) retrieves only metadata and lacks scores and lyric retrieval schemes. Being vastly different from its western counterpart, an Indic music piece is required to be encoded in a different way than the XML format used for western music like MusicXML. Such encoding requires a proper understanding of the structure of the music sheet and its careful implementation in XML. In this paper, we propose the development of an XML-based format, SangeetXML, for exchanging and retrieving Indic music information from a theoretical 2D matrix model Swaralipi. We implement SangeetXML by formatting a sample of Rabindra Sangeet (read Tagore Songs in English) compositions and highlights the feasibility of an easy and quick retrieval system based on SangeetXML through XQuery, the de-facto standard for querying XML-encoded data.
要从XML数据库中大量XML编码的乐谱和歌词中高效地检索乐谱信息,就需要将这些音乐数据存储在结构良好的系统技术中。目前印度音乐的搜索引擎(当前上下文中的泰戈尔歌曲)只检索元数据,缺乏乐谱和歌词检索方案。由于与西方音乐有很大的不同,印度音乐作品需要以与西方音乐(如MusicXML)使用的XML格式不同的方式进行编码。这种编码要求对乐谱的结构有正确的理解,并在XML中仔细地实现它。在本文中,我们建议开发一种基于xml的格式SangeetXML,用于从理论二维矩阵模型Swaralipi中交换和检索印度音乐信息。我们通过格式化Rabindra Sangeet(阅读泰戈尔的英文歌曲)作品的示例来实现SangeetXML,并强调了通过XQuery(查询xml编码数据的事实标准)建立一个基于SangeetXML的简单快速检索系统的可行性。
{"title":"SangeetXML: An XML Format for Score Retrieval for Indic Music","authors":"Chandan Misra","doi":"10.1145/3469877.3493697","DOIUrl":"https://doi.org/10.1145/3469877.3493697","url":null,"abstract":"Efficient retrieval of score information from a large set of XML-encoded scores and lyrics in an XML database requires such music data to be stored in a well-structured and systematic technique. Current search engines for Indic music (Tagore songs in the present context) retrieves only metadata and lacks scores and lyric retrieval schemes. Being vastly different from its western counterpart, an Indic music piece is required to be encoded in a different way than the XML format used for western music like MusicXML. Such encoding requires a proper understanding of the structure of the music sheet and its careful implementation in XML. In this paper, we propose the development of an XML-based format, SangeetXML, for exchanging and retrieving Indic music information from a theoretical 2D matrix model Swaralipi. We implement SangeetXML by formatting a sample of Rabindra Sangeet (read Tagore Songs in English) compositions and highlights the feasibility of an easy and quick retrieval system based on SangeetXML through XQuery, the de-facto standard for querying XML-encoded data.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132379071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prediction of Transcription Factor Binding Sites Using Deep Learning Combined with DNA Sequences and Shape Feature Data 结合DNA序列和形状特征数据的深度学习预测转录因子结合位点
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3497696
Yangyang Li, Jie Liu, Hao Liu
Knowing transcription factor binding sites (TFBS) is essential to model underlying binding mechanisms and cellular functions. Studies have shown that in addition to the DNA sequence, the shape information of DNA is also an important factor affecting its activity. Here, we developed a CNN model to integrate 3D DNA shape information derived using a high-throughput method for predicting TF binding sites (TFBSs). We identify the best performing architectures by varying CNN window size, kernels, hidden nodes and hidden layers. The performance of the two types of data and their combination was evaluated using 69 different ChIP-seq [1] experiments. Our results showed that the model integrating shape information and sequence information compared favorably to the sequence-based model This work combines knowledge from structural biology and genomics, and DNA shape features improved the description of TF binding specificity.
了解转录因子结合位点(TFBS)对于模拟潜在的结合机制和细胞功能至关重要。研究表明,除了DNA序列外,DNA的形状信息也是影响其活性的重要因素。在这里,我们开发了一个CNN模型来整合使用高通量方法预测TF结合位点(TFBSs)获得的3D DNA形状信息。我们通过改变CNN窗口大小、内核、隐藏节点和隐藏层来识别性能最好的架构。通过69个不同的ChIP-seq[1]实验,评估了两种类型数据及其组合的性能。我们的研究结果表明,整合形状信息和序列信息的模型优于基于序列的模型。这项工作结合了结构生物学和基因组学的知识,DNA形状特征改进了对TF结合特异性的描述。
{"title":"Prediction of Transcription Factor Binding Sites Using Deep Learning Combined with DNA Sequences and Shape Feature Data","authors":"Yangyang Li, Jie Liu, Hao Liu","doi":"10.1145/3469877.3497696","DOIUrl":"https://doi.org/10.1145/3469877.3497696","url":null,"abstract":"Knowing transcription factor binding sites (TFBS) is essential to model underlying binding mechanisms and cellular functions. Studies have shown that in addition to the DNA sequence, the shape information of DNA is also an important factor affecting its activity. Here, we developed a CNN model to integrate 3D DNA shape information derived using a high-throughput method for predicting TF binding sites (TFBSs). We identify the best performing architectures by varying CNN window size, kernels, hidden nodes and hidden layers. The performance of the two types of data and their combination was evaluated using 69 different ChIP-seq [1] experiments. Our results showed that the model integrating shape information and sequence information compared favorably to the sequence-based model This work combines knowledge from structural biology and genomics, and DNA shape features improved the description of TF binding specificity.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123887736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Delay-sensitive and Priority-aware Transmission Control for Real-time Multimedia Communications 实时多媒体通信的延迟敏感和优先级感知传输控制
Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493597
Ximing Wu, Lei Zhang, Yingfeng Wu, Haobin Zhou, Laizhong Cui
Today’s multimedia applications usually organize the contents into data blocks with different deadlines and priorities. Meeting/missing the deadline for different data blocks may contribute/hurt the user experience to different degrees. With the goal of optimizing real-time multimedia communications, the transmission control scheme needs to make two challenging decisions: the proper sending rate and the best data block to send under dynamic network conditions. In this paper, we propose a delay-sensitive and priority-aware transmission control scheme with two modules, namely, rate control and block selection. The rate control module constantly monitors the network condition and adjusts the sending rate accordingly. The block selection module classifies the blocks based on whether they are estimated to be delivered before deadline and then ranks them according to their effective priority scores. The extensive simulation results demonstrate the superiority of our proposed scheme over the other representative baseline approaches.
今天的多媒体应用程序通常将内容组织到具有不同截止日期和优先级的数据块中。满足/错过不同数据块的截止日期可能会在不同程度上贡献/损害用户体验。为了优化实时多媒体通信,传输控制方案需要做出两个具有挑战性的决策:适当的发送速率和动态网络条件下发送的最佳数据块。本文提出了一种时延敏感、优先级感知的传输控制方案,该方案包含两个模块,即速率控制和分组选择。速率控制模块通过监控网络状况,及时调整发送速率。块选择模块根据是否预计在截止日期前交付对块进行分类,然后根据有效优先级分数对块进行排序。大量的仿真结果表明,我们提出的方案优于其他代表性的基线方法。
{"title":"Delay-sensitive and Priority-aware Transmission Control for Real-time Multimedia Communications","authors":"Ximing Wu, Lei Zhang, Yingfeng Wu, Haobin Zhou, Laizhong Cui","doi":"10.1145/3469877.3493597","DOIUrl":"https://doi.org/10.1145/3469877.3493597","url":null,"abstract":"Today’s multimedia applications usually organize the contents into data blocks with different deadlines and priorities. Meeting/missing the deadline for different data blocks may contribute/hurt the user experience to different degrees. With the goal of optimizing real-time multimedia communications, the transmission control scheme needs to make two challenging decisions: the proper sending rate and the best data block to send under dynamic network conditions. In this paper, we propose a delay-sensitive and priority-aware transmission control scheme with two modules, namely, rate control and block selection. The rate control module constantly monitors the network condition and adjusts the sending rate accordingly. The block selection module classifies the blocks based on whether they are estimated to be delivered before deadline and then ranks them according to their effective priority scores. The extensive simulation results demonstrate the superiority of our proposed scheme over the other representative baseline approaches.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126941681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Multimedia Asia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1