Colored Point Cloud (CPC) is often distorted in the processes of its acquisition, processing, and compression, so reliable quality assessment metrics are required to estimate the perception of distortion of CPC. We propose a Full-reference Quality Metric for colored point cloud based on Graph signal features and Color features (FQM-GC). For geometric distortion, the normal and coordinate information of the sub-clouds divided via geometric segmentation is used to construct their underlying graphs, then, the geometric structure features are extracted. For color distortion, the corresponding color statistical features are extracted from regions divided with color attribution. Meanwhile, the color features of different regions are weighted to simulate the visual masking effect. Finally, all the extracted features are formed into a feature vector to estimate the quality of CPCs. Experimental results on three databases (CPCD2.0, IRPC and SJTU-PCQA) show that the proposed metric FQM-GC is more consistent with human visual perception.
{"title":"FQM-GC: Full-reference Quality Metric for Colored Point Cloud Based on Graph Signal Features and Color Features","authors":"Ke-Xin Zhang, G. Jiang, Mei Yu","doi":"10.1145/3469877.3490578","DOIUrl":"https://doi.org/10.1145/3469877.3490578","url":null,"abstract":"Colored Point Cloud (CPC) is often distorted in the processes of its acquisition, processing, and compression, so reliable quality assessment metrics are required to estimate the perception of distortion of CPC. We propose a Full-reference Quality Metric for colored point cloud based on Graph signal features and Color features (FQM-GC). For geometric distortion, the normal and coordinate information of the sub-clouds divided via geometric segmentation is used to construct their underlying graphs, then, the geometric structure features are extracted. For color distortion, the corresponding color statistical features are extracted from regions divided with color attribution. Meanwhile, the color features of different regions are weighted to simulate the visual masking effect. Finally, all the extracted features are formed into a feature vector to estimate the quality of CPCs. Experimental results on three databases (CPCD2.0, IRPC and SJTU-PCQA) show that the proposed metric FQM-GC is more consistent with human visual perception.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123766833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Animation plays an important role in virtual reality and augmented reality applications. However, it requires great efforts for non-professional users to create animation assets. In this paper, we propose a systematic pipeline to generate ready-to-used characters from images for real-time animation without user intervention. Rather than per-pixel mapping or synthesis in image space using optical flow or generative models, we employ an approximate geometric embodiment to undertake 3D animation without large distortion. The geometry structure is generated from a type-agnostic character. A skeleton adaption is then adopted to guarantee semantic motion transfer to the geometry proxy. The generated character is compatible with standard 3D graphics engines and ready to use for real-time applications. Experiments show that our method works on various images (e.g. sketches, cartoons, and photos) of most object categories (e.g. human, animals, and non-creatures). We develop an AR demo to show its potential usage for fast prototyping.
{"title":"Automatically Generate Rigged Character from Single Image","authors":"Zhanpeng Huang, Rui Han, Jianwen Huang, Hao Yin, Zipeng Qin, Zibin Wang","doi":"10.1145/3469877.3490565","DOIUrl":"https://doi.org/10.1145/3469877.3490565","url":null,"abstract":"Animation plays an important role in virtual reality and augmented reality applications. However, it requires great efforts for non-professional users to create animation assets. In this paper, we propose a systematic pipeline to generate ready-to-used characters from images for real-time animation without user intervention. Rather than per-pixel mapping or synthesis in image space using optical flow or generative models, we employ an approximate geometric embodiment to undertake 3D animation without large distortion. The geometry structure is generated from a type-agnostic character. A skeleton adaption is then adopted to guarantee semantic motion transfer to the geometry proxy. The generated character is compatible with standard 3D graphics engines and ready to use for real-time applications. Experiments show that our method works on various images (e.g. sketches, cartoons, and photos) of most object categories (e.g. human, animals, and non-creatures). We develop an AR demo to show its potential usage for fast prototyping.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124754355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose an efficient bus crowdedness classification system that can be used in daily life. In particular, we analyze and study the data collected from real bus, aiming to deal with the difficulty of bus congestion classification. Besides, we combine deep learning and computer vision technology to extract images or videos from the internal surveillance cameras of the bus. The information of crowd will finally be integrated with algorithms into a complete classification system. As a consequence, when the user enters the system and submits the image or video to be detected, the system will display the classification results in turn. The classification results include passenger density distribution, number of passengers, date, and algorithm running time. In addition, the user can use the mouse to delineate an area in the passenger density distribution map and count any image area.
{"title":"An Efficient Bus Crowdedness Classification System","authors":"Lingcan Meng, Xiushan Nie, Zhifang Tan","doi":"10.1145/3469877.3493587","DOIUrl":"https://doi.org/10.1145/3469877.3493587","url":null,"abstract":"We propose an efficient bus crowdedness classification system that can be used in daily life. In particular, we analyze and study the data collected from real bus, aiming to deal with the difficulty of bus congestion classification. Besides, we combine deep learning and computer vision technology to extract images or videos from the internal surveillance cameras of the bus. The information of crowd will finally be integrated with algorithms into a complete classification system. As a consequence, when the user enters the system and submits the image or video to be detected, the system will display the classification results in turn. The classification results include passenger density distribution, number of passengers, date, and algorithm running time. In addition, the user can use the mouse to delineate an area in the passenger density distribution map and count any image area.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127444519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rui Wang, Chengyu Zheng, Yanru Jiang, Zhao-Hui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie
The semantic segmentation of frazil ice and anchor ice is of great significance for river management, ship navigation, and ice hazard forecasting in cold regions. Especially, distinguishing frazil ice from sediment-carrying anchor ice can increase the estimation accuracy of the sediment transportation capacity of the river. Although the river ice semantic segmentation methods based on deep learning has achieved great prediction accuracy, there is still the problem of insufficient feature extraction. To address this problem, we proposed a Fine-Grained River Ice Semantic Segmentation (FGRIS) based on attentive features and enhancing feature fusion to deal with these challenges. First, we propose a Dual-Attention Mechanism (DAM) method, which uses a combination of channel attention features and position attention features to extract more comprehensive semantic features. Then, we proposed a novel Branch Feature Fusion (BFF) module to bridge the semantic feature gap between high-level feature semantic features and low-level semantic features, which is robust to different scales. Experimental results conducted on Alberta River Ice Segmentation Dataset demonstrate the superiority of the proposed method.
{"title":"A Fine-Grained River Ice Semantic Segmentation based on Attentive Features and Enhancing Feature Fusion","authors":"Rui Wang, Chengyu Zheng, Yanru Jiang, Zhao-Hui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie","doi":"10.1145/3469877.3497698","DOIUrl":"https://doi.org/10.1145/3469877.3497698","url":null,"abstract":"The semantic segmentation of frazil ice and anchor ice is of great significance for river management, ship navigation, and ice hazard forecasting in cold regions. Especially, distinguishing frazil ice from sediment-carrying anchor ice can increase the estimation accuracy of the sediment transportation capacity of the river. Although the river ice semantic segmentation methods based on deep learning has achieved great prediction accuracy, there is still the problem of insufficient feature extraction. To address this problem, we proposed a Fine-Grained River Ice Semantic Segmentation (FGRIS) based on attentive features and enhancing feature fusion to deal with these challenges. First, we propose a Dual-Attention Mechanism (DAM) method, which uses a combination of channel attention features and position attention features to extract more comprehensive semantic features. Then, we proposed a novel Branch Feature Fusion (BFF) module to bridge the semantic feature gap between high-level feature semantic features and low-level semantic features, which is robust to different scales. Experimental results conducted on Alberta River Ice Segmentation Dataset demonstrate the superiority of the proposed method.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128081916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The COVID-19 pandemic has had a significant socio-economic impact on the world. Specifically, social distancing has impacted many activities that were previously conducted face-to-face. One of these was the training that students receive for job interviews. Thus, we developed a job interview training system that will give students the ability to continue receiving this type of training. Our system recognized the nonverbal behaviors of an interviewee, namely gaze, facial expression, and posture and compares the recognition results with those of models of exemplary nonverbal behaviors of an interviewee. A virtual agent acted as an advisor gives feedback on the interviewee's behaviors that need improvement. In order to verify the effectiveness of the two kinds of feedback, namely, rationalized feedback (with quantitative recognition results) vs. non-rationalized one, we compared interviewees’ impression. The results of the evaluation experiment indicated that the virtual agent with rationalized feedback was rated as more reliable but less friendly than the non-rationalized feedback.
{"title":"Impression of a Job Interview training agent that gives rationalized feedback: Should Virtual Agent Give Advice with Rationale?","authors":"Nao Takeuchi, Tomoko Koda","doi":"10.1145/3469877.3493598","DOIUrl":"https://doi.org/10.1145/3469877.3493598","url":null,"abstract":"The COVID-19 pandemic has had a significant socio-economic impact on the world. Specifically, social distancing has impacted many activities that were previously conducted face-to-face. One of these was the training that students receive for job interviews. Thus, we developed a job interview training system that will give students the ability to continue receiving this type of training. Our system recognized the nonverbal behaviors of an interviewee, namely gaze, facial expression, and posture and compares the recognition results with those of models of exemplary nonverbal behaviors of an interviewee. A virtual agent acted as an advisor gives feedback on the interviewee's behaviors that need improvement. In order to verify the effectiveness of the two kinds of feedback, namely, rationalized feedback (with quantitative recognition results) vs. non-rationalized one, we compared interviewees’ impression. The results of the evaluation experiment indicated that the virtual agent with rationalized feedback was rated as more reliable but less friendly than the non-rationalized feedback.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126690738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huan Wang, Yunhui Shi, Jin Wang, Gang Wu, N. Ling, Baocai Yin
The Spherical Measure Based Spherical Image Representation (SMSIR) has nearly uniformly distributed pixels in the spherical domain with effective index schemes. Based on SMSIR, the spherical wavelet transform can be efficiently designed, which can capture the spherical geometry feature in a compact manner and provides a powerful tool for spherical image compression. In this paper, we propose an efficient compression scheme for SMSIR images named Spherical Set Partitioning in Hierarchical Trees (S-SPIHT) using the spherical wavelet transform, which exploits the inherent similarities across the subbands in the spherical wavelet decomposition of a SMSIR image. The proposed S-SPIHT can progressively transform spherical wavelet coefficients into bit-stream, and generate an embedded compressed bit-stream that can be efficiently decoded at several spherical image quality levels. The most crucial part of our proposed S-SPIHT is the redesign of scanning the wavelet coefficients corresponding to different index schemes. We design three scanning methods, namely ordered root tree index scanning (ORTIS), dyadic index progressive scanning(DIPS) and dyadic index cross scanning(DICS)to efficiently reorganize the wavelet coefficients. These methods can effectively exploit the self-similarity between sub-bands and the fact that the high-frequency sub-bands mostly contain insignificant coefficients. Experimental results on widely-used datasets demonstrate that our proposed S-SPIHT outperforms the straightforward SPIHT for SMSIR images in terms of PSNR, S-PSNR and SSIM.
基于球面测度的球面图像表示(SMSIR)通过有效的索引方案在球面域内实现了像素的均匀分布。基于SMSIR的球面小波变换可以有效地设计,以紧凑的方式捕获球面几何特征,为球面图像压缩提供了有力的工具。本文提出了一种基于球面小波变换的SMSIR图像压缩方案,即S-SPIHT (Spherical Set Partitioning In Hierarchical Trees),该方案利用了SMSIR图像球面小波分解中各子带之间的内在相似性。所提出的S-SPIHT算法可以将球形小波系数逐步转化为比特流,并生成可在多个球形图像质量水平下有效解码的嵌入压缩比特流。我们提出的S-SPIHT最关键的部分是重新设计扫描不同指数方案对应的小波系数。我们设计了有序根树索引扫描(ORTIS)、并进索引逐级扫描(DIPS)和并进索引交叉扫描(DICS)三种扫描方法来有效地重组小波系数。这些方法可以有效地利用子带之间的自相似性和高频子带系数不显著的特点。在广泛使用的数据集上的实验结果表明,我们提出的S-SPIHT在PSNR, S-PSNR和SSIM方面优于SMSIR图像的直接SPIHT。
{"title":"Spherical Image Compression Using Spherical Wavelet Transform","authors":"Huan Wang, Yunhui Shi, Jin Wang, Gang Wu, N. Ling, Baocai Yin","doi":"10.1145/3469877.3490577","DOIUrl":"https://doi.org/10.1145/3469877.3490577","url":null,"abstract":"The Spherical Measure Based Spherical Image Representation (SMSIR) has nearly uniformly distributed pixels in the spherical domain with effective index schemes. Based on SMSIR, the spherical wavelet transform can be efficiently designed, which can capture the spherical geometry feature in a compact manner and provides a powerful tool for spherical image compression. In this paper, we propose an efficient compression scheme for SMSIR images named Spherical Set Partitioning in Hierarchical Trees (S-SPIHT) using the spherical wavelet transform, which exploits the inherent similarities across the subbands in the spherical wavelet decomposition of a SMSIR image. The proposed S-SPIHT can progressively transform spherical wavelet coefficients into bit-stream, and generate an embedded compressed bit-stream that can be efficiently decoded at several spherical image quality levels. The most crucial part of our proposed S-SPIHT is the redesign of scanning the wavelet coefficients corresponding to different index schemes. We design three scanning methods, namely ordered root tree index scanning (ORTIS), dyadic index progressive scanning(DIPS) and dyadic index cross scanning(DICS)to efficiently reorganize the wavelet coefficients. These methods can effectively exploit the self-similarity between sub-bands and the fact that the high-frequency sub-bands mostly contain insignificant coefficients. Experimental results on widely-used datasets demonstrate that our proposed S-SPIHT outperforms the straightforward SPIHT for SMSIR images in terms of PSNR, S-PSNR and SSIM.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130236949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weakly supervised object localization exploits the last convolutional feature maps of classification model and the weights of Fully-Connected (FC) layer to achieves localization. However, high-level feature maps for localization lack edge features. Additionally, the weights are specific to classification task, causing only discriminative regions to be discovered. In order to fuse edge features and adjust the attention distribution for feature map channels, we propose an efficient method called Attention-based Dual-Branches Localization (ADBL) Network, in which dual-branches structure and attention mechanism are adopted to mine edge features and non-discriminative features for locating more target areas. Specifically, dual-branches structure cascades low-level feature maps to mine target object edge regions. Additionally, during inference stage, attention mechanism assigns appropriate attention for different features to preserve non-discriminative areas. Extensive experiments on both ILSVRC and CUB-200-2011 datasets show that the ADBL method achieves substantial performance improvements.
{"title":"Attention-based Dual-Branches Localization Network for Weakly Supervised Object Localization","authors":"Wenjun Hui, Chuangchuang Tan, Guanghua Gu","doi":"10.1145/3469877.3490568","DOIUrl":"https://doi.org/10.1145/3469877.3490568","url":null,"abstract":"Weakly supervised object localization exploits the last convolutional feature maps of classification model and the weights of Fully-Connected (FC) layer to achieves localization. However, high-level feature maps for localization lack edge features. Additionally, the weights are specific to classification task, causing only discriminative regions to be discovered. In order to fuse edge features and adjust the attention distribution for feature map channels, we propose an efficient method called Attention-based Dual-Branches Localization (ADBL) Network, in which dual-branches structure and attention mechanism are adopted to mine edge features and non-discriminative features for locating more target areas. Specifically, dual-branches structure cascades low-level feature maps to mine target object edge regions. Additionally, during inference stage, attention mechanism assigns appropriate attention for different features to preserve non-discriminative areas. Extensive experiments on both ILSVRC and CUB-200-2011 datasets show that the ADBL method achieves substantial performance improvements.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131001813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Efficient retrieval of score information from a large set of XML-encoded scores and lyrics in an XML database requires such music data to be stored in a well-structured and systematic technique. Current search engines for Indic music (Tagore songs in the present context) retrieves only metadata and lacks scores and lyric retrieval schemes. Being vastly different from its western counterpart, an Indic music piece is required to be encoded in a different way than the XML format used for western music like MusicXML. Such encoding requires a proper understanding of the structure of the music sheet and its careful implementation in XML. In this paper, we propose the development of an XML-based format, SangeetXML, for exchanging and retrieving Indic music information from a theoretical 2D matrix model Swaralipi. We implement SangeetXML by formatting a sample of Rabindra Sangeet (read Tagore Songs in English) compositions and highlights the feasibility of an easy and quick retrieval system based on SangeetXML through XQuery, the de-facto standard for querying XML-encoded data.
{"title":"SangeetXML: An XML Format for Score Retrieval for Indic Music","authors":"Chandan Misra","doi":"10.1145/3469877.3493697","DOIUrl":"https://doi.org/10.1145/3469877.3493697","url":null,"abstract":"Efficient retrieval of score information from a large set of XML-encoded scores and lyrics in an XML database requires such music data to be stored in a well-structured and systematic technique. Current search engines for Indic music (Tagore songs in the present context) retrieves only metadata and lacks scores and lyric retrieval schemes. Being vastly different from its western counterpart, an Indic music piece is required to be encoded in a different way than the XML format used for western music like MusicXML. Such encoding requires a proper understanding of the structure of the music sheet and its careful implementation in XML. In this paper, we propose the development of an XML-based format, SangeetXML, for exchanging and retrieving Indic music information from a theoretical 2D matrix model Swaralipi. We implement SangeetXML by formatting a sample of Rabindra Sangeet (read Tagore Songs in English) compositions and highlights the feasibility of an easy and quick retrieval system based on SangeetXML through XQuery, the de-facto standard for querying XML-encoded data.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132379071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowing transcription factor binding sites (TFBS) is essential to model underlying binding mechanisms and cellular functions. Studies have shown that in addition to the DNA sequence, the shape information of DNA is also an important factor affecting its activity. Here, we developed a CNN model to integrate 3D DNA shape information derived using a high-throughput method for predicting TF binding sites (TFBSs). We identify the best performing architectures by varying CNN window size, kernels, hidden nodes and hidden layers. The performance of the two types of data and their combination was evaluated using 69 different ChIP-seq [1] experiments. Our results showed that the model integrating shape information and sequence information compared favorably to the sequence-based model This work combines knowledge from structural biology and genomics, and DNA shape features improved the description of TF binding specificity.
{"title":"Prediction of Transcription Factor Binding Sites Using Deep Learning Combined with DNA Sequences and Shape Feature Data","authors":"Yangyang Li, Jie Liu, Hao Liu","doi":"10.1145/3469877.3497696","DOIUrl":"https://doi.org/10.1145/3469877.3497696","url":null,"abstract":"Knowing transcription factor binding sites (TFBS) is essential to model underlying binding mechanisms and cellular functions. Studies have shown that in addition to the DNA sequence, the shape information of DNA is also an important factor affecting its activity. Here, we developed a CNN model to integrate 3D DNA shape information derived using a high-throughput method for predicting TF binding sites (TFBSs). We identify the best performing architectures by varying CNN window size, kernels, hidden nodes and hidden layers. The performance of the two types of data and their combination was evaluated using 69 different ChIP-seq [1] experiments. Our results showed that the model integrating shape information and sequence information compared favorably to the sequence-based model This work combines knowledge from structural biology and genomics, and DNA shape features improved the description of TF binding specificity.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123887736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ximing Wu, Lei Zhang, Yingfeng Wu, Haobin Zhou, Laizhong Cui
Today’s multimedia applications usually organize the contents into data blocks with different deadlines and priorities. Meeting/missing the deadline for different data blocks may contribute/hurt the user experience to different degrees. With the goal of optimizing real-time multimedia communications, the transmission control scheme needs to make two challenging decisions: the proper sending rate and the best data block to send under dynamic network conditions. In this paper, we propose a delay-sensitive and priority-aware transmission control scheme with two modules, namely, rate control and block selection. The rate control module constantly monitors the network condition and adjusts the sending rate accordingly. The block selection module classifies the blocks based on whether they are estimated to be delivered before deadline and then ranks them according to their effective priority scores. The extensive simulation results demonstrate the superiority of our proposed scheme over the other representative baseline approaches.
{"title":"Delay-sensitive and Priority-aware Transmission Control for Real-time Multimedia Communications","authors":"Ximing Wu, Lei Zhang, Yingfeng Wu, Haobin Zhou, Laizhong Cui","doi":"10.1145/3469877.3493597","DOIUrl":"https://doi.org/10.1145/3469877.3493597","url":null,"abstract":"Today’s multimedia applications usually organize the contents into data blocks with different deadlines and priorities. Meeting/missing the deadline for different data blocks may contribute/hurt the user experience to different degrees. With the goal of optimizing real-time multimedia communications, the transmission control scheme needs to make two challenging decisions: the proper sending rate and the best data block to send under dynamic network conditions. In this paper, we propose a delay-sensitive and priority-aware transmission control scheme with two modules, namely, rate control and block selection. The rate control module constantly monitors the network condition and adjusts the sending rate accordingly. The block selection module classifies the blocks based on whether they are estimated to be delivered before deadline and then ranks them according to their effective priority scores. The extensive simulation results demonstrate the superiority of our proposed scheme over the other representative baseline approaches.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126941681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}