Pub Date : 2006-12-26DOI: 10.1109/ICME.2006.262596
S. Basu
We present a method for performing acoustic echo cancellation in a channel with rapidly varying gain and thus a rapidly varying channel characteristic. This is a situation in which standard AEC approaches perform poorly. Our method involves learning a scale-free channel characteristic (Htilde). We then apply this to a windowed version of the signal and remove the projection of the transformed signal from the output signal. We also develop a "ramp projection" method that allows for a linear variation in gain within the window. We show results in a telephony application with 3 dB to more than 8 dB of improvement over conventional AEC using the simple projection and an additional 1 dB using the ramp projection
我们提出了一种在具有快速变化的增益和快速变化的信道特性的信道中执行声学回波抵消的方法。在这种情况下,标准AEC方法表现不佳。我们的方法包括学习无标度通道特性(Htilde)。然后我们将其应用于信号的带窗版本,并从输出信号中去除变换后的信号的投影。我们还开发了一种“斜坡投影”方法,允许在窗口内增益的线性变化。我们展示了一个电话应用程序的结果,使用简单投影比传统AEC提高了3 dB到8 dB以上,使用斜坡投影又提高了1 dB
{"title":"Acoustic Echo Cancellation in a Channel with Rapidly Varying Gain","authors":"S. Basu","doi":"10.1109/ICME.2006.262596","DOIUrl":"https://doi.org/10.1109/ICME.2006.262596","url":null,"abstract":"We present a method for performing acoustic echo cancellation in a channel with rapidly varying gain and thus a rapidly varying channel characteristic. This is a situation in which standard AEC approaches perform poorly. Our method involves learning a scale-free channel characteristic (Htilde). We then apply this to a windowed version of the signal and remove the projection of the transformed signal from the output signal. We also develop a \"ramp projection\" method that allows for a linear variation in gain within the window. We show results in a telephony application with 3 dB to more than 8 dB of improvement over conventional AEC using the simple projection and an additional 1 dB using the ramp projection","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124134789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262405
G. Lee, Ming-Jiun Wang, He-Yuan Lin, D. W. Su, Bo-Yun Lin
This paper presents a new spatio-temporal motion estimation algorithm for video coding. The algorithm is based on optimization theory and consists of the strategies including 3D spatio-temporal motion vector prediction, modified one-at-a-time search scheme, and multiple update paths. The simulation results indicate our algorithm is better than other recently proposed ones under the same computational budget and is very close to full search. The low-cost feature and regular demand of computational resource make our algorithm suitable for VLSI implementation. The algorithm also makes single chip solution for high-definition coding feasible
{"title":"A 3D Spatio-Temporal Motion Estimation Algorithm for Video Coding","authors":"G. Lee, Ming-Jiun Wang, He-Yuan Lin, D. W. Su, Bo-Yun Lin","doi":"10.1109/ICME.2006.262405","DOIUrl":"https://doi.org/10.1109/ICME.2006.262405","url":null,"abstract":"This paper presents a new spatio-temporal motion estimation algorithm for video coding. The algorithm is based on optimization theory and consists of the strategies including 3D spatio-temporal motion vector prediction, modified one-at-a-time search scheme, and multiple update paths. The simulation results indicate our algorithm is better than other recently proposed ones under the same computational budget and is very close to full search. The low-cost feature and regular demand of computational resource make our algorithm suitable for VLSI implementation. The algorithm also makes single chip solution for high-definition coding feasible","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114979192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262841
M. Al-khassaweneh, Selin Aviyente
In this paper, a new watermarking scheme in the joint time-frequency domain is introduced. Wigner distribution is used to transform an image into the spatial-spectral domain. The proposed method selects the time-frequency cells to be watermarked based on the particular image's energy distribution in the joint domain. This approach ensures the imperceptibility of the embedded watermark. It is shown that embedding in the time-frequency domain is equivalent to a nonlinear embedding function in the spatial domain. A corresponding watermark detection algorithm is also introduced. The performance of the proposed watermarking algorithm under possible attacks, such as noise, re-sampling, rotation, filtering, and JPEG compression is illustrated
{"title":"Robust Watermarking in the Wigner Domain","authors":"M. Al-khassaweneh, Selin Aviyente","doi":"10.1109/ICME.2006.262841","DOIUrl":"https://doi.org/10.1109/ICME.2006.262841","url":null,"abstract":"In this paper, a new watermarking scheme in the joint time-frequency domain is introduced. Wigner distribution is used to transform an image into the spatial-spectral domain. The proposed method selects the time-frequency cells to be watermarked based on the particular image's energy distribution in the joint domain. This approach ensures the imperceptibility of the embedded watermark. It is shown that embedding in the time-frequency domain is equivalent to a nonlinear embedding function in the spatial domain. A corresponding watermark detection algorithm is also introduced. The performance of the proposed watermarking algorithm under possible attacks, such as noise, re-sampling, rotation, filtering, and JPEG compression is illustrated","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115416750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262488
Haifeng Zheng, Congchong Ru, Lun Yu, C. Chen
MIMO-OFDM is a promising technique for the broadband wireless communication system. In this paper, we propose a novel scheme that integrates multiple description coding (MDC), error resilient video coding, and unequal error protection strategy with various space time coding codes for robust video transmission over MIMO-OFDM system. The proposed MDC coder generates multiple bitstreams of equal importance which are very suitable for multiple antennas system. Furthermore, according to the contribution to the reconstructed video quality, we apply unequal error protection strategy using BLAST and STBC space time codes for each video bitstream. Experimental results have demonstrated that the proposed scheme can achieve desired tradeoff between the reconstructed video quality and the transmission efficiency
{"title":"Robust Video Transmission Over MIMO-OFDM System using MDC and Space Time Codes","authors":"Haifeng Zheng, Congchong Ru, Lun Yu, C. Chen","doi":"10.1109/ICME.2006.262488","DOIUrl":"https://doi.org/10.1109/ICME.2006.262488","url":null,"abstract":"MIMO-OFDM is a promising technique for the broadband wireless communication system. In this paper, we propose a novel scheme that integrates multiple description coding (MDC), error resilient video coding, and unequal error protection strategy with various space time coding codes for robust video transmission over MIMO-OFDM system. The proposed MDC coder generates multiple bitstreams of equal importance which are very suitable for multiple antennas system. Furthermore, according to the contribution to the reconstructed video quality, we apply unequal error protection strategy using BLAST and STBC space time codes for each video bitstream. Experimental results have demonstrated that the proposed scheme can achieve desired tradeoff between the reconstructed video quality and the transmission efficiency","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"341 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115450651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262912
D. Jurca, P. Frossard
We address the problem of delay-constrained streaming of multimedia packets over dynamic bandwidth channels. Efficient streaming solutions generally rely on the knowledge of the channel bandwidth, in order to select the media packets to be transmitted, according with their sending time. However, the streaming server usually cannot have a perfect knowledge of the channel bandwidth, and important packets may be lost because of over-estimation. We address the rate prediction mismatch by media scheduling with a conservative delay, which provides a safety margin for the packet delivery, even in the presence of unpredicted bandwidth variations. We formulate an optimization problem whose goal is to find the optimal conservative delay to be used in the scheduling process, given the network model the playback delay imposed by the client. We then propose a simple solution to the scheduling delay estimation, effective in real-time streaming scenarios. Our streaming method proves robust against channel prediction errors, and performs better than other mechanisms based on frame reordering strategies
{"title":"Media Streaming with Conservative Delay on Variable Rate Channels","authors":"D. Jurca, P. Frossard","doi":"10.1109/ICME.2006.262912","DOIUrl":"https://doi.org/10.1109/ICME.2006.262912","url":null,"abstract":"We address the problem of delay-constrained streaming of multimedia packets over dynamic bandwidth channels. Efficient streaming solutions generally rely on the knowledge of the channel bandwidth, in order to select the media packets to be transmitted, according with their sending time. However, the streaming server usually cannot have a perfect knowledge of the channel bandwidth, and important packets may be lost because of over-estimation. We address the rate prediction mismatch by media scheduling with a conservative delay, which provides a safety margin for the packet delivery, even in the presence of unpredicted bandwidth variations. We formulate an optimization problem whose goal is to find the optimal conservative delay to be used in the scheduling process, given the network model the playback delay imposed by the client. We then propose a simple solution to the scheduling delay estimation, effective in real-time streaming scenarios. Our streaming method proves robust against channel prediction errors, and performs better than other mechanisms based on frame reordering strategies","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123103904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262412
Hung-Ming Wang, Ji-Kun Lin, J. Yang
In H.264 advanced video coding (AVC), variable block size motion estimation plays an important role in compression of interframes. In this paper, we propose a fast inter prediction algorithm based on hierarchical homogeneous detection and cost analysis to select the best mode effectively. For each macroblock, we first detect that whether the macroblock is spatial homogeneous or not. For the non spatial homogeneous macroblock, we then perform the 16times16 motion estimation and examine if the 16times16 block is temporal homogeneous or not. Once the homogeneous macroblock is detected in the above process, the best mode will be chosen as 16times16 mode. For the non-homogeneous macroblock, we then execute 8times8 motion estimation and analyze the cost of 8times8 mode and 16times16 mode for deciding the best inter mode should be 16times16 mode or any other mode. The process for searching the best 8times8 block subtype is similar to the process for macroblocks. Finally, the best inter mode is decided by selecting the inter mode with least cost from the candidate modes. Experimental results show that our proposed algorithm can save about 32~54% computation time without introducing any noticeable performance degradation
{"title":"Fast Inter Mode Decision Based on Hierarchical Homogeneous Detection and Cost Analysis for H.264/AVC Coders","authors":"Hung-Ming Wang, Ji-Kun Lin, J. Yang","doi":"10.1109/ICME.2006.262412","DOIUrl":"https://doi.org/10.1109/ICME.2006.262412","url":null,"abstract":"In H.264 advanced video coding (AVC), variable block size motion estimation plays an important role in compression of interframes. In this paper, we propose a fast inter prediction algorithm based on hierarchical homogeneous detection and cost analysis to select the best mode effectively. For each macroblock, we first detect that whether the macroblock is spatial homogeneous or not. For the non spatial homogeneous macroblock, we then perform the 16times16 motion estimation and examine if the 16times16 block is temporal homogeneous or not. Once the homogeneous macroblock is detected in the above process, the best mode will be chosen as 16times16 mode. For the non-homogeneous macroblock, we then execute 8times8 motion estimation and analyze the cost of 8times8 mode and 16times16 mode for deciding the best inter mode should be 16times16 mode or any other mode. The process for searching the best 8times8 block subtype is similar to the process for macroblocks. Finally, the best inter mode is decided by selecting the inter mode with least cost from the candidate modes. Experimental results show that our proposed algorithm can save about 32~54% computation time without introducing any noticeable performance degradation","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124250433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262775
Yang Li, H. Chan
Communication technologies, old or new, are pushing the development of telecommunication industry. Thus, technically performing a multimodal service session (e.g. one end is involved with data while the other end is with multimedia) is no longer a problem. People are gaining interests in managing these multimodal services by considering the choice and preference of users. This research brings in a four-option decision mechanism to intelligently process a multimodal service and may ensure a successful and friendly communication session for two reasons. Firstly, this mechanism provides four extra options for a service that may fail in current communication systems. After three extra tries, the service session will most probably succeed. Secondly, the user make decisions them by setting the rules of how to make proper decisions in advance
{"title":"A Decision Mechanism for Processing Multimodal Services in Future Generation Network","authors":"Yang Li, H. Chan","doi":"10.1109/ICME.2006.262775","DOIUrl":"https://doi.org/10.1109/ICME.2006.262775","url":null,"abstract":"Communication technologies, old or new, are pushing the development of telecommunication industry. Thus, technically performing a multimodal service session (e.g. one end is involved with data while the other end is with multimedia) is no longer a problem. People are gaining interests in managing these multimodal services by considering the choice and preference of users. This research brings in a four-option decision mechanism to intelligently process a multimodal service and may ensure a successful and friendly communication session for two reasons. Firstly, this mechanism provides four extra options for a service that may fail in current communication systems. After three extra tries, the service session will most probably succeed. Secondly, the user make decisions them by setting the rules of how to make proper decisions in advance","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116888190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262948
Hua-Fu Li, Chin-Chuan Ho, M. Shan, Suh-Yin Lee
Mining multimedia data is one of the most important issues in data mining. In this paper, we propose an online one-pass algorithm to mine the set of frequent temporal patterns in online music query streams with a sliding window. An effective bit-sequence representation is used to reduce the processing time and memory needed to slide the windows. Experiments show that the proposed algorithm only needs a half of memory requirement of original music query data, and just scans the data once
{"title":"Online Mining of Recent Music Query Streams","authors":"Hua-Fu Li, Chin-Chuan Ho, M. Shan, Suh-Yin Lee","doi":"10.1109/ICME.2006.262948","DOIUrl":"https://doi.org/10.1109/ICME.2006.262948","url":null,"abstract":"Mining multimedia data is one of the most important issues in data mining. In this paper, we propose an online one-pass algorithm to mine the set of frequent temporal patterns in online music query streams with a sliding window. An effective bit-sequence representation is used to reduce the processing time and memory needed to slide the windows. Experiments show that the proposed algorithm only needs a half of memory requirement of original music query data, and just scans the data once","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117201863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262885
Wei-Yang Lin, Kin-Chung Wong, Y. Hu, N. Boston
In this paper, we developed a family of 2D and 3D invariant features with applications to 3D human faces recognition. The main contributions of this paper are: (a) systematically deriving a family of novel features, called summation invariant that are invariant to Euclidean transformation in both 2D and 3D; (b) developing an effective method to apply summation invariant to the 3D face recognition problem. Tested with the 3D data from the face recognition grand challenge v1.0 dataset, the proposed new features exhibit achieves a performance that rivals the best 3D face recognition algorithms reported so far
{"title":"Face Recognition using 3D Summation Invariant Features","authors":"Wei-Yang Lin, Kin-Chung Wong, Y. Hu, N. Boston","doi":"10.1109/ICME.2006.262885","DOIUrl":"https://doi.org/10.1109/ICME.2006.262885","url":null,"abstract":"In this paper, we developed a family of 2D and 3D invariant features with applications to 3D human faces recognition. The main contributions of this paper are: (a) systematically deriving a family of novel features, called summation invariant that are invariant to Euclidean transformation in both 2D and 3D; (b) developing an effective method to apply summation invariant to the 3D face recognition problem. Tested with the 3D data from the face recognition grand challenge v1.0 dataset, the proposed new features exhibit achieves a performance that rivals the best 3D face recognition algorithms reported so far","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127081916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-07-09DOI: 10.1109/ICME.2006.262945
Shirish S. Karande, U. Parrikar, K. Misra, H. Radha
Radio hardware used for the reception of 802.11b frames is capable of associating a signal to silence ratio (SSR) with each received frame. If a received frame is corrupted, then these SSR indications can be used to provide robust apriori estimate of the bit error rate in the packet. In many recently proposed cross-layer protocols, for transmission of video over wireless networks, recovery of information from partially corrupted packets has shown significant utility. In this paper, based on experiments with actual 802.11b error traces, we show that the channel state information (CSI) provided by the SSR indications can be used to improve the error recovery performance of an FEC scheme employed in conjunction with a cross-layer protocol. H.264 based simulation are used to establish the efficacy of the proposed work for video applications; specifically for video over 802.11b WLAN
{"title":"Utilizing SSR Indications for Improved Video Communication in Presence of 802.11B Residue Errors","authors":"Shirish S. Karande, U. Parrikar, K. Misra, H. Radha","doi":"10.1109/ICME.2006.262945","DOIUrl":"https://doi.org/10.1109/ICME.2006.262945","url":null,"abstract":"Radio hardware used for the reception of 802.11b frames is capable of associating a signal to silence ratio (SSR) with each received frame. If a received frame is corrupted, then these SSR indications can be used to provide robust apriori estimate of the bit error rate in the packet. In many recently proposed cross-layer protocols, for transmission of video over wireless networks, recovery of information from partially corrupted packets has shown significant utility. In this paper, based on experiments with actual 802.11b error traces, we show that the channel state information (CSI) provided by the SSR indications can be used to improve the error recovery performance of an FEC scheme employed in conjunction with a cross-layer protocol. H.264 based simulation are used to establish the efficacy of the proposed work for video applications; specifically for video over 802.11b WLAN","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127367254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}