首页 > 最新文献

Proceedings of the International Conference on Signal Processing and Multimedia Applications最新文献

英文 中文
Visual AER-based processing with convolutions for a parallel supercomputer 基于视觉aer的并行超级计算机卷积处理
R. Montero-Gonzalez, Arturo Morgado Estévez, F. Perez-Peña, A. Linares-Barranco, A. Jiménez-Fernandez, B. Linares-Barranco, J. Pérez-Carrasco
This paper is based on the simulation of a convolution model for multimedia applications using the neuro-inspired Address-Event-Representation (AER) philosophy. AER is a communication mechanism between chips gathering thousands of spiking neurons. These spiking neurons are able to process the visual information in a frame-free style like the human brain do. All the spiking neurons are working in parallel and each of them implement an operation when an input stimulus is received. The result of this operation could be, or not, to produce an output event. There exist AER retinas and other sensors, AER processors (convolvers, WTA filters), learning chips and robot actuators. In this paper we present the implementation of an AER convolution processor for the supercomputer CRS (cluster research support) of the University of Cadiz (UCA). This research involves a test cases design in which the optimal parameters are set to run the AER convolution in parallel processors. These cases consist on running the convolution taking an image divided in different number of parts, applying to each part a Sobel filter for edge detection, and based on the AER-TOOL simulator. Runtimes are compared for all cases and the optimal configuration of the system is discussed. In general, CRS obtain better performances when the image is subdivided than for the whole image processing.
本文基于基于神经启发的地址-事件-表示(AER)哲学的多媒体应用卷积模型的仿真。AER是一种聚集了数千个尖峰神经元的芯片之间的通信机制。这些尖峰神经元能够像人类大脑一样以无帧的方式处理视觉信息。所有的尖峰神经元并行工作,当接收到输入刺激时,它们中的每一个都执行一个操作。此操作的结果可以生成输出事件,也可以不生成输出事件。目前已有AER视网膜和其他传感器、AER处理器(卷积器、WTA滤波器)、学习芯片和机器人执行器。本文介绍了加的斯大学(UCA)超级计算机CRS(集群研究支持)的AER卷积处理器的实现。本研究涉及一个测试用例的设计,其中设置了在并行处理器上运行AER卷积的最佳参数。这些情况包括运行卷积,将图像分成不同数量的部分,对每个部分应用Sobel滤波器进行边缘检测,并基于AER-TOOL模拟器。比较了所有情况下的运行时间,并讨论了系统的最佳配置。总的来说,CRS在对图像进行细分处理时比在对整个图像进行处理时获得更好的性能。
{"title":"Visual AER-based processing with convolutions for a parallel supercomputer","authors":"R. Montero-Gonzalez, Arturo Morgado Estévez, F. Perez-Peña, A. Linares-Barranco, A. Jiménez-Fernandez, B. Linares-Barranco, J. Pérez-Carrasco","doi":"10.5220/0003519100850090","DOIUrl":"https://doi.org/10.5220/0003519100850090","url":null,"abstract":"This paper is based on the simulation of a convolution model for multimedia applications using the neuro-inspired Address-Event-Representation (AER) philosophy. AER is a communication mechanism between chips gathering thousands of spiking neurons. These spiking neurons are able to process the visual information in a frame-free style like the human brain do. All the spiking neurons are working in parallel and each of them implement an operation when an input stimulus is received. The result of this operation could be, or not, to produce an output event. There exist AER retinas and other sensors, AER processors (convolvers, WTA filters), learning chips and robot actuators. In this paper we present the implementation of an AER convolution processor for the supercomputer CRS (cluster research support) of the University of Cadiz (UCA). This research involves a test cases design in which the optimal parameters are set to run the AER convolution in parallel processors. These cases consist on running the convolution taking an image divided in different number of parts, applying to each part a Sobel filter for edge detection, and based on the AER-TOOL simulator. Runtimes are compared for all cases and the optimal configuration of the system is discussed. In general, CRS obtain better performances when the image is subdivided than for the whole image processing.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116942835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A genetic approach for improving the side information in Wyner-Ziv video coding with long duration GOP 一种改进长时间GOP的Wyner-Ziv视频编码侧信息的遗传方法
C. Yaacoub, J. Farah, Chadi Jabroun
This work tackles the problem of side information generation for the case of large-duration GOPs in distributed video coding. Based on a previously developed technique for side-information enhancement, we develop a genetic algorithm particularly designed for large GOPs, taking into account the GOP size, the additional bitrate incurred by encoding hash information, as well as the decoding complexity. The proposed algorithm makes use of different interpolation methods available in the literature in a fusion-based approach. A significant gain in the average PSNR that can reach 2 dB is observed with respect to the best performing interpolation technique, while the algorithm is run for no more than 18% of the total number of blocks in a given video sequence. On the other hand, while the encoding complexity is a main concern in distributed video coding, the proposed solution incurs no additional complexity at the encoder side in the case of hash-based Wyner-Ziv video coding.
本文解决了分布式视频编码中大持续时间GOPs情况下的边信息生成问题。基于先前开发的侧信息增强技术,我们开发了一种专门为大型GOPs设计的遗传算法,考虑到GOP大小,编码哈希信息所产生的额外比特率以及解码复杂性。提出的算法在基于融合的方法中利用了文献中可用的不同插值方法。相对于性能最好的插值技术,可以观察到平均PSNR可以达到2 dB的显着增益,而该算法在给定视频序列中运行不超过18%的块总数。另一方面,虽然编码复杂性是分布式视频编码的主要问题,但在基于哈希的Wyner-Ziv视频编码的情况下,所提出的解决方案不会在编码器端产生额外的复杂性。
{"title":"A genetic approach for improving the side information in Wyner-Ziv video coding with long duration GOP","authors":"C. Yaacoub, J. Farah, Chadi Jabroun","doi":"10.5220/0003526300970103","DOIUrl":"https://doi.org/10.5220/0003526300970103","url":null,"abstract":"This work tackles the problem of side information generation for the case of large-duration GOPs in distributed video coding. Based on a previously developed technique for side-information enhancement, we develop a genetic algorithm particularly designed for large GOPs, taking into account the GOP size, the additional bitrate incurred by encoding hash information, as well as the decoding complexity. The proposed algorithm makes use of different interpolation methods available in the literature in a fusion-based approach. A significant gain in the average PSNR that can reach 2 dB is observed with respect to the best performing interpolation technique, while the algorithm is run for no more than 18% of the total number of blocks in a given video sequence. On the other hand, while the encoding complexity is a main concern in distributed video coding, the proposed solution incurs no additional complexity at the encoder side in the case of hash-based Wyner-Ziv video coding.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"156-157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117179153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the differences in surface electromyographic signal between myofascial-pain and normal groups: Feature extraction through wavelet denoising and decomposition 探讨肌筋膜疼痛组与正常组肌电信号的差异:小波去噪分解特征提取
Ching-Fen Jiang, N. Yu, Yu-Ching Lin
Upper-back myofascial pain is an increasingly significant syndrome associated with frequent computer using. However, the changes in neuromuscular functions incurred by myofascial pain are still under-discovered. This study aims to discover the changes in neuromuscular function on the taut band through signal analysis of surface electromyography. We first developed a fully automatic algorithm to detect the duration of an epoch of muscle contraction. Following that, the features of epochs in both time-domain and frequency-domain were extracted from the 13 patients to compare with the measurement from 13 normal subjects. The higher contraction strength with lower median frequency found in the patient group is similar to the reported changes with muscle fatigue. The signal was further analyzed by wavelet energy of 17 levels. The result shows that the energy measured from the patients exceeds that from the normal group at the low frequency band, suggesting that an increasing synchronization level of motor unit recruitment may cause the drop in the median frequency and the increase in contraction strength.
上背部肌筋膜疼痛是一种日益显著的综合征,与频繁使用电脑有关。然而,肌筋膜疼痛引起的神经肌肉功能的改变仍未被发现。本研究旨在通过表面肌电图信号分析,发现紧绷带神经肌肉功能的变化。我们首先开发了一种全自动算法来检测肌肉收缩时期的持续时间。然后提取13例患者的时域和频域epoch特征,与13例正常人的测量结果进行比较。在患者组中发现的较高的收缩强度和较低的中位数频率与报道的肌肉疲劳变化相似。对信号进行17级小波能量分析。结果显示,患者在低频段测得的能量超过正常组,提示运动单元募集同步水平的提高可能导致中位频率下降,收缩强度增加。
{"title":"Exploring the differences in surface electromyographic signal between myofascial-pain and normal groups: Feature extraction through wavelet denoising and decomposition","authors":"Ching-Fen Jiang, N. Yu, Yu-Ching Lin","doi":"10.5220/0003515402030206","DOIUrl":"https://doi.org/10.5220/0003515402030206","url":null,"abstract":"Upper-back myofascial pain is an increasingly significant syndrome associated with frequent computer using. However, the changes in neuromuscular functions incurred by myofascial pain are still under-discovered. This study aims to discover the changes in neuromuscular function on the taut band through signal analysis of surface electromyography. We first developed a fully automatic algorithm to detect the duration of an epoch of muscle contraction. Following that, the features of epochs in both time-domain and frequency-domain were extracted from the 13 patients to compare with the measurement from 13 normal subjects. The higher contraction strength with lower median frequency found in the patient group is similar to the reported changes with muscle fatigue. The signal was further analyzed by wavelet energy of 17 levels. The result shows that the energy measured from the patients exceeds that from the normal group at the low frequency band, suggesting that an increasing synchronization level of motor unit recruitment may cause the drop in the median frequency and the increase in contraction strength.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126718176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The WINDSURF library for the efficient retrieval of multimedia hierarchical data WINDSURF库用于高效检索多媒体分层数据
Ilaria Bartolini, M. Patella, Guido Stromei
Several modem multimedia applications require the management of complex data, that can be defined as hierarchical objects consisting of several component elements. In such scenarios, the concept of similarity between complex objects clearly recursively depends on the similarity between component data, making difficult the resolution of several common tasks, like processing of queries and understanding the impact of different alternatives available for the definition of similarity between objects. To overcome such limitations, in this paper we present the WINDSURF library for management of multimedia hierarchical data. The goal of the library is to provide a general framework for assessing the performance of alternative query processing techniques for efficient retrieval of complex data that arise in several multimedia applications, such as image/video retrieval and the comparison of collection of documents. We designed the library so as to include characteristics of generality, flexibility, and extensibility: these are provided by way of a number of different templates that can be appropriately instantiated in order to realize the particular retrieval model needed by the user.
一些现代多媒体应用需要管理复杂的数据,这些数据可以定义为由多个组件元素组成的分层对象。在这种情况下,复杂对象之间的相似性概念显然递归地依赖于组件数据之间的相似性,这使得解决一些常见任务变得困难,例如处理查询和理解对象之间相似性定义的不同替代方案的影响。为了克服这些限制,本文提出了用于管理多媒体分层数据的WINDSURF库。该库的目标是提供一个通用框架,用于评估可选查询处理技术的性能,以便有效地检索多个多媒体应用程序中出现的复杂数据,例如图像/视频检索和文档集合的比较。我们设计的库包含通用性、灵活性和可扩展性的特征:这些特征是通过许多不同的模板提供的,这些模板可以适当地实例化,以实现用户所需的特定检索模型。
{"title":"The WINDSURF library for the efficient retrieval of multimedia hierarchical data","authors":"Ilaria Bartolini, M. Patella, Guido Stromei","doi":"10.5220/0003451701390148","DOIUrl":"https://doi.org/10.5220/0003451701390148","url":null,"abstract":"Several modem multimedia applications require the management of complex data, that can be defined as hierarchical objects consisting of several component elements. In such scenarios, the concept of similarity between complex objects clearly recursively depends on the similarity between component data, making difficult the resolution of several common tasks, like processing of queries and understanding the impact of different alternatives available for the definition of similarity between objects. To overcome such limitations, in this paper we present the WINDSURF library for management of multimedia hierarchical data. The goal of the library is to provide a general framework for assessing the performance of alternative query processing techniques for efficient retrieval of complex data that arise in several multimedia applications, such as image/video retrieval and the comparison of collection of documents. We designed the library so as to include characteristics of generality, flexibility, and extensibility: these are provided by way of a number of different templates that can be appropriately instantiated in order to realize the particular retrieval model needed by the user.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123052632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Effective interference reduction method for spread spectrum fingerprinting 扩频指纹识别的有效抗干扰方法
M. Kuribayashi
The iterative detection method was proposed in IH2008 specified for the CDMA-based fingerprinting scheme which embedding procedure was additive watermarking method. Such a detection method is applicable for the multiplicative watermarking method that modulates a fingerprint using the characteristic of a content. In this study, we study the interference among fingerprints embedded in a content in the hierarchical version of Cox's scheme, and propose the effective detection method that iteratively detects colluders combined with a removal operation. By introducing two kinds of thresholds, the removal operation is adaptively performed to reduce the interference without causing serious false detection.
在IH2008中提出了针对基于cdma的指纹识别方案的迭代检测方法,该方法的嵌入过程为加性水印方法。该检测方法适用于利用内容的特性调制指纹的乘法水印方法。在本研究中,我们研究了分层版Cox方案中嵌入在内容中的指纹之间的干扰,并提出了结合去除操作迭代检测共谋的有效检测方法。通过引入两种阈值,自适应地进行去除操作,以减少干扰,而不会造成严重的误检。
{"title":"Effective interference reduction method for spread spectrum fingerprinting","authors":"M. Kuribayashi","doi":"10.5220/0003497101670172","DOIUrl":"https://doi.org/10.5220/0003497101670172","url":null,"abstract":"The iterative detection method was proposed in IH2008 specified for the CDMA-based fingerprinting scheme which embedding procedure was additive watermarking method. Such a detection method is applicable for the multiplicative watermarking method that modulates a fingerprint using the characteristic of a content. In this study, we study the interference among fingerprints embedded in a content in the hierarchical version of Cox's scheme, and propose the effective detection method that iteratively detects colluders combined with a removal operation. By introducing two kinds of thresholds, the removal operation is adaptively performed to reduce the interference without causing serious false detection.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127792559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A spatial immersive office environment for computer-supported collaborative work: Moving towards the office of the future 计算机支持协同工作的空间沉浸式办公环境:走向未来的办公室
Maarten Dumont, S. Rogmans, S. Maesen, Karel Frederix, Johannes Taelman, P. Bekaert
In this paper, we present our work in building a prototype office environment for computer-supported collaborative work, that spatially — and auditorially — immerses the participants, as if the augmented and virtual generated environment was a true extension of the physical office. To realize this, we have integrated various hardware, computer vision and graphics technologies from either existing state-of-the-art, but mostly from knowledge and expertise in our research center. The fundamental components of such an office of the future, i.e. image-based modeling, rendering and spatial immersiveness, are illustrated together with surface computing and advanced audio processing, to go even beyond the original concept.
在本文中,我们展示了我们在为计算机支持的协作工作建立一个原型办公环境方面的工作,该环境在空间和听觉上使参与者沉浸其中,就好像增强和虚拟生成的环境是物理办公室的真正延伸。为了实现这一目标,我们从现有的最先进的技术中集成了各种硬件,计算机视觉和图形技术,但主要来自我们研究中心的知识和专业知识。这种未来办公室的基本组成部分,即基于图像的建模,渲染和空间沉浸感,与表面计算和先进的音频处理一起说明,甚至超越了最初的概念。
{"title":"A spatial immersive office environment for computer-supported collaborative work: Moving towards the office of the future","authors":"Maarten Dumont, S. Rogmans, S. Maesen, Karel Frederix, Johannes Taelman, P. Bekaert","doi":"10.5220/0003567702120216","DOIUrl":"https://doi.org/10.5220/0003567702120216","url":null,"abstract":"In this paper, we present our work in building a prototype office environment for computer-supported collaborative work, that spatially — and auditorially — immerses the participants, as if the augmented and virtual generated environment was a true extension of the physical office. To realize this, we have integrated various hardware, computer vision and graphics technologies from either existing state-of-the-art, but mostly from knowledge and expertise in our research center. The fundamental components of such an office of the future, i.e. image-based modeling, rendering and spatial immersiveness, are illustrated together with surface computing and advanced audio processing, to go even beyond the original concept.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133825547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic sound restoration system concepts and design 自动音响恢复系统的概念与设计
A. Czyżewski, B. Kostek, A. Kupryjanow
A concept of a system for automatic audio recording reconstruction is described. It is supported by the video image reconstruction algorithm, focused on the video instability analysis. Sound restoration is performed focusing on noise and wow and flutter analysis. Presented algorithms are designed to be automatic and to reduce the human effort during the restoration process. A web service designed especially for automatic restoration process is envisioned as an integration platform for these algorithms and for repository of recordings.
描述了一种自动录音重建系统的概念。以视频图像重建算法为支撑,重点研究视频的不稳定性分析。对噪声和颤振分析进行了声音恢复。所提出的算法是自动设计的,减少了在恢复过程中的人工努力。专门为自动恢复过程设计的web服务被设想为这些算法和记录存储库的集成平台。
{"title":"Automatic sound restoration system concepts and design","authors":"A. Czyżewski, B. Kostek, A. Kupryjanow","doi":"10.5220/0003527702070211","DOIUrl":"https://doi.org/10.5220/0003527702070211","url":null,"abstract":"A concept of a system for automatic audio recording reconstruction is described. It is supported by the video image reconstruction algorithm, focused on the video instability analysis. Sound restoration is performed focusing on noise and wow and flutter analysis. Presented algorithms are designed to be automatic and to reduce the human effort during the restoration process. A web service designed especially for automatic restoration process is envisioned as an integration platform for these algorithms and for repository of recordings.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115994347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Quality evaluation of novel DTD algorithm based on audio watermarking 基于音频水印的新型DTD算法的质量评价
A. Ciarkowski, A. Czyżewski
Echo cancellers typically employ a doubletalk detection (DTD) algorithm in order to keep the adaptive filter from diverging in the presence of near-end speech signal or other disruptive sounds in the microphone signal. A novel doubletalk detection algorithm based on techniques similar to those used for audio signal watermarking was introduced by the authors. The application of the described DTD algorithm within acoustic echo cancellation system is presented. The comparison of the proposed algorithm with very common, but simple Geigel algorithm and representing current state-of-the-art Normalized Cross-Correlation algorithms is performed. Both objective (ROC) and subjective (listening tests) performance evaluation methods are employed to obtain exhaustive evaluation results in simulated real-world conditions. The evaluation results are presented and their relevance is discussed. An issue of algorithms' computational complexity is emphasized and conclusions are drawn.
回声消除器通常采用双话检测(DTD)算法,以防止自适应滤波器在麦克风信号中存在近端语音信号或其他干扰声音时发散。本文提出了一种新的双腔检测算法,该算法基于与音频信号水印相似的技术。介绍了所描述的DTD算法在声回波消除系统中的应用。将所提出的算法与非常常见但简单的Geigel算法和代表当前最先进的归一化互相关算法进行了比较。采用客观(ROC)和主观(听力测试)两种评价方法,在模拟真实世界条件下得到详尽的评价结果。给出了评价结果,并对其相关性进行了讨论。重点讨论了算法的计算复杂度问题,并得出结论。
{"title":"Quality evaluation of novel DTD algorithm based on audio watermarking","authors":"A. Ciarkowski, A. Czyżewski","doi":"10.5220/0003524701810186","DOIUrl":"https://doi.org/10.5220/0003524701810186","url":null,"abstract":"Echo cancellers typically employ a doubletalk detection (DTD) algorithm in order to keep the adaptive filter from diverging in the presence of near-end speech signal or other disruptive sounds in the microphone signal. A novel doubletalk detection algorithm based on techniques similar to those used for audio signal watermarking was introduced by the authors. The application of the described DTD algorithm within acoustic echo cancellation system is presented. The comparison of the proposed algorithm with very common, but simple Geigel algorithm and representing current state-of-the-art Normalized Cross-Correlation algorithms is performed. Both objective (ROC) and subjective (listening tests) performance evaluation methods are employed to obtain exhaustive evaluation results in simulated real-world conditions. The evaluation results are presented and their relevance is discussed. An issue of algorithms' computational complexity is emphasized and conclusions are drawn.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121458345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accuracy of MP3 speech recognition under real-word conditions: Experimental study 真实世界条件下MP3语音识别的准确性实验研究
P. Pollák, Martin Behunek
This paper presents the study of speech recognition accuracy with respect to different levels of MP3 compression. Special attention is focused on the processing of speech signals with different quality, i.e. with different level of background noise and channel distortion. The work was motivated by possible usage of ASR for offline automatic transcription of audio recordings collected by standard wide-spread MP3 devices. The realized experiments have proved that although MP3 format is not optimal for speech compression it does not distort speech significantly especially for high or moderate bit rates and high quality of source data. The accuracy of connected digits ASR decreased consequently very slowly up to the bit rate 24 kbps. For the best case of PLP parameterization in close-talk channel just 3% decrease of recognition accuracy was observed while the size of the compressed file was approximately 10% of the original size. All results were slightly worse under presence of additive background noise and channel distortion in a signal but achieved accuracy was also acceptable in this case especially for PLP features.
本文研究了不同MP3压缩水平下的语音识别精度。特别关注不同质量的语音信号的处理,即不同程度的背景噪声和信道失真。这项工作的动机是可能使用ASR对标准广泛使用的MP3设备收集的录音进行离线自动转录。已实现的实验证明,虽然MP3格式不是语音压缩的最佳格式,但对于高或中等比特率和高质量的源数据,它不会造成明显的语音失真。因此,连接数字ASR的精度下降非常缓慢,直至比特率为24kbps。当压缩文件的大小约为原始大小的10%时,PLP参数化在近距离通道中的最佳情况下,识别精度仅下降3%。在信号中存在附加背景噪声和通道失真的情况下,所有结果都稍差,但在这种情况下,特别是对于PLP特征,达到的精度也是可以接受的。
{"title":"Accuracy of MP3 speech recognition under real-word conditions: Experimental study","authors":"P. Pollák, Martin Behunek","doi":"10.5220/0003512600050010","DOIUrl":"https://doi.org/10.5220/0003512600050010","url":null,"abstract":"This paper presents the study of speech recognition accuracy with respect to different levels of MP3 compression. Special attention is focused on the processing of speech signals with different quality, i.e. with different level of background noise and channel distortion. The work was motivated by possible usage of ASR for offline automatic transcription of audio recordings collected by standard wide-spread MP3 devices. The realized experiments have proved that although MP3 format is not optimal for speech compression it does not distort speech significantly especially for high or moderate bit rates and high quality of source data. The accuracy of connected digits ASR decreased consequently very slowly up to the bit rate 24 kbps. For the best case of PLP parameterization in close-talk channel just 3% decrease of recognition accuracy was observed while the size of the compressed file was approximately 10% of the original size. All results were slightly worse under presence of additive background noise and channel distortion in a signal but achieved accuracy was also acceptable in this case especially for PLP features.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134214773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Hand image segmentation by means of Gaussian multiscale aggregation for biometric applications 基于高斯多尺度聚集的生物识别手图像分割
A. Sierra, C. S. Ávila, J. Casanova, G. Bailador
Applying biometrics to daily scenarios involves demanding requirements in terms of software and hardware. On the contrary, current biometric techniques are also being adapted to present-day devices, like mobile phones, laptops and the like, which are far from meeting the previous stated requirements. In fact, achieving a combination of both necessities is one of the most difficult problems at present in biometrics. Therefore, this paper presents a segmentation algorithm able to provide suitable solutions in terms of precision for hand biometric recognition, considering a wide range of backgrounds like carpets, glass, grass, mud, pavement, plastic, tiles or wood. Results highlight that segmentation accuracy is carried out with high rates of precision (F-measure ≥ 88%)), presenting competitive time results when compared to state-of-the-art segmentation algorithms time performance.
将生物识别技术应用到日常场景中,对软件和硬件都有很高的要求。相反,目前的生物识别技术也正在适应当今的设备,如移动电话、笔记本电脑等,这些设备远远不能满足先前规定的要求。事实上,实现这两种必需品的结合是目前生物识别技术中最困难的问题之一。因此,本文提出了一种能够在精度上为手部生物特征识别提供合适解决方案的分割算法,考虑到地毯、玻璃、草、泥、路面、塑料、瓷砖、木材等广泛的背景。结果表明,该算法的分割精度很高(F-measure≥88%),与最先进的分割算法相比,在时间上具有竞争力。
{"title":"Hand image segmentation by means of Gaussian multiscale aggregation for biometric applications","authors":"A. Sierra, C. S. Ávila, J. Casanova, G. Bailador","doi":"10.5220/0003462500400046","DOIUrl":"https://doi.org/10.5220/0003462500400046","url":null,"abstract":"Applying biometrics to daily scenarios involves demanding requirements in terms of software and hardware. On the contrary, current biometric techniques are also being adapted to present-day devices, like mobile phones, laptops and the like, which are far from meeting the previous stated requirements. In fact, achieving a combination of both necessities is one of the most difficult problems at present in biometrics. Therefore, this paper presents a segmentation algorithm able to provide suitable solutions in terms of precision for hand biometric recognition, considering a wide range of backgrounds like carpets, glass, grass, mud, pavement, plastic, tiles or wood. Results highlight that segmentation accuracy is carried out with high rates of precision (F-measure ≥ 88%)), presenting competitive time results when compared to state-of-the-art segmentation algorithms time performance.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"388 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131782133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the International Conference on Signal Processing and Multimedia Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1