首页 > 最新文献

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific最新文献

英文 中文
Additive noise detection and its application to audio forensics 加性噪声检测及其在音频取证中的应用
Rui Yang
Digital audio recordings can be manipulated by pervasive audio editing software easily. Often forgery would not be naive splicing. Post-processing would be a part of tampering. Post-processing can eliminate the obvious traces of forgery. Noise can cover audible evidence of forgery and destroy traces of other tampering operations. The detection of additive noise in audio signal is a useful tool for audio forensics. In this paper, we investigate the effect of additive noise on audio signal, and propose a feature named "sign change rate" for detecting additive noise. Via theoretical analyze and extensive experiments, it shows the proposed feature is effective in additive noise detection. Also the method can be a potential tool for forgery localization of digital audio.
数字音频记录可以通过普遍的音频编辑软件轻松地进行操作。常常伪造不会天真拼接。后处理是篡改的一部分。后处理可以消除明显的伪造痕迹。噪音可以掩盖伪造的声音证据,并摧毁其他篡改操作的痕迹。音频信号中加性噪声的检测是音频取证的重要手段。本文研究了加性噪声对音频信号的影响,提出了一种用于检测加性噪声的特征“符号变化率”。通过理论分析和大量实验,证明了该特征在加性噪声检测中是有效的。该方法还可作为数字音频伪造定位的潜在工具。
{"title":"Additive noise detection and its application to audio forensics","authors":"Rui Yang","doi":"10.1109/APSIPA.2014.7041688","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041688","url":null,"abstract":"Digital audio recordings can be manipulated by pervasive audio editing software easily. Often forgery would not be naive splicing. Post-processing would be a part of tampering. Post-processing can eliminate the obvious traces of forgery. Noise can cover audible evidence of forgery and destroy traces of other tampering operations. The detection of additive noise in audio signal is a useful tool for audio forensics. In this paper, we investigate the effect of additive noise on audio signal, and propose a feature named \"sign change rate\" for detecting additive noise. Via theoretical analyze and extensive experiments, it shows the proposed feature is effective in additive noise detection. Also the method can be a potential tool for forgery localization of digital audio.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114983336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Feature extraction for human action classification using adaptive key frame interval 基于自适应关键帧间隔的人体动作分类特征提取
Kanokphan Lertniphonphan, S. Aramvith, T. Chalidabhongse
Human actions in video have the variation in both spatial and time domains which cause the difficulty for action classification. According to the nature of articulated body, an amount of movement from point-to-point is not constant, which can be illustrated as a bell-shape. In this paper, key frames are detected for specifying a starting and ending point for an action cycle. The time between key frames determines the window length for feature extraction in time domain. Since the cycles are varying, the key frame interval is varying and adaptive to performer and action. A local orientation histogram of Key Pose Energy Image (KPEI) and Motion History Image (MHI) is constructed during the period. The experimental results on WEIZMANN dataset demonstrate that the feature within the adaptive key frame interval can effectively classify actions.
视频中的人的动作在空间和时间上都有变化,这给动作分类带来了困难。根据铰接体的性质,从点到点的运动量不是恒定的,可以用钟形来表示。在本文中,检测关键帧用于指定动作循环的起点和终点。关键帧之间的时间决定了时域特征提取的窗口长度。由于周期是变化的,关键帧间隔是变化的,并适应表演者和动作。在此期间,构建了关键姿态能量图像(KPEI)和运动历史图像(MHI)的局部方向直方图。在WEIZMANN数据集上的实验结果表明,自适应关键帧间隔内的特征可以有效地对动作进行分类。
{"title":"Feature extraction for human action classification using adaptive key frame interval","authors":"Kanokphan Lertniphonphan, S. Aramvith, T. Chalidabhongse","doi":"10.1109/APSIPA.2014.7041766","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041766","url":null,"abstract":"Human actions in video have the variation in both spatial and time domains which cause the difficulty for action classification. According to the nature of articulated body, an amount of movement from point-to-point is not constant, which can be illustrated as a bell-shape. In this paper, key frames are detected for specifying a starting and ending point for an action cycle. The time between key frames determines the window length for feature extraction in time domain. Since the cycles are varying, the key frame interval is varying and adaptive to performer and action. A local orientation histogram of Key Pose Energy Image (KPEI) and Motion History Image (MHI) is constructed during the period. The experimental results on WEIZMANN dataset demonstrate that the feature within the adaptive key frame interval can effectively classify actions.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115488882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Reverberation steering and listening area expansion on 3-D sound field reproduction with parametric array loudspeaker 参数阵列扬声器三维声场再现的混响控制与听区扩展
Daisuke Ikefuji, H. Tsujii, S. Masunaga, M. Nakayama, T. Nishiura, Y. Yamashita
Recently, technologies for reproducing a 3-dimensional sound field are required for providing highly realistic sensations. Therefore, we previously proposed a system with multiple parametric array loudspeakers (PAL). PALs can design sound images on walls, ceilings, and floors by using the higher directivity of ultrasound. Thus, the proposed system can easily present incoming sound from various directions. However, it is difficult to provide a realistic sensation depending on the reverberation time. In addition, the listening area of one PAL is small. In this paper, we therefore propose two approaches for overcoming these problems. First, we propose reverberation steering with indirect electrodynamic loudspeakers and PALs. We also attempt to expand the listening area of the sound image with a curved-type PAL. As a result of evaluation experiments for each proposed approach, we could confirm the effectiveness of each approach.
最近,为了提供高度真实的感觉,需要重现三维声场的技术。因此,我们先前提出了一个多参数阵列扬声器(PAL)系统。pal可以利用超声波的高指向性,在墙壁、天花板和地板上设计声音图像。因此,所提出的系统可以很容易地呈现来自不同方向的传入声音。然而,由于混响时间的不同,很难提供真实的感觉。另外,一个PAL的收听面积小。因此,在本文中,我们提出了克服这些问题的两种方法。首先,我们提出用间接电动力扬声器和pal控制混响。我们还尝试使用曲线型PAL来扩大声音图像的听音区域。通过对每种方法的评估实验,我们可以确认每种方法的有效性。
{"title":"Reverberation steering and listening area expansion on 3-D sound field reproduction with parametric array loudspeaker","authors":"Daisuke Ikefuji, H. Tsujii, S. Masunaga, M. Nakayama, T. Nishiura, Y. Yamashita","doi":"10.1109/APSIPA.2014.7041606","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041606","url":null,"abstract":"Recently, technologies for reproducing a 3-dimensional sound field are required for providing highly realistic sensations. Therefore, we previously proposed a system with multiple parametric array loudspeakers (PAL). PALs can design sound images on walls, ceilings, and floors by using the higher directivity of ultrasound. Thus, the proposed system can easily present incoming sound from various directions. However, it is difficult to provide a realistic sensation depending on the reverberation time. In addition, the listening area of one PAL is small. In this paper, we therefore propose two approaches for overcoming these problems. First, we propose reverberation steering with indirect electrodynamic loudspeakers and PALs. We also attempt to expand the listening area of the sound image with a curved-type PAL. As a result of evaluation experiments for each proposed approach, we could confirm the effectiveness of each approach.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115829766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Estimation of Japanese DRT intelligibility using Articulation Index Band Correlations 用发音指数波段相关性估计日语DRT可理解性
K. Kondo
We proposed and evaluated an estimation method for the forced selection Japanese Diagnostic Rhyme Test (DRT). The proposed measure takes into account the forced selection manner of the DRT from a pair of rhyming words. The objective distance measure used here was based on the Articulation index Band Correlation (ABC), which showed favorable results for the English Modified Rhyme Test (MRT). The correlation of time-frequency patterns between the test word and the template word speech of the two words in the candidate word pair was calculated. The word with the higher correlation was decided to be the likely candidate word. The time-frequency (T-F) pattern was calculated in the Articulation Index (AI) bands, and the correlation was calculated between the corresponding bands of the test and candidate word sample. The candidate word with more AI bands showing higher correlation values was finally chosen. The ratio of bands with higher correlation with the candidate word vs. the total number of bands is calculated to quantify how well the test word matches the candidate word in the word pair. We estimated a logistic mapping function from this ratio to intelligibility scores using speech mixed with known noise. The mapping functions were then used to estimate the intelligibility of speech mixed with unknown noise. This estimation was compared to another measure that we previously have evaluated, the frequency-weighed segmental SNR, and was proven to be more accurate, with the correlation between estimated and estimated intelligibility over 0.93, and the root mean square below 0.15. Thus, it should be possible to "screen" the intelligibility in many of the noise conditions to be tested, and cut down on the scale of the subjective test needed.
我们提出并评估了一种强制选择日语诊断韵测试(DRT)的估计方法。该方法考虑了从一对押韵词中强制选择DRT的方式。本文使用的客观距离测量是基于发音指标频带相关性(ABC),该方法在英语修饰韵测试(MRT)中显示出良好的效果。计算候选词对中两个词的测试词与模板词语音的时频相关性。相关性较高的单词被决定为可能的候选单词。计算发音指数(Articulation Index, AI)波段的时频(T-F)模式,并计算测试对应波段与候选词样本之间的相关性。最终选择具有更多AI波段且相关值较高的候选词。计算与候选单词相关度较高的频带与频带总数的比率,以量化测试单词与单词对中候选单词的匹配程度。我们使用混合了已知噪声的语音,从可理解性分数的比率估计了一个逻辑映射函数。然后利用映射函数估计含有未知噪声的语音的可理解性。该估计与我们之前评估的另一种测量方法进行了比较,即频率加权的分段信噪比,并被证明更准确,估计和估计的可理解性之间的相关性超过0.93,均方根低于0.15。因此,在许多需要测试的噪声条件下,应该有可能“筛选”可理解性,并减少所需的主观测试的规模。
{"title":"Estimation of Japanese DRT intelligibility using Articulation Index Band Correlations","authors":"K. Kondo","doi":"10.1109/APSIPA.2014.7041516","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041516","url":null,"abstract":"We proposed and evaluated an estimation method for the forced selection Japanese Diagnostic Rhyme Test (DRT). The proposed measure takes into account the forced selection manner of the DRT from a pair of rhyming words. The objective distance measure used here was based on the Articulation index Band Correlation (ABC), which showed favorable results for the English Modified Rhyme Test (MRT). The correlation of time-frequency patterns between the test word and the template word speech of the two words in the candidate word pair was calculated. The word with the higher correlation was decided to be the likely candidate word. The time-frequency (T-F) pattern was calculated in the Articulation Index (AI) bands, and the correlation was calculated between the corresponding bands of the test and candidate word sample. The candidate word with more AI bands showing higher correlation values was finally chosen. The ratio of bands with higher correlation with the candidate word vs. the total number of bands is calculated to quantify how well the test word matches the candidate word in the word pair. We estimated a logistic mapping function from this ratio to intelligibility scores using speech mixed with known noise. The mapping functions were then used to estimate the intelligibility of speech mixed with unknown noise. This estimation was compared to another measure that we previously have evaluated, the frequency-weighed segmental SNR, and was proven to be more accurate, with the correlation between estimated and estimated intelligibility over 0.93, and the root mean square below 0.15. Thus, it should be possible to \"screen\" the intelligibility in many of the noise conditions to be tested, and cut down on the scale of the subjective test needed.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127119363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Attachable robotic arm for anthropomorphized explanation by pointing 可连接的机械臂,通过指向进行人格化解释
Hirotaka Osawa, Wataru Kayano
Our daily household activities are supported by many complicated home appliances whose functions are difficult to learn. In order to clearly explain the functions of home appliances to users, we design attachable agential triggers to render home appliances as explanatory agents. We detail how our application helps explain the use of home appliances to users. Our proposed robotic arms are easier to use than previously used attachable arms in order to point to a home appliance.
我们的日常家庭活动是由许多复杂的家用电器支持的,这些电器的功能很难掌握。为了向用户清晰地解释家电的功能,我们设计了附加的代理触发器,将家电作为解释代理。我们详细介绍了我们的应用程序如何帮助向用户解释家用电器的使用。我们提出的机械臂比以前使用的附加臂更容易使用,以便指向家用电器。
{"title":"Attachable robotic arm for anthropomorphized explanation by pointing","authors":"Hirotaka Osawa, Wataru Kayano","doi":"10.1109/APSIPA.2014.7041704","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041704","url":null,"abstract":"Our daily household activities are supported by many complicated home appliances whose functions are difficult to learn. In order to clearly explain the functions of home appliances to users, we design attachable agential triggers to render home appliances as explanatory agents. We detail how our application helps explain the use of home appliances to users. Our proposed robotic arms are easier to use than previously used attachable arms in order to point to a home appliance.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123704857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic exudates detection in retinal images using efficient integrated approaches 基于高效集成方法的视网膜图像渗出物自动检测
Wuttichai Luangruangrong, P. Kulkasem, Suwanna Rasmequan, Annupan Rodtook, K. Chinnasarn
Diabetic Retinopathy with exudates causes a major problem in human visualization and becomes a cause of blindness to diabetic patients. In addition, the numbers of diabetic retinopathy patients are increasing while the numbers of doctors are not easily increased in the same proportion. This circumstance causes a heavy work load for doctors. In the past, the medical image processing research has shown that simply getting a second opinion can significantly help physician's diagnosis. This research proposes a method to detect exudates from diabetic retinopathy images. The early exudates detection of diabetic retinopathy patients will reduce seriousness in diabetic retinopathy. The proposed method for detecting exudates consists of 5 major steps as follows: 1) To improve the quality of images by using the contrast limited adaptive histogram equalization (CLAHE) 2) To apply the object attribute thresholding algorithm (OAT) for non-retinal object removal, 3) To implement Frangi's algorithm based on Hessian filtering for blood vessel detection 4) To detect the retinal optic disc by applying the combination between multi-resolution analysis and Hough transform and 5) To classify exudates in the remaining region with algorithms of hierarchical fuzzy-c-mean clustering. The performance of the proposed method is evaluated on DIARETDB, which is the retinal image database of the Lappeenranta University of Technology, where the performance is good enough for exudates detection.
伴有渗出物的糖尿病视网膜病变对人体视觉造成了严重的影响,并成为糖尿病患者失明的主要原因。此外,糖尿病视网膜病变患者的数量在不断增加,而医生的数量却不容易按比例增加。这种情况给医生带来了沉重的工作量。在过去,医学图像处理研究表明,简单地获得第二意见可以显著地帮助医生的诊断。本研究提出一种检测糖尿病视网膜病变影像渗出物的方法。糖尿病视网膜病变患者的早期渗出物检测将降低糖尿病视网膜病变的严重程度。本文提出的渗出物检测方法包括以下5个主要步骤:1)利用对比度限制自适应直方图均衡化(CLAHE)提高图像质量2)应用目标属性阈值算法(OAT)去除非视网膜目标;3)实现基于Hessian滤波的Frangi算法进行血管检测;4)采用多分辨率分析与Hough变换相结合的方法检测视网膜视盘;5)采用分层模糊c均值聚类算法对剩余区域的渗出物进行分类。在拉彭兰塔理工大学(Lappeenranta University of Technology)的视网膜图像数据库DIARETDB上对该方法的性能进行了评估,其性能足以用于渗出物检测。
{"title":"Automatic exudates detection in retinal images using efficient integrated approaches","authors":"Wuttichai Luangruangrong, P. Kulkasem, Suwanna Rasmequan, Annupan Rodtook, K. Chinnasarn","doi":"10.1109/APSIPA.2014.7041749","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041749","url":null,"abstract":"Diabetic Retinopathy with exudates causes a major problem in human visualization and becomes a cause of blindness to diabetic patients. In addition, the numbers of diabetic retinopathy patients are increasing while the numbers of doctors are not easily increased in the same proportion. This circumstance causes a heavy work load for doctors. In the past, the medical image processing research has shown that simply getting a second opinion can significantly help physician's diagnosis. This research proposes a method to detect exudates from diabetic retinopathy images. The early exudates detection of diabetic retinopathy patients will reduce seriousness in diabetic retinopathy. The proposed method for detecting exudates consists of 5 major steps as follows: 1) To improve the quality of images by using the contrast limited adaptive histogram equalization (CLAHE) 2) To apply the object attribute thresholding algorithm (OAT) for non-retinal object removal, 3) To implement Frangi's algorithm based on Hessian filtering for blood vessel detection 4) To detect the retinal optic disc by applying the combination between multi-resolution analysis and Hough transform and 5) To classify exudates in the remaining region with algorithms of hierarchical fuzzy-c-mean clustering. The performance of the proposed method is evaluated on DIARETDB, which is the retinal image database of the Lappeenranta University of Technology, where the performance is good enough for exudates detection.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122051957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An evaluation of target speech for a nonaudible murmur enhancement system in noisy environments 噪声环境下非听杂音增强系统的目标语音评价
Sakura Tsuruta, Kou Tanaka, T. Toda, Graham Neubig, S. Sakti, Satoshi Nakamura
Nonaudible murmur (NAM) is a soft whispered voice recorded with NAM microphone through body conduction. NAM allows for silent speech communication as it makes it possible for the speaker to convey their message in a nonaudible voice. However, its intelligibility and naturalness are significantly degraded compared to those of natural speech owing to acoustic changes caused by body conduction. To address this issue, statistical voice conversion (VC) methods from NAM to normal speech (NAM-to-Speech) and to a whispered voice (NAM-to-Whisper) have been proposed. It has been reported that these NAM enhancement methods significantly improve speech quality and intelligibility of NAM, and NAM-to-Whisper is more effective than NAM-to-Speech. However, it is still not obvious which method is more effective if a listener listens to the enhanced speech in noisy environments, a situation that often happens in silent speech communication. In this paper, assuming a typical situation in which NAM is uttered by a speaker in a quiet environment and conveyed to a listener in noisy environments, we investigate what kinds of target speech are more effective for NAM enhancement. We also propose NAM enhancement methods for converting NAM to other types of target voiced speech. Experiments show that the conversion process into voiced speech is more effective than that into unvoiced speech for generating more intelligible speech in noisy environments.
非听杂音(NAM)是用NAM麦克风通过身体传导录下的一种轻声细语。不结盟运动允许无声语音通信,因为它使说话者能够以听不见的声音传达他们的信息。然而,由于身体传导引起的声学变化,其可理解性和自然性与自然语音相比明显下降。为了解决这个问题,已经提出了从NAM到正常语音(NAM- To - speech)和到低声语音(NAM- To - whisper)的统计语音转换(VC)方法。有报道称,这些NAM增强方法显著提高了NAM的语音质量和可理解性,并且NAM-to- whisper比NAM-to- speech更有效。然而,如果听者在嘈杂的环境中听增强语音,哪种方法更有效,这在无声语言交流中经常发生。在本文中,我们假设一个典型的情况,即在安静的环境中由说话者发出非NAM,并在嘈杂的环境中传达给听者,我们研究了什么样的目标语音对非NAM增强更有效。我们还提出了不发音增强方法,将不发音转换为其他类型的目标语音。实验表明,在嘈杂环境下,将语音转换成浊音比将语音转换成浊音更能有效地生成可理解的语音。
{"title":"An evaluation of target speech for a nonaudible murmur enhancement system in noisy environments","authors":"Sakura Tsuruta, Kou Tanaka, T. Toda, Graham Neubig, S. Sakti, Satoshi Nakamura","doi":"10.1109/APSIPA.2014.7041618","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041618","url":null,"abstract":"Nonaudible murmur (NAM) is a soft whispered voice recorded with NAM microphone through body conduction. NAM allows for silent speech communication as it makes it possible for the speaker to convey their message in a nonaudible voice. However, its intelligibility and naturalness are significantly degraded compared to those of natural speech owing to acoustic changes caused by body conduction. To address this issue, statistical voice conversion (VC) methods from NAM to normal speech (NAM-to-Speech) and to a whispered voice (NAM-to-Whisper) have been proposed. It has been reported that these NAM enhancement methods significantly improve speech quality and intelligibility of NAM, and NAM-to-Whisper is more effective than NAM-to-Speech. However, it is still not obvious which method is more effective if a listener listens to the enhanced speech in noisy environments, a situation that often happens in silent speech communication. In this paper, assuming a typical situation in which NAM is uttered by a speaker in a quiet environment and conveyed to a listener in noisy environments, we investigate what kinds of target speech are more effective for NAM enhancement. We also propose NAM enhancement methods for converting NAM to other types of target voiced speech. Experiments show that the conversion process into voiced speech is more effective than that into unvoiced speech for generating more intelligible speech in noisy environments.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128451359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Classification of electromyogram using vertical visibility algorithm with support vector machine 基于支持向量机的垂直可见性肌电图分类
P. Artameeyanant, Sivarit Sultornsanee, K. Chamnongthai, K. Higuchi
Analyzing the electromyogram is an important issue on diagnosis of neuromuscular diseases. The classification of electromyogram signal plays a significant role in this issue. Since the characteristic of the signals is complex and non-stationary, so the complex network is an appropriate tool in extracting feature of the signal. In this paper we propose a novel feature extraction technique based on transforming the signal to complex network via vertical visibility algorithm. Characteristic on the measurements of community structure and distance property are examined. The pattern on the relationship of nodes in the network is investigated. Support vector machine was employed for classification. The proposed method can classify the signals into 3 cases, i.e., healthy, myopathy, and neuropathy, with remarkable experimental results.
肌电图分析是神经肌肉疾病诊断的重要内容。肌电信号的分类在这一问题中起着重要的作用。由于信号的特征是复杂和非平稳的,因此复杂网络是提取信号特征的合适工具。本文提出了一种基于垂直可见性算法将信号转化为复杂网络的特征提取方法。探讨了社区结构和距离属性测量的特点。研究了网络中节点关系的规律。采用支持向量机进行分类。该方法可将信号分为健康、肌病和神经病三种情况,实验结果显著。
{"title":"Classification of electromyogram using vertical visibility algorithm with support vector machine","authors":"P. Artameeyanant, Sivarit Sultornsanee, K. Chamnongthai, K. Higuchi","doi":"10.1109/APSIPA.2014.7041820","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041820","url":null,"abstract":"Analyzing the electromyogram is an important issue on diagnosis of neuromuscular diseases. The classification of electromyogram signal plays a significant role in this issue. Since the characteristic of the signals is complex and non-stationary, so the complex network is an appropriate tool in extracting feature of the signal. In this paper we propose a novel feature extraction technique based on transforming the signal to complex network via vertical visibility algorithm. Characteristic on the measurements of community structure and distance property are examined. The pattern on the relationship of nodes in the network is investigated. Support vector machine was employed for classification. The proposed method can classify the signals into 3 cases, i.e., healthy, myopathy, and neuropathy, with remarkable experimental results.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128579549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Telelife: An immersive media experience for rehabilitation Telelife:沉浸式康复媒体体验
Farid Abedan Kondori, Li Liu, Haibo Li
In recent years, emergence of telerehabilitation systems for home-based therapy has altered healthcare systems. Telerehabilitation enables therapists to observe patients status via Internet, thus a patient does not have to visit rehabilitation facilities for every rehabilitation session. Despite the fact that telerehabilitation provides great opportunities, there are two major issues that affect effectiveness of telerehabilitation: relegation of the patient at home, and loss of direct supervision of the therapist. Since patients have no actual interaction with other persons during the rehabilitation period, they will become isolated and gradually lose their social skills. Moreover, without direct supervision of therapists, rehabilitation exercises can be performed with bad compensation strategies that lead to a poor quality recovery. To resolve these issues, we propose telelife, a new concept for future rehabilitation systems. The idea is to use media technology to create a totally new immersive media experience for rehabilitation. In telerehabilitation patients locally execute exercises, and therapists remotely monitor patients' status. In telelife patients, however, remotely perform exercises and therapists locally monitor. Thus, not only telelife enables rehabilitation at distance, but also improves the patients' social competences, and provides direct supervision of therapists. In this paper we introduce telelife to enhance telerehabilitation, and investigate technical challenges and possible methods to achieve telelife.
近年来,以家庭为基础的远程康复系统的出现改变了医疗保健系统。远程康复使治疗师能够通过互联网观察患者的状态,因此患者不必每次都去康复机构进行康复治疗。尽管远程康复提供了巨大的机会,但有两个主要问题影响远程康复的有效性:患者在家中的退居,以及治疗师的直接监督的丧失。由于患者在康复期间与他人没有实际的互动,他们会变得孤立,逐渐失去社交能力。此外,如果没有治疗师的直接监督,康复练习可能会以糟糕的补偿策略进行,从而导致低质量的康复。为了解决这些问题,我们提出了未来康复系统的新概念telelife。这个想法是利用媒体技术为康复创造一种全新的沉浸式媒体体验。在远程康复中,患者在当地进行锻炼,治疗师远程监控患者的状态。然而,在远程患者中,远程进行锻炼,治疗师在本地进行监测。因此,远程生活不仅可以实现远程康复,还可以提高患者的社会能力,并为治疗师提供直接监督。本文介绍了远程康复技术,探讨了实现远程康复的技术挑战和可能的方法。
{"title":"Telelife: An immersive media experience for rehabilitation","authors":"Farid Abedan Kondori, Li Liu, Haibo Li","doi":"10.1109/APSIPA.2014.7041675","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041675","url":null,"abstract":"In recent years, emergence of telerehabilitation systems for home-based therapy has altered healthcare systems. Telerehabilitation enables therapists to observe patients status via Internet, thus a patient does not have to visit rehabilitation facilities for every rehabilitation session. Despite the fact that telerehabilitation provides great opportunities, there are two major issues that affect effectiveness of telerehabilitation: relegation of the patient at home, and loss of direct supervision of the therapist. Since patients have no actual interaction with other persons during the rehabilitation period, they will become isolated and gradually lose their social skills. Moreover, without direct supervision of therapists, rehabilitation exercises can be performed with bad compensation strategies that lead to a poor quality recovery. To resolve these issues, we propose telelife, a new concept for future rehabilitation systems. The idea is to use media technology to create a totally new immersive media experience for rehabilitation. In telerehabilitation patients locally execute exercises, and therapists remotely monitor patients' status. In telelife patients, however, remotely perform exercises and therapists locally monitor. Thus, not only telelife enables rehabilitation at distance, but also improves the patients' social competences, and provides direct supervision of therapists. In this paper we introduce telelife to enhance telerehabilitation, and investigate technical challenges and possible methods to achieve telelife.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128619605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A balancing voltage transformation for robust frequency estimation in unbalanced power systems 一种用于不平衡电力系统鲁棒频率估计的平衡电压变换
Yili Xia, Kai Wang, Wenjiang Pei, D. Mandic
This paper addresses the detection of the fundamental frequency of power systems under unbalanced and distorted conditions. By using the second order information, both the autocorrelation and pseudo-autocorrelation, within the Clarke's transformed voltage, a novel balancing voltage transformation (BVT) is proposed to accurately detect the underlying phase angle evolution of the positive sequence component. This removes the biggest obstacle in current power systems and makes possible to use any frequency estimator for single-tone exponential on unbalanced power systems. The robustness of the proposed phase angle detection technique is illustrated for two well-known and efficient frequency estimators, that is, a discrete Fourier transform (DFT) coefficient interpolation method [1] and the weighted linear predictor (WLP) [2]. A window technique is used to cater for the fast and computationally affordable frequency estimation purposes. Simulations over a range of unbalanced conditions, including voltage dips and swells, frequency deviations and the presence of higher order harmonics support the analysis.
本文讨论了电力系统在不平衡和畸变条件下的基频检测问题。利用Clarke变换电压内的二阶自相关和伪自相关信息,提出了一种新的平衡电压变换(BVT),以准确检测正序分量的底层相角演变。这消除了当前电力系统中最大的障碍,使得在不平衡电力系统中使用任何频率估计器进行单音指数估计成为可能。本文所提出的相角检测技术对于两种众所周知且高效的频率估计方法(即离散傅立叶变换(DFT)系数插值方法[1]和加权线性预测器(WLP)[2])具有鲁棒性。使用窗口技术来满足快速和计算负担得起的频率估计目的。在一系列不平衡条件下的模拟,包括电压下降和膨胀,频率偏差和高阶谐波的存在支持分析。
{"title":"A balancing voltage transformation for robust frequency estimation in unbalanced power systems","authors":"Yili Xia, Kai Wang, Wenjiang Pei, D. Mandic","doi":"10.1109/APSIPA.2014.7041682","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041682","url":null,"abstract":"This paper addresses the detection of the fundamental frequency of power systems under unbalanced and distorted conditions. By using the second order information, both the autocorrelation and pseudo-autocorrelation, within the Clarke's transformed voltage, a novel balancing voltage transformation (BVT) is proposed to accurately detect the underlying phase angle evolution of the positive sequence component. This removes the biggest obstacle in current power systems and makes possible to use any frequency estimator for single-tone exponential on unbalanced power systems. The robustness of the proposed phase angle detection technique is illustrated for two well-known and efficient frequency estimators, that is, a discrete Fourier transform (DFT) coefficient interpolation method [1] and the weighted linear predictor (WLP) [2]. A window technique is used to cater for the fast and computationally affordable frequency estimation purposes. Simulations over a range of unbalanced conditions, including voltage dips and swells, frequency deviations and the presence of higher order harmonics support the analysis.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129024626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1