首页 > 最新文献

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific最新文献

英文 中文
Topic model allocation of conversational dialogue records by Latent Dirichlet Allocation 基于潜在狄利克雷分配的会话对话记录主题模型分配
Jui-Feng Yeh, C. Lee, Yi-Shiuan Tan, Liang-Chih Yu
The topic information of conversational content is important for continuation with communication, so topic detection and tracking is one of important research. Due to there are many topic transform occurring frequently in long time communication, and the conversation maybe have many topics, so it's important to detect different topics in conversational content. This paper detects topic information by using agglomerative clustering of utterances and Dynamic Latent Dirichlet Allocation topic model, uses proportion of verb and noun to analyze similarity between utterances and cluster all utterances in conversational content by agglomerative clustering algorithm. The topic structure of conversational content is friability, so we use speech act information and gets the hypernym information by E-HowNet that obtains robustness of word categories. Latent Dirichlet Allocation topic model is used to detect topic in file units, it just can detect only one topic if uses it in conversational content, because of there are many topics in conversational content frequently, and also uses speech act information and hypernym information to train the latent Dirichlet allocation models, then uses trained models to detect different topic information in conversational content. For evaluating the proposed method, support vector machine is developed for comparison. According to the experimental results, we can find the proposed method outperforms the approach based on support vector machine in topic detection and tracking in spoken dialogue.
会话内容的话题信息对于交流的继续是很重要的,因此话题的检测与跟踪是重要的研究之一。由于在长时间的交际中,经常会出现许多话题变换,并且会话可能包含许多话题,因此在会话内容中检测不同的话题是很重要的。本文利用话语凝聚聚类和动态潜狄利克雷分配主题模型检测话题信息,利用动词和名词的比例分析话语之间的相似度,并利用凝聚聚类算法对会话内容中的所有话语进行聚类。会话内容的主题结构是脆弱的,因此我们利用语音行为信息,通过E-HowNet获取首词信息,从而获得词类别的鲁棒性。潜狄利克雷分配主题模型用于在文件单元中检测主题,由于会话内容中经常有许多主题,因此在会话内容中使用潜狄利克雷分配主题模型只能检测一个主题,并且还使用语音行为信息和超词信息来训练潜狄利克雷分配模型,然后使用训练好的模型来检测会话内容中的不同主题信息。为了评估所提出的方法,开发了支持向量机进行比较。根据实验结果,我们发现该方法在口语对话的主题检测和跟踪方面优于基于支持向量机的方法。
{"title":"Topic model allocation of conversational dialogue records by Latent Dirichlet Allocation","authors":"Jui-Feng Yeh, C. Lee, Yi-Shiuan Tan, Liang-Chih Yu","doi":"10.1109/APSIPA.2014.7041546","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041546","url":null,"abstract":"The topic information of conversational content is important for continuation with communication, so topic detection and tracking is one of important research. Due to there are many topic transform occurring frequently in long time communication, and the conversation maybe have many topics, so it's important to detect different topics in conversational content. This paper detects topic information by using agglomerative clustering of utterances and Dynamic Latent Dirichlet Allocation topic model, uses proportion of verb and noun to analyze similarity between utterances and cluster all utterances in conversational content by agglomerative clustering algorithm. The topic structure of conversational content is friability, so we use speech act information and gets the hypernym information by E-HowNet that obtains robustness of word categories. Latent Dirichlet Allocation topic model is used to detect topic in file units, it just can detect only one topic if uses it in conversational content, because of there are many topics in conversational content frequently, and also uses speech act information and hypernym information to train the latent Dirichlet allocation models, then uses trained models to detect different topic information in conversational content. For evaluating the proposed method, support vector machine is developed for comparison. According to the experimental results, we can find the proposed method outperforms the approach based on support vector machine in topic detection and tracking in spoken dialogue.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115754286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Learning visual co-occurrence with auto-encoder for image super-resolution 学习视觉共现与自编码器的图像超分辨率
Yudong Liang, Jinjun Wang, Shizhou Zhang, Yihong Gong
This paper proposes a novel neural network learning the essential mapping function between the low resolution and high resolution image for Image superresolution problem. In our approach, patch recurrence property of small patches in natural image are utilized as a prior to train the network. An autoencoder neutral network is designed to reconstruct the high resolution patches. The constraint that the output of the coding part should be similar as the corresponding high resolution patches is imposed to ameliorate the illness nature of the superresolution problem. In fact, the degeneration mapping from the high resolution image to the low resolution image is also integrated in the network. Both visual improvements and objective assessments are demonstrated on true images.
针对图像超分辨率问题,提出了一种学习低分辨率和高分辨率图像基本映射函数的神经网络。在我们的方法中,利用自然图像中小块的块递归特性作为训练网络的先验。设计了一个自编码器神经网络来重建高分辨率的图像。为了改善超分辨率问题的病态性,对编码部分的输出施加了与相应的高分辨率补丁相似的约束。实际上,从高分辨率图像到低分辨率图像的退化映射也集成在网络中。在真实图像上演示了视觉改进和客观评估。
{"title":"Learning visual co-occurrence with auto-encoder for image super-resolution","authors":"Yudong Liang, Jinjun Wang, Shizhou Zhang, Yihong Gong","doi":"10.1109/APSIPA.2014.7041671","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041671","url":null,"abstract":"This paper proposes a novel neural network learning the essential mapping function between the low resolution and high resolution image for Image superresolution problem. In our approach, patch recurrence property of small patches in natural image are utilized as a prior to train the network. An autoencoder neutral network is designed to reconstruct the high resolution patches. The constraint that the output of the coding part should be similar as the corresponding high resolution patches is imposed to ameliorate the illness nature of the superresolution problem. In fact, the degeneration mapping from the high resolution image to the low resolution image is also integrated in the network. Both visual improvements and objective assessments are demonstrated on true images.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123493236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Improved cross-layer cooperative MAC protocol for wireless ad hoc networks 改进的无线自组织网络跨层协作MAC协议
Quang-Trung Hoang, X. Tran
This paper considers the design of a cross-layer medium access control (MAC) protocol for wireless ad hoc cooperative networks. Specifically, we redesign the message exchange process of the MAC protocol previously proposed by Shan et al. By using a HRP signal with shorter length the proposed protocol can reduce the protocol overhead and thus improve the transmission reliability. We also propose to use only one HRP signal to resolve the collision among the helpers with the same cooperative rate. The proposed protocol achieves higher path throughput and lower end-to-end packet latency compared with that by Shan et al. and the traditional IEEE 802.11 MAC protocol.
研究了无线自组织协作网络的跨层介质访问控制(MAC)协议的设计。具体来说,我们重新设计了Shan等人之前提出的MAC协议的消息交换过程。该协议通过使用短长度的HRP信号,减少了协议开销,提高了传输的可靠性。我们还建议使用一个HRP信号来解决具有相同合作速率的helper之间的冲突。与Shan等人的协议和传统的IEEE 802.11 MAC协议相比,该协议实现了更高的路径吞吐量和更低的端到端数据包延迟。
{"title":"Improved cross-layer cooperative MAC protocol for wireless ad hoc networks","authors":"Quang-Trung Hoang, X. Tran","doi":"10.1109/APSIPA.2014.7041716","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041716","url":null,"abstract":"This paper considers the design of a cross-layer medium access control (MAC) protocol for wireless ad hoc cooperative networks. Specifically, we redesign the message exchange process of the MAC protocol previously proposed by Shan et al. By using a HRP signal with shorter length the proposed protocol can reduce the protocol overhead and thus improve the transmission reliability. We also propose to use only one HRP signal to resolve the collision among the helpers with the same cooperative rate. The proposed protocol achieves higher path throughput and lower end-to-end packet latency compared with that by Shan et al. and the traditional IEEE 802.11 MAC protocol.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123714825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Multi-block dependency based watermarking scheme for binary-text image authentication 基于多块依赖的二进制文本图像认证水印方案
Fan Chen, Yao Qin, Hongjie He
To improve the ability against different counterfeiting attacks, a watermarking algorithm is proposed for binary-text image (BTI) authentication. To protect the uniform regions in BTI, the watermark information of a fixed-size block is generated according to the content of it and divided into three parts. One part is embedded in the flippable pixels of itself, and the other two parts are respectively embedded in the flippable pixels of other two blocks in BTI, which are randomly chosen based on the secret key. This strategy can not only introduce the block-wise dependence, but also make it possible for the authentication watermark of a uniform block to be embedded in the BTI. In the tamper detection stage, a multi-block based statistic detection method is designed to verify the validity of an image block. Simulation results show that the proposed algorithm can achieve a good imperceptibility and have an ability resisting the maliciously attacks such as collage attack, delete tampering, replace tampering etc.
为了提高对各种伪造攻击的防御能力,提出了一种用于二进制文本图像(BTI)认证的水印算法。为了保护BTI中的均匀区域,根据固定大小块的内容生成水印信息,并将其分为三部分。其中一部分嵌入到自身的可翻转像素中,另外两部分分别嵌入到BTI中根据密钥随机选择的其他两个块的可翻转像素中。该策略不仅可以引入块依赖,而且可以将统一块的认证水印嵌入到BTI中。在篡改检测阶段,设计了一种基于多块的统计检测方法来验证图像块的有效性。仿真结果表明,该算法具有良好的不可感知性,能够抵抗拼贴攻击、删除篡改、替换篡改等恶意攻击。
{"title":"Multi-block dependency based watermarking scheme for binary-text image authentication","authors":"Fan Chen, Yao Qin, Hongjie He","doi":"10.1109/APSIPA.2014.7041611","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041611","url":null,"abstract":"To improve the ability against different counterfeiting attacks, a watermarking algorithm is proposed for binary-text image (BTI) authentication. To protect the uniform regions in BTI, the watermark information of a fixed-size block is generated according to the content of it and divided into three parts. One part is embedded in the flippable pixels of itself, and the other two parts are respectively embedded in the flippable pixels of other two blocks in BTI, which are randomly chosen based on the secret key. This strategy can not only introduce the block-wise dependence, but also make it possible for the authentication watermark of a uniform block to be embedded in the BTI. In the tamper detection stage, a multi-block based statistic detection method is designed to verify the validity of an image block. Simulation results show that the proposed algorithm can achieve a good imperceptibility and have an ability resisting the maliciously attacks such as collage attack, delete tampering, replace tampering etc.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125025935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A clustering analysis of Chinese consonants based on functional load 基于功能负荷的汉语辅音聚类分析
Bin Wu, Jinsong Zhang, Yanlu Xie
This paper attempts to provide some insights about the relationship between the differentiability and the classification importance of consonants in Chinese speech communication. The two characteristics can be modelled by the perceptual distance and the functional load respectively. We have a clustering analysis of Chinese consonants based on functional load (FL) relied on mutual information (MI) between the text and its phoneme transcription. Then we compare our clustering result with that based on the perceptual distance by articulation tests. By experimenting on the Chinese newspaper corpus with millions of sentences, we find most phonemes at the same place of articulation with different manners tend to have large FLs pairwise. It is consistent with the result that those phonemes tend to have long perceptual distance pairwise.
本文试图对汉语语音交际中辅音的可区分性与分类重要性之间的关系提出一些见解。这两个特征可以分别用感知距离和功能负荷来表示。本文对汉语辅音进行了基于功能负荷的聚类分析,该聚类分析依赖于文本与音素转录之间的互信息。然后通过发音测试将我们的聚类结果与基于感知距离的聚类结果进行比较。通过对中文报纸语料库数百万句的实验,我们发现,在发音相同的位置,不同发音方式的大多数音素往往具有较大的成对外音。这与结果一致,即这些音素倾向于具有较长的成对感知距离。
{"title":"A clustering analysis of Chinese consonants based on functional load","authors":"Bin Wu, Jinsong Zhang, Yanlu Xie","doi":"10.1109/APSIPA.2014.7041637","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041637","url":null,"abstract":"This paper attempts to provide some insights about the relationship between the differentiability and the classification importance of consonants in Chinese speech communication. The two characteristics can be modelled by the perceptual distance and the functional load respectively. We have a clustering analysis of Chinese consonants based on functional load (FL) relied on mutual information (MI) between the text and its phoneme transcription. Then we compare our clustering result with that based on the perceptual distance by articulation tests. By experimenting on the Chinese newspaper corpus with millions of sentences, we find most phonemes at the same place of articulation with different manners tend to have large FLs pairwise. It is consistent with the result that those phonemes tend to have long perceptual distance pairwise.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121984583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Quality-based channel selection in multi-channel radio-over-fiber system 多通道光纤无线通信系统中基于质量的信道选择
Withawat Tangtrongpairoj, T. Higashino, M. Okada
Radio over Fiber (RoF) is a promising solution for wireless access services by transferring the heterogeneous radio signal via the optical fiber link. However, RoF devices have nonlinear characteristics which create intermodulation products in system. The intermodulation distortion (IMD) interferes uplink RF signals in the presence of coupling between downlink and uplink antennas in the base station (BS). This paper proposed the performance evaluation due to coupled downlink interfere to uplink antenna. The carrier to distortion plus noise ratio (CDNR) is evaluated for all combinations. By using NS3 network simulator, the result shows the best combination achieves better performance. Which coupled downlink interfere in uplink signal can be reduced when amount of downlink packet is decreased.
光纤无线电(RoF)通过光纤链路传输异构无线电信号,是一种很有前途的无线接入服务解决方案。然而,RoF器件具有非线性特性,会在系统中产生互调产物。在基站下行天线和上行天线之间存在耦合的情况下,互调失真(IMD)会干扰上行射频信号。本文提出了由于下行链路耦合干扰对上行天线的性能评估。对所有组合的载波失真加噪声比(CDNR)进行了评估。通过对NS3网络进行仿真,结果表明,最佳组合可以获得更好的性能。减少下行分组的数量可以减少上行信号中的耦合下行干扰。
{"title":"Quality-based channel selection in multi-channel radio-over-fiber system","authors":"Withawat Tangtrongpairoj, T. Higashino, M. Okada","doi":"10.1109/APSIPA.2014.7041690","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041690","url":null,"abstract":"Radio over Fiber (RoF) is a promising solution for wireless access services by transferring the heterogeneous radio signal via the optical fiber link. However, RoF devices have nonlinear characteristics which create intermodulation products in system. The intermodulation distortion (IMD) interferes uplink RF signals in the presence of coupling between downlink and uplink antennas in the base station (BS). This paper proposed the performance evaluation due to coupled downlink interfere to uplink antenna. The carrier to distortion plus noise ratio (CDNR) is evaluated for all combinations. By using NS3 network simulator, the result shows the best combination achieves better performance. Which coupled downlink interfere in uplink signal can be reduced when amount of downlink packet is decreased.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122797541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A narrowband active noise control system with frequency mismatch compensation 带频率失配补偿的窄带主动噪声控制系统
Jinwei Sun, Fei Ma, Boyan Huang, Liang Wen
Narrowband active noise control (ANC) systems enjoy good performance where sinusoidal signals dominate in the primary noise, on condition that a reference signal of the same frequencies with the primary noise is given. However, frequencies of the reference signal provided by nonacoustic sensors are usually different from that of the primary noise due to temperature changes, aging, etc. Such frequency mismatch (FM) will make the narrowband ANC systems unable to suppress the primary noise effectively, even render them useless. In this paper, we propose a new narrowband ANC system that integrated with a frequency estimation subsystem. The frequency estimation is obtained from a spectrum computation based on an adaptive linear prediction filter. The estimated frequencies are used by the cosine signal generator to produce a more accurate reference signal to the main controller, thus the performance deterioration caused by FM can be mitigated. The effectiveness of the proposed system has been confirmed by numerous simulations.
窄带有源噪声控制系统在主噪声中以正弦信号为主的情况下,只要给出与主噪声相同频率的参考信号,就具有良好的控制性能。然而,由于温度变化、老化等原因,非声学传感器提供的参考信号的频率通常与主噪声的频率不同。这种频率失配(FM)会使窄带无线通信系统无法有效抑制主噪声,甚至使其失效。本文提出了一种集成了频率估计子系统的窄带自适应无线通信系统。频率估计是基于自适应线性预测滤波器的频谱计算得到的。余弦信号发生器利用估计的频率为主控制器提供更精确的参考信号,从而减轻调频引起的性能下降。通过大量的仿真验证了该系统的有效性。
{"title":"A narrowband active noise control system with frequency mismatch compensation","authors":"Jinwei Sun, Fei Ma, Boyan Huang, Liang Wen","doi":"10.1109/APSIPA.2014.7041689","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041689","url":null,"abstract":"Narrowband active noise control (ANC) systems enjoy good performance where sinusoidal signals dominate in the primary noise, on condition that a reference signal of the same frequencies with the primary noise is given. However, frequencies of the reference signal provided by nonacoustic sensors are usually different from that of the primary noise due to temperature changes, aging, etc. Such frequency mismatch (FM) will make the narrowband ANC systems unable to suppress the primary noise effectively, even render them useless. In this paper, we propose a new narrowband ANC system that integrated with a frequency estimation subsystem. The frequency estimation is obtained from a spectrum computation based on an adaptive linear prediction filter. The estimated frequencies are used by the cosine signal generator to produce a more accurate reference signal to the main controller, thus the performance deterioration caused by FM can be mitigated. The effectiveness of the proposed system has been confirmed by numerous simulations.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126276962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An automatic input protocol recommendation method for tailored switch-to-speech communication aid systems 一种用于定制切换语音通信辅助系统的自动输入协议推荐方法
Fuming Fang, T. Shinozaki, Takao Kobayashi
A switch-to-speech interface can provide a means of interactive communication as a support system for people with disabilities with voluntary movements. Any motion of a part of the body, such as eye movements, can be used for the switch input. The number of possible switch operations varies from person to person, but the bandwidth is generally quite limited. Therefore, efficient input protocols are needed to map the switch operations to pronunciations. Meanwhile, the protocol must be easily learnable so that anyone can use it. To this end, we propose a protocol recommendation method that can accept individual requirements in switch operations. This method suggests a customized protocol for each user of the interface that is both speedy to enter and easy to remember. The two main ideas in the protocol design are utilizing the knowledge about the alphabet table that everyone already knows and improving the input speed and learnability by allowing ambiguity in the switch to pronunciation conversion. The conversion errors due to the ambiguity are offset by an N-gram language model. The performance of the protocols was evaluated through simulations and the measured values obtained from research participants, and the advantage of the proposed method is shown.
语音转换界面可作为辅助系统,为残障人士提供互动沟通的手段。任何身体部位的运动,比如眼睛的运动,都可以作为开关的输入。可能的交换操作的数量因人而异,但带宽通常是相当有限的。因此,需要有效的输入协议来将切换操作映射到发音。同时,该协议必须易于学习,以便任何人都可以使用它。为此,我们提出了一种能够接受交换机操作中个性化需求的协议推荐方法。该方法为界面的每个用户提供了一个定制的协议,该协议既快速输入又易于记忆。协议设计的两个主要思想是利用每个人都知道的字母表知识,以及通过允许发音转换中的歧义来提高输入速度和可学习性。由歧义引起的转换误差由N-gram语言模型抵消。通过仿真和研究参与者的实测值对方案的性能进行了评价,表明了所提方法的优越性。
{"title":"An automatic input protocol recommendation method for tailored switch-to-speech communication aid systems","authors":"Fuming Fang, T. Shinozaki, Takao Kobayashi","doi":"10.1109/APSIPA.2014.7041638","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041638","url":null,"abstract":"A switch-to-speech interface can provide a means of interactive communication as a support system for people with disabilities with voluntary movements. Any motion of a part of the body, such as eye movements, can be used for the switch input. The number of possible switch operations varies from person to person, but the bandwidth is generally quite limited. Therefore, efficient input protocols are needed to map the switch operations to pronunciations. Meanwhile, the protocol must be easily learnable so that anyone can use it. To this end, we propose a protocol recommendation method that can accept individual requirements in switch operations. This method suggests a customized protocol for each user of the interface that is both speedy to enter and easy to remember. The two main ideas in the protocol design are utilizing the knowledge about the alphabet table that everyone already knows and improving the input speed and learnability by allowing ambiguity in the switch to pronunciation conversion. The conversion errors due to the ambiguity are offset by an N-gram language model. The performance of the protocols was evaluated through simulations and the measured values obtained from research participants, and the advantage of the proposed method is shown.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126306398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription 通过选择相似的演讲者进行演讲转录的DNN-HMM的无监督演讲者自适应
M. Mimura, Tatsuya Kawahara
Unsupervised speaker adaptation of Deep Neural Network (DNN) is investigated for lecture transcription tasks, in which a single speaker gives a long speech and thus speaker adaptation is important. The proposed method selects similar speakers to the test data (test speaker) from the training database, which are used for retraining the baseline DNN. Several speaker characteristic features are defined for the speaker similarity measure. The feature based on Universal Background Model (UBM) and principal component analysis (PCA) achieves the best performance, resulting in a significant improvement from the baseline DNN and also from the adapted GMM-HMM system. The method is combined with a naive adaptation method using the initial ASR hypothesis of the test data, and an additional improvement is achieved.
研究了基于深度神经网络(DNN)的无监督演讲人自适应的演讲转录任务,在这种任务中,演讲人自适应是一个重要的问题。该方法从训练数据库中选择与测试数据(测试说话人)相似的说话人,用于对基线DNN进行再训练。定义了几个说话人的特征特征用于说话人相似度度量。基于通用背景模型(Universal Background Model, UBM)和主成分分析(principal component analysis, PCA)的特征得到了最好的性能,与基线深度神经网络和自适应的GMM-HMM系统相比有了显著的改进。该方法与利用试验数据初始ASR假设的朴素自适应方法相结合,实现了进一步的改进。
{"title":"Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription","authors":"M. Mimura, Tatsuya Kawahara","doi":"10.1109/APSIPA.2014.7041567","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041567","url":null,"abstract":"Unsupervised speaker adaptation of Deep Neural Network (DNN) is investigated for lecture transcription tasks, in which a single speaker gives a long speech and thus speaker adaptation is important. The proposed method selects similar speakers to the test data (test speaker) from the training database, which are used for retraining the baseline DNN. Several speaker characteristic features are defined for the speaker similarity measure. The feature based on Universal Background Model (UBM) and principal component analysis (PCA) achieves the best performance, resulting in a significant improvement from the baseline DNN and also from the adapted GMM-HMM system. The method is combined with a naive adaptation method using the initial ASR hypothesis of the test data, and an additional improvement is achieved.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125828469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Constrained design of FIR filters with sparse coefficients 稀疏系数FIR滤波器的约束设计
Ryo Matsuoka, T. Baba, M. Okuda
We present an algorithm for the constrained design of FIR filters with sparse coefficients. In general, the filter design approach aims to minimize a filter order and maximize the filter performance. Although the FIR filter coefficients designed by the least squares method is optimal in the least squares sense, it is not necessarily optimal among the set of filters with the same number of multipliers, that is, less mean squared error can be achieved by a filter that has the same number of multipliers, but has longer impulse response with some zero-valued entries. Our method minimizes the number of nonzero entries in the impulse response together with the least squares error of its frequency response. In addition, we incorporate some constraints to the design and realize better performance than conventional constrained least squares design.
提出了一种具有稀疏系数的FIR滤波器的约束设计算法。一般来说,滤波器设计方法的目标是最小化滤波器阶数和最大化滤波器性能。虽然用最小二乘法设计的FIR滤波器系数在最小二乘意义上是最优的,但在具有相同乘数的滤波器集合中并不一定是最优的,即具有相同乘数的滤波器可以获得较小的均方误差,但具有一些零值项的滤波器具有较长的脉冲响应。我们的方法将脉冲响应中的非零项的数量及其频率响应的最小二乘误差最小化。此外,我们在设计中加入了一些约束,实现了比传统约束最小二乘设计更好的性能。
{"title":"Constrained design of FIR filters with sparse coefficients","authors":"Ryo Matsuoka, T. Baba, M. Okuda","doi":"10.1109/APSIPA.2014.7041561","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041561","url":null,"abstract":"We present an algorithm for the constrained design of FIR filters with sparse coefficients. In general, the filter design approach aims to minimize a filter order and maximize the filter performance. Although the FIR filter coefficients designed by the least squares method is optimal in the least squares sense, it is not necessarily optimal among the set of filters with the same number of multipliers, that is, less mean squared error can be achieved by a filter that has the same number of multipliers, but has longer impulse response with some zero-valued entries. Our method minimizes the number of nonzero entries in the impulse response together with the least squares error of its frequency response. In addition, we incorporate some constraints to the design and realize better performance than conventional constrained least squares design.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129978061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1