首页 > 最新文献

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific最新文献

英文 中文
Selection of best match keyword using spoken term detection for spoken document indexing 使用口语词检测为口语文档索引选择最佳匹配关键字
Kentaro Domoto, T. Utsuro, N. Sawada, H. Nishizaki
This paper presents a novel keyword selection-based spoken document-indexing framework that selects the best match keyword from query candidates using spoken term detection (STD) for spoken document retrieval. Our method comprises creating a keyword set including keywords that are likely to be in a spoken document. Next, an STD is conducted for all the keywords as query terms for STD; then, the detection result, a set of each keyword and its detection intervals in the spoken document, is obtained. For the keywords that have competitive intervals, we rank them based on the matching cost of STD and select the best one with the longest duration among competitive detections. This is the final output of STD process and serves as an index word for the spoken document. The proposed framework was evaluated on lecture speeches as spoken documents in an STD task. The results show that our framework was quite effective for preventing false detection errors and in annotating keyword indices to spoken documents.
本文提出了一种基于关键字选择的语音文档索引框架,该框架利用语音词检测(STD)从查询候选者中选择最匹配的关键字进行语音文档检索。我们的方法包括创建一个关键字集,其中包括可能出现在口语文档中的关键字。然后,对所有作为STD查询词的关键词执行STD;然后,得到语音文档中每个关键字及其检测间隔的集合作为检测结果。对于具有竞争间隔的关键词,我们根据STD的匹配成本对它们进行排序,并在竞争检测中选择持续时间最长的最佳关键词。这是STD过程的最终输出,并作为口语文档的索引词。在STD任务中,以演讲作为口语文档对所提出的框架进行了评估。结果表明,该框架在防止误检错误和为口语文档标注关键词索引方面非常有效。
{"title":"Selection of best match keyword using spoken term detection for spoken document indexing","authors":"Kentaro Domoto, T. Utsuro, N. Sawada, H. Nishizaki","doi":"10.1109/APSIPA.2014.7041589","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041589","url":null,"abstract":"This paper presents a novel keyword selection-based spoken document-indexing framework that selects the best match keyword from query candidates using spoken term detection (STD) for spoken document retrieval. Our method comprises creating a keyword set including keywords that are likely to be in a spoken document. Next, an STD is conducted for all the keywords as query terms for STD; then, the detection result, a set of each keyword and its detection intervals in the spoken document, is obtained. For the keywords that have competitive intervals, we rank them based on the matching cost of STD and select the best one with the longest duration among competitive detections. This is the final output of STD process and serves as an index word for the spoken document. The proposed framework was evaluated on lecture speeches as spoken documents in an STD task. The results show that our framework was quite effective for preventing false detection errors and in annotating keyword indices to spoken documents.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"215 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123869677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-learning-based signal decomposition for multimedia applications: A review and comparative study 基于自学习的多媒体信号分解:综述与比较研究
Li-Wei Kang, C. Yeh, Duan-Yu Chen, Chia-Tsung Lin
Decomposition of a signal (e.g., image or video) into multiple semantic components has been an effective research topic for various image/video processing applications, such as image/video denoising, enhancement, and inpainting. In this paper, we present a survey of signal decomposition frameworks based on the uses of sparsity and morphological diversity in signal mixtures and its applications in multimedia. First, we analyze existing MCA (morphological component analysis) based image decomposition frameworks with their applications and explore the potential limitations of these approaches for image denoising. Then, we discuss our recently proposed self-learning based image decomposition framework with its applications to several image/video denoising tasks, including single image rain streak removal, denoising, deblocking, joint super-resolution and deblocking for a highly compressed image/video. By advancing sparse representation and morphological diversity of image signals, the proposed framework first learns an over-complete dictionary from the high frequency part of an input image for reconstruction purposes. An unsupervised or supervised clustering technique is applied to the dictionary atoms for identifying the morphological component corresponding to the noise pattern of interest (e.g., rain streaks, blocking artifacts, or Gaussian noises). Different from prior learning-based approaches, our method does not need to collect training data in advance and no image priors are required. Our experimental results have confirmed the effectiveness and robustness of the proposed framework, which has been shown to outperform state-of-the-art approaches.
将信号(如图像或视频)分解为多个语义成分已经成为各种图像/视频处理应用的有效研究课题,例如图像/视频去噪、增强和涂漆。本文综述了基于稀疏性和形态多样性的信号分解框架及其在多媒体中的应用。首先,我们分析了现有的基于形态成分分析的图像分解框架及其应用,并探讨了这些方法在图像去噪方面的潜在局限性。然后,我们讨论了我们最近提出的基于自学习的图像分解框架及其在若干图像/视频去噪任务中的应用,包括单幅图像雨纹去除、去噪、去块、高度压缩图像/视频的联合超分辨率和去块。通过提高图像信号的稀疏表示和形态多样性,该框架首先从输入图像的高频部分学习一个过完备字典进行重建。将无监督或有监督聚类技术应用于字典原子,以识别与感兴趣的噪声模式(例如,雨条,阻塞伪影或高斯噪声)相对应的形态学成分。与之前基于学习的方法不同,我们的方法不需要提前收集训练数据,也不需要图像先验。我们的实验结果证实了所提出框架的有效性和鲁棒性,该框架已被证明优于最先进的方法。
{"title":"Self-learning-based signal decomposition for multimedia applications: A review and comparative study","authors":"Li-Wei Kang, C. Yeh, Duan-Yu Chen, Chia-Tsung Lin","doi":"10.1109/APSIPA.2014.7041778","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041778","url":null,"abstract":"Decomposition of a signal (e.g., image or video) into multiple semantic components has been an effective research topic for various image/video processing applications, such as image/video denoising, enhancement, and inpainting. In this paper, we present a survey of signal decomposition frameworks based on the uses of sparsity and morphological diversity in signal mixtures and its applications in multimedia. First, we analyze existing MCA (morphological component analysis) based image decomposition frameworks with their applications and explore the potential limitations of these approaches for image denoising. Then, we discuss our recently proposed self-learning based image decomposition framework with its applications to several image/video denoising tasks, including single image rain streak removal, denoising, deblocking, joint super-resolution and deblocking for a highly compressed image/video. By advancing sparse representation and morphological diversity of image signals, the proposed framework first learns an over-complete dictionary from the high frequency part of an input image for reconstruction purposes. An unsupervised or supervised clustering technique is applied to the dictionary atoms for identifying the morphological component corresponding to the noise pattern of interest (e.g., rain streaks, blocking artifacts, or Gaussian noises). Different from prior learning-based approaches, our method does not need to collect training data in advance and no image priors are required. Our experimental results have confirmed the effectiveness and robustness of the proposed framework, which has been shown to outperform state-of-the-art approaches.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116767970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
R-cube: A dialogue agent for restaurant recommendation and reservation R-cube:一个餐厅推荐和预订的对话代理
Seokhwan Kim, Rafael E. Banchs
This paper describes a hybrid dialogue system for restaurant recommendation and reservation. The proposed system combines rule-based and data-driven components by using a flexible architecture aiming at diminishing error propagation along the different steps of the dialogue management and processing pipeline. The system implements three basic subsystems for restaurant recommendation, selection and booking, which leverage on the same system architecture and processing components. The specific system described here operates with a data collection of Singapore's F&B industry but it can be easily adapted to any other city or location by simply replacing the used data collection.
本文介绍了一种用于餐厅推荐和预订的混合对话系统。提出的系统通过使用灵活的体系结构将基于规则的组件和数据驱动的组件结合在一起,旨在减少对话管理和处理管道的不同步骤中的错误传播。该系统实现了餐厅推荐、选择和预订三个基本子系统,它们利用相同的系统架构和处理组件。这里描述的特定系统使用新加坡餐饮行业的数据收集,但它可以很容易地适应任何其他城市或地点,只需更换使用的数据收集。
{"title":"R-cube: A dialogue agent for restaurant recommendation and reservation","authors":"Seokhwan Kim, Rafael E. Banchs","doi":"10.1109/APSIPA.2014.7041732","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041732","url":null,"abstract":"This paper describes a hybrid dialogue system for restaurant recommendation and reservation. The proposed system combines rule-based and data-driven components by using a flexible architecture aiming at diminishing error propagation along the different steps of the dialogue management and processing pipeline. The system implements three basic subsystems for restaurant recommendation, selection and booking, which leverage on the same system architecture and processing components. The specific system described here operates with a data collection of Singapore's F&B industry but it can be easily adapted to any other city or location by simply replacing the used data collection.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122845742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Real-time depth map generation using hybrid multi-view cameras 实时深度图生成使用混合多视图相机
Yunseok Song, Dong-Won Shin, Eunsang Ko, Yo-Sung Ho
In this paper, we present a hybrid multi-view camera system for real-time depth generation. We set up eight color cameras and three depth cameras. For simple test scenarios, we capture a single object at a blue-screen studio. The objective is depth map generation at eight color viewpoints. Due to hardware limitations, depth cameras produce low resolution images, i.e., 176×144. Thus, we warp the depth data to the color cameras views (1280×720) and then execute filtering. Joint bilateral filtering (JBF) is used to exploit range and spatial weights, considering color data as well. Simulation results exhibit depth generation of 13 frames per second (fps) when treating eight images as a single frame. When the proposed method is executed on a computer per depth camera basis, the speed can become three times faster. Thus, we have successfully achieved real-time depth generation using a hybrid multi-view camera system.
本文提出了一种用于实时深度生成的混合多视点相机系统。我们设置了8台彩色摄像机和3台深度摄像机。对于简单的测试场景,我们在蓝屏工作室捕获单个对象。目标是在八个颜色视点生成深度图。由于硬件限制,深度相机产生低分辨率图像,即176×144。因此,我们将深度数据扭曲到彩色摄像机视图(1280×720),然后执行过滤。联合双边滤波(JBF)利用距离和空间权重,同时考虑颜色数据。仿真结果显示,当将8张图像作为一帧处理时,深度生成速度为每秒13帧(fps)。当所提出的方法在每个深度相机的基础上在计算机上执行时,速度可以提高三倍。因此,我们已经成功地实现了实时深度生成使用混合多视图相机系统。
{"title":"Real-time depth map generation using hybrid multi-view cameras","authors":"Yunseok Song, Dong-Won Shin, Eunsang Ko, Yo-Sung Ho","doi":"10.1109/APSIPA.2014.7041683","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041683","url":null,"abstract":"In this paper, we present a hybrid multi-view camera system for real-time depth generation. We set up eight color cameras and three depth cameras. For simple test scenarios, we capture a single object at a blue-screen studio. The objective is depth map generation at eight color viewpoints. Due to hardware limitations, depth cameras produce low resolution images, i.e., 176×144. Thus, we warp the depth data to the color cameras views (1280×720) and then execute filtering. Joint bilateral filtering (JBF) is used to exploit range and spatial weights, considering color data as well. Simulation results exhibit depth generation of 13 frames per second (fps) when treating eight images as a single frame. When the proposed method is executed on a computer per depth camera basis, the speed can become three times faster. Thus, we have successfully achieved real-time depth generation using a hybrid multi-view camera system.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131539563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Robust anchorperson detection based on audio streams using a hybrid I-vector and DNN system 基于i向量和DNN混合系统的音频流鲁棒主播检测
Yun-Fan Chang, Payton Lin, Shao-Hua Cheng, Kai-Hsuan Chan, Y. Zeng, Chia-Wei Liao, Wen-Tsung Chang, Y. Wang, Yu Tsao
Anchorperson segment detection enables efficient video content indexing for information retrieval. Anchorperson detection based on audio analysis has gained popularity due to lower computational complexity and satisfactory performance. This paper presents a robust framework using a hybrid I-vector and deep neural network (DNN) system to perform anchorperson detection based on audio streams of video content. The proposed system first applies I-vector to extract speaker identity features from the audio data. With the extracted speaker identity features, a DNN classifier is then used to verify the claimed anchorperson identity. In addition, subspace feature normalization (SFN) is incorporated into the hybrid system for robust feature extraction to compensate the audio mismatch issues caused by recording devices. An anchorperson verification experiment was conducted to evaluate the equal error rate (EER) of the proposed hybrid system. Experimental results demonstrate that the proposed system outperforms the state-of-the-art hybrid I-vector and support vector machine (SVM) system. Moreover, the proposed system was further enhanced by integrating SFN to effectively compensate the audio mismatch issues in anchorperson detection tasks.
主播片段检测使高效的视频内容索引信息检索。基于音频分析的主播检测以其较低的计算复杂度和令人满意的性能得到了广泛的应用。本文提出了一种鲁棒框架,利用混合i向量和深度神经网络(DNN)系统来执行基于视频内容音频流的主播检测。该系统首先利用i向量从音频数据中提取说话人身份特征。使用提取的说话人身份特征,然后使用DNN分类器来验证所声明的主播身份。此外,在混合系统中引入了子空间特征归一化(SFN)来进行鲁棒特征提取,以补偿由录音设备引起的音频不匹配问题。通过主播验证实验对该混合系统的等错误率进行了评价。实验结果表明,该系统优于当前最先进的i向量和支持向量机(SVM)混合系统。此外,该系统通过集成SFN进一步增强,有效补偿主播检测任务中的音频不匹配问题。
{"title":"Robust anchorperson detection based on audio streams using a hybrid I-vector and DNN system","authors":"Yun-Fan Chang, Payton Lin, Shao-Hua Cheng, Kai-Hsuan Chan, Y. Zeng, Chia-Wei Liao, Wen-Tsung Chang, Y. Wang, Yu Tsao","doi":"10.1109/APSIPA.2014.7041717","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041717","url":null,"abstract":"Anchorperson segment detection enables efficient video content indexing for information retrieval. Anchorperson detection based on audio analysis has gained popularity due to lower computational complexity and satisfactory performance. This paper presents a robust framework using a hybrid I-vector and deep neural network (DNN) system to perform anchorperson detection based on audio streams of video content. The proposed system first applies I-vector to extract speaker identity features from the audio data. With the extracted speaker identity features, a DNN classifier is then used to verify the claimed anchorperson identity. In addition, subspace feature normalization (SFN) is incorporated into the hybrid system for robust feature extraction to compensate the audio mismatch issues caused by recording devices. An anchorperson verification experiment was conducted to evaluate the equal error rate (EER) of the proposed hybrid system. Experimental results demonstrate that the proposed system outperforms the state-of-the-art hybrid I-vector and support vector machine (SVM) system. Moreover, the proposed system was further enhanced by integrating SFN to effectively compensate the audio mismatch issues in anchorperson detection tasks.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"62 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131540250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Comparison the training methods of neural network for English and Thai character recognition 神经网络训练方法在英文和泰文字符识别中的比较
A. Saenthon, Natchanon Sukkhadamrongrak
Currently, the optical character recognition (OCR) is applied in many fields such as reading the office letter and to read the serial on parts of industrial. The most manufacturing focus the processing time and accuracy for inspection process. The learning method of the optical character recognition is used a neural network to recognize the fonts and correlation the matching value. The neural network has many learning techniques which each technique impact to the processing time and accuracy. Therefore, this paper studies to comparisons a suitable procedure of training in neural network for recognizing both Thai and English characters. The experiment results show the comparing values of error and processing time of each training technique.
目前,光学字符识别(OCR)技术已广泛应用于办公信件的识别、工业零件的序列号识别等领域。制造过程中最关注的是检验过程的加工时间和精度。光学字符识别的学习方法是利用神经网络对字体进行识别并关联匹配值。神经网络有许多学习技术,每一种技术都会对处理时间和精度产生影响。为此,本文研究比较了一种适合于泰、英文字符识别的神经网络训练方法。实验结果显示了各种训练方法的误差和处理时间的比较值。
{"title":"Comparison the training methods of neural network for English and Thai character recognition","authors":"A. Saenthon, Natchanon Sukkhadamrongrak","doi":"10.1109/APSIPA.2014.7041795","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041795","url":null,"abstract":"Currently, the optical character recognition (OCR) is applied in many fields such as reading the office letter and to read the serial on parts of industrial. The most manufacturing focus the processing time and accuracy for inspection process. The learning method of the optical character recognition is used a neural network to recognize the fonts and correlation the matching value. The neural network has many learning techniques which each technique impact to the processing time and accuracy. Therefore, this paper studies to comparisons a suitable procedure of training in neural network for recognizing both Thai and English characters. The experiment results show the comparing values of error and processing time of each training technique.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132744084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Topic model allocation of conversational dialogue records by Latent Dirichlet Allocation 基于潜在狄利克雷分配的会话对话记录主题模型分配
Jui-Feng Yeh, C. Lee, Yi-Shiuan Tan, Liang-Chih Yu
The topic information of conversational content is important for continuation with communication, so topic detection and tracking is one of important research. Due to there are many topic transform occurring frequently in long time communication, and the conversation maybe have many topics, so it's important to detect different topics in conversational content. This paper detects topic information by using agglomerative clustering of utterances and Dynamic Latent Dirichlet Allocation topic model, uses proportion of verb and noun to analyze similarity between utterances and cluster all utterances in conversational content by agglomerative clustering algorithm. The topic structure of conversational content is friability, so we use speech act information and gets the hypernym information by E-HowNet that obtains robustness of word categories. Latent Dirichlet Allocation topic model is used to detect topic in file units, it just can detect only one topic if uses it in conversational content, because of there are many topics in conversational content frequently, and also uses speech act information and hypernym information to train the latent Dirichlet allocation models, then uses trained models to detect different topic information in conversational content. For evaluating the proposed method, support vector machine is developed for comparison. According to the experimental results, we can find the proposed method outperforms the approach based on support vector machine in topic detection and tracking in spoken dialogue.
会话内容的话题信息对于交流的继续是很重要的,因此话题的检测与跟踪是重要的研究之一。由于在长时间的交际中,经常会出现许多话题变换,并且会话可能包含许多话题,因此在会话内容中检测不同的话题是很重要的。本文利用话语凝聚聚类和动态潜狄利克雷分配主题模型检测话题信息,利用动词和名词的比例分析话语之间的相似度,并利用凝聚聚类算法对会话内容中的所有话语进行聚类。会话内容的主题结构是脆弱的,因此我们利用语音行为信息,通过E-HowNet获取首词信息,从而获得词类别的鲁棒性。潜狄利克雷分配主题模型用于在文件单元中检测主题,由于会话内容中经常有许多主题,因此在会话内容中使用潜狄利克雷分配主题模型只能检测一个主题,并且还使用语音行为信息和超词信息来训练潜狄利克雷分配模型,然后使用训练好的模型来检测会话内容中的不同主题信息。为了评估所提出的方法,开发了支持向量机进行比较。根据实验结果,我们发现该方法在口语对话的主题检测和跟踪方面优于基于支持向量机的方法。
{"title":"Topic model allocation of conversational dialogue records by Latent Dirichlet Allocation","authors":"Jui-Feng Yeh, C. Lee, Yi-Shiuan Tan, Liang-Chih Yu","doi":"10.1109/APSIPA.2014.7041546","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041546","url":null,"abstract":"The topic information of conversational content is important for continuation with communication, so topic detection and tracking is one of important research. Due to there are many topic transform occurring frequently in long time communication, and the conversation maybe have many topics, so it's important to detect different topics in conversational content. This paper detects topic information by using agglomerative clustering of utterances and Dynamic Latent Dirichlet Allocation topic model, uses proportion of verb and noun to analyze similarity between utterances and cluster all utterances in conversational content by agglomerative clustering algorithm. The topic structure of conversational content is friability, so we use speech act information and gets the hypernym information by E-HowNet that obtains robustness of word categories. Latent Dirichlet Allocation topic model is used to detect topic in file units, it just can detect only one topic if uses it in conversational content, because of there are many topics in conversational content frequently, and also uses speech act information and hypernym information to train the latent Dirichlet allocation models, then uses trained models to detect different topic information in conversational content. For evaluating the proposed method, support vector machine is developed for comparison. According to the experimental results, we can find the proposed method outperforms the approach based on support vector machine in topic detection and tracking in spoken dialogue.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115754286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Phase detection of multi-channel SSVEPs via complex sparse spatial weighting 基于复稀疏空间加权的多通道ssvep相位检测
Keita Shimpo, Toshihisa Tanaka
A brain-computer interface (BCI) based on steady-state visual evoked potentials (SSVEP) is one of the most practical BCI, because of high recognition accuracies and short time training. Phase of SSVEPs can be potentially applicable for generating device commands. However, the effective method of estimating the phase of SSVEPs has not yet been established, especially, in the case of using multi-channel electroencephalogram (EEG). In this paper, we propose a novel method for estimating the phase of SSVEPs from multi-channel EEG, which uses complex sparse spatial weighting. We conducted experiments with the phase-coded SSVEPs based BCI for evaluating performance of our proposed method. As a result, our proposed method showed higher recognition accuracies than conventional methods in all six subjects.
基于稳态视觉诱发电位(SSVEP)的脑机接口(BCI)具有识别准确率高、训练时间短等优点,是目前最实用的脑机接口之一。ssvep阶段可能潜在地适用于生成设备命令。然而,目前还没有有效的方法来估计ssvep的相位,特别是在使用多通道脑电图(EEG)的情况下。本文提出了一种基于复稀疏空间加权的多通道脑电信号相位估计方法。我们用基于BCI的相位编码ssvep进行了实验,以评估我们提出的方法的性能。结果表明,本文提出的方法在所有6个主题上都比传统方法具有更高的识别准确率。
{"title":"Phase detection of multi-channel SSVEPs via complex sparse spatial weighting","authors":"Keita Shimpo, Toshihisa Tanaka","doi":"10.1109/APSIPA.2014.7041666","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041666","url":null,"abstract":"A brain-computer interface (BCI) based on steady-state visual evoked potentials (SSVEP) is one of the most practical BCI, because of high recognition accuracies and short time training. Phase of SSVEPs can be potentially applicable for generating device commands. However, the effective method of estimating the phase of SSVEPs has not yet been established, especially, in the case of using multi-channel electroencephalogram (EEG). In this paper, we propose a novel method for estimating the phase of SSVEPs from multi-channel EEG, which uses complex sparse spatial weighting. We conducted experiments with the phase-coded SSVEPs based BCI for evaluating performance of our proposed method. As a result, our proposed method showed higher recognition accuracies than conventional methods in all six subjects.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124638003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Modeling spatial uncertainty of imprecise information in images 图像中不精确信息的空间不确定性建模
T. Pham
The description of information content in images is imprecise in nature. Quantification of uncertainty in images for pattern analysis has been addressed with the theories of probability and fuzzy sets. In this paper, an approach for modeling the spatial uncertainty of images is proposed in the setting of geostatistics and probability measure of fuzzy events. The proposed approach can be utilized to extract an effective feature for image classification.
图像中信息内容的描述本质上是不精确的。用概率论和模糊集理论对图像的不确定性进行了定量分析。本文提出了一种基于地质统计学和模糊事件概率测度的图像空间不确定性建模方法。该方法可用于提取图像分类的有效特征。
{"title":"Modeling spatial uncertainty of imprecise information in images","authors":"T. Pham","doi":"10.1109/APSIPA.2014.7041514","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041514","url":null,"abstract":"The description of information content in images is imprecise in nature. Quantification of uncertainty in images for pattern analysis has been addressed with the theories of probability and fuzzy sets. In this paper, an approach for modeling the spatial uncertainty of images is proposed in the setting of geostatistics and probability measure of fuzzy events. The proposed approach can be utilized to extract an effective feature for image classification.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134376025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contactless palmprint alignment based on intrinsic local affine-invariant feature points 基于内禀局部仿射不变特征点的非接触式掌纹对准
C. Phromsuthirak, W. Tangsuksant, A. Sanpanich, C. Pintavirooj
A Palmprint, biométrie characteristics, was mostly found in civil and commercial applications for security system because it has more reliable and easy to capture by low resolution devices. This paper was to develop a new contactless palmprint alignment with general USB camera on tripod. The palmprint image is acquired by this camera and using intrinsic local affine-invariant key points residing on the area patches spanning between two successive fingers to align palmprint image. The key points are relative affine invariant to affine transformations so this algorithm does not need the guidance pegs in acquisition process to fix hand position to avoid the scaling, translation and rotation problems for correctly palmprint image alignment. Finally, the developed algorithm was tested by 10 left-handed palmprint images collected from different subjects. The simulation results indicate by distance map error of 1.4899 pixels.
掌纹具有生物质变特征,由于其可靠性高,且易于低分辨率设备捕获,因此主要应用于民用和商用安防系统。本文研究了一种新型三脚架通用USB相机的非接触式掌纹对准方法。该相机采集掌纹图像,利用两个连续手指之间的区域斑块上的固有局部仿射不变关键点对掌纹图像进行对齐。该算法的关键在于对仿射变换的相对仿射不变性,因此该算法不需要在采集过程中使用导引杆来固定手的位置,从而避免了正确对齐掌纹图像时的缩放、平移和旋转问题。最后,用10张不同受试者的左手掌纹图像对算法进行了验证。仿真结果表明,距离图误差为1.4899像素。
{"title":"Contactless palmprint alignment based on intrinsic local affine-invariant feature points","authors":"C. Phromsuthirak, W. Tangsuksant, A. Sanpanich, C. Pintavirooj","doi":"10.1109/APSIPA.2014.7041563","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041563","url":null,"abstract":"A Palmprint, biométrie characteristics, was mostly found in civil and commercial applications for security system because it has more reliable and easy to capture by low resolution devices. This paper was to develop a new contactless palmprint alignment with general USB camera on tripod. The palmprint image is acquired by this camera and using intrinsic local affine-invariant key points residing on the area patches spanning between two successive fingers to align palmprint image. The key points are relative affine invariant to affine transformations so this algorithm does not need the guidance pegs in acquisition process to fix hand position to avoid the scaling, translation and rotation problems for correctly palmprint image alignment. Finally, the developed algorithm was tested by 10 left-handed palmprint images collected from different subjects. The simulation results indicate by distance map error of 1.4899 pixels.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131795318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1