Currently the modern developing home-care systems highlight the functionalities on bio-signals measurement, security surveillance and health care, however most of them work independently. In this paper, a newly warming-care framework for elderly is proposed, not only to reach the aforementioned services, but also including following kindly services, that is, the remote monitoring, the web camera management, the emergency call for help, the behavior recognition and feedback, and the remote control entertainment services, to reach a comprehensive humanistic-caring system. The proposed framework is motivated by the individual alphabet on “HAPPINESS” which are redefined and interpreted as “Health”, “Ability”, “Protection”, “Personalization”, “Interaction”, “Nursing”, “Entertainment”, “Succor” and “Smile”. Three main services are spotlighted to achieve the goals described below. The Web-based Central Camera Management Service (WCCMS) is a real-time remote monitoring function that a caregiver can pay attention to care elderly anytime and anywhere through web services; the Multimodal Human-Machine Interaction Service (MHMIS) provides the audio-visual cognitive functions to interact with elderly, and the Web-based User Management Service (WUMS) gives user a smart HMI interface including bio-signal measurement, help button, remote control, and hospital appointment scheduling functionalities. To evaluate the proposed framework usability, MOS (Mean Opinion Score) is applied and average MOS 4.2 score is acquired that reveals the proposed system expectable.
{"title":"A happiness-oriented home care system for elderly daily living","authors":"Yang-Yen Ou, Po-Yi Shih, Ta-Wen Kuan, S. Shih, Jhing-Fa Wang, Jaw-Shyang Wu","doi":"10.1109/ICOT.2014.6956632","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956632","url":null,"abstract":"Currently the modern developing home-care systems highlight the functionalities on bio-signals measurement, security surveillance and health care, however most of them work independently. In this paper, a newly warming-care framework for elderly is proposed, not only to reach the aforementioned services, but also including following kindly services, that is, the remote monitoring, the web camera management, the emergency call for help, the behavior recognition and feedback, and the remote control entertainment services, to reach a comprehensive humanistic-caring system. The proposed framework is motivated by the individual alphabet on “HAPPINESS” which are redefined and interpreted as “Health”, “Ability”, “Protection”, “Personalization”, “Interaction”, “Nursing”, “Entertainment”, “Succor” and “Smile”. Three main services are spotlighted to achieve the goals described below. The Web-based Central Camera Management Service (WCCMS) is a real-time remote monitoring function that a caregiver can pay attention to care elderly anytime and anywhere through web services; the Multimodal Human-Machine Interaction Service (MHMIS) provides the audio-visual cognitive functions to interact with elderly, and the Web-based User Management Service (WUMS) gives user a smart HMI interface including bio-signal measurement, help button, remote control, and hospital appointment scheduling functionalities. To evaluate the proposed framework usability, MOS (Mean Opinion Score) is applied and average MOS 4.2 score is acquired that reveals the proposed system expectable.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127532665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956636
Guohong Liang, Ying Li
This paper suggests a thin cloud removing approach of remote sensing image based on robust kernel regression. Due to the influence of atmosphere condition, cloud cover is one of the most disturbance factors in remote sensing image. So cloud removal is a very important step for improving the quality of the image before making analysis. Because thin cloud is the low frequency component in remote sensing images, thin cloud can be removed efficiently by using the method introduced in this paper.
{"title":"Removing thin cloud from remote sensing digital images based on robust kernel regression","authors":"Guohong Liang, Ying Li","doi":"10.1109/ICOT.2014.6956636","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956636","url":null,"abstract":"This paper suggests a thin cloud removing approach of remote sensing image based on robust kernel regression. Due to the influence of atmosphere condition, cloud cover is one of the most disturbance factors in remote sensing image. So cloud removal is a very important step for improving the quality of the image before making analysis. Because thin cloud is the low frequency component in remote sensing images, thin cloud can be removed efficiently by using the method introduced in this paper.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123310414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6954675
Dahai Yu, Junwei Han, Yibo Ye, Zhijun Fang
In this paper, we present a novel human detection method by devising a saliency framework on visual attention HOG features for infrared thermal imaging cameras. The proposed approach extends the saliency map by including the representation not only spatial features but also gaze distribution features. During thermal videos, the developed framework consists several computational stages: (a) the regions of interest areas are outlined based on saliency contrast; (b) the grids of HOG descriptor are selected to extract features in each image; (c) the training features are optimized by gaze visual attention map; (d) finally support vector machine algorithm is used to register positive human saliency model for trained classifiers. In order to validate our algorithm, we constructed a thermal infrared image database collected by real-time inspection system that contains labeled gaze attention map. The experimental results using this database demonstrated that our algorithm outperforms previous state-of-the-art methods for human detection tasks in thermal infrared images.
{"title":"A novel saliency detection framework for infrared thermal images","authors":"Dahai Yu, Junwei Han, Yibo Ye, Zhijun Fang","doi":"10.1109/ICOT.2014.6954675","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6954675","url":null,"abstract":"In this paper, we present a novel human detection method by devising a saliency framework on visual attention HOG features for infrared thermal imaging cameras. The proposed approach extends the saliency map by including the representation not only spatial features but also gaze distribution features. During thermal videos, the developed framework consists several computational stages: (a) the regions of interest areas are outlined based on saliency contrast; (b) the grids of HOG descriptor are selected to extract features in each image; (c) the training features are optimized by gaze visual attention map; (d) finally support vector machine algorithm is used to register positive human saliency model for trained classifiers. In order to validate our algorithm, we constructed a thermal infrared image database collected by real-time inspection system that contains labeled gaze attention map. The experimental results using this database demonstrated that our algorithm outperforms previous state-of-the-art methods for human detection tasks in thermal infrared images.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114557214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956607
Zhenye Gan, Zhenwen Wang, Hongwu Yang
This paper presents a method to realize the Tibetan Lhasa speech concatenation synthesis based on a large corpus. A large corpus of Tibetan Lhasa dialect is established by analyzing the characteristics of Tibetan Lhasa dialect. A grapheme-to-phoneme conversion method is realized to convert Tibetan sentences to Speech Assessment Methods Phonetic Alphabet (SAMPA)-based Pinyin sequences. Firstly, Tibetan text is converted to Pinyin sequences based on SAMPA-T transformation method. Then the Tibetan acoustic finals and syllables are used as units to builds Classification and Regression Tree (CART) according to the spectral distance of each candidate units and the context dependent question sets. The CART algorithm is applied to choose the acoustic finals and syllables which are most conform to the context information. Finally, the Tibetan Lhasa speech is then synthesized by waveform concatenation synthesis method. Tests show that the MOS of Synthetic Tibetan Lhasa speech by using acoustic finals or syllables as units is 3.9 points and 4.1 points respectively. The quality of synthesized Tibetan Lhasa speech by using syllables as units is better than acoustic finals.
{"title":"Realizing Tibetan Lhasa speech concatenation synthesis system based on a large corpus","authors":"Zhenye Gan, Zhenwen Wang, Hongwu Yang","doi":"10.1109/ICOT.2014.6956607","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956607","url":null,"abstract":"This paper presents a method to realize the Tibetan Lhasa speech concatenation synthesis based on a large corpus. A large corpus of Tibetan Lhasa dialect is established by analyzing the characteristics of Tibetan Lhasa dialect. A grapheme-to-phoneme conversion method is realized to convert Tibetan sentences to Speech Assessment Methods Phonetic Alphabet (SAMPA)-based Pinyin sequences. Firstly, Tibetan text is converted to Pinyin sequences based on SAMPA-T transformation method. Then the Tibetan acoustic finals and syllables are used as units to builds Classification and Regression Tree (CART) according to the spectral distance of each candidate units and the context dependent question sets. The CART algorithm is applied to choose the acoustic finals and syllables which are most conform to the context information. Finally, the Tibetan Lhasa speech is then synthesized by waveform concatenation synthesis method. Tests show that the MOS of Synthetic Tibetan Lhasa speech by using acoustic finals or syllables as units is 3.9 points and 4.1 points respectively. The quality of synthesized Tibetan Lhasa speech by using syllables as units is better than acoustic finals.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115444070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956618
Chun-Chia Kung, Yang-Yen Ou, Ding-Ruey Yeh, Chung-Fa Wang
Mother love is essential for mother-child relationship, and is important for child's development. Despite of series of studies pertaining to the neural substrate of parental love, to date, however, no publication is found regarding the neural substrates of maternal love in shopping. In this study, we examined first-time mothers' brain activations in response to mother-related (purses, make-ups, and clothes) and children-related (toys, books, and clothes) items measured by fMRI. 22 (including 20 first-time) mothers were scanned with 3T MRI scanner. In children- vs. mother-related items, the expected mother-nature site, peri-aquductal grey (or PAG), was found to be more activated. In addition, “Buy vs. NotBuy”, as well as interaction of both factors, were both identified. In addition, voxelwise regressions yielded regions associated with mothers' buying characteristics. Lastly, we also implemented the multi-voxel pattern analysis and searchlight mapping to these data, with some preliminary results.
{"title":"The neural substrate of maternal love in shopping: Mothers' willingness to pay for her child vs. for herself: An fMRI study","authors":"Chun-Chia Kung, Yang-Yen Ou, Ding-Ruey Yeh, Chung-Fa Wang","doi":"10.1109/ICOT.2014.6956618","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956618","url":null,"abstract":"Mother love is essential for mother-child relationship, and is important for child's development. Despite of series of studies pertaining to the neural substrate of parental love, to date, however, no publication is found regarding the neural substrates of maternal love in shopping. In this study, we examined first-time mothers' brain activations in response to mother-related (purses, make-ups, and clothes) and children-related (toys, books, and clothes) items measured by fMRI. 22 (including 20 first-time) mothers were scanned with 3T MRI scanner. In children- vs. mother-related items, the expected mother-nature site, peri-aquductal grey (or PAG), was found to be more activated. In addition, “Buy vs. NotBuy”, as well as interaction of both factors, were both identified. In addition, voxelwise regressions yielded regions associated with mothers' buying characteristics. Lastly, we also implemented the multi-voxel pattern analysis and searchlight mapping to these data, with some preliminary results.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124646321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6954666
Chen Xueling, Zhang Yanning
This paper introduces a blur kernel refinement method that produces a more accurate kernel estimation based on the best light streak that is selected from a motion blurred image. The best image patch that contains a clear light streak is firstly selected and the blur kernel is estimated from the patch by solving an optimization problem. Then, a kernel refinement method based on region growing is proposed to extract the motion trajectory to be the refined kernel and avoid the disturbance from the background. At last, a non-blind deconvolution method is used to obtain the restored sharp image using the refined kernel. Experimental results of both synthetic images and real world images demonstrate that the kernel refinement can improve the quality of deconvolution and yield a better sharp image with less ringing artifacts. Also, the normalized cross-correlation is utilized to evaluate the similarity between refined and ground truth kernel and verifies the improvement of refined kernels.
{"title":"Kernel refinement based on best light streak for motion deblurring","authors":"Chen Xueling, Zhang Yanning","doi":"10.1109/ICOT.2014.6954666","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6954666","url":null,"abstract":"This paper introduces a blur kernel refinement method that produces a more accurate kernel estimation based on the best light streak that is selected from a motion blurred image. The best image patch that contains a clear light streak is firstly selected and the blur kernel is estimated from the patch by solving an optimization problem. Then, a kernel refinement method based on region growing is proposed to extract the motion trajectory to be the refined kernel and avoid the disturbance from the background. At last, a non-blind deconvolution method is used to obtain the restored sharp image using the refined kernel. Experimental results of both synthetic images and real world images demonstrate that the kernel refinement can improve the quality of deconvolution and yield a better sharp image with less ringing artifacts. Also, the normalized cross-correlation is utilized to evaluate the similarity between refined and ground truth kernel and verifies the improvement of refined kernels.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127961488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work presents a low-cost and fast-trainable automatic speaker-speech recognition (ASSR) system, by proposed binary halved clustering (BHC) method for human-machine interface (HMI) on an embedded platform, owing to the trait of low cost in ASSR system is essential and affordable for real-world application. In addition, fast-trainable ability can provide fast responding time. The reduction of waiting time makes the proposed HMI to be friendly for users. The speech recognition uses enhanced cross-word reference templates (ECWRTs) for template training type. The novel BHC method uses binary-halved splitting to generate speaker models for low complexity requirement. The regularity of binary halved behavior is beneficial for data scheduling and resource sharing in the embedded ASSR system. Compared with the conventional works, simulation results indicate that the proposed hardware accelerator achieves 28% less cost, 90% less responding time, an ASSR accuracy of 90%. Comparison exhibits that performance of the proposed system is greater than the conventional works, thereby demonstrating the friendly and affordable factor of the proposed HMI.
{"title":"An automatic speaker-speech recognition system for friendly HMI based on binary halved clustering","authors":"Chih-Hsiang Peng, Chih-Hung Chou, Ta-Wen Kuan, Po-Chuan Lin, Jhing-Fa Wang, P. Yu","doi":"10.1109/ICOT.2014.6956624","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956624","url":null,"abstract":"This work presents a low-cost and fast-trainable automatic speaker-speech recognition (ASSR) system, by proposed binary halved clustering (BHC) method for human-machine interface (HMI) on an embedded platform, owing to the trait of low cost in ASSR system is essential and affordable for real-world application. In addition, fast-trainable ability can provide fast responding time. The reduction of waiting time makes the proposed HMI to be friendly for users. The speech recognition uses enhanced cross-word reference templates (ECWRTs) for template training type. The novel BHC method uses binary-halved splitting to generate speaker models for low complexity requirement. The regularity of binary halved behavior is beneficial for data scheduling and resource sharing in the embedded ASSR system. Compared with the conventional works, simulation results indicate that the proposed hardware accelerator achieves 28% less cost, 90% less responding time, an ASSR accuracy of 90%. Comparison exhibits that performance of the proposed system is greater than the conventional works, thereby demonstrating the friendly and affordable factor of the proposed HMI.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"BME-26 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114124649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6954670
Cong-Cong Zhou, C. Tu, Yun Gao, Fei-Xiang Wang, Hong-Wei Gong, Ping Lian, Cheng He, Xuesong Ye
A new low-power wrist-worn miniature device used for real-time wireless heart rate (HR) monitoring and fall detection is presented here. This device consists of sensors, signal condition circuits, microcontroller, and system communication module. Power management and algorithms are applied to achieve low power function. Using PASW Statistics 18.0(SPSS Statistics) software to analyze the 54 HR date gotten from Six subjects, we find that the average and standard deviation of the proposed device are 60.83 and 9.705 while they are 61.96 and 9.317 by using POLAR RS100(Polar Electro). The Pearson correlation coefficient is 0.975(p<;0.01). Results show that proposed device has good consistency as compared to the POLAR RS100. A low-power, low-cost MEMS accelerometer is used to detect the fall. Results show that we can detect the occurrence of a fall according to the threshold which is significant different from stationary, walking and standing up from sitting situations. When people worn the device fall down, an interrupt will be generated and sent to the microcontroller for further process immediately. 245 samples are tested, and the fall forwards detection accuracy is 93.75%. The device is useful to detect heartbeat problems in long-term vital sign monitoring such as combat medics, mountain climbers, etc. And also it is useful to detect health condition of elderly people.
{"title":"A low-power, wireless, wrist-worn device for long time heart rate monitoring and fall detection","authors":"Cong-Cong Zhou, C. Tu, Yun Gao, Fei-Xiang Wang, Hong-Wei Gong, Ping Lian, Cheng He, Xuesong Ye","doi":"10.1109/ICOT.2014.6954670","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6954670","url":null,"abstract":"A new low-power wrist-worn miniature device used for real-time wireless heart rate (HR) monitoring and fall detection is presented here. This device consists of sensors, signal condition circuits, microcontroller, and system communication module. Power management and algorithms are applied to achieve low power function. Using PASW Statistics 18.0(SPSS Statistics) software to analyze the 54 HR date gotten from Six subjects, we find that the average and standard deviation of the proposed device are 60.83 and 9.705 while they are 61.96 and 9.317 by using POLAR RS100(Polar Electro). The Pearson correlation coefficient is 0.975(p<;0.01). Results show that proposed device has good consistency as compared to the POLAR RS100. A low-power, low-cost MEMS accelerometer is used to detect the fall. Results show that we can detect the occurrence of a fall according to the threshold which is significant different from stationary, walking and standing up from sitting situations. When people worn the device fall down, an interrupt will be generated and sent to the microcontroller for further process immediately. 245 samples are tested, and the fall forwards detection accuracy is 93.75%. The device is useful to detect heartbeat problems in long-term vital sign monitoring such as combat medics, mountain climbers, etc. And also it is useful to detect health condition of elderly people.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116274746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956628
Chung-Hsien Wu, Jen-Chun Lin, Wen-Li Wei
Facial occlusion is a critical issue that may dramatically degrade the performance on facial expression-based emotion recognition. In this study, the Error Weighted Cross-Correlation Model (EWCCM) is employed to predict the facial Action Unit (AU) under partial facial occlusion from non-occluded facial regions for facial geometric feature reconstruction. In EWCCM, a Gaussian Mixture Model (GMM)-based Cross-Correlation Model (CCM) is first adopted to construct the statistical dependency among features from paired facial components such as eyebrows-cheeks of the non-occluded regions for AU prediction of the occluded region. A Bayesian classifier weighting scheme is then used to enhance the AU prediction accuracy considering the contributions of the GMM-based CCMs. Based on the predicted AU, a regression fusion scheme is proposed to reconstruct the occluded facial geometric features. Experimental results show that the proposed approach yielded satisfactory results on the NCKU-FEPO database for facial AU reconstruction.
{"title":"Action unit reconstruction of occluded facial expression","authors":"Chung-Hsien Wu, Jen-Chun Lin, Wen-Li Wei","doi":"10.1109/ICOT.2014.6956628","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956628","url":null,"abstract":"Facial occlusion is a critical issue that may dramatically degrade the performance on facial expression-based emotion recognition. In this study, the Error Weighted Cross-Correlation Model (EWCCM) is employed to predict the facial Action Unit (AU) under partial facial occlusion from non-occluded facial regions for facial geometric feature reconstruction. In EWCCM, a Gaussian Mixture Model (GMM)-based Cross-Correlation Model (CCM) is first adopted to construct the statistical dependency among features from paired facial components such as eyebrows-cheeks of the non-occluded regions for AU prediction of the occluded region. A Bayesian classifier weighting scheme is then used to enhance the AU prediction accuracy considering the contributions of the GMM-based CCMs. Based on the predicted AU, a regression fusion scheme is proposed to reconstruct the occluded facial geometric features. Experimental results show that the proposed approach yielded satisfactory results on the NCKU-FEPO database for facial AU reconstruction.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124467104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956622
Yen-Lin Chiang, Yuan-Shan Lee, Wen-Chi Hsieh, Jia-Ching Wang
In this work, a query-by-singing (QBS) content-based music retrieval (CBMR) system is proposed. The proposed QBS-CBMR system shows high efficiency and portability. The proposed QBS-CBMR system uses a music clip as a search key. First, a 13 dimensional Mel-frequency cepstral coefficients (MFCCs) is extracted from an input music clip. Second, each dimension of MFCCs is transformed into a symbolic sequence using the adapted symbolic aggregate approximation (adapted SAX). Each symbolic sequence corresponding to each dimension of MFCCs is then converted into a tree structure called advanced fast pattern index (AFPI) tree. In order to evaluate the similarity between the query music clip and the songs in the database, a partial score is calculated for each AFPI tree first. The final score is obtained by the weighted summation of all partial scores, where the weighting of each partial score is determined by its entropy. The experimental results show that the proposed music retrieval system outperforms other approaches.
{"title":"Efficient and portable content-based music retrieval system","authors":"Yen-Lin Chiang, Yuan-Shan Lee, Wen-Chi Hsieh, Jia-Ching Wang","doi":"10.1109/ICOT.2014.6956622","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956622","url":null,"abstract":"In this work, a query-by-singing (QBS) content-based music retrieval (CBMR) system is proposed. The proposed QBS-CBMR system shows high efficiency and portability. The proposed QBS-CBMR system uses a music clip as a search key. First, a 13 dimensional Mel-frequency cepstral coefficients (MFCCs) is extracted from an input music clip. Second, each dimension of MFCCs is transformed into a symbolic sequence using the adapted symbolic aggregate approximation (adapted SAX). Each symbolic sequence corresponding to each dimension of MFCCs is then converted into a tree structure called advanced fast pattern index (AFPI) tree. In order to evaluate the similarity between the query music clip and the songs in the database, a partial score is calculated for each AFPI tree first. The final score is obtained by the weighted summation of all partial scores, where the weighting of each partial score is determined by its entropy. The experimental results show that the proposed music retrieval system outperforms other approaches.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124682511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}