Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956606
Xiaoyong Lu, Hongwu Yang, Aibao Zhou
Happiness has attracted much attention of the researchers in various fields. This paper realizes prosodic conversion of emotional speech for happiness computing on speech communication. An emotional speech corpus includes 11 kinds of typical emotional utterances is designed, where each utterance is labeled the emotional information with PAD value in a psychological sense. A five-scale tone model is employed to model the pitch contour of emotional utterances on the syllable level. A generalized regression neural network (GRNN) based prosody conversion model is built to realize the transformation of pitch contour, duration and pause duration of emotional utterance, in which the PAD values of emotion and context parameter are adopted to predict the prosodic features. Emotional utterance is then re-synthesized with the STRAIGHT algorithm by modifying pitch contour, duration and pause duration. Experimental results on Emotional Mean Opining Score (EMOS) demonstrate that the prosody conversion effect of proposed method can express corresponding feelings.
{"title":"Applying PAD three dimensional emotion model to convert prosody of emotional speech","authors":"Xiaoyong Lu, Hongwu Yang, Aibao Zhou","doi":"10.1109/ICOT.2014.6956606","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956606","url":null,"abstract":"Happiness has attracted much attention of the researchers in various fields. This paper realizes prosodic conversion of emotional speech for happiness computing on speech communication. An emotional speech corpus includes 11 kinds of typical emotional utterances is designed, where each utterance is labeled the emotional information with PAD value in a psychological sense. A five-scale tone model is employed to model the pitch contour of emotional utterances on the syllable level. A generalized regression neural network (GRNN) based prosody conversion model is built to realize the transformation of pitch contour, duration and pause duration of emotional utterance, in which the PAD values of emotion and context parameter are adopted to predict the prosodic features. Emotional utterance is then re-synthesized with the STRAIGHT algorithm by modifying pitch contour, duration and pause duration. Experimental results on Emotional Mean Opining Score (EMOS) demonstrate that the prosody conversion effect of proposed method can express corresponding feelings.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"313 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122967847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956633
Ta-Wen Kuan, Jhing-Fa Wang, Tsai Shang-Hung
The paper proposes a first sound chip design for security-sensitive event sounds recognition that extended the interaction of Orange warming care from human-to-human to environment-to-human perception. The proposed chip is fittingly embedded in smart sensors or appliances at home to surroundingly detect the event sounds, which can timely care the elderly or children who live alone thus actively call for assistance. In order to realize the chip in a high-accuracy performance, a small-size area and a low-power dissipation, the MFCC several sub-modules including, radix-2 FFT, Mel-filter bank etc are optimized for chip design to reach the required characteristics. In the simulation results, the proposed MFCC with k-NN framework performs the higher recognition accuracy than LPCC and MP features having k-NN classifier. For chip realization, the optimized MFCC sub-modules indeed improve the hardware resource utilization, where the chip is designed and simulated by verilog and synthesized by TSMC 90nm library.
{"title":"Optimized radix-2 FFT and Mel-filter bank in MFCC-based events sound recognition chip design for active smart warming care","authors":"Ta-Wen Kuan, Jhing-Fa Wang, Tsai Shang-Hung","doi":"10.1109/ICOT.2014.6956633","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956633","url":null,"abstract":"The paper proposes a first sound chip design for security-sensitive event sounds recognition that extended the interaction of Orange warming care from human-to-human to environment-to-human perception. The proposed chip is fittingly embedded in smart sensors or appliances at home to surroundingly detect the event sounds, which can timely care the elderly or children who live alone thus actively call for assistance. In order to realize the chip in a high-accuracy performance, a small-size area and a low-power dissipation, the MFCC several sub-modules including, radix-2 FFT, Mel-filter bank etc are optimized for chip design to reach the required characteristics. In the simulation results, the proposed MFCC with k-NN framework performs the higher recognition accuracy than LPCC and MP features having k-NN classifier. For chip realization, the optimized MFCC sub-modules indeed improve the hardware resource utilization, where the chip is designed and simulated by verilog and synthesized by TSMC 90nm library.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123515154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956605
An Zhanfu, Pei Dong, Yong HongWu, Wang Quanzhou
We address the problem of precise self-positioning of an autonomous mobile robot. This problem is formulated as a manifold perception algorithm such that the precision position of a mobile robot is evaluated based on the distance from an obstacle, critical features or signs of surroundings and the depth of its surrounding images. We propose to accurately localize the position of a mobile robot using an algorithm that fusing the local plane coordinates information getting from laser ranging and space visual information represented by features of a depth image with variational weights, by which the local distance information of laser ranging and depth vision information are relatively complemented. First, we utilize EKF algorithm on the data gathered by laser to get coarse location of a robot, then open RGB-D camera to capture depth images and we extract SURF features of images, when the features are matched with training examples, the RANSAC algorithm is used to check consistency of spatial structures. Finally, extensive experiments show that our fusion method has significantly improved location results of accuracy compared with the results using either EKF on laser data or SURF features matching on depth images. Especially, experiments with variational fusion weights demonstrated that with this method our robot was capable of accomplishing self-location precisely in real time.
{"title":"A new strategy for improving the self-positioning precision of an autonomous mobile robot","authors":"An Zhanfu, Pei Dong, Yong HongWu, Wang Quanzhou","doi":"10.1109/ICOT.2014.6956605","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956605","url":null,"abstract":"We address the problem of precise self-positioning of an autonomous mobile robot. This problem is formulated as a manifold perception algorithm such that the precision position of a mobile robot is evaluated based on the distance from an obstacle, critical features or signs of surroundings and the depth of its surrounding images. We propose to accurately localize the position of a mobile robot using an algorithm that fusing the local plane coordinates information getting from laser ranging and space visual information represented by features of a depth image with variational weights, by which the local distance information of laser ranging and depth vision information are relatively complemented. First, we utilize EKF algorithm on the data gathered by laser to get coarse location of a robot, then open RGB-D camera to capture depth images and we extract SURF features of images, when the features are matched with training examples, the RANSAC algorithm is used to check consistency of spatial structures. Finally, extensive experiments show that our fusion method has significantly improved location results of accuracy compared with the results using either EKF on laser data or SURF features matching on depth images. Especially, experiments with variational fusion weights demonstrated that with this method our robot was capable of accomplishing self-location precisely in real time.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125685887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956620
Lin Qiu, Jiahui Lu, C. Chiu
Research has shown that subjective well-being has two related but distinct dimensions, eudaimonic well-being and hedonic well-being. Hedonic well-being refers to one's overall positive affective experiences, while eudaimonic well-being is related to having a meaningful and noble purpose for life. While people are striving to have a happy and meaningful life, their motivations can be influenced by socio-economic conditions and contexts. In this study, we analyzed words frequencies in the Google Books corpus to measure the changing needs for eudaimonic and hedonic well-being and their relationships with economic growth. Results show that the frequencies of words related to hedonic well-being decrease while those related to eudaimonic well-being increase over the years. Furthermore, when people are poor, their motivation for hedonic well-being is relatively high. The hedonic motivational strength dramatically decreases and becomes stable when income reaches at a certain level. In contrast, people have relatively low motivation for eudaimonic well-being when they are poor. The eudaimonic motivational strength dramatically increases and becomes stable when income reaches at a certain level. Our study demonstrates an example of measuring subjective well-being through analysis of digital media.
{"title":"Detecting the needs for happiness and meaning in life from google books","authors":"Lin Qiu, Jiahui Lu, C. Chiu","doi":"10.1109/ICOT.2014.6956620","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956620","url":null,"abstract":"Research has shown that subjective well-being has two related but distinct dimensions, eudaimonic well-being and hedonic well-being. Hedonic well-being refers to one's overall positive affective experiences, while eudaimonic well-being is related to having a meaningful and noble purpose for life. While people are striving to have a happy and meaningful life, their motivations can be influenced by socio-economic conditions and contexts. In this study, we analyzed words frequencies in the Google Books corpus to measure the changing needs for eudaimonic and hedonic well-being and their relationships with economic growth. Results show that the frequencies of words related to hedonic well-being decrease while those related to eudaimonic well-being increase over the years. Furthermore, when people are poor, their motivation for hedonic well-being is relatively high. The hedonic motivational strength dramatically decreases and becomes stable when income reaches at a certain level. In contrast, people have relatively low motivation for eudaimonic well-being when they are poor. The eudaimonic motivational strength dramatically increases and becomes stable when income reaches at a certain level. Our study demonstrates an example of measuring subjective well-being through analysis of digital media.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128729009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956604
Huo Yuanlian, Qiao Yongfeng
In this paper, a new lightning electric field signals denoising approach based on noise reduction algorithms in empirical mode decomposition(EMD) which was widely used for analyzing nonlinear and nonstationary data was applied. The data from the simulation and measurements were analyzed to evaluate this method comparing with the traditional FIR low-pass filter. The results showed that the denoising methods based on EMD provides very good results for denoising lightning electric field signals and it was effective and superior to the FIR filter method.
{"title":"Enhancement of lightning electric field signals using empirical mode decomposition method","authors":"Huo Yuanlian, Qiao Yongfeng","doi":"10.1109/ICOT.2014.6956604","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956604","url":null,"abstract":"In this paper, a new lightning electric field signals denoising approach based on noise reduction algorithms in empirical mode decomposition(EMD) which was widely used for analyzing nonlinear and nonstationary data was applied. The data from the simulation and measurements were analyzed to evaluate this method comparing with the traditional FIR low-pass filter. The results showed that the denoising methods based on EMD provides very good results for denoising lightning electric field signals and it was effective and superior to the FIR filter method.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124055182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956642
Yuchao Fan, Mingxing Xu, Zhiyong Wu, Lianhong Cai
Emotion recognition from speech plays an important role in developing affective and intelligent Human Computer Interaction. The goal of this work is to build an Automatic Emotion Variation Detection (AEVD) system to determine each emotional salient segment in continuous speech. We focus on emotion detection in angry-neutral speech, which is common in recent studies of AEVD. This study proposes a novel framework for AEVD using Multi-scaled Sliding Window (MSW-AEVD) to assign an emotion class to each window-shift by fusion decisions of all the sliding windows containing the shift. Firstly, sliding window with fixed-length is introduced as the basic procedure, in which several different fusion methods are investigated. Then multi-scaled sliding window is employed to support multi-classifiers with different timescale features, in which another two fusion strategies are provided. Finally, a postprocessing is applied to refine the final outputs. Performance evaluation is carried out on the public Berlin database EMO-DB. Our experimental results show that proposed MSW-AEVD significantly outperforms the traditional HMM-based AEVD.
{"title":"Automatic emotion variation detection using multi-scaled sliding window","authors":"Yuchao Fan, Mingxing Xu, Zhiyong Wu, Lianhong Cai","doi":"10.1109/ICOT.2014.6956642","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956642","url":null,"abstract":"Emotion recognition from speech plays an important role in developing affective and intelligent Human Computer Interaction. The goal of this work is to build an Automatic Emotion Variation Detection (AEVD) system to determine each emotional salient segment in continuous speech. We focus on emotion detection in angry-neutral speech, which is common in recent studies of AEVD. This study proposes a novel framework for AEVD using Multi-scaled Sliding Window (MSW-AEVD) to assign an emotion class to each window-shift by fusion decisions of all the sliding windows containing the shift. Firstly, sliding window with fixed-length is introduced as the basic procedure, in which several different fusion methods are investigated. Then multi-scaled sliding window is employed to support multi-classifiers with different timescale features, in which another two fusion strategies are provided. Finally, a postprocessing is applied to refine the final outputs. Performance evaluation is carried out on the public Berlin database EMO-DB. Our experimental results show that proposed MSW-AEVD significantly outperforms the traditional HMM-based AEVD.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128745268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6954665
Wang Jian, Zhao Xin-Bo
Recently there have been a lot of researches on eye gaze points in naturalistic scenes. But in many of these studies, subjects are instructed to view scenes without any particular task. So what are they different in visual research? In this paper, we provide detailed analysis of the eye gaze points from 11 subjects' eye movements when they performed a search task in 1307 images. The eye tracking data was analyzed in the following four aspects: agreement among subjects, center bias, difference of gaze points for each stimulus between target-present and target absent stimuli and the distribution of gaze points in the target-present image, The results of the analysis show that during visual search tasks, in which subjects are asked to find a particular target in a display, target object playa dominant role in the guidance of eye movements.
{"title":"Analysis of eye gaze points based on visual search","authors":"Wang Jian, Zhao Xin-Bo","doi":"10.1109/ICOT.2014.6954665","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6954665","url":null,"abstract":"Recently there have been a lot of researches on eye gaze points in naturalistic scenes. But in many of these studies, subjects are instructed to view scenes without any particular task. So what are they different in visual research? In this paper, we provide detailed analysis of the eye gaze points from 11 subjects' eye movements when they performed a search task in 1307 images. The eye tracking data was analyzed in the following four aspects: agreement among subjects, center bias, difference of gaze points for each stimulus between target-present and target absent stimuli and the distribution of gaze points in the target-present image, The results of the analysis show that during visual search tasks, in which subjects are asked to find a particular target in a display, target object playa dominant role in the guidance of eye movements.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116830672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6954669
Ma Zhong, Zhao Xin-Bo
We aim to build a visual vocabulary by applying a model of visual attention. Concretely, we first learn a computational visual attention model from the real eye tracking data. Then using this model to find the most salient regions in the images, and extracting features from these regions to build a visual vocabulary with more expressive power. The experiment was conducted to verify the effectiveness of the proposed visual attention based visual vocabulary. The results show that the proposed vocabulary boosts the performance of the category recognition, which means the proposed vocabulary outperforms the traditional one.
{"title":"Visual attention based visual vocabulary","authors":"Ma Zhong, Zhao Xin-Bo","doi":"10.1109/ICOT.2014.6954669","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6954669","url":null,"abstract":"We aim to build a visual vocabulary by applying a model of visual attention. Concretely, we first learn a computational visual attention model from the real eye tracking data. Then using this model to find the most salient regions in the images, and extracting features from these regions to build a visual vocabulary with more expressive power. The experiment was conducted to verify the effectiveness of the proposed visual attention based visual vocabulary. The results show that the proposed vocabulary boosts the performance of the category recognition, which means the proposed vocabulary outperforms the traditional one.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124787667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6956621
Wen-Jen Hsieh, S. Tsai, Yuh-Tyng Chen, Jenq-daw Lee, Jun-jen Huang, Ignacio Jose Minambres Garcia
According to the World Health Organization (WHO) and Alzheimer Disease International (ADI), there are at least 35.6 million people suffering from dementia in the world. Mild cognitive impairment (MCI) is considered as a risk state or a prodromal of dementia. This paper aims to make further exploration into the risk factors of mild cognitive impairment, analyzing the longitudinal data of three waves of surveys in 1999, 2003 and 2007 from “Taiwan Longitudinal Study on Aging” (TLSA). The hierarchical linear model (HLM) is applied to analyze samples from the TLSA of Taiwan's elderly 65 years old and over in 1999. Empirical results suggest that cognitive function worsening differs among individuals, but clearly increases with age. Depressive symptoms show statistically positively significance but educational attainment show the opposite direction, whereas gender, marital status, ethnic, health behavior, and family support are all not statistically significant.
根据世界卫生组织(WHO)和国际阿尔茨海默病组织(ADI)的数据,全球至少有3560万人患有痴呆症。轻度认知障碍(MCI)被认为是痴呆的一种危险状态或前驱症状。本文旨在进一步探讨轻度认知障碍的危险因素,分析1999年、2003年和2007年“台湾老龄化纵向研究”(Taiwan longitudinal Study on Aging, TLSA)三次调查的纵向数据。本文采用层次线性模型(HLM)对1999年台湾65岁及以上老年人的TLSA样本进行分析。实证结果表明,认知功能恶化在个体之间存在差异,但随着年龄的增长而明显增加。抑郁症状与受教育程度呈显著正相关,性别、婚姻状况、民族、健康行为、家庭支持均无显著统计学意义。
{"title":"Social determinants of mild cognitive impairment among the elderly — A case study of Taiwan","authors":"Wen-Jen Hsieh, S. Tsai, Yuh-Tyng Chen, Jenq-daw Lee, Jun-jen Huang, Ignacio Jose Minambres Garcia","doi":"10.1109/ICOT.2014.6956621","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6956621","url":null,"abstract":"According to the World Health Organization (WHO) and Alzheimer Disease International (ADI), there are at least 35.6 million people suffering from dementia in the world. Mild cognitive impairment (MCI) is considered as a risk state or a prodromal of dementia. This paper aims to make further exploration into the risk factors of mild cognitive impairment, analyzing the longitudinal data of three waves of surveys in 1999, 2003 and 2007 from “Taiwan Longitudinal Study on Aging” (TLSA). The hierarchical linear model (HLM) is applied to analyze samples from the TLSA of Taiwan's elderly 65 years old and over in 1999. Empirical results suggest that cognitive function worsening differs among individuals, but clearly increases with age. Depressive symptoms show statistically positively significance but educational attainment show the opposite direction, whereas gender, marital status, ethnic, health behavior, and family support are all not statistically significant.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128859860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-20DOI: 10.1109/ICOT.2014.6954694
Chen Ding, Yong Xia, Ying Li
This paper proposes a neural network based supervised segmentation algorithm for retinal vessel delineation. The histogram of each training image patch and its optimal threshold acquired through iteratively comparing the binaryzation result to the manual segmentation are applied to a BP neural network to establish the correspondence between the intensity distribution and optimal segmentation parameter. Finally, each test image can be segmented by using a number of local thresholds that are predicted by the trained the neural network according the histograms of image patches. The propose algorithm has been evaluated on the DRIVE database that contains forty retinal images with manually segmented vessel trees. Our results show that the proposed algorithm can effective segment the vasculature in retinal images.
{"title":"Supervised segmentation of vasculature in retinal images using neural networks","authors":"Chen Ding, Yong Xia, Ying Li","doi":"10.1109/ICOT.2014.6954694","DOIUrl":"https://doi.org/10.1109/ICOT.2014.6954694","url":null,"abstract":"This paper proposes a neural network based supervised segmentation algorithm for retinal vessel delineation. The histogram of each training image patch and its optimal threshold acquired through iteratively comparing the binaryzation result to the manual segmentation are applied to a BP neural network to establish the correspondence between the intensity distribution and optimal segmentation parameter. Finally, each test image can be segmented by using a number of local thresholds that are predicted by the trained the neural network according the histograms of image patches. The propose algorithm has been evaluated on the DRIVE database that contains forty retinal images with manually segmented vessel trees. Our results show that the proposed algorithm can effective segment the vasculature in retinal images.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124859064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}