Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075528
Kadria Ezzine, M. Frikha
Speaker identity, the sound of a person's voice, is one of the most important characteristics in human communication. Voice conversion (VC) is an emergent problem in voice and speech processing that deals with the process of modifying a speaker's identity. More particularly, the speech signal spoken by the source speaker is modified to sound a sifit had been pronounced by another speaker, referred to as the target speaker. A variety of VC techniques has been proposed since the first appearance of the voice conversion problem. The choice among those techniques represents a compromise between the similarity of the converted voice to the target voice and the quality of the output speech signal, both rated by the used technique. In this paper, we review a comprehensive state-of-the-art of voice conversion techniques while pointing out their advantages and disadvantages. These techniques will be applied in significant and most versatile areas of speech technology; applications that are far beyond speech synthesis.
{"title":"A comparative study of voice conversion techniques: A review","authors":"Kadria Ezzine, M. Frikha","doi":"10.1109/ATSIP.2017.8075528","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075528","url":null,"abstract":"Speaker identity, the sound of a person's voice, is one of the most important characteristics in human communication. Voice conversion (VC) is an emergent problem in voice and speech processing that deals with the process of modifying a speaker's identity. More particularly, the speech signal spoken by the source speaker is modified to sound a sifit had been pronounced by another speaker, referred to as the target speaker. A variety of VC techniques has been proposed since the first appearance of the voice conversion problem. The choice among those techniques represents a compromise between the similarity of the converted voice to the target voice and the quality of the output speech signal, both rated by the used technique. In this paper, we review a comprehensive state-of-the-art of voice conversion techniques while pointing out their advantages and disadvantages. These techniques will be applied in significant and most versatile areas of speech technology; applications that are far beyond speech synthesis.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"62 14","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121006974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075583
Soukaina El Idrissi El Kaitouni, A. Abbad, H. Tairi
In this article, we propose an automatic method for the detection and extraction of the tumor on mammogram images. Most methods of detection of a tumor require the extraction of a large number of texture features from multiple calculations. The study first examines a technique of preprocessing images to obtain the Otsu thresholding method to eliminate items that do not belong in. After performing the thresholding, we estimate the number of base classes of technical LBP (Local Binary Pattern). To automate the initialization task, the classification proposed by applying dynamic k-means and improve the classes obtained by the method of Markov. Then we calculate the correlation between these classes and the original image, we deduce the class that contains the tumor and muscle pectoral. Finally, it uses the method of growing the region to eliminate pectoral muscle. The result obtained by this approach shows the quality and accuracy of extracting parts of the tumor compared to existing approaches in the literature.
{"title":"Tumor extraction and elimination of pectoral muscle based on hidden Markov and region growing: Applied based MIAS","authors":"Soukaina El Idrissi El Kaitouni, A. Abbad, H. Tairi","doi":"10.1109/ATSIP.2017.8075583","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075583","url":null,"abstract":"In this article, we propose an automatic method for the detection and extraction of the tumor on mammogram images. Most methods of detection of a tumor require the extraction of a large number of texture features from multiple calculations. The study first examines a technique of preprocessing images to obtain the Otsu thresholding method to eliminate items that do not belong in. After performing the thresholding, we estimate the number of base classes of technical LBP (Local Binary Pattern). To automate the initialization task, the classification proposed by applying dynamic k-means and improve the classes obtained by the method of Markov. Then we calculate the correlation between these classes and the original image, we deduce the class that contains the tumor and muscle pectoral. Finally, it uses the method of growing the region to eliminate pectoral muscle. The result obtained by this approach shows the quality and accuracy of extracting parts of the tumor compared to existing approaches in the literature.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114975070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075548
T. Frikha, Faten Chaabane, Boukhchim Said, Hassen Drira, Mohamed Abid, C. Amar, Lifl Lille
Developing multimedia embedded applications continues to flourish. In fact, a biometric facial recognition system can be used not only on PCs abut also in embedded systems, it is a potential enhancer to meet security and surveillance needs. The analysis of facial recognition consists offoursteps: face analysis, face expressions’ recognition, missing data completion and full face recognition. This paper proposes a hardware architecture based on an adaptation approach foran algorithm which has proven good face detection and recognition in 3D space. The proposed application was tested using a co design technique based on a mixed Hardware Software architecture: the FPGA platform.
{"title":"Embedded approach for a Riemannian-based framework of analyzing 3D faces","authors":"T. Frikha, Faten Chaabane, Boukhchim Said, Hassen Drira, Mohamed Abid, C. Amar, Lifl Lille","doi":"10.1109/ATSIP.2017.8075548","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075548","url":null,"abstract":"Developing multimedia embedded applications continues to flourish. In fact, a biometric facial recognition system can be used not only on PCs abut also in embedded systems, it is a potential enhancer to meet security and surveillance needs. The analysis of facial recognition consists offoursteps: face analysis, face expressions’ recognition, missing data completion and full face recognition. This paper proposes a hardware architecture based on an adaptation approach foran algorithm which has proven good face detection and recognition in 3D space. The proposed application was tested using a co design technique based on a mixed Hardware Software architecture: the FPGA platform.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121030135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075597
Khaoula Tbarki, S. B. Said, Riadh Ksantini, Z. Lachiri
Ground Penetrating Radar (GPR) has been a precious tool for humanitarian demining. The GPR scans the ground and delivers a three-dimensional matrix representing three types of data: Ascan, Bscan and Cscan. The Ascan data represents the response from a reflection signal of a pulse emitted by the GPR at a given position. In the proposed landmine detection method, the Ascan data is normalized and then classified using Kernel based One Class Support Vector Machine (OSVM). In fact, OSVM has the main advantage of handling unbalanced data, where is not the case for multiclass SVM. Our landmine detection method was tested and evaluated on the MACADAM database which is composed of 11 scenarios of landmines and 3 scenarios of inoffensive objects (wood stick, SodaCan, pine, stone). Experimental results have shown the superiority of the RBF kernel OSVM over others kernel functions based multiclass SVM in term of classification accuracy especially, as landmine data is unbalanced.
{"title":"Landmine detection improvement using one-class SVM for unbalanced data","authors":"Khaoula Tbarki, S. B. Said, Riadh Ksantini, Z. Lachiri","doi":"10.1109/ATSIP.2017.8075597","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075597","url":null,"abstract":"Ground Penetrating Radar (GPR) has been a precious tool for humanitarian demining. The GPR scans the ground and delivers a three-dimensional matrix representing three types of data: Ascan, Bscan and Cscan. The Ascan data represents the response from a reflection signal of a pulse emitted by the GPR at a given position. In the proposed landmine detection method, the Ascan data is normalized and then classified using Kernel based One Class Support Vector Machine (OSVM). In fact, OSVM has the main advantage of handling unbalanced data, where is not the case for multiclass SVM. Our landmine detection method was tested and evaluated on the MACADAM database which is composed of 11 scenarios of landmines and 3 scenarios of inoffensive objects (wood stick, SodaCan, pine, stone). Experimental results have shown the superiority of the RBF kernel OSVM over others kernel functions based multiclass SVM in term of classification accuracy especially, as landmine data is unbalanced.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128141699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075530
A. Chetouani
Blur is certainly one of the most encountered and the most annoying degradation types in image. It is due to several causes such as compression, motion, filtering and so on. In order to estimate the quality of this kind of degraded images, several metrics have been proposed in the literature. In this paper, we focus our attention on stereoscopic images and we propose a fusion-based blind stereoscopic image quality metric for blur degradation. In order to characterize the considered degradation type, some relevant features are first computed. Note that these features are extracted from a cyclopean image (CI) derived from the stereoscopic image. The final index quality is given by combined all features through a Support Vector Machine (SVM) model used as a regression tool. The 3D LIVE and the IEEE image databases have been used to evaluate our method. The achieved performance has been compared to the state-of-the-art.
{"title":"A fusion-based blind image quality metric for blurred stereoscopic images","authors":"A. Chetouani","doi":"10.1109/ATSIP.2017.8075530","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075530","url":null,"abstract":"Blur is certainly one of the most encountered and the most annoying degradation types in image. It is due to several causes such as compression, motion, filtering and so on. In order to estimate the quality of this kind of degraded images, several metrics have been proposed in the literature. In this paper, we focus our attention on stereoscopic images and we propose a fusion-based blind stereoscopic image quality metric for blur degradation. In order to characterize the considered degradation type, some relevant features are first computed. Note that these features are extracted from a cyclopean image (CI) derived from the stereoscopic image. The final index quality is given by combined all features through a Support Vector Machine (SVM) model used as a regression tool. The 3D LIVE and the IEEE image databases have been used to evaluate our method. The achieved performance has been compared to the state-of-the-art.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130424300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075612
O. Dorgham
Medical imaging segmentation provides vital information for surgical diagnosis, and usually demands an accurate segmentation. A fully automated computed tomography image segmentation method is proposed. This method is unsupervised and automatic estimation of the required parameters for identifying the human body as a region of interest. The proposed methodology consists of four steps: First, a body region of interest is masked by a method based on thresholding and basic morphological operations. Second, a body region of interest is identified using chain codes and a method for collecting adjacent contours. Next, the identification of background non-regions of interest is performed using an entropy algorithm. Finally, the human body segment is identified using a GrabCut algorithm. According to the visual evaluation results, segmentation of the human body, from the Computed Tomography images, was seen to be precise and accurate. The analysis provided evidence that the human body segmentation method could be applied to segmenting other organs, registering different image modalities or speeding-up the generation of digitally reconstructed radiographs.
{"title":"Automatic body segmentation from computed tomography image","authors":"O. Dorgham","doi":"10.1109/ATSIP.2017.8075612","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075612","url":null,"abstract":"Medical imaging segmentation provides vital information for surgical diagnosis, and usually demands an accurate segmentation. A fully automated computed tomography image segmentation method is proposed. This method is unsupervised and automatic estimation of the required parameters for identifying the human body as a region of interest. The proposed methodology consists of four steps: First, a body region of interest is masked by a method based on thresholding and basic morphological operations. Second, a body region of interest is identified using chain codes and a method for collecting adjacent contours. Next, the identification of background non-regions of interest is performed using an entropy algorithm. Finally, the human body segment is identified using a GrabCut algorithm. According to the visual evaluation results, segmentation of the human body, from the Computed Tomography images, was seen to be precise and accurate. The analysis provided evidence that the human body segmentation method could be applied to segmenting other organs, registering different image modalities or speeding-up the generation of digitally reconstructed radiographs.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133700572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075541
Hinda Dridi, K. Ouni
In this paper we describe a systematic procedure to implement two-stage based keywords spotting system (KWS). In first stage, a phonetic decoding of continuous speech is obtained using a CD-DNN-HMM model built with the Kaldi toolkit. In second stage, these results of phonetic transcriptions will serve to construct a system to search the keywords embedded in continuous speech using the classification and regression tree (CART) implemented with the software MATLAB. The work will be done using the TIMIT data base.
{"title":"Hybrid context dependent CD-DNN-HMM keywords spotting on continuous speech","authors":"Hinda Dridi, K. Ouni","doi":"10.1109/ATSIP.2017.8075541","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075541","url":null,"abstract":"In this paper we describe a systematic procedure to implement two-stage based keywords spotting system (KWS). In first stage, a phonetic decoding of continuous speech is obtained using a CD-DNN-HMM model built with the Kaldi toolkit. In second stage, these results of phonetic transcriptions will serve to construct a system to search the keywords embedded in continuous speech using the classification and regression tree (CART) implemented with the software MATLAB. The work will be done using the TIMIT data base.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124897936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075532
Z. Nejim, Makrem Mestiri, H. Amiri
In this paper, a new approach for 3D skeleton-based human motion recognition is discussed. First, we opted to represent the movement as a set of body joints trajectories. Those trajectories are then converted into ropes histograms. The motion records are obtained using the Kinect motion sensor. The classification phase consists in comparing those histograms with ropes histograms of a set of reference motions. This method is then tested on a random dataset of recorded motions and have presented an accuracy rate of 85%.
{"title":"Use of ropes histograms as joints trajectories representation for human motion recognition","authors":"Z. Nejim, Makrem Mestiri, H. Amiri","doi":"10.1109/ATSIP.2017.8075532","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075532","url":null,"abstract":"In this paper, a new approach for 3D skeleton-based human motion recognition is discussed. First, we opted to represent the movement as a set of body joints trajectories. Those trajectories are then converted into ropes histograms. The motion records are obtained using the Kinect motion sensor. The classification phase consists in comparing those histograms with ropes histograms of a set of reference motions. This method is then tested on a random dataset of recorded motions and have presented an accuracy rate of 85%.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126677052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075562
I. Bakkouri, K. Afdel
This paper presents a novel deep learning approach focused on the classification of tumors in mammograms as malignant or benign. It is a modern machine learning method which promises to create models that learn from large dataset and make accurate predictions. In this study, we propose a discriminative objective for supervised feature learning by training a Convolutional Neural Network (CNN). Choosing CNN involves input image with a fixed-length and as a consequence, we equip our networks with a scaling process based on Gaussian pyramids for obtaining regions of interest with normalized size. The dataset used in this research is augmented with applying the geometric transformation techniques in order to prevent overfitting and create a robust deep learning model. We perform classification with Softmax layer. It is used to train CNN for classification. We evaluate our methodology on both of the publicly available dataset DDSM and BCDR. In comparison with the current state-of-the-art methods, the experiments show that our proposed system provides good results, achieving high accuracy of 97.28% that will assist radiologists in making diagnostic decisions without increasing false negatives.
{"title":"Breast tumor classification based on deep convolutional neural networks","authors":"I. Bakkouri, K. Afdel","doi":"10.1109/ATSIP.2017.8075562","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075562","url":null,"abstract":"This paper presents a novel deep learning approach focused on the classification of tumors in mammograms as malignant or benign. It is a modern machine learning method which promises to create models that learn from large dataset and make accurate predictions. In this study, we propose a discriminative objective for supervised feature learning by training a Convolutional Neural Network (CNN). Choosing CNN involves input image with a fixed-length and as a consequence, we equip our networks with a scaling process based on Gaussian pyramids for obtaining regions of interest with normalized size. The dataset used in this research is augmented with applying the geometric transformation techniques in order to prevent overfitting and create a robust deep learning model. We perform classification with Softmax layer. It is used to train CNN for classification. We evaluate our methodology on both of the publicly available dataset DDSM and BCDR. In comparison with the current state-of-the-art methods, the experiments show that our proposed system provides good results, achieving high accuracy of 97.28% that will assist radiologists in making diagnostic decisions without increasing false negatives.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116617720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-05-01DOI: 10.1109/ATSIP.2017.8075578
A. Elhanaoui, G. Maze, E. Aassif, D. Decultot
In this document, smoothed pseudo Wigner-Ville (SPWD) and reassigned spectrogram (RSPD) time-frequency distributions were developed and used to analyze an acoustic signal scattered from an thin cylindrical metallic tube immersed in water. In the work, the studied tube is made of two parts. The obtained results suggest that the time-frequency methods are suitable to find various resonances of circumferential waves that are propagated around the shell, and give better concordance with the theoretical results.
{"title":"Usefulness of time-frequency treatment to an acoustic signal","authors":"A. Elhanaoui, G. Maze, E. Aassif, D. Decultot","doi":"10.1109/ATSIP.2017.8075578","DOIUrl":"https://doi.org/10.1109/ATSIP.2017.8075578","url":null,"abstract":"In this document, smoothed pseudo Wigner-Ville (SPWD) and reassigned spectrogram (RSPD) time-frequency distributions were developed and used to analyze an acoustic signal scattered from an thin cylindrical metallic tube immersed in water. In the work, the studied tube is made of two parts. The obtained results suggest that the time-frequency methods are suitable to find various resonances of circumferential waves that are propagated around the shell, and give better concordance with the theoretical results.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122090984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}