Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365144
Sung-muk Kang, H. Cho, Seung-ho Kim, Jong-Hak Kim, Jun-Dong Cho
This paper presents a method of reducing visual fatigue for 3D image viewing. We, based on Epipolar geometry, compare with the conventional method focusing on adjusting baseline of image. Our unique idea is that the scaling of baseline is used to adjust the disparity of the left and right images to reduce visual fatigue. The experimental validation indicates that our proposed method can be adopted as a reliable method to reduce visual fatigue in the 3D image.
{"title":"Predicting visual fatigue in 3D image viewing by adjusting the baseline positioning","authors":"Sung-muk Kang, H. Cho, Seung-ho Kim, Jong-Hak Kim, Jun-Dong Cho","doi":"10.1109/SPA.2015.7365144","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365144","url":null,"abstract":"This paper presents a method of reducing visual fatigue for 3D image viewing. We, based on Epipolar geometry, compare with the conventional method focusing on adjusting baseline of image. Our unique idea is that the scaling of baseline is used to adjust the disparity of the left and right images to reduce visual fatigue. The experimental validation indicates that our proposed method can be adopted as a reliable method to reduce visual fatigue in the 3D image.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131665968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365142
J. Kotus, P. Szczuko, M. Szczodrak, A. Czyżewski
A hardware and software solution for guitar string vibration measurement by fast cameras is described. Orthogonal setup for 3D image acquisition is proposed capable to capture several thousand image frames per second. Dedicated image processing algorithm was developed and described in the paper, aimed at tracking the movement of some selected points along the string. Fast and accurate tracking results provided a detailed information about vibrations, that was transformed into sound samples. Described sound processing methods were applied in order to enable a comparison of captured string vibrations with the sound recorded using a microphone. Analysis of obtained results, conclusions, and future work plans are included.
本文介绍了利用快速相机测量吉他弦振动的硬件和软件解决方案。本文提出了用于 3D 图像采集的正交设置,每秒可捕捉数千帧图像。论文中开发并描述了专用的图像处理算法,旨在跟踪琴弦上某些选定点的移动。快速准确的跟踪结果提供了详细的振动信息,并将其转化为声音样本。为了将捕捉到的琴弦振动与使用麦克风录制的声音进行比较,本文介绍了声音处理方法。报告还包括对所获结果的分析、结论和未来工作计划。
{"title":"Application of fast cameras to string vibrations recording","authors":"J. Kotus, P. Szczuko, M. Szczodrak, A. Czyżewski","doi":"10.1109/SPA.2015.7365142","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365142","url":null,"abstract":"A hardware and software solution for guitar string vibration measurement by fast cameras is described. Orthogonal setup for 3D image acquisition is proposed capable to capture several thousand image frames per second. Dedicated image processing algorithm was developed and described in the paper, aimed at tracking the movement of some selected points along the string. Fast and accurate tracking results provided a detailed information about vibrations, that was transformed into sound samples. Described sound processing methods were applied in order to enable a comparison of captured string vibrations with the sound recorded using a microphone. Analysis of obtained results, conclusions, and future work plans are included.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129567822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365138
Hugo Cordeiro, José Fonseca, I. Guimarães, C. Meneses
Voice pathology identification using speech processing methods can be used as a preliminary diagnosis. This study implements a set of identification systems to screen voice pathologies using voice signal features from the sustained vowel /a/ and continuous speech. The two signals tasks are evaluated using three acoustic features applied to four classifiers. Three main classes are identified: physiological disorders; neuromuscular disorders; and healthy subjects. The main objective of this work is to evaluate which voice signal is more reliable for voice pathology diagnosis, which acoustic feature has more pathology information and which is the best classifier to carry out this task. The best overall system accuracy is 77.9%, obtained with Mel-Line Spectrum Frequencies (MLSF) feature extracted from continuous speech and applied to a Gaussian Mixture Models (GMM) classifier.
{"title":"Voice pathologies identification speech signals, features and classifiers evaluation","authors":"Hugo Cordeiro, José Fonseca, I. Guimarães, C. Meneses","doi":"10.1109/SPA.2015.7365138","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365138","url":null,"abstract":"Voice pathology identification using speech processing methods can be used as a preliminary diagnosis. This study implements a set of identification systems to screen voice pathologies using voice signal features from the sustained vowel /a/ and continuous speech. The two signals tasks are evaluated using three acoustic features applied to four classifiers. Three main classes are identified: physiological disorders; neuromuscular disorders; and healthy subjects. The main objective of this work is to evaluate which voice signal is more reliable for voice pathology diagnosis, which acoustic feature has more pathology information and which is the best classifier to carry out this task. The best overall system accuracy is 77.9%, obtained with Mel-Line Spectrum Frequencies (MLSF) feature extracted from continuous speech and applied to a Gaussian Mixture Models (GMM) classifier.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127149523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365136
J. Oska, J. Wojtun, K. Wodecki, Z. Piotrowski
In the article there are presented the results of research on the influence of the lossy compression, used in codecs G.711, G.723.1 and iLBC, on the efficiency of isolated speech phrase recognition. In the research the degree of robustness against degrading factors in the parameterisation method of audio signal LPCC and MFCC (Linear Prediction Cepstral Coefficients, Mel Frequency Cepstral Coefficients) is compared. The research is based on the classifier of improved Gaussian mixtures making allowance for Universal Background Model GMM-UBM (Gaussian Mixtures Model - Universal Background Model). The research was conducted on the database composed of 3000 isolated speech phrases.
{"title":"Robustness analysis of automatic speech signal recognition system against factors degrading speech signal","authors":"J. Oska, J. Wojtun, K. Wodecki, Z. Piotrowski","doi":"10.1109/SPA.2015.7365136","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365136","url":null,"abstract":"In the article there are presented the results of research on the influence of the lossy compression, used in codecs G.711, G.723.1 and iLBC, on the efficiency of isolated speech phrase recognition. In the research the degree of robustness against degrading factors in the parameterisation method of audio signal LPCC and MFCC (Linear Prediction Cepstral Coefficients, Mel Frequency Cepstral Coefficients) is compared. The research is based on the classifier of improved Gaussian mixtures making allowance for Universal Background Model GMM-UBM (Gaussian Mixtures Model - Universal Background Model). The research was conducted on the database composed of 3000 isolated speech phrases.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127839714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365148
Wassim Alexan, Ahmed El Mahdy
Applying network coding to a wireless cooperative communication network always proves beneficial, in terms of gains in spatial diversity, improved coverage and channel capacity. In this paper, a relay-selection method for a bidirectional (two-way) wireless cooperative communication system is proposed. A single best relay node or set of relay nodes are selected to jointly forward the combined data streams from two users. The method is based on the value of the log-likelihood ratio of the received signal at each relay node in the system. Performance is measured in terms of bit error rate (BER) curves and outage probability curves. A comparison with opportunistic relaying is carried out.
{"title":"A relay selection method for bidirectional wireless cooperative networks based on the log-likelihood ratio","authors":"Wassim Alexan, Ahmed El Mahdy","doi":"10.1109/SPA.2015.7365148","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365148","url":null,"abstract":"Applying network coding to a wireless cooperative communication network always proves beneficial, in terms of gains in spatial diversity, improved coverage and channel capacity. In this paper, a relay-selection method for a bidirectional (two-way) wireless cooperative communication system is proposed. A single best relay node or set of relay nodes are selected to jointly forward the combined data streams from two users. The method is based on the value of the log-likelihood ratio of the received signal at each relay node in the system. Performance is measured in terms of bit error rate (BER) curves and outage probability curves. A comparison with opportunistic relaying is carried out.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123098683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365106
A. Konieczka, J. Balcerek, Agata Chmielewska, A. Dabrowski
In this article a simple and fully automatic approach to local contrast enhancement is presented, maintaining reasonable computational complexity and feasibility of simultaneous calculations on multiple processors. The authors propose an algorithm, which allows to obtain images with artificially increased dynamic range. The resulting images do not contain unnatural artifacts and are close to images perceived by humans. Results of experiments show that the described method offers very good results even for images obtained with a large range of differences in the exposures.
{"title":"Approach to local contrast enhancement","authors":"A. Konieczka, J. Balcerek, Agata Chmielewska, A. Dabrowski","doi":"10.1109/SPA.2015.7365106","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365106","url":null,"abstract":"In this article a simple and fully automatic approach to local contrast enhancement is presented, maintaining reasonable computational complexity and feasibility of simultaneous calculations on multiple processors. The authors propose an algorithm, which allows to obtain images with artificially increased dynamic range. The resulting images do not contain unnatural artifacts and are close to images perceived by humans. Results of experiments show that the described method offers very good results even for images obtained with a large range of differences in the exposures.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126837336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365140
Michael Kröger, M. Rosenbaum, W. Sauer-Greff, R. Urbansky, M. Lorang, M. Siegrist
The simulation of X-ray images can be computed efficiently using raytracing, a technique well established in 3D computer graphics and rendering. Since raytracing is a discrete technique it is prone to aliasing artefacts. However, irregular sampling is able to mitigate this problem. In this paper the influence of the probability density function of the sampling process on the reconstructed spectral density is described. It is demonstrated that irregular sampling can be used in X-ray imaging simulation to reduce the impact of aliasing.
{"title":"Irregular sampling for X-ray imaging simulation","authors":"Michael Kröger, M. Rosenbaum, W. Sauer-Greff, R. Urbansky, M. Lorang, M. Siegrist","doi":"10.1109/SPA.2015.7365140","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365140","url":null,"abstract":"The simulation of X-ray images can be computed efficiently using raytracing, a technique well established in 3D computer graphics and rendering. Since raytracing is a discrete technique it is prone to aliasing artefacts. However, irregular sampling is able to mitigate this problem. In this paper the influence of the probability density function of the sampling process on the reconstructed spectral density is described. It is demonstrated that irregular sampling can be used in X-ray imaging simulation to reduce the impact of aliasing.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115440215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365112
Katarzyna Bugajska, A. Skalski, Janusz Gajda, T. Drewniak
In this article we have proposed several image processing techniques enabling the extraction of 3D tumor affected renal vascularity from CT scans in order to facilitate partial nephrectomy. The information which vessels supply the tumor is crucial to eliminate ischemic injury and allows the usage of the selective clamping method. However, until now renal vascularity has been analyzed only on the basis of visualization and its limitations. Our novel method consisted of the following steps: binarization upon image intensity histogram, erosion - elimination of connections between different structures, segmentation by a proposed locally adaptive region growing algorithm and finally segmentation by level set method using variational approach allowing the incorporation of the Chan - Vese model and image gradient information into the energy functional. The proposed set of image processing techniques allowed us to obtain 3D renal vessels segmentations and to identify target vessels. The results were validated on manually segmented, randomly chosen slices of ten different patients' computed tomography scans. Segmentation effectiveness is equal to 0.838 of Dice Coefficient meaning.
{"title":"The renal vessel segmentation for facilitation of partial nephrectomy","authors":"Katarzyna Bugajska, A. Skalski, Janusz Gajda, T. Drewniak","doi":"10.1109/SPA.2015.7365112","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365112","url":null,"abstract":"In this article we have proposed several image processing techniques enabling the extraction of 3D tumor affected renal vascularity from CT scans in order to facilitate partial nephrectomy. The information which vessels supply the tumor is crucial to eliminate ischemic injury and allows the usage of the selective clamping method. However, until now renal vascularity has been analyzed only on the basis of visualization and its limitations. Our novel method consisted of the following steps: binarization upon image intensity histogram, erosion - elimination of connections between different structures, segmentation by a proposed locally adaptive region growing algorithm and finally segmentation by level set method using variational approach allowing the incorporation of the Chan - Vese model and image gradient information into the energy functional. The proposed set of image processing techniques allowed us to obtain 3D renal vessels segmentations and to identify target vessels. The results were validated on manually segmented, randomly chosen slices of ten different patients' computed tomography scans. Segmentation effectiveness is equal to 0.838 of Dice Coefficient meaning.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"42 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126197555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365139
T. Maka, Miroslaw Lazoryszczak
In this study, an approach to analyse the properties of spectral peaks of simultaneously talking speakers in monophonic audio signal has been described. We have proposed a technique based on spectral peaks tracking and attributes calculated from peaks histogram. Spectral peaks have been estimated using linear prediction-based spectral envelope for each frame of source signal. The features have been computed from the histogram at different frequency bands. The statistical properties of the obtained features have been used to find out the relationship with the number of speech sources. Proposed approach has been tested using a dedicated database featuring sentences with the same and mixed gender, where the number of speakers varies from two to twelve. Different configuration parameters like frame size, bin width of the histogram and linear prediction order have been used in the conducted experiments. The results show that obtained trends of statistical descriptors are directly connected with the number of voice sources. The proposed descriptors and performed regression analysis can be a basis to estimate the number of speakers in single audio stream.
{"title":"Influence of simultaneous spoken sentences on the properties of spectral peaks","authors":"T. Maka, Miroslaw Lazoryszczak","doi":"10.1109/SPA.2015.7365139","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365139","url":null,"abstract":"In this study, an approach to analyse the properties of spectral peaks of simultaneously talking speakers in monophonic audio signal has been described. We have proposed a technique based on spectral peaks tracking and attributes calculated from peaks histogram. Spectral peaks have been estimated using linear prediction-based spectral envelope for each frame of source signal. The features have been computed from the histogram at different frequency bands. The statistical properties of the obtained features have been used to find out the relationship with the number of speech sources. Proposed approach has been tested using a dedicated database featuring sentences with the same and mixed gender, where the number of speakers varies from two to twelve. Different configuration parameters like frame size, bin width of the histogram and linear prediction order have been used in the conducted experiments. The results show that obtained trends of statistical descriptors are directly connected with the number of voice sources. The proposed descriptors and performed regression analysis can be a basis to estimate the number of speakers in single audio stream.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114341971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-12-28DOI: 10.1109/SPA.2015.7365143
H. Cho, Sang-Hyeop Song, Jong-Hak Kim, Solima, Jun-Dong Cho
In this paper, we propose a simple object tracking based on background modeling using histogram matching. Different from the existing block-based background modeling methods, most researches focus on a background subtraction. However, that includes a problem on visually object tracking and unreliable coordinate information. In this work, we implement background modeling and generate `background map' to reduce processing time for labeling. Then, `background map' is used for labeling algorithm to find a foreground that find object coordination information each frame. In addition, we can track moving object between previous frame and current frame using histogram matching. In the real-time processing, for a 640*480 resolution video, processing time is within 19ms using parallel studio.
{"title":"Simple object coordination tracking based on background modeling","authors":"H. Cho, Sang-Hyeop Song, Jong-Hak Kim, Solima, Jun-Dong Cho","doi":"10.1109/SPA.2015.7365143","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365143","url":null,"abstract":"In this paper, we propose a simple object tracking based on background modeling using histogram matching. Different from the existing block-based background modeling methods, most researches focus on a background subtraction. However, that includes a problem on visually object tracking and unreliable coordinate information. In this work, we implement background modeling and generate `background map' to reduce processing time for labeling. Then, `background map' is used for labeling algorithm to find a foreground that find object coordination information each frame. In addition, we can track moving object between previous frame and current frame using histogram matching. In the real-time processing, for a 640*480 resolution video, processing time is within 19ms using parallel studio.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123229908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}