首页 > 最新文献

2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)最新文献

英文 中文
Predicting visual fatigue in 3D image viewing by adjusting the baseline positioning 通过调整基线定位预测三维图像观看中的视疲劳
Sung-muk Kang, H. Cho, Seung-ho Kim, Jong-Hak Kim, Jun-Dong Cho
This paper presents a method of reducing visual fatigue for 3D image viewing. We, based on Epipolar geometry, compare with the conventional method focusing on adjusting baseline of image. Our unique idea is that the scaling of baseline is used to adjust the disparity of the left and right images to reduce visual fatigue. The experimental validation indicates that our proposed method can be adopted as a reliable method to reduce visual fatigue in the 3D image.
提出了一种减轻三维图像视觉疲劳的方法。在极极几何的基础上,与以调整图像基线为重点的传统方法进行了比较。我们独特的想法是使用基线的缩放来调整左右图像的视差,以减少视觉疲劳。实验验证表明,该方法可以作为一种可靠的降低三维图像视觉疲劳的方法。
{"title":"Predicting visual fatigue in 3D image viewing by adjusting the baseline positioning","authors":"Sung-muk Kang, H. Cho, Seung-ho Kim, Jong-Hak Kim, Jun-Dong Cho","doi":"10.1109/SPA.2015.7365144","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365144","url":null,"abstract":"This paper presents a method of reducing visual fatigue for 3D image viewing. We, based on Epipolar geometry, compare with the conventional method focusing on adjusting baseline of image. Our unique idea is that the scaling of baseline is used to adjust the disparity of the left and right images to reduce visual fatigue. The experimental validation indicates that our proposed method can be adopted as a reliable method to reduce visual fatigue in the 3D image.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131665968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of fast cameras to string vibrations recording 快速相机在弦振动记录中的应用
J. Kotus, P. Szczuko, M. Szczodrak, A. Czyżewski
A hardware and software solution for guitar string vibration measurement by fast cameras is described. Orthogonal setup for 3D image acquisition is proposed capable to capture several thousand image frames per second. Dedicated image processing algorithm was developed and described in the paper, aimed at tracking the movement of some selected points along the string. Fast and accurate tracking results provided a detailed information about vibrations, that was transformed into sound samples. Described sound processing methods were applied in order to enable a comparison of captured string vibrations with the sound recorded using a microphone. Analysis of obtained results, conclusions, and future work plans are included.
本文介绍了利用快速相机测量吉他弦振动的硬件和软件解决方案。本文提出了用于 3D 图像采集的正交设置,每秒可捕捉数千帧图像。论文中开发并描述了专用的图像处理算法,旨在跟踪琴弦上某些选定点的移动。快速准确的跟踪结果提供了详细的振动信息,并将其转化为声音样本。为了将捕捉到的琴弦振动与使用麦克风录制的声音进行比较,本文介绍了声音处理方法。报告还包括对所获结果的分析、结论和未来工作计划。
{"title":"Application of fast cameras to string vibrations recording","authors":"J. Kotus, P. Szczuko, M. Szczodrak, A. Czyżewski","doi":"10.1109/SPA.2015.7365142","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365142","url":null,"abstract":"A hardware and software solution for guitar string vibration measurement by fast cameras is described. Orthogonal setup for 3D image acquisition is proposed capable to capture several thousand image frames per second. Dedicated image processing algorithm was developed and described in the paper, aimed at tracking the movement of some selected points along the string. Fast and accurate tracking results provided a detailed information about vibrations, that was transformed into sound samples. Described sound processing methods were applied in order to enable a comparison of captured string vibrations with the sound recorded using a microphone. Analysis of obtained results, conclusions, and future work plans are included.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129567822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Voice pathologies identification speech signals, features and classifiers evaluation 语音病理识别语音信号,特征和分类器评价
Hugo Cordeiro, José Fonseca, I. Guimarães, C. Meneses
Voice pathology identification using speech processing methods can be used as a preliminary diagnosis. This study implements a set of identification systems to screen voice pathologies using voice signal features from the sustained vowel /a/ and continuous speech. The two signals tasks are evaluated using three acoustic features applied to four classifiers. Three main classes are identified: physiological disorders; neuromuscular disorders; and healthy subjects. The main objective of this work is to evaluate which voice signal is more reliable for voice pathology diagnosis, which acoustic feature has more pathology information and which is the best classifier to carry out this task. The best overall system accuracy is 77.9%, obtained with Mel-Line Spectrum Frequencies (MLSF) feature extracted from continuous speech and applied to a Gaussian Mixture Models (GMM) classifier.
语音病理鉴定使用语音处理方法可以作为初步诊断。本研究实现了一套识别系统,利用持续元音/a/和连续语音的语音信号特征来筛选语音病理。使用三个声学特征应用于四个分类器来评估两个信号任务。主要分为三类:生理障碍;神经肌肉疾病;健康的实验对象。本工作的主要目的是评估哪种语音信号更可靠地用于语音病理诊断,哪种声学特征具有更多的病理信息,哪种分类器是执行该任务的最佳分类器。从连续语音中提取梅尔线频谱频率(Mel-Line Spectrum frequency, MLSF)特征并将其应用于高斯混合模型(Gaussian Mixture Models, GMM)分类器,获得了最佳的系统总体准确率77.9%。
{"title":"Voice pathologies identification speech signals, features and classifiers evaluation","authors":"Hugo Cordeiro, José Fonseca, I. Guimarães, C. Meneses","doi":"10.1109/SPA.2015.7365138","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365138","url":null,"abstract":"Voice pathology identification using speech processing methods can be used as a preliminary diagnosis. This study implements a set of identification systems to screen voice pathologies using voice signal features from the sustained vowel /a/ and continuous speech. The two signals tasks are evaluated using three acoustic features applied to four classifiers. Three main classes are identified: physiological disorders; neuromuscular disorders; and healthy subjects. The main objective of this work is to evaluate which voice signal is more reliable for voice pathology diagnosis, which acoustic feature has more pathology information and which is the best classifier to carry out this task. The best overall system accuracy is 77.9%, obtained with Mel-Line Spectrum Frequencies (MLSF) feature extracted from continuous speech and applied to a Gaussian Mixture Models (GMM) classifier.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127149523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Robustness analysis of automatic speech signal recognition system against factors degrading speech signal 语音信号自动识别系统对语音信号退化因素的鲁棒性分析
J. Oska, J. Wojtun, K. Wodecki, Z. Piotrowski
In the article there are presented the results of research on the influence of the lossy compression, used in codecs G.711, G.723.1 and iLBC, on the efficiency of isolated speech phrase recognition. In the research the degree of robustness against degrading factors in the parameterisation method of audio signal LPCC and MFCC (Linear Prediction Cepstral Coefficients, Mel Frequency Cepstral Coefficients) is compared. The research is based on the classifier of improved Gaussian mixtures making allowance for Universal Background Model GMM-UBM (Gaussian Mixtures Model - Universal Background Model). The research was conducted on the database composed of 3000 isolated speech phrases.
本文给出了在G.711、G.723.1和iLBC编解码器中使用有损压缩对孤立语音短语识别效率影响的研究结果。在研究中比较了音频信号参数化方法LPCC和MFCC(线性预测倒谱系数,Mel频率倒谱系数)对退化因子的鲁棒性。本研究基于改进的高斯混合分类器,并考虑通用背景模型GMM-UBM(高斯混合模型-通用背景模型)。该研究是在由3000个孤立的语音短语组成的数据库上进行的。
{"title":"Robustness analysis of automatic speech signal recognition system against factors degrading speech signal","authors":"J. Oska, J. Wojtun, K. Wodecki, Z. Piotrowski","doi":"10.1109/SPA.2015.7365136","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365136","url":null,"abstract":"In the article there are presented the results of research on the influence of the lossy compression, used in codecs G.711, G.723.1 and iLBC, on the efficiency of isolated speech phrase recognition. In the research the degree of robustness against degrading factors in the parameterisation method of audio signal LPCC and MFCC (Linear Prediction Cepstral Coefficients, Mel Frequency Cepstral Coefficients) is compared. The research is based on the classifier of improved Gaussian mixtures making allowance for Universal Background Model GMM-UBM (Gaussian Mixtures Model - Universal Background Model). The research was conducted on the database composed of 3000 isolated speech phrases.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127839714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A relay selection method for bidirectional wireless cooperative networks based on the log-likelihood ratio 基于对数似然比的双向无线协作网络中继选择方法
Wassim Alexan, Ahmed El Mahdy
Applying network coding to a wireless cooperative communication network always proves beneficial, in terms of gains in spatial diversity, improved coverage and channel capacity. In this paper, a relay-selection method for a bidirectional (two-way) wireless cooperative communication system is proposed. A single best relay node or set of relay nodes are selected to jointly forward the combined data streams from two users. The method is based on the value of the log-likelihood ratio of the received signal at each relay node in the system. Performance is measured in terms of bit error rate (BER) curves and outage probability curves. A comparison with opportunistic relaying is carried out.
将网络编码应用于无线协作通信网络,在空间分集、覆盖范围和信道容量方面都是有益的。提出了一种用于双向无线协作通信系统的中继选择方法。选择一个或一组最佳中继节点来联合转发来自两个用户的组合数据流。该方法基于系统中每个中继节点接收信号的对数似然比的值。性能是根据误码率曲线和中断概率曲线来衡量的。并与机会继电器进行了比较。
{"title":"A relay selection method for bidirectional wireless cooperative networks based on the log-likelihood ratio","authors":"Wassim Alexan, Ahmed El Mahdy","doi":"10.1109/SPA.2015.7365148","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365148","url":null,"abstract":"Applying network coding to a wireless cooperative communication network always proves beneficial, in terms of gains in spatial diversity, improved coverage and channel capacity. In this paper, a relay-selection method for a bidirectional (two-way) wireless cooperative communication system is proposed. A single best relay node or set of relay nodes are selected to jointly forward the combined data streams from two users. The method is based on the value of the log-likelihood ratio of the received signal at each relay node in the system. Performance is measured in terms of bit error rate (BER) curves and outage probability curves. A comparison with opportunistic relaying is carried out.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123098683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Approach to local contrast enhancement 局部对比度增强方法
A. Konieczka, J. Balcerek, Agata Chmielewska, A. Dabrowski
In this article a simple and fully automatic approach to local contrast enhancement is presented, maintaining reasonable computational complexity and feasibility of simultaneous calculations on multiple processors. The authors propose an algorithm, which allows to obtain images with artificially increased dynamic range. The resulting images do not contain unnatural artifacts and are close to images perceived by humans. Results of experiments show that the described method offers very good results even for images obtained with a large range of differences in the exposures.
本文提出了一种简单的、全自动的局部对比度增强方法,该方法能保持合理的计算复杂度和在多处理器上同时计算的可行性。提出了一种人工增大动态范围的图像提取算法。生成的图像不包含非自然的人工制品,并且接近人类感知的图像。实验结果表明,即使在曝光差较大的情况下,所述方法也能获得很好的结果。
{"title":"Approach to local contrast enhancement","authors":"A. Konieczka, J. Balcerek, Agata Chmielewska, A. Dabrowski","doi":"10.1109/SPA.2015.7365106","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365106","url":null,"abstract":"In this article a simple and fully automatic approach to local contrast enhancement is presented, maintaining reasonable computational complexity and feasibility of simultaneous calculations on multiple processors. The authors propose an algorithm, which allows to obtain images with artificially increased dynamic range. The resulting images do not contain unnatural artifacts and are close to images perceived by humans. Results of experiments show that the described method offers very good results even for images obtained with a large range of differences in the exposures.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126837336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Irregular sampling for X-ray imaging simulation 用于x射线成像模拟的不规则采样
Michael Kröger, M. Rosenbaum, W. Sauer-Greff, R. Urbansky, M. Lorang, M. Siegrist
The simulation of X-ray images can be computed efficiently using raytracing, a technique well established in 3D computer graphics and rendering. Since raytracing is a discrete technique it is prone to aliasing artefacts. However, irregular sampling is able to mitigate this problem. In this paper the influence of the probability density function of the sampling process on the reconstructed spectral density is described. It is demonstrated that irregular sampling can be used in X-ray imaging simulation to reduce the impact of aliasing.
射线追踪技术是一种在三维计算机图形学和渲染中成熟的技术,可以有效地计算x射线图像的模拟。由于光线追踪是一种离散技术,它很容易产生混叠。然而,不规则采样可以缓解这个问题。本文描述了采样过程的概率密度函数对重构谱密度的影响。结果表明,在x射线成像模拟中,不规则采样可以减少混叠的影响。
{"title":"Irregular sampling for X-ray imaging simulation","authors":"Michael Kröger, M. Rosenbaum, W. Sauer-Greff, R. Urbansky, M. Lorang, M. Siegrist","doi":"10.1109/SPA.2015.7365140","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365140","url":null,"abstract":"The simulation of X-ray images can be computed efficiently using raytracing, a technique well established in 3D computer graphics and rendering. Since raytracing is a discrete technique it is prone to aliasing artefacts. However, irregular sampling is able to mitigate this problem. In this paper the influence of the probability density function of the sampling process on the reconstructed spectral density is described. It is demonstrated that irregular sampling can be used in X-ray imaging simulation to reduce the impact of aliasing.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115440215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The renal vessel segmentation for facilitation of partial nephrectomy 肾血管分割促进部分肾切除术
Katarzyna Bugajska, A. Skalski, Janusz Gajda, T. Drewniak
In this article we have proposed several image processing techniques enabling the extraction of 3D tumor affected renal vascularity from CT scans in order to facilitate partial nephrectomy. The information which vessels supply the tumor is crucial to eliminate ischemic injury and allows the usage of the selective clamping method. However, until now renal vascularity has been analyzed only on the basis of visualization and its limitations. Our novel method consisted of the following steps: binarization upon image intensity histogram, erosion - elimination of connections between different structures, segmentation by a proposed locally adaptive region growing algorithm and finally segmentation by level set method using variational approach allowing the incorporation of the Chan - Vese model and image gradient information into the energy functional. The proposed set of image processing techniques allowed us to obtain 3D renal vessels segmentations and to identify target vessels. The results were validated on manually segmented, randomly chosen slices of ten different patients' computed tomography scans. Segmentation effectiveness is equal to 0.838 of Dice Coefficient meaning.
在这篇文章中,我们提出了几种图像处理技术,可以从CT扫描中提取3D肿瘤影响的肾脏血管,以促进部分肾切除术。血管供应肿瘤的信息对于消除缺血性损伤至关重要,并允许使用选择性夹紧方法。然而,到目前为止,对肾脏血管的分析仅基于可视化及其局限性。我们的新方法包括以下步骤:基于图像强度直方图的二值化,不同结构之间的连接的侵蚀消除,通过提出的局部自适应区域增长算法进行分割,最后使用允许将Chan - Vese模型和图像梯度信息结合到能量泛函中的变分方法进行水平集分割。所提出的一套图像处理技术使我们能够获得三维肾血管分割并识别目标血管。结果在人工分割的,随机选择的10个不同患者的计算机断层扫描切片上得到验证。分割效果等于Dice Coefficient意义的0.838。
{"title":"The renal vessel segmentation for facilitation of partial nephrectomy","authors":"Katarzyna Bugajska, A. Skalski, Janusz Gajda, T. Drewniak","doi":"10.1109/SPA.2015.7365112","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365112","url":null,"abstract":"In this article we have proposed several image processing techniques enabling the extraction of 3D tumor affected renal vascularity from CT scans in order to facilitate partial nephrectomy. The information which vessels supply the tumor is crucial to eliminate ischemic injury and allows the usage of the selective clamping method. However, until now renal vascularity has been analyzed only on the basis of visualization and its limitations. Our novel method consisted of the following steps: binarization upon image intensity histogram, erosion - elimination of connections between different structures, segmentation by a proposed locally adaptive region growing algorithm and finally segmentation by level set method using variational approach allowing the incorporation of the Chan - Vese model and image gradient information into the energy functional. The proposed set of image processing techniques allowed us to obtain 3D renal vessels segmentations and to identify target vessels. The results were validated on manually segmented, randomly chosen slices of ten different patients' computed tomography scans. Segmentation effectiveness is equal to 0.838 of Dice Coefficient meaning.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"42 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126197555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Influence of simultaneous spoken sentences on the properties of spectral peaks 同时口语句子对谱峰特性的影响
T. Maka, Miroslaw Lazoryszczak
In this study, an approach to analyse the properties of spectral peaks of simultaneously talking speakers in monophonic audio signal has been described. We have proposed a technique based on spectral peaks tracking and attributes calculated from peaks histogram. Spectral peaks have been estimated using linear prediction-based spectral envelope for each frame of source signal. The features have been computed from the histogram at different frequency bands. The statistical properties of the obtained features have been used to find out the relationship with the number of speech sources. Proposed approach has been tested using a dedicated database featuring sentences with the same and mixed gender, where the number of speakers varies from two to twelve. Different configuration parameters like frame size, bin width of the histogram and linear prediction order have been used in the conducted experiments. The results show that obtained trends of statistical descriptors are directly connected with the number of voice sources. The proposed descriptors and performed regression analysis can be a basis to estimate the number of speakers in single audio stream.
本文介绍了一种分析单声道音频信号中同时说话人的频谱峰特性的方法。我们提出了一种基于谱峰跟踪和谱峰直方图属性计算的技术。利用基于线性预测的谱包络对源信号的每一帧进行谱峰估计。从直方图中计算出不同频带的特征。利用所获得的特征的统计特性来找出与语音源数量的关系。该方法已经在一个专门的数据库中进行了测试,该数据库包含了相同性别和混合性别的句子,其中说话者的数量从2到12不等。实验中使用了不同的配置参数,如帧大小、直方图的bin宽度和线性预测顺序。结果表明,统计描述符的变化趋势与语音源的数量直接相关。所提出的描述符和所执行的回归分析可以作为估计单个音频流中说话者数量的基础。
{"title":"Influence of simultaneous spoken sentences on the properties of spectral peaks","authors":"T. Maka, Miroslaw Lazoryszczak","doi":"10.1109/SPA.2015.7365139","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365139","url":null,"abstract":"In this study, an approach to analyse the properties of spectral peaks of simultaneously talking speakers in monophonic audio signal has been described. We have proposed a technique based on spectral peaks tracking and attributes calculated from peaks histogram. Spectral peaks have been estimated using linear prediction-based spectral envelope for each frame of source signal. The features have been computed from the histogram at different frequency bands. The statistical properties of the obtained features have been used to find out the relationship with the number of speech sources. Proposed approach has been tested using a dedicated database featuring sentences with the same and mixed gender, where the number of speakers varies from two to twelve. Different configuration parameters like frame size, bin width of the histogram and linear prediction order have been used in the conducted experiments. The results show that obtained trends of statistical descriptors are directly connected with the number of voice sources. The proposed descriptors and performed regression analysis can be a basis to estimate the number of speakers in single audio stream.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114341971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Simple object coordination tracking based on background modeling 基于背景建模的简单目标协调跟踪
H. Cho, Sang-Hyeop Song, Jong-Hak Kim, Solima, Jun-Dong Cho
In this paper, we propose a simple object tracking based on background modeling using histogram matching. Different from the existing block-based background modeling methods, most researches focus on a background subtraction. However, that includes a problem on visually object tracking and unreliable coordinate information. In this work, we implement background modeling and generate `background map' to reduce processing time for labeling. Then, `background map' is used for labeling algorithm to find a foreground that find object coordination information each frame. In addition, we can track moving object between previous frame and current frame using histogram matching. In the real-time processing, for a 640*480 resolution video, processing time is within 19ms using parallel studio.
本文提出了一种基于直方图匹配背景建模的简单目标跟踪方法。与现有的基于分块的背景建模方法不同,大多数研究都集中在背景减法上。然而,这其中存在着视觉目标跟踪和坐标信息不可靠的问题。在这项工作中,我们实现了背景建模和生成“背景地图”,以减少标签的处理时间。然后,利用“背景图”进行标注算法,寻找每帧都能找到物体协调信息的前景。此外,我们可以利用直方图匹配来跟踪前一帧和当前帧之间的运动目标。在实时处理中,对于640*480分辨率的视频,使用并行工作室处理时间在19ms以内。
{"title":"Simple object coordination tracking based on background modeling","authors":"H. Cho, Sang-Hyeop Song, Jong-Hak Kim, Solima, Jun-Dong Cho","doi":"10.1109/SPA.2015.7365143","DOIUrl":"https://doi.org/10.1109/SPA.2015.7365143","url":null,"abstract":"In this paper, we propose a simple object tracking based on background modeling using histogram matching. Different from the existing block-based background modeling methods, most researches focus on a background subtraction. However, that includes a problem on visually object tracking and unreliable coordinate information. In this work, we implement background modeling and generate `background map' to reduce processing time for labeling. Then, `background map' is used for labeling algorithm to find a foreground that find object coordination information each frame. In addition, we can track moving object between previous frame and current frame using histogram matching. In the real-time processing, for a 640*480 resolution video, processing time is within 19ms using parallel studio.","PeriodicalId":423880,"journal":{"name":"2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123229908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1