首页 > 最新文献

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)最新文献

英文 中文
Advanced mapping techniques for digital signal processors 数字信号处理器的高级映射技术
T. Fryza, R. Mego
This paper is focused on the hardware modeling and the algorithms mapping on the digital signal processor (DSP) with the very long instruction word (VLIW) architecture, such as TMS320C6000. The general methods to develop an efficient application for the target processor combine high- and/or low-level programming languages. Although the hardware capabilities of the nowadays processors and compilers are persistently increasing, the programmers common practice is to hand-optimize critical parts of the digital signal processing algorithms in low-level assembly code. In the paper the benefit of the auxiliary tool for generating of semi-optimal codes for the DSP is presented. The functions for basic vector operations (addition, multiplication, and dot product) were proposed by this tool and the computing performances were compared to the corresponding functions from the TMS320C6000 DSP Library (DSPLIB). Comparing the functions' duration, the proposed routines achieve the average acceleration of 24 CPU cycles.
本文主要研究了TMS320C6000等超长指令字(VLIW)结构的数字信号处理器(DSP)的硬件建模和算法映射。为目标处理器开发高效应用程序的一般方法是结合高级和/或低级编程语言。尽管当今处理器和编译器的硬件性能不断提高,但程序员通常的做法是在低级汇编代码中手工优化数字信号处理算法的关键部分。本文介绍了该辅助工具对DSP半最优码生成的好处。提出了基本矢量运算(加法、乘法和点积)的函数,并与TMS320C6000 DSP库(DSPLIB)的相应函数进行了计算性能比较。比较函数的持续时间,所提出的例程实现了24个CPU周期的平均加速。
{"title":"Advanced mapping techniques for digital signal processors","authors":"T. Fryza, R. Mego","doi":"10.1109/ISSPIT.2016.7886037","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886037","url":null,"abstract":"This paper is focused on the hardware modeling and the algorithms mapping on the digital signal processor (DSP) with the very long instruction word (VLIW) architecture, such as TMS320C6000. The general methods to develop an efficient application for the target processor combine high- and/or low-level programming languages. Although the hardware capabilities of the nowadays processors and compilers are persistently increasing, the programmers common practice is to hand-optimize critical parts of the digital signal processing algorithms in low-level assembly code. In the paper the benefit of the auxiliary tool for generating of semi-optimal codes for the DSP is presented. The functions for basic vector operations (addition, multiplication, and dot product) were proposed by this tool and the computing performances were compared to the corresponding functions from the TMS320C6000 DSP Library (DSPLIB). Comparing the functions' duration, the proposed routines achieve the average acceleration of 24 CPU cycles.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115172456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Classical Arabic phoneme contextual analysis using HMM classifiers 基于HMM分类器的古典阿拉伯语音素语境分析
Y. Alotaibi, A. Meftah, S. Selouani
This paper presents a phonetic analysis of Arabic speech language phonemes using hidden Markov model classifiers and their confusion matrices. For this purpose, a new classical Arabic speech corpus was planned and designed. The corpus is based on recitations from The Holy Quran of specific scripts. Semi-manual labeling and segmentation of the audio files along with other language resources such as a word dictionary were prepared. Recitations from The Holy Quran are highly indicative of the pronunciation of Arabic phonemes. The classifier results show that phonemes with the lowest frequencies in general have the highest error rates. Overall, the rates of correct classification are 76.04%, 93.01%, 93.59%, and 92.81% for monophone, left and right context biphone, and triphone systems, respectively.
本文利用隐马尔可夫模型分类器及其混淆矩阵对阿拉伯语语音音位进行了语音分析。为此,设计了一个新的古典阿拉伯语语料库。该语料库是基于对《古兰经》特定文本的背诵。对音频文件进行半手工标注和分词,并结合单词词典等语言资源进行分词。《古兰经》的背诵高度反映了阿拉伯音素的发音。分类器结果表明,频率最低的音素通常具有最高的错误率。总体而言,单声部、左右双声部和三声部系统的分类正确率分别为76.04%、93.01%、93.59%和92.81%。
{"title":"Classical Arabic phoneme contextual analysis using HMM classifiers","authors":"Y. Alotaibi, A. Meftah, S. Selouani","doi":"10.1109/ISSPIT.2016.7886001","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886001","url":null,"abstract":"This paper presents a phonetic analysis of Arabic speech language phonemes using hidden Markov model classifiers and their confusion matrices. For this purpose, a new classical Arabic speech corpus was planned and designed. The corpus is based on recitations from The Holy Quran of specific scripts. Semi-manual labeling and segmentation of the audio files along with other language resources such as a word dictionary were prepared. Recitations from The Holy Quran are highly indicative of the pronunciation of Arabic phonemes. The classifier results show that phonemes with the lowest frequencies in general have the highest error rates. Overall, the rates of correct classification are 76.04%, 93.01%, 93.59%, and 92.81% for monophone, left and right context biphone, and triphone systems, respectively.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130615611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Robust Generalized Low Rank Approximation of Matrices for image recognition 图像识别中矩阵的鲁棒广义低秩逼近
H. Nakouri, M. Limam
For a set of 2D objects such as image representations, a 2DPCA approach that computes principal components of row-row and column-column covariance matrices would be more appropriate. The Generalized Low Rank Approximation of Matrices (GLRAM) approach has proved its efficiency on computation time and compression ratio over 1D principal components analysis approaches. However, GLRAM fails to efficiently account noise and outliers. To address this problem, a robust version of GLRAM, called RGLRAM is proposed. To weaken the noise effect, we propose a non-greedy iterative approach for GLRAM that maximizes data covariance in the projection subspace and minimizes the construction error. The proposed method is applied to face image recognition and shows its efficiency in handling noisy data more than GLRAM does. Experiments are performed on three benchmark face databases and results reveal that the proposed method achieves substantial results in terms of recognition accuracy, numerical stability, convergence and speed.
对于一组2D对象,如图像表示,计算行-行和列-列协方差矩阵的主成分的2DPCA方法可能更合适。与一维主成分分析方法相比,矩阵的广义低秩逼近(GLRAM)方法在计算时间和压缩比上都有显著的提高。然而,GLRAM不能有效地考虑噪声和异常值。为了解决这个问题,提出了一个健壮的GLRAM版本,称为RGLRAM。为了减弱噪声影响,我们提出了一种非贪婪迭代的GLRAM方法,该方法可以最大化投影子空间中的数据协方差并最小化构造误差。将该方法应用于人脸图像识别,结果表明该方法在处理噪声数据方面优于GLRAM。在三个基准人脸数据库上进行了实验,结果表明该方法在识别精度、数值稳定性、收敛性和速度等方面都取得了显著的效果。
{"title":"Robust Generalized Low Rank Approximation of Matrices for image recognition","authors":"H. Nakouri, M. Limam","doi":"10.1109/ISSPIT.2016.7886035","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886035","url":null,"abstract":"For a set of 2D objects such as image representations, a 2DPCA approach that computes principal components of row-row and column-column covariance matrices would be more appropriate. The Generalized Low Rank Approximation of Matrices (GLRAM) approach has proved its efficiency on computation time and compression ratio over 1D principal components analysis approaches. However, GLRAM fails to efficiently account noise and outliers. To address this problem, a robust version of GLRAM, called RGLRAM is proposed. To weaken the noise effect, we propose a non-greedy iterative approach for GLRAM that maximizes data covariance in the projection subspace and minimizes the construction error. The proposed method is applied to face image recognition and shows its efficiency in handling noisy data more than GLRAM does. Experiments are performed on three benchmark face databases and results reveal that the proposed method achieves substantial results in terms of recognition accuracy, numerical stability, convergence and speed.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128558429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Designing an optimal digital bandpass filter for 3D USCT II 设计3D USCT II的最佳数字带通滤波器
M. Zapf, R. Chhabra
3D-USCT-II is a novel imaging method aimed at detecting breast cancer at an early stage by using Synthetic Aperture Focusing Technique (SAFT). The excitation signal (Coded Excitation), used as an input signal, goes to the receiver transducer, and is then fed into a signal processing chain where a digital filter is used with a bandwidth from 1.66 MHz to 3.33 MHz, which is also defined as its digital bandwidth. The analog bandwidth of the signal, however, begins below 1.66 MHz. Therefore, there is considerable loss of bandwidth with the usage of this digital filter. A solution presented here makes use of modulation to assist the bandwidth increase of the digital filter. Results are then compared with metrics defined by SNR, increase in bandwidth, and increase in signal fidelity. The results show an increase in bandwidth by 15.06%, increase in SNR by 7.72% and increase in signal fidelity by 5.76%.
3D-USCT-II是一种利用合成孔径聚焦技术(SAFT)检测乳腺癌早期的新型成像方法。激励信号(编码激励)作为输入信号,进入接收换能器,然后送入信号处理链,其中使用数字滤波器,带宽从1.66 MHz到3.33 MHz,也定义为其数字带宽。然而,信号的模拟带宽开始低于1.66 MHz。因此,使用这种数字滤波器会造成相当大的带宽损失。本文提出了一种利用调制来辅助数字滤波器带宽增加的解决方案。然后将结果与由信噪比、带宽增加和信号保真度增加定义的指标进行比较。结果表明,带宽提高了15.06%,信噪比提高了7.72%,信号保真度提高了5.76%。
{"title":"Designing an optimal digital bandpass filter for 3D USCT II","authors":"M. Zapf, R. Chhabra","doi":"10.1109/ISSPIT.2016.7886030","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886030","url":null,"abstract":"3D-USCT-II is a novel imaging method aimed at detecting breast cancer at an early stage by using Synthetic Aperture Focusing Technique (SAFT). The excitation signal (Coded Excitation), used as an input signal, goes to the receiver transducer, and is then fed into a signal processing chain where a digital filter is used with a bandwidth from 1.66 MHz to 3.33 MHz, which is also defined as its digital bandwidth. The analog bandwidth of the signal, however, begins below 1.66 MHz. Therefore, there is considerable loss of bandwidth with the usage of this digital filter. A solution presented here makes use of modulation to assist the bandwidth increase of the digital filter. Results are then compared with metrics defined by SNR, increase in bandwidth, and increase in signal fidelity. The results show an increase in bandwidth by 15.06%, increase in SNR by 7.72% and increase in signal fidelity by 5.76%.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131389924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A speech enhancement algorithm using computational auditory scene analysis with spectral subtraction 基于谱减法的计算听觉场景分析的语音增强算法
Cong Guo, Like Hui, Weiqiang Zhang, Jia Liu
Computational auditory scene analysis (CASA) system is well used in speech enhancement area in recent years. We propose a new system that combines CASA and spectral subtraction to get better enhanced speech. The CASA part consists of the latest method deep neural networks (DNNs). The original way to reconstruct the denoise signal is to use the estimated masks with direct overlap-add method ignoring the information of noise within the frames. In our system, we estimate self-adapted thresholds for each channel by Gaussian Mixture Model from the estimated ratio masks (ERMs) to separate noise and speech of each channel. In this way, we make full use of the information within frames. The results show increase in both objective and subjective evaluation.
计算听觉场景分析(CASA)系统近年来在语音增强领域得到了很好的应用。我们提出了一种结合CASA和频谱减法的新系统,以获得更好的增强语音。CASA部分由最新方法深度神经网络(dnn)组成。原始的重建噪声信号的方法是利用直接叠加法估计的掩模,忽略帧内的噪声信息。在我们的系统中,我们使用高斯混合模型从估计的比率掩模(erm)中估计每个通道的自适应阈值,以分离每个通道的噪声和语音。这样,我们就充分利用了帧内的信息。结果表明,客观评价和主观评价均有所提高。
{"title":"A speech enhancement algorithm using computational auditory scene analysis with spectral subtraction","authors":"Cong Guo, Like Hui, Weiqiang Zhang, Jia Liu","doi":"10.1109/ISSPIT.2016.7886000","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886000","url":null,"abstract":"Computational auditory scene analysis (CASA) system is well used in speech enhancement area in recent years. We propose a new system that combines CASA and spectral subtraction to get better enhanced speech. The CASA part consists of the latest method deep neural networks (DNNs). The original way to reconstruct the denoise signal is to use the estimated masks with direct overlap-add method ignoring the information of noise within the frames. In our system, we estimate self-adapted thresholds for each channel by Gaussian Mixture Model from the estimated ratio masks (ERMs) to separate noise and speech of each channel. In this way, we make full use of the information within frames. The results show increase in both objective and subjective evaluation.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128100775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PCA-Kalman based load forecasting of electric power demand 基于PCA-Kalman的电力需求负荷预测
Lucas D. X. Ribeiro, Jayme Milanezi, J. Costa, W. Giozza, R. K. Miranda, M. Vieira
Electricity demand time series are stochastic processes related to climate, social and economic variables. By predicting the evolution of such time series, electrical load forecasting can be performed in order to support the electrical grid planning. In this paper, we propose a Kalman based load forecasting system for daily demand forecasting. Our proposed approach incorporates a Principal Component Analysis (PCA) of the input variables obtained from linear and nonlinear transformations of the candidate time series. In order to validate our predicting scheme, data collected from Brasília distribution company has been used. Our proposed approach outperforms state-of-the-art approaches based on state space and artificial neural networks.
电力需求时间序列是与气候、社会和经济变量相关的随机过程。通过预测这些时间序列的演变,可以进行负荷预测,以支持电网规划。本文提出了一种基于卡尔曼的负荷预测系统,用于日需求预测。我们提出的方法结合了从候选时间序列的线性和非线性变换中获得的输入变量的主成分分析(PCA)。为了验证我们的预测方案,使用了Brasília分销公司收集的数据。我们提出的方法优于基于状态空间和人工神经网络的最先进的方法。
{"title":"PCA-Kalman based load forecasting of electric power demand","authors":"Lucas D. X. Ribeiro, Jayme Milanezi, J. Costa, W. Giozza, R. K. Miranda, M. Vieira","doi":"10.1109/ISSPIT.2016.7886010","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886010","url":null,"abstract":"Electricity demand time series are stochastic processes related to climate, social and economic variables. By predicting the evolution of such time series, electrical load forecasting can be performed in order to support the electrical grid planning. In this paper, we propose a Kalman based load forecasting system for daily demand forecasting. Our proposed approach incorporates a Principal Component Analysis (PCA) of the input variables obtained from linear and nonlinear transformations of the candidate time series. In order to validate our predicting scheme, data collected from Brasília distribution company has been used. Our proposed approach outperforms state-of-the-art approaches based on state space and artificial neural networks.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131349911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Bl-GESPAR: A fast SAR imaging algorithm for phase noise mitigation b - gespar:一种相位噪声抑制的快速SAR成像算法
Qing Zhang, Xunchao Cong, Keyu Long, Yue Yang, Jiangbo Liu, Q. Wan
The performance of Synthetic aperture radar (SAR) imagery is often significantly deteriorated by the random phase noises arose from the atmospheric turbulence or frequency jitter of the transmit signal within SAR observations. The computational time of the traditional phase retrieval based SAR autofocus algorithms is sharply increased with the size of scene. In this paper, we recast the SAR imaging problem via the phase-corrupted data as a special case of block-based quadratic compressed sensing (BBQCS) problem. We propose a novel fast SAR imaging algorithm to recover the focused well SAR image from the phase-corrupted data and reduce the computational time and memory requirement for several orders of magnitude. Experimental results show our proposed algorithm not only reduces the computational complex but also provides satisfactory reconstruction performance.
合成孔径雷达(SAR)图像的性能经常受到大气湍流或发射信号频率抖动引起的随机相位噪声的严重影响。传统的基于相位检索的SAR自动聚焦算法的计算时间随着场景规模的增大而急剧增加。本文将基于相位损坏数据的SAR成像问题重新定义为基于块的二次压缩感知(BBQCS)问题的一个特例。提出了一种新的快速SAR成像算法,从相位损坏的数据中恢复聚焦后的井SAR图像,并将计算时间和内存需求降低了几个数量级。实验结果表明,该算法不仅降低了计算复杂度,而且具有满意的重构性能。
{"title":"Bl-GESPAR: A fast SAR imaging algorithm for phase noise mitigation","authors":"Qing Zhang, Xunchao Cong, Keyu Long, Yue Yang, Jiangbo Liu, Q. Wan","doi":"10.1109/ISSPIT.2016.7886043","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886043","url":null,"abstract":"The performance of Synthetic aperture radar (SAR) imagery is often significantly deteriorated by the random phase noises arose from the atmospheric turbulence or frequency jitter of the transmit signal within SAR observations. The computational time of the traditional phase retrieval based SAR autofocus algorithms is sharply increased with the size of scene. In this paper, we recast the SAR imaging problem via the phase-corrupted data as a special case of block-based quadratic compressed sensing (BBQCS) problem. We propose a novel fast SAR imaging algorithm to recover the focused well SAR image from the phase-corrupted data and reduce the computational time and memory requirement for several orders of magnitude. Experimental results show our proposed algorithm not only reduces the computational complex but also provides satisfactory reconstruction performance.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115969548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Integrating machine learning in embedded sensor systems for Internet-of-Things applications 在物联网应用的嵌入式传感器系统中集成机器学习
Jongmin Lee, M. Stanley, A. Spanias, C. Tepedelenlioğlu
Interpreting sensor data in Internet-of-Things applications is a challenging problem particularly in embedded systems. We consider sensor data analytics where machine learning algorithms can be fully implemented on an embedded processor/sensor board. We develop an efficient real-time realization of a Gaussian mixture model (GMM) for execution on the NXP FRDM-K64F embedded sensor board. We demonstrate the design of a customized program and data structure that generates real-time sensor features, and we show details and training/classification results for select IoT applications. The integrated hardware/software system enables real-time data analytics and continuous training and re-training of the machine learning (ML) algorithm. The real-time ML platform can accommodate several applications with lower sensor data traffic.
在物联网应用中解释传感器数据是一个具有挑战性的问题,特别是在嵌入式系统中。我们考虑传感器数据分析,其中机器学习算法可以在嵌入式处理器/传感器板上完全实现。我们开发了一种有效的实时实现高斯混合模型(GMM),用于在NXP FRDM-K64F嵌入式传感器板上执行。我们演示了生成实时传感器特征的定制程序和数据结构的设计,并展示了选择物联网应用的细节和训练/分类结果。集成的硬件/软件系统支持实时数据分析和机器学习(ML)算法的持续训练和再训练。实时机器学习平台可以适应传感器数据流量较低的几种应用。
{"title":"Integrating machine learning in embedded sensor systems for Internet-of-Things applications","authors":"Jongmin Lee, M. Stanley, A. Spanias, C. Tepedelenlioğlu","doi":"10.1109/ISSPIT.2016.7886051","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886051","url":null,"abstract":"Interpreting sensor data in Internet-of-Things applications is a challenging problem particularly in embedded systems. We consider sensor data analytics where machine learning algorithms can be fully implemented on an embedded processor/sensor board. We develop an efficient real-time realization of a Gaussian mixture model (GMM) for execution on the NXP FRDM-K64F embedded sensor board. We demonstrate the design of a customized program and data structure that generates real-time sensor features, and we show details and training/classification results for select IoT applications. The integrated hardware/software system enables real-time data analytics and continuous training and re-training of the machine learning (ML) algorithm. The real-time ML platform can accommodate several applications with lower sensor data traffic.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121274723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Handwriting image enhancement using local learning windowing, Gaussian Mixture Model and k-means clustering 基于局部学习窗口、高斯混合模型和k-means聚类的手写图像增强
H. Kusetogullari, Håkan Grahn, Niklas Lavesson
In this paper, a new approach is proposed to enhance the handwriting image by using learning-based windowing contrast enhancement and Gaussian Mixture Model (GMM). A fixed size window moves over the handwriting image and two quantitative methods which are discrete entropy (DE) and edge-based contrast measure (EBCM) are used to estimate the quality of each patch. The obtained results are used in the unsupervised learning method by using k-means clustering to assign the quality of handwriting as bad (if it is low contrast) or good (if it is high contrast). After that, if the corresponding patch is estimated as low contrast, a contrast enhancement method is applied to the window to enhance the handwriting. GMM is used as a final step to smoothly exchange information between original and enhanced images to discard the artifacts to represent the final image. The proposed method has been compared with the other contrast enhancement methods for different datasets which are Swedish historical documents, DIBCO2010, DIBCO2012 and DIBCO2013. Results illustrate that proposed method performs well to enhance the handwriting comparing to the existing contrast enhancement methods.
本文提出了一种基于学习的加窗对比度增强和高斯混合模型(GMM)相结合的手写图像增强方法。在手写图像上移动一个固定大小的窗口,并使用离散熵(DE)和基于边缘的对比度度量(EBCM)两种定量方法来估计每个补丁的质量。通过使用k-means聚类将获得的结果用于无监督学习方法,将笔迹的质量划分为差(如果它的对比度低)或好(如果它的对比度高)。之后,如果估计对应的patch对比度较低,则对窗口应用对比度增强方法来增强笔迹。GMM是在原始图像和增强图像之间平滑交换信息的最后一步,以去除伪影来表示最终图像。针对瑞典历史文献DIBCO2010、DIBCO2012和DIBCO2013等不同的数据集,将本文方法与其他对比度增强方法进行了对比。实验结果表明,与现有的对比度增强方法相比,所提出的方法具有较好的笔迹增强效果。
{"title":"Handwriting image enhancement using local learning windowing, Gaussian Mixture Model and k-means clustering","authors":"H. Kusetogullari, Håkan Grahn, Niklas Lavesson","doi":"10.1109/ISSPIT.2016.7886054","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886054","url":null,"abstract":"In this paper, a new approach is proposed to enhance the handwriting image by using learning-based windowing contrast enhancement and Gaussian Mixture Model (GMM). A fixed size window moves over the handwriting image and two quantitative methods which are discrete entropy (DE) and edge-based contrast measure (EBCM) are used to estimate the quality of each patch. The obtained results are used in the unsupervised learning method by using k-means clustering to assign the quality of handwriting as bad (if it is low contrast) or good (if it is high contrast). After that, if the corresponding patch is estimated as low contrast, a contrast enhancement method is applied to the window to enhance the handwriting. GMM is used as a final step to smoothly exchange information between original and enhanced images to discard the artifacts to represent the final image. The proposed method has been compared with the other contrast enhancement methods for different datasets which are Swedish historical documents, DIBCO2010, DIBCO2012 and DIBCO2013. Results illustrate that proposed method performs well to enhance the handwriting comparing to the existing contrast enhancement methods.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128963871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A time scale modification with large and varying scaling factors 一种具有大而变化的比例因子的时间尺度修正
Kevin Struwe
When it comes to time- and pitch-scale modification algorithms, Phase Vocoder based approaches are widely used. However, a problem with the standard approach is the limited scaling range from 50% to 200%. For an adaptive pitch transposition method for cochlear implants, larger scaling range values that are varying on a by-frame-basis are needed. This paper shows a solution for that problem with adjusted modification and synthesis stages. The developed algorithm has a constant analysis hop size and a varying synthesis hop size. Effectively, interpolated intermediate frames are introduced to compensate for missing signal information. These frames are spaced equidistant with the reciprocal scaling factor and have linear interpolated amplitudes. As a result, the desired characteristics could be achieved by a high quality implementation of the new method. The sound quality could be evaluated at an informal listening test.
当涉及到时间和音高尺度修改算法时,基于相位声码器的方法被广泛使用。然而,标准方法的一个问题是缩放范围从50%到200%。对于人工耳蜗的自适应基音变换方法,需要更大的按帧变化的刻度范围值。本文提出了一种通过调整改性和合成阶段来解决该问题的方法。该算法具有恒定的分析跳数和可变的合成跳数。有效地引入插值的中间帧来补偿丢失的信号信息。这些帧以倒数比例因子等距间隔,并具有线性插值振幅。结果表明,通过高质量地实现新方法,可以获得期望的特性。音质可以在非正式的听力测试中进行评估。
{"title":"A time scale modification with large and varying scaling factors","authors":"Kevin Struwe","doi":"10.1109/ISSPIT.2016.7886033","DOIUrl":"https://doi.org/10.1109/ISSPIT.2016.7886033","url":null,"abstract":"When it comes to time- and pitch-scale modification algorithms, Phase Vocoder based approaches are widely used. However, a problem with the standard approach is the limited scaling range from 50% to 200%. For an adaptive pitch transposition method for cochlear implants, larger scaling range values that are varying on a by-frame-basis are needed. This paper shows a solution for that problem with adjusted modification and synthesis stages. The developed algorithm has a constant analysis hop size and a varying synthesis hop size. Effectively, interpolated intermediate frames are introduced to compensate for missing signal information. These frames are spaced equidistant with the reciprocal scaling factor and have linear interpolated amplitudes. As a result, the desired characteristics could be achieved by a high quality implementation of the new method. The sound quality could be evaluated at an informal listening test.","PeriodicalId":371691,"journal":{"name":"2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115025568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1