首页 > 最新文献

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)最新文献

英文 中文
Object classification with convolution neural network based on the time-frequency representation of their echo 基于回声时频表示的卷积神经网络目标分类
Mariia Dmitrieva, Matias Valdenegro-Toro, K. Brown, G. Heald, D. Lane
This paper presents classification of spherical objects with different physical properties. The classification is based on the energy distribution in wideband pulses that have been scattered from objects. The echo is represented in Time-Frequency Domain (TFD), using Short Time Fourier Transform (STFT) with different window lengths, and is fed into a Convolution Neural Network (CNN) for classification. The results for different window lengths are analysed to study the influence of time and frequency resolution in classification. The CNN performs the best results with accuracy of (98.44 ± 0.8)% over 5 object classes trained on grayscale TFD images with 0.1 ms window length of STFT. The CNN is compared with a Multilayer Perceptron classifier, Support Vector Machine, and Gradient Boosting.
本文介绍了具有不同物理性质的球形物体的分类。这种分类是基于从物体散射出来的宽带脉冲的能量分布。利用不同窗长的短时傅立叶变换(STFT)在时频域(TFD)中表示回波,并将其送入卷积神经网络(CNN)进行分类。分析了不同窗长的结果,研究了时频分辨率对分类的影响。在STFT窗口长度为0.1 ms的灰度TFD图像上,CNN在5个目标类别上训练的准确率达到(98.44±0.8)%。CNN与多层感知器分类器、支持向量机和梯度增强进行了比较。
{"title":"Object classification with convolution neural network based on the time-frequency representation of their echo","authors":"Mariia Dmitrieva, Matias Valdenegro-Toro, K. Brown, G. Heald, D. Lane","doi":"10.1109/MLSP.2017.8168134","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168134","url":null,"abstract":"This paper presents classification of spherical objects with different physical properties. The classification is based on the energy distribution in wideband pulses that have been scattered from objects. The echo is represented in Time-Frequency Domain (TFD), using Short Time Fourier Transform (STFT) with different window lengths, and is fed into a Convolution Neural Network (CNN) for classification. The results for different window lengths are analysed to study the influence of time and frequency resolution in classification. The CNN performs the best results with accuracy of (98.44 ± 0.8)% over 5 object classes trained on grayscale TFD images with 0.1 ms window length of STFT. The CNN is compared with a Multilayer Perceptron classifier, Support Vector Machine, and Gradient Boosting.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"2 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76796677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Speech recognition features based on deep latent Gaussian models 基于深隐高斯模型的语音识别特征
Andros Tjandra, S. Sakti, Satoshi Nakamura
This paper constructs speech features based on a generative model using a deep latent Gaussian model (DLGM), which is trained using stochastic gradient variational Bayes (SGVB) algorithm and performs efficient approximate inference and learning with a directed probabilistic graphical model. The trained DLGM then generate latent variables based on Gaussian distribution, which is used as new features for a deep neural network (DNN) acoustic model. Here we compare our results with and without features transformed by DLGM and also observe the benefits of combining both the proposed and original features into a single DNN. Our experimental results show that the proposed features using DLGM improved the ASR performance. Furthermore, the DNN acoustic model, which combined the proposed and original features, gave the best performances.
本文利用深隐高斯模型(DLGM)构建基于生成模型的语音特征,该模型使用随机梯度变分贝叶斯(SGVB)算法进行训练,并使用有向概率图模型进行有效的近似推理和学习。训练后的DLGM生成基于高斯分布的潜在变量,作为深度神经网络声学模型的新特征。在这里,我们比较了经过DLGM转换的特征和没有经过DLGM转换的特征的结果,并观察了将提出的特征和原始特征结合到单个DNN中的好处。我们的实验结果表明,使用DLGM提出的特征提高了ASR性能。此外,将所提特征与原始特征相结合的深度神经网络声学模型表现最佳。
{"title":"Speech recognition features based on deep latent Gaussian models","authors":"Andros Tjandra, S. Sakti, Satoshi Nakamura","doi":"10.1109/MLSP.2017.8168174","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168174","url":null,"abstract":"This paper constructs speech features based on a generative model using a deep latent Gaussian model (DLGM), which is trained using stochastic gradient variational Bayes (SGVB) algorithm and performs efficient approximate inference and learning with a directed probabilistic graphical model. The trained DLGM then generate latent variables based on Gaussian distribution, which is used as new features for a deep neural network (DNN) acoustic model. Here we compare our results with and without features transformed by DLGM and also observe the benefits of combining both the proposed and original features into a single DNN. Our experimental results show that the proposed features using DLGM improved the ASR performance. Furthermore, the DNN acoustic model, which combined the proposed and original features, gave the best performances.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"54 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75690678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A linear stochastic state space model for electrocardiograms 心电图的线性随机状态空间模型
Kimmo Suotsalo, S. Särkkä
This paper proposes a linear stochastic state space model for electrocardiogram signal processing and analysis. The model is obtained as a discretized version of Wiener process acceleration model. The model is combined with a fixed-lag Rauch-Tung-Striebel smoother to perform on-line signal denoising, feature extraction, and beat classification. The results indicate that the proposed approach outperforms a conventional FIR filter in terms of improved signal-to-noise ratio, and that the approach can be used for highly accurate online classification of normal beats and premature ventricular contractions. The benefits of the model include the possibility to use closed-form solutions to the optimal filtering and smoothing problems, quick adaptation to sudden changes in beat morphology and heart rate, simple and fast initialization, preprocessing-free operation, intuitive interpretation of the system state, and more.
提出了一种用于心电图信号处理和分析的线性随机状态空间模型。该模型是Wiener过程加速模型的离散化版本。该模型结合固定滞后的Rauch-Tung-Striebel平滑器进行在线信号去噪、特征提取和节拍分类。结果表明,该方法在提高信噪比方面优于传统的FIR滤波器,并且该方法可用于高度准确的正常心跳和室性早搏在线分类。该模型的优点包括可以使用封闭形式的解决方案来解决最优滤波和平滑问题,快速适应心跳形态和心率的突然变化,简单快速的初始化,无需预处理的操作,直观地解释系统状态等等。
{"title":"A linear stochastic state space model for electrocardiograms","authors":"Kimmo Suotsalo, S. Särkkä","doi":"10.1109/MLSP.2017.8168126","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168126","url":null,"abstract":"This paper proposes a linear stochastic state space model for electrocardiogram signal processing and analysis. The model is obtained as a discretized version of Wiener process acceleration model. The model is combined with a fixed-lag Rauch-Tung-Striebel smoother to perform on-line signal denoising, feature extraction, and beat classification. The results indicate that the proposed approach outperforms a conventional FIR filter in terms of improved signal-to-noise ratio, and that the approach can be used for highly accurate online classification of normal beats and premature ventricular contractions. The benefits of the model include the possibility to use closed-form solutions to the optimal filtering and smoothing problems, quick adaptation to sudden changes in beat morphology and heart rate, simple and fast initialization, preprocessing-free operation, intuitive interpretation of the system state, and more.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"180 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74373316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Learning embeddings for speaker clustering based on voice equality 基于语音平等的说话人聚类学习嵌入
Y. X. Lukic, Carlo Vogt, Oliver Durr, Thilo Stadelmann
Recent work has shown that convolutional neural networks (CNNs) trained in a supervised fashion for speaker identification are able to extract features from spectrograms which can be used for speaker clustering. These features are represented by the activations of a certain hidden layer and are called embeddings. However, previous approaches require plenty of additional speaker data to learn the embedding, and although the clustering results are then on par with more traditional approaches using MFCC features etc., room for improvements stems from the fact that these embeddings are trained with a surrogate task that is rather far away from segregating unknown voices — namely, identifying few specific speakers. We address both problems by training a CNN to extract embeddings that are similar for equal speakers (regardless of their specific identity) using weakly labeled data. We demonstrate our approach on the well-known TIMIT dataset that has often been used for speaker clustering experiments in the past. We exceed the clustering performance of all previous approaches, but require just 100 instead of 590 unrelated speakers to learn an embedding suited for clustering.
最近的研究表明,以监督方式训练的卷积神经网络(cnn)能够从频谱图中提取特征,这些特征可用于说话人聚类。这些特征由某个隐藏层的激活表示,称为嵌入。然而,以前的方法需要大量额外的说话人数据来学习嵌入,尽管聚类结果与使用MFCC特征等更传统的方法相当,但改进的空间源于这样一个事实,即这些嵌入是用一个替代任务训练的,该任务与分离未知声音相距甚远——即识别少数特定的说话人。我们通过训练CNN使用弱标记数据提取相同说话者(无论其具体身份如何)的相似嵌入来解决这两个问题。我们在著名的TIMIT数据集上展示了我们的方法,该数据集过去经常用于说话人聚类实验。我们超越了之前所有方法的聚类性能,但只需要100个而不是590个不相关的说话者来学习适合聚类的嵌入。
{"title":"Learning embeddings for speaker clustering based on voice equality","authors":"Y. X. Lukic, Carlo Vogt, Oliver Durr, Thilo Stadelmann","doi":"10.1109/MLSP.2017.8168166","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168166","url":null,"abstract":"Recent work has shown that convolutional neural networks (CNNs) trained in a supervised fashion for speaker identification are able to extract features from spectrograms which can be used for speaker clustering. These features are represented by the activations of a certain hidden layer and are called embeddings. However, previous approaches require plenty of additional speaker data to learn the embedding, and although the clustering results are then on par with more traditional approaches using MFCC features etc., room for improvements stems from the fact that these embeddings are trained with a surrogate task that is rather far away from segregating unknown voices — namely, identifying few specific speakers. We address both problems by training a CNN to extract embeddings that are similar for equal speakers (regardless of their specific identity) using weakly labeled data. We demonstrate our approach on the well-known TIMIT dataset that has often been used for speaker clustering experiments in the past. We exceed the clustering performance of all previous approaches, but require just 100 instead of 590 unrelated speakers to learn an embedding suited for clustering.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"347 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77784383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Neonatal seizure detection using convolutional neural networks 基于卷积神经网络的新生儿癫痫检测
Alison O'Shea, G. Lightbody, G. Boylan, A. Temko
This study presents a novel end-to-end architecture that learns hierarchical representations from raw EEG data using fully convolutional deep neural networks for the task of neonatal seizure detection. The deep neural network acts as both feature extractor and classifier, allowing for end-to-end optimization of the seizure detector. The designed system is evaluated on a large dataset of continuous unedited multichannel neonatal EEG totaling 835 hours and comprising of 1389 seizures. The proposed deep architecture, with sample-level filters, achieves an accuracy that is comparable to the state-of-the-art SVM-based neonatal seizure detector, which operates on a set of carefully designed hand-crafted features. The fully convolutional architecture allows for the localization of EEG waveforms and patterns that result in high seizure probabilities for further clinical examination.
本研究提出了一种新颖的端到端架构,该架构使用全卷积深度神经网络从原始EEG数据中学习分层表示,用于新生儿癫痫发作检测任务。深度神经网络作为特征提取器和分类器,允许对癫痫检测器进行端到端优化。设计的系统在连续未编辑的多通道新生儿脑电图大数据集上进行评估,总计835小时,包括1389次癫痫发作。所提出的深度架构,与样本级滤波器,实现了与最先进的基于svm的新生儿癫痫检测器相媲美的精度,该检测器在一组精心设计的手工特征上运行。全卷积架构允许脑电图波形和模式的定位,导致高癫痫发作的可能性,进一步的临床检查。
{"title":"Neonatal seizure detection using convolutional neural networks","authors":"Alison O'Shea, G. Lightbody, G. Boylan, A. Temko","doi":"10.1109/MLSP.2017.8168193","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168193","url":null,"abstract":"This study presents a novel end-to-end architecture that learns hierarchical representations from raw EEG data using fully convolutional deep neural networks for the task of neonatal seizure detection. The deep neural network acts as both feature extractor and classifier, allowing for end-to-end optimization of the seizure detector. The designed system is evaluated on a large dataset of continuous unedited multichannel neonatal EEG totaling 835 hours and comprising of 1389 seizures. The proposed deep architecture, with sample-level filters, achieves an accuracy that is comparable to the state-of-the-art SVM-based neonatal seizure detector, which operates on a set of carefully designed hand-crafted features. The fully convolutional architecture allows for the localization of EEG waveforms and patterns that result in high seizure probabilities for further clinical examination.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"27 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87000595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Detecting malignant ventricular arrhythmias in electrocardiograms by Gaussian process classification 用高斯过程分类检测心电图中的恶性室性心律失常
Kimmo Suotsalo, S. Särkkä
Ventricular tachycardia, ventricular flutter, and ventricular fibrillation are malignant forms of cardiac arrhythmias, whose occurrence may be a life-threatening event. Several methods exist for detecting these arrhythmias in the electrocardiogram. However, the use of Gaussian process classifiers in this context has not been reported in the current literature. In comparison to the popular support vector machines, Gaussian processes have the advantage of being fully probabilistic, they can be re-casted in Bayesian filtering compatible state-space form, and they can be flexibly combined with first-principles physical models. In this paper we use Gaussian process classification to detect malignant ventricular arrhythmias in the electrocardiogram. We describe how Gaussian process classifiers can be used to solve the detection problem, and show that the proposed classifiers achieve a performance that is comparable to that of the state-of-the-art methods henceforth laying down promising foundations for more general electrocardiogram-based arrhythmia detection framework.
室性心动过速、心室扑动和心室颤动是心律失常的恶性形式,其发生可能是危及生命的事件。有几种方法可以在心电图中检测这些心律失常。然而,在这种情况下使用高斯过程分类器在目前的文献中还没有报道。与目前流行的支持向量机相比,高斯过程具有完全概率性、可重构为贝叶斯滤波兼容的状态空间形式、可与第一性原理物理模型灵活结合等优点。本文采用高斯过程分类方法检测心电图中的恶性室性心律失常。我们描述了如何使用高斯过程分类器来解决检测问题,并表明所提出的分类器实现了与最先进的方法相当的性能,从此为更通用的基于心电图的心律失常检测框架奠定了有希望的基础。
{"title":"Detecting malignant ventricular arrhythmias in electrocardiograms by Gaussian process classification","authors":"Kimmo Suotsalo, S. Särkkä","doi":"10.1109/MLSP.2017.8168160","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168160","url":null,"abstract":"Ventricular tachycardia, ventricular flutter, and ventricular fibrillation are malignant forms of cardiac arrhythmias, whose occurrence may be a life-threatening event. Several methods exist for detecting these arrhythmias in the electrocardiogram. However, the use of Gaussian process classifiers in this context has not been reported in the current literature. In comparison to the popular support vector machines, Gaussian processes have the advantage of being fully probabilistic, they can be re-casted in Bayesian filtering compatible state-space form, and they can be flexibly combined with first-principles physical models. In this paper we use Gaussian process classification to detect malignant ventricular arrhythmias in the electrocardiogram. We describe how Gaussian process classifiers can be used to solve the detection problem, and show that the proposed classifiers achieve a performance that is comparable to that of the state-of-the-art methods henceforth laying down promising foundations for more general electrocardiogram-based arrhythmia detection framework.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"36 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91515808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Predicting individualized intelligence quotient scores using brainnetome-atlas based functional connectivity 使用基于脑网络图谱的功能连通性预测个性化智商分数
R. Jiang, S. Qi, Yuhui Du, Weizheng Yan, V. Calhoun, T. Jiang, J. Sui
Variation in several brain regions and neural parameters is associated with intelligence. In this study, we adopted functional connectivity (FC) based on Brainnetome-atlas to predict the intelligence quotient (IQ) scores quantitatively with a prediction framework incorporating advanced feature selection and regression methods. We compared prediction performance of five regression models and evaluated the effectiveness of feature selection. The best prediction performance was achieved by ReliefF+LASSO, by which correlations of r=0.72 and r=0.46 between prediction and true values were obtained for 174 female and 186 male subjects respectively in a leave-one-out-cross-validation, suggesting that for female subjects, a better prediction of IQ scores can be achieved using precise FCs. Further, weight analysis revealed the most predictive FCs and the relevant regions. Results support the hypothesis that intelligence is characterized by interaction between multiple brain regions, especially the parieto-frontal integration theory implicated areas. This study facilitates our understanding of the biological basis of intelligence by individualized prediction.
一些大脑区域和神经参数的变化与智力有关。在本研究中,我们采用基于脑网络图谱的功能连通性(FC)来定量预测智商(IQ)得分,并结合了先进的特征选择和回归方法的预测框架。我们比较了五种回归模型的预测性能,并评估了特征选择的有效性。ReliefF+LASSO预测效果最好,分别对174名女性和186名男性受试者进行留一交叉验证,预测值与真实值的相关性为r=0.72和r=0.46,表明对于女性受试者,使用精确的FCs可以更好地预测智商分数。此外,权重分析揭示了最具预测性的fc和相关区域。研究结果支持了大脑多个区域之间相互作用的假设,特别是顶叶-额叶整合理论所涉及的区域。这项研究通过个性化预测促进了我们对智力的生物学基础的理解。
{"title":"Predicting individualized intelligence quotient scores using brainnetome-atlas based functional connectivity","authors":"R. Jiang, S. Qi, Yuhui Du, Weizheng Yan, V. Calhoun, T. Jiang, J. Sui","doi":"10.1109/MLSP.2017.8168150","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168150","url":null,"abstract":"Variation in several brain regions and neural parameters is associated with intelligence. In this study, we adopted functional connectivity (FC) based on Brainnetome-atlas to predict the intelligence quotient (IQ) scores quantitatively with a prediction framework incorporating advanced feature selection and regression methods. We compared prediction performance of five regression models and evaluated the effectiveness of feature selection. The best prediction performance was achieved by ReliefF+LASSO, by which correlations of r=0.72 and r=0.46 between prediction and true values were obtained for 174 female and 186 male subjects respectively in a leave-one-out-cross-validation, suggesting that for female subjects, a better prediction of IQ scores can be achieved using precise FCs. Further, weight analysis revealed the most predictive FCs and the relevant regions. Results support the hypothesis that intelligence is characterized by interaction between multiple brain regions, especially the parieto-frontal integration theory implicated areas. This study facilitates our understanding of the biological basis of intelligence by individualized prediction.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"60 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86376970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Discriminating bipolar disorder from major depression based on kernel SVM using functional independent components 基于功能独立分量的核支持向量机判别双相情感障碍与重度抑郁症
Shuang Gao, E. Osuch, M. Wammes, J. Théberge, T. Jiang, V. Calhoun, J. Sui
Bipolar disorder (BD) and major depressive disorder (MDD) both share depressive symptoms, so how to discriminate them in early depressive episodes is a major clinical challenge. Independent components (ICs) extracted from fMRI data have been proved to carry distinguishing information and can be used for classification. Here we extend a previous method that makes use of multiple fMRI ICs to build linear subspaces for each individual, which is further used as input for classifiers. The similarity matrix between different subjects is first calculated using distance metric of principal angle, which is then projected into kernel space for support vector machine (SVM) classification among 37 BDs and 36 MDDs. In practice, we adopt forward selection technique on 20 ICs and nested 10-fold cross validation to select the most discriminative IC combinations of fMRI and determine the final diagnosis by majority voting mechanism. The results on human data demonstrate that the proposed method achieves much better performance than its initial version [8] (93% vs. 75%), and identifies 5 discriminative fMRI components for distinguishing BD and MDD patients, which are mainly located in prefrontal cortex, default mode network and thalamus etc. This work provides a new framework for helping diagnose the new patients with overlapped symptoms between BD and MDD, which not only adds to our understanding of functional deficits in mood disorders, but also may serve as potential biomarkers for their differential diagnosis.
双相情感障碍(BD)和重度抑郁障碍(MDD)都具有抑郁症状,因此如何在早期抑郁发作中区分它们是一个重大的临床挑战。从功能磁共振成像数据中提取的独立分量(Independent components, ic)已被证明可以携带识别信息,并可用于分类。在这里,我们扩展了先前的方法,该方法使用多个fMRI ic为每个个体构建线性子空间,这进一步用作分类器的输入。首先利用主角距离度量计算不同受试者之间的相似矩阵,然后将其投影到核空间中,用于支持向量机(SVM)对37个bd和36个mdd进行分类。在实践中,我们对20个IC采用前向选择技术和嵌套10倍交叉验证,选择最具判别性的fMRI IC组合,并通过多数投票机制确定最终诊断。基于人体数据的实验结果表明,该方法的识别性能明显优于初始版本[8](93% vs. 75%),并识别出5个区分BD和MDD患者的fMRI成分,这些成分主要位于前额皮质、默认模式网络和丘脑等。本研究为帮助诊断双相障碍和重度抑郁症重叠症状的新患者提供了一个新的框架,不仅增加了我们对情绪障碍的功能缺陷的理解,而且可能作为鉴别诊断的潜在生物标志物。
{"title":"Discriminating bipolar disorder from major depression based on kernel SVM using functional independent components","authors":"Shuang Gao, E. Osuch, M. Wammes, J. Théberge, T. Jiang, V. Calhoun, J. Sui","doi":"10.1109/MLSP.2017.8168110","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168110","url":null,"abstract":"Bipolar disorder (BD) and major depressive disorder (MDD) both share depressive symptoms, so how to discriminate them in early depressive episodes is a major clinical challenge. Independent components (ICs) extracted from fMRI data have been proved to carry distinguishing information and can be used for classification. Here we extend a previous method that makes use of multiple fMRI ICs to build linear subspaces for each individual, which is further used as input for classifiers. The similarity matrix between different subjects is first calculated using distance metric of principal angle, which is then projected into kernel space for support vector machine (SVM) classification among 37 BDs and 36 MDDs. In practice, we adopt forward selection technique on 20 ICs and nested 10-fold cross validation to select the most discriminative IC combinations of fMRI and determine the final diagnosis by majority voting mechanism. The results on human data demonstrate that the proposed method achieves much better performance than its initial version [8] (93% vs. 75%), and identifies 5 discriminative fMRI components for distinguishing BD and MDD patients, which are mainly located in prefrontal cortex, default mode network and thalamus etc. This work provides a new framework for helping diagnose the new patients with overlapped symptoms between BD and MDD, which not only adds to our understanding of functional deficits in mood disorders, but also may serve as potential biomarkers for their differential diagnosis.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"9 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87577059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Gaussian density guided deep neural network for single-channel speech enhancement 基于高斯密度的深度神经网络单通道语音增强
Li Chai, Jun Du, Yannan Wang
Recently, the minimum mean squared error (MMSE) has been a benchmark of optimization criterion for deep neural network (DNN) based speech enhancement. In this study, a probabilistic learning framework to estimate the DNN parameters for single-channel speech enhancement is proposed. First, the statistical analysis shows that the prediction error vector at the DNN output well follows a unimodal density for each log-power spectral component. Accordingly, we present a maximum likelihood (ML) approach to DNN parameter learning by charactering the prediction error vector as a multivariate Gaussian density with a zero mean vector and an unknown covariance matrix. It is demonstrated that the proposed learning approach can achieve a better generalization capability than MMSE-based DNN learning for unseen noise types, which can significantly reduce the speech distortions in low SNR environments.
近年来,最小均方误差(MMSE)已成为基于深度神经网络(DNN)的语音增强优化准则的基准。在本研究中,提出了一种估计单通道语音增强的深度神经网络参数的概率学习框架。首先,统计分析表明,DNN输出处的预测误差向量很好地遵循每个对数功率谱分量的单峰密度。因此,我们通过将预测误差向量表征为具有零均值向量和未知协方差矩阵的多元高斯密度,提出了DNN参数学习的最大似然(ML)方法。实验结果表明,该学习方法比基于mmse的深度神经网络学习方法具有更好的泛化能力,可以显著降低低信噪比环境下的语音失真。
{"title":"Gaussian density guided deep neural network for single-channel speech enhancement","authors":"Li Chai, Jun Du, Yannan Wang","doi":"10.1109/MLSP.2017.8168116","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168116","url":null,"abstract":"Recently, the minimum mean squared error (MMSE) has been a benchmark of optimization criterion for deep neural network (DNN) based speech enhancement. In this study, a probabilistic learning framework to estimate the DNN parameters for single-channel speech enhancement is proposed. First, the statistical analysis shows that the prediction error vector at the DNN output well follows a unimodal density for each log-power spectral component. Accordingly, we present a maximum likelihood (ML) approach to DNN parameter learning by charactering the prediction error vector as a multivariate Gaussian density with a zero mean vector and an unknown covariance matrix. It is demonstrated that the proposed learning approach can achieve a better generalization capability than MMSE-based DNN learning for unseen noise types, which can significantly reduce the speech distortions in low SNR environments.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"24 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82227098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An accelerated newton's method for projections onto the ℓ1-ball 用加速牛顿法求球面上的投影
P. Rodríguez
We present a simple and computationally efficient algorithm, based on the accelerated Newton's method, to solve the root finding problem associated with the projection onto the ℓ1-ball problem. Considering an interpretation of the Michelot's algorithm as Newton method, our algorithm can be understood as an accelerated version of the Michelot's algorithm, that needs significantly less major iterations to converge to the solution. Although the worst-case performance of the propose algorithm is O(n2), it exhibits in practice an O(n) performance and it is empirically demonstrated that it is competitive or faster than existing methods.
本文提出了一种基于加速牛顿法的简单高效的求根算法,用于求解与1球投影相关的求根问题。考虑到将Michelot算法解释为牛顿方法,我们的算法可以理解为Michelot算法的加速版本,它需要更少的主要迭代才能收敛到解决方案。虽然该算法的最坏情况性能为O(n2),但在实践中表现出O(n)的性能,并且经验证明它比现有方法具有竞争力或更快。
{"title":"An accelerated newton's method for projections onto the ℓ1-ball","authors":"P. Rodríguez","doi":"10.1109/MLSP.2017.8168161","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168161","url":null,"abstract":"We present a simple and computationally efficient algorithm, based on the accelerated Newton's method, to solve the root finding problem associated with the projection onto the ℓ1-ball problem. Considering an interpretation of the Michelot's algorithm as Newton method, our algorithm can be understood as an accelerated version of the Michelot's algorithm, that needs significantly less major iterations to converge to the solution. Although the worst-case performance of the propose algorithm is O(n2), it exhibits in practice an O(n) performance and it is empirically demonstrated that it is competitive or faster than existing methods.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83513320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1