首页 > 最新文献

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文 中文
Constant False Alarm Rate for Online one Class Svm Learning 在线一类Svm学习的恒定虚警率
Yongjian Xue, P. Beauseroy
Many one class SVM applications require online learning technique when time series data are encountered. Most of the existing methods for online SVM learning are based on C SVM without adapting the constraint parameter dynamically as the number of training samples increases. In such case the false alarm rate decreases while the miss alarm rate increases gradually for one class SVM. In most applications we prefer a relatively stable performance, especially the false alarm rate. In order to solve that problem, we propose an online version of v-OeSVM. Experiments on toy and real datasets show that v-OeSVM is a good mean to target a given false alarm rate while the AUC increases slowly as the number of new samples increases.
许多单类支持向量机应用在遇到时间序列数据时都需要在线学习技术。现有的支持向量机在线学习方法大多是基于C支持向量机,不能随着训练样本数量的增加而动态调整约束参数。在这种情况下,一类支持向量机的虚警率逐渐降低,漏警率逐渐增加。在大多数应用中,我们更喜欢相对稳定的性能,尤其是虚警率。为了解决这个问题,我们提出了一个在线版本的v-OeSVM。在玩具和真实数据集上的实验表明,当AUC随着新样本数量的增加而缓慢增加时,v-OeSVM是针对给定虚警率的一个很好的平均值。
{"title":"Constant False Alarm Rate for Online one Class Svm Learning","authors":"Yongjian Xue, P. Beauseroy","doi":"10.1109/ICASSP.2018.8462022","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462022","url":null,"abstract":"Many one class SVM applications require online learning technique when time series data are encountered. Most of the existing methods for online SVM learning are based on C SVM without adapting the constraint parameter dynamically as the number of training samples increases. In such case the false alarm rate decreases while the miss alarm rate increases gradually for one class SVM. In most applications we prefer a relatively stable performance, especially the false alarm rate. In order to solve that problem, we propose an online version of v-OeSVM. Experiments on toy and real datasets show that v-OeSVM is a good mean to target a given false alarm rate while the AUC increases slowly as the number of new samples increases.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"30 1","pages":"2821-2825"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81231986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Low-Overhead Receiver-Side Channel Tracking for Mmwave Mimo 毫米波Mimo的低开销接收机侧信道跟踪
Karthik Upadhya, S. Vorobyov, R. Heath
Millimeter wave (mmWave) multiple-input multiple-output (MIMO) transceivers employ narrow beams to obtain a large array-gain, rendering them sensitive to changes in the angles of arrival and departure of the paths. Since the singular vectors that span the channel subspace are used to design the precoder and combiner, we propose a method to track the receiver-side channel subspace during data transmission using a separate radio frequency (RF) chain dedicated for channel tracking. Under certain conditions on the transmit precoder, we show that the receiver-side channel subspace can be estimated during data transmission without knowing the structure of the precoder or the transmitted data. The performance of the proposed method is evaluated through simulations.
毫米波(mmWave)多输入多输出(MIMO)收发器采用窄波束来获得较大的阵列增益,使其对路径到达和离开角度的变化敏感。由于使用跨信道子空间的奇异向量来设计预编码器和合并器,因此我们提出了一种在数据传输过程中使用专用信道跟踪的单独射频(RF)链跟踪接收端信道子空间的方法。在发送预编码器的一定条件下,我们证明了在不知道发送预编码器结构或发送数据的情况下,可以在数据传输过程中估计接收端信道子空间。通过仿真对该方法的性能进行了评价。
{"title":"Low-Overhead Receiver-Side Channel Tracking for Mmwave Mimo","authors":"Karthik Upadhya, S. Vorobyov, R. Heath","doi":"10.1109/ICASSP.2018.8461320","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461320","url":null,"abstract":"Millimeter wave (mmWave) multiple-input multiple-output (MIMO) transceivers employ narrow beams to obtain a large array-gain, rendering them sensitive to changes in the angles of arrival and departure of the paths. Since the singular vectors that span the channel subspace are used to design the precoder and combiner, we propose a method to track the receiver-side channel subspace during data transmission using a separate radio frequency (RF) chain dedicated for channel tracking. Under certain conditions on the transmit precoder, we show that the receiver-side channel subspace can be estimated during data transmission without knowing the structure of the precoder or the transmitted data. The performance of the proposed method is evaluated through simulations.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"36 1","pages":"3859-3863"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81508181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On Compressive Sensing of Sparse Covariance Matrices Using Deterministic Sensing Matrices 基于确定性感知矩阵的稀疏协方差矩阵压缩感知研究
Alihan Kaplan, V. Pohl, Dae Gwan Lee
This paper considers the problem of determining the sparse covariance matrix $mathbf{X}$ of an unknown data vector $pmb{x}$ by observing the covariance matrix $mathbf{Y}$ of a compressive measurement vector $pmb{y}=mathbf{A}pmb{x}$. We construct deterministic sensing matrices $mathbf{A}$ for which the recovery of a $k$ -sparse covariance matrix $mathbf{X}$ from $m$ values of $mathbf{Y}$ is guaranteed with high probability. In particular, we show that the number of measurements $m$ scales linearly with the sparsity $k$.
本文考虑了通过观察压缩测量向量$pmb{Y} =mathbf{a}pmb{X}$的协方差矩阵$mathbf{Y}$来确定未知数据向量$pmb{X}$的稀疏协方差矩阵$mathbf{X}$的问题。我们构造了确定性感知矩阵$mathbf{A}$,对于该矩阵$ $k$ -稀疏协方差矩阵$ $ mathbf{X}$,从$mathbf{Y}$的$m$值可以保证高概率恢复。特别地,我们证明了测量的数目$m$与稀疏度$k$呈线性关系。
{"title":"On Compressive Sensing of Sparse Covariance Matrices Using Deterministic Sensing Matrices","authors":"Alihan Kaplan, V. Pohl, Dae Gwan Lee","doi":"10.1109/ICASSP.2018.8461312","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461312","url":null,"abstract":"This paper considers the problem of determining the sparse covariance matrix <tex>$mathbf{X}$</tex> of an unknown data vector <tex>$pmb{x}$</tex> by observing the covariance matrix <tex>$mathbf{Y}$</tex> of a compressive measurement vector <tex>$pmb{y}=mathbf{A}pmb{x}$</tex>. We construct deterministic sensing matrices <tex>$mathbf{A}$</tex> for which the recovery of a <tex>$k$</tex> -sparse covariance matrix <tex>$mathbf{X}$</tex> from <tex>$m$</tex> values of <tex>$mathbf{Y}$</tex> is guaranteed with high probability. In particular, we show that the number of measurements <tex>$m$</tex> scales linearly with the sparsity <tex>$k$</tex>.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"28 1","pages":"4019-4023"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89595941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Robust and Effective Hyperspectral Pansharpening Using Spatio-Spectral Total Variation 基于空间光谱全变分的鲁棒有效高光谱泛锐化
Saori Takeyama, Shunsuke Ono, I. Kumazawa
Acquiring high-resolution hyperspectral (HS) images is a very challenging task. To this end, hyperspectral pansharpening techniques have been widely studied, which estimate an HS image of high spatial and spectral resolution (high HS image) from a pair of an HS image of high spectral resolution but low spatial resolution (low HS image) and a high spatial resolution panchromatic (PAN) image. However, since these methods do not fully utilize the piecewise-smoothness of spectral information on HS images in estimation, they tend to produce spectral distortion when the low HS image contains noise. To tackle this issue, we propose a new hyperspectral pansharpening method using a spatio-spectral regularization. Our method not only effectively exploits observed information but also properly promotes the spatio-spectral piecewise-smoothness of the resulting high HS image, leading to high quality and robust estimation. The proposed method is reduced to a nonsmooth convex optimization problem, which is efficiently solved by a primal-dual splitting method. Our experiments demonstrate the advantages of our method over existing hyperspectral pansharpening methods.
获取高分辨率高光谱(HS)图像是一项非常具有挑战性的任务。为此,高光谱泛锐化技术得到了广泛的研究,该技术将高光谱分辨率但低空间分辨率的HS图像(低HS图像)和高空间分辨率的全色(PAN)图像对估计出高空间分辨率和高光谱分辨率的HS图像(高HS图像)。然而,由于这些方法在估计时没有充分利用HS图像光谱信息的分段平滑性,当低HS图像含有噪声时,容易产生光谱失真。为了解决这一问题,我们提出了一种基于空间光谱正则化的高光谱泛锐化方法。我们的方法不仅有效地利用了观测信息,而且适当地提高了高HS图像的空间-光谱分段平滑性,从而实现了高质量和鲁棒性的估计。该方法被简化为一个非光滑凸优化问题,并通过原始对偶分裂方法有效地求解。实验结果表明,该方法优于现有的高光谱泛锐化方法。
{"title":"Robust and Effective Hyperspectral Pansharpening Using Spatio-Spectral Total Variation","authors":"Saori Takeyama, Shunsuke Ono, I. Kumazawa","doi":"10.1109/ICASSP.2018.8462464","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462464","url":null,"abstract":"Acquiring high-resolution hyperspectral (HS) images is a very challenging task. To this end, hyperspectral pansharpening techniques have been widely studied, which estimate an HS image of high spatial and spectral resolution (high HS image) from a pair of an HS image of high spectral resolution but low spatial resolution (low HS image) and a high spatial resolution panchromatic (PAN) image. However, since these methods do not fully utilize the piecewise-smoothness of spectral information on HS images in estimation, they tend to produce spectral distortion when the low HS image contains noise. To tackle this issue, we propose a new hyperspectral pansharpening method using a spatio-spectral regularization. Our method not only effectively exploits observed information but also properly promotes the spatio-spectral piecewise-smoothness of the resulting high HS image, leading to high quality and robust estimation. The proposed method is reduced to a nonsmooth convex optimization problem, which is efficiently solved by a primal-dual splitting method. Our experiments demonstrate the advantages of our method over existing hyperspectral pansharpening methods.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"45 1","pages":"1603-1607"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86512694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Comparison of Speech Tasks for Automatic Classification of Patients with Amyotrophic Lateral Sclerosis and Healthy Subjects 肌萎缩侧索硬化症患者与健康受试者语音自动分类任务的比较
Aravind Illa, Deep Patel, B. Yamini, Meera ss, N. Shivashankar, P. Veeramani, Seena vengalii, Kiran Polavarapui, S. Nashi, A. Nalini, P. Ghosh
In this work, we consider the task of acoustic and articulatory feature based automatic classification of Amyotrophic Lateral Sclerosis (ALS) patients and healthy subjects using speech tasks. In particular, we compare the roles of different types of speech tasks, namely rehearsed speech, spontaneous speech and repeated words for this purpose. Simultaneous articulatory and speech data were recorded from 8 healthy controls and 8 ALS patients using AG501 for the classification experiments. In addition to typical acoustic and articulatory features, new articulatory features are proposed for classification. As classifiers, both Deep Neural Networks (DNN) and Support Vector Machines (SVM) are examined. Classification experiments reveal that the proposed articulatory features outperform other acoustic and articulatory features using both DNN and SVM classifier. However, SVM performs better than DNN classifier using the proposed feature. Among three different speech tasks considered, the rehearsed speech was found to provide the highest F-score of 1, followed by an F-score of 0.92 when both repeated words and spontaneous speech are used for classification.
在这项工作中,我们考虑使用语音任务对肌萎缩侧索硬化症(ALS)患者和健康受试者进行基于声学和发音特征的自动分类。我们特别比较了不同类型的言语任务的作用,即排练的言语、自发的言语和重复的话语。使用AG501记录8例健康对照和8例ALS患者的同时发音和言语数据,进行分类实验。除了典型的声学和发音特征外,还提出了新的发音特征进行分类。作为分类器,深度神经网络(DNN)和支持向量机(SVM)都进行了研究。分类实验表明,使用DNN和SVM分类器,所提出的发音特征优于其他声学和发音特征。然而,使用所提出的特征,SVM的性能优于DNN分类器。在考虑的三种不同的语音任务中,发现排练的语音提供了最高的f分,为1,其次是使用重复单词和自发语音进行分类的f分为0.92。
{"title":"Comparison of Speech Tasks for Automatic Classification of Patients with Amyotrophic Lateral Sclerosis and Healthy Subjects","authors":"Aravind Illa, Deep Patel, B. Yamini, Meera ss, N. Shivashankar, P. Veeramani, Seena vengalii, Kiran Polavarapui, S. Nashi, A. Nalini, P. Ghosh","doi":"10.1109/ICASSP.2018.8461836","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461836","url":null,"abstract":"In this work, we consider the task of acoustic and articulatory feature based automatic classification of Amyotrophic Lateral Sclerosis (ALS) patients and healthy subjects using speech tasks. In particular, we compare the roles of different types of speech tasks, namely rehearsed speech, spontaneous speech and repeated words for this purpose. Simultaneous articulatory and speech data were recorded from 8 healthy controls and 8 ALS patients using AG501 for the classification experiments. In addition to typical acoustic and articulatory features, new articulatory features are proposed for classification. As classifiers, both Deep Neural Networks (DNN) and Support Vector Machines (SVM) are examined. Classification experiments reveal that the proposed articulatory features outperform other acoustic and articulatory features using both DNN and SVM classifier. However, SVM performs better than DNN classifier using the proposed feature. Among three different speech tasks considered, the rehearsed speech was found to provide the highest F-score of 1, followed by an F-score of 0.92 when both repeated words and spontaneous speech are used for classification.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"46 15 1","pages":"6014-6018"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73471803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Faster and Still Safe: Combining Screening Techniques and Structured Dictionaries to Accelerate the Lasso 更快,仍然安全:结合筛选技术和结构化词典加速套索
C. Dantas, R. Gribonval
Accelerating the solution of the Lasso problem becomes crucial when scaling to very high dimensional data. In this paper, we propose a way to combine two existing acceleration techniques: safe screening tests, which simplify the problem by eliminating useless dictionary atoms; and the use of structured dictionaries which are faster to operate with. A structured approximation of the true dictionary is used at the initial stage of the optimization, and we show how to define screening tests which are still safe despite the approximation error. In particular, we extend a state-of-the-art screening test, the GAP SAFE sphere test, to this new setting. The practical interest of the proposed methodology is demonstrated by considerable reductions in simulation time.
当缩放到非常高维的数据时,加速Lasso问题的解决变得至关重要。在本文中,我们提出了一种结合两种现有加速技术的方法:安全筛选测试,它通过消除无用的字典原子来简化问题;使用结构化字典,操作起来更快。在优化的初始阶段使用了真实字典的结构化近似,并展示了如何定义尽管存在近似误差但仍然安全的筛选测试。特别是,我们将最先进的筛选测试,GAP安全球体测试,扩展到这个新环境。所提出的方法的实际意义是通过大大减少模拟时间来证明的。
{"title":"Faster and Still Safe: Combining Screening Techniques and Structured Dictionaries to Accelerate the Lasso","authors":"C. Dantas, R. Gribonval","doi":"10.1109/ICASSP.2018.8461514","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461514","url":null,"abstract":"Accelerating the solution of the Lasso problem becomes crucial when scaling to very high dimensional data. In this paper, we propose a way to combine two existing acceleration techniques: safe screening tests, which simplify the problem by eliminating useless dictionary atoms; and the use of structured dictionaries which are faster to operate with. A structured approximation of the true dictionary is used at the initial stage of the optimization, and we show how to define screening tests which are still safe despite the approximation error. In particular, we extend a state-of-the-art screening test, the GAP SAFE sphere test, to this new setting. The practical interest of the proposed methodology is demonstrated by considerable reductions in simulation time.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"9 1","pages":"4069-4073"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73267044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An Investigation of Subband Wavenet Vocoder Covering Entire Audible Frequency Range with Limited Acoustic Features 覆盖整个可听频率范围的有限声学特征子带波声编码器研究
T. Okamoto, Kentaro Tachibana, T. Toda, Y. Shiga, H. Kawai
Although a WaveNet vocoder can synthesize more natural-sounding speech waveforms than conventional vocoders with sampling frequencies of 16 and 24 kHz, it is difficult to directly extend the sampling frequency to 48 kHz to cover the entire human audible frequency range for higher-quality synthesis because the model size becomes too large to train with a consumer GPU. For a WaveNet vocoder with a sampling frequency of 48 kHz with a consumer GPU, this paper introduces a subband WaveNet architecture to a speaker-dependent WaveNet vocoder and proposes a subband WaveNet vocoder. In experiments, each conditional subband WaveNet with a sampling frequency of 8 kHz was well trained using a consumer GPU. The results of subjective evaluations with a Japanese male speech corpus indicate that the proposed subband WaveNet vocoder with 36-dimensional simple acoustic features significantly outperformed the conventional source-filter model-based vocoders including STRAIGHT with 86-dimensional features.
虽然WaveNet声码器可以合成比传统声码器更自然的语音波形,采样频率为16和24 kHz,但很难直接将采样频率扩展到48 kHz以覆盖整个人类可听频率范围以获得更高质量的合成,因为模型尺寸变得太大而无法用消费级GPU进行训练。针对消费级GPU下采样频率为48khz的WaveNet声码器,本文将子带WaveNet架构引入到依赖于扬声器的WaveNet声码器中,并提出了一种子带WaveNet声码器。在实验中,每个采样频率为8 kHz的条件子带WaveNet都使用消费级GPU进行了很好的训练。基于日本男性语音语料库的主观评价结果表明,所提出的具有36维简单声学特征的子带WaveNet声码器显著优于传统的具有86维特征的基于源滤波器模型的声码器,包括STRAIGHT。
{"title":"An Investigation of Subband Wavenet Vocoder Covering Entire Audible Frequency Range with Limited Acoustic Features","authors":"T. Okamoto, Kentaro Tachibana, T. Toda, Y. Shiga, H. Kawai","doi":"10.1109/ICASSP.2018.8462237","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462237","url":null,"abstract":"Although a WaveNet vocoder can synthesize more natural-sounding speech waveforms than conventional vocoders with sampling frequencies of 16 and 24 kHz, it is difficult to directly extend the sampling frequency to 48 kHz to cover the entire human audible frequency range for higher-quality synthesis because the model size becomes too large to train with a consumer GPU. For a WaveNet vocoder with a sampling frequency of 48 kHz with a consumer GPU, this paper introduces a subband WaveNet architecture to a speaker-dependent WaveNet vocoder and proposes a subband WaveNet vocoder. In experiments, each conditional subband WaveNet with a sampling frequency of 8 kHz was well trained using a consumer GPU. The results of subjective evaluations with a Japanese male speech corpus indicate that the proposed subband WaveNet vocoder with 36-dimensional simple acoustic features significantly outperformed the conventional source-filter model-based vocoders including STRAIGHT with 86-dimensional features.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"14 2 1","pages":"5654-5658"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78638817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Online Education Evaluation for Signal Processing Course Through Student Learning Pathways 基于学生学习路径的信号处理课程在线教育评估
K. H. Ng, S. Tatinati, Andy W. H. Khong
Impact of online learning sequences to forecast course outcomes for an undergraduate digital signal processing (DSP) course is studied in this work. A multi-modal learning schema based on deep-learning techniques with learning sequences, psychometric measures, and personality traits as input features is developed in this work. The aim is to identify any underlying patterns in the learning sequences and subsequently forecast the learning outcomes. Experiments are conducted on the data acquired for the DSP course taught over 13 teaching weeks to underpin the forecasting efficacy of various deep-learning models. Results showed that the proposed multi-modal schema yields better forecasting performance compared to existing frequency-based methods in existing literature. It is further observed that the psychometric measures incorporated in the proposed multimodal schema enhance the ability of distinguishing nuances in the input sequences when the forecasting task is highly dependent on human behavior.
本文研究了在线学习序列对预测本科数字信号处理(DSP)课程结果的影响。本文开发了一种基于深度学习技术的多模态学习模式,以学习序列、心理测量和人格特征作为输入特征。其目的是识别学习序列中的任何潜在模式,并随后预测学习结果。对13个教学周的DSP课程所获取的数据进行实验,以验证各种深度学习模型的预测效果。结果表明,与现有文献中基于频率的预测方法相比,所提出的多模态模式具有更好的预测效果。进一步观察到,当预测任务高度依赖于人类行为时,纳入多模态模式的心理测量方法增强了识别输入序列中细微差别的能力。
{"title":"Online Education Evaluation for Signal Processing Course Through Student Learning Pathways","authors":"K. H. Ng, S. Tatinati, Andy W. H. Khong","doi":"10.1109/ICASSP.2018.8461464","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461464","url":null,"abstract":"Impact of online learning sequences to forecast course outcomes for an undergraduate digital signal processing (DSP) course is studied in this work. A multi-modal learning schema based on deep-learning techniques with learning sequences, psychometric measures, and personality traits as input features is developed in this work. The aim is to identify any underlying patterns in the learning sequences and subsequently forecast the learning outcomes. Experiments are conducted on the data acquired for the DSP course taught over 13 teaching weeks to underpin the forecasting efficacy of various deep-learning models. Results showed that the proposed multi-modal schema yields better forecasting performance compared to existing frequency-based methods in existing literature. It is further observed that the psychometric measures incorporated in the proposed multimodal schema enhance the ability of distinguishing nuances in the input sequences when the forecasting task is highly dependent on human behavior.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 1","pages":"6458-6462"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88017283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Epileptic State Segmentation with Temporal-Constrained Clustering 基于时间约束聚类的癫痫状态分割
Kang Lin, Yu Qi, Shaozhe Feng, Qi Lian, Gang Pan, Yueming Wang
Automatic seizure identification plays an important role in epilepsy evaluation. Most existing methods regard seizure identification as a classification problem and rely on labelled training set. However, labelling seizure onset is very expensive and seizure data for each individual is especially limited, classifier-based methods are usually impractical in use. Clustering methods could learn useful information from unlabelled data, while they may lead to unstable results given epileptic signals with high noises. In this paper, we propose to use Gaussian temporal-constrained k-medoids method for seizure state segmentation. Using temporal information, the noises could be effectively suppressed and robust clustering performance is achieved. Besides, a new criterion called signed total variation (STV) which describes temporal integrity and consistency is proposed for temporal-constrained clustering evaluation. Experimental results show that, compared with the existing methods, the k-medoids method with Gaussian temporal constraint achieves the best results on both F1-score and STV.
癫痫发作自动识别在癫痫评估中起着重要的作用。现有的方法大多将癫痫识别视为分类问题,依赖于标记训练集。然而,标记癫痫发作是非常昂贵的,并且每个人的癫痫发作数据特别有限,基于分类器的方法通常在使用中不切实际。聚类方法可以从未标记的数据中学习到有用的信息,但对于高噪声的癫痫信号,聚类方法可能导致结果不稳定。在本文中,我们提出使用高斯时间约束k-介质方法进行癫痫状态分割。利用时间信息可以有效抑制噪声,实现鲁棒的聚类性能。此外,提出了一种描述时间完整性和一致性的有符号总变分(STV)准则,用于时间约束聚类评价。实验结果表明,与现有方法相比,具有高斯时间约束的k-medoids方法在F1-score和STV上都取得了最好的结果。
{"title":"Epileptic State Segmentation with Temporal-Constrained Clustering","authors":"Kang Lin, Yu Qi, Shaozhe Feng, Qi Lian, Gang Pan, Yueming Wang","doi":"10.1109/ICASSP.2018.8462070","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462070","url":null,"abstract":"Automatic seizure identification plays an important role in epilepsy evaluation. Most existing methods regard seizure identification as a classification problem and rely on labelled training set. However, labelling seizure onset is very expensive and seizure data for each individual is especially limited, classifier-based methods are usually impractical in use. Clustering methods could learn useful information from unlabelled data, while they may lead to unstable results given epileptic signals with high noises. In this paper, we propose to use Gaussian temporal-constrained k-medoids method for seizure state segmentation. Using temporal information, the noises could be effectively suppressed and robust clustering performance is achieved. Besides, a new criterion called signed total variation (STV) which describes temporal integrity and consistency is proposed for temporal-constrained clustering evaluation. Experimental results show that, compared with the existing methods, the k-medoids method with Gaussian temporal constraint achieves the best results on both F1-score and STV.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"881-885"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83501276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Deep Reinforcement Learning Framework for Identifying Funny Scenes in Movies 一种用于识别电影搞笑场景的深度强化学习框架
Haoqi Li, Naveen Kumar, Ruxin Chen, P. Georgiou
This paper presents a novel deep Reinforcement Learning (RL) framework for classifying movie scenes based on affect using the face images detected in the video stream as input. Extracting affective information from the video is a challenging task modulating complex visual and temporal representations intertwined with the complex aspects of human perception and information integration. This also makes it difficult to collect a large annotated corpus restricting the use of supervised learning methods. We present an alternative learning framework based on RL that is tolerant to label sparsity and can easily make use of any available ground truth in an online fashion. We employ this modified RL model for the binary classification of whether a scene is funny or not on a dataset of movie scene clips. The results show that our model correctly predicts 72.95% of the time on the 2–3 minute long movie scenes while on shorter scenes the accuracy obtained is 84.13%.
本文提出了一种新的深度强化学习(RL)框架,利用视频流中检测到的人脸图像作为输入,基于情感对电影场景进行分类。从视频中提取情感信息是一项具有挑战性的任务,需要调节复杂的视觉和时间表征,这些表征与人类感知和信息整合的复杂方面交织在一起。这也使得收集大量带注释的语料库变得困难,限制了监督学习方法的使用。我们提出了一种基于强化学习的替代学习框架,它可以容忍标签稀疏性,并且可以在线方式轻松地利用任何可用的基础真值。我们使用这种改进的RL模型对电影场景片段数据集上的场景是否有趣进行二元分类。结果表明,我们的模型对2-3分钟长的电影场景的预测准确率为72.95%,对较短的场景的预测准确率为84.13%。
{"title":"A Deep Reinforcement Learning Framework for Identifying Funny Scenes in Movies","authors":"Haoqi Li, Naveen Kumar, Ruxin Chen, P. Georgiou","doi":"10.1109/ICASSP.2018.8462686","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462686","url":null,"abstract":"This paper presents a novel deep Reinforcement Learning (RL) framework for classifying movie scenes based on affect using the face images detected in the video stream as input. Extracting affective information from the video is a challenging task modulating complex visual and temporal representations intertwined with the complex aspects of human perception and information integration. This also makes it difficult to collect a large annotated corpus restricting the use of supervised learning methods. We present an alternative learning framework based on RL that is tolerant to label sparsity and can easily make use of any available ground truth in an online fashion. We employ this modified RL model for the binary classification of whether a scene is funny or not on a dataset of movie scene clips. The results show that our model correctly predicts 72.95% of the time on the 2–3 minute long movie scenes while on shorter scenes the accuracy obtained is 84.13%.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"22 1","pages":"3116-3120"},"PeriodicalIF":0.0,"publicationDate":"2018-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80553371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
期刊
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1