首页 > 最新文献

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文 中文
Distributed Non-Orthogonal Pilot Design for Multi-Cell Massive Mimo Systems 多小区大规模Mimo系统的分布式非正交导频设计
Yue Wu, Shaodan Med, Yuantao Gu
In this work, a distributed non-orthogonal pilot design approach is proposed to tackle the pilot contamination problem in multi-cell massive multiple input multiple output (MIMO) systems. The pilot signals are designed under power constraints by minimizing the total mean square errors (MSEs) of the minimum mean square error (MMSE) channel estimators of all base stations (BSs). In order to solve the above non-convex pilot design problem, the stochastic variance reduced gradient (SVRG) projection algorithm is introduced, where the pilots signals are optimized in a distributed way at individual BSs. The SVRG projection algorithm preserves the randomness of the transient gradient, which makes the solution more likely jump out of the local minima. Moreover, only part of the BSs are activated to perform the gradient descent operation during each iteration, producing a green and low-cost infrastructure. Numerical simulations demonstrate the superiority of the proposed approach in terms of the channel estimation accuracy and uplink achievable sum rate.
本文提出了一种分布式非正交导频设计方法来解决多单元大规模多输入多输出(MIMO)系统中的导频污染问题。导频信号是在功率限制下设计的,通过最小化所有基站(BSs)的最小均方误差(MMSE)信道估计器的总均方误差(MSEs)来实现。为了解决上述非凸导频设计问题,引入了随机方差减少梯度(SVRG)投影算法,在单个BSs处对导频信号进行分布式优化。SVRG投影算法保留了瞬态梯度的随机性,使解更容易跳出局部极小值。此外,在每次迭代中,仅激活部分BSs执行梯度下降操作,从而产生绿色和低成本的基础设施。数值仿真结果表明了该方法在信道估计精度和上行可达和率方面的优越性。
{"title":"Distributed Non-Orthogonal Pilot Design for Multi-Cell Massive Mimo Systems","authors":"Yue Wu, Shaodan Med, Yuantao Gu","doi":"10.1109/ICASSP40776.2020.9053224","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053224","url":null,"abstract":"In this work, a distributed non-orthogonal pilot design approach is proposed to tackle the pilot contamination problem in multi-cell massive multiple input multiple output (MIMO) systems. The pilot signals are designed under power constraints by minimizing the total mean square errors (MSEs) of the minimum mean square error (MMSE) channel estimators of all base stations (BSs). In order to solve the above non-convex pilot design problem, the stochastic variance reduced gradient (SVRG) projection algorithm is introduced, where the pilots signals are optimized in a distributed way at individual BSs. The SVRG projection algorithm preserves the randomness of the transient gradient, which makes the solution more likely jump out of the local minima. Moreover, only part of the BSs are activated to perform the gradient descent operation during each iteration, producing a green and low-cost infrastructure. Numerical simulations demonstrate the superiority of the proposed approach in terms of the channel estimation accuracy and uplink achievable sum rate.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"45 15 1","pages":"5195-5199"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72729118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition 重音语音识别中序列到序列模型的耦合训练
Vinit Unni, Nitish Joshi, P. Jyothi
Accented speech poses significant challenges for state-of-the-art automatic speech recognition (ASR) systems. Accent is a property of speech that lasts throughout an utterance in varying degrees of strength. This makes it hard to isolate the influence of accent on individual speech sounds. We propose coupled training for encoder-decoder ASR models that acts on pairs of utterances corresponding to the same text spoken by speakers with different accents. This training regime introduces an L2 loss between the attention-weighted representations corresponding to pairs of utterances with the same text, thus acting as a regularizer and encouraging representations from the encoder to be more accent-invariant. We focus on recognizing accented English samples from the Mozilla Common Voice corpus. We obtain significant error rate reductions on accented samples from a large set of diverse accents using coupled training. We also show consistent improvements in performance on heavily accented samples (as determined by a standalone accent classifier).
重音语音对最先进的自动语音识别(ASR)系统提出了重大挑战。口音是一种语言的特性,它以不同程度的强度贯穿整个话语。这使得很难分离出重音对单个语音的影响。我们提出了对编码器-解码器ASR模型的耦合训练,该模型作用于不同口音的说话者所说的同一文本对应的话语对。这种训练机制在具有相同文本的话语对对应的注意加权表示之间引入了L2损失,从而充当正则化器,并鼓励编码器的表示更具重音不变性。我们专注于识别来自Mozilla公共语音语料库的重音英语样本。我们使用耦合训练从大量不同的口音样本中获得了显着的错误率降低。我们还展示了在重口音样本(由独立的口音分类器确定)上性能的持续改进。
{"title":"Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition","authors":"Vinit Unni, Nitish Joshi, P. Jyothi","doi":"10.1109/ICASSP40776.2020.9052912","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9052912","url":null,"abstract":"Accented speech poses significant challenges for state-of-the-art automatic speech recognition (ASR) systems. Accent is a property of speech that lasts throughout an utterance in varying degrees of strength. This makes it hard to isolate the influence of accent on individual speech sounds. We propose coupled training for encoder-decoder ASR models that acts on pairs of utterances corresponding to the same text spoken by speakers with different accents. This training regime introduces an L2 loss between the attention-weighted representations corresponding to pairs of utterances with the same text, thus acting as a regularizer and encouraging representations from the encoder to be more accent-invariant. We focus on recognizing accented English samples from the Mozilla Common Voice corpus. We obtain significant error rate reductions on accented samples from a large set of diverse accents using coupled training. We also show consistent improvements in performance on heavily accented samples (as determined by a standalone accent classifier).","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"238 1","pages":"8254-8258"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72743569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Non-Griffin–Lim Type Signal Recovery from Magnitude Spectrogram 从幅度谱图中恢复非griffin - lim型信号
Ryusei Nakatsu, D. Kitahara, A. Hirabayashi
Speech and audio signal processing frequently requires to recover a time-domain signal from the magnitude of a spectrogram. Conventional methods inversely transform the magnitude spectrogram with a phase spectrogram recovered by the Griffin–Lim algorithm or its accelerated versions. The short-time Fourier transform (STFT) perfectly matches this framework, while other useful spectrogram transforms, such as the constant-Q transform (CQT), do not, because their inverses cannot be computed easily. To make the best of such useful spectrogram transforms, we propose an algorithm which recovers the time-domain signal without the inverse spectrogram transforms. We formulate the signal recovery as a nonconvex optimization problem, which is difficult to solve exactly. To approximately solve the problem, we exploit a stochastic convex optimization technique. A well-organized block selection enables us both to avoid local minimums and to achieve fast convergence. Numerical experiments show the effectiveness of the proposed method for both STFT and CQT cases.
语音和音频信号处理经常需要从频谱图的幅度中恢复时域信号。传统方法用Griffin-Lim算法或其加速版本恢复的相位谱图对幅度谱图进行反变换。短时傅里叶变换(STFT)完美地匹配了这个框架,而其他有用的谱图变换,如常q变换(CQT),则不能,因为它们的逆不能轻易计算。为了充分利用这些有用的谱图变换,我们提出了一种不需要谱图逆变换就能恢复时域信号的算法。我们将信号恢复问题表述为一个难以精确求解的非凸优化问题。为了近似地解决这个问题,我们采用了一种随机凸优化技术。组织良好的块选择使我们既可以避免局部最小值,又可以实现快速收敛。数值实验证明了该方法对STFT和CQT两种情况的有效性。
{"title":"Non-Griffin–Lim Type Signal Recovery from Magnitude Spectrogram","authors":"Ryusei Nakatsu, D. Kitahara, A. Hirabayashi","doi":"10.1109/ICASSP40776.2020.9053576","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053576","url":null,"abstract":"Speech and audio signal processing frequently requires to recover a time-domain signal from the magnitude of a spectrogram. Conventional methods inversely transform the magnitude spectrogram with a phase spectrogram recovered by the Griffin–Lim algorithm or its accelerated versions. The short-time Fourier transform (STFT) perfectly matches this framework, while other useful spectrogram transforms, such as the constant-Q transform (CQT), do not, because their inverses cannot be computed easily. To make the best of such useful spectrogram transforms, we propose an algorithm which recovers the time-domain signal without the inverse spectrogram transforms. We formulate the signal recovery as a nonconvex optimization problem, which is difficult to solve exactly. To approximately solve the problem, we exploit a stochastic convex optimization technique. A well-organized block selection enables us both to avoid local minimums and to achieve fast convergence. Numerical experiments show the effectiveness of the proposed method for both STFT and CQT cases.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"11 3 1","pages":"791-795"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74560715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Talker-Independent Speaker Separation in Reverberant Conditions 混响条件下的独立扬声器分离
Masood Delfarah, Yuzhou Liu, Deliang Wang
Speaker separation refers to the task of separating a mixture signal comprising two or more speakers. Impressive advances have been made recently in deep learning based talker-independent speaker separation. But such advances are achieved in anechoic conditions. We address talker-independent speaker separation in reverberant conditions by exploring a recently proposed deep CASA approach. To effectively deal with speaker separation and speech dereverberation, we propose a two-stage strategy where reverberant utterances are first separated and then dereverberated. The two-stage deep CASA method outperforms other talker-independent separation methods. In addition, the deep CASA algorithm produces substantial speech intelligibility improvements for human listeners, with a particularly large benefit for hearing-impaired listeners.
扬声器分离是指分离包含两个或多个扬声器的混合信号的任务。最近在基于深度学习的独立于说话者的分离方面取得了令人印象深刻的进展。但这种进步是在消声条件下实现的。我们通过探索最近提出的深度CASA方法来解决混响条件下与对讲机无关的扬声器分离问题。为了有效地处理说话人分离和语音去混响,我们提出了一种两阶段策略,即首先分离混响话语然后去混响。两阶段深度CASA方法优于其他与对讲机无关的分离方法。此外,深度CASA算法大大提高了人类听众的语音清晰度,对听力受损的听众尤其有很大的好处。
{"title":"Talker-Independent Speaker Separation in Reverberant Conditions","authors":"Masood Delfarah, Yuzhou Liu, Deliang Wang","doi":"10.1109/ICASSP40776.2020.9054422","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054422","url":null,"abstract":"Speaker separation refers to the task of separating a mixture signal comprising two or more speakers. Impressive advances have been made recently in deep learning based talker-independent speaker separation. But such advances are achieved in anechoic conditions. We address talker-independent speaker separation in reverberant conditions by exploring a recently proposed deep CASA approach. To effectively deal with speaker separation and speech dereverberation, we propose a two-stage strategy where reverberant utterances are first separated and then dereverberated. The two-stage deep CASA method outperforms other talker-independent separation methods. In addition, the deep CASA algorithm produces substantial speech intelligibility improvements for human listeners, with a particularly large benefit for hearing-impaired listeners.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"52 1","pages":"8723-8727"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78399017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multi-Agent Deep Reinforcement Learning For Distributed Handover Management In Dense MmWave Networks 密集毫米波网络中分布式切换管理的多智能体深度强化学习
Mohamed Sana, A. Domenico, E. Strinati, Antonio Clemente
The dense deployment of millimeter wave small cells combined with directional beamforming is a promising solution to enhance the network capacity of the current generation of wireless communications. However, the reliability of millimeter wave communication links can be affected by severe pathloss, blockage, and deafness. As a result, mobile users are subject to frequent handoffs, which deteriorate the user throughput and the battery lifetime of mobile terminals. To tackle this problem, our paper proposes a deep multi-agent reinforcement learning framework for distributed handover management called RHando (Reinforced Handover). We model users as agents that learn how to perform handover to optimize the network throughput while taking into account the associated cost. The proposed solution is fully distributed, thus limiting signaling and computation overhead. Numerical results show that the proposed solution can provide higher throughput compared to conventional schemes while considerably limiting the frequency of the handovers.
结合定向波束形成的毫米波小蜂窝的密集部署是增强当前一代无线通信网络容量的一种有前途的解决方案。然而,毫米波通信链路的可靠性会受到严重的路径丢失、阻塞和耳聋的影响。这导致移动用户频繁切换,降低了用户吞吐量,降低了移动终端的电池寿命。为了解决这个问题,本文提出了一种用于分布式移交管理的深度多智能体强化学习框架RHando (reinforcement switching)。我们将用户建模为学习如何执行切换以在考虑相关成本的同时优化网络吞吐量的代理。所提出的解决方案是完全分布式的,因此限制了信令和计算开销。数值结果表明,与传统方案相比,该方案可以提供更高的吞吐量,同时大大限制了切换频率。
{"title":"Multi-Agent Deep Reinforcement Learning For Distributed Handover Management In Dense MmWave Networks","authors":"Mohamed Sana, A. Domenico, E. Strinati, Antonio Clemente","doi":"10.1109/ICASSP40776.2020.9052936","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9052936","url":null,"abstract":"The dense deployment of millimeter wave small cells combined with directional beamforming is a promising solution to enhance the network capacity of the current generation of wireless communications. However, the reliability of millimeter wave communication links can be affected by severe pathloss, blockage, and deafness. As a result, mobile users are subject to frequent handoffs, which deteriorate the user throughput and the battery lifetime of mobile terminals. To tackle this problem, our paper proposes a deep multi-agent reinforcement learning framework for distributed handover management called RHando (Reinforced Handover). We model users as agents that learn how to perform handover to optimize the network throughput while taking into account the associated cost. The proposed solution is fully distributed, thus limiting signaling and computation overhead. Numerical results show that the proposed solution can provide higher throughput compared to conventional schemes while considerably limiting the frequency of the handovers.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"9 1","pages":"8976-8980"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78443120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Concentration-Based Polynomial Calculations on Nicked DNA 基于浓度的缺失DNA多项式计算
Tonglin Chen, Marc D. Riedel
In this paper, we introduce a novel scheme for computing polynomial functions on a substrate of nicked DNA. We first discuss a fractional encoding of data, based on the concentration of nicked double DNA strands. Then we show how to perform multiplication on this representation. Next we describe the read-out process, effected by releasing single strands. We show how to perform simple mathematical operations such as addition and subtraction, as well as how to scale constant values using probabilistic switches. We also describe two complex operations: calculating a vector dot product and computing a general polynomial function. We conclude by discussing potential applications of our scheme, practical challenges, and future research directions.
在本文中,我们介绍了一种在有缺口的DNA衬底上计算多项式函数的新方案。我们首先讨论数据的分数编码,基于刻痕双DNA链的浓度。然后我们将展示如何对这种表示执行乘法。接下来,我们将描述通过释放单链实现的读出过程。我们将展示如何执行简单的数学运算,如加法和减法,以及如何使用概率开关缩放常数值。我们还描述了两个复杂的运算:计算向量点积和计算一般多项式函数。最后讨论了该方案的潜在应用、实际挑战和未来的研究方向。
{"title":"Concentration-Based Polynomial Calculations on Nicked DNA","authors":"Tonglin Chen, Marc D. Riedel","doi":"10.1109/ICASSP40776.2020.9053353","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053353","url":null,"abstract":"In this paper, we introduce a novel scheme for computing polynomial functions on a substrate of nicked DNA. We first discuss a fractional encoding of data, based on the concentration of nicked double DNA strands. Then we show how to perform multiplication on this representation. Next we describe the read-out process, effected by releasing single strands. We show how to perform simple mathematical operations such as addition and subtraction, as well as how to scale constant values using probabilistic switches. We also describe two complex operations: calculating a vector dot product and computing a general polynomial function. We conclude by discussing potential applications of our scheme, practical challenges, and future research directions.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"14 1","pages":"8836-8840"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77290363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimal Window Design for Joint Spatial-Spectral Domain Filtering of Signals on the Sphere 球上信号空谱联合滤波的最优窗口设计
Adeem Aslam, Z. Khalid
We present the optimal design of an azimuthally symmetric window signal for carrying out joint spatial-spectral domain filtering of a spherical (source) signal contaminated by a realization of an anisotropic noise process. The resulting window is used in the computation of spatially localized spherical harmonic transform of the noise-contaminated signal. We formulate the window design problem using the joint spatial-spectral domain filtering framework and choose the optimality criterion which minimizes the mean square error between the (noise-free) source signal and its filtered estimate. The azimuthally symmetric optimal window signal is shown to be specified by the statistics of the source and noise processes. We illustrate the capability of the proposed window signal by applying the joint spatial-spectral domain filtering framework to the bandlimited Mars topography map and demonstrate improvements in the output signal to noise ratio (SNR) for different values of input SNR.
我们提出了一种方位对称窗口信号的优化设计,用于对受各向异性噪声过程污染的球形(源)信号进行联合空间-频谱域滤波。利用所得窗口计算噪声污染信号的空间定域球谐变换。我们使用空间-频谱域联合滤波框架来制定窗口设计问题,并选择使(无噪声)源信号与其滤波估计之间的均方误差最小的最优准则。最优窗口信号的方位对称是由源和噪声过程的统计量决定的。我们通过将联合空间-频谱域滤波框架应用于有限带宽的火星地形图来说明所提出的窗口信号的能力,并演示了不同输入信噪比值下输出信噪比(SNR)的改进。
{"title":"Optimal Window Design for Joint Spatial-Spectral Domain Filtering of Signals on the Sphere","authors":"Adeem Aslam, Z. Khalid","doi":"10.1109/ICASSP40776.2020.9054085","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054085","url":null,"abstract":"We present the optimal design of an azimuthally symmetric window signal for carrying out joint spatial-spectral domain filtering of a spherical (source) signal contaminated by a realization of an anisotropic noise process. The resulting window is used in the computation of spatially localized spherical harmonic transform of the noise-contaminated signal. We formulate the window design problem using the joint spatial-spectral domain filtering framework and choose the optimality criterion which minimizes the mean square error between the (noise-free) source signal and its filtered estimate. The azimuthally symmetric optimal window signal is shown to be specified by the statistics of the source and noise processes. We illustrate the capability of the proposed window signal by applying the joint spatial-spectral domain filtering framework to the bandlimited Mars topography map and demonstrate improvements in the output signal to noise ratio (SNR) for different values of input SNR.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"117 1","pages":"5785-5789"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75895557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fg2seq: Effectively Encoding Knowledge for End-To-End Task-Oriented Dialog Fg2seq:端到端任务导向对话的有效知识编码
Zhenhao He, Yuhong He, Qingyao Wu, Jian Chen
End-to-end Task-oriented spoken dialog systems typically require modeling two types of inputs, namely, the dialog history which is a sequence of utterances and the knowledge base (KB) associated with the dialog history. While modeling these inputs, current state-of-the-art models typically ignore the rich structure in the knowledge graph or its intrinsic association with the dialog history. In this paper, we propose a Flow-to-Graph seq2seq model (FG2Seq) which can effectively encode knowledge by considering inherent structural information of the knowledge graph and latent semantic information from dialog history. Experiments on two publicly available task oriented dialog datasets show that our proposed FG2Seq achieves robust performance on generating appropriate system responses and outperforms the baseline systems.
端到端面向任务的口语对话系统通常需要建模两种类型的输入,即对话历史(一个话语序列)和与对话历史相关的知识库(KB)。在对这些输入建模时,当前最先进的模型通常忽略了知识图中的丰富结构或其与对话历史的内在关联。本文提出了一种流到图的seq2seq模型(FG2Seq),该模型通过考虑知识图固有的结构信息和对话历史的潜在语义信息,可以有效地对知识进行编码。在两个公开可用的面向任务的对话数据集上的实验表明,我们提出的FG2Seq在生成适当的系统响应方面取得了稳健的性能,并且优于基线系统。
{"title":"Fg2seq: Effectively Encoding Knowledge for End-To-End Task-Oriented Dialog","authors":"Zhenhao He, Yuhong He, Qingyao Wu, Jian Chen","doi":"10.1109/ICASSP40776.2020.9053667","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053667","url":null,"abstract":"End-to-end Task-oriented spoken dialog systems typically require modeling two types of inputs, namely, the dialog history which is a sequence of utterances and the knowledge base (KB) associated with the dialog history. While modeling these inputs, current state-of-the-art models typically ignore the rich structure in the knowledge graph or its intrinsic association with the dialog history. In this paper, we propose a Flow-to-Graph seq2seq model (FG2Seq) which can effectively encode knowledge by considering inherent structural information of the knowledge graph and latent semantic information from dialog history. Experiments on two publicly available task oriented dialog datasets show that our proposed FG2Seq achieves robust performance on generating appropriate system responses and outperforms the baseline systems.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"63 1","pages":"8029-8033"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76151956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Clustering of Nonnegative Data and an Application to Matrix Completion 非负数据的聚类及其在矩阵补全中的应用
Christopher Strohmeier, D. Needell
In this article, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix completion algorithm which can outperform standard matrix completion algorithms on data matrices satisfying a certain natural low rank condition.
在本文中,我们提出了一个简单的算法来聚类位于不相交子空间中的非负数据。我们根据所述子空间之间的某种相关度量来分析其性能。我们利用我们的聚类算法开发了一种矩阵补全算法,该算法在满足一定自然低秩条件的数据矩阵上优于标准矩阵补全算法。
{"title":"Clustering of Nonnegative Data and an Application to Matrix Completion","authors":"Christopher Strohmeier, D. Needell","doi":"10.1109/ICASSP40776.2020.9052980","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9052980","url":null,"abstract":"In this article, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix completion algorithm which can outperform standard matrix completion algorithms on data matrices satisfying a certain natural low rank condition.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"8349-8353"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75080477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revealing Hidden Drawings in Leonardo’s ‘the Virgin of the Rocks’ from Macro X-Ray Fluorescence Scanning Data through Element Line Localisation 通过元素线定位从宏观x射线荧光扫描数据揭示达芬奇“岩石圣母”中的隐藏画作
Su Yan, Jun-Jie Huang, Nathan Daly, C. Higgitt, P. Dragotti
Macro X-Ray Fluorescence (XRF) scanning is an increasingly widely used imaging technique for the non-invasive detection and mapping of chemical elements in Old Master paintings. Existing approaches for XRF signal analysis require varying degrees of expert user input. They are mainly based on peak fitting at fixed energies associated with each element and require the target elements to be selected manually. In this paper, we propose a new method that can process macro XRF scanning data from paintings fully automatically. The method consists of two parts: 1) detecting pulses in an XRF spectrum using Finite Rate of Innovation (FRI) theory; 2) producing the distribution maps for each element automatically identified in the painting. The results presented show the ability of our method to detect weak or partially overlapping signals and more excitingly to have visualisation of underdrawing in a masterpiece by Leonardo da Vinci.
宏观x射线荧光(XRF)扫描是一种越来越广泛使用的成像技术,用于对古代大师画作中的化学元素进行无创检测和绘制。现有的XRF信号分析方法需要不同程度的专家用户输入。它们主要基于与每个元素关联的固定能量处的峰值拟合,需要人工选择目标元素。在本文中,我们提出了一种新的方法,可以完全自动地处理来自绘画的宏XRF扫描数据。该方法由两部分组成:1)利用有限创新率(FRI)理论检测XRF频谱中的脉冲;2)生成在绘画中自动识别的每个元素的分布图。结果表明,我们的方法能够检测到微弱或部分重叠的信号,更令人兴奋的是,我们可以在达芬奇的杰作中看到底图。
{"title":"Revealing Hidden Drawings in Leonardo’s ‘the Virgin of the Rocks’ from Macro X-Ray Fluorescence Scanning Data through Element Line Localisation","authors":"Su Yan, Jun-Jie Huang, Nathan Daly, C. Higgitt, P. Dragotti","doi":"10.1109/ICASSP40776.2020.9054460","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054460","url":null,"abstract":"Macro X-Ray Fluorescence (XRF) scanning is an increasingly widely used imaging technique for the non-invasive detection and mapping of chemical elements in Old Master paintings. Existing approaches for XRF signal analysis require varying degrees of expert user input. They are mainly based on peak fitting at fixed energies associated with each element and require the target elements to be selected manually. In this paper, we propose a new method that can process macro XRF scanning data from paintings fully automatically. The method consists of two parts: 1) detecting pulses in an XRF spectrum using Finite Rate of Innovation (FRI) theory; 2) producing the distribution maps for each element automatically identified in the painting. The results presented show the ability of our method to detect weak or partially overlapping signals and more excitingly to have visualisation of underdrawing in a masterpiece by Leonardo da Vinci.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"26 1","pages":"1444-1448"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74936891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1