首页 > 最新文献

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献

英文 中文
Hierarchical Traffic Matrices: Axiomatic Foundations to Practical Traffic Matrix Synthesis 层次交通矩阵:实用交通矩阵合成的公理基础
Paul Tune, M. Roughan, Chris Wiren
The traffic matrix of a network is useful in a variety of applications: network planning and forecasting, traffic engineering and anomaly detection. Much work has focused on estimating traffic matrices, but methods are often tested on limited data. There is then the possibility of unrepresentativeness of the datasets, and the lack of generalizability of the subsequent results. Synthesis can help alleviate this problem. In this paper, we examine a fundamental question: what constitutes a good class of statistical models for traffic matrix synthesis? The results of our study is the definition of a set of axioms specifying structure on traffic matrix models, including the incorporation of organizational structure (hierarchies) in network traffic. We introduce the Hierarchical Traffic Matrix (HTM) which satisfies these requirements. We then study the hierarchical structure of the GEANT network, a research network based in Europe, to validate our ideas. Finally, we illustrate how structure in traffic matrices can affect network topology design.
网络的流量矩阵在网络规划和预测、流量工程和异常检测等方面有着广泛的应用。许多工作都集中在估计交通矩阵上,但方法通常在有限的数据上进行测试。然后可能存在数据集不具有代表性,以及后续结果缺乏普遍性的可能性。合成可以帮助缓解这个问题。在本文中,我们研究了一个基本问题:什么构成了交通矩阵合成的一类好的统计模型?我们的研究结果是在流量矩阵模型上定义了一组指定结构的公理,包括在网络流量中纳入组织结构(层次结构)。我们引入了满足这些要求的分层流量矩阵(HTM)。然后,我们研究了位于欧洲的研究网络GEANT网络的层次结构,以验证我们的想法。最后,我们说明了流量矩阵的结构如何影响网络拓扑设计。
{"title":"Hierarchical Traffic Matrices: Axiomatic Foundations to Practical Traffic Matrix Synthesis","authors":"Paul Tune, M. Roughan, Chris Wiren","doi":"10.23919/APSIPA.2018.8659593","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659593","url":null,"abstract":"The traffic matrix of a network is useful in a variety of applications: network planning and forecasting, traffic engineering and anomaly detection. Much work has focused on estimating traffic matrices, but methods are often tested on limited data. There is then the possibility of unrepresentativeness of the datasets, and the lack of generalizability of the subsequent results. Synthesis can help alleviate this problem. In this paper, we examine a fundamental question: what constitutes a good class of statistical models for traffic matrix synthesis? The results of our study is the definition of a set of axioms specifying structure on traffic matrix models, including the incorporation of organizational structure (hierarchies) in network traffic. We introduce the Hierarchical Traffic Matrix (HTM) which satisfies these requirements. We then study the hierarchical structure of the GEANT network, a research network based in Europe, to validate our ideas. Finally, we illustrate how structure in traffic matrices can affect network topology design.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131732647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EEG Hyperscanning for Eight or more Persons - Feasibility Study for Emotion Recognition using Deep Learning Technique 八人或以上的脑电图超扫描-使用深度学习技术进行情绪识别的可行性研究
Sunghan Lee, Sangjun Han, S. Jun
Multi-user electroencephalogram (EEG) system is necessary to study concurrent activity among many persons. It is difficult to find a system that measures multiple EEG signals from more than even three people simultaneously. Therefore, we suggested a framework that is able to acquire EEG signals of more than eight persons at the same time and investigated the feasibility of this system. Acquisition was performed by using OpenViBE software developed by INRIA. Wireless EEG devices for our proposed framework were manufactured by BioBrain, Corp. in Korea. A device consists of eight channels measuring frontal EEG at a speed of 1 KHz sampling rate. While participants wore this system and did emotional video watching task as a group audience, their brain signals were acquired. To show its feasibility and efficacy, our preliminary result is analyzed using deep learning technique.
多用户脑电图(EEG)系统是研究多人并发活动的必要手段。很难找到一个能够同时测量超过三个人的多个脑电图信号的系统。因此,我们提出了一个能够同时采集8人以上脑电信号的框架,并对该系统的可行性进行了研究。使用INRIA开发的OpenViBE软件进行采集。我们提出的框架的无线脑电图设备是由韩国的BioBrain公司制造的。该装置由8个通道组成,以1khz采样率测量额叶脑电图。当参与者戴上这个系统,作为一群观众进行情感视频观看任务时,他们的大脑信号被获取。为了证明其可行性和有效性,我们使用深度学习技术对我们的初步结果进行了分析。
{"title":"EEG Hyperscanning for Eight or more Persons - Feasibility Study for Emotion Recognition using Deep Learning Technique","authors":"Sunghan Lee, Sangjun Han, S. Jun","doi":"10.23919/APSIPA.2018.8659738","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659738","url":null,"abstract":"Multi-user electroencephalogram (EEG) system is necessary to study concurrent activity among many persons. It is difficult to find a system that measures multiple EEG signals from more than even three people simultaneously. Therefore, we suggested a framework that is able to acquire EEG signals of more than eight persons at the same time and investigated the feasibility of this system. Acquisition was performed by using OpenViBE software developed by INRIA. Wireless EEG devices for our proposed framework were manufactured by BioBrain, Corp. in Korea. A device consists of eight channels measuring frontal EEG at a speed of 1 KHz sampling rate. While participants wore this system and did emotional video watching task as a group audience, their brain signals were acquired. To show its feasibility and efficacy, our preliminary result is analyzed using deep learning technique.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128792303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
User's Intention Understanding in Question-Answering System Using Attention-based LSTM 基于注意力的LSTM问答系统中的用户意图理解
Yukio Matsuyoshi, T. Takiguchi, Y. Ariki
A rule-based question-answering system is limited in its ability to understand a user's intention due to the inevitable incompleteness of the rules. To address this problem, in this paper, we propose a method to estimate question type and question keyword class from a user's question by using an attention-based LSTM (Long Short-Term Memory) model. We also propose a joint model for simultaneous estimation of question type and question keyword class. Through the experiment, the effectiveness of our proposed method is evaluated based upon estimation rates. In addition, the proposed method for question type estimation is compared with a rule-based system, support vector machine (SVM), and Random Forest. The method for question keyword class estimation is also compared with the non-attention LSTM model and the conventional model.
基于规则的问答系统理解用户意图的能力有限,这是由于规则不可避免的不完整性。为了解决这一问题,本文提出了一种利用基于注意力的LSTM (Long - Short-Term Memory)模型从用户的问题中估计问题类型和问题关键字类别的方法。我们还提出了一个问题类型和问题关键字类同时估计的联合模型。通过实验,基于估计率对所提方法的有效性进行了评价。此外,将本文提出的问题类型估计方法与基于规则的系统、支持向量机(SVM)和随机森林进行了比较。并将该方法与非关注LSTM模型和传统模型进行了比较。
{"title":"User's Intention Understanding in Question-Answering System Using Attention-based LSTM","authors":"Yukio Matsuyoshi, T. Takiguchi, Y. Ariki","doi":"10.23919/APSIPA.2018.8659636","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659636","url":null,"abstract":"A rule-based question-answering system is limited in its ability to understand a user's intention due to the inevitable incompleteness of the rules. To address this problem, in this paper, we propose a method to estimate question type and question keyword class from a user's question by using an attention-based LSTM (Long Short-Term Memory) model. We also propose a joint model for simultaneous estimation of question type and question keyword class. Through the experiment, the effectiveness of our proposed method is evaluated based upon estimation rates. In addition, the proposed method for question type estimation is compared with a rule-based system, support vector machine (SVM), and Random Forest. The method for question keyword class estimation is also compared with the non-attention LSTM model and the conventional model.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125377277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A study on impulse response model of reflected path for THz band 太赫兹波段反射路径脉冲响应模型的研究
Kazuhiro Tsujimura, K. Umebayashi, J. Kokkoniemi, Janne J. Lehtomäki
Impulse response based channel model is vital for wireless communication analysis and modeling. This paper considers the impulse response of the terahertz band (THz band: 0.1-10 THz) for reflected path in case of short range (1–100 cm) wireless communication. In indoor application, it is necessary to consider multipath channel. In analysis of reflected path, rough surface of reflector is considered with Rayleigh roughness factor. The validity of the model is investigated with experimental THz band measurements (up to 2THz).
基于脉冲响应的信道模型是无线通信分析和建模的关键。本文研究了短距离(1 ~ 100 cm)无线通信中反射路径太赫兹波段(太赫兹波段:0.1 ~ 10太赫兹)的脉冲响应。在室内应用中,有必要考虑多径信道。在反射路径分析中,考虑了反射面粗糙表面的瑞利粗糙度因子。通过实验太赫兹波段测量(高达2THz)验证了该模型的有效性。
{"title":"A study on impulse response model of reflected path for THz band","authors":"Kazuhiro Tsujimura, K. Umebayashi, J. Kokkoniemi, Janne J. Lehtomäki","doi":"10.23919/APSIPA.2018.8659776","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659776","url":null,"abstract":"Impulse response based channel model is vital for wireless communication analysis and modeling. This paper considers the impulse response of the terahertz band (THz band: 0.1-10 THz) for reflected path in case of short range (1–100 cm) wireless communication. In indoor application, it is necessary to consider multipath channel. In analysis of reflected path, rough surface of reflector is considered with Rayleigh roughness factor. The validity of the model is investigated with experimental THz band measurements (up to 2THz).","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123334432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
XOR learning by spiking neural network with infrared communications 利用脉冲神经网络与红外通信进行异或学习
Kazuki Matsumoto, H. Torikai, H. Sekiya
A Spiking Neural Network (SNN), which expresses information by spike trains, has an ability to process information with low energy like a human brain. Hardware implementation of a SNN is an important research problem. If the neurons are linked by wireless communications, SNNs can obtain the spatial degree of freedom, which may extend application area dramatically. Additionally, such SNNs can process information with low energy, owing to wireless communication by the spike trains. Therefore, it is regarded as low power-consumption wireless sensor networks (WSNs) with adding the functions of SNN neurons to wireless sensor nodes. This “Wireless Neural Sensor Networks” can distribute information processing like a brain on the WSN nodes. This paper presents a SNN with infrared(IR) communications as the first step of the above concept. Neurons are implemented by field programmable gate array, which are linked by IR communications. The implemented SNN succeeded in acquiring the XOR function through reinforcement learning.
通过尖峰序列表达信息的尖峰神经网络(SNN)具有像人脑一样以低能量处理信息的能力。SNN的硬件实现是一个重要的研究问题。如果神经元之间通过无线通信连接,snn可以获得空间自由度,可以极大地扩展应用领域。此外,由于尖峰串的无线通信,这种snn可以以低能量处理信息。因此,将SNN神经元的功能添加到无线传感器节点上,被视为低功耗无线传感器网络(WSNs)。这种“无线神经传感器网络”可以像大脑一样在WSN节点上分配信息处理。本文提出了一个具有红外通信的SNN作为上述概念的第一步。神经元由现场可编程门阵列实现,通过红外通信连接。实现的SNN通过强化学习成功获取异或函数。
{"title":"XOR learning by spiking neural network with infrared communications","authors":"Kazuki Matsumoto, H. Torikai, H. Sekiya","doi":"10.23919/APSIPA.2018.8659484","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659484","url":null,"abstract":"A Spiking Neural Network (SNN), which expresses information by spike trains, has an ability to process information with low energy like a human brain. Hardware implementation of a SNN is an important research problem. If the neurons are linked by wireless communications, SNNs can obtain the spatial degree of freedom, which may extend application area dramatically. Additionally, such SNNs can process information with low energy, owing to wireless communication by the spike trains. Therefore, it is regarded as low power-consumption wireless sensor networks (WSNs) with adding the functions of SNN neurons to wireless sensor nodes. This “Wireless Neural Sensor Networks” can distribute information processing like a brain on the WSN nodes. This paper presents a SNN with infrared(IR) communications as the first step of the above concept. Neurons are implemented by field programmable gate array, which are linked by IR communications. The implemented SNN succeeded in acquiring the XOR function through reinforcement learning.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126049695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Graphical User Interface for Medical Deep Learning - Application to Magnetic Resonance Imaging 医学深度学习的图形用户界面-应用于磁共振成像
Sebastian Milde, Annika Liebgott, Ziwei Wu, Wenyi Feng, Jiahuan Yang, Lukas Mauch, P. Martirosian, F. Bamberg, K. Nikolaou, S. Gatidis, F. Schick, Bin Yang, Thomas Kustner
In clinical diagnostic, magnetic resonance imaging (MRI) is a valuable and versatile tool. The acquisition process is, however, susceptible to image distortions (artifacts) which may lead to degradation of image quality. Automated and reference-free localization and quantification of artifacts by employing convolutional neural networks (CNNs) is a promising way for early detection of artifacts. Training relies on high amount of expert labeled data which is a time-demanding process. Previous studies were based on global labels, i.e. a whole volume was automatically labeled as artifact-free or artifact-affected. However, artifact appearance is rather localized. We propose a local labeling which is conducted via a graphical user interface (GUI). Moreover, the GUI provides easy handling of data viewing, preprocessing (labeling, patching, data augmentation), network parametrization and training, data and network evaluation as well as deep visualization of the learned network content. The GUI is not limited to these features and will be extended in the future. The developed GUI is made publicly available and features a modular outline to target different applications of machine learning and deep learning, such as artifact detection, classification and segmentation.
在临床诊断中,磁共振成像(MRI)是一种有价值且用途广泛的工具。然而,采集过程容易受到图像失真(伪影)的影响,这可能导致图像质量下降。利用卷积神经网络(cnn)对伪影进行自动化和无参考的定位和量化是一种很有前途的早期检测伪影的方法。训练依赖于大量的专家标记数据,这是一个耗时的过程。以前的研究是基于全局标记,即整个卷被自动标记为无人工或人工影响。然而,工件的外观是相当局部的。我们建议通过图形用户界面(GUI)进行局部标记。此外,GUI还提供了易于处理的数据查看、预处理(标记、修补、数据增强)、网络参数化和训练、数据和网络评估以及学习到的网络内容的深度可视化。GUI并不局限于这些特性,将来还会进行扩展。开发的GUI是公开可用的,并具有模块化大纲,以针对机器学习和深度学习的不同应用,如工件检测,分类和分割。
{"title":"Graphical User Interface for Medical Deep Learning - Application to Magnetic Resonance Imaging","authors":"Sebastian Milde, Annika Liebgott, Ziwei Wu, Wenyi Feng, Jiahuan Yang, Lukas Mauch, P. Martirosian, F. Bamberg, K. Nikolaou, S. Gatidis, F. Schick, Bin Yang, Thomas Kustner","doi":"10.23919/APSIPA.2018.8659515","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659515","url":null,"abstract":"In clinical diagnostic, magnetic resonance imaging (MRI) is a valuable and versatile tool. The acquisition process is, however, susceptible to image distortions (artifacts) which may lead to degradation of image quality. Automated and reference-free localization and quantification of artifacts by employing convolutional neural networks (CNNs) is a promising way for early detection of artifacts. Training relies on high amount of expert labeled data which is a time-demanding process. Previous studies were based on global labels, i.e. a whole volume was automatically labeled as artifact-free or artifact-affected. However, artifact appearance is rather localized. We propose a local labeling which is conducted via a graphical user interface (GUI). Moreover, the GUI provides easy handling of data viewing, preprocessing (labeling, patching, data augmentation), network parametrization and training, data and network evaluation as well as deep visualization of the learned network content. The GUI is not limited to these features and will be extended in the future. The developed GUI is made publicly available and features a modular outline to target different applications of machine learning and deep learning, such as artifact detection, classification and segmentation.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114129515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Age and Gender Prediction from Face Images Using Convolutional Neural Network 基于卷积神经网络的人脸图像年龄和性别预测
Koichi Ito, Hiroya Kawai, Takehisa Okano, T. Aoki
Attribute information such as age and gender improves the performance of face recognition. This paper proposes an age and gender prediction method from face images using convolutional neural network. Through a set of experiments using public face databases, we demonstrate that the proposed method exhibits the efficient performance on age and gender prediction compared with conventional methods.
年龄和性别等属性信息提高了人脸识别的性能。本文提出了一种基于卷积神经网络的人脸图像年龄和性别预测方法。通过一组公共人脸数据库的实验,我们证明了该方法在年龄和性别预测方面比传统方法具有更高的性能。
{"title":"Age and Gender Prediction from Face Images Using Convolutional Neural Network","authors":"Koichi Ito, Hiroya Kawai, Takehisa Okano, T. Aoki","doi":"10.23919/APSIPA.2018.8659655","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659655","url":null,"abstract":"Attribute information such as age and gender improves the performance of face recognition. This paper proposes an age and gender prediction method from face images using convolutional neural network. Through a set of experiments using public face databases, we demonstrate that the proposed method exhibits the efficient performance on age and gender prediction compared with conventional methods.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116236480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A DNN-based emotional speech synthesis by speaker adaptation 基于说话人自适应的dnn情绪语音合成
Hongwu Yang, Weizhao Zhang, Pengpeng Zhi
The paper proposes a deep neural network (DNN)-based emotional speech synthesis method to improve the quality of synthesized emotional speech by speaker adaptation with a multi-speaker and multi-emotion speech corpus. Firstly, a text analyzer is employed to obtain the contextual labels from sentences while the WORLD vocoder is used to extract the acoustic features from corresponding speeches. Then a set of speaker-independent DNN average voice models are trained with the contextual labels and acoustic features of multi-emotion speech corpus. Finally, the speaker adaptation is adopted to train a set of speaker-dependent DNN voice models of target emotion with target emotional training speeches. The target emotional speech is synthesized by the speaker-dependent DNN voice models. Subjective evaluations show that comparing with the traditional hidden Markov model (HMM)-based method, the proposed method can achieve higher opinion scores. Objective tests demonstrate that the spectrum of the emotional speech synthesized by the proposed method is also closer to the original speech than that of the emotional speech synthesized by the HMM-based method. Therefore, the proposed method can improve the emotion express and naturalness of synthesized emotional speech.
本文提出了一种基于深度神经网络(DNN)的情绪语音合成方法,利用多说话人、多情绪语音语料库对说话人进行自适应,提高合成情绪语音的质量。首先,使用文本分析器从句子中获取上下文标签,并使用WORLD声码器从相应的语音中提取声学特征。然后利用上下文标签和多情感语音语料库的声学特征训练一组独立于说话人的DNN平均语音模型。最后,采用说话人自适应,用目标情绪训练演讲训练一组目标情绪依赖于说话人的DNN语音模型。目标情绪语音由说话人依赖的深度神经网络语音模型合成。主观评价表明,与传统的基于隐马尔可夫模型(HMM)的方法相比,该方法可以获得更高的意见得分。客观测试表明,与基于hmm的方法合成的情绪语音相比,该方法合成的情绪语音频谱更接近原始语音。因此,该方法可以提高合成情感语音的情感表达和自然度。
{"title":"A DNN-based emotional speech synthesis by speaker adaptation","authors":"Hongwu Yang, Weizhao Zhang, Pengpeng Zhi","doi":"10.23919/APSIPA.2018.8659599","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659599","url":null,"abstract":"The paper proposes a deep neural network (DNN)-based emotional speech synthesis method to improve the quality of synthesized emotional speech by speaker adaptation with a multi-speaker and multi-emotion speech corpus. Firstly, a text analyzer is employed to obtain the contextual labels from sentences while the WORLD vocoder is used to extract the acoustic features from corresponding speeches. Then a set of speaker-independent DNN average voice models are trained with the contextual labels and acoustic features of multi-emotion speech corpus. Finally, the speaker adaptation is adopted to train a set of speaker-dependent DNN voice models of target emotion with target emotional training speeches. The target emotional speech is synthesized by the speaker-dependent DNN voice models. Subjective evaluations show that comparing with the traditional hidden Markov model (HMM)-based method, the proposed method can achieve higher opinion scores. Objective tests demonstrate that the spectrum of the emotional speech synthesized by the proposed method is also closer to the original speech than that of the emotional speech synthesized by the HMM-based method. Therefore, the proposed method can improve the emotion express and naturalness of synthesized emotional speech.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"237 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116777523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Epileptic Focus Localization Based on iEEG by Using Positive Unlabeled (PU) Learning 基于正性无标记学习的iEEG癫痫病灶定位
Xuyang Zhao, Toshihisa Tanaka, Wanzeng Kong, Qibin Zhao, Jianting Cao, H. Sugano, Noboru Yoshida
Epilepsy is a chronic disorder of the brain. Intracranial electroencephalogram (iEEG) recorded from cortex is the most popular measurement for not only the diagnosis of epilepsy, but also the focus localization that is crucial for the surgery. In recent years, the machine learning methods have been rapidly developed and applied successfully to various real world problems. Given sufficient number of samples, the powerful deep learning methods can achieve high performance for epileptic focus localization. However, it is a challenging task to obtain large amount of labeled iEEG regarding focal/non-focal channels, since the annotations must be performed by multiple clinical experts through visual judgment on the long term iEEG signals. In order to reduce the necessary number of labeled training samples, we introduce the positive unlabeled (PU) learning method for classification of focal and non-focal epileptic iEEG signals. The proposed method enables us to learn a binary classifier by using small amount of labeled data containing only one class (i.e., focal signals) and unlabeled data containing two classes (i.e., focal and non-focal signals), which greatly reduces the workload of clinical experts for annotations. Experimental results on Bern dataset and iEEG recorded from Juntendo University Hospital demonstrate the effectiveness of our method.
癫痫是一种脑部慢性疾病。大脑皮层的颅内脑电图(iEEG)不仅是诊断癫痫最常用的测量方法,而且是对手术至关重要的病灶定位。近年来,机器学习方法得到了迅速发展,并成功地应用于各种现实世界的问题。在样本数量足够的情况下,强大的深度学习方法可以实现癫痫病灶定位的高性能。然而,获得大量关于焦/非焦通道的标记脑电图是一项具有挑战性的任务,因为注释必须由多名临床专家通过对长期脑电图信号的视觉判断来完成。为了减少必要的标记训练样本数量,我们引入了正无标记学习方法对局灶性和非局灶性癫痫脑电图信号进行分类。所提出的方法使我们能够使用少量仅包含一类(即焦点信号)的标记数据和包含两类(即焦点和非焦点信号)的未标记数据来学习二分类器,从而大大减少了临床专家的注释工作量。在Bern数据集和Juntendo大学医院记录的iEEG上的实验结果证明了我们的方法的有效性。
{"title":"Epileptic Focus Localization Based on iEEG by Using Positive Unlabeled (PU) Learning","authors":"Xuyang Zhao, Toshihisa Tanaka, Wanzeng Kong, Qibin Zhao, Jianting Cao, H. Sugano, Noboru Yoshida","doi":"10.23919/APSIPA.2018.8659747","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659747","url":null,"abstract":"Epilepsy is a chronic disorder of the brain. Intracranial electroencephalogram (iEEG) recorded from cortex is the most popular measurement for not only the diagnosis of epilepsy, but also the focus localization that is crucial for the surgery. In recent years, the machine learning methods have been rapidly developed and applied successfully to various real world problems. Given sufficient number of samples, the powerful deep learning methods can achieve high performance for epileptic focus localization. However, it is a challenging task to obtain large amount of labeled iEEG regarding focal/non-focal channels, since the annotations must be performed by multiple clinical experts through visual judgment on the long term iEEG signals. In order to reduce the necessary number of labeled training samples, we introduce the positive unlabeled (PU) learning method for classification of focal and non-focal epileptic iEEG signals. The proposed method enables us to learn a binary classifier by using small amount of labeled data containing only one class (i.e., focal signals) and unlabeled data containing two classes (i.e., focal and non-focal signals), which greatly reduces the workload of clinical experts for annotations. Experimental results on Bern dataset and iEEG recorded from Juntendo University Hospital demonstrate the effectiveness of our method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"43 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113941982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Significance of Teager Energy Operator Phase for Replay Spoof Detection 能量算子相位对重放欺骗检测的意义
Prasad A. Tapkir, H. Patil
The increased use of voice biometrics for various security applications, motivated authors to investigate different countermeasures for the hazard of spoofing attacks, where the attacker tries to imitate the genuine speaker. The replay is the most accessible spoofing attack. Past studies have ignored phase information for various speech processing applications. In this paper, we explore the excitation source-like feature set, namely, Teager Energy Operator (TEO) phase and its significance in the replay spoof detection task. This feature set is further fused at score-level with magnitude spectrum-based features, such as Constant Q Cepstral Coefficients (CQCC), Mel Frequency Cepstral Coefficients (MFCC), and Linear Frequency Cepstral Coefficients (LFCC). The improvement in the results show that the TEO phase feature set contains the complementary information to the magnitude spectrum-based features. The experiments are performed on the ASV Spoof 2017 Challenge database. The systems are implemented with Gaussian Mixture Model (GMM) as a classifier. Our best system using TEO phase achieves the Equal Error Rate (EER) of 6.57% and 15.39% on the development and evaluation set, respectively.
在各种安全应用中越来越多地使用语音生物识别技术,促使作者研究针对欺骗攻击危险的不同对策,攻击者试图模仿真正的说话者。重放是最容易实现的欺骗攻击。以往的研究在各种语音处理应用中忽略了相位信息。在本文中,我们探讨了类激励源特征集,即Teager能量算子(TEO)相位及其在重放欺骗检测任务中的意义。该特征集在分数水平上进一步与基于幅度谱的特征融合,例如恒定Q倒谱系数(CQCC), Mel频率倒谱系数(MFCC)和线性频率倒谱系数(LFCC)。改进结果表明,TEO相位特征集包含了基于幅度谱特征的互补信息。实验在ASV Spoof 2017 Challenge数据库上进行。该系统采用高斯混合模型(GMM)作为分类器实现。我们使用TEO阶段的最佳系统在开发集和评估集上的平均错误率(EER)分别为6.57%和15.39%。
{"title":"Significance of Teager Energy Operator Phase for Replay Spoof Detection","authors":"Prasad A. Tapkir, H. Patil","doi":"10.23919/APSIPA.2018.8659664","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659664","url":null,"abstract":"The increased use of voice biometrics for various security applications, motivated authors to investigate different countermeasures for the hazard of spoofing attacks, where the attacker tries to imitate the genuine speaker. The replay is the most accessible spoofing attack. Past studies have ignored phase information for various speech processing applications. In this paper, we explore the excitation source-like feature set, namely, Teager Energy Operator (TEO) phase and its significance in the replay spoof detection task. This feature set is further fused at score-level with magnitude spectrum-based features, such as Constant Q Cepstral Coefficients (CQCC), Mel Frequency Cepstral Coefficients (MFCC), and Linear Frequency Cepstral Coefficients (LFCC). The improvement in the results show that the TEO phase feature set contains the complementary information to the magnitude spectrum-based features. The experiments are performed on the ASV Spoof 2017 Challenge database. The systems are implemented with Gaussian Mixture Model (GMM) as a classifier. Our best system using TEO phase achieves the Equal Error Rate (EER) of 6.57% and 15.39% on the development and evaluation set, respectively.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122502610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1