首页 > 最新文献

International Conference on Signal Processing and Machine Learning最新文献

英文 中文
Minimum Classification Error Training with Speech Synthesis-Based Regularization for Speech Recognition 基于语音合成的正则化最小分类误差训练用于语音识别
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372819
Naoto Umezaki, Takumi Okubo, Hideyuki Watanabe, S. Katagiri, M. Ohsaki
To increase the utility of Regularization, which is a common framework for avoiding the underestimation of ideal Bayes error, for speech recognizer training, we propose a new classifier training concept that incorporates a regularization term that represents the speech synthesis ability of classifier parameters. To implement our new concept, we first introduce a speech recognizer that embeds Line Spectral Pairs-Conjugate Structure-Algebraic Code Excited Linear Prediction (LSP-CS-ACELP) in a Multi-Prototype State-Transition-Model (MP-STM) classifier, define a regularization term that represents the speech synthesis ability by the distance between a training sample and its nearest MP-STM word model, and formalize a new Minimum Classification Error (MCE) training method for jointly minimizing a conventional smooth classification error count loss and the newly defined regularization term. We evaluated the proposed training method in an isolated-word, closed-vocabulary, and speaker-independent speech recognition task whose Bayes error is estimated to be about 20% and found that our method successfully produced an estimate of Bayes error (about 18.4%) with a single training run over a training dataset without such data resampling as Cross-Validation or the assumptions of sample distribution. Moreover, we investigated the quality of the synthesized speech using LSP parameters derived from the trained prototypes and found that the quality of the Bayes error estimation is clearly supported by the speech synthesis ability preserved in the training.
正则化是避免理想贝叶斯误差低估的常用框架,为了提高正则化在语音识别器训练中的效用,我们提出了一种新的分类器训练概念,该概念包含了代表分类器参数语音合成能力的正则化项。为了实现我们的新概念,我们首先在多原型状态转换模型(MP-STM)分类器中引入了一个语音识别器,该识别器嵌入了线谱对共轭结构代数码激发线性预测(LSP-CS-ACELP),定义了一个正则化项,该正则化项通过训练样本与其最近的MP-STM单词模型之间的距离来表示语音合成能力。并形式化了一种新的最小分类误差(MCE)训练方法,该方法将传统的平滑分类误差计数损失和新定义的正则化项联合最小化。我们在一个孤立词、封闭词汇和说话人独立的语音识别任务中评估了所提出的训练方法,该任务的贝叶斯误差估计约为20%,并发现我们的方法在一个训练数据集上运行一次训练就成功地产生了贝叶斯误差估计(约18.4%),而没有交叉验证或样本分布假设等数据重新采样。此外,我们使用从训练原型中获得的LSP参数研究了合成语音的质量,发现贝叶斯误差估计的质量明显得到了训练中保留的语音合成能力的支持。
{"title":"Minimum Classification Error Training with Speech Synthesis-Based Regularization for Speech Recognition","authors":"Naoto Umezaki, Takumi Okubo, Hideyuki Watanabe, S. Katagiri, M. Ohsaki","doi":"10.1145/3372806.3372819","DOIUrl":"https://doi.org/10.1145/3372806.3372819","url":null,"abstract":"To increase the utility of Regularization, which is a common framework for avoiding the underestimation of ideal Bayes error, for speech recognizer training, we propose a new classifier training concept that incorporates a regularization term that represents the speech synthesis ability of classifier parameters. To implement our new concept, we first introduce a speech recognizer that embeds Line Spectral Pairs-Conjugate Structure-Algebraic Code Excited Linear Prediction (LSP-CS-ACELP) in a Multi-Prototype State-Transition-Model (MP-STM) classifier, define a regularization term that represents the speech synthesis ability by the distance between a training sample and its nearest MP-STM word model, and formalize a new Minimum Classification Error (MCE) training method for jointly minimizing a conventional smooth classification error count loss and the newly defined regularization term. We evaluated the proposed training method in an isolated-word, closed-vocabulary, and speaker-independent speech recognition task whose Bayes error is estimated to be about 20% and found that our method successfully produced an estimate of Bayes error (about 18.4%) with a single training run over a training dataset without such data resampling as Cross-Validation or the assumptions of sample distribution. Moreover, we investigated the quality of the synthesized speech using LSP parameters derived from the trained prototypes and found that the quality of the Bayes error estimation is clearly supported by the speech synthesis ability preserved in the training.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125688174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-source Radar Data Fusion via Support Vector Regression 基于支持向量回归的多源雷达数据融合
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372810
Zhanchun Gao, Y. Xiang
Since the measurement error of surveillance sensors such as radar differs each other in the detection of the same target, it's necessary to fuse the multi-source radar data to estimate the true location of target and reduce the measurement error of radar. The key is to establish nonlinear regression model since the uncertainty of measurement error. In this paper, the Support Vector Regression(SVR) methodology was adopted to estimate the true location of target based upon the measurement results of multi-source radar. We uniquely identify a region by a sequence of radar id which means a target can be detected in this area by radars with id listed in the sequence. Different regression model was established in different region which are independent of each other. Since the coordinate system used by radar data and ADSB data is different, we mapped all the data into the same two-dimensional Cartesian coordinate system. In the same region, two regression models were established to estimate the values of aircraft on the x-axis and the y-axis. After we predict the x and y coordinates of the target, we convert the coordinates back to the WGS84 format.
由于雷达等监视传感器在探测同一目标时测量误差各不相同,因此有必要对多源雷达数据进行融合,以估计目标的真实位置,减小雷达的测量误差。由于测量误差的不确定性,关键是建立非线性回归模型。本文基于多源雷达的测量结果,采用支持向量回归(SVR)方法估计目标的真实位置。我们通过雷达id序列唯一地识别一个区域,这意味着可以通过序列中列出的id的雷达在该区域检测到目标。在不同的区域建立了不同的回归模型,这些模型相互独立。由于雷达数据和ADSB数据使用的坐标系不同,我们将所有数据映射到相同的二维笛卡尔坐标系中。在同一区域,建立了两个回归模型,分别估计飞机在x轴和y轴上的值。在我们预测目标的x和y坐标之后,我们将坐标转换回WGS84格式。
{"title":"Multi-source Radar Data Fusion via Support Vector Regression","authors":"Zhanchun Gao, Y. Xiang","doi":"10.1145/3372806.3372810","DOIUrl":"https://doi.org/10.1145/3372806.3372810","url":null,"abstract":"Since the measurement error of surveillance sensors such as radar differs each other in the detection of the same target, it's necessary to fuse the multi-source radar data to estimate the true location of target and reduce the measurement error of radar. The key is to establish nonlinear regression model since the uncertainty of measurement error. In this paper, the Support Vector Regression(SVR) methodology was adopted to estimate the true location of target based upon the measurement results of multi-source radar. We uniquely identify a region by a sequence of radar id which means a target can be detected in this area by radars with id listed in the sequence. Different regression model was established in different region which are independent of each other. Since the coordinate system used by radar data and ADSB data is different, we mapped all the data into the same two-dimensional Cartesian coordinate system. In the same region, two regression models were established to estimate the values of aircraft on the x-axis and the y-axis. After we predict the x and y coordinates of the target, we convert the coordinates back to the WGS84 format.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114139760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Method for Removing Motion Blur from Images of Harmful Biological Organisms in Power Places Based on Improved Cyclegan 基于改进Cyclegan的电力场所有害生物图像运动模糊去除方法
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372820
Dongyang Ye, Shangping Zhong, Jiahao Zhuang, Li Chen
Nowadays, the automatic detection of harmful organisms in power places has attracted attention due to the extensive unattended way of power places. However, surveillance pictures are prone to motion blurring and harmful organisms cannot be effectively detected due to their frequent and fast movements in power places. On the basis of the improved Cycle-Consistent Adversarial Networks (CycleGAN) model, we propose a method for removing motion blur from the images of harmful biological organisms in power places. This method does not require paired blurred and real sharp images for training, which is consistent with actual requirements. In addition, our method improves the classical CycleGAN model by combining cycle consistency and perceptual loss to enhance the detail authenticity of image texture restoration and improve the detection accuracy. The model uses Wasserstein GAN with gradient penalty (WGAN-GP) as a loss function to train the depth model. Given the existence of the GAN itself, the entire real image distribution space is difficult to fill with the generated image distribution space. Experimental results show that the proposed method effectively improves the detection accuracy of harmful organisms in power places.
目前,由于电力场所普遍采用无人值守的方式,电力场所有害生物的自动检测受到了人们的关注。然而,由于电力场所中有害生物活动频繁、速度快,监控画面容易出现运动模糊,无法有效检测出有害生物。在改进的周期一致对抗网络(CycleGAN)模型的基础上,提出了一种去除电力场所有害生物图像运动模糊的方法。该方法不需要对模糊和真实的锐利图像进行配对训练,符合实际需求。此外,我们的方法通过结合周期一致性和感知损失对经典CycleGAN模型进行改进,增强了图像纹理恢复的细节真实性,提高了检测精度。该模型使用Wasserstein梯度惩罚GAN (WGAN-GP)作为损失函数来训练深度模型。由于GAN本身的存在,生成的图像分布空间很难填充整个实数图像分布空间。实验结果表明,该方法有效地提高了电力场所有害生物的检测精度。
{"title":"Method for Removing Motion Blur from Images of Harmful Biological Organisms in Power Places Based on Improved Cyclegan","authors":"Dongyang Ye, Shangping Zhong, Jiahao Zhuang, Li Chen","doi":"10.1145/3372806.3372820","DOIUrl":"https://doi.org/10.1145/3372806.3372820","url":null,"abstract":"Nowadays, the automatic detection of harmful organisms in power places has attracted attention due to the extensive unattended way of power places. However, surveillance pictures are prone to motion blurring and harmful organisms cannot be effectively detected due to their frequent and fast movements in power places. On the basis of the improved Cycle-Consistent Adversarial Networks (CycleGAN) model, we propose a method for removing motion blur from the images of harmful biological organisms in power places. This method does not require paired blurred and real sharp images for training, which is consistent with actual requirements. In addition, our method improves the classical CycleGAN model by combining cycle consistency and perceptual loss to enhance the detail authenticity of image texture restoration and improve the detection accuracy. The model uses Wasserstein GAN with gradient penalty (WGAN-GP) as a loss function to train the depth model. Given the existence of the GAN itself, the entire real image distribution space is difficult to fill with the generated image distribution space. Experimental results show that the proposed method effectively improves the detection accuracy of harmful organisms in power places.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131465499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Development and Trend of ECG Diagnosis Assisted by Artificial Intelligence 人工智能辅助心电图诊断的发展与趋势
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372807
Tongnan Xia, Mengyao Shu, Hongtao Fan, Lei Ma, Yaojie Sun
Due to the low accuracy and efficiency of traditional manual and existing automated interpretation of ECG, misdiagnosis and missed diagnosis are easy to occur. Studies have shown that, artificial intelligence technology is the direction of ECG diagnosis in the future. The wide application of artificial intelligence in ECG diagnostic system will effectively promote the rapid development of electrocardiography and improve the level of clinical prevention, early warning and treatment as well as prognosis evaluation. Based on the research situation of our research group, we summarized and introduced the research progress of using artificial intelligence technology to assist ECG diagnosis at home and abroad in this paper.
由于传统的人工和现有的自动心电判读准确率和效率较低,容易出现误诊和漏诊。研究表明,人工智能技术是未来心电图诊断的方向。人工智能在心电诊断系统中的广泛应用,将有效促进心电图学的快速发展,提高临床预防、预警、治疗和预后评估水平。本文结合课题组的研究情况,对国内外利用人工智能技术辅助心电诊断的研究进展进行了总结和介绍。
{"title":"The Development and Trend of ECG Diagnosis Assisted by Artificial Intelligence","authors":"Tongnan Xia, Mengyao Shu, Hongtao Fan, Lei Ma, Yaojie Sun","doi":"10.1145/3372806.3372807","DOIUrl":"https://doi.org/10.1145/3372806.3372807","url":null,"abstract":"Due to the low accuracy and efficiency of traditional manual and existing automated interpretation of ECG, misdiagnosis and missed diagnosis are easy to occur. Studies have shown that, artificial intelligence technology is the direction of ECG diagnosis in the future. The wide application of artificial intelligence in ECG diagnostic system will effectively promote the rapid development of electrocardiography and improve the level of clinical prevention, early warning and treatment as well as prognosis evaluation. Based on the research situation of our research group, we summarized and introduced the research progress of using artificial intelligence technology to assist ECG diagnosis at home and abroad in this paper.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"253 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133646700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Data Link Modeling and Simulation Based on DEVS 基于DEVS的数据链建模与仿真
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3374911
Zhenxing Luo, Lulu Zhao, Wanyong Tian, Dan Yang, Yiyuan Chen, Jiabin Yu, Jianjun Li
The Discrete Event System (DEVS)[1] Specification provides a reference standard for the model design and simulation development of complex discrete event state system. It designs a formal mechanism to describe discrete event state, which is composed of a set of strictly abstract mathematical symbols, and provides a strict description mechanism and execution logic for the modeling and Simulation of discrete event state system, it ensures the normalization, reusability and simulation operation ability of the model. As a special link system, Data Link[2] is different from general communication system. It is mainly used between different military platforms to ensure information sharing. By linking sensors, command and control system and weapon platform according to the specified message format and communication protocol, the information system that can automatically transmit the formatted data of battlefield situation, command and guidance, tactical coordination, weapon control, etc., in real time can form a close and efficient tactical link relationship between different combat platforms. In this paper, DEVS is used to simulate the Reporting Responsibility for Air, Surface, and Land tracks in MIL-STD-6016B.
离散事件系统(DEVS)[1]规范为复杂离散事件状态系统的模型设计和仿真开发提供了参考标准。设计了一种由一组严格抽象的数学符号组成的描述离散事件状态的形式化机制,为离散事件状态系统的建模和仿真提供了严格的描述机制和执行逻辑,保证了模型的规范化、可重用性和仿真操作能力。数据链路[2]作为一种特殊的链路系统,不同于一般的通信系统。主要用于不同军事平台之间,保证信息共享。通过将传感器、指挥控制系统和武器平台按照规定的报文格式和通信协议进行连接,能够实时自动传输战场态势、指挥制导、战术协调、武器控制等格式化数据的信息系统,在不同作战平台之间形成紧密、高效的战术链接关系。本文采用DEVS对MIL-STD-6016B中空中、地面和陆地轨道的报告责任进行了仿真。
{"title":"Data Link Modeling and Simulation Based on DEVS","authors":"Zhenxing Luo, Lulu Zhao, Wanyong Tian, Dan Yang, Yiyuan Chen, Jiabin Yu, Jianjun Li","doi":"10.1145/3372806.3374911","DOIUrl":"https://doi.org/10.1145/3372806.3374911","url":null,"abstract":"The Discrete Event System (DEVS)[1] Specification provides a reference standard for the model design and simulation development of complex discrete event state system. It designs a formal mechanism to describe discrete event state, which is composed of a set of strictly abstract mathematical symbols, and provides a strict description mechanism and execution logic for the modeling and Simulation of discrete event state system, it ensures the normalization, reusability and simulation operation ability of the model. As a special link system, Data Link[2] is different from general communication system. It is mainly used between different military platforms to ensure information sharing. By linking sensors, command and control system and weapon platform according to the specified message format and communication protocol, the information system that can automatically transmit the formatted data of battlefield situation, command and guidance, tactical coordination, weapon control, etc., in real time can form a close and efficient tactical link relationship between different combat platforms. In this paper, DEVS is used to simulate the Reporting Responsibility for Air, Surface, and Land tracks in MIL-STD-6016B.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115243282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Maximum Bayes Boundary-Ness Training For Pattern Classification 模式分类的最大贝叶斯边界训练
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372817
Masahiro Senda, David Ha, Hideyuki Watanabe, S. Katagiri, M. Ohsaki
The ultimate goal of pattern classifier parameter training is to achieve its optimal status (value) that produces Bayes error or a corresponding Bayes boundary. To realize this goal without unrealistically long training repetitions and strict parameter assumptions, the Bayes Boundary-ness-based Selection (BBS) method was recently proposed and its effectiveness was clearly demonstrated. However, the BBS method remains cumbersome because it consists of two stages: the first generates many candidate sets of trained parameters by carefully controlling the training hyperparameters so that those candidate sets can include the optimal target parameter set; the second stage selects an optimal set from candidate sets. To resolve the BBS method's burden, we propose a new one-stage training method that directly optimizes a given classifier parameter set by maximizing its Bayes boundary-ness or increasing its accuracy during Bayes error estimation. We experimentally evaluate our proposed method in terms of its accuracy of Bayes error estimation over four synthetic or real-life datasets. Our experimental results clearly show that it successfully overcomes the drawbacks of the preceding BBS method and directly creates optimal classifier parameter status without generating too many candidate parameter sets.
模式分类器参数训练的最终目标是达到其产生贝叶斯误差的最优状态(值)或相应的贝叶斯边界。为了实现这一目标,不需要不切实际的长训练次数和严格的参数假设,最近提出了基于贝叶斯边界的选择(BBS)方法,并清楚地证明了其有效性。然而,BBS方法仍然很麻烦,因为它包括两个阶段:第一阶段通过仔细控制训练超参数来生成许多训练参数的候选集,使这些候选集可以包含最优的目标参数集;第二阶段从候选集合中选择最优集合。为了解决BBS方法的负担,我们提出了一种新的单阶段训练方法,该方法通过在贝叶斯误差估计过程中最大化其贝叶斯边界性或提高其准确性来直接优化给定的分类器参数集。我们通过实验评估了我们提出的方法在四个合成或现实数据集上的贝叶斯误差估计的准确性。我们的实验结果清楚地表明,它成功地克服了之前的BBS方法的缺点,在不产生太多候选参数集的情况下直接产生最优的分类器参数状态。
{"title":"Maximum Bayes Boundary-Ness Training For Pattern Classification","authors":"Masahiro Senda, David Ha, Hideyuki Watanabe, S. Katagiri, M. Ohsaki","doi":"10.1145/3372806.3372817","DOIUrl":"https://doi.org/10.1145/3372806.3372817","url":null,"abstract":"The ultimate goal of pattern classifier parameter training is to achieve its optimal status (value) that produces Bayes error or a corresponding Bayes boundary. To realize this goal without unrealistically long training repetitions and strict parameter assumptions, the Bayes Boundary-ness-based Selection (BBS) method was recently proposed and its effectiveness was clearly demonstrated. However, the BBS method remains cumbersome because it consists of two stages: the first generates many candidate sets of trained parameters by carefully controlling the training hyperparameters so that those candidate sets can include the optimal target parameter set; the second stage selects an optimal set from candidate sets. To resolve the BBS method's burden, we propose a new one-stage training method that directly optimizes a given classifier parameter set by maximizing its Bayes boundary-ness or increasing its accuracy during Bayes error estimation. We experimentally evaluate our proposed method in terms of its accuracy of Bayes error estimation over four synthetic or real-life datasets. Our experimental results clearly show that it successfully overcomes the drawbacks of the preceding BBS method and directly creates optimal classifier parameter status without generating too many candidate parameter sets.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122472148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-scale Fusion and Channel Weighted CNN for Acoustic Scene Classification 基于多尺度融合和信道加权CNN的声场景分类
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372809
Liping Yang, Xinxing Chen, Lianjie Tao, Xiaohua Gu
Ensemble semantic features are useful for acoustic scene classification. In this paper, we proposed a multi-scale fusion and channel weighted CNN framework. The framework consists of two stages: the multi-scale feature fusion and channel weighting stages. The multi-scale feature fusion stage extracts hierarchy semantic feature maps using a CNN with simplified Xception architecture and then integrates multi-scale semantic features through a top-down pathway. The channel weighting stage squeezes feature maps into a channel descriptor and then transforms it into a set of channel weighting factors to reinforce the importance of each channel for acoustic scene classification. Experimental results on DCASE2018 acoustic scene classification subtask A and subtask B demonstrate the performances of the proposed framework.
集成语义特征对声学场景分类非常有用。本文提出了一种多尺度融合和信道加权的CNN框架。该框架包括两个阶段:多尺度特征融合阶段和通道加权阶段。多尺度特征融合阶段使用简化Xception架构的CNN提取层次语义特征映射,然后通过自顶向下的路径整合多尺度语义特征。通道加权阶段将特征映射压缩为通道描述符,然后将其转换为一组通道加权因子,以增强每个通道对声学场景分类的重要性。在DCASE2018声学场景分类子任务A和子任务B上的实验结果验证了该框架的性能。
{"title":"Multi-scale Fusion and Channel Weighted CNN for Acoustic Scene Classification","authors":"Liping Yang, Xinxing Chen, Lianjie Tao, Xiaohua Gu","doi":"10.1145/3372806.3372809","DOIUrl":"https://doi.org/10.1145/3372806.3372809","url":null,"abstract":"Ensemble semantic features are useful for acoustic scene classification. In this paper, we proposed a multi-scale fusion and channel weighted CNN framework. The framework consists of two stages: the multi-scale feature fusion and channel weighting stages. The multi-scale feature fusion stage extracts hierarchy semantic feature maps using a CNN with simplified Xception architecture and then integrates multi-scale semantic features through a top-down pathway. The channel weighting stage squeezes feature maps into a channel descriptor and then transforms it into a set of channel weighting factors to reinforce the importance of each channel for acoustic scene classification. Experimental results on DCASE2018 acoustic scene classification subtask A and subtask B demonstrate the performances of the proposed framework.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123491117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multi-Scale Deep Convolutional Nets with Attention Model and Conditional Random Fields for Semantic Image Segmentation 基于注意模型和条件随机场的多尺度深度卷积网络语义图像分割
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372811
Ming Liu, Caiming Zhang, Zhao Zhang
Although Convolutional Neural Networks are effective visual models that generate hierarchies of features, there still exist some shortcomings in the application of Deep Convolutional Neural Networks to semantic image segmentation. In this work, our algorithm incorporates multi-scale atrous convolution, attention model and Conditional Random Fields to tackle this problem. Firstly, our method replaces deconvolutional layers with atrous convolutional layers to avoid reducing feature resolution when the Deep Convolutional Neural Networks is employed in a fully convolutional fashion. Secondly, multi-scale architecture and attention model are used to extract the existence of features at multiple scales. Thirdly, we use Conditional Random Fields to prevent the built-in invariance of Deep Convolutional Neural Networks reducing localization accuracy. Moreover, our network completely integrates Conditional Random Fields modelling with Deep Convolutional Neural Networks, making it possible to train the deep network end-to-end. In this paper, our method is used to the matters of semantic image segmentation and is demonstrated the effectiveness of our model with experiments on PASCAL VOC 2012.
虽然卷积神经网络是一种有效的生成特征层次的视觉模型,但是深度卷积神经网络在语义图像分割中的应用还存在一些不足。在这项工作中,我们的算法结合了多尺度亚特鲁斯卷积、注意模型和条件随机场来解决这个问题。首先,我们的方法用反卷积层代替反卷积层,以避免在以全卷积方式使用深度卷积神经网络时降低特征分辨率。其次,采用多尺度结构和注意模型提取多尺度特征的存在性;第三,我们使用条件随机场来防止深度卷积神经网络的内置不变性降低定位精度。此外,我们的网络完全集成了条件随机场建模和深度卷积神经网络,使得端到端训练深度网络成为可能。本文将该方法应用于语义图像分割问题,并在PASCAL VOC 2012上进行了实验,验证了该方法的有效性。
{"title":"Multi-Scale Deep Convolutional Nets with Attention Model and Conditional Random Fields for Semantic Image Segmentation","authors":"Ming Liu, Caiming Zhang, Zhao Zhang","doi":"10.1145/3372806.3372811","DOIUrl":"https://doi.org/10.1145/3372806.3372811","url":null,"abstract":"Although Convolutional Neural Networks are effective visual models that generate hierarchies of features, there still exist some shortcomings in the application of Deep Convolutional Neural Networks to semantic image segmentation. In this work, our algorithm incorporates multi-scale atrous convolution, attention model and Conditional Random Fields to tackle this problem. Firstly, our method replaces deconvolutional layers with atrous convolutional layers to avoid reducing feature resolution when the Deep Convolutional Neural Networks is employed in a fully convolutional fashion. Secondly, multi-scale architecture and attention model are used to extract the existence of features at multiple scales. Thirdly, we use Conditional Random Fields to prevent the built-in invariance of Deep Convolutional Neural Networks reducing localization accuracy. Moreover, our network completely integrates Conditional Random Fields modelling with Deep Convolutional Neural Networks, making it possible to train the deep network end-to-end. In this paper, our method is used to the matters of semantic image segmentation and is demonstrated the effectiveness of our model with experiments on PASCAL VOC 2012.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121035952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automated Detection of Sewer Pipe Defects Based on Cost-Sensitive Convolutional Neural Network 基于代价敏感卷积神经网络的污水管道缺陷自动检测
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372816
Yuhan Chen, Shangping Zhong, Kaizhi Chen, Shoulong Chen, Song Zheng
Regular inspection and repair of drainage pipes is an important part of urban construction. Currently, many classification methods have been used for defect diagnosis using images inside pipelines. However, most of these classification models train the classifier with the goal of maximizing accuracy without considering the unequal error classification cost in defect diagnosis. In this study, the authors analyze the characteristics of sewer pipeline defect detection and design an automated detection framework based on the cost-sensitive deep convolutional neural network (CNN). The method makes the CNN network cost sensitive by introducing learning theories at the structural and loss levels of the network. To minimize misclassification costs, the authors propose a new auxiliary loss function Cost-Mean Loss, which allows the model to obtain the original parameters of the network to maximize the accuracy and improve the performance of the model by minimizing total misclassification costs in the learning process. Theoretical analysis shows that the new auxiliary loss function can be applied to the classification task to optimize the expected value of misclassification costs. The inspection images collected from multiple drainage pipes were used to train and test the network. Results show that after the cost-sensitive strategy was added, the defect detection rate decreased from 2.1% to 0.45%. Moreover, the model with Cost-Mean Loss has better performance than the original model.
排水管道的定期检查和维修是城市建设的重要组成部分。目前,利用管道内部图像进行缺陷诊断的分类方法很多。然而,这些分类模型大多以最大准确率为目标来训练分类器,而没有考虑缺陷诊断中不相等的错误分类代价。本文分析了污水管道缺陷检测的特点,设计了一种基于代价敏感深度卷积神经网络(CNN)的自动检测框架。该方法通过在网络的结构和损失层面引入学习理论,使CNN网络具有成本敏感性。为了最小化错误分类代价,作者提出了一种新的辅助损失函数Cost-Mean loss,该函数允许模型获取网络的原始参数,通过最小化学习过程中的总错误分类代价来最大化准确率并提高模型的性能。理论分析表明,新的辅助损失函数可以应用到分类任务中,以优化误分类代价的期望值。利用从多个排水管采集的检测图像对网络进行训练和测试。结果表明,加入成本敏感策略后,缺陷检出率由2.1%下降到0.45%。此外,具有Cost-Mean Loss的模型比原始模型具有更好的性能。
{"title":"Automated Detection of Sewer Pipe Defects Based on Cost-Sensitive Convolutional Neural Network","authors":"Yuhan Chen, Shangping Zhong, Kaizhi Chen, Shoulong Chen, Song Zheng","doi":"10.1145/3372806.3372816","DOIUrl":"https://doi.org/10.1145/3372806.3372816","url":null,"abstract":"Regular inspection and repair of drainage pipes is an important part of urban construction. Currently, many classification methods have been used for defect diagnosis using images inside pipelines. However, most of these classification models train the classifier with the goal of maximizing accuracy without considering the unequal error classification cost in defect diagnosis. In this study, the authors analyze the characteristics of sewer pipeline defect detection and design an automated detection framework based on the cost-sensitive deep convolutional neural network (CNN). The method makes the CNN network cost sensitive by introducing learning theories at the structural and loss levels of the network. To minimize misclassification costs, the authors propose a new auxiliary loss function Cost-Mean Loss, which allows the model to obtain the original parameters of the network to maximize the accuracy and improve the performance of the model by minimizing total misclassification costs in the learning process. Theoretical analysis shows that the new auxiliary loss function can be applied to the classification task to optimize the expected value of misclassification costs. The inspection images collected from multiple drainage pipes were used to train and test the network. Results show that after the cost-sensitive strategy was added, the defect detection rate decreased from 2.1% to 0.45%. Moreover, the model with Cost-Mean Loss has better performance than the original model.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117077670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Learning How to Avoiding Obstacles for End-to-End Driving with Conditional Imitation Learning 利用条件模仿学习学习如何避开端到端驾驶障碍
Pub Date : 2019-11-27 DOI: 10.1145/3372806.3372808
Enwei Zhang, Hongtu Zhou, Yongchao Ding, Junqiao Zhao, Chen Ye
Obstacle avoiding is one of the most complex tasks for autonomous driving systems, which was also ignored by many cutting-edge end-to-end learning-based methods. The difficulties stem from the integrated process of detection and interpretation of environment and obstacles and generation of proper behaviors. We make the use of CARLA, a simulator for autonomous driving research, and collect massive human drivers' reactions to obstacles on road subjecting to given driving commands, i.e. follow, go straight, turn left and turn right for about 6 hours. A behavior-Cloning neural network architecture is proposed with the modified loss that enlarge the effects of errors for steer, which indicates the benefit to high an accuracy. We found the data augmentation of the image is crucial to the training of the proposed network. And a reasonable limit allows avoiding unexpected stop. The experiments demonstrate 3 obstacle avoidance cases: for the same type as the training dataset, other automobile and two-wheeled vehicles. Finally, the CARLA benchmark is also tested.
避障是自动驾驶系统中最复杂的任务之一,也是许多尖端的端到端学习方法所忽略的。困难源于对环境和障碍的发现和解释以及适当行为的产生的综合过程。我们利用自动驾驶研究模拟器CARLA,收集了大量人类驾驶员在给定驾驶指令下对道路障碍物的反应,即跟随、直行、左转、右转,持续约6小时。提出了一种行为克隆神经网络结构,该结构的修正损失可以放大误差对转向的影响,这表明该结构具有较高的精度。我们发现图像的数据增强对所提出的网络的训练至关重要。合理的限制可以避免意外停车。实验展示了3种避障案例:针对与训练数据集相同的类型,其他汽车和两轮车辆。最后,对CARLA基准进行了测试。
{"title":"Learning How to Avoiding Obstacles for End-to-End Driving with Conditional Imitation Learning","authors":"Enwei Zhang, Hongtu Zhou, Yongchao Ding, Junqiao Zhao, Chen Ye","doi":"10.1145/3372806.3372808","DOIUrl":"https://doi.org/10.1145/3372806.3372808","url":null,"abstract":"Obstacle avoiding is one of the most complex tasks for autonomous driving systems, which was also ignored by many cutting-edge end-to-end learning-based methods. The difficulties stem from the integrated process of detection and interpretation of environment and obstacles and generation of proper behaviors. We make the use of CARLA, a simulator for autonomous driving research, and collect massive human drivers' reactions to obstacles on road subjecting to given driving commands, i.e. follow, go straight, turn left and turn right for about 6 hours. A behavior-Cloning neural network architecture is proposed with the modified loss that enlarge the effects of errors for steer, which indicates the benefit to high an accuracy. We found the data augmentation of the image is crucial to the training of the proposed network. And a reasonable limit allows avoiding unexpected stop. The experiments demonstrate 3 obstacle avoidance cases: for the same type as the training dataset, other automobile and two-wheeled vehicles. Finally, the CARLA benchmark is also tested.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132027934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
International Conference on Signal Processing and Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1