首页 > 最新文献

2022 International Joint Conference on Neural Networks (IJCNN)最新文献

英文 中文
Robust Cross-Modal Retrieval by Adversarial Training 基于对抗训练的鲁棒跨模态检索
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892637
Tao Zhang, Shiliang Sun, Jing Zhao
Cross-modal retrieval is usually implemented based on cross-modal representation learning, which is used to extract semantic information from cross-modal data. Recent work shows that cross-modal representation learning is vulnerable to adversarial attacks, even using large-scale pre-trained networks. By attacking the representation, it can be simple to attack the downstream tasks, especially for cross-modal retrieval tasks. Adversarial attacks on any modality will easily lead to obvious retrieval errors, which brings the challenge to improve the adversarial robustness of cross-modal retrieval. In this paper, we propose a robust cross-modal retrieval method (RoCMR), which generates adversarial examples for both the query modality and candidate modality and performs adversarial training for cross-modal retrieval. Specifically, we generate adversarial examples for both image and text modalities and train the model with benign and adversarial examples in the framework of contrastive learning. We evaluate the proposed RoCMR on two datasets and show its effectiveness in defending against gradient-based attacks.
跨模态检索通常基于跨模态表示学习来实现,该学习用于从跨模态数据中提取语义信息。最近的研究表明,即使使用大规模预训练的网络,跨模态表示学习也容易受到对抗性攻击。通过攻击表示,可以很容易地攻击下游任务,特别是跨模态检索任务。针对任何模态的对抗性攻击都容易导致明显的检索错误,这给提高跨模态检索的对抗性鲁棒性带来了挑战。在本文中,我们提出了一种鲁棒跨模态检索方法(RoCMR),该方法为查询模态和候选模态生成对抗性示例,并对跨模态检索进行对抗性训练。具体来说,我们为图像和文本模式生成对抗示例,并在对比学习的框架中使用良性和对抗示例训练模型。我们在两个数据集上评估了所提出的RoCMR,并展示了它在防御基于梯度的攻击方面的有效性。
{"title":"Robust Cross-Modal Retrieval by Adversarial Training","authors":"Tao Zhang, Shiliang Sun, Jing Zhao","doi":"10.1109/IJCNN55064.2022.9892637","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892637","url":null,"abstract":"Cross-modal retrieval is usually implemented based on cross-modal representation learning, which is used to extract semantic information from cross-modal data. Recent work shows that cross-modal representation learning is vulnerable to adversarial attacks, even using large-scale pre-trained networks. By attacking the representation, it can be simple to attack the downstream tasks, especially for cross-modal retrieval tasks. Adversarial attacks on any modality will easily lead to obvious retrieval errors, which brings the challenge to improve the adversarial robustness of cross-modal retrieval. In this paper, we propose a robust cross-modal retrieval method (RoCMR), which generates adversarial examples for both the query modality and candidate modality and performs adversarial training for cross-modal retrieval. Specifically, we generate adversarial examples for both image and text modalities and train the model with benign and adversarial examples in the framework of contrastive learning. We evaluate the proposed RoCMR on two datasets and show its effectiveness in defending against gradient-based attacks.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134460170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploring Attribute Space with Word Embedding for Zero-shot Learning 基于词嵌入的零学习属性空间探索
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892132
Zhaocheng Zhang, Gang Yang
With the purpose of addressing the scarcity of attribute diversity in Zero-shot Learning (ZSL), we propose to search for additional attributes in embedding space to extend the class embedding, providing a more discriminative representation of the class prototype. Meanwhile, to tackle the inherent noise behind manually annotated attributes, we apply multi-layer convolutional processing on semantic features rather than conventional linear transformation for filtering. Moreover, we employ Center Loss to assist the training stage, which helps the learned mapping be more accurate and consistent with the corresponding class's prototype. Combining these modules mentioned above, extensive experiments on several public datasets show that our method could yield decent improvements. This proposed way of extending attributes can also be migrated to other models or tasks and obtain better results.
为了解决零射击学习(Zero-shot Learning, ZSL)中属性多样性的稀缺性,我们提出在嵌入空间中寻找额外的属性来扩展类嵌入,提供一个更具判别性的类原型表示。同时,为了解决手工标注属性背后的固有噪声,我们对语义特征进行多层卷积处理,而不是传统的线性变换进行滤波。此外,我们使用中心损失来辅助训练阶段,这有助于学习到的映射更加准确,并与相应的类原型保持一致。结合上面提到的这些模块,在几个公共数据集上进行的大量实验表明,我们的方法可以产生不错的改进。这种扩展属性的方法也可以移植到其他模型或任务中,获得更好的结果。
{"title":"Exploring Attribute Space with Word Embedding for Zero-shot Learning","authors":"Zhaocheng Zhang, Gang Yang","doi":"10.1109/IJCNN55064.2022.9892132","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892132","url":null,"abstract":"With the purpose of addressing the scarcity of attribute diversity in Zero-shot Learning (ZSL), we propose to search for additional attributes in embedding space to extend the class embedding, providing a more discriminative representation of the class prototype. Meanwhile, to tackle the inherent noise behind manually annotated attributes, we apply multi-layer convolutional processing on semantic features rather than conventional linear transformation for filtering. Moreover, we employ Center Loss to assist the training stage, which helps the learned mapping be more accurate and consistent with the corresponding class's prototype. Combining these modules mentioned above, extensive experiments on several public datasets show that our method could yield decent improvements. This proposed way of extending attributes can also be migrated to other models or tasks and obtain better results.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134490477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Context-Dependent Spatial Representations in the Hippocampus using Place Cell Dendritic Computation 使用位置细胞树突计算的海马体中上下文相关的空间表征
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892401
Adedapo Alabi, D. Vanderelst, A. Minai
The hippocampus in rodents encodes physical space using place cells that show maximal firing in specific regions of space - their place fields. These place cells are reused across different contexts and environments with uncorrelated place fields. Though place fields are known to depend on distal sensory cues, even identical environments can have completely different place fields if the contexts are different. We propose a novel place cell network model for this feature using two frequently overlooked aspects of neural computation - dendritic morphology and the spatial co-location of spatiotemporally co-active afferent synapses - and show that these enable the reuse of place cells to encode different maps for environments with identical sensory cues.
啮齿类动物的海马体利用位置细胞对物理空间进行编码,这些位置细胞在空间的特定区域——它们的位置场——表现出最大的放电。这些位置细胞在不同的上下文和环境中使用不相关的位置字段进行重用。虽然已知位置场依赖于远端感官线索,但如果背景不同,即使是相同的环境也可能有完全不同的位置场。我们提出了一种新的位置细胞网络模型,利用神经计算的两个经常被忽视的方面-树突形态和时空协同活动传入突触的空间共定位-并表明这些可以重用位置细胞来编码具有相同感觉线索的环境的不同地图。
{"title":"Context-Dependent Spatial Representations in the Hippocampus using Place Cell Dendritic Computation","authors":"Adedapo Alabi, D. Vanderelst, A. Minai","doi":"10.1109/IJCNN55064.2022.9892401","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892401","url":null,"abstract":"The hippocampus in rodents encodes physical space using place cells that show maximal firing in specific regions of space - their place fields. These place cells are reused across different contexts and environments with uncorrelated place fields. Though place fields are known to depend on distal sensory cues, even identical environments can have completely different place fields if the contexts are different. We propose a novel place cell network model for this feature using two frequently overlooked aspects of neural computation - dendritic morphology and the spatial co-location of spatiotemporally co-active afferent synapses - and show that these enable the reuse of place cells to encode different maps for environments with identical sensory cues.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131675731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Adaptive Spatial-Temporal Fusion Graph Convolutional Networks for Traffic Flow Forecasting 基于自适应时空融合图卷积网络的交通流预测
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892326
Senwen Li, Liang Ge, Yongquan Lin, Bo Zeng
Traffic flow forecasting is a significant issue in the field of transportation. Early works model temporal dependencies and spatial correlations, respectively. Recently, some models are proposed to capture spatial-temporal dependencies simultaneously. However, these models have three defects. Firstly, they only use the information of road network structure to construct graph structure. It may not accurately reflect the spatial-temporal correlations among nodes. Secondly, only the correlations among nodes adjacent in time or space are considered in each graph convolutional layer. Finally, it's challenging for them to describe that future traffic flow is influenced by different scale spatial-temporal information. In this paper, we propose a model called Adaptive Spatial-Temporal Fusion Graph Convolutional Networks to address these problems. Firstly, the model can find cross-time, cross-space correlations among nodes to adjust spatial-temporal graph structure by a learnable adaptive matrix. Secondly, it can help nodes attain a larger spatiotemporal receptive field through constructing spatial-temporal graphs of different time spans. At last, the results of various spatial-temporal scale graph convolutional layers are fused to produce node embedding for prediction. It helps find the different spatial-temporal ranges' influence for various nodes. Experiments are conducted on real-world traffic datasets, and results show that our model outperforms the state-of-the-art baselines.
交通流预测是交通领域的一个重要问题。早期的作品分别模拟了时间依赖性和空间相关性。近年来,提出了一些同时捕获时空依赖关系的模型。然而,这些模型有三个缺陷。首先,他们只使用路网结构信息来构建图结构。它可能不能准确地反映节点之间的时空相关性。其次,每个图卷积层只考虑在时间或空间上相邻的节点之间的相关性。最后,如何描述未来交通流受不同尺度时空信息的影响是一个挑战。在本文中,我们提出了一种称为自适应时空融合图卷积网络的模型来解决这些问题。首先,该模型通过可学习的自适应矩阵找到节点间的跨时间、跨空间相关性,调整时空图结构;其次,通过构建不同时间跨度的时空图,帮助节点获得更大的时空接受场;最后,对各时空尺度图卷积层的结果进行融合,生成节点嵌入进行预测。它有助于发现不同时空范围对各个节点的影响。在真实世界的交通数据集上进行了实验,结果表明我们的模型优于最先进的基线。
{"title":"Adaptive Spatial-Temporal Fusion Graph Convolutional Networks for Traffic Flow Forecasting","authors":"Senwen Li, Liang Ge, Yongquan Lin, Bo Zeng","doi":"10.1109/IJCNN55064.2022.9892326","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892326","url":null,"abstract":"Traffic flow forecasting is a significant issue in the field of transportation. Early works model temporal dependencies and spatial correlations, respectively. Recently, some models are proposed to capture spatial-temporal dependencies simultaneously. However, these models have three defects. Firstly, they only use the information of road network structure to construct graph structure. It may not accurately reflect the spatial-temporal correlations among nodes. Secondly, only the correlations among nodes adjacent in time or space are considered in each graph convolutional layer. Finally, it's challenging for them to describe that future traffic flow is influenced by different scale spatial-temporal information. In this paper, we propose a model called Adaptive Spatial-Temporal Fusion Graph Convolutional Networks to address these problems. Firstly, the model can find cross-time, cross-space correlations among nodes to adjust spatial-temporal graph structure by a learnable adaptive matrix. Secondly, it can help nodes attain a larger spatiotemporal receptive field through constructing spatial-temporal graphs of different time spans. At last, the results of various spatial-temporal scale graph convolutional layers are fused to produce node embedding for prediction. It helps find the different spatial-temporal ranges' influence for various nodes. Experiments are conducted on real-world traffic datasets, and results show that our model outperforms the state-of-the-art baselines.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131675966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Continual learning benefits from multiple sleep stages: NREM, REM, and Synaptic Downscaling 持续学习受益于多个睡眠阶段:非快速眼动、快速眼动和突触降阶
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9891965
Brian S. Robinson, Clare W. Lau, Alexander New, S. Nichols, Erik C. Johnson, M. Wolmetz, W. Coon
Learning new tasks and skills in succession without overwriting or interfering with prior learning (i.e., “catastrophic forgetting”) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve even rudimentary parity with the performance and functionality apparent in biology. One of the processes found in biology that can be adapted for use in artificial systems is sleep, in which the brain deploys numerous neural operations relevant to continual learning and ripe for artificial adaptation. Here, we investigate how modeling three distinct components of mammalian sleep together affects continual learning in artificial neural networks: (1) a veridical memory replay process observed during non-rapid eye movement (NREM) sleep; (2) a generative memory replay process linked to REM sleep; and (3) a synaptic downscaling process which has been proposed to tune signal-to-noise ratios and support neural upkeep. To create this tripartite artificial sleep, we modeled NREM veridical replay by training the network using intermediate representations of samples from the current task. We modeled REM by utilizing a generator network to create intermediate representations of samples from previous tasks for training. Synaptic downscaling, a novel con-tribution, is modeled utilizing a size-dependent downscaling of network weights. We find benefits from the inclusion of all three sleep components when evaluating performance on a continual learning CIFAR-100 image classification benchmark. Maximum accuracy improved during training and catastrophic forgetting was reduced during later tasks. While some catastrophic forget-ting persisted over the course of network training, higher levels of synaptic downscaling lead to better retention of early tasks and further facilitated the recovery of early task accuracy during subsequent training. One key takeaway is that there is a trade-off at hand when considering the level of synaptic downscaling to use - more aggressive downscaling better protects early tasks, but less downscaling enhances the ability to learn new tasks. Intermediate levels can strike a balance with the highest overall accuracies during training. Overall, our results both provide insight into how to adapt sleep components to enhance artificial continual learning systems and highlight areas for future neuroscientific sleep research to further such systems.
连续学习新的任务和技能而不覆盖或干扰先前的学习(即“灾难性遗忘”)对人工和生物神经网络来说都是一个计算挑战,然而人工系统很难达到与生物学中明显的性能和功能相当的基本水平。在生物学中发现的一个可以用于人工系统的过程是睡眠,在睡眠中,大脑部署了许多与持续学习相关的神经操作,并且为人工适应做好了准备。在这里,我们研究了如何将哺乳动物睡眠的三个不同组成部分建模在一起影响人工神经网络中的持续学习:(1)在非快速眼动(NREM)睡眠期间观察到的真实记忆重放过程;(2)与快速眼动睡眠相关的生成性记忆重放过程;(3)突触降尺度过程,该过程被提出用于调节信噪比和支持神经维持。为了创建这种三方人工睡眠,我们通过使用来自当前任务的样本的中间表示来训练网络,从而模拟了NREM的验证重放。我们通过使用生成器网络来创建来自先前训练任务的样本的中间表示来建模REM。突触降尺度是一种新的贡献,它利用网络权重的大小依赖降尺度来建模。在持续学习CIFAR-100图像分类基准上评估性能时,我们发现包含所有三个睡眠组件的好处。在训练过程中,最大准确度得到了提高,而在随后的任务中,灾难性遗忘则有所减少。虽然一些灾难性的遗忘在网络训练过程中持续存在,但较高水平的突触缩小导致对早期任务的更好保留,并进一步促进了在后续训练中早期任务准确性的恢复。一个关键的结论是,当考虑到突触缩小使用的水平时,有一个权衡——更积极的缩小可以更好地保护早期任务,但更少的缩小可以增强学习新任务的能力。中级水平可以在训练中达到最高的整体准确度。总的来说,我们的研究结果既提供了如何调整睡眠成分以增强人工持续学习系统的见解,也突出了未来神经科学睡眠研究的领域,以进一步发展此类系统。
{"title":"Continual learning benefits from multiple sleep stages: NREM, REM, and Synaptic Downscaling","authors":"Brian S. Robinson, Clare W. Lau, Alexander New, S. Nichols, Erik C. Johnson, M. Wolmetz, W. Coon","doi":"10.1109/IJCNN55064.2022.9891965","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9891965","url":null,"abstract":"Learning new tasks and skills in succession without overwriting or interfering with prior learning (i.e., “catastrophic forgetting”) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve even rudimentary parity with the performance and functionality apparent in biology. One of the processes found in biology that can be adapted for use in artificial systems is sleep, in which the brain deploys numerous neural operations relevant to continual learning and ripe for artificial adaptation. Here, we investigate how modeling three distinct components of mammalian sleep together affects continual learning in artificial neural networks: (1) a veridical memory replay process observed during non-rapid eye movement (NREM) sleep; (2) a generative memory replay process linked to REM sleep; and (3) a synaptic downscaling process which has been proposed to tune signal-to-noise ratios and support neural upkeep. To create this tripartite artificial sleep, we modeled NREM veridical replay by training the network using intermediate representations of samples from the current task. We modeled REM by utilizing a generator network to create intermediate representations of samples from previous tasks for training. Synaptic downscaling, a novel con-tribution, is modeled utilizing a size-dependent downscaling of network weights. We find benefits from the inclusion of all three sleep components when evaluating performance on a continual learning CIFAR-100 image classification benchmark. Maximum accuracy improved during training and catastrophic forgetting was reduced during later tasks. While some catastrophic forget-ting persisted over the course of network training, higher levels of synaptic downscaling lead to better retention of early tasks and further facilitated the recovery of early task accuracy during subsequent training. One key takeaway is that there is a trade-off at hand when considering the level of synaptic downscaling to use - more aggressive downscaling better protects early tasks, but less downscaling enhances the ability to learn new tasks. Intermediate levels can strike a balance with the highest overall accuracies during training. Overall, our results both provide insight into how to adapt sleep components to enhance artificial continual learning systems and highlight areas for future neuroscientific sleep research to further such systems.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115527247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DAPID: A Differential-adaptive PID Optimization Strategy for Neural Network Training 神经网络训练的微分自适应PID优化策略
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892746
Yulin Cai, Haoqian Wang
Derived from automatic control theory, the PID optimizer for neural network training can effectively inhibit the overshoot phenomenon of conventional optimization algorithms such as SGD-Momentum. However, its differential term may unexpectedly have a relatively large scale during iteration, which may amplify the inherent noise of input samples and deteriorate the training process. In this paper, we adopt a self-adaptive iterating rule for the PID optimizer's differential term, which uses both first-order and second-order moment estimation to calculate the differential's unbiased statistical value approximately. Such strategy prevents the differential term from being divergent and accelerates the iteration without increasing much computational cost. Empirical results on several popular machine learning datasets demonstrate that the proposed optimization strategy achieves favorable acceleration of convergence as well as competitive accuracy compared with other stochastic optimization approaches.
神经网络训练PID优化器来源于自动控制理论,可以有效抑制SGD-Momentum等传统优化算法的超调现象。然而,在迭代过程中,它的微分项可能会出乎意料地具有较大的尺度,这可能会放大输入样本的固有噪声,从而恶化训练过程。本文对PID优化器的微分项采用自适应迭代规则,利用一阶和二阶矩估计近似计算微分的无偏统计值。该策略防止了微分项的发散,在不增加计算成本的情况下加快了迭代速度。在几个流行的机器学习数据集上的实证结果表明,与其他随机优化方法相比,所提出的优化策略具有良好的收敛加速和竞争精度。
{"title":"DAPID: A Differential-adaptive PID Optimization Strategy for Neural Network Training","authors":"Yulin Cai, Haoqian Wang","doi":"10.1109/IJCNN55064.2022.9892746","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892746","url":null,"abstract":"Derived from automatic control theory, the PID optimizer for neural network training can effectively inhibit the overshoot phenomenon of conventional optimization algorithms such as SGD-Momentum. However, its differential term may unexpectedly have a relatively large scale during iteration, which may amplify the inherent noise of input samples and deteriorate the training process. In this paper, we adopt a self-adaptive iterating rule for the PID optimizer's differential term, which uses both first-order and second-order moment estimation to calculate the differential's unbiased statistical value approximately. Such strategy prevents the differential term from being divergent and accelerates the iteration without increasing much computational cost. Empirical results on several popular machine learning datasets demonstrate that the proposed optimization strategy achieves favorable acceleration of convergence as well as competitive accuracy compared with other stochastic optimization approaches.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115907499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pixel Rows and Columns Relationship Modeling Network based on Transformer for Retinal Vessel Segmentation 基于Transformer的视网膜血管分割像素行列关系建模网络
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892650
Zekang Qiu, J. Zhao, Chudong Shan, Jianyong Huang, Zhiyong Yuan
Performing automatic retinal vessel segmentation on fundus image can obtain clear retinal vessel structure quickly, which will assist doctors to improve the efficiency and reliability of diagnosis. In fundus image, there are many small vessels and some areas with low contrast, and there may be abnormal areas. Therefore, achieving automatic retinal vessel segmentation with high performance is still challenging. The retinal vessel in the image is a topological structure, so the distribution of retinal vessel pixels in each pixel row (or column) should have some relationship to other rows (or columns). Motivated by this observation, we propose Pixel Rows and Columns Relationship Modeling Network (PRCRM-Net) to achieve high-performance retinal vessel segmentation. PRCRM-Net separately models the relationship between different pixel rows and pixel columns of fundus image, and achieves retinal vessel segmentation by classifying the pixels in units of pixel row and pixel column. The input of PRCRM-Net is the feature map extracted by U-Net. PRCRM-Net firstly processes the input feature map into row feature sequence and column feature sequence respectively. Secondly, it models the relationship between the elements in the row feature sequence and column feature sequence respectively based on Transformer. Finally, the updated row feature sequence and column feature sequence are used to obtain row-based segmentation result and column-based segmentation result respectively. And the final segmentation result is the combination of these two types of results. To evaluate the performance of PRCRM-Net, we conduct comprehensive experiments on three representative datasets, DRIVE, STARE and CHASE_DB1. The experiment results show that the proposed PRCRM-Net achieves state-of-the-art performance.
对眼底图像进行视网膜血管自动分割,可以快速获得清晰的视网膜血管结构,有助于医生提高诊断的效率和可靠性。眼底图像可见小血管较多,部分对比度较低的区域,可能有异常区域。因此,实现高性能的视网膜血管自动分割仍然是一个挑战。图像中的视网膜血管是一种拓扑结构,因此视网膜血管像素在每个像素行(或列)中的分布应该与其他行(或列)有一定的关系。基于这一观察结果,我们提出了像素行和列关系建模网络(PRCRM-Net)来实现高性能的视网膜血管分割。prcr - net分别对眼底图像不同像素行和像素列之间的关系进行建模,以像素行和像素列为单位对像素进行分类,实现视网膜血管分割。prcr - net的输入是U-Net提取的特征图。prcr - net首先将输入的特征映射分别处理成行特征序列和列特征序列。其次,基于Transformer分别对行特征序列和列特征序列中元素之间的关系进行建模;最后,利用更新后的行特征序列和列特征序列分别获得基于行和基于列的分割结果。最后的分割结果就是这两种结果的结合。为了评估PRCRM-Net的性能,我们在DRIVE、STARE和CHASE_DB1三个具有代表性的数据集上进行了综合实验。实验结果表明,所提出的PRCRM-Net达到了最先进的性能。
{"title":"Pixel Rows and Columns Relationship Modeling Network based on Transformer for Retinal Vessel Segmentation","authors":"Zekang Qiu, J. Zhao, Chudong Shan, Jianyong Huang, Zhiyong Yuan","doi":"10.1109/IJCNN55064.2022.9892650","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892650","url":null,"abstract":"Performing automatic retinal vessel segmentation on fundus image can obtain clear retinal vessel structure quickly, which will assist doctors to improve the efficiency and reliability of diagnosis. In fundus image, there are many small vessels and some areas with low contrast, and there may be abnormal areas. Therefore, achieving automatic retinal vessel segmentation with high performance is still challenging. The retinal vessel in the image is a topological structure, so the distribution of retinal vessel pixels in each pixel row (or column) should have some relationship to other rows (or columns). Motivated by this observation, we propose Pixel Rows and Columns Relationship Modeling Network (PRCRM-Net) to achieve high-performance retinal vessel segmentation. PRCRM-Net separately models the relationship between different pixel rows and pixel columns of fundus image, and achieves retinal vessel segmentation by classifying the pixels in units of pixel row and pixel column. The input of PRCRM-Net is the feature map extracted by U-Net. PRCRM-Net firstly processes the input feature map into row feature sequence and column feature sequence respectively. Secondly, it models the relationship between the elements in the row feature sequence and column feature sequence respectively based on Transformer. Finally, the updated row feature sequence and column feature sequence are used to obtain row-based segmentation result and column-based segmentation result respectively. And the final segmentation result is the combination of these two types of results. To evaluate the performance of PRCRM-Net, we conduct comprehensive experiments on three representative datasets, DRIVE, STARE and CHASE_DB1. The experiment results show that the proposed PRCRM-Net achieves state-of-the-art performance.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124252526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-source Representation Enhancement for Wikipedia-style Entity Annotation 维基百科式实体标注的多源表示增强
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892289
Kunyuan Pang, Shasha Li, Jintao Tang, Ting Wang
Entity annotation in Wikipedia (officially named wikilinks) greatly benefits human end-users. Human editors are required to select all mentions that are most helpful to human end-users and link each mention to a Wikipedia page. We aim to design an automatic system to generate Wikipedia-style entity annotation for any plain text. However, existing research either rely heavily on mention-entity map or are restricted to named entities only. Besides, they neglect to select the appropriate mentions as Wikipedia requires. As a result, they leave out some necessary annotation and introduce excessive distracting annotation. Existing benchmarks also skirt around the coverage and selection issues. We propose a new task called Mention Detection and Se-lection for entity annotation, along with a new benchmark, WikiC, to better reflect annotation quality. The task is coined centering mentions specific to each position in high-quality human-annotated examples. We also proposed a new framework, DrWiki, to fulfill the task. We adopt a deep pre-trained span selection model inferring directly from plain text via tokens' context embedding. It can cover all possible spans and avoid limiting to mention-entity maps. In addition, information of both inarguable mention-entity pairs, and mention repeat has been introduced as token-wise representation enhancement by FLAT attention and repeat embedding respectively. Empirical results on WikiC show that, compared with often adopted and state-of-the-art Entity Linking and Entity Recognition methods, our method achieves improvement to previous methods in overall performance. Additional experiments show that DrWiki gains improvement even with a low-coverage mention-entity map.
维基百科中的实体注释(官方命名为wikilinks)极大地造福了人类最终用户。人类编辑需要选择所有对人类最终用户最有帮助的提及,并将每个提及链接到维基百科页面。我们的目标是设计一个自动系统,为任何纯文本生成维基百科风格的实体注释。然而,现有的研究要么严重依赖于提及实体图,要么仅限于命名实体。此外,他们忽略了按照维基百科的要求选择适当的提及。因此,他们省略了一些必要的注释,并引入了过多的分散注意力的注释。现有的基准也绕过了覆盖范围和选择问题。为了更好地反映标注质量,我们提出了一个名为提及检测和选择的实体标注任务,以及一个新的基准WikiC。该任务是在高质量的人工注释示例中对特定于每个位置的提及进行集中。我们还提出了一个新的框架,DrWiki,来完成这个任务。我们采用深度预训练的跨度选择模型,通过标记的上下文嵌入直接从纯文本推断。它可以覆盖所有可能的跨度,避免局限于提及实体映射。此外,引入了无可争议的提及实体对信息和提及重复信息,分别通过FLAT关注和重复嵌入作为标记智能表示增强。WikiC上的实证结果表明,与常用的实体链接和实体识别方法相比,我们的方法在整体性能上比以前的方法有所提高。额外的实验表明,即使使用低覆盖率的提及实体图,DrWiki也能获得改进。
{"title":"Multi-source Representation Enhancement for Wikipedia-style Entity Annotation","authors":"Kunyuan Pang, Shasha Li, Jintao Tang, Ting Wang","doi":"10.1109/IJCNN55064.2022.9892289","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892289","url":null,"abstract":"Entity annotation in Wikipedia (officially named wikilinks) greatly benefits human end-users. Human editors are required to select all mentions that are most helpful to human end-users and link each mention to a Wikipedia page. We aim to design an automatic system to generate Wikipedia-style entity annotation for any plain text. However, existing research either rely heavily on mention-entity map or are restricted to named entities only. Besides, they neglect to select the appropriate mentions as Wikipedia requires. As a result, they leave out some necessary annotation and introduce excessive distracting annotation. Existing benchmarks also skirt around the coverage and selection issues. We propose a new task called Mention Detection and Se-lection for entity annotation, along with a new benchmark, WikiC, to better reflect annotation quality. The task is coined centering mentions specific to each position in high-quality human-annotated examples. We also proposed a new framework, DrWiki, to fulfill the task. We adopt a deep pre-trained span selection model inferring directly from plain text via tokens' context embedding. It can cover all possible spans and avoid limiting to mention-entity maps. In addition, information of both inarguable mention-entity pairs, and mention repeat has been introduced as token-wise representation enhancement by FLAT attention and repeat embedding respectively. Empirical results on WikiC show that, compared with often adopted and state-of-the-art Entity Linking and Entity Recognition methods, our method achieves improvement to previous methods in overall performance. Additional experiments show that DrWiki gains improvement even with a low-coverage mention-entity map.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114832372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparative Study and a New Industrial Platform for Decentralized Anomaly Detection Using Machine Learning Algorithms 基于机器学习算法的去中心化异常检测的比较研究与新工业平台
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892939
Fabian Gerz, Tolga Renan Bastürk, Julian Kirchhoff, Joachim Denker, L. Al-Shrouf, M. Jelali
The occurrence of anomalies and unexpected, process-related faults is a major problem for manufacturing systems, which has a significant impact on product quality. Early detection of anomalies is therefore of central importance in order to create sufficient room for maneuver to take countermeasures and ensure product quality. This paper investigates the performance of machine learning (ML) algorithms for anomaly detection in sensor data streams. For this purpose, the performance of six ML algorithms (K-means, DBSCAN, Isolation Forest, OCSVM, LSTM-Network, and DeepAnt) is evaluated based on defined performance metrics. These methods are benchmarked on publicly available datasets, own synthetic datasets, and novel industrial datasets. The latter include radar sensor datasets from a hot rolling mill. Research results show a high detection performance of K-means algorithm, DBSCAN algorithm and LSTM network for punctual, collective and contextual anomalies. A decentralized strategy for (real-time) anomaly detection using sensor data streams is proposed and an industrial (Cloud-Edge Computing) platform is developed and implemented for this purpose.
异常和意外的过程相关故障的发生是制造系统的主要问题,对产品质量有重大影响。因此,早期发现异常是至关重要的,以便为采取对策和确保产品质量创造足够的机动空间。本文研究了传感器数据流中异常检测的机器学习(ML)算法的性能。为此,基于定义的性能指标评估了六种机器学习算法(K-means、DBSCAN、隔离森林、OCSVM、LSTM-Network和DeepAnt)的性能。这些方法在公开可用的数据集、自己的合成数据集和新的工业数据集上进行基准测试。后者包括热轧厂的雷达传感器数据集。研究结果表明,K-means算法、DBSCAN算法和LSTM网络对准时异常、集体异常和上下文异常具有较高的检测性能。提出了一种利用传感器数据流进行(实时)异常检测的分散策略,并为此开发和实现了一个工业(云边缘计算)平台。
{"title":"A Comparative Study and a New Industrial Platform for Decentralized Anomaly Detection Using Machine Learning Algorithms","authors":"Fabian Gerz, Tolga Renan Bastürk, Julian Kirchhoff, Joachim Denker, L. Al-Shrouf, M. Jelali","doi":"10.1109/IJCNN55064.2022.9892939","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892939","url":null,"abstract":"The occurrence of anomalies and unexpected, process-related faults is a major problem for manufacturing systems, which has a significant impact on product quality. Early detection of anomalies is therefore of central importance in order to create sufficient room for maneuver to take countermeasures and ensure product quality. This paper investigates the performance of machine learning (ML) algorithms for anomaly detection in sensor data streams. For this purpose, the performance of six ML algorithms (K-means, DBSCAN, Isolation Forest, OCSVM, LSTM-Network, and DeepAnt) is evaluated based on defined performance metrics. These methods are benchmarked on publicly available datasets, own synthetic datasets, and novel industrial datasets. The latter include radar sensor datasets from a hot rolling mill. Research results show a high detection performance of K-means algorithm, DBSCAN algorithm and LSTM network for punctual, collective and contextual anomalies. A decentralized strategy for (real-time) anomaly detection using sensor data streams is proposed and an industrial (Cloud-Edge Computing) platform is developed and implemented for this purpose.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114931615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
POPNASv2: An Efficient Multi-Objective Neural Architecture Search Technique POPNASv2:一种高效的多目标神经结构搜索技术
Pub Date : 2022-07-18 DOI: 10.1109/IJCNN55064.2022.9892073
Andrea Falanti, Eugenio Lomurno, Stefano Samele, D. Ardagna, Matteo Matteucci
Automating the research for the best neural network model is a task that has gained more and more relevance in the last few years. In this context, Neural Architecture Search (NAS) represents the most effective technique whose results rival the state of the art hand-crafted architectures. However, this approach requires a lot of computational capabilities as well as research time, which make prohibitive its usage in many real-world scenarios. With its sequential model-based optimization strategy, Progressive Neural Architecture Search (PNAS) represents a possible step forward to face this resources issue. Despite the quality of the found network architectures, this technique is still limited in research time. A significant step in this direction has been done by Pareto-Optimal Progressive Neural Architecture Search (POPNAS), which expand PNAS with a time predictor to enable a trade-off between search time and accuracy, considering a multi-objective optimization problem. This paper proposes a new version of the Pareto-Optimal Progressive Neural Architecture Search, called POPNASv2. Our approach enhances its first version and improves its performance. We expanded the search space by adding new operators and improved the quality of both predictors to build more accurate Pareto fronts. Moreover, we introduced cell equivalence checks and enriched the search strategy with an adaptive greedy exploration step. Our efforts allow POPNASv2 to achieve PNAS-like performance with an average 4x factor search time speed-up. Code: https://doi.org/10.5281/zenodo.6574040
对最佳神经网络模型的自动化研究是近年来越来越受到重视的课题。在这种情况下,神经架构搜索(NAS)代表了最有效的技术,其结果可以与最先进的手工架构相媲美。然而,这种方法需要大量的计算能力和研究时间,这使得它在许多现实场景中的使用望而却步。渐进式神经结构搜索(Progressive Neural Architecture Search, PNAS)以其基于序列模型的优化策略为解决这一资源问题提供了可能的方法。尽管已发现的网络体系结构质量很高,但该技术的研究时间仍然有限。帕累托最优渐进式神经结构搜索(POPNAS)在这个方向上迈出了重要的一步,它用时间预测器扩展了PNAS,考虑到多目标优化问题,可以在搜索时间和精度之间进行权衡。本文提出了一个新版本的帕累托最优渐进神经结构搜索,称为POPNASv2。我们的方法增强了它的第一个版本并提高了它的性能。我们通过添加新的运算符扩展了搜索空间,并提高了两个预测器的质量,以构建更准确的帕累托前沿。此外,我们引入了单元等价性检查,并通过自适应贪婪探索步骤丰富了搜索策略。我们的努力使POPNASv2能够以平均4倍的搜索时间加速实现与pnas类似的性能。代码:https://doi.org/10.5281/zenodo.6574040
{"title":"POPNASv2: An Efficient Multi-Objective Neural Architecture Search Technique","authors":"Andrea Falanti, Eugenio Lomurno, Stefano Samele, D. Ardagna, Matteo Matteucci","doi":"10.1109/IJCNN55064.2022.9892073","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892073","url":null,"abstract":"Automating the research for the best neural network model is a task that has gained more and more relevance in the last few years. In this context, Neural Architecture Search (NAS) represents the most effective technique whose results rival the state of the art hand-crafted architectures. However, this approach requires a lot of computational capabilities as well as research time, which make prohibitive its usage in many real-world scenarios. With its sequential model-based optimization strategy, Progressive Neural Architecture Search (PNAS) represents a possible step forward to face this resources issue. Despite the quality of the found network architectures, this technique is still limited in research time. A significant step in this direction has been done by Pareto-Optimal Progressive Neural Architecture Search (POPNAS), which expand PNAS with a time predictor to enable a trade-off between search time and accuracy, considering a multi-objective optimization problem. This paper proposes a new version of the Pareto-Optimal Progressive Neural Architecture Search, called POPNASv2. Our approach enhances its first version and improves its performance. We expanded the search space by adding new operators and improved the quality of both predictors to build more accurate Pareto fronts. Moreover, we introduced cell equivalence checks and enriched the search strategy with an adaptive greedy exploration step. Our efforts allow POPNASv2 to achieve PNAS-like performance with an average 4x factor search time speed-up. Code: https://doi.org/10.5281/zenodo.6574040","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115027760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2022 International Joint Conference on Neural Networks (IJCNN)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1