Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892637
Tao Zhang, Shiliang Sun, Jing Zhao
Cross-modal retrieval is usually implemented based on cross-modal representation learning, which is used to extract semantic information from cross-modal data. Recent work shows that cross-modal representation learning is vulnerable to adversarial attacks, even using large-scale pre-trained networks. By attacking the representation, it can be simple to attack the downstream tasks, especially for cross-modal retrieval tasks. Adversarial attacks on any modality will easily lead to obvious retrieval errors, which brings the challenge to improve the adversarial robustness of cross-modal retrieval. In this paper, we propose a robust cross-modal retrieval method (RoCMR), which generates adversarial examples for both the query modality and candidate modality and performs adversarial training for cross-modal retrieval. Specifically, we generate adversarial examples for both image and text modalities and train the model with benign and adversarial examples in the framework of contrastive learning. We evaluate the proposed RoCMR on two datasets and show its effectiveness in defending against gradient-based attacks.
{"title":"Robust Cross-Modal Retrieval by Adversarial Training","authors":"Tao Zhang, Shiliang Sun, Jing Zhao","doi":"10.1109/IJCNN55064.2022.9892637","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892637","url":null,"abstract":"Cross-modal retrieval is usually implemented based on cross-modal representation learning, which is used to extract semantic information from cross-modal data. Recent work shows that cross-modal representation learning is vulnerable to adversarial attacks, even using large-scale pre-trained networks. By attacking the representation, it can be simple to attack the downstream tasks, especially for cross-modal retrieval tasks. Adversarial attacks on any modality will easily lead to obvious retrieval errors, which brings the challenge to improve the adversarial robustness of cross-modal retrieval. In this paper, we propose a robust cross-modal retrieval method (RoCMR), which generates adversarial examples for both the query modality and candidate modality and performs adversarial training for cross-modal retrieval. Specifically, we generate adversarial examples for both image and text modalities and train the model with benign and adversarial examples in the framework of contrastive learning. We evaluate the proposed RoCMR on two datasets and show its effectiveness in defending against gradient-based attacks.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134460170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892132
Zhaocheng Zhang, Gang Yang
With the purpose of addressing the scarcity of attribute diversity in Zero-shot Learning (ZSL), we propose to search for additional attributes in embedding space to extend the class embedding, providing a more discriminative representation of the class prototype. Meanwhile, to tackle the inherent noise behind manually annotated attributes, we apply multi-layer convolutional processing on semantic features rather than conventional linear transformation for filtering. Moreover, we employ Center Loss to assist the training stage, which helps the learned mapping be more accurate and consistent with the corresponding class's prototype. Combining these modules mentioned above, extensive experiments on several public datasets show that our method could yield decent improvements. This proposed way of extending attributes can also be migrated to other models or tasks and obtain better results.
{"title":"Exploring Attribute Space with Word Embedding for Zero-shot Learning","authors":"Zhaocheng Zhang, Gang Yang","doi":"10.1109/IJCNN55064.2022.9892132","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892132","url":null,"abstract":"With the purpose of addressing the scarcity of attribute diversity in Zero-shot Learning (ZSL), we propose to search for additional attributes in embedding space to extend the class embedding, providing a more discriminative representation of the class prototype. Meanwhile, to tackle the inherent noise behind manually annotated attributes, we apply multi-layer convolutional processing on semantic features rather than conventional linear transformation for filtering. Moreover, we employ Center Loss to assist the training stage, which helps the learned mapping be more accurate and consistent with the corresponding class's prototype. Combining these modules mentioned above, extensive experiments on several public datasets show that our method could yield decent improvements. This proposed way of extending attributes can also be migrated to other models or tasks and obtain better results.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134490477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892401
Adedapo Alabi, D. Vanderelst, A. Minai
The hippocampus in rodents encodes physical space using place cells that show maximal firing in specific regions of space - their place fields. These place cells are reused across different contexts and environments with uncorrelated place fields. Though place fields are known to depend on distal sensory cues, even identical environments can have completely different place fields if the contexts are different. We propose a novel place cell network model for this feature using two frequently overlooked aspects of neural computation - dendritic morphology and the spatial co-location of spatiotemporally co-active afferent synapses - and show that these enable the reuse of place cells to encode different maps for environments with identical sensory cues.
{"title":"Context-Dependent Spatial Representations in the Hippocampus using Place Cell Dendritic Computation","authors":"Adedapo Alabi, D. Vanderelst, A. Minai","doi":"10.1109/IJCNN55064.2022.9892401","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892401","url":null,"abstract":"The hippocampus in rodents encodes physical space using place cells that show maximal firing in specific regions of space - their place fields. These place cells are reused across different contexts and environments with uncorrelated place fields. Though place fields are known to depend on distal sensory cues, even identical environments can have completely different place fields if the contexts are different. We propose a novel place cell network model for this feature using two frequently overlooked aspects of neural computation - dendritic morphology and the spatial co-location of spatiotemporally co-active afferent synapses - and show that these enable the reuse of place cells to encode different maps for environments with identical sensory cues.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131675731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892326
Senwen Li, Liang Ge, Yongquan Lin, Bo Zeng
Traffic flow forecasting is a significant issue in the field of transportation. Early works model temporal dependencies and spatial correlations, respectively. Recently, some models are proposed to capture spatial-temporal dependencies simultaneously. However, these models have three defects. Firstly, they only use the information of road network structure to construct graph structure. It may not accurately reflect the spatial-temporal correlations among nodes. Secondly, only the correlations among nodes adjacent in time or space are considered in each graph convolutional layer. Finally, it's challenging for them to describe that future traffic flow is influenced by different scale spatial-temporal information. In this paper, we propose a model called Adaptive Spatial-Temporal Fusion Graph Convolutional Networks to address these problems. Firstly, the model can find cross-time, cross-space correlations among nodes to adjust spatial-temporal graph structure by a learnable adaptive matrix. Secondly, it can help nodes attain a larger spatiotemporal receptive field through constructing spatial-temporal graphs of different time spans. At last, the results of various spatial-temporal scale graph convolutional layers are fused to produce node embedding for prediction. It helps find the different spatial-temporal ranges' influence for various nodes. Experiments are conducted on real-world traffic datasets, and results show that our model outperforms the state-of-the-art baselines.
{"title":"Adaptive Spatial-Temporal Fusion Graph Convolutional Networks for Traffic Flow Forecasting","authors":"Senwen Li, Liang Ge, Yongquan Lin, Bo Zeng","doi":"10.1109/IJCNN55064.2022.9892326","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892326","url":null,"abstract":"Traffic flow forecasting is a significant issue in the field of transportation. Early works model temporal dependencies and spatial correlations, respectively. Recently, some models are proposed to capture spatial-temporal dependencies simultaneously. However, these models have three defects. Firstly, they only use the information of road network structure to construct graph structure. It may not accurately reflect the spatial-temporal correlations among nodes. Secondly, only the correlations among nodes adjacent in time or space are considered in each graph convolutional layer. Finally, it's challenging for them to describe that future traffic flow is influenced by different scale spatial-temporal information. In this paper, we propose a model called Adaptive Spatial-Temporal Fusion Graph Convolutional Networks to address these problems. Firstly, the model can find cross-time, cross-space correlations among nodes to adjust spatial-temporal graph structure by a learnable adaptive matrix. Secondly, it can help nodes attain a larger spatiotemporal receptive field through constructing spatial-temporal graphs of different time spans. At last, the results of various spatial-temporal scale graph convolutional layers are fused to produce node embedding for prediction. It helps find the different spatial-temporal ranges' influence for various nodes. Experiments are conducted on real-world traffic datasets, and results show that our model outperforms the state-of-the-art baselines.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131675966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9891965
Brian S. Robinson, Clare W. Lau, Alexander New, S. Nichols, Erik C. Johnson, M. Wolmetz, W. Coon
Learning new tasks and skills in succession without overwriting or interfering with prior learning (i.e., “catastrophic forgetting”) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve even rudimentary parity with the performance and functionality apparent in biology. One of the processes found in biology that can be adapted for use in artificial systems is sleep, in which the brain deploys numerous neural operations relevant to continual learning and ripe for artificial adaptation. Here, we investigate how modeling three distinct components of mammalian sleep together affects continual learning in artificial neural networks: (1) a veridical memory replay process observed during non-rapid eye movement (NREM) sleep; (2) a generative memory replay process linked to REM sleep; and (3) a synaptic downscaling process which has been proposed to tune signal-to-noise ratios and support neural upkeep. To create this tripartite artificial sleep, we modeled NREM veridical replay by training the network using intermediate representations of samples from the current task. We modeled REM by utilizing a generator network to create intermediate representations of samples from previous tasks for training. Synaptic downscaling, a novel con-tribution, is modeled utilizing a size-dependent downscaling of network weights. We find benefits from the inclusion of all three sleep components when evaluating performance on a continual learning CIFAR-100 image classification benchmark. Maximum accuracy improved during training and catastrophic forgetting was reduced during later tasks. While some catastrophic forget-ting persisted over the course of network training, higher levels of synaptic downscaling lead to better retention of early tasks and further facilitated the recovery of early task accuracy during subsequent training. One key takeaway is that there is a trade-off at hand when considering the level of synaptic downscaling to use - more aggressive downscaling better protects early tasks, but less downscaling enhances the ability to learn new tasks. Intermediate levels can strike a balance with the highest overall accuracies during training. Overall, our results both provide insight into how to adapt sleep components to enhance artificial continual learning systems and highlight areas for future neuroscientific sleep research to further such systems.
{"title":"Continual learning benefits from multiple sleep stages: NREM, REM, and Synaptic Downscaling","authors":"Brian S. Robinson, Clare W. Lau, Alexander New, S. Nichols, Erik C. Johnson, M. Wolmetz, W. Coon","doi":"10.1109/IJCNN55064.2022.9891965","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9891965","url":null,"abstract":"Learning new tasks and skills in succession without overwriting or interfering with prior learning (i.e., “catastrophic forgetting”) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve even rudimentary parity with the performance and functionality apparent in biology. One of the processes found in biology that can be adapted for use in artificial systems is sleep, in which the brain deploys numerous neural operations relevant to continual learning and ripe for artificial adaptation. Here, we investigate how modeling three distinct components of mammalian sleep together affects continual learning in artificial neural networks: (1) a veridical memory replay process observed during non-rapid eye movement (NREM) sleep; (2) a generative memory replay process linked to REM sleep; and (3) a synaptic downscaling process which has been proposed to tune signal-to-noise ratios and support neural upkeep. To create this tripartite artificial sleep, we modeled NREM veridical replay by training the network using intermediate representations of samples from the current task. We modeled REM by utilizing a generator network to create intermediate representations of samples from previous tasks for training. Synaptic downscaling, a novel con-tribution, is modeled utilizing a size-dependent downscaling of network weights. We find benefits from the inclusion of all three sleep components when evaluating performance on a continual learning CIFAR-100 image classification benchmark. Maximum accuracy improved during training and catastrophic forgetting was reduced during later tasks. While some catastrophic forget-ting persisted over the course of network training, higher levels of synaptic downscaling lead to better retention of early tasks and further facilitated the recovery of early task accuracy during subsequent training. One key takeaway is that there is a trade-off at hand when considering the level of synaptic downscaling to use - more aggressive downscaling better protects early tasks, but less downscaling enhances the ability to learn new tasks. Intermediate levels can strike a balance with the highest overall accuracies during training. Overall, our results both provide insight into how to adapt sleep components to enhance artificial continual learning systems and highlight areas for future neuroscientific sleep research to further such systems.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115527247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892746
Yulin Cai, Haoqian Wang
Derived from automatic control theory, the PID optimizer for neural network training can effectively inhibit the overshoot phenomenon of conventional optimization algorithms such as SGD-Momentum. However, its differential term may unexpectedly have a relatively large scale during iteration, which may amplify the inherent noise of input samples and deteriorate the training process. In this paper, we adopt a self-adaptive iterating rule for the PID optimizer's differential term, which uses both first-order and second-order moment estimation to calculate the differential's unbiased statistical value approximately. Such strategy prevents the differential term from being divergent and accelerates the iteration without increasing much computational cost. Empirical results on several popular machine learning datasets demonstrate that the proposed optimization strategy achieves favorable acceleration of convergence as well as competitive accuracy compared with other stochastic optimization approaches.
{"title":"DAPID: A Differential-adaptive PID Optimization Strategy for Neural Network Training","authors":"Yulin Cai, Haoqian Wang","doi":"10.1109/IJCNN55064.2022.9892746","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892746","url":null,"abstract":"Derived from automatic control theory, the PID optimizer for neural network training can effectively inhibit the overshoot phenomenon of conventional optimization algorithms such as SGD-Momentum. However, its differential term may unexpectedly have a relatively large scale during iteration, which may amplify the inherent noise of input samples and deteriorate the training process. In this paper, we adopt a self-adaptive iterating rule for the PID optimizer's differential term, which uses both first-order and second-order moment estimation to calculate the differential's unbiased statistical value approximately. Such strategy prevents the differential term from being divergent and accelerates the iteration without increasing much computational cost. Empirical results on several popular machine learning datasets demonstrate that the proposed optimization strategy achieves favorable acceleration of convergence as well as competitive accuracy compared with other stochastic optimization approaches.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115907499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892650
Zekang Qiu, J. Zhao, Chudong Shan, Jianyong Huang, Zhiyong Yuan
Performing automatic retinal vessel segmentation on fundus image can obtain clear retinal vessel structure quickly, which will assist doctors to improve the efficiency and reliability of diagnosis. In fundus image, there are many small vessels and some areas with low contrast, and there may be abnormal areas. Therefore, achieving automatic retinal vessel segmentation with high performance is still challenging. The retinal vessel in the image is a topological structure, so the distribution of retinal vessel pixels in each pixel row (or column) should have some relationship to other rows (or columns). Motivated by this observation, we propose Pixel Rows and Columns Relationship Modeling Network (PRCRM-Net) to achieve high-performance retinal vessel segmentation. PRCRM-Net separately models the relationship between different pixel rows and pixel columns of fundus image, and achieves retinal vessel segmentation by classifying the pixels in units of pixel row and pixel column. The input of PRCRM-Net is the feature map extracted by U-Net. PRCRM-Net firstly processes the input feature map into row feature sequence and column feature sequence respectively. Secondly, it models the relationship between the elements in the row feature sequence and column feature sequence respectively based on Transformer. Finally, the updated row feature sequence and column feature sequence are used to obtain row-based segmentation result and column-based segmentation result respectively. And the final segmentation result is the combination of these two types of results. To evaluate the performance of PRCRM-Net, we conduct comprehensive experiments on three representative datasets, DRIVE, STARE and CHASE_DB1. The experiment results show that the proposed PRCRM-Net achieves state-of-the-art performance.
{"title":"Pixel Rows and Columns Relationship Modeling Network based on Transformer for Retinal Vessel Segmentation","authors":"Zekang Qiu, J. Zhao, Chudong Shan, Jianyong Huang, Zhiyong Yuan","doi":"10.1109/IJCNN55064.2022.9892650","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892650","url":null,"abstract":"Performing automatic retinal vessel segmentation on fundus image can obtain clear retinal vessel structure quickly, which will assist doctors to improve the efficiency and reliability of diagnosis. In fundus image, there are many small vessels and some areas with low contrast, and there may be abnormal areas. Therefore, achieving automatic retinal vessel segmentation with high performance is still challenging. The retinal vessel in the image is a topological structure, so the distribution of retinal vessel pixels in each pixel row (or column) should have some relationship to other rows (or columns). Motivated by this observation, we propose Pixel Rows and Columns Relationship Modeling Network (PRCRM-Net) to achieve high-performance retinal vessel segmentation. PRCRM-Net separately models the relationship between different pixel rows and pixel columns of fundus image, and achieves retinal vessel segmentation by classifying the pixels in units of pixel row and pixel column. The input of PRCRM-Net is the feature map extracted by U-Net. PRCRM-Net firstly processes the input feature map into row feature sequence and column feature sequence respectively. Secondly, it models the relationship between the elements in the row feature sequence and column feature sequence respectively based on Transformer. Finally, the updated row feature sequence and column feature sequence are used to obtain row-based segmentation result and column-based segmentation result respectively. And the final segmentation result is the combination of these two types of results. To evaluate the performance of PRCRM-Net, we conduct comprehensive experiments on three representative datasets, DRIVE, STARE and CHASE_DB1. The experiment results show that the proposed PRCRM-Net achieves state-of-the-art performance.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124252526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892289
Kunyuan Pang, Shasha Li, Jintao Tang, Ting Wang
Entity annotation in Wikipedia (officially named wikilinks) greatly benefits human end-users. Human editors are required to select all mentions that are most helpful to human end-users and link each mention to a Wikipedia page. We aim to design an automatic system to generate Wikipedia-style entity annotation for any plain text. However, existing research either rely heavily on mention-entity map or are restricted to named entities only. Besides, they neglect to select the appropriate mentions as Wikipedia requires. As a result, they leave out some necessary annotation and introduce excessive distracting annotation. Existing benchmarks also skirt around the coverage and selection issues. We propose a new task called Mention Detection and Se-lection for entity annotation, along with a new benchmark, WikiC, to better reflect annotation quality. The task is coined centering mentions specific to each position in high-quality human-annotated examples. We also proposed a new framework, DrWiki, to fulfill the task. We adopt a deep pre-trained span selection model inferring directly from plain text via tokens' context embedding. It can cover all possible spans and avoid limiting to mention-entity maps. In addition, information of both inarguable mention-entity pairs, and mention repeat has been introduced as token-wise representation enhancement by FLAT attention and repeat embedding respectively. Empirical results on WikiC show that, compared with often adopted and state-of-the-art Entity Linking and Entity Recognition methods, our method achieves improvement to previous methods in overall performance. Additional experiments show that DrWiki gains improvement even with a low-coverage mention-entity map.
{"title":"Multi-source Representation Enhancement for Wikipedia-style Entity Annotation","authors":"Kunyuan Pang, Shasha Li, Jintao Tang, Ting Wang","doi":"10.1109/IJCNN55064.2022.9892289","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892289","url":null,"abstract":"Entity annotation in Wikipedia (officially named wikilinks) greatly benefits human end-users. Human editors are required to select all mentions that are most helpful to human end-users and link each mention to a Wikipedia page. We aim to design an automatic system to generate Wikipedia-style entity annotation for any plain text. However, existing research either rely heavily on mention-entity map or are restricted to named entities only. Besides, they neglect to select the appropriate mentions as Wikipedia requires. As a result, they leave out some necessary annotation and introduce excessive distracting annotation. Existing benchmarks also skirt around the coverage and selection issues. We propose a new task called Mention Detection and Se-lection for entity annotation, along with a new benchmark, WikiC, to better reflect annotation quality. The task is coined centering mentions specific to each position in high-quality human-annotated examples. We also proposed a new framework, DrWiki, to fulfill the task. We adopt a deep pre-trained span selection model inferring directly from plain text via tokens' context embedding. It can cover all possible spans and avoid limiting to mention-entity maps. In addition, information of both inarguable mention-entity pairs, and mention repeat has been introduced as token-wise representation enhancement by FLAT attention and repeat embedding respectively. Empirical results on WikiC show that, compared with often adopted and state-of-the-art Entity Linking and Entity Recognition methods, our method achieves improvement to previous methods in overall performance. Additional experiments show that DrWiki gains improvement even with a low-coverage mention-entity map.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114832372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892939
Fabian Gerz, Tolga Renan Bastürk, Julian Kirchhoff, Joachim Denker, L. Al-Shrouf, M. Jelali
The occurrence of anomalies and unexpected, process-related faults is a major problem for manufacturing systems, which has a significant impact on product quality. Early detection of anomalies is therefore of central importance in order to create sufficient room for maneuver to take countermeasures and ensure product quality. This paper investigates the performance of machine learning (ML) algorithms for anomaly detection in sensor data streams. For this purpose, the performance of six ML algorithms (K-means, DBSCAN, Isolation Forest, OCSVM, LSTM-Network, and DeepAnt) is evaluated based on defined performance metrics. These methods are benchmarked on publicly available datasets, own synthetic datasets, and novel industrial datasets. The latter include radar sensor datasets from a hot rolling mill. Research results show a high detection performance of K-means algorithm, DBSCAN algorithm and LSTM network for punctual, collective and contextual anomalies. A decentralized strategy for (real-time) anomaly detection using sensor data streams is proposed and an industrial (Cloud-Edge Computing) platform is developed and implemented for this purpose.
{"title":"A Comparative Study and a New Industrial Platform for Decentralized Anomaly Detection Using Machine Learning Algorithms","authors":"Fabian Gerz, Tolga Renan Bastürk, Julian Kirchhoff, Joachim Denker, L. Al-Shrouf, M. Jelali","doi":"10.1109/IJCNN55064.2022.9892939","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892939","url":null,"abstract":"The occurrence of anomalies and unexpected, process-related faults is a major problem for manufacturing systems, which has a significant impact on product quality. Early detection of anomalies is therefore of central importance in order to create sufficient room for maneuver to take countermeasures and ensure product quality. This paper investigates the performance of machine learning (ML) algorithms for anomaly detection in sensor data streams. For this purpose, the performance of six ML algorithms (K-means, DBSCAN, Isolation Forest, OCSVM, LSTM-Network, and DeepAnt) is evaluated based on defined performance metrics. These methods are benchmarked on publicly available datasets, own synthetic datasets, and novel industrial datasets. The latter include radar sensor datasets from a hot rolling mill. Research results show a high detection performance of K-means algorithm, DBSCAN algorithm and LSTM network for punctual, collective and contextual anomalies. A decentralized strategy for (real-time) anomaly detection using sensor data streams is proposed and an industrial (Cloud-Edge Computing) platform is developed and implemented for this purpose.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114931615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.1109/IJCNN55064.2022.9892073
Andrea Falanti, Eugenio Lomurno, Stefano Samele, D. Ardagna, Matteo Matteucci
Automating the research for the best neural network model is a task that has gained more and more relevance in the last few years. In this context, Neural Architecture Search (NAS) represents the most effective technique whose results rival the state of the art hand-crafted architectures. However, this approach requires a lot of computational capabilities as well as research time, which make prohibitive its usage in many real-world scenarios. With its sequential model-based optimization strategy, Progressive Neural Architecture Search (PNAS) represents a possible step forward to face this resources issue. Despite the quality of the found network architectures, this technique is still limited in research time. A significant step in this direction has been done by Pareto-Optimal Progressive Neural Architecture Search (POPNAS), which expand PNAS with a time predictor to enable a trade-off between search time and accuracy, considering a multi-objective optimization problem. This paper proposes a new version of the Pareto-Optimal Progressive Neural Architecture Search, called POPNASv2. Our approach enhances its first version and improves its performance. We expanded the search space by adding new operators and improved the quality of both predictors to build more accurate Pareto fronts. Moreover, we introduced cell equivalence checks and enriched the search strategy with an adaptive greedy exploration step. Our efforts allow POPNASv2 to achieve PNAS-like performance with an average 4x factor search time speed-up. Code: https://doi.org/10.5281/zenodo.6574040
{"title":"POPNASv2: An Efficient Multi-Objective Neural Architecture Search Technique","authors":"Andrea Falanti, Eugenio Lomurno, Stefano Samele, D. Ardagna, Matteo Matteucci","doi":"10.1109/IJCNN55064.2022.9892073","DOIUrl":"https://doi.org/10.1109/IJCNN55064.2022.9892073","url":null,"abstract":"Automating the research for the best neural network model is a task that has gained more and more relevance in the last few years. In this context, Neural Architecture Search (NAS) represents the most effective technique whose results rival the state of the art hand-crafted architectures. However, this approach requires a lot of computational capabilities as well as research time, which make prohibitive its usage in many real-world scenarios. With its sequential model-based optimization strategy, Progressive Neural Architecture Search (PNAS) represents a possible step forward to face this resources issue. Despite the quality of the found network architectures, this technique is still limited in research time. A significant step in this direction has been done by Pareto-Optimal Progressive Neural Architecture Search (POPNAS), which expand PNAS with a time predictor to enable a trade-off between search time and accuracy, considering a multi-objective optimization problem. This paper proposes a new version of the Pareto-Optimal Progressive Neural Architecture Search, called POPNASv2. Our approach enhances its first version and improves its performance. We expanded the search space by adding new operators and improved the quality of both predictors to build more accurate Pareto fronts. Moreover, we introduced cell equivalence checks and enriched the search strategy with an adaptive greedy exploration step. Our efforts allow POPNASv2 to achieve PNAS-like performance with an average 4x factor search time speed-up. Code: https://doi.org/10.5281/zenodo.6574040","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115027760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}