Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9660009
Muhammad Shabbir Abbasi, Harith Al-Sahaf, I. Welch
Malice or severity scoring models are a technique for detection of maliciousness. A few ransomware detection studies utilise malice scoring models for detection of ransomware-like behavior. These models rely on the weighted sum of some manually chosen features and their weights by a domain expert. To automate the modelling of malice scoring for ransomware detection, we propose a method based on Genetic Programming (GP) that automatically evolves a behavior-based malice scoring model by selecting appropriate features and functions from the input feature and operator sets. The experimental results show that the best-evolved model correctly assigned a malice score, below the threshold value to over 85% of the unseen goodware instances, and over the threshold value to more than 99% of the unseen ransomware instances.
{"title":"Automated Behavior-based Malice Scoring of Ransomware Using Genetic Programming","authors":"Muhammad Shabbir Abbasi, Harith Al-Sahaf, I. Welch","doi":"10.1109/SSCI50451.2021.9660009","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9660009","url":null,"abstract":"Malice or severity scoring models are a technique for detection of maliciousness. A few ransomware detection studies utilise malice scoring models for detection of ransomware-like behavior. These models rely on the weighted sum of some manually chosen features and their weights by a domain expert. To automate the modelling of malice scoring for ransomware detection, we propose a method based on Genetic Programming (GP) that automatically evolves a behavior-based malice scoring model by selecting appropriate features and functions from the input feature and operator sets. The experimental results show that the best-evolved model correctly assigned a malice score, below the threshold value to over 85% of the unseen goodware instances, and over the threshold value to more than 99% of the unseen ransomware instances.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115379533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9660105
Ritvik Sai Teegavarapu, Debojit Biswas
Object detection and classification tasks can be addressed effectively using machine learning (ML) methods that use convolutional neural networks (CNNs) and region-based convolutional neural networks (R-CNNs). In this study, the ability of R-CNNs to distinguish between digital images of artificial and real objects is evaluated. A single-shot detection (SSD) network is also developed to serve as a baseline approach and for comparative evaluation. Experiments are designed using several images of real and artificial leaves as inputs to the R-CNNs that are trained and tested with different proposal areas of the images. The performances of R-CNNs and SSDs are evaluated using mean average precision (mAP) measure. Results from this study indicate that trained R-CNN s perform well in classification of real and artificial leaves and are robust in performance against changes in many of the experimental factors including minimal training data and resolution of the images. R-CNNs have also performed better than SSDs in the classification tasks with higher values of mAP. The performance of R-CNNs is affected by the proposal area, or the number of subsections the R-CNNs utilizes to determine distinct characteristics of the objects (i.e., leaves) presented. Results based on limited experiments from this study indicate the R-CNNs and their variants are ideally suited for object classification tasks with numerous real-world applications.
{"title":"Classification of Artificial and Real Objects Using Faster Region-Based Convolutional Neural Networks","authors":"Ritvik Sai Teegavarapu, Debojit Biswas","doi":"10.1109/SSCI50451.2021.9660105","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9660105","url":null,"abstract":"Object detection and classification tasks can be addressed effectively using machine learning (ML) methods that use convolutional neural networks (CNNs) and region-based convolutional neural networks (R-CNNs). In this study, the ability of R-CNNs to distinguish between digital images of artificial and real objects is evaluated. A single-shot detection (SSD) network is also developed to serve as a baseline approach and for comparative evaluation. Experiments are designed using several images of real and artificial leaves as inputs to the R-CNNs that are trained and tested with different proposal areas of the images. The performances of R-CNNs and SSDs are evaluated using mean average precision (mAP) measure. Results from this study indicate that trained R-CNN s perform well in classification of real and artificial leaves and are robust in performance against changes in many of the experimental factors including minimal training data and resolution of the images. R-CNNs have also performed better than SSDs in the classification tasks with higher values of mAP. The performance of R-CNNs is affected by the proposal area, or the number of subsections the R-CNNs utilizes to determine distinct characteristics of the objects (i.e., leaves) presented. Results based on limited experiments from this study indicate the R-CNNs and their variants are ideally suited for object classification tasks with numerous real-world applications.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123080652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9659994
Anil Arpaci, Jun Chen, J. Drake, Tim Glover
Increasing competition in today's telecommunication industry drives the need for more cost effective services. In order to reduce the cost of designing a fibre network with low capital expenditure, automation and optimisation of network design has become crucial. British Telecom's network design software, BT NetDesign, has been developed for the purpose of network design and optimisation using a rich set of network/graph-based heuristics and the simulated annealing (SA) search method. Although NetDesign provides several different ways of navigating the search space via different move heuristics, the existing search method (SA) does not consistently reach the near-global optimum as the size of network increases. To deal with larger networks, this study utilises an intelligent approach based on the well-known Luby sequence to combine move heuristics, using two separate learning schemes: frequency based and bigram statistics. These two strategies are rigorously evaluated on network instances of different sizes. Experimental results on real-world case studies indicate that a bigram scheme with a longer warm-up period to learn heuristic combinations can reach high quality solutions for large networks.
{"title":"Intelligent Strategies to Combine Move Heuristics in Selection Hyper-heuristics for Real-World Fibre Network Design Optimisation","authors":"Anil Arpaci, Jun Chen, J. Drake, Tim Glover","doi":"10.1109/SSCI50451.2021.9659994","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9659994","url":null,"abstract":"Increasing competition in today's telecommunication industry drives the need for more cost effective services. In order to reduce the cost of designing a fibre network with low capital expenditure, automation and optimisation of network design has become crucial. British Telecom's network design software, BT NetDesign, has been developed for the purpose of network design and optimisation using a rich set of network/graph-based heuristics and the simulated annealing (SA) search method. Although NetDesign provides several different ways of navigating the search space via different move heuristics, the existing search method (SA) does not consistently reach the near-global optimum as the size of network increases. To deal with larger networks, this study utilises an intelligent approach based on the well-known Luby sequence to combine move heuristics, using two separate learning schemes: frequency based and bigram statistics. These two strategies are rigorously evaluated on network instances of different sizes. Experimental results on real-world case studies indicate that a bigram scheme with a longer warm-up period to learn heuristic combinations can reach high quality solutions for large networks.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125322672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9659914
Maximilian Münch, Simon Heilig, Philipp Väth, Frank-Michael Schleif
Life science data analysis frequently encounters particular challenges that cannot be solved with classical techniques from data analytics or machine learning domains. The complex inherent structure of the data and especially the encoding in non-standard ways, e.g., as genome- or protein-sequences, graph structure or histograms, often limit the development of appropriate classification models. To address these limitations, the application of domain-specific expert similarity measures has gained a lot of attention in the past. However, the use of such expert measures suffers from two major drawbacks: (a) there is not one outstanding similarity measure that guarantees success in all application scenarios, and (b) such similarity functions often lead to indefinite data that cannot be processed by classical machine learning methods. In order to tackle both of these limitations, this paper presents a method to embed indefinite life science data with various similarity measures at the same time into a complex-valued vector space. We test our approach on various life science data sets and evaluate the performance against other competitive methods to show its efficiency.
{"title":"Scalable embedding of multiple perspectives for indefinite life-science data analysis","authors":"Maximilian Münch, Simon Heilig, Philipp Väth, Frank-Michael Schleif","doi":"10.1109/SSCI50451.2021.9659914","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9659914","url":null,"abstract":"Life science data analysis frequently encounters particular challenges that cannot be solved with classical techniques from data analytics or machine learning domains. The complex inherent structure of the data and especially the encoding in non-standard ways, e.g., as genome- or protein-sequences, graph structure or histograms, often limit the development of appropriate classification models. To address these limitations, the application of domain-specific expert similarity measures has gained a lot of attention in the past. However, the use of such expert measures suffers from two major drawbacks: (a) there is not one outstanding similarity measure that guarantees success in all application scenarios, and (b) such similarity functions often lead to indefinite data that cannot be processed by classical machine learning methods. In order to tackle both of these limitations, this paper presents a method to embed indefinite life science data with various similarity measures at the same time into a complex-valued vector space. We test our approach on various life science data sets and evaluate the performance against other competitive methods to show its efficiency.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116165849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9659868
Qihao Shan, Alexander Heck, Sanaz Mostaghim
The best-of-n problem has been a popular research topic for understanding collective decision-making in recent years. Researchers aim to enable a swarm of agents to collectively converge to a single opinion out of a series of potential options, using only local interactions. In this paper, we investigate the viability of decision-making via majority rule using ranked voting systems in multi-option scenarios where n>2. We focus on two ranked voting systems, single transferable vote (STV) and Borda count (BC). The proposed algorithms are tested in a discrete collective estimation scenario, and compared against two benchmark algorithms, direct comparison (DC) and majority rule using first-past-the-post voting (FPTP). We have analyzed the experimental results, focusing on the trade-off between accuracy and speed in decision-making. We have concluded that ranked voting systems can significantly improve the performances of collective decision-making strategies in multi-option scenarios. Our experiments show that BC is the best performing algorithm in the studied scenario.
{"title":"Discrete Collective Estimation in Swarm Robotics with Ranked Voting Systems","authors":"Qihao Shan, Alexander Heck, Sanaz Mostaghim","doi":"10.1109/SSCI50451.2021.9659868","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9659868","url":null,"abstract":"The best-of-n problem has been a popular research topic for understanding collective decision-making in recent years. Researchers aim to enable a swarm of agents to collectively converge to a single opinion out of a series of potential options, using only local interactions. In this paper, we investigate the viability of decision-making via majority rule using ranked voting systems in multi-option scenarios where n>2. We focus on two ranked voting systems, single transferable vote (STV) and Borda count (BC). The proposed algorithms are tested in a discrete collective estimation scenario, and compared against two benchmark algorithms, direct comparison (DC) and majority rule using first-past-the-post voting (FPTP). We have analyzed the experimental results, focusing on the trade-off between accuracy and speed in decision-making. We have concluded that ranked voting systems can significantly improve the performances of collective decision-making strategies in multi-option scenarios. Our experiments show that BC is the best performing algorithm in the studied scenario.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122950074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9660092
Darius Scheepers, N. Pillay
The research presented in this paper investigates the use of transfer learning in a genetic programming generation constructive hyper-heuristic for discrete optimisation, namely, the one dimensional bin packing problem (1BPP). The source hyper-heuristic solves easy and medium problem instances from the Scholl benchmark set and the target hyper-heuristic solves the hard problem instances in the same benchmark set. Performance is assessed in terms of objective value, i.e. the number of bins, computational effort and generality of the hyper-heuristic. This study firstly compares the performance of two transfer learning approaches previously shown to be effective for generation constructive hyper-heuristics, for the one dimensional bin packing problem. Both these approaches performed better than not using transfer learning, with the approach transferring the best elements from each generation of the source hyper-heuristic to the target hyper-heuristic (TL2) producing the best results. The study then investigated transferring knowledge on an area of the search space rather than a point in the search space. Three approaches were developed and evaluated for this purpose. Two of these approaches were able to improve the performance of TL2 on three of the ten problem instances with respect to objective value.
{"title":"A Study of Transfer Learning in a Generation Constructive Hyper-Heuristic for One Dimensional Bin Packing","authors":"Darius Scheepers, N. Pillay","doi":"10.1109/SSCI50451.2021.9660092","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9660092","url":null,"abstract":"The research presented in this paper investigates the use of transfer learning in a genetic programming generation constructive hyper-heuristic for discrete optimisation, namely, the one dimensional bin packing problem (1BPP). The source hyper-heuristic solves easy and medium problem instances from the Scholl benchmark set and the target hyper-heuristic solves the hard problem instances in the same benchmark set. Performance is assessed in terms of objective value, i.e. the number of bins, computational effort and generality of the hyper-heuristic. This study firstly compares the performance of two transfer learning approaches previously shown to be effective for generation constructive hyper-heuristics, for the one dimensional bin packing problem. Both these approaches performed better than not using transfer learning, with the approach transferring the best elements from each generation of the source hyper-heuristic to the target hyper-heuristic (TL2) producing the best results. The study then investigated transferring knowledge on an area of the search space rather than a point in the search space. Three approaches were developed and evaluated for this purpose. Two of these approaches were able to improve the performance of TL2 on three of the ten problem instances with respect to objective value.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114145836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9660149
Fevzi Tugrul Varna, P. Husbands
This paper introduces a new particle swarm optimisation variant: the altruistic heterogeneous particle swarm optimisation algorithm (AHPSO). The algorithm conceptualises particles as energy-driven agents with bio-inspired altruistic behaviour. In our approach, particles possess a current energy level and an activation threshold and are in one of two possible states (active or inactive) depending on their energy levels at time t. The idea of altruism is used to form lending-borrowing relationships among particles to change an agent's state from inactive to active, and the main search mechanism exploits this idea. Diversity in the swarm, which prevent premature convergence, is maintained via agent states and the level of altruistic behaviour particles exhibit. The performance of AHPSO was compared with 11 metaheuristics and 12 state-of-the-art PSO variants using the CEC'17 and CEC'05 test suites at 30 and 50 dimensions. The AHPSO algorithm outperformed all 23 comparison algorithms on both benchmark test suites at both 30 and 50 dimensions.
{"title":"AHPSO: Altruistic Heterogeneous Particle Swarm Optimisation Algorithm for Global Optimisation","authors":"Fevzi Tugrul Varna, P. Husbands","doi":"10.1109/SSCI50451.2021.9660149","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9660149","url":null,"abstract":"This paper introduces a new particle swarm optimisation variant: the altruistic heterogeneous particle swarm optimisation algorithm (AHPSO). The algorithm conceptualises particles as energy-driven agents with bio-inspired altruistic behaviour. In our approach, particles possess a current energy level and an activation threshold and are in one of two possible states (active or inactive) depending on their energy levels at time t. The idea of altruism is used to form lending-borrowing relationships among particles to change an agent's state from inactive to active, and the main search mechanism exploits this idea. Diversity in the swarm, which prevent premature convergence, is maintained via agent states and the level of altruistic behaviour particles exhibit. The performance of AHPSO was compared with 11 metaheuristics and 12 state-of-the-art PSO variants using the CEC'17 and CEC'05 test suites at 30 and 50 dimensions. The AHPSO algorithm outperformed all 23 comparison algorithms on both benchmark test suites at both 30 and 50 dimensions.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122126199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9660121
Mathias Minos-Stensrud, H. Moen, Jan Dyre Bjerknes
The impact of information sharing in the generalized problem of combined search and task allocation has not been studied in detail in the context of multi-agent systems. Thus, a simple swarm intelligence mechanism called call-out and a basic game theoretic auction mechanism are compared and analyzed in terms of communication distance and information fault tolerance. Simulations show that the auction mechanism performs well under varying communication distances but has problems when the communication distance is low and when facing faulty information. The call-out mechanism, however, performs significantly better when communication distances are low and when information transfer between agents is uncertain. Furthermore, call-out performs almost equal to auction for intermittent communication distances but due to the inherent property of “over-coordination” for large communication distances agents become “over-committed” in solving tasks at the expense of searching for new tasks. This fundamental system behavior can only be studied in the combined search and task allocation problem,
{"title":"Information sharing in multi-agent search and task allocation problems","authors":"Mathias Minos-Stensrud, H. Moen, Jan Dyre Bjerknes","doi":"10.1109/SSCI50451.2021.9660121","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9660121","url":null,"abstract":"The impact of information sharing in the generalized problem of combined search and task allocation has not been studied in detail in the context of multi-agent systems. Thus, a simple swarm intelligence mechanism called call-out and a basic game theoretic auction mechanism are compared and analyzed in terms of communication distance and information fault tolerance. Simulations show that the auction mechanism performs well under varying communication distances but has problems when the communication distance is low and when facing faulty information. The call-out mechanism, however, performs significantly better when communication distances are low and when information transfer between agents is uncertain. Furthermore, call-out performs almost equal to auction for intermittent communication distances but due to the inherent property of “over-coordination” for large communication distances agents become “over-committed” in solving tasks at the expense of searching for new tasks. This fundamental system behavior can only be studied in the combined search and task allocation problem,","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122128611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9659907
Oliver Chang, Christiana Marchese, Jared Mejia, A. Clark
Neural networks (NNs) are becoming an increasingly important part of mobile robot control systems. Compared with traditional methods, NNs (and other data-driven techniques) produce comparable-if not better-results while requiring less engineering knowhow. Training NNs, however, still requires exploration of a significant number of architectural, optimization, and evaluation options. In this study, we build a simulation environment, generate three image datasets using distinct techniques, train 652 models (including replicates) using a variety of architectures and paradigms (e.g., classification, regression, etc.), and evaluate the navigation ability of the model in simulation. Our goal is to explore a large number of model possibilities so that we can select the most promising for future study with a physical device. Training datasets leading to the best performing models were those that included a significant amount of noise from seemingly inefficient actions. The most promising models explicitly incorporated “memory” wherein previous actions were included as an input in the next step. Such models performed as good or better than conventional convolutional NNs, recurrent NNs, and custom architectures including two camera frames. Although trained models perform well in an environment matching the distribution of the training dataset, they fail when the simulation environment is altered in a seemingly insignificant manner. In robotics research it is often taken for granted that a model with good validation characteristics will perform well on the underlying task, but the results presented here show that there can often be a loose relationship between validation metrics and performance.
{"title":"Investigating Neural Network Architectures, Techniques, and Datasets for Autonomous Navigation in Simulation","authors":"Oliver Chang, Christiana Marchese, Jared Mejia, A. Clark","doi":"10.1109/SSCI50451.2021.9659907","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9659907","url":null,"abstract":"Neural networks (NNs) are becoming an increasingly important part of mobile robot control systems. Compared with traditional methods, NNs (and other data-driven techniques) produce comparable-if not better-results while requiring less engineering knowhow. Training NNs, however, still requires exploration of a significant number of architectural, optimization, and evaluation options. In this study, we build a simulation environment, generate three image datasets using distinct techniques, train 652 models (including replicates) using a variety of architectures and paradigms (e.g., classification, regression, etc.), and evaluate the navigation ability of the model in simulation. Our goal is to explore a large number of model possibilities so that we can select the most promising for future study with a physical device. Training datasets leading to the best performing models were those that included a significant amount of noise from seemingly inefficient actions. The most promising models explicitly incorporated “memory” wherein previous actions were included as an input in the next step. Such models performed as good or better than conventional convolutional NNs, recurrent NNs, and custom architectures including two camera frames. Although trained models perform well in an environment matching the distribution of the training dataset, they fail when the simulation environment is altered in a seemingly insignificant manner. In robotics research it is often taken for granted that a model with good validation characteristics will perform well on the underlying task, but the results presented here show that there can often be a loose relationship between validation metrics and performance.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129520504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-05DOI: 10.1109/SSCI50451.2021.9659545
Hepeng Li, Zhenhua Wang, Lusi Li, Haibo He
Microgrids provide power systems with an effective manner to integrate distributed energy resources, increase power supply reliability, and reduce operational cost. However, intermittent renewable energy resources (RESs) makes it challenging to operate a microgrid safely and economically based on forecasting. To overcome this issue, we develop an online energy management approach for efficient microgrid operation using safe deep reinforcement learning (SDRL). By considering uncertainties and AC power flow, the proposed method formulates online microgrid energy management as a constrained Markov decision process (CMDP). The objective is to find a safety-guaranteed scheduling policy to minimize the total operational cost. To achieve this, we use a SDRL method to learn a neural network-based policy based on constrained policy optimization (CPO). Different from tradition DRL methods that allow an agent to freely explore any behavior during training, the proposed method limits the exploration to safe policies that satisfy AC power flow constraints during training. The proposed method is model-free and does not require predictive information or explicit model of the microgrid. The proposed method is trained and tested on a medium voltage distribution network with real-world power grid data from California Independent Operator (CAISO). Simulation results verify the effectiveness and superiority of proposed method over traditional DRL approaches.
{"title":"Online Microgrid Energy Management Based on Safe Deep Reinforcement Learning","authors":"Hepeng Li, Zhenhua Wang, Lusi Li, Haibo He","doi":"10.1109/SSCI50451.2021.9659545","DOIUrl":"https://doi.org/10.1109/SSCI50451.2021.9659545","url":null,"abstract":"Microgrids provide power systems with an effective manner to integrate distributed energy resources, increase power supply reliability, and reduce operational cost. However, intermittent renewable energy resources (RESs) makes it challenging to operate a microgrid safely and economically based on forecasting. To overcome this issue, we develop an online energy management approach for efficient microgrid operation using safe deep reinforcement learning (SDRL). By considering uncertainties and AC power flow, the proposed method formulates online microgrid energy management as a constrained Markov decision process (CMDP). The objective is to find a safety-guaranteed scheduling policy to minimize the total operational cost. To achieve this, we use a SDRL method to learn a neural network-based policy based on constrained policy optimization (CPO). Different from tradition DRL methods that allow an agent to freely explore any behavior during training, the proposed method limits the exploration to safe policies that satisfy AC power flow constraints during training. The proposed method is model-free and does not require predictive information or explicit model of the microgrid. The proposed method is trained and tested on a medium voltage distribution network with real-world power grid data from California Independent Operator (CAISO). Simulation results verify the effectiveness and superiority of proposed method over traditional DRL approaches.","PeriodicalId":255763,"journal":{"name":"2021 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128761988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}