Pub Date : 2022-12-15DOI: 10.1007/s43674-022-00049-5
Mohit Kumar
Despite a recent surge of research interest in privacy and transferrable deep learning, optimizing the tradeoff between privacy requirements and performance of machine learning models remains a challenge. This motivates the development of an approach that optimizes both privacy-preservation mechanism and learning of the deep models for achieving a robust performance. This paper considers the problem of semi-supervised transfer and multi-task learning under differential privacy framework. An alternative conception of deep autoencoder, referred to as Conditionally Deep Membership-Mapping Autoencoder (CDMMA), is considered for transferrable deep learning. Under practice-oriented settings, an analytical solution for the learning of CDMMA can be derived by means of variational optimization. The paper proposes a transfer and multi-task learning approach that combines CDMMA with a tailored noise adding mechanism to transfer knowledge from source to target domain in a privacy-preserving manner.
{"title":"Differentially private transferrable deep learning with membership-mappings","authors":"Mohit Kumar","doi":"10.1007/s43674-022-00049-5","DOIUrl":"10.1007/s43674-022-00049-5","url":null,"abstract":"<div><p>Despite a recent surge of research interest in privacy and transferrable deep learning, optimizing the tradeoff between privacy requirements and performance of machine learning models remains a challenge. This motivates the development of an approach that optimizes both privacy-preservation mechanism and learning of the deep models for achieving a robust performance. This paper considers the problem of semi-supervised transfer and multi-task learning under differential privacy framework. An alternative conception of deep autoencoder, referred to as <i>Conditionally Deep Membership-Mapping Autoencoder (CDMMA)</i>, is considered for transferrable deep learning. Under practice-oriented settings, an analytical solution for the learning of CDMMA can be derived by means of variational optimization. The paper proposes a transfer and multi-task learning approach that combines CDMMA with a tailored noise adding mechanism to transfer knowledge from source to target domain in a privacy-preserving manner.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50483727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Correct classification of neuromuscular disorders is essential to provide accurate diagnosis. Presently, gene microarray technology is a widely accepted technology to monitor the expression level of a large number of genes simultaneously. The gene microarray data are a high dimensional data, which usually contains small samples having a large number of genes. Therefore, dimension reduction is a crucial task for correct classification of diseases. Dimension reduction eliminates those genes which are less expressive and enhances the efficiency of the classification model. In the present paper, we developed a novel hybrid dimension reduction method and a deep learning-based classification model for neuromuscular disorders. The hybrid dimension reduction method is deployed in three phase: in the first phase, the expressive genes are selected using F test method, and the mutual information method and the best one among them are selected for further processing. In second phase, the gene selected by the best model is further transformed to low dimension by PCA. In third phase, the deep learning-based classification model is deployed. For experimentation, two diseased and multi-diseased micro array data sets, which is publicly available, is used. The best accuracy by 50-100-50-25-13 deep learning architecture with hybrid dimension reduction, where 100 genes select by F test and PCA with 50 principal components is 89% for NMD data set. The best accuracy by 50-100-2 deep learning architecture with hybrid dimension reduction, where 100 genes select by F test and PCA with 50 principal components is 97% for FSHD data set. The proposed hybrid method gives better classification accuracy result and reduces the search space and time complexity as well for both two diseased and multi-diseased micro array data sets.
{"title":"A novel hybrid dimension reduction and deep learning-based classification for neuromuscular disorder","authors":"Babita Pandey, Devendra Kumar Pandey, Aditya Khamparia, Seema Shukla","doi":"10.1007/s43674-022-00047-7","DOIUrl":"10.1007/s43674-022-00047-7","url":null,"abstract":"<div><p>Correct classification of neuromuscular disorders is essential to provide accurate diagnosis. Presently, gene microarray technology is a widely accepted technology to monitor the expression level of a large number of genes simultaneously. The gene microarray data are a high dimensional data, which usually contains small samples having a large number of genes. Therefore, dimension reduction is a crucial task for correct classification of diseases. Dimension reduction eliminates those genes which are less expressive and enhances the efficiency of the classification model. In the present paper, we developed a novel hybrid dimension reduction method and a deep learning-based classification model for neuromuscular disorders. The hybrid dimension reduction method is deployed in three phase: in the first phase, the expressive genes are selected using <i>F</i> test method, and the mutual information method and the best one among them are selected for further processing. In second phase, the gene selected by the best model is further transformed to low dimension by PCA. In third phase, the deep learning-based classification model is deployed. For experimentation, two diseased and multi-diseased micro array data sets, which is publicly available, is used. The best accuracy by 50-100-50-25-13 deep learning architecture with hybrid dimension reduction, where 100 genes select by <i>F</i> test and PCA with 50 principal components is 89% for NMD data set. The best accuracy by 50-100-2 deep learning architecture with hybrid dimension reduction, where 100 genes select by <i>F</i> test and PCA with 50 principal components is 97% for FSHD data set. The proposed hybrid method gives better classification accuracy result and reduces the search space and time complexity as well for both two diseased and multi-diseased micro array data sets.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50503646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-29DOI: 10.1007/s43674-022-00046-8
Ali Mahdavi-Hormat, Mohammad Bagher Menhaj, Ashkan Shakarami
Convolutional Neural Networks are machine learning models that have proven abilities in many variants of tasks. This powerful machine learning model sometimes suffers from overfitting. This paper proposes a method based on Reinforcement Learning for addressing this problem. In this research, the parameters of a target layer in the Convolutional Neural Network take as a state for the Agent of the Reinforcement Learning section. Then the Agent gives some actions as forming parameters of a hyperbolic secant function. This function’s form is changed gradually and implicitly by the proposed method. The inputs of the function are the weights of the layer, and its outputs multiply by the same weights to updating them. In this study, the proposed method is inspired by the Deep Deterministic Policy Gradient model because the actions of the Agent are into a continuous domain. To show the proposed method’s effectiveness, the classification task is considered using Convolutional Neural Networks. In this study, 7 datasets have been used for evaluating the model; MNIST, Extended MNIST, small-notMNIST, Fashion-MNIST, sign language MNIST, CIFAR-10, and CIFAR-100.
{"title":"An effective Reinforcement Learning method for preventing the overfitting of Convolutional Neural Networks","authors":"Ali Mahdavi-Hormat, Mohammad Bagher Menhaj, Ashkan Shakarami","doi":"10.1007/s43674-022-00046-8","DOIUrl":"10.1007/s43674-022-00046-8","url":null,"abstract":"<div><p>Convolutional Neural Networks are machine learning models that have proven abilities in many variants of tasks. This powerful machine learning model sometimes suffers from overfitting. This paper proposes a method based on Reinforcement Learning for addressing this problem. In this research, the parameters of a target layer in the Convolutional Neural Network take as a state for the Agent of the Reinforcement Learning section. Then the Agent gives some actions as forming parameters of a hyperbolic secant function. This function’s form is changed gradually and implicitly by the proposed method. The inputs of the function are the weights of the layer, and its outputs multiply by the same weights to updating them. In this study, the proposed method is inspired by the Deep Deterministic Policy Gradient model because the actions of the Agent are into a continuous domain. To show the proposed method’s effectiveness, the classification task is considered using Convolutional Neural Networks. In this study, 7 datasets have been used for evaluating the model; MNIST, Extended MNIST, small-notMNIST, Fashion-MNIST, sign language MNIST, CIFAR-10, and CIFAR-100.\u0000</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50524501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the outbreak of COVID-19 disease globally, countries around the world are facing shortages of resources (i.e. testing kits, medicine). A quick diagnosis of COVID-19 and isolating patients are crucial in curbing the pandemic, especially in rural areas. This is because the disease is highly contagious and can spread easily. To assist doctors, several studies have proposed an initial detection of COVID-19 cases using radiological images. In this paper, we propose an alternative method for analyzing chest X-ray images to provide an efficient and accurate diagnosis of COVID-19 which can run on edge devices. The approach acts as an enabler for the deep learning model to be deployed in practical application. Here, the convolutional neural network models which are fine-tuned to predict COVID-19 and pneumonia infection from chest X-ray images are developed by adopting transfer learning techniques. The developed model yielded an accuracy of 98.13%, sensitivity of 97.7%, and specificity of 99.1%. To highlight the important regions in the X-ray images which directs the model to its decision/prediction, we adopted the Gradient Class Activation Map (Grad-CAM). The generated heat maps from the Grad-CAM were then compared with the annotated X-ray images by board-certified radiologists. Results showed that the findings strongly correlate with clinical evidence. For practical deployment, we implemented the trained model in edge devices (NCS2) and this has achieved an improvement of 90% in inference speed compared to CPU. This shows that the developed model has the potential to be implemented on the edge, for example in primary care clinics and rural areas which are not well-equipped or do not have access to stable internet connections.
{"title":"Towards edge devices implementation: deep learning model with visualization for COVID-19 prediction from chest X-ray","authors":"Shaline Jia Thean Koh, Marwan Nafea, Hermawan Nugroho","doi":"10.1007/s43674-022-00044-w","DOIUrl":"10.1007/s43674-022-00044-w","url":null,"abstract":"<div><p>Due to the outbreak of COVID-19 disease globally, countries around the world are facing shortages of resources (i.e. testing kits, medicine). A quick diagnosis of COVID-19 and isolating patients are crucial in curbing the pandemic, especially in rural areas. This is because the disease is highly contagious and can spread easily. To assist doctors, several studies have proposed an initial detection of COVID-19 cases using radiological images. In this paper, we propose an alternative method for analyzing chest X-ray images to provide an efficient and accurate diagnosis of COVID-19 which can run on edge devices. The approach acts as an enabler for the deep learning model to be deployed in practical application. Here, the convolutional neural network models which are fine-tuned to predict COVID-19 and pneumonia infection from chest X-ray images are developed by adopting transfer learning techniques. The developed model yielded an accuracy of 98.13%, sensitivity of 97.7%, and specificity of 99.1%. To highlight the important regions in the X-ray images which directs the model to its decision/prediction, we adopted the Gradient Class Activation Map (Grad-CAM). The generated heat maps from the Grad-CAM were then compared with the annotated X-ray images by board-certified radiologists. Results showed that the findings strongly correlate with clinical evidence. For practical deployment, we implemented the trained model in edge devices (NCS2) and this has achieved an improvement of 90% in inference speed compared to CPU. This shows that the developed model has the potential to be implemented on the edge, for example in primary care clinics and rural areas which are not well-equipped or do not have access to stable internet connections.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43674-022-00044-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40391902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-15DOI: 10.1007/s43674-022-00045-9
Xiaojie Xu, Yun Zhang
Forecasts of commodity prices are vital issues to market participants and policy-makers. Those of cooking section oil are of no exception, considering its importance as one of main food resources. In the present study, we assess the forecast problem using weekly wholesale price indices of canola and soybean oil in China during January 1, 2010–January 3, 2020, by employing the non-linear auto-regressive neural network as the forecast tool. We evaluate forecast performance of different model settings over algorithms, delays, hidden neurons, and data splitting ratios in arriving at the final models for the two commodities, which are relatively simple and lead to accurate and stable results. Particularly, the model for the price index of canola oil generates relative root mean square errors of 2.66, 1.46, and 2.17% for training, validation, and testing, respectively, and the model for the price index of soybean oil generates relative root mean square errors of 2.33, 1.96, and 1.98% for training, validation, and testing, respectively. Through the analysis, we show usefulness of the neural network technique for commodity price forecasts. Our results might serve as technical forecasts on a standalone basis or be combined with other fundamental forecasts for perspectives of price trends and corresponding policy analysis.
{"title":"Canola and soybean oil price forecasts via neural networks","authors":"Xiaojie Xu, Yun Zhang","doi":"10.1007/s43674-022-00045-9","DOIUrl":"10.1007/s43674-022-00045-9","url":null,"abstract":"<div><p>Forecasts of commodity prices are vital issues to market participants and policy-makers. Those of cooking section oil are of no exception, considering its importance as one of main food resources. In the present study, we assess the forecast problem using weekly wholesale price indices of canola and soybean oil in China during January 1, 2010–January 3, 2020, by employing the non-linear auto-regressive neural network as the forecast tool. We evaluate forecast performance of different model settings over algorithms, delays, hidden neurons, and data splitting ratios in arriving at the final models for the two commodities, which are relatively simple and lead to accurate and stable results. Particularly, the model for the price index of canola oil generates relative root mean square errors of 2.66, 1.46, and 2.17% for training, validation, and testing, respectively, and the model for the price index of soybean oil generates relative root mean square errors of 2.33, 1.96, and 1.98% for training, validation, and testing, respectively. Through the analysis, we show usefulness of the neural network technique for commodity price forecasts. Our results might serve as technical forecasts on a standalone basis or be combined with other fundamental forecasts for perspectives of price trends and corresponding policy analysis.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43674-022-00045-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50485510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.1007/s43674-022-00041-z
Ju Li, Huifang Ma
Community search enables personalized community discovery and has wide applications in real-life scenarios. Existing attributed community search algorithms use personalized information provided by attributes to locate desired community. Though achieved promising results, existing works suffer from two major limitations: (i) the precision of the algorithm decreases significantly when the seed comes from the boundary regions of the community. (ii) Most attributed community search methods mainly take the attribute information as edge weights to reveal semantic strength (e.g., attribute similarity, attribute distance, etc.), but largely ignore that attribute may serve as heterogeneous vertex. To make up for these deficiencies, in this paper, we propose a novel two-stage attributed community search method with seed replacement and joint random walk (SRRW). Specifically, in the seed replacement stage, we replace the initial query node with a core node; in the random walk stage, attributes are taken as heterogeneous nodes and the augmented graph is modeled based on the affiliation of the attributes via an overlapping clustering algorithm. And finally, a joint random walk is performed on the augmented graph to explore the desired local community. We conduct extensive experiments on both synthetic and real-world benchmarks, demonstrating its effectiveness for attributed community search.
{"title":"Attributed community search based on seed replacement and joint random walk","authors":"Ju Li, Huifang Ma","doi":"10.1007/s43674-022-00041-z","DOIUrl":"10.1007/s43674-022-00041-z","url":null,"abstract":"<div><p>Community search enables personalized community discovery and has wide applications in real-life scenarios. Existing attributed community search algorithms use personalized information provided by attributes to locate desired community. Though achieved promising results, existing works suffer from two major limitations: (i) the precision of the algorithm decreases significantly when the seed comes from the boundary regions of the community. (ii) Most attributed community search methods mainly take the attribute information as edge weights to reveal semantic strength (e.g., attribute similarity, attribute distance, etc.), but largely ignore that attribute may serve as heterogeneous vertex. To make up for these deficiencies, in this paper, we propose a novel two-stage attributed community search method with seed replacement and joint random walk (SRRW). Specifically, in the seed replacement stage, we replace the initial query node with a core node; in the random walk stage, attributes are taken as heterogeneous nodes and the augmented graph is modeled based on the affiliation of the attributes via an overlapping clustering algorithm. And finally, a joint random walk is performed on the augmented graph to explore the desired local community. We conduct extensive experiments on both synthetic and real-world benchmarks, demonstrating its effectiveness for attributed community search.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50437271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-31DOI: 10.1007/s43674-022-00042-y
Aditi Kar Gangopadhyay, Tanay Sheth, Tanmoy Kanti Das, Sneha Chauhan
The paper analyzes observations using a logic-based numerical methodology in Python. The Logical Analysis of Data (LAD) specializes in selecting a minimal number of features and finding unique patterns within it to distinguish ‘positive’ from ‘negative’ observations. The Python implementation of the classification model is further improved by introducing adaptations to pattern generation techniques. Finally, a case study of the Power Attack Systems Dataset used to improvise Smart Grid technology is performed to explore real-life applications of the classification model and analyze its performance against commonly used techniques.
{"title":"Detection of cyber attacks on smart grids","authors":"Aditi Kar Gangopadhyay, Tanay Sheth, Tanmoy Kanti Das, Sneha Chauhan","doi":"10.1007/s43674-022-00042-y","DOIUrl":"10.1007/s43674-022-00042-y","url":null,"abstract":"<div><p>The paper analyzes observations using a logic-based numerical methodology in Python. The Logical Analysis of Data (LAD) specializes in selecting a minimal number of features and finding unique patterns within it to distinguish ‘positive’ from ‘negative’ observations. The Python implementation of the classification model is further improved by introducing adaptations to pattern generation techniques. Finally, a case study of the Power Attack Systems Dataset used to improvise Smart Grid technology is performed to explore real-life applications of the classification model and analyze its performance against commonly used techniques.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50527969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-16DOI: 10.1007/s43674-022-00040-0
Vincent Berthiaume
In machine learning, a classifier has a certain learning curve i.e. the curve of the error/success probability as a function of the training set size. Finding the learning curve for a large interval of sizes takes a lot of processing time. A better method is to estimate the error probabilities only for few minimal sizes and use the pairs size-estimate as data points to model the learning curve. Searchers have tested different models. These models have certain parameters and are conceived from curves that only have the general aspect of a real learning curve. In this paper, we propose two new models that have more parameters and are conceived from real learning curves of nearest neighbour classifiers. These two main differences increase the chance for these new models to fit better the learning curve. We test these new models on one-input and two-class nearest neighbour classifiers.
{"title":"New models of classifier learning curves","authors":"Vincent Berthiaume","doi":"10.1007/s43674-022-00040-0","DOIUrl":"10.1007/s43674-022-00040-0","url":null,"abstract":"<div><p>In machine learning, a classifier has a certain learning curve i.e. the curve of the error/success probability as a function of the training set size. Finding the learning curve for a large interval of sizes takes a lot of processing time. A better method is to estimate the error probabilities only for few minimal sizes and use the pairs size-estimate as data points to model the learning curve. Searchers have tested different models. These models have certain parameters and are conceived from curves that only have the general aspect of a real learning curve. In this paper, we propose two new models that have more parameters and are conceived from real learning curves of nearest neighbour classifiers. These two main differences increase the chance for these new models to fit better the learning curve. We test these new models on one-input and two-class nearest neighbour classifiers.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43674-022-00040-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50487084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-29DOI: 10.1007/s43674-022-00038-8
Zhao Lu, Haoda Fu, William R. Prucka
The resolvent operator and the corresponding Green’s function occupy a central position in the realms of differential and integral equations, operator theory, and in particular the modern physics. However, in the field of machine learning, when confronted with the complex and highly challenging learning tasks from the real world, the prowess of Green’s function of resolvent is rarely explored and exploited. This paper aims at innovating the conventional translation-invariant kernels and rotation-invariant kernels, through theoretical investigation into a new view of constructing kernel functions by means of the resolvent operator and its Green’s function. From the practical perspective, the newly developed kernel functions are applied for robust signal recovery from noise corrupted data in the scenario of linear programming support vector learning. In particular, the monotonic and non-monotonic activation functions are used for kernel design to improve the representation capability. In this manner, a new dimension is given for kernel-based robust sparse learning from the following two aspects: firstly, a new theoretical framework by bridging the gap between the mathematical subtleties of resolvent operator and Green’s function theory and kernel construction; secondly, a concretization for the fusion between activation functions design in neural networks and nonlinear kernels design. Finally, the experimental study demonstrates the potential and superiority of the newly developed kernel functions in robust signal recovery and multiscale sparse modeling, as one step towards removing the apparent boundaries between the realms of modern signal processing and computational intelligence.
{"title":"Resolvent and new activation functions for linear programming kernel sparse learning","authors":"Zhao Lu, Haoda Fu, William R. Prucka","doi":"10.1007/s43674-022-00038-8","DOIUrl":"10.1007/s43674-022-00038-8","url":null,"abstract":"<div><p>The resolvent operator and the corresponding Green’s function occupy a central position in the realms of differential and integral equations, operator theory, and in particular the modern physics. However, in the field of machine learning, when confronted with the complex and highly challenging learning tasks from the real world, the prowess of Green’s function of resolvent is rarely explored and exploited. This paper aims at innovating the conventional translation-invariant kernels and rotation-invariant kernels, through theoretical investigation into a new view of constructing kernel functions by means of the resolvent operator and its Green’s function. From the practical perspective, the newly developed kernel functions are applied for robust signal recovery from noise corrupted data in the scenario of linear programming support vector learning. In particular, the monotonic and non-monotonic activation functions are used for kernel design to improve the representation capability. In this manner, a new dimension is given for kernel-based robust sparse learning from the following two aspects: firstly, a new theoretical framework by bridging the gap between the mathematical subtleties of resolvent operator and Green’s function theory and kernel construction; secondly, a concretization for the fusion between activation functions design in neural networks and nonlinear kernels design. Finally, the experimental study demonstrates the potential and superiority of the newly developed kernel functions in robust signal recovery and multiscale sparse modeling, as one step towards removing the apparent boundaries between the realms of modern signal processing and computational intelligence.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50523930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-16DOI: 10.1007/s43674-022-00039-7
Poulami Dalapati, Kaushik Paul
This paper addresses the issues concerning the rescheduling of a static timetable in case of a disaster, encountered in a large and complex railway network system. The proposed approach tries to modify the existing schedule to minimise the overall delay of trains. This is achieved by representing the rescheduling problem in the form of a Petri-Net and the highly uncertain disaster recovery time in such a model is handled as Markov decision processes (MDP). For solving the rescheduling problem, a distributed constraint optimisation (DCOP)-based strategy involving the use of autonomous agents is used to generate the desired schedule. The proposed approach is evaluated on the real-time data set taken from the Eastern Railways, India by constructing various disaster scenarios using the Java Agent DEvelopment Framework (JADE). The proposed framework, when compared to the existing approaches, substantially reduces the delay of trains after rescheduling.
本文讨论了在大型复杂铁路网系统中发生灾难时,静态时间表的重新安排问题。拟议的方法试图修改现有的时间表,以最大限度地减少列车的整体延误。这是通过将重新调度问题表示为Petri网的形式来实现的,并且这种模型中高度不确定的灾难恢复时间被处理为马尔可夫决策过程(MDP)。为了解决重新调度问题,使用了一种基于分布式约束优化(DCOP)的策略,包括使用自主代理来生成所需的调度。通过使用Java Agent DEvelopment Framework(JADE)构建各种灾难场景,在印度东部铁路公司的实时数据集上对所提出的方法进行了评估。与现有方法相比,拟议的框架大大减少了列车改期后的延误。
{"title":"Multi-agent-based dynamic railway scheduling and optimization: a coloured petri-net model","authors":"Poulami Dalapati, Kaushik Paul","doi":"10.1007/s43674-022-00039-7","DOIUrl":"10.1007/s43674-022-00039-7","url":null,"abstract":"<div><p>This paper addresses the issues concerning the rescheduling of a static timetable in case of a disaster, encountered in a large and complex railway network system. The proposed approach tries to modify the existing schedule to minimise the overall delay of trains. This is achieved by representing the rescheduling problem in the form of a Petri-Net and the highly uncertain disaster recovery time in such a model is handled as Markov decision processes (MDP). For solving the rescheduling problem, a distributed constraint optimisation (DCOP)-based strategy involving the use of autonomous agents is used to generate the desired schedule. The proposed approach is evaluated on the real-time data set taken from the Eastern Railways, India by constructing various disaster scenarios using the Java Agent DEvelopment Framework (JADE). The proposed framework, when compared to the existing approaches, substantially reduces the delay of trains after rescheduling.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43674-022-00039-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50486572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}