COVID-19 has become a pandemic for the entire world, and it has significantly affected the world economy. The importance of early detection and treatment of the infection cannot be overstated. The traditional diagnosis techniques take more time in detecting the infection. Although, numerous deep learning-based automated solutions have recently been developed in this regard, nevertheless, the limitation of computational and battery power in resource-constrained devices makes it difficult to deploy trained models for real-time inference. In this paper, to detect the presence of COVID-19 in CT-scan images, an important weights-only transfer learning method has been proposed for devices with limited runt-time resources. In the proposed method, the pre-trained models are made point-of-care devices friendly by pruning less important weight parameters of the model. The experiments were performed on two popular VGG16 and ResNet34 models and the empirical results showed that pruned ResNet34 model achieved 95.47% accuracy, 0.9216 sensitivity, 0.9567 F-score, and 0.9942 specificity with 41.96% fewer FLOPs and 20.64% fewer weight parameters on the SARS-CoV-2 CT-scan dataset. The results of our experiments showed that the proposed method significantly reduces the run-time resource requirements of the computationally intensive models and makes them ready to be utilized on the point-of-care devices.
{"title":"Deep learning-based important weights-only transfer learning approach for COVID-19 CT-scan classification.","authors":"Tejalal Choudhary, Shubham Gujar, Anurag Goswami, Vipul Mishra, Tapas Badal","doi":"10.1007/s10489-022-03893-7","DOIUrl":"https://doi.org/10.1007/s10489-022-03893-7","url":null,"abstract":"<p><p>COVID-19 has become a pandemic for the entire world, and it has significantly affected the world economy. The importance of early detection and treatment of the infection cannot be overstated. The traditional diagnosis techniques take more time in detecting the infection. Although, numerous deep learning-based automated solutions have recently been developed in this regard, nevertheless, the limitation of computational and battery power in resource-constrained devices makes it difficult to deploy trained models for real-time inference. In this paper, to detect the presence of COVID-19 in CT-scan images, an important weights-only transfer learning method has been proposed for devices with limited runt-time resources. In the proposed method, the pre-trained models are made point-of-care devices friendly by pruning less important weight parameters of the model. The experiments were performed on two popular VGG16 and ResNet34 models and the empirical results showed that pruned ResNet34 model achieved 95.47% accuracy, 0.9216 sensitivity, 0.9567 F-score, and 0.9942 specificity with 41.96% fewer FLOPs and 20.64% fewer weight parameters on the SARS-CoV-2 CT-scan dataset. The results of our experiments showed that the proposed method significantly reduces the run-time resource requirements of the computationally intensive models and makes them ready to be utilized on the point-of-care devices.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 6","pages":"7201-7215"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9289654/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10785428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2022-12-05DOI: 10.1007/s10489-022-04294-6
Nima Pourkhodabakhsh, Mobina Mousapour Mamoudan, Ali Bozorgi-Amiri
Employee turnover is one of the most important issues in human resource management, which is a combination of soft and hard skills. This makes it difficult for managers to make decisions. In order to make better decisions, this article has been devoted to identifying factors affecting employee turnover using feature selection approaches such as Recursive Feature Elimination algorithm and Mutual Information and Meta-heuristic algorithms such as Gray Wolf Optimizer and Genetic Algorithm. The use of Multi-Criteria Decision-Making techniques is one of the other approaches used to identify the factors affecting the employee turnover in this article. Our expert has used the Best-Worst Method to evaluate each of these variables. In order to check the performance of each of the above methods and to identify the most significant factors on employee turnover, the results are used in some machine learning algorithms to check their accuracy in predicting the employee turnover. These three methods have been implemented on the human resources dataset of a company and the results show that the factors identified by the Mutual Information algorithm can show better results in predicting the employee turnover. Also, the results confirm that managers need a support tool to make decisions because the possibility of making mistakes in their decisions is high. This approach can be used as a decision support tool by managers and help managers and organizations to have a correct insight into the departure of their employees and adopt policies to retain and optimize their employees.
{"title":"Effective machine learning, Meta-heuristic algorithms and multi-criteria decision making to minimizing human resource turnover.","authors":"Nima Pourkhodabakhsh, Mobina Mousapour Mamoudan, Ali Bozorgi-Amiri","doi":"10.1007/s10489-022-04294-6","DOIUrl":"10.1007/s10489-022-04294-6","url":null,"abstract":"<p><p>Employee turnover is one of the most important issues in human resource management, which is a combination of soft and hard skills. This makes it difficult for managers to make decisions. In order to make better decisions, this article has been devoted to identifying factors affecting employee turnover using feature selection approaches such as Recursive Feature Elimination algorithm and Mutual Information and Meta-heuristic algorithms such as Gray Wolf Optimizer and Genetic Algorithm. The use of Multi-Criteria Decision-Making techniques is one of the other approaches used to identify the factors affecting the employee turnover in this article. Our expert has used the Best-Worst Method to evaluate each of these variables. In order to check the performance of each of the above methods and to identify the most significant factors on employee turnover, the results are used in some machine learning algorithms to check their accuracy in predicting the employee turnover. These three methods have been implemented on the human resources dataset of a company and the results show that the factors identified by the Mutual Information algorithm can show better results in predicting the employee turnover. Also, the results confirm that managers need a support tool to make decisions because the possibility of making mistakes in their decisions is high. This approach can be used as a decision support tool by managers and help managers and organizations to have a correct insight into the departure of their employees and adopt policies to retain and optimize their employees.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 12","pages":"16309-16331"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9734781/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9565335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2022-12-01DOI: 10.1007/s10489-022-04283-9
Angelo Gaeta, Vincenzo Loia, Luigi Lomasto, Francesco Orciuoli
The paper presents and evaluates an approach based on Rough Set Theory, and some variants and extensions of this theory, to analyze phenomena related to Information Disorder. The main concepts and constructs of Rough Set Theory, such as lower and upper approximations of a target set, indiscernibility and neighborhood binary relations, are used to model and reason on groups of social media users and sets of information that circulate in the social media. Information theoretic measures, such as roughness and entropy, are used to evaluate two concepts, Complexity and Milestone, that have been borrowed by system theory and contextualized for Information Disorder. The novelty of the results presented in this paper relates to the adoption of Rough Set Theory constructs and operators in this new and unexplored field of investigation and, specifically, to model key elements of Information Disorder, such as the message and the interpreters, and reason on the evolutionary dynamics of these elements. The added value of using these measures is an increase in the ability to interpret the effects of Information Disorder, due to the circulation of news, as the ratio between the cardinality of lower and upper approximations of a Rough Set, cardinality variations of parts, increase in their fragmentation or cohesion. Such improved interpretative ability can be beneficial to social media analysts and providers. Four algorithms based on Rough Set Theory and some variants or extensions are used to evaluate the results in a case study built with real data used to contrast disinformation for COVID-19. The achieved results allow to understand the superiority of the approaches based on Fuzzy Rough Sets for the interpretation of our phenomenon.
{"title":"A novel approach based on rough set theory for analyzing information disorder.","authors":"Angelo Gaeta, Vincenzo Loia, Luigi Lomasto, Francesco Orciuoli","doi":"10.1007/s10489-022-04283-9","DOIUrl":"10.1007/s10489-022-04283-9","url":null,"abstract":"<p><p>The paper presents and evaluates an approach based on Rough Set Theory, and some variants and extensions of this theory, to analyze phenomena related to Information Disorder. The main concepts and constructs of Rough Set Theory, such as lower and upper approximations of a target set, indiscernibility and neighborhood binary relations, are used to model and reason on groups of social media users and sets of information that circulate in the social media. Information theoretic measures, such as roughness and entropy, are used to evaluate two concepts, Complexity and Milestone, that have been borrowed by system theory and contextualized for Information Disorder. The novelty of the results presented in this paper relates to the adoption of Rough Set Theory constructs and operators in this new and unexplored field of investigation and, specifically, to model key elements of Information Disorder, such as the message and the interpreters, and reason on the evolutionary dynamics of these elements. The added value of using these measures is an increase in the ability to interpret the effects of Information Disorder, due to the circulation of news, as the ratio between the cardinality of lower and upper approximations of a Rough Set, cardinality variations of parts, increase in their fragmentation or cohesion. Such improved interpretative ability can be beneficial to social media analysts and providers. Four algorithms based on Rough Set Theory and some variants or extensions are used to evaluate the results in a case study built with real data used to contrast disinformation for COVID-19. The achieved results allow to understand the superiority of the approaches based on Fuzzy Rough Sets for the interpretation of our phenomenon.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 12","pages":"15993-16014"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9713159/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9563863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1007/s10489-022-03796-7
Essam H Houssein, Mohamed H Hassan, Mohamed A Mahdy, Salah Kamel
This paper proposes an enhanced version of Equilibrium Optimizer (EO) called (EEO) for solving global optimization and the optimal power flow (OPF) problems. The proposed EEO algorithm includes a new performance reinforcement strategy with the Lévy Flight mechanism. The algorithm addresses the shortcomings of the original Equilibrium Optimizer (EO) and aims to provide better solutions (than those provided by EO) to global optimization problems, especially OPF problems. The proposed EEO efficiency was confirmed by comparing its results on the ten functions of the CEC'20 test suite, to those of other algorithms, including high-performance algorithms, i.e., CMA-ES, IMODE, AGSK and LSHADE_cnEpSin. Moreover, the statistical significance of these results was validated by the Wilcoxon's rank-sum test. After that, the proposed EEO was applied to solve the the OPF problem. The OPF is formulated as a nonlinear optimization problem with conflicting objectives and subjected to both equality and inequality constraints. The performance of this technique is deliberated and evaluated on the standard IEEE 30-bus test system for different objectives. The obtained results of the proposed EEO algorithm is compared to the original EO algorithm and those obtained using other techniques mentioned in the literature. These Simulation results revealed that the proposed algorithm provides better optimized solutions than 20 published methods and results as well as the original EO algorithm. The EEO superiority was demonstrated through six different cases, that involved the minimization of different objectives: fuel cost, fuel cost with valve-point loading effect, emission, total active power losses, voltage deviation, and voltage instability. Also, the comparison results indicate that EEO algorithm can provide a robust, high-quality feasible solutions for different OPF problems.
{"title":"Development and application of equilibrium optimizer for optimal power flow calculation of power system.","authors":"Essam H Houssein, Mohamed H Hassan, Mohamed A Mahdy, Salah Kamel","doi":"10.1007/s10489-022-03796-7","DOIUrl":"https://doi.org/10.1007/s10489-022-03796-7","url":null,"abstract":"<p><p>This paper proposes an enhanced version of Equilibrium Optimizer (EO) called (EEO) for solving global optimization and the optimal power flow (OPF) problems. The proposed EEO algorithm includes a new performance reinforcement strategy with the Lévy Flight mechanism. The algorithm addresses the shortcomings of the original Equilibrium Optimizer (EO) and aims to provide better solutions (than those provided by EO) to global optimization problems, especially OPF problems. The proposed EEO efficiency was confirmed by comparing its results on the ten functions of the CEC'20 test suite, to those of other algorithms, including high-performance algorithms, i.e., CMA-ES, IMODE, AGSK and LSHADE_cnEpSin. Moreover, the statistical significance of these results was validated by the Wilcoxon's rank-sum test. After that, the proposed EEO was applied to solve the the OPF problem. The OPF is formulated as a nonlinear optimization problem with conflicting objectives and subjected to both equality and inequality constraints. The performance of this technique is deliberated and evaluated on the standard IEEE 30-bus test system for different objectives. The obtained results of the proposed EEO algorithm is compared to the original EO algorithm and those obtained using other techniques mentioned in the literature. These Simulation results revealed that the proposed algorithm provides better optimized solutions than 20 published methods and results as well as the original EO algorithm. The EEO superiority was demonstrated through six different cases, that involved the minimization of different objectives: fuel cost, fuel cost with valve-point loading effect, emission, total active power losses, voltage deviation, and voltage instability. Also, the comparison results indicate that EEO algorithm can provide a robust, high-quality feasible solutions for different OPF problems.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 6","pages":"7232-7253"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9289660/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10785427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1007/s10489-022-03945-y
Zhi-Fen He, Chun-Hua Zhang, Bin Liu, Bo Li
Multi-view multi-label learning (MVML) is an important paradigm in machine learning, where each instance is represented by several heterogeneous views and associated with a set of class labels. However, label incompleteness and the ignorance of both the relationships among views and the correlations among labels will cause performance degradation in MVML algorithms. Accordingly, a novel method, label recovery and label correlation co-learning forMulti-ViewMulti-Label classification with incoMpleteLabels (MV2ML), is proposed in this paper. First, a label correlation-guided binary classifier kernel-based is constructed for each label. Then, we adopt the multi-kernel fusion method to effectively fuse the multi-view data by utilizing the individual and complementary information among multiple views and distinguishing the contribution difference of each view. Finally, we propose a collaborative learning strategy that considers the exploitation of asymmetric label correlations, the fusion of multi-view data, the recovery of incomplete label matrix and the construction of the classification model simultaneously. In such a way, the recovery of incomplete label matrix and the learning of label correlations interact and boost each other to guide the training of classifiers. Extensive experimental results demonstrate that MV2ML achieves highly competitive classification performance against state-of-the-art approaches on various real-world multi-view multi-label datasets in terms of six evaluation criteria.
{"title":"Label recovery and label correlation co-learning for multi-view multi-label classification with incomplete labels.","authors":"Zhi-Fen He, Chun-Hua Zhang, Bin Liu, Bo Li","doi":"10.1007/s10489-022-03945-y","DOIUrl":"https://doi.org/10.1007/s10489-022-03945-y","url":null,"abstract":"<p><p>Multi-view multi-label learning (MVML) is an important paradigm in machine learning, where each instance is represented by several heterogeneous views and associated with a set of class labels. However, label incompleteness and the ignorance of both the relationships among views and the correlations among labels will cause performance degradation in MVML algorithms. Accordingly, a novel method, <i>label recovery and label correlation co-learning for</i> <b>M</b> <i>ulti</i>-<b>V</b> <i>iew</i> <b>M</b> <i>ulti</i>-<b>L</b> <i>abel classification with inco</i> <b>M</b> <i>plete</i> <b>L</b> <i>abels</i> (MV2ML), is proposed in this paper. First, a label correlation-guided binary classifier kernel-based is constructed for each label. Then, we adopt the multi-kernel fusion method to effectively fuse the multi-view data by utilizing the individual and complementary information among multiple views and distinguishing the contribution difference of each view. Finally, we propose a collaborative learning strategy that considers the exploitation of asymmetric label correlations, the fusion of multi-view data, the recovery of incomplete label matrix and the construction of the classification model simultaneously. In such a way, the recovery of incomplete label matrix and the learning of label correlations interact and boost each other to guide the training of classifiers. Extensive experimental results demonstrate that MV2ML achieves highly competitive classification performance against state-of-the-art approaches on various real-world multi-view multi-label datasets in terms of six evaluation criteria.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 8","pages":"9444-9462"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9360669/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9381033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents, TextConvoNet, a novel Convolutional Neural Network (CNN) based architecture for binary and multi-class text classification problems. Most of the existing CNN-based models use one-dimensional convolving filters, where each filter specializes in extracting n-grams features of a particular input word embeddings (Sentence Matrix). These features can be termed as intra-sentence n-gram features. To the best of our knowledge, all the existing CNN models for text classification are based on the aforementioned concept. The presented TextConvoNet not only extracts the intra-sentence n-gram features but also captures the inter-sentence n-gram features in input text data. It uses an alternative approach for input matrix representation and applies a two-dimensional multi-scale convolutional operation on the input. We perform an experimental study on five binary and multi-class classification datasets and evaluate the performance of the TextConvoNet for text classification. The results are evaluated using eight performance measures, accuracy, precision, recall, f1-score, specificity, gmean1, gmean2, and Mathews correlation coefficient (MCC). Furthermore, we extensively compared presented TextConvoNet with machine learning, deep learning, and attention-based models. The experimental results evidenced that the presented TextConvoNet outperformed and yielded better performance than the other used models for text classification purposes.
{"title":"<i>TextConvoNet</i>: a convolutional neural network based architecture for text classification.","authors":"Sanskar Soni, Satyendra Singh Chouhan, Santosh Singh Rathore","doi":"10.1007/s10489-022-04221-9","DOIUrl":"10.1007/s10489-022-04221-9","url":null,"abstract":"<p><p>This paper presents, <i>TextConvoNet</i>, a novel Convolutional Neural Network (CNN) based architecture for binary and multi-class text classification problems. Most of the existing CNN-based models use one-dimensional convolving filters, where each filter specializes in extracting <i>n-grams</i> features of a particular input word embeddings (Sentence Matrix). These features can be termed as intra-sentence <i>n-gram</i> features. To the best of our knowledge, all the existing CNN models for text classification are based on the aforementioned concept. The presented <i>TextConvoNet</i> not only extracts the intra-sentence <i>n-gram</i> features but also captures the inter-sentence <i>n-gram</i> features in input text data. It uses an alternative approach for input matrix representation and applies a two-dimensional multi-scale convolutional operation on the input. We perform an experimental study on five binary and multi-class classification datasets and evaluate the performance of the <i>TextConvoNet</i> for text classification. The results are evaluated using eight performance measures, accuracy, precision, recall, f1-score, specificity, gmean1, gmean2, and Mathews correlation coefficient (MCC). Furthermore, we extensively compared presented <i>TextConvoNet</i> with machine learning, deep learning, and attention-based models. The experimental results evidenced that the presented <i>TextConvoNet</i> outperformed and yielded better performance than the other used models for text classification purposes.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 11","pages":"14249-14268"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9589611/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9921214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent decades have witnessed rapid development in the field of medical image segmentation. Deep learning-based fully convolution neural networks have played a significant role in the development of automated medical image segmentation models. Though immensely effective, such networks only take into account localized features and are unable to capitalize on the global context of medical image. In this paper, two deep learning based models have been proposed namely USegTransformer-P and USegTransformer-S. The proposed models capitalize upon local features and global features by amalgamating the transformer-based encoders and convolution-based encoders to segment medical images with high precision. Both the proposed models deliver promising results, performing better than the previous state of the art models in various segmentation tasks such as Brain tumor, Lung nodules, Skin lesion and Nuclei segmentation. The authors believe that the ability of USegTransformer-P and USegTransformer-S to perform segmentation with high precision could remarkably benefit medical practitioners and radiologists around the world.
{"title":"Semantic segmentation in medical images through transfused convolution and transformer networks.","authors":"Tashvik Dhamija, Anunay Gupta, Shreyansh Gupta, Anjum, Rahul Katarya, Ghanshyam Singh","doi":"10.1007/s10489-022-03642-w","DOIUrl":"https://doi.org/10.1007/s10489-022-03642-w","url":null,"abstract":"<p><p>Recent decades have witnessed rapid development in the field of medical image segmentation. Deep learning-based fully convolution neural networks have played a significant role in the development of automated medical image segmentation models. Though immensely effective, such networks only take into account localized features and are unable to capitalize on the global context of medical image. In this paper, two deep learning based models have been proposed namely USegTransformer-P and USegTransformer-S. The proposed models capitalize upon local features and global features by amalgamating the transformer-based encoders and convolution-based encoders to segment medical images with high precision. Both the proposed models deliver promising results, performing better than the previous state of the art models in various segmentation tasks such as Brain tumor, Lung nodules, Skin lesion and Nuclei segmentation. The authors believe that the ability of USegTransformer-P and USegTransformer-S to perform segmentation with high precision could remarkably benefit medical practitioners and radiologists around the world.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 1","pages":"1132-1148"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9035506/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10468953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1007/s10489-022-04077-z
Dong-Qin Xu, Ming-Ai Li
Domain adaptation, as an important branch of transfer learning, can be applied to cope with data insufficiency and high subject variabilities in motor imagery electroencephalogram (MI-EEG) based brain-computer interfaces. The existing methods generally focus on aligning data and feature distribution; however, aligning each source domain with the informative samples of the target domain and seeking the most appropriate source domains to enhance the classification effect has not been considered. In this paper, we propose a dual alignment-based multi-source domain adaptation framework, denoted DAMSDAF. Based on continuous wavelet transform, all channels of MI-EEG signals are converted respectively and the generated time-frequency spectrum images are stitched to construct multi-source domains and target domain. Then, the informative samples close to the decision boundary are found in the target domain by using entropy, and they are employed to align and reassign each source domain with normalized mutual information. Furthermore, a multi-branch deep network (MBDN) is designed, and the maximum mean discrepancy is embedded in each branch to realign the specific feature distribution. Each branch is separately trained by an aligned source domain, and all the single branch transfer accuracies are arranged in descending order and utilized for weighted prediction of MBDN. Therefore, the most suitable number of source domains with top weights can be automatically determined. Extensive experiments are conducted based on 3 public MI-EEG datasets. DAMSDAF achieves the classification accuracies of 92.56%, 69.45% and 89.57%, and the statistical analysis is performed by the kappa value and t-test. Experimental results show that DAMSDAF significantly improves the transfer effects compared to the present methods, indicating that dual alignment can sufficiently use the different weighted samples and even source domains at different levels as well as realizing optimal selection of multi-source domains.
{"title":"A dual alignment-based multi-source domain adaptation framework for motor imagery EEG classification.","authors":"Dong-Qin Xu, Ming-Ai Li","doi":"10.1007/s10489-022-04077-z","DOIUrl":"https://doi.org/10.1007/s10489-022-04077-z","url":null,"abstract":"<p><p>Domain adaptation, as an important branch of transfer learning, can be applied to cope with data insufficiency and high subject variabilities in motor imagery electroencephalogram (MI-EEG) based brain-computer interfaces. The existing methods generally focus on aligning data and feature distribution; however, aligning each source domain with the informative samples of the target domain and seeking the most appropriate source domains to enhance the classification effect has not been considered. In this paper, we propose a dual alignment-based multi-source domain adaptation framework, denoted DAMSDAF. Based on continuous wavelet transform, all channels of MI-EEG signals are converted respectively and the generated time-frequency spectrum images are stitched to construct multi-source domains and target domain. Then, the informative samples close to the decision boundary are found in the target domain by using entropy, and they are employed to align and reassign each source domain with normalized mutual information. Furthermore, a multi-branch deep network (MBDN) is designed, and the maximum mean discrepancy is embedded in each branch to realign the specific feature distribution. Each branch is separately trained by an aligned source domain, and all the single branch transfer accuracies are arranged in descending order and utilized for weighted prediction of MBDN. Therefore, the most suitable number of source domains with top weights can be automatically determined. Extensive experiments are conducted based on 3 public MI-EEG datasets. DAMSDAF achieves the classification accuracies of 92.56%, 69.45% and 89.57%, and the statistical analysis is performed by the kappa value and <i>t-</i>test. Experimental results show that DAMSDAF significantly improves the transfer effects compared to the present methods, indicating that dual alignment can sufficiently use the different weighted samples and even source domains at different levels as well as realizing optimal selection of multi-source domains.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 9","pages":"10766-10788"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9402410/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9500093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2022-11-29DOI: 10.1007/s10489-022-04288-4
Thai-Vu Nguyen, Anh Nguyen, Nghia Le, Bac Le
Domain adaptation is a potential method to train a powerful deep neural network across various datasets. More precisely, domain adaptation methods train the model on training data and test that model on a completely separate dataset. The adversarial-based adaptation method became popular among other domain adaptation methods. Relying on the idea of GAN, the adversarial-based domain adaptation tries to minimize the distribution between the training and testing dataset based on the adversarial learning process. We observe that the semi-supervised learning approach can combine with the adversarial-based method to solve the domain adaptation problem. In this paper, we propose an improved adversarial domain adaptation method called Semi-Supervised Adversarial Discriminative Domain Adaptation (SADDA), which can outperform other prior domain adaptation methods. We also show that SADDA has a wide range of applications and illustrate the promise of our method for image classification and sentiment classification problems.
{"title":"Semi-supervised adversarial discriminative domain adaptation.","authors":"Thai-Vu Nguyen, Anh Nguyen, Nghia Le, Bac Le","doi":"10.1007/s10489-022-04288-4","DOIUrl":"10.1007/s10489-022-04288-4","url":null,"abstract":"<p><p>Domain adaptation is a potential method to train a powerful deep neural network across various datasets. More precisely, domain adaptation methods train the model on training data and test that model on a completely separate dataset. The adversarial-based adaptation method became popular among other domain adaptation methods. Relying on the idea of GAN, the adversarial-based domain adaptation tries to minimize the distribution between the training and testing dataset based on the adversarial learning process. We observe that the semi-supervised learning approach can combine with the adversarial-based method to solve the domain adaptation problem. In this paper, we propose an improved adversarial domain adaptation method called Semi-Supervised Adversarial Discriminative Domain Adaptation (SADDA), which can outperform other prior domain adaptation methods. We also show that SADDA has a wide range of applications and illustrate the promise of our method for image classification and sentiment classification problems.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 12","pages":"15909-15922"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9707164/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9570425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2022-11-01DOI: 10.1007/s10489-022-04215-7
Qingbo Hao, Chundong Wang, Yingyuan Xiao, Hao Lin
In the application recommendation field, collaborative filtering (CF) method is often considered to be one of the most effective methods. As the basis of CF-based recommendation methods, representation learning needs to learn two types of factors: attribute factors revealed by independent individuals (e.g., user attributes, application types) and interaction factors contained in collaborative signals (e.g., interactions influenced by others). However, existing CF-based methods fail to learn these two factors separately; therefore, it is difficult to understand the deeper motivation behind user behaviors, resulting in suboptimal performance. From this point of view, we propose a multi-granularity coupled graph neural network recommendation method based on implicit relationships (IMGC-GNN). Specifically, we introduce contextual information (time and space) into user-application interactions and construct a three-layer coupled graph. Then, the graph neural network approach is used to learn the attribute and interaction factors separately. For attribute representation learning, we decompose the coupled graph into three homogeneous graphs with users, applications, and contexts as nodes. Next, we use multilayer aggregation operations to learn features between users, between contexts, and between applications. For interaction representation learning, we construct a homogeneous graph with user-context-application interactions as nodes. Next, we use node similarity and structural similarity to learn the deep interaction features. Finally, according to the learned representations, IMGC-GNN makes accurate application recommendations to users in different contexts. To verify the validity of the proposed method, we conduct experiments on real-world interaction data from three cities and compare our model with seven baseline methods. The experimental results show that our method has the best performance in the top-k recommendation.
{"title":"IMGC-GNN: A multi-granularity coupled graph neural network recommendation method based on implicit relationships.","authors":"Qingbo Hao, Chundong Wang, Yingyuan Xiao, Hao Lin","doi":"10.1007/s10489-022-04215-7","DOIUrl":"10.1007/s10489-022-04215-7","url":null,"abstract":"<p><p>In the application recommendation field, collaborative filtering (CF) method is often considered to be one of the most effective methods. As the basis of CF-based recommendation methods, representation learning needs to learn two types of factors: attribute factors revealed by independent individuals (e.g., user attributes, application types) and interaction factors contained in collaborative signals (e.g., interactions influenced by others). However, existing CF-based methods fail to learn these two factors separately; therefore, it is difficult to understand the deeper motivation behind user behaviors, resulting in suboptimal performance. From this point of view, we propose a multi-granularity coupled graph neural network recommendation method based on implicit relationships (IMGC-GNN). Specifically, we introduce contextual information (time and space) into user-application interactions and construct a three-layer coupled graph. Then, the graph neural network approach is used to learn the attribute and interaction factors separately. For attribute representation learning, we decompose the coupled graph into three homogeneous graphs with users, applications, and contexts as nodes. Next, we use multilayer aggregation operations to learn features between users, between contexts, and between applications. For interaction representation learning, we construct a homogeneous graph with user-context-application interactions as nodes. Next, we use node similarity and structural similarity to learn the deep interaction features. Finally, according to the learned representations, IMGC-GNN makes accurate application recommendations to users in different contexts. To verify the validity of the proposed method, we conduct experiments on real-world interaction data from three cities and compare our model with seven baseline methods. The experimental results show that our method has the best performance in the top-<i>k</i> recommendation.</p>","PeriodicalId":72260,"journal":{"name":"Applied intelligence (Dordrecht, Netherlands)","volume":"53 11","pages":"14668-14689"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9628402/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9618837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}