Pub Date : 2023-05-31DOI: 10.1007/s00521-023-08644-4
Soumia Goumiri, Dalila Benboudjema, Wojciech Pieczynski
Convolutional neural networks (CNNs) have lately proven to be extremely effective in image recognition. Besides CNN, hidden Markov chains (HMCs) are probabilistic models widely used in image processing. This paper presents a new hybrid model composed of both CNNs and HMCs. The CNN model is used for feature extraction and dimensionality reduction and the HMC model for classification. In the new model, named CNN-HMC, convolutional and pooling layers of the CNN model are applied to extract features maps. Also a Peano scan is applied to obtain several HMCs. Expectation-Maximization (EM) algorithm is used to estimate HMC's parameters and to make the Bayesian Maximum Posterior Mode (MPM) classification method used unsupervised. The objective is to enhance the performances of the CNN models for the image classification task. To evaluate the performance of our proposal, it is compared to six models in two series of experiments. In the first series, we consider two CNN-HMC and compare them to two CNNs, 4Conv and Mini AlexNet, respectively. The results show that CNN-HMC model outperforms the classical CNN model, and significantly improves the accuracy of the Mini AlexNet. In the second series, it is compared to four models CNN-SVMs, CNN-LSTMs, CNN-RFs, and CNN-gcForests, which only differ from CNN-HMC by the second classification step. Based on five datasets and four metrics recall, precision, F1-score, and accuracy, results of these comparisons show again the interest of the proposed CNN-HMC. In particular, with a CNN model of 71% of accuracy, the CNN-HMC gives an accuracy ranging between 81.63% and 92.5%.
{"title":"A new hybrid model of convolutional neural networks and hidden Markov chains for image classification.","authors":"Soumia Goumiri, Dalila Benboudjema, Wojciech Pieczynski","doi":"10.1007/s00521-023-08644-4","DOIUrl":"10.1007/s00521-023-08644-4","url":null,"abstract":"<p><p>Convolutional neural networks (CNNs) have lately proven to be extremely effective in image recognition. Besides CNN, hidden Markov chains (HMCs) are probabilistic models widely used in image processing. This paper presents a new hybrid model composed of both CNNs and HMCs. The CNN model is used for feature extraction and dimensionality reduction and the HMC model for classification. In the new model, named CNN-HMC, convolutional and pooling layers of the CNN model are applied to extract features maps. Also a Peano scan is applied to obtain several HMCs. Expectation-Maximization (EM) algorithm is used to estimate HMC's parameters and to make the Bayesian Maximum Posterior Mode (MPM) classification method used unsupervised. The objective is to enhance the performances of the CNN models for the image classification task. To evaluate the performance of our proposal, it is compared to six models in two series of experiments. In the first series, we consider two CNN-HMC and compare them to two CNNs, 4Conv and Mini AlexNet, respectively. The results show that CNN-HMC model outperforms the classical CNN model, and significantly improves the accuracy of the Mini AlexNet. In the second series, it is compared to four models CNN-SVMs, CNN-LSTMs, CNN-RFs, and CNN-gcForests, which only differ from CNN-HMC by the second classification step. Based on five datasets and four metrics recall, precision, F1-score, and accuracy, results of these comparisons show again the interest of the proposed CNN-HMC. In particular, with a CNN model of 71% of accuracy, the CNN-HMC gives an accuracy ranging between 81.63% and 92.5%.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-16"},"PeriodicalIF":6.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10230497/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9720340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-31DOI: 10.1007/s00521-023-08662-2
Panagiotis C Theocharopoulos, Anastasia Tsoukala, Spiros V Georgakopoulos, Sotiris K Tasoulis, Vassilis P Plagianakos
The Covid-19 pandemic made a significant impact on society, including the widespread implementation of lockdowns to prevent the spread of the virus. This measure led to a decrease in face-to-face social interactions and, as an equivalent, an increase in the use of social media platforms, such as Twitter. As part of Industry 4.0, sentiment analysis can be exploited to study public attitudes toward future pandemics and sociopolitical situations in general. This work presents an analysis framework by applying a combination of natural language processing techniques and machine learning algorithms to classify the sentiment of each tweet as positive, or negative. Through extensive experimentation, we expose the ideal model for this task and, subsequently, utilize sentiment predictions to perform time series analysis over the course of the pandemic. In addition, a change point detection algorithm was applied in order to identify the turning points in public attitudes toward the pandemic, which were validated by cross-referencing the news report at that particular period of time. Finally, we study the relationship between sentiment trends on social media and, news coverage of the pandemic, providing insights into the public's perception of the pandemic and its influence on the news.
{"title":"Analysing sentiment change detection of Covid-19 tweets.","authors":"Panagiotis C Theocharopoulos, Anastasia Tsoukala, Spiros V Georgakopoulos, Sotiris K Tasoulis, Vassilis P Plagianakos","doi":"10.1007/s00521-023-08662-2","DOIUrl":"10.1007/s00521-023-08662-2","url":null,"abstract":"<p><p>The Covid-19 pandemic made a significant impact on society, including the widespread implementation of lockdowns to prevent the spread of the virus. This measure led to a decrease in face-to-face social interactions and, as an equivalent, an increase in the use of social media platforms, such as Twitter. As part of Industry 4.0, sentiment analysis can be exploited to study public attitudes toward future pandemics and sociopolitical situations in general. This work presents an analysis framework by applying a combination of natural language processing techniques and machine learning algorithms to classify the sentiment of each tweet as positive, or negative. Through extensive experimentation, we expose the ideal model for this task and, subsequently, utilize sentiment predictions to perform time series analysis over the course of the pandemic. In addition, a change point detection algorithm was applied in order to identify the turning points in public attitudes toward the pandemic, which were validated by cross-referencing the news report at that particular period of time. Finally, we study the relationship between sentiment trends on social media and, news coverage of the pandemic, providing insights into the public's perception of the pandemic and its influence on the news.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-11"},"PeriodicalIF":6.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10230484/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9771496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In linear registration, a floating image is spatially aligned with a reference image after performing a series of linear metric transformations. Additionally, linear registration is mainly considered a preprocessing version of nonrigid registration. To better accomplish the task of finding the optimal transformation in pairwise intensity-based medical image registration, in this work, we present an optimization algorithm called the normal vibration distribution search-based differential evolution algorithm (NVSA), which is modified from the Bernstein search-based differential evolution (BSD) algorithm. We redesign the search pattern of the BSD algorithm and import several control parameters as part of the fine-tuning process to reduce the difficulty of the algorithm. In this study, 23 classic optimization functions and 16 real-world patients (resulting in 41 multimodal registration scenarios) are used in experiments performed to statistically investigate the problem solving ability of the NVSA. Nine metaheuristic algorithms are used in the conducted experiments. When compared to the commonly utilized registration methods, such as ANTS, Elastix, and FSL, our method achieves better registration performance on the RIRE dataset. Moreover, we prove that our method can perform well with or without its initial spatial transformation in terms of different evaluation indicators, demonstrating its versatility and robustness for various clinical needs and applications. This study establishes the idea that metaheuristic-based methods can better accomplish linear registration tasks than the frequently used approaches; the proposed method demonstrates promise that it can solve real-world clinical and service problems encountered during nonrigid registration as a preprocessing approach.The source code of the NVSA is publicly available at https://github.com/PengGui-N/NVSA.
{"title":"Normal vibration distribution search-based differential evolution algorithm for multimodal biomedical image registration.","authors":"Peng Gui, Fazhi He, Bingo Wing-Kuen Ling, Dengyi Zhang, Zongyuan Ge","doi":"10.1007/s00521-023-08649-z","DOIUrl":"10.1007/s00521-023-08649-z","url":null,"abstract":"<p><p>In linear registration, a floating image is spatially aligned with a reference image after performing a series of linear metric transformations. Additionally, linear registration is mainly considered a preprocessing version of nonrigid registration. To better accomplish the task of finding the optimal transformation in pairwise intensity-based medical image registration, in this work, we present an optimization algorithm called the normal vibration distribution search-based differential evolution algorithm (NVSA), which is modified from the Bernstein search-based differential evolution (BSD) algorithm. We redesign the search pattern of the BSD algorithm and import several control parameters as part of the fine-tuning process to reduce the difficulty of the algorithm. In this study, 23 classic optimization functions and 16 real-world patients (resulting in 41 multimodal registration scenarios) are used in experiments performed to statistically investigate the problem solving ability of the NVSA. Nine metaheuristic algorithms are used in the conducted experiments. When compared to the commonly utilized registration methods, such as ANTS, Elastix, and FSL, our method achieves better registration performance on the RIRE dataset. Moreover, we prove that our method can perform well with or without its initial spatial transformation in terms of different evaluation indicators, demonstrating its versatility and robustness for various clinical needs and applications. This study establishes the idea that metaheuristic-based methods can better accomplish linear registration tasks than the frequently used approaches; the proposed method demonstrates promise that it can solve real-world clinical and service problems encountered during nonrigid registration as a preprocessing approach.The source code of the NVSA is publicly available at https://github.com/PengGui-N/NVSA.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-23"},"PeriodicalIF":6.0,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10227826/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-27DOI: 10.1007/s00521-023-08689-5
Simon Fong, Giancarlo Fortino, Dhanjoo Ghista, Francesco Piccialli
{"title":"Special issue on deep learning and big data analytics for medical e-diagnosis/AI-based e-diagnosis.","authors":"Simon Fong, Giancarlo Fortino, Dhanjoo Ghista, Francesco Piccialli","doi":"10.1007/s00521-023-08689-5","DOIUrl":"10.1007/s00521-023-08689-5","url":null,"abstract":"","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-5"},"PeriodicalIF":6.0,"publicationDate":"2023-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10224755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9688574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-27DOI: 10.1007/s00521-023-08683-x
Javad Mozaffari, Abdollah Amirkhani, Shahriar B Shokouhi
The spread of the COVID-19 started back in 2019; and so far, more than 4 million people around the world have lost their lives to this deadly virus and its variants. In view of the high transmissibility of the Corona virus, which has turned this disease into a global pandemic, artificial intelligence can be employed as an effective tool for an earlier detection and treatment of this illness. In this review paper, we evaluate the performance of the deep learning models in processing the X-Ray and CT-Scan images of the Corona patients' lungs and describe the changes made to these models in order to enhance their Corona detection accuracy. To this end, we introduce the famous deep learning models such as VGGNet, GoogleNet and ResNet and after reviewing the research works in which these models have been used for the detection of COVID-19, we compare the performances of the newer models such as DenseNet, CapsNet, MobileNet and EfficientNet. We then present the deep learning techniques of GAN, transfer learning, and data augmentation and examine the statistics of using these techniques. Here, we also describe the datasets introduced since the onset of the COVID-19. These datasets contain the lung images of Corona patients, healthy individuals, and the patients with non-Corona pulmonary diseases. Lastly, we elaborate on the existing challenges in the use of artificial intelligence for COVID-19 detection and the prospective trends of using this method in similar situations and conditions.
Supplementary information: The online version contains supplementary material available at 10.1007/s00521-023-08683-x.
{"title":"A survey on deep learning models for detection of COVID-19.","authors":"Javad Mozaffari, Abdollah Amirkhani, Shahriar B Shokouhi","doi":"10.1007/s00521-023-08683-x","DOIUrl":"10.1007/s00521-023-08683-x","url":null,"abstract":"<p><p>The spread of the COVID-19 started back in 2019; and so far, more than 4 million people around the world have lost their lives to this deadly virus and its variants. In view of the high transmissibility of the Corona virus, which has turned this disease into a global pandemic, artificial intelligence can be employed as an effective tool for an earlier detection and treatment of this illness. In this review paper, we evaluate the performance of the deep learning models in processing the X-Ray and CT-Scan images of the Corona patients' lungs and describe the changes made to these models in order to enhance their Corona detection accuracy. To this end, we introduce the famous deep learning models such as VGGNet, GoogleNet and ResNet and after reviewing the research works in which these models have been used for the detection of COVID-19, we compare the performances of the newer models such as DenseNet, CapsNet, MobileNet and EfficientNet. We then present the deep learning techniques of GAN, transfer learning, and data augmentation and examine the statistics of using these techniques. Here, we also describe the datasets introduced since the onset of the COVID-19. These datasets contain the lung images of Corona patients, healthy individuals, and the patients with non-Corona pulmonary diseases. Lastly, we elaborate on the existing challenges in the use of artificial intelligence for COVID-19 detection and the prospective trends of using this method in similar situations and conditions.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s00521-023-08683-x.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-29"},"PeriodicalIF":6.0,"publicationDate":"2023-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10224665/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A high-quality domain-oriented dataset is crucial for the domain-specific named entity recognition (NER) task. In this study, we introduce a novel education-oriented Chinese NER dataset (EduNER). To provide representative and diverse training data, we collect data from multiple sources, including textbooks, academic papers, and education-related web pages. The collected documents span ten years (2012-2021). A team of domain experts is invited to accomplish the education NER schema definition, and a group of trained annotators is hired to complete the annotation. A collaborative labeling platform is built for accelerating human annotation. The constructed EduNER dataset includes 16 entity types, 11k+ sentences, and 35,731 entities. We conduct a thorough statistical analysis of EduNER and summarize its distinctive characteristics by comparing it with eight open-domain or domain-specific NER datasets. Sixteen state-of-the-art models are further utilized for NER tasks validation. The experimental results can enlighten further exploration. To the best of our knowledge, EduNER is the first publicly available dataset for NER task in the education domain, which may promote the development of education-oriented NER models.
{"title":"EduNER: a Chinese named entity recognition dataset for education research.","authors":"Xu Li, Chengkun Wei, Zhuoren Jiang, Wenlong Meng, Fan Ouyang, Zihui Zhang, Wenzhi Chen","doi":"10.1007/s00521-023-08635-5","DOIUrl":"10.1007/s00521-023-08635-5","url":null,"abstract":"<p><p>A high-quality domain-oriented dataset is crucial for the domain-specific named entity recognition (NER) task. In this study, we introduce a novel education-oriented Chinese NER dataset (EduNER). To provide representative and diverse training data, we collect data from multiple sources, including textbooks, academic papers, and education-related web pages. The collected documents span ten years (2012-2021). A team of domain experts is invited to accomplish the education NER schema definition, and a group of trained annotators is hired to complete the annotation. A collaborative labeling platform is built for accelerating human annotation. The constructed EduNER dataset includes 16 entity types, 11k+ sentences, and 35,731 entities. We conduct a thorough statistical analysis of EduNER and summarize its distinctive characteristics by comparing it with eight open-domain or domain-specific NER datasets. Sixteen state-of-the-art models are further utilized for NER tasks validation. The experimental results can enlighten further exploration. To the best of our knowledge, EduNER is the first publicly available dataset for NER task in the education domain, which may promote the development of education-oriented NER models.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-15"},"PeriodicalIF":6.0,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10199663/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-20DOI: 10.1007/s00521-023-08647-1
Qian Chen, Xuan Wang, Zoe Lin Jiang, Yulin Wu, Huale Li, Lei Cui, Xiaozhen Sun
The mechanism design theory can be applied not only in the economy but also in many fields, such as politics and military affairs, which has important practical and strategic significance for countries in the period of system innovation and transformation. As Nobel Laureate Paul said, the complexity of the real economy makes it difficult for "Unorganized Markets" to ensure supply-demand balance and the efficient allocation of resources. When traditional economic theory cannot explain and calculate the complex scenes of reality, we require a high-performance computing solution based on traditional theory to evaluate the mechanisms, meanwhile, get better social welfare. The mechanism design theory is undoubtedly the best option. Different from other existing works, which are based on the theoretical exploration of optimal solutions or single perspective analysis of scenarios, this paper focuses on the more real and complex markets. It explores to discover the common difficulties and feasible solutions for the applications. Firstly, we review the history of traditional mechanism design and algorithm mechanism design. Subsequently, we present the main challenges in designing the actual data-driven market mechanisms, including the inherent challenges in the mechanism design theory, the challenges brought by new markets and the common challenges faced by both. In addition, we also comb and discuss theoretical support and computer-aided methods in detail. This paper guides cross-disciplinary researchers who wish to explore the resource allocation problem in real markets for the first time and offers a different perspective for researchers struggling to solve complex social problems. Finally, we discuss and propose new ideas and look to the future.
{"title":"Breaking the traditional: a survey of algorithmic mechanism design applied to economic and complex environments.","authors":"Qian Chen, Xuan Wang, Zoe Lin Jiang, Yulin Wu, Huale Li, Lei Cui, Xiaozhen Sun","doi":"10.1007/s00521-023-08647-1","DOIUrl":"10.1007/s00521-023-08647-1","url":null,"abstract":"<p><p>The mechanism design theory can be applied not only in the economy but also in many fields, such as politics and military affairs, which has important practical and strategic significance for countries in the period of system innovation and transformation. As Nobel Laureate Paul said, the complexity of the real economy makes it difficult for \"Unorganized Markets\" to ensure supply-demand balance and the efficient allocation of resources. When traditional economic theory cannot explain and calculate the complex scenes of reality, we require a high-performance computing solution based on traditional theory to evaluate the mechanisms, meanwhile, get better social welfare. The mechanism design theory is undoubtedly the best option. Different from other existing works, which are based on the theoretical exploration of optimal solutions or single perspective analysis of scenarios, this paper focuses on the more real and complex markets. It explores to discover the common difficulties and feasible solutions for the applications. Firstly, we review the history of traditional mechanism design and algorithm mechanism design. Subsequently, we present the main challenges in designing the actual data-driven market mechanisms, including the inherent challenges in the mechanism design theory, the challenges brought by new markets and the common challenges faced by both. In addition, we also comb and discuss theoretical support and computer-aided methods in detail. This paper guides cross-disciplinary researchers who wish to explore the resource allocation problem in real markets for the first time and offers a different perspective for researchers struggling to solve complex social problems. Finally, we discuss and propose new ideas and look to the future.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-30"},"PeriodicalIF":4.5,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10199671/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-10DOI: 10.1007/s00521-023-08461-9
Menghui Zhou, Yu Zhang, Tong Liu, Yun Yang, Po Yang
In this paper, we propose a novel efficient multi-task learning formulation for the class of progression problems in which its state will continuously change over time. To use the shared knowledge information between multiple tasks to improve performance, existing multi-task learning methods mainly focus on feature selection or optimizing the task relation structure. The feature selection methods usually fail to explore the complex relationship between tasks and thus have limited performance. The methods centring on optimizing the relation structure of tasks are not capable of selecting meaningful features and have a bi-convex objective function which results in high computation complexity of the associated optimization algorithm. Unlike these multi-task learning methods, motivated by a simple and direct idea that the state of a system at the current time point should be related to all previous time points, we first propose a novel relation structure, termed adaptive global temporal relation structure (AGTS). Then we integrate the widely used sparse group Lasso, fused Lasso with AGTS to propose a novel convex multi-task learning formulation that not only performs feature selection but also adaptively captures the global temporal task relatedness. Since the existence of three non-smooth penalties, the objective function is challenging to solve. We first design an optimization algorithm based on the alternating direction method of multipliers (ADMM). Considering that the worst-case convergence rate of ADMM is only sub-linear, we then devise an efficient algorithm based on the accelerated gradient method which has the optimal convergence rate among first-order methods. We show the proximal operator of several non-smooth penalties can be solved efficiently due to the special structure of our formulation. Experimental results on four real-world datasets demonstrate that our approach not only outperforms multiple baseline MTL methods in terms of effectiveness but also has high efficiency.
{"title":"Efficient multi-task learning with adaptive temporal structure for progression prediction.","authors":"Menghui Zhou, Yu Zhang, Tong Liu, Yun Yang, Po Yang","doi":"10.1007/s00521-023-08461-9","DOIUrl":"10.1007/s00521-023-08461-9","url":null,"abstract":"<p><p>In this paper, we propose a novel efficient multi-task learning formulation for the class of progression problems in which its state will continuously change over time. To use the shared knowledge information between multiple tasks to improve performance, existing multi-task learning methods mainly focus on feature selection or optimizing the task relation structure. The feature selection methods usually fail to explore the complex relationship between tasks and thus have limited performance. The methods centring on optimizing the relation structure of tasks are not capable of selecting meaningful features and have a bi-convex objective function which results in high computation complexity of the associated optimization algorithm. Unlike these multi-task learning methods, motivated by a simple and direct idea that the state of a system at the current time point should be related to all previous time points, we first propose a novel relation structure, termed adaptive global temporal relation structure (AGTS). Then we integrate the widely used sparse group Lasso, fused Lasso with AGTS to propose a novel convex multi-task learning formulation that not only performs feature selection but also adaptively captures the global temporal task relatedness. Since the existence of three non-smooth penalties, the objective function is challenging to solve. We first design an optimization algorithm based on the alternating direction method of multipliers (ADMM). Considering that the worst-case convergence rate of ADMM is only sub-linear, we then devise an efficient algorithm based on the accelerated gradient method which has the optimal convergence rate among first-order methods. We show the proximal operator of several non-smooth penalties can be solved efficiently due to the special structure of our formulation. Experimental results on four real-world datasets demonstrate that our approach not only outperforms multiple baseline MTL methods in terms of effectiveness but also has high efficiency.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-16"},"PeriodicalIF":4.5,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10171734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9771492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-08DOI: 10.1007/s00521-023-08629-3
George Manias, Argyro Mavrogiorgou, Athanasios Kiourtis, Chrysostomos Symvoulidis, Dimosthenis Kyriazis
Text categorization and sentiment analysis are two of the most typical natural language processing tasks with various emerging applications implemented and utilized in different domains, such as health care and policy making. At the same time, the tremendous growth in the popularity and usage of social media, such as Twitter, has resulted on an immense increase in user-generated data, as mainly represented by the corresponding texts in users' posts. However, the analysis of these specific data and the extraction of actionable knowledge and added value out of them is a challenging task due to the domain diversity and the high multilingualism that characterizes these data. The latter highlights the emerging need for the implementation and utilization of domain-agnostic and multilingual solutions. To investigate a portion of these challenges this research work performs a comparative analysis of multilingual approaches for classifying both the sentiment and the text of an examined multilingual corpus. In this context, four multilingual BERT-based classifiers and a zero-shot classification approach are utilized and compared in terms of their accuracy and applicability in the classification of multilingual data. Their comparison has unveiled insightful outcomes and has a twofold interpretation. Multilingual BERT-based classifiers achieve high performances and transfer inference when trained and fine-tuned on multilingual data. While also the zero-shot approach presents a novel technique for creating multilingual solutions in a faster, more efficient, and scalable way. It can easily be fitted to new languages and new tasks while achieving relatively good results across many languages. However, when efficiency and scalability are less important than accuracy, it seems that this model, and zero-shot models in general, can not be compared to fine-tuned and trained multilingual BERT-based classifiers.
{"title":"Multilingual text categorization and sentiment analysis: a comparative analysis of the utilization of multilingual approaches for classifying twitter data.","authors":"George Manias, Argyro Mavrogiorgou, Athanasios Kiourtis, Chrysostomos Symvoulidis, Dimosthenis Kyriazis","doi":"10.1007/s00521-023-08629-3","DOIUrl":"10.1007/s00521-023-08629-3","url":null,"abstract":"<p><p>Text categorization and sentiment analysis are two of the most typical natural language processing tasks with various emerging applications implemented and utilized in different domains, such as health care and policy making. At the same time, the tremendous growth in the popularity and usage of social media, such as Twitter, has resulted on an immense increase in user-generated data, as mainly represented by the corresponding texts in users' posts. However, the analysis of these specific data and the extraction of actionable knowledge and added value out of them is a challenging task due to the domain diversity and the high multilingualism that characterizes these data. The latter highlights the emerging need for the implementation and utilization of domain-agnostic and multilingual solutions. To investigate a portion of these challenges this research work performs a comparative analysis of multilingual approaches for classifying both the sentiment and the text of an examined multilingual corpus. In this context, four multilingual BERT-based classifiers and a zero-shot classification approach are utilized and compared in terms of their accuracy and applicability in the classification of multilingual data. Their comparison has unveiled insightful outcomes and has a twofold interpretation. Multilingual BERT-based classifiers achieve high performances and transfer inference when trained and fine-tuned on multilingual data. While also the zero-shot approach presents a novel technique for creating multilingual solutions in a faster, more efficient, and scalable way. It can easily be fitted to new languages and new tasks while achieving relatively good results across many languages. However, when efficiency and scalability are less important than accuracy, it seems that this model, and zero-shot models in general, can not be compared to fine-tuned and trained multilingual BERT-based classifiers.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-17"},"PeriodicalIF":6.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10165589/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9715044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-04DOI: 10.1007/s00521-023-08606-w
Tawsifur Rahman, Muhammad E H Chowdhury, Amith Khandakar, Zaid Bin Mahbub, Md Sakib Abrar Hossain, Abraham Alhatou, Eynas Abdalla, Sreekumar Muthiyal, Khandaker Farzana Islam, Saad Bin Abul Kashem, Muhammad Salman Khan, Susu M Zughaier, Maqsud Hossain
Nowadays, quick, and accurate diagnosis of COVID-19 is a pressing need. This study presents a multimodal system to meet this need. The presented system employs a machine learning module that learns the required knowledge from the datasets collected from 930 COVID-19 patients hospitalized in Italy during the first wave of COVID-19 (March-June 2020). The dataset consists of twenty-five biomarkers from electronic health record and Chest X-ray (CXR) images. It is found that the system can diagnose low- or high-risk patients with an accuracy, sensitivity, and F1-score of 89.03%, 90.44%, and 89.03%, respectively. The system exhibits 6% higher accuracy than the systems that employ either CXR images or biomarker data. In addition, the system can calculate the mortality risk of high-risk patients using multivariate logistic regression-based nomogram scoring technique. Interested physicians can use the presented system to predict the early mortality risks of COVID-19 patients using the web-link: Covid-severity-grading-AI. In this case, a physician needs to input the following information: CXR image file, Lactate Dehydrogenase (LDH), Oxygen Saturation (O2%), White Blood Cells Count, C-reactive protein, and Age. This way, this study contributes to the management of COVID-19 patients by predicting early mortality risk.
Supplementary information: The online version contains supplementary material available at 10.1007/s00521-023-08606-w.
{"title":"BIO-CXRNET: a robust multimodal stacking machine learning technique for mortality risk prediction of COVID-19 patients using chest X-ray images and clinical data.","authors":"Tawsifur Rahman, Muhammad E H Chowdhury, Amith Khandakar, Zaid Bin Mahbub, Md Sakib Abrar Hossain, Abraham Alhatou, Eynas Abdalla, Sreekumar Muthiyal, Khandaker Farzana Islam, Saad Bin Abul Kashem, Muhammad Salman Khan, Susu M Zughaier, Maqsud Hossain","doi":"10.1007/s00521-023-08606-w","DOIUrl":"10.1007/s00521-023-08606-w","url":null,"abstract":"<p><p>Nowadays, quick, and accurate diagnosis of COVID-19 is a pressing need. This study presents a multimodal system to meet this need. The presented system employs a machine learning module that learns the required knowledge from the datasets collected from 930 COVID-19 patients hospitalized in Italy during the first wave of COVID-19 (March-June 2020). The dataset consists of twenty-five biomarkers from electronic health record and Chest X-ray (CXR) images. It is found that the system can diagnose low- or high-risk patients with an accuracy, sensitivity, and <i>F</i>1-score of 89.03%, 90.44%, and 89.03%, respectively. The system exhibits 6% higher accuracy than the systems that employ either CXR images or biomarker data. In addition, the system can calculate the mortality risk of high-risk patients using multivariate logistic regression-based nomogram scoring technique. Interested physicians can use the presented system to predict the early mortality risks of COVID-19 patients using the web-link: Covid-severity-grading-AI. In this case, a physician needs to input the following information: CXR image file, Lactate Dehydrogenase (LDH), Oxygen Saturation (O<sub>2</sub>%), White Blood Cells Count, C-reactive protein, and Age. This way, this study contributes to the management of COVID-19 patients by predicting early mortality risk.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s00521-023-08606-w.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":" ","pages":"1-23"},"PeriodicalIF":6.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10157130/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}