Pub Date : 2024-09-09DOI: 10.1007/s10462-024-10825-z
Maria Lymperaiou, Giorgos Stamou
Multimodal learning has been a field of increasing interest, aiming to combine various modalities in a single joint representation. Especially in the area of visiolinguistic (VL) learning multiple models and techniques have been developed, targeting a variety of tasks that involve images and text. VL models have reached unprecedented performances by extending the idea of Transformers, so that both modalities can learn from each other. Massive pre-training procedures enable VL models to acquire a certain level of real-world understanding, although many gaps can be identified: the limited comprehension of commonsense, factual, temporal and other everyday knowledge aspects questions the extendability of VL tasks. Knowledge graphs and other knowledge sources can fill those gaps by explicitly providing missing information, unlocking novel capabilities of VL models. At the same time, knowledge graphs enhance explainability, fairness and validity of decision making, issues of outermost importance for such complex implementations. The current survey aims to unify the fields of VL representation learning and knowledge graphs, and provides a taxonomy and analysis of knowledge-enhanced VL models.
{"title":"A survey on knowledge-enhanced multimodal learning","authors":"Maria Lymperaiou, Giorgos Stamou","doi":"10.1007/s10462-024-10825-z","DOIUrl":"10.1007/s10462-024-10825-z","url":null,"abstract":"<div><p>Multimodal learning has been a field of increasing interest, aiming to combine various modalities in a single joint representation. Especially in the area of visiolinguistic (VL) learning multiple models and techniques have been developed, targeting a variety of tasks that involve images and text. VL models have reached unprecedented performances by extending the idea of Transformers, so that both modalities can learn from each other. Massive pre-training procedures enable VL models to acquire a certain level of real-world understanding, although many gaps can be identified: the limited comprehension of commonsense, factual, temporal and other everyday knowledge aspects questions the extendability of VL tasks. Knowledge graphs and other knowledge sources can fill those gaps by explicitly providing missing information, unlocking novel capabilities of VL models. At the same time, knowledge graphs enhance explainability, fairness and validity of decision making, issues of outermost importance for such complex implementations. The current survey aims to unify the fields of VL representation learning and knowledge graphs, and provides a taxonomy and analysis of knowledge-enhanced VL models.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10825-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1007/s10462-024-10860-w
Mohammed Atef, Sifeng Liu, Sarbast Moslem, Dragan Pamucar
In order to conduct an in-depth study of Zhan’s methodology pertaining to the covering of multigranulation fuzzy rough sets ((hbox {C}_{{MG}})FRSs), we build two families: the family of fuzzy (beta )-minimum descriptions and the family of (beta )-maximum descriptions. Subsequently, utilizing these notions, we proceed to develop two variations of covering via optimistic (pessimistic) multigranuation rough set samples ((hbox {CO(P)}_{{MG}})FRS). The axiomatic properties are examined. In this study, we examine four models of covering using variable precision multigranulation fuzzy rough sets ((hbox {CVP}_{{MG}})FRSs). We proceed with analyzing the features of these models. Interconnections between these planned plans are also elucidated. This study explores algorithms that aim to identify innovative strategies for addressing multiattribute group decision-making problems (MAGDM) and multicriteria group decision-making problems (MCGDM). The test examples have been elucidated to provide an inclusive grasp of the efficacy of the offered samples. Ultimately, the distinctions between our methodologies and the preexisting research have been demonstrated.
{"title":"New covering techniques and applications utilizing multigranulation fuzzy rough sets","authors":"Mohammed Atef, Sifeng Liu, Sarbast Moslem, Dragan Pamucar","doi":"10.1007/s10462-024-10860-w","DOIUrl":"10.1007/s10462-024-10860-w","url":null,"abstract":"<div><p>In order to conduct an in-depth study of Zhan’s methodology pertaining to the covering of multigranulation fuzzy rough sets (<span>(hbox {C}_{{MG}})</span>FRSs), we build two families: the family of fuzzy <span>(beta )</span>-minimum descriptions and the family of <span>(beta )</span>-maximum descriptions. Subsequently, utilizing these notions, we proceed to develop two variations of covering via optimistic (pessimistic) multigranuation rough set samples (<span>(hbox {CO(P)}_{{MG}})</span>FRS). The axiomatic properties are examined. In this study, we examine four models of covering using variable precision multigranulation fuzzy rough sets (<span>(hbox {CVP}_{{MG}})</span>FRSs). We proceed with analyzing the features of these models. Interconnections between these planned plans are also elucidated. This study explores algorithms that aim to identify innovative strategies for addressing multiattribute group decision-making problems (MAGDM) and multicriteria group decision-making problems (MCGDM). The test examples have been elucidated to provide an inclusive grasp of the efficacy of the offered samples. Ultimately, the distinctions between our methodologies and the preexisting research have been demonstrated.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10860-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Grey Wolf Optimization (GWO) is a highly effective meta-heuristic algorithm leveraging swarm intelligence to tackle real-world optimization problems. However, when confronted with large-scale problems, GWO encounters hurdles in convergence speed and problem-solving capabilities. To address this, we propose an Improved Adaptive Grey Wolf Optimization (IAGWO), which significantly enhances exploration of the search space through refined search mechanisms and adaptive strategy. Primarily, we introduce the incorporation of velocity and the Inverse Multiquadratic Function (IMF) into the search mechanism. This integration not only accelerates convergence speed but also maintains accuracy. Secondly, we implement an adaptive strategy for population updates, enhancing the algorithm's search and optimization capabilities dynamically. The efficacy of our proposed IAGWO is demonstrated through comparative experiments conducted on benchmark test sets, including CEC 2017, CEC 2020, CEC 2022, and CEC 2013 large-scale global optimization suites. At CEC2017, CEC 2020 (10/20 dimensions), CEC 2022 (10/20 dimensions), and CEC 2013, respectively, it outperformed other comparative algorithms by 88.2%, 91.5%, 85.4%, 96.2%, 97.4%, and 97.2%. Results affirm that our algorithm surpasses state-of-the-art approaches in addressing large-scale problems. Moreover, we showcase the broad application potential of the algorithm by successfully solving 19 real-world engineering challenges.
{"title":"Improved multi-strategy adaptive Grey Wolf Optimization for practical engineering applications and high-dimensional problem solving","authors":"Mingyang Yu, Jing Xu, Weiyun Liang, Yu Qiu, Sixu Bao, Lin Tang","doi":"10.1007/s10462-024-10821-3","DOIUrl":"10.1007/s10462-024-10821-3","url":null,"abstract":"<div><p>The Grey Wolf Optimization (GWO) is a highly effective meta-heuristic algorithm leveraging swarm intelligence to tackle real-world optimization problems. However, when confronted with large-scale problems, GWO encounters hurdles in convergence speed and problem-solving capabilities. To address this, we propose an Improved Adaptive Grey Wolf Optimization (IAGWO), which significantly enhances exploration of the search space through refined search mechanisms and adaptive strategy. Primarily, we introduce the incorporation of velocity and the Inverse Multiquadratic Function (IMF) into the search mechanism. This integration not only accelerates convergence speed but also maintains accuracy. Secondly, we implement an adaptive strategy for population updates, enhancing the algorithm's search and optimization capabilities dynamically. The efficacy of our proposed IAGWO is demonstrated through comparative experiments conducted on benchmark test sets, including CEC 2017, CEC 2020, CEC 2022, and CEC 2013 large-scale global optimization suites. At CEC2017, CEC 2020 (10/20 dimensions), CEC 2022 (10/20 dimensions), and CEC 2013, respectively, it outperformed other comparative algorithms by 88.2%, 91.5%, 85.4%, 96.2%, 97.4%, and 97.2%. Results affirm that our algorithm surpasses state-of-the-art approaches in addressing large-scale problems. Moreover, we showcase the broad application potential of the algorithm by successfully solving 19 real-world engineering challenges.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10821-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s10462-024-10875-3
Gang Kou, Hasan Dinçer, Dragan Pamucar, Serhat Yüksel, Muhammet Deveci, Gabriela Oana Olaru, Serkan Eti
Necessary improvements should be made to increase the effectiveness of non-fungible tokens on the Metaverse platform without having extra costs. For the purpose of handing this process more efficiently, there is a need to determine the most important factors for a more successful integration of non-fungible tokens into this platform. Accordingly, this study aims to determine the appropriate the identity management choices of non-fungible tokens in the Metaverse. There are three different stages in the proposed novel fuzzy decision-making model. The first stage includes prioritizing the expert choices with artificial intelligence-based decision-making methodology. Secondly, the criteria sets for managing non-fungible tokens are weighted by using Quantum picture fuzzy rough sets-based M-SWARA methodology. Finally, the identity management choices regarding non-fungible tokens in the Metaverse are ranked with Quantum picture fuzzy rough sets oriented VIKOR. The main contribution of this study is that artificial intelligence methodology is integrated to the fuzzy decision-making modelling to differentiate the experts. With the help of this situation, it can be possible to create clusters for the experts. Hence, the opinions of experts outside this group may be excluded from the scope. It has been determined that security must be ensured first to increase the use of non-fungible tokens on the Metaverse platform. Similarly, technological infrastructure must also be sufficient to achieve this objective. Moreover, biometrics for unique identification has the best ranking performance among the alternatives. Privacy with authentication plays also critical role for the effectiveness of this process.
{"title":"Human–computer interaction using artificial intelligence-based expert prioritization and neuro quantum fuzzy picture rough sets for identity management choices of non-fungible tokens in the Metaverse","authors":"Gang Kou, Hasan Dinçer, Dragan Pamucar, Serhat Yüksel, Muhammet Deveci, Gabriela Oana Olaru, Serkan Eti","doi":"10.1007/s10462-024-10875-3","DOIUrl":"10.1007/s10462-024-10875-3","url":null,"abstract":"<div><p>Necessary improvements should be made to increase the effectiveness of non-fungible tokens on the Metaverse platform without having extra costs. For the purpose of handing this process more efficiently, there is a need to determine the most important factors for a more successful integration of non-fungible tokens into this platform. Accordingly, this study aims to determine the appropriate the identity management choices of non-fungible tokens in the Metaverse. There are three different stages in the proposed novel fuzzy decision-making model. The first stage includes prioritizing the expert choices with artificial intelligence-based decision-making methodology. Secondly, the criteria sets for managing non-fungible tokens are weighted by using Quantum picture fuzzy rough sets-based M-SWARA methodology. Finally, the identity management choices regarding non-fungible tokens in the Metaverse are ranked with Quantum picture fuzzy rough sets oriented VIKOR. The main contribution of this study is that artificial intelligence methodology is integrated to the fuzzy decision-making modelling to differentiate the experts. With the help of this situation, it can be possible to create clusters for the experts. Hence, the opinions of experts outside this group may be excluded from the scope. It has been determined that security must be ensured first to increase the use of non-fungible tokens on the Metaverse platform. Similarly, technological infrastructure must also be sufficient to achieve this objective. Moreover, biometrics for unique identification has the best ranking performance among the alternatives. Privacy with authentication plays also critical role for the effectiveness of this process.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10875-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s10462-024-10905-0
Gang Kou, Hasan Dinçer, Dragan Pamucar, Serhat Yüksel, Muhammet Deveci, Serkan Eti
There should be some improvements to increase the performance of Metaverse investments. However, businesses need to focus on the most important actions to provide cost effectiveness in this process. In summary, a new study is needed in which a priority analysis is made for the performance indicators of Metaverse investments. Accordingly, this study aims to evaluate the main determinants of the performance of the metaverse investments. Within this context, a novel model is created that has four different stages. The first stage is related to the prioritizing the experts with artificial intelligence-based decision-making method. Secondly, missing evaluations are estimated by expert recommendation system. Thirdly, the criteria are weighted with Quantum picture fuzzy rough sets-based (QPFR) M-Step-wise Weight Assessment Ratio Analysis (SWARA). Finally, investment decision-making priorities are ranked by QPFR VIKOR (Vlse Kriterijumska Optimizacija Kompromisno Resenje). The main contribution of this study is the integration of the artificial intelligence methodology to the fuzzy decision-making approach for the purpose of computing the weights of the decision makers. Owing to this condition, the evaluations of these people are examined according to their qualifications. This situation has a positive contribution to make more effective evaluations. Organizational effectiveness is found to be the most important factor in improving the performance of metaverse investments. Similarly, it is also identified that it is important for businesses to ensure technological improvements in the development of Metaverse investments. On the other side, the ranking results indicate that regulatory framework is the most critical alternative in this regard.
{"title":"Artificial intelligence-based expert weighted quantum picture fuzzy rough sets and recommendation system for metaverse investment decision-making priorities","authors":"Gang Kou, Hasan Dinçer, Dragan Pamucar, Serhat Yüksel, Muhammet Deveci, Serkan Eti","doi":"10.1007/s10462-024-10905-0","DOIUrl":"10.1007/s10462-024-10905-0","url":null,"abstract":"<div><p>There should be some improvements to increase the performance of Metaverse investments. However, businesses need to focus on the most important actions to provide cost effectiveness in this process. In summary, a new study is needed in which a priority analysis is made for the performance indicators of Metaverse investments. Accordingly, this study aims to evaluate the main determinants of the performance of the metaverse investments. Within this context, a novel model is created that has four different stages. The first stage is related to the prioritizing the experts with artificial intelligence-based decision-making method. Secondly, missing evaluations are estimated by expert recommendation system. Thirdly, the criteria are weighted with Quantum picture fuzzy rough sets-based (QPFR) M-Step-wise Weight Assessment Ratio Analysis (SWARA). Finally, investment decision-making priorities are ranked by QPFR VIKOR (Vlse Kriterijumska Optimizacija Kompromisno Resenje). The main contribution of this study is the integration of the artificial intelligence methodology to the fuzzy decision-making approach for the purpose of computing the weights of the decision makers. Owing to this condition, the evaluations of these people are examined according to their qualifications. This situation has a positive contribution to make more effective evaluations. Organizational effectiveness is found to be the most important factor in improving the performance of metaverse investments. Similarly, it is also identified that it is important for businesses to ensure technological improvements in the development of Metaverse investments. On the other side, the ranking results indicate that regulatory framework is the most critical alternative in this regard.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10905-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s10462-024-10924-x
Stella Ho, Ming Liu, Shang Gao, Longxiang Gao
Continual learning strives to ensure stability in solving previously seen tasks while demonstrating plasticity in a novel domain. Recent advances in continual learning are mostly confined to a supervised learning setting, especially in NLP domain. In this work, we consider a few-shot continual active learning setting where labeled data are inadequate, and unlabeled data are abundant but with a limited annotation budget. We exploit meta-learning and propose a method, called Meta-Continual Active Learning. This method sequentially queries the most informative examples from a pool of unlabeled data for annotation to enhance task-specific performance and tackles continual learning problems through a meta-objective. Specifically, we employ meta-learning and experience replay to address inter-task confusion and catastrophic forgetting. We further incorporate textual augmentations to avoid memory over-fitting caused by experience replay and sample queries, thereby ensuring generalization. We conduct extensive experiments on benchmark text classification datasets from diverse domains to validate the feasibility and effectiveness of meta-continual active learning. We also analyze the impact of different active learning strategies on various meta continual learning models. The experimental results demonstrate that introducing randomness into sample selection is the best default strategy for maintaining generalization in meta-continual learning framework.
持续学习致力于确保在解决以往任务时的稳定性,同时在新领域中表现出可塑性。最近在持续学习方面取得的进展大多局限于有监督的学习环境,尤其是在 NLP 领域。在这项工作中,我们考虑的是少量持续主动学习环境,在这种环境中,标记数据不足,而未标记数据丰富,但注释预算有限。我们利用元学习(meta-learning),提出了一种名为元持续主动学习(Meta-Continual Active Learning)的方法。该方法从未标注数据池中依次查询信息量最大的示例进行标注,以提高特定任务的性能,并通过元目标解决持续学习问题。具体来说,我们采用元学习和经验重放来解决任务间的混淆和灾难性遗忘问题。我们还进一步结合了文本增强技术,以避免经验回放和样本查询造成的记忆过度拟合,从而确保泛化。我们在不同领域的基准文本分类数据集上进行了广泛的实验,以验证元持续主动学习的可行性和有效性。我们还分析了不同主动学习策略对各种元持续学习模型的影响。实验结果表明,在元连续学习框架中,将随机性引入样本选择是保持泛化的最佳默认策略。
{"title":"Learning to learn for few-shot continual active learning","authors":"Stella Ho, Ming Liu, Shang Gao, Longxiang Gao","doi":"10.1007/s10462-024-10924-x","DOIUrl":"10.1007/s10462-024-10924-x","url":null,"abstract":"<div><p>Continual learning strives to ensure <i>stability</i> in solving previously seen tasks while demonstrating <i>plasticity</i> in a novel domain. Recent advances in continual learning are mostly confined to a supervised learning setting, especially in NLP domain. In this work, we consider a few-shot continual active learning setting where labeled data are inadequate, and unlabeled data are abundant but with a limited annotation budget. We exploit meta-learning and propose a method, called <i>Meta-Continual Active Learning</i>. This method sequentially queries the most informative examples from a pool of unlabeled data for annotation to enhance task-specific performance and tackles continual learning problems through a meta-objective. Specifically, we employ meta-learning and experience replay to address inter-task confusion and catastrophic forgetting. We further incorporate textual augmentations to avoid memory over-fitting caused by experience replay and sample queries, thereby ensuring generalization. We conduct extensive experiments on benchmark text classification datasets from diverse domains to validate the feasibility and effectiveness of meta-continual active learning. We also analyze the impact of different active learning strategies on various meta continual learning models. The experimental results demonstrate that introducing randomness into sample selection is the best default strategy for maintaining generalization in meta-continual learning framework.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10924-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s10462-024-10933-w
Jun Hoong Chan, Kai Liu, Yu Chen, A. S. M. Sharifuzzaman Sagar, Yong-Guk Kim
Recently, machine learning has been very useful in solving diverse tasks with drones, such as autonomous navigation, visual surveillance, communication, disaster management, and agriculture. Among these machine learning, two representative paradigms have been widely utilized in such applications: supervised learning and reinforcement learning. Researchers prefer to use supervised learning, mostly based on convolutional neural networks, because of its robustness and ease of use but yet data labeling is laborious and time-consuming. On the other hand, when traditional reinforcement learning is combined with the deep neural network, it can be a very powerful tool to solve high-dimensional input problems such as image and video. Along with the fast development of reinforcement learning, many researchers utilize reinforcement learning in drone applications, and it often outperforms supervised learning. However, it usually requires the agent to explore the environment on a trial-and-error basis which is high cost and unrealistic in the real environment. Recent advances in simulated environments can allow an agent to learn by itself to overcome these drawbacks, although the gap between the real environment and the simulator has to be minimized in the end. In this sense, a realistic and reliable simulator is essential for reinforcement learning training. This paper investigates various drone simulators that work with diverse reinforcement learning architectures. The characteristics of the reinforcement learning-based drone simulators are analyzed and compared for the researchers who would like to employ them for their projects. Finally, we shed light on some challenges and potential directions for future drone simulators.
{"title":"Reinforcement learning-based drone simulators: survey, practice, and challenge","authors":"Jun Hoong Chan, Kai Liu, Yu Chen, A. S. M. Sharifuzzaman Sagar, Yong-Guk Kim","doi":"10.1007/s10462-024-10933-w","DOIUrl":"10.1007/s10462-024-10933-w","url":null,"abstract":"<div><p>Recently, machine learning has been very useful in solving diverse tasks with drones, such as autonomous navigation, visual surveillance, communication, disaster management, and agriculture. Among these machine learning, two representative paradigms have been widely utilized in such applications: supervised learning and reinforcement learning. Researchers prefer to use supervised learning, mostly based on convolutional neural networks, because of its robustness and ease of use but yet data labeling is laborious and time-consuming. On the other hand, when traditional reinforcement learning is combined with the deep neural network, it can be a very powerful tool to solve high-dimensional input problems such as image and video. Along with the fast development of reinforcement learning, many researchers utilize reinforcement learning in drone applications, and it often outperforms supervised learning. However, it usually requires the agent to explore the environment on a trial-and-error basis which is high cost and unrealistic in the real environment. Recent advances in simulated environments can allow an agent to learn by itself to overcome these drawbacks, although the gap between the real environment and the simulator has to be minimized in the end. In this sense, a realistic and reliable simulator is essential for reinforcement learning training. This paper investigates various drone simulators that work with diverse reinforcement learning architectures. The characteristics of the reinforcement learning-based drone simulators are analyzed and compared for the researchers who would like to employ them for their projects. Finally, we shed light on some challenges and potential directions for future drone simulators.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10933-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s10462-024-10930-z
Bin Wang, Liwen Yu, Bo Zhang
As the degree of automotive intelligence increases, gesture recognition is gaining more attention in human-vehicle interaction. However, existing gesture recognition methods are computationally intensive and perform poorly in multi-modal sensor scenarios. This paper proposes a novel network structure, AL-MobileNet (MobileNet with Attention and Lightweight Modules), which can quickly and accurately estimate 2D gestures in RGB and infrared (IR) images. The innovations of this paper are as follows: Firstly, to enhance multi-modal data, we created a synthetic IR dataset based on real 2D gestures and employed a coarse-to-fine training approach. Secondly, to speed up the model's computation on edge devices, we introduced a new lightweight computational module called the Split Channel Attention Block (SCAB). Thirdly, to ensure the model maintains accuracy in large datasets, we incorporated auxiliary networks and Angle-Weighted Loss (AWL) into the backbone network. Experiments show that AL-MobileNet requires only 0.4 GFLOPs of computational power and 1.2 million parameters. This makes it 1.5 times faster than MobileNet and allows for quick execution on edge devices. AL-MobileNet achieved a running speed of up to 28 FPS on the Ambarella CV28. On both general datasets and our dataset, our algorithm achieved an average PCK0.2 score of 0.95. This indicates that the algorithm can quickly generate accurate 2D gestures. The demonstration of the algorithm can be reviewed in gesturebaolong.
{"title":"AL-MobileNet: a novel model for 2D gesture recognition in intelligent cockpit based on multi-modal data","authors":"Bin Wang, Liwen Yu, Bo Zhang","doi":"10.1007/s10462-024-10930-z","DOIUrl":"10.1007/s10462-024-10930-z","url":null,"abstract":"<div><p>As the degree of automotive intelligence increases, gesture recognition is gaining more attention in human-vehicle interaction. However, existing gesture recognition methods are computationally intensive and perform poorly in multi-modal sensor scenarios. This paper proposes a novel network structure, AL-MobileNet (MobileNet with Attention and Lightweight Modules), which can quickly and accurately estimate 2D gestures in RGB and infrared (IR) images. The innovations of this paper are as follows: Firstly, to enhance multi-modal data, we created a synthetic IR dataset based on real 2D gestures and employed a coarse-to-fine training approach. Secondly, to speed up the model's computation on edge devices, we introduced a new lightweight computational module called the Split Channel Attention Block (SCAB). Thirdly, to ensure the model maintains accuracy in large datasets, we incorporated auxiliary networks and Angle-Weighted Loss (AWL) into the backbone network. Experiments show that AL-MobileNet requires only 0.4 GFLOPs of computational power and 1.2 million parameters. This makes it 1.5 times faster than MobileNet and allows for quick execution on edge devices. AL-MobileNet achieved a running speed of up to 28 FPS on the Ambarella CV28. On both general datasets and our dataset, our algorithm achieved an average PCK0.2 score of 0.95. This indicates that the algorithm can quickly generate accurate 2D gestures. The demonstration of the algorithm can be reviewed in gesturebaolong.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10930-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s10462-024-10914-z
Maleika Heenaye-Mamode Khan, Pushtika Reesaul, Muhammad Muzzammil Auzine, Amelia Taylor
Due to the progress in image processing and Artificial Intelligence (AI), it is now possible to develop automated tool for the early detection and diagnosis of Alzheimer’s Disease (AD). Handcrafted techniques developed so far, lack generality, leading to the development of deep learning (DL) techniques, which can extract more relevant features. To cater for the limited labelled datasets and requirement in terms of high computational power, transfer learning models can be adopted as a baseline. In recent years, considerable research efforts have been devoted to developing machine learning-based techniques for AD detection and classification using medical imaging data. This survey paper comprehensively reviews the existing literature on various methodologies and approaches employed for AD detection and classification, with a focus on neuroimaging techniques such as structural MRI, PET, and fMRI. The main objective of this survey is to analyse the different transfer learning models that can be used for the deployment of deep convolution neural network for AD detection and classification. The phases involved in the development namely image capture, pre-processing, feature extraction and selection are also discussed in the view of shedding light on the different phases and challenges that need to be addressed. The research perspectives may provide research directions on the development of automated applications for AD detection and classification.
由于图像处理和人工智能(AI)技术的进步,现在有可能开发出用于早期检测和诊断阿尔茨海默病(AD)的自动化工具。迄今为止开发的手工技术缺乏通用性,因此开发了可提取更多相关特征的深度学习(DL)技术。为了满足有限的标记数据集和对高计算能力的要求,可以采用迁移学习模型作为基准。近年来,大量研究人员致力于开发基于机器学习的技术,利用医学影像数据进行 AD 检测和分类。本调查报告全面回顾了现有文献中有关用于注意力缺失症检测和分类的各种方法和途径,重点关注结构性 MRI、PET 和 fMRI 等神经成像技术。本调查的主要目的是分析不同的迁移学习模型,这些模型可用于部署深度卷积神经网络,以进行注意力缺失症检测和分类。此外,还讨论了开发过程中涉及的各个阶段,即图像捕获、预处理、特征提取和选择,以揭示需要解决的不同阶段和挑战。这些研究视角可为开发注意力缺失检测和分类的自动化应用提供研究方向。
{"title":"Detection of Alzheimer’s disease using pre-trained deep learning models through transfer learning: a review","authors":"Maleika Heenaye-Mamode Khan, Pushtika Reesaul, Muhammad Muzzammil Auzine, Amelia Taylor","doi":"10.1007/s10462-024-10914-z","DOIUrl":"10.1007/s10462-024-10914-z","url":null,"abstract":"<div><p>Due to the progress in image processing and Artificial Intelligence (AI), it is now possible to develop automated tool for the early detection and diagnosis of Alzheimer’s Disease (AD). Handcrafted techniques developed so far, lack generality, leading to the development of deep learning (DL) techniques, which can extract more relevant features. To cater for the limited labelled datasets and requirement in terms of high computational power, transfer learning models can be adopted as a baseline. In recent years, considerable research efforts have been devoted to developing machine learning-based techniques for AD detection and classification using medical imaging data. This survey paper comprehensively reviews the existing literature on various methodologies and approaches employed for AD detection and classification, with a focus on neuroimaging techniques such as structural MRI, PET, and fMRI. The main objective of this survey is to analyse the different transfer learning models that can be used for the deployment of deep convolution neural network for AD detection and classification. The phases involved in the development namely image capture, pre-processing, feature extraction and selection are also discussed in the view of shedding light on the different phases and challenges that need to be addressed. The research perspectives may provide research directions on the development of automated applications for AD detection and classification.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10914-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cloud computing is an emerging technology composed of several key components that work together to create a seamless network of interconnected devices. These interconnected devices, such as sensors, routers, smartphones, and smart appliances, are the foundation of the Internet of Everything (IoE). Huge volumes of data generated by IoE devices are processed and accumulated in the cloud, allowing for real-time analysis and insights. As a result, there is a dire need for load-balancing and task-scheduling techniques in cloud computing. The primary objective of these techniques is to divide the workload evenly across all available resources and handle other issues like reducing execution time and response time, increasing throughput and fault detection. This systematic literature review (SLR) aims to analyze various technologies comprising optimization and machine learning algorithms used for load balancing and task-scheduling problems in a cloud computing environment. To analyze the load-balancing patterns and task-scheduling techniques, we opted for a representative set of 63 research articles written in English from 2014 to 2024 that has been selected using suitable exclusion-inclusion criteria. The SLR aims to minimize bias and increase objectivity by designing research questions about the topic. We have focused on the technologies used, the merits-demerits of diverse technologies, gaps within the research, insights into tools, forthcoming opportunities, performance metrics, and an in-depth investigation into ML-based optimization techniques.
{"title":"A systematic literature review for load balancing and task scheduling techniques in cloud computing","authors":"Nisha Devi, Sandeep Dalal, Kamna Solanki, Surjeet Dalal, Umesh Kumar Lilhore, Sarita Simaiya, Nasratullah Nuristani","doi":"10.1007/s10462-024-10925-w","DOIUrl":"10.1007/s10462-024-10925-w","url":null,"abstract":"<div><p>Cloud computing is an emerging technology composed of several key components that work together to create a seamless network of interconnected devices. These interconnected devices, such as sensors, routers, smartphones, and smart appliances, are the foundation of the Internet of Everything (IoE). Huge volumes of data generated by IoE devices are processed and accumulated in the cloud, allowing for real-time analysis and insights. As a result, there is a dire need for load-balancing and task-scheduling techniques in cloud computing. The primary objective of these techniques is to divide the workload evenly across all available resources and handle other issues like reducing execution time and response time, increasing throughput and fault detection. This systematic literature review (SLR) aims to analyze various technologies comprising optimization and machine learning algorithms used for load balancing and task-scheduling problems in a cloud computing environment. To analyze the load-balancing patterns and task-scheduling techniques, we opted for a representative set of 63 research articles written in English from 2014 to 2024 that has been selected using suitable exclusion-inclusion criteria. The SLR aims to minimize bias and increase objectivity by designing research questions about the topic. We have focused on the technologies used, the merits-demerits of diverse technologies, gaps within the research, insights into tools, forthcoming opportunities, performance metrics, and an in-depth investigation into ML-based optimization techniques.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10925-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}