While the classical KNN (k nearest neighbor) shares its avoidance of the consistent distribution assumption between training and testing samples to achieve fast prediction, it still faces two challenges: (a) its generalization ability heavily depends on an appropriate number k of nearest neighbors; (b) its prediction behavior lacks interpretability. In order to address the two challenges, a novel Bayes-decisive linear KNN with adaptive nearest neighbors (i.e., BLA-KNN) is proposed to obtain the following three merits: (a) a diagonal matrix is introduced to adaptively select the nearest neighbors and simultaneously improve the generalization capability of the proposed BLA-KNN method; (b) the proposed BLA-KNN method owns the group effect, which inherits and extends the group property of the sum of squares for total deviations by reflecting the training sample class-aware information in the group effect regularization term; (c) the prediction behavior of the proposed BLA-KNN method can be interpreted from the Bayes-decision-rule perspective. In order to do so, we first use a diagonal matrix to weigh each training sample so as to obtain the importance of the sample, while constraining the importance weights to ensure that the adaptive k value is carried out efficiently. Second, we introduce a class-aware information regularization term in the objective function to obtain the nearest neighbor group effect of the samples. Finally, we introduce linear expression weights related to the distance measure between the testing and training samples in the regularization term to ensure that the interpretation of Bayes-decision-rule can be performed smoothly. We also optimize the proposed objective function using an alternating optimization strategy. We experimentally demonstrate the effectiveness of the proposed BLA-KNN method by comparing it with 7 comparative methods on 15 benchmark datasets.
{"title":"Bayes-Decisive Linear KNN with Adaptive Nearest Neighbors","authors":"Jin Zhang, Zekang Bian, Shitong Wang","doi":"10.1155/2024/6664942","DOIUrl":"10.1155/2024/6664942","url":null,"abstract":"<p>While the classical KNN (<i>k</i> nearest neighbor) shares its avoidance of the consistent distribution assumption between training and testing samples to achieve fast prediction, it still faces two challenges: (a) its generalization ability heavily depends on an appropriate number <i>k</i> of nearest neighbors; (b) its prediction behavior lacks interpretability. In order to address the two challenges, a novel Bayes-decisive linear KNN with adaptive nearest neighbors (<i>i.e</i>., BLA-KNN) is proposed to obtain the following three merits: (a) a diagonal matrix is introduced to adaptively select the nearest neighbors and simultaneously improve the generalization capability of the proposed BLA-KNN method; (b) the proposed BLA-KNN method owns the group effect, which inherits and extends the group property of the sum of squares for total deviations by reflecting the training sample class-aware information in the group effect regularization term; (c) the prediction behavior of the proposed BLA-KNN method can be interpreted from the Bayes-decision-rule perspective. In order to do so, we first use a diagonal matrix to weigh each training sample so as to obtain the importance of the sample, while constraining the importance weights to ensure that the adaptive <i>k</i> value is carried out efficiently. Second, we introduce a class-aware information regularization term in the objective function to obtain the nearest neighbor group effect of the samples. Finally, we introduce linear expression weights related to the distance measure between the testing and training samples in the regularization term to ensure that the interpretation of Bayes-decision-rule can be performed smoothly. We also optimize the proposed objective function using an alternating optimization strategy. We experimentally demonstrate the effectiveness of the proposed BLA-KNN method by comparing it with 7 comparative methods on 15 benchmark datasets.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139601673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Granular computing (GrC) embraces a spectrum of concepts, methodologies, methods, and applications, which dwells upon information granules and their processing. Fuzzy C-means (FCM) based encoding and decoding (granulation-degranulation) mechanism plays a visible role in granular computing. Fuzzy decoding mechanism, also known as the reconstruction (degranulation) problem, has become an intensively studied category in recent years. This study mainly focuses on the improvement of the fuzzy decoding mechanism, and an augmented version achieved through constructing perturbation matrices of prototypes is put forward. Particle swarm optimization is employed to determine a group of optimal perturbation matrices to optimize the prototype matrix and obtain an optimal partition matrix. A series of experiments are carried out to show the enhancement of the proposed method. The experimental results are consistent with the theoretical analysis and demonstrate that the developed method outperforms the traditional FCM-based decoding mechanism.
{"title":"Constructing Perturbation Matrices of Prototypes for Enhancing the Performance of Fuzzy Decoding Mechanism","authors":"Kaijie Xu, Hanyu E, Junliang Liu, Guoyao Xiao, Xiaoan Tang, Mengdao Xing","doi":"10.1155/2024/5780186","DOIUrl":"10.1155/2024/5780186","url":null,"abstract":"<p>Granular computing (GrC) embraces a spectrum of concepts, methodologies, methods, and applications, which dwells upon information granules and their processing. Fuzzy C-means (FCM) based encoding and decoding (granulation-degranulation) mechanism plays a visible role in granular computing. Fuzzy decoding mechanism, also known as the reconstruction (degranulation) problem, has become an intensively studied category in recent years. This study mainly focuses on the improvement of the fuzzy decoding mechanism, and an augmented version achieved through constructing perturbation matrices of prototypes is put forward. Particle swarm optimization is employed to determine a group of optimal perturbation matrices to optimize the prototype matrix and obtain an optimal partition matrix. A series of experiments are carried out to show the enhancement of the proposed method. The experimental results are consistent with the theoretical analysis and demonstrate that the developed method outperforms the traditional FCM-based decoding mechanism.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139601679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data mining is the process used for extracting hidden patterns from large databases using a variety of techniques. For example, in supermarkets, we can discover the items that are often purchased together and that are hidden within the data. This helps make better decisions which improve the business outcomes. One of the techniques that are used to discover frequent patterns in large databases is frequent itemset mining (FIM) that is a part of association rule mining (ARM). There are different algorithms for mining frequent itemsets. One of the most common algorithms for this purpose is the Apriori algorithm that deduces association rules between different objects which describe how these objects are related together. It can be used in different application areas like market basket analysis, student’s courses selection process in the E-learning platforms, stock management, and medical applications. Nowadays, there is a great explosion of data that will increase the computational time in the Apriori algorithm. Therefore, there is a necessity to run the data-intensive algorithms in a parallel-distributed environment to achieve a convenient performance. In this paper, optimization of the Apriori algorithm using the Spark-based cuckoo filter structure (ASCF) is introduced. ASCF succeeds in removing the candidate generation step from the Apriori algorithm to reduce computational complexity and avoid costly comparisons. It uses the cuckoo filter structure to prune the transactions by reducing the number of items in each transaction. The proposed algorithm is implemented on the Spark in-memory processing distributed environment to reduce processing time. ASCF offers a great improvement in performance over the other candidate algorithms based on Apriori, where it achieves a time of only 5.8% of the state-of-the-art approach on the retail dataset with a minimum support of 0.75%.
{"title":"ASCF: Optimization of the Apriori Algorithm Using Spark-Based Cuckoo Filter Structure","authors":"Bana Ahmad Alrahwan, Mona Farouk","doi":"10.1155/2024/8781318","DOIUrl":"10.1155/2024/8781318","url":null,"abstract":"<p>Data mining is the process used for extracting hidden patterns from large databases using a variety of techniques. For example, in supermarkets, we can discover the items that are often purchased together and that are hidden within the data. This helps make better decisions which improve the business outcomes. One of the techniques that are used to discover frequent patterns in large databases is frequent itemset mining (FIM) that is a part of association rule mining (ARM). There are different algorithms for mining frequent itemsets. One of the most common algorithms for this purpose is the Apriori algorithm that deduces association rules between different objects which describe how these objects are related together. It can be used in different application areas like market basket analysis, student’s courses selection process in the E-learning platforms, stock management, and medical applications. Nowadays, there is a great explosion of data that will increase the computational time in the Apriori algorithm. Therefore, there is a necessity to run the data-intensive algorithms in a parallel-distributed environment to achieve a convenient performance. In this paper, optimization of the Apriori algorithm using the Spark-based cuckoo filter structure (ASCF) is introduced. ASCF succeeds in removing the candidate generation step from the Apriori algorithm to reduce computational complexity and avoid costly comparisons. It uses the cuckoo filter structure to prune the transactions by reducing the number of items in each transaction. The proposed algorithm is implemented on the Spark in-memory processing distributed environment to reduce processing time. ASCF offers a great improvement in performance over the other candidate algorithms based on Apriori, where it achieves a time of only 5.8% of the state-of-the-art approach on the retail dataset with a minimum support of 0.75%.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139606741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Document representation is the basis of language modeling. Its goal is to turn natural language text that flows into a structured form that can be stored and processed by a computer. The bag-of-words model is used by most of the text-representation methods that are currently available. And yet, they do not consider how phrases are used in the text, which hurts the performance of tasks that use natural language processing later on. Representing the meaning of text by phrases is a promising area of future research, but it is hard to do well because phrases are organized in a hierarchy and mining efficiency is low. In this paper, we put forward a method called hierarchical text semantic representation using the knowledge graph (HTSRKG), which uses syntactic structure features to find hierarchical phrases and knowledge graphs to improve how phrases are evaluated. First, we use CKY and PCFG to build the syntax tree sentence by sentence. Second, we walk through the parse tree using the hierarchical routing process to obtain the mixed phrase semantics in passages. Finally, the introduction of the knowledge graph improves the efficiency of text semantic extraction and the accuracy of text representation. This gives us a solid foundation for tasks involving natural language processing that come after. Extensive testing on actual datasets shows that HTSRKG surpasses baseline approaches with respect to text semantic representation, and the results of a recent benchmarking study support this.
{"title":"Knowledge Graph-Based Hierarchical Text Semantic Representation","authors":"Yongliang Wu, Xiao Pan, Jinghui Li, Shimao Dou, Jiahao Dong, Dan Wei","doi":"10.1155/2024/5583270","DOIUrl":"10.1155/2024/5583270","url":null,"abstract":"<p>Document representation is the basis of language modeling. Its goal is to turn natural language text that flows into a structured form that can be stored and processed by a computer. The bag-of-words model is used by most of the text-representation methods that are currently available. And yet, they do not consider how phrases are used in the text, which hurts the performance of tasks that use natural language processing later on. Representing the meaning of text by phrases is a promising area of future research, but it is hard to do well because phrases are organized in a hierarchy and mining efficiency is low. In this paper, we put forward a method called hierarchical text semantic representation using the knowledge graph (HTSRKG), which uses syntactic structure features to find hierarchical phrases and knowledge graphs to improve how phrases are evaluated. First, we use CKY and PCFG to build the syntax tree sentence by sentence. Second, we walk through the parse tree using the hierarchical routing process to obtain the mixed phrase semantics in passages. Finally, the introduction of the knowledge graph improves the efficiency of text semantic extraction and the accuracy of text representation. This gives us a solid foundation for tasks involving natural language processing that come after. Extensive testing on actual datasets shows that HTSRKG surpasses baseline approaches with respect to text semantic representation, and the results of a recent benchmarking study support this.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139532610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dynamic signs in the sentence form are conveyed in continuous sign-language videos. A series of frames are used to depict a single sign or a phrase in sign videos. Most of these frames are noninformational and they hardly effect on sign recognition. By removing them from the frameset, the recognition algorithm will only need to input a minimal number of frames for each sign. This reduces the time and spatial complexity of such systems. The algorithm deals with the challenge of identifying tiny motion frames such as tapping, stroking, and caressing as keyframes on continuous sign-language videos with a high reduction ratio and accuracy. The proposed method maintains the continuity of sign motion instead of isolating signs, unlike previous studies. It also supports the scalability and stability of the dataset. The algorithm measures angular displacements between adjacent frames to identify potential keyframes. Then, noninformational frames are discarded using the sequence check technique. Pheonix14, a German continuous sign-language benchmark dataset, has been reduced to 74.9% with an accuracy of 83.1%, and American sign language (ASL) How2Sign is reduced to 76.9% with 84.2% accuracy. A low word error rate (WER) is also achieved on the Phoenix14 dataset.
{"title":"Keyframe Extraction Algorithm for Continuous Sign-Language Videos Using Angular Displacement and Sequence Check Metrics","authors":"M. S. Aiswarya, R. Arockia Xavier Annie","doi":"10.1155/2024/4725216","DOIUrl":"10.1155/2024/4725216","url":null,"abstract":"<p>Dynamic signs in the sentence form are conveyed in continuous sign-language videos. A series of frames are used to depict a single sign or a phrase in sign videos. Most of these frames are noninformational and they hardly effect on sign recognition. By removing them from the frameset, the recognition algorithm will only need to input a minimal number of frames for each sign. This reduces the time and spatial complexity of such systems. The algorithm deals with the challenge of identifying tiny motion frames such as tapping, stroking, and caressing as keyframes on continuous sign-language videos with a high reduction ratio and accuracy. The proposed method maintains the continuity of sign motion instead of isolating signs, unlike previous studies. It also supports the scalability and stability of the dataset. The algorithm measures angular displacements between adjacent frames to identify potential keyframes. Then, noninformational frames are discarded using the sequence check technique. Pheonix14, a German continuous sign-language benchmark dataset, has been reduced to 74.9% with an accuracy of 83.1%, and American sign language (ASL) How2Sign is reduced to 76.9% with 84.2% accuracy. A low word error rate (WER) is also achieved on the Phoenix14 dataset.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139440784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Belief divergence is a significant measure to quantify the discrepancy between evidence, which is beneficial for conflict information management in Dempster–Shafer evidence theory. In this article, three new concepts are given, namely, the belief Bhattacharyya coefficient, adjustment function, and enhancement factor. And based on them, a novel enhanced belief divergence, called EBD, is proposed, which can assess the correlation of subsets and fully reflect the uncertainty of multielement sets. The important properties of the EBD have been studied. In particular, a new EBD-based multisource information fusion method is designed to handle evidence conflict, where the weight of evidence is decided by the EBD between evidence and the information volume of each evidence. Compared with other methods, the proposed method in the applications of target recognition and iris classification can produce more rational and telling outcomes when dealing with conflict information. Finally, an application in risk priority evaluation of the failure modes of rotor blades of an aircraft turbine is provided to validate that the proposed method has the extensive applicability.
{"title":"A New Divergence Based on the Belief Bhattacharyya Coefficient with an Application in Risk Evaluation of Aircraft Turbine Rotor Blades","authors":"Zhu Yin, Xiaojian Ma, Hang Wang","doi":"10.1155/2024/2140919","DOIUrl":"10.1155/2024/2140919","url":null,"abstract":"<p>Belief divergence is a significant measure to quantify the discrepancy between evidence, which is beneficial for conflict information management in Dempster–Shafer evidence theory. In this article, three new concepts are given, namely, the belief Bhattacharyya coefficient, adjustment function, and enhancement factor. And based on them, a novel enhanced belief divergence, called EBD, is proposed, which can assess the correlation of subsets and fully reflect the uncertainty of multielement sets. The important properties of the EBD have been studied. In particular, a new EBD-based multisource information fusion method is designed to handle evidence conflict, where the weight of evidence is decided by the EBD between evidence and the information volume of each evidence. Compared with other methods, the proposed method in the applications of target recognition and iris classification can produce more rational and telling outcomes when dealing with conflict information. Finally, an application in risk priority evaluation of the failure modes of rotor blades of an aircraft turbine is provided to validate that the proposed method has the extensive applicability.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139627191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hosam El-Sofany, Samir A. El-Seoud, Omar H. Karam, Yasser M. Abd El-Latif, Islam A. T. F. Taj-Eddin
With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction of the disease to prevent long-term health complications. This study addresses this need by using machine learning (ML) techniques applied to the Pima Indians dataset and private diabetes datasets through the implementation of a computerized system for predicting diabetes. In contrast to prior research, this study employs a semisupervised model combined with strong gradient boosting, effectively predicting diabetes-related features of the dataset. Additionally, the researchers employ the SMOTE technique to deal with the problem of imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, and Naive Bayes, are evaluated to determine the algorithm that produces the most accurate diabetes prediction. The proposed approach has achieved impressive performance. For the private dataset, the XGBoost algorithm with SMOTE achieved an accuracy of 97.4%, an F1 coefficient of 0.95, and an AUC of 0.87. For the combined datasets, it achieved an accuracy of 83.1%, an F1 coefficient of 0.76, and an AUC of 0.85. To understand how the model predicts the final results, an explainable AI technique using SHAP methods is implemented. Furthermore, the study demonstrates the adaptability of the proposed system by applying a domain adaptation method. To further enhance accessibility, a mobile app has been developed for instant diabetes prediction based on user-entered features. This study contributes novel insights and techniques to the field of ML-based diabetic prediction, potentially aiding in the early detection and management of diabetes in Saudi Arabia.
{"title":"A Proposed Technique Using Machine Learning for the Prediction of Diabetes Disease through a Mobile App","authors":"Hosam El-Sofany, Samir A. El-Seoud, Omar H. Karam, Yasser M. Abd El-Latif, Islam A. T. F. Taj-Eddin","doi":"10.1155/2024/6688934","DOIUrl":"10.1155/2024/6688934","url":null,"abstract":"<p>With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction of the disease to prevent long-term health complications. This study addresses this need by using machine learning (ML) techniques applied to the Pima Indians dataset and private diabetes datasets through the implementation of a computerized system for predicting diabetes. In contrast to prior research, this study employs a semisupervised model combined with strong gradient boosting, effectively predicting diabetes-related features of the dataset. Additionally, the researchers employ the SMOTE technique to deal with the problem of imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, and Naive Bayes, are evaluated to determine the algorithm that produces the most accurate diabetes prediction. The proposed approach has achieved impressive performance. For the private dataset, the XGBoost algorithm with SMOTE achieved an accuracy of 97.4%, an F1 coefficient of 0.95, and an AUC of 0.87. For the combined datasets, it achieved an accuracy of 83.1%, an F1 coefficient of 0.76, and an AUC of 0.85. To understand how the model predicts the final results, an explainable AI technique using SHAP methods is implemented. Furthermore, the study demonstrates the adaptability of the proposed system by applying a domain adaptation method. To further enhance accessibility, a mobile app has been developed for instant diabetes prediction based on user-entered features. This study contributes novel insights and techniques to the field of ML-based diabetic prediction, potentially aiding in the early detection and management of diabetes in Saudi Arabia.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139442520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Action recognition (AR) has many applications, including surveillance, health/disabilities care, man-machine interactions, video-content-based monitoring, and activity recognition. Because human action videos contain a large number of frames, implemented models must minimize computation by reducing the number, size, and resolution of frames. We propose an improved method for detecting human actions in low-size and low-resolution videos by employing convolutional neural networks (CNNs) with channel attention mechanisms (CAMs) and autoencoders (AEs). By enhancing blocks with more representative features, convolutional layers extract discriminating features from various networks. Additionally, we use random sampling of frames before main processing to improve accuracy while employing less data. The goal is to increase performance while overcoming challenges such as overfitting, computational complexity, and uncertainty by utilizing CNN-CAM and AE. Identifying patterns and features associated with selective high-level performance is the next step. To validate the method, low-resolution and low-size video frames were used in the UCF50, UCF101, and HMDB51 datasets. Additionally, the algorithm has relatively minimal computational complexity. Consequently, the proposed method performs satisfactorily compared to other similar methods. It has accuracy estimates of 77.29, 98.87, and 97.16%, respectively, for HMDB51, UCF50, and UCF101 datasets. These results indicate that the method can effectively classify human actions. Furthermore, the proposed method can be used as a processing model for low-resolution and low-size video frames.
{"title":"Channel Attention-Based Approach with Autoencoder Network for Human Action Recognition in Low-Resolution Frames","authors":"Elaheh Dastbaravardeh, Somayeh Askarpour, Maryam Saberi Anari, Khosro Rezaee","doi":"10.1155/2024/1052344","DOIUrl":"10.1155/2024/1052344","url":null,"abstract":"<p>Action recognition (AR) has many applications, including surveillance, health/disabilities care, man-machine interactions, video-content-based monitoring, and activity recognition. Because human action videos contain a large number of frames, implemented models must minimize computation by reducing the number, size, and resolution of frames. We propose an improved method for detecting human actions in low-size and low-resolution videos by employing convolutional neural networks (CNNs) with channel attention mechanisms (CAMs) and autoencoders (AEs). By enhancing blocks with more representative features, convolutional layers extract discriminating features from various networks. Additionally, we use random sampling of frames before main processing to improve accuracy while employing less data. The goal is to increase performance while overcoming challenges such as overfitting, computational complexity, and uncertainty by utilizing CNN-CAM and AE. Identifying patterns and features associated with selective high-level performance is the next step. To validate the method, low-resolution and low-size video frames were used in the UCF50, UCF101, and HMDB51 datasets. Additionally, the algorithm has relatively minimal computational complexity. Consequently, the proposed method performs satisfactorily compared to other similar methods. It has accuracy estimates of 77.29, 98.87, and 97.16%, respectively, for HMDB51, UCF50, and UCF101 datasets. These results indicate that the method can effectively classify human actions. Furthermore, the proposed method can be used as a processing model for low-resolution and low-size video frames.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated learning (FL) has shown promise in smart industries as a means of training machine-learning models while preserving privacy. However, it contradicts FL’s low communication latency requirement to rely on the cloud to transmit information with data owners in model training tasks. Furthermore, data owners may not be willing to contribute their resources for free. To address this, we propose a single contract to dual contract approach to incentivize both model owners and workers to participate in FL-based machine learning tasks. The single-contract incentivizes model owners to contribute their model parameters, and the dual contract incentivizes workers to use their latest data to participate in the training task. The latest data draw out the trade-off between data quantity and data update frequency. Performance evaluation shows that our dual contract satisfies different preferences for data quantity and update frequency, and validates that the proposed incentive mechanism is incentive compatible and flexible.
{"title":"Hierarchical Incentive Mechanism for Federated Learning: A Single Contract to Dual Contract Approach for Smart Industries","authors":"Tao Wan, Tiantian Jiang, Weichuan Liao, Nan Jiang","doi":"10.1155/2024/6402026","DOIUrl":"10.1155/2024/6402026","url":null,"abstract":"<p>Federated learning (FL) has shown promise in smart industries as a means of training machine-learning models while preserving privacy. However, it contradicts FL’s low communication latency requirement to rely on the cloud to transmit information with data owners in model training tasks. Furthermore, data owners may not be willing to contribute their resources for free. To address this, we propose a single contract to dual contract approach to incentivize both model owners and workers to participate in FL-based machine learning tasks. The single-contract incentivizes model owners to contribute their model parameters, and the dual contract incentivizes workers to use their latest data to participate in the training task. The latest data draw out the trade-off between data quantity and data update frequency. Performance evaluation shows that our dual contract satisfies different preferences for data quantity and update frequency, and validates that the proposed incentive mechanism is incentive compatible and flexible.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139386250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Besides data sparsity and cold start, recommender systems often face the problems of selection bias and exposure bias. These problems influence the accuracy of recommendations and easily lead to overrecommendations. This paper proposes a recommendation approach based on heterogeneous network and dynamic knowledge graph (HN-DKG). The main steps include (1) determining the implicit preferences of users according to user’s cross-domain and cross-platform behaviors to form multimodal nodes and then building a heterogeneous knowledge graph; (2) Applying an improved multihead attention mechanism of the graph attention network (GAT) to realize the relationship enhancement of multimodal nodes and constructing a dynamic knowledge graph; and (3) Leveraging RippleNet to discover user’s layered potential interests and rating candidate items. In which, some mechanisms, such as user seed clusters, propagation blocking, and random seed mechanisms, are designed to obtain more accurate and diverse recommendations. In this paper, the public datasets are used to evaluate the performance of algorithms, and the experimental results show that the proposed method has good performance in the effectiveness and diversity of recommendations. On the MovieLens-1M dataset, the proposed model is 18%, 9%, and 2% higher than KGAT on F1, NDCG@10, and AUC and 20%, 2%, and 0.9% higher than RippleNet, respectively. On the Amazon Book dataset, the proposed model is 12%, 3%, and 2.5% higher than NFM on F1, NDCG@10, and AUC and 0.8%, 2.3%, and 0.35% higher than RippleNet, respectively.
{"title":"A Recommendation Approach Based on Heterogeneous Network and Dynamic Knowledge Graph","authors":"Shanshan Wan, Yuquan Wu, Ying Liu, Linhu Xiao, Maozu Guo","doi":"10.1155/2024/4169402","DOIUrl":"10.1155/2024/4169402","url":null,"abstract":"<p>Besides data sparsity and cold start, recommender systems often face the problems of selection bias and exposure bias. These problems influence the accuracy of recommendations and easily lead to overrecommendations. This paper proposes a recommendation approach based on heterogeneous network and dynamic knowledge graph (HN-DKG). The main steps include (1) determining the implicit preferences of users according to user’s cross-domain and cross-platform behaviors to form multimodal nodes and then building a heterogeneous knowledge graph; (2) Applying an improved multihead attention mechanism of the graph attention network (GAT) to realize the relationship enhancement of multimodal nodes and constructing a dynamic knowledge graph; and (3) Leveraging RippleNet to discover user’s layered potential interests and rating candidate items. In which, some mechanisms, such as user seed clusters, propagation blocking, and random seed mechanisms, are designed to obtain more accurate and diverse recommendations. In this paper, the public datasets are used to evaluate the performance of algorithms, and the experimental results show that the proposed method has good performance in the effectiveness and diversity of recommendations. On the MovieLens-1M dataset, the proposed model is 18%, 9%, and 2% higher than KGAT on <i>F</i>1, NDCG@10, and AUC and 20%, 2%, and 0.9% higher than RippleNet, respectively. On the Amazon Book dataset, the proposed model is 12%, 3%, and 2.5% higher than NFM on <i>F</i>1, NDCG@10, and AUC and 0.8%, 2.3%, and 0.35% higher than RippleNet, respectively.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":7.0,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139536298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}