Pub Date : 2025-01-24DOI: 10.1016/j.future.2025.107729
Phong Lam, Ha-Linh Nguyen, Xuan-Truc Dao Dang, Van-Son Tran, Minh-Duc Le, Thu-Trang Nguyen, Son Nguyen, Hieu Dinh Vo
The performance of the Machine learning and Deep learning models heavily depends on the quality and quantity of the training data. However, real-world datasets often contain a considerable percentage of noisy labels, ranging from 8.0% to 38.5%. This could significantly reduce model accuracy. To address the problem of corrupted labels, we propose Cola, a novel data-centric approach that leverages both local neighborhood similarities and global relationships across the entire dataset to detect corrupted labels. The main idea of our approach is that similar instances tend to share the same label, and the relationship between clean data can be learned and utilized to distinguish the correct and corrupted labels. Our experiments on four well-established datasets of image and text demonstrate that Cola consistently outperforms state-of-the-art approaches, achieving improvements of 8% to 21% in F1-score for identifying corrupted labels across various noise types and rates. For visual data, Cola achieves improvements of up to 80% in F1-score, while for textual data, the average improvement reaches about 17% with a maximum of 91%. Furthermore, Cola is significantly more effective and efficient in detecting corrupted labels than advanced large language models, such as Llama3, with improvements of up to 112% in Precision and a 300X reduction in execution time.
{"title":"Leveraging local and global relationships for corrupted label detection","authors":"Phong Lam, Ha-Linh Nguyen, Xuan-Truc Dao Dang, Van-Son Tran, Minh-Duc Le, Thu-Trang Nguyen, Son Nguyen, Hieu Dinh Vo","doi":"10.1016/j.future.2025.107729","DOIUrl":"10.1016/j.future.2025.107729","url":null,"abstract":"<div><div>The performance of the Machine learning and Deep learning models heavily depends on the quality and quantity of the training data. However, real-world datasets often contain a considerable percentage of noisy labels, ranging from 8.0% to 38.5%. This could significantly reduce model accuracy. To address the problem of corrupted labels, we propose <span>Cola</span>, a novel data-centric approach that leverages both <em>local</em> neighborhood similarities and <em>global</em> relationships across the entire dataset to detect corrupted labels. The main idea of our approach is that similar instances tend to share the same label, and the relationship between clean data can be learned and utilized to distinguish the correct and corrupted labels. Our experiments on four well-established datasets of image and text demonstrate that <span>Cola</span> consistently outperforms state-of-the-art approaches, achieving improvements of 8% to 21% in F1-score for identifying corrupted labels across various noise types and rates. For visual data, <span>Cola</span> achieves improvements of up to 80% in F1-score, while for textual data, the average improvement reaches about 17% with a maximum of 91%. Furthermore, <span>Cola</span> is significantly more effective and efficient in detecting corrupted labels than advanced large language models, such as <em>Llama3</em>, with improvements of up to 112% in Precision and a 300X reduction in execution time.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107729"},"PeriodicalIF":6.2,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143077829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1016/j.future.2025.107709
Hui Tian , Nan Gan , Fang Peng , Hanyu Quan , Chin-Chen Chang , Athanasios V. Vasilakos
Cloud storage, a vital component of cloud computing, faces significant challenges in ensuring data integrity, which hinders its widespread adoption. Public auditing models, which rely on third-party auditors (TPAs), have been developed to address these issues by offloading computation from users. However, maintaining the consistent trustworthiness of TPAs remains a major challenge, especially in preventing dishonest behaviors, such as collusion, procrastination, and forgery. In this paper, we propose a novel smart contract-based public integrity auditing scheme for cloud storage, introducing a transparent, non-black-box auditing process. This scheme adopts certificateless authentication, significantly reducing the overhead associated with traditional key management and certificate handling. To mitigate TPA dishonesty, we introduce a blockchain-based challenge generation algorithm and an auditing process preservation mechanism. The challenge algorithm ensures fair random sampling by leveraging blockchain’s immutability, reducing the risk of collusion between TPAs and cloud service providers (CSPs). The auditing process preservation mechanism prevents procrastination by recording task completion times and preserving metadata, ensuring full traceability and accountability. We also present a post-auditing validation mechanism that enhances the verifiability of auditing results, comprising two components: auditing computation proof, which verifies the correctness of computationally intensive steps, and auditing process replay, which replays the entire auditing using preserved metadata. Finally, we formally prove the security of our scheme and conduct a comprehensive performance comparison with existing solutions. The results demonstrate that our approach offers strong security, reduces computational overhead, and maintains comparable communication overhead to other schemes.
{"title":"Smart contract-based public integrity auditing for cloud storage against malicious auditors","authors":"Hui Tian , Nan Gan , Fang Peng , Hanyu Quan , Chin-Chen Chang , Athanasios V. Vasilakos","doi":"10.1016/j.future.2025.107709","DOIUrl":"10.1016/j.future.2025.107709","url":null,"abstract":"<div><div>Cloud storage, a vital component of cloud computing, faces significant challenges in ensuring data integrity, which hinders its widespread adoption. Public auditing models, which rely on third-party auditors (TPAs), have been developed to address these issues by offloading computation from users. However, maintaining the consistent trustworthiness of TPAs remains a major challenge, especially in preventing dishonest behaviors, such as collusion, procrastination, and forgery. In this paper, we propose a novel smart contract-based public integrity auditing scheme for cloud storage, introducing a transparent, non-black-box auditing process. This scheme adopts certificateless authentication, significantly reducing the overhead associated with traditional key management and certificate handling. To mitigate TPA dishonesty, we introduce a blockchain-based challenge generation algorithm and an auditing process preservation mechanism. The challenge algorithm ensures fair random sampling by leveraging blockchain’s immutability, reducing the risk of collusion between TPAs and cloud service providers (CSPs). The auditing process preservation mechanism prevents procrastination by recording task completion times and preserving metadata, ensuring full traceability and accountability. We also present a post-auditing validation mechanism that enhances the verifiability of auditing results, comprising two components: auditing computation proof, which verifies the correctness of computationally intensive steps, and auditing process replay, which replays the entire auditing using preserved metadata. Finally, we formally prove the security of our scheme and conduct a comprehensive performance comparison with existing solutions. The results demonstrate that our approach offers strong security, reduces computational overhead, and maintains comparable communication overhead to other schemes.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107709"},"PeriodicalIF":6.2,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we present a novel method to generate hard instances with planted solutions based on the public–private McEliece post-quantum cryptographic protocol. Unlike other planting methods rooted in the infinite-size statistical analysis, our cryptographic protocol generates instances which are all hard (in cryptographic terms), with the hardness tuned by the size of the private key, and with a guaranteed unique ground state. More importantly, because of the private–public key protocol, planted solutions cannot be easily recovered by a direct inspection of the planted instances without the knowledge of the private key used to generate them, therefore making our protocol suitable to test and evaluate quantum devices without the risk of “backdoors” being exploited.
{"title":"Generating hard Ising instances with planted solutions using post-quantum cryptographic protocols","authors":"Salvatore Mandrà , Humberto Munoz-Bauza , Gianni Mossi , Eleanor G. Rieffel","doi":"10.1016/j.future.2025.107721","DOIUrl":"10.1016/j.future.2025.107721","url":null,"abstract":"<div><div>In this paper we present a novel method to generate hard instances with planted solutions based on the public–private McEliece post-quantum cryptographic protocol. Unlike other planting methods rooted in the infinite-size statistical analysis, our cryptographic protocol generates instances which are <em>all</em> hard (in cryptographic terms), with the hardness tuned by the size of the private key, and with a guaranteed unique ground state. More importantly, because of the private–public key protocol, planted solutions cannot be easily recovered by a direct inspection of the planted instances without the knowledge of the private key used to generate them, therefore making our protocol suitable to test and evaluate quantum devices without the risk of “backdoors” being exploited.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107721"},"PeriodicalIF":6.2,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143077831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1016/j.future.2025.107722
Soniya Rohhila, Amit Kumar Singh
Deep learning (DL) plays a vital role in identifying critical features and patterns in digital images. Deep learning models and generated records, particularly digital images, are highly effective in media and other applications but pose privacy and security challenges. For example, healthcare professionals must understand how Artificial Intelligence (AI) makes decisions to trust and fully incorporate its findings into medical practice. This research addresses the security and privacy challenges associated with a DL model and generated records for social media applications. In this work, we propose a binary hash tree-based encryption that encrypts a customised model and generated images to minimise data leakage. The proposed method includes three parts. First is a customised autoencoder that minimises the size of digital images ensuring the security of generated images with a Henon chaotic map and ephemeral keys derived from a binary hash tree (BHT) for encryption in a Galois field (GF). Further, we encrypt the fewest possible weight parameters of the customised model with the same ephemeral key to preserve privacy. By doing so, our method reduces data leakage and further improves the model security at the same time. Extensive experiments reveal that the proposed method is more secure against attacks than state-of-the-art methods, which could be helpful in media and several other applications. To the best of our knowledge, we are the first to explore a secure system that protects both the model and the generated media at the same time using an encryption technique.
{"title":"Using binary hash tree-based encryption to secure a deep learning model and generated images for social media applications","authors":"Soniya Rohhila, Amit Kumar Singh","doi":"10.1016/j.future.2025.107722","DOIUrl":"10.1016/j.future.2025.107722","url":null,"abstract":"<div><div>Deep learning (DL) plays a vital role in identifying critical features and patterns in digital images. Deep learning models and generated records, particularly digital images, are highly effective in media and other applications but pose privacy and security challenges. For example, healthcare professionals must understand how Artificial Intelligence (AI) makes decisions to trust and fully incorporate its findings into medical practice. This research addresses the security and privacy challenges associated with a DL model and generated records for social media applications. In this work, we propose a binary hash tree-based encryption that encrypts a customised model and generated images to minimise data leakage. The proposed method includes three parts. First is a customised autoencoder that minimises the size of digital images ensuring the security of generated images with a Henon chaotic map and ephemeral keys derived from a binary hash tree (BHT) for encryption in a Galois field (GF). Further, we encrypt the fewest possible weight parameters of the customised model with the same ephemeral key to preserve privacy. By doing so, our method reduces data leakage and further improves the model security at the same time. Extensive experiments reveal that the proposed method is more secure against attacks than state-of-the-art methods, which could be helpful in media and several other applications. To the best of our knowledge, we are the first to explore a secure system that protects both the model and the generated media at the same time using an encryption technique.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107722"},"PeriodicalIF":6.2,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143077830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-21DOI: 10.1016/j.future.2024.107698
Pedro J. Martinez-Ferrer , Albert-Jan Yzelman , Vicenç Beltran
The tensor–vector contraction (TVC) is the most memory-bound operation of its class and a core component of the higher-order power method (HOPM). This paper brings distributed-memory parallelization to a native TVC algorithm for dense tensors that overall remains oblivious to contraction mode, tensor splitting, and tensor order. Similarly, we propose a novel distributed HOPM, namely dHOPM, that can save up to one order of magnitude of streamed memory and is about twice as costly in terms of data movement as a distributed TVC operation (dTVC) when using task-based parallelization. The numerical experiments carried out in this work on three different architectures featuring multicore and accelerators confirm that the performances of dTVC and dHOPM remain relatively close to the peak system memory bandwidth (50%–80%, depending on the architecture) and on par with STREAM benchmark figures. On strong scalability scenarios, our native multicore implementations of these two algorithms can achieve similar and sometimes even greater performance figures than those based upon state-of-the-art CUDA batched kernels. Finally, we demonstrate that both computation and communication can benefit from mixed precision arithmetic also in cases where the hardware does not support low precision data types natively.
{"title":"Distributed and heterogeneous tensor–vector contraction algorithms for high performance computing","authors":"Pedro J. Martinez-Ferrer , Albert-Jan Yzelman , Vicenç Beltran","doi":"10.1016/j.future.2024.107698","DOIUrl":"10.1016/j.future.2024.107698","url":null,"abstract":"<div><div>The tensor–vector contraction (TVC) is the most memory-bound operation of its class and a core component of the higher-order power method (HOPM). This paper brings distributed-memory parallelization to a native TVC algorithm for dense tensors that overall remains oblivious to contraction mode, tensor splitting, and tensor order. Similarly, we propose a novel distributed HOPM, namely dHOPM<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>, that can save up to one order of magnitude of streamed memory and is about twice as costly in terms of data movement as a distributed TVC operation (dTVC) when using task-based parallelization. The numerical experiments carried out in this work on three different architectures featuring multicore and accelerators confirm that the performances of dTVC and dHOPM<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span> remain relatively close to the peak system memory bandwidth (50%–80%, depending on the architecture) and on par with STREAM benchmark figures. On strong scalability scenarios, our native multicore implementations of these two algorithms can achieve similar and sometimes even greater performance figures than those based upon state-of-the-art CUDA batched kernels. Finally, we demonstrate that both computation and communication can benefit from mixed precision arithmetic also in cases where the hardware does not support low precision data types natively.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107698"},"PeriodicalIF":6.2,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143077832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-19DOI: 10.1016/j.future.2025.107715
Jingwei Tan , Fagui Liu , Bin Wang , Qingbo Wu , C.L. Philip Chen
With an increasing number of deep neural network (DNN)-based applications being deployed at the edges, edge–cloud collaborative computing has emerged as a promising solution to alleviate the burden on resource-constrained edges by collaborative inference. However, simply offloading part of DNN to the cloud introduces significant communication overhead during inference. In this paper, we propose EC5, an Edge–Cloud Collaborative Computing framework with Compressive Communication. The compression of the intermediate feature is formulated using information theory and jointly optimized with the DNN through end-to-end multi-task learning. By decomposing DNN parameters into a new space, EC5 enables efficient storage and update of models across various compression levels. An Adaptive Exit scheme is designed to retain high-confidence inputs on the edge for fast inference, reducing reliance on the cloud. Experimental comparisons with baseline methods prove that EC5 significantly conserves network bandwidth and reduces communication instances, with low latency and acceptable accuracy loss, showing flexibility across different communication scenarios.
{"title":"EC5: Edge–cloud collaborative computing framework with compressive communication","authors":"Jingwei Tan , Fagui Liu , Bin Wang , Qingbo Wu , C.L. Philip Chen","doi":"10.1016/j.future.2025.107715","DOIUrl":"10.1016/j.future.2025.107715","url":null,"abstract":"<div><div>With an increasing number of deep neural network (DNN)-based applications being deployed at the edges, edge–cloud collaborative computing has emerged as a promising solution to alleviate the burden on resource-constrained edges by collaborative inference. However, simply offloading part of DNN to the cloud introduces significant communication overhead during inference. In this paper, we propose EC5, an Edge–Cloud Collaborative Computing framework with Compressive Communication. The compression of the intermediate feature is formulated using information theory and jointly optimized with the DNN through end-to-end multi-task learning. By decomposing DNN parameters into a new space, EC5 enables efficient storage and update of models across various compression levels. An Adaptive Exit scheme is designed to retain high-confidence inputs on the edge for fast inference, reducing reliance on the cloud. Experimental comparisons with baseline methods prove that EC5 significantly conserves network bandwidth and reduces communication instances, with low latency and acceptable accuracy loss, showing flexibility across different communication scenarios.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107715"},"PeriodicalIF":6.2,"publicationDate":"2025-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-18DOI: 10.1016/j.future.2025.107719
Pedro Barbosa , Ivone Amorim , Eva Maia , Isabel Praça
The widespread use of the Internet and digital services has significantly increased data collection and processing. Critical domains like healthcare rely on this data, but privacy and security concerns limit its usability, constraining the performance of intelligent systems, particularly those leveraging Neural Networks (NNs). NNs require high-quality data for optimal performance, but existing privacy-preserving methods, such as Federated Learning and Differential Privacy, often degrade model accuracy. While Homomorphic Encryption (HE) has emerged as a promising alternative, existing HE-based methods face challenges in computational efficiency and scalability, limiting their real-world application.
To address these issues, we introduce ENNigma, a novel framework employing state-of-the-art Fully Homomorphic Encryption techniques. This framework introduces optimizations that significantly improve the speed and accuracy of encrypted NN operations. Experiments conducted using the CIC-DDoS2019 dataset — a benchmark for Distributed Denial of Service attack detection — demonstrate ENNigma’s effectiveness. A classification performance with a maximum relative error of 1.01% was achieved compared to non-private models, while reducing multiplication time by up to 59% compared to existing FHE-based approaches. These results highlight ENNigma’s potential for practical, privacy-preserving neural network applications.
{"title":"ENNigma: A framework for Private Neural Networks","authors":"Pedro Barbosa , Ivone Amorim , Eva Maia , Isabel Praça","doi":"10.1016/j.future.2025.107719","DOIUrl":"10.1016/j.future.2025.107719","url":null,"abstract":"<div><div>The widespread use of the Internet and digital services has significantly increased data collection and processing. Critical domains like healthcare rely on this data, but privacy and security concerns limit its usability, constraining the performance of intelligent systems, particularly those leveraging Neural Networks (NNs). NNs require high-quality data for optimal performance, but existing privacy-preserving methods, such as Federated Learning and Differential Privacy, often degrade model accuracy. While Homomorphic Encryption (HE) has emerged as a promising alternative, existing HE-based methods face challenges in computational efficiency and scalability, limiting their real-world application.</div><div>To address these issues, we introduce ENNigma, a novel framework employing state-of-the-art Fully Homomorphic Encryption techniques. This framework introduces optimizations that significantly improve the speed and accuracy of encrypted NN operations. Experiments conducted using the CIC-DDoS2019 dataset — a benchmark for Distributed Denial of Service attack detection — demonstrate ENNigma’s effectiveness. A classification performance with a maximum relative error of 1.01% was achieved compared to non-private models, while reducing multiplication time by up to 59% compared to existing FHE-based approaches. These results highlight ENNigma’s potential for practical, privacy-preserving neural network applications.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107719"},"PeriodicalIF":6.2,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143077833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-18DOI: 10.1016/j.future.2025.107707
Sergio Ruiz-Villafranca , José Roldán-Gómez , Javier Carrillo-Mondéjar , José Luis Martinez , Carlos H. Gañán
In recent years we have seen the emergence of new industrial paradigms such as Industry 4.0/5.0 or the Industrial Internet of Things (IIoT). As the use of these new paradigms continues to grow, so do the number of threats and exploits that they face, which makes the IIoT a desirable target for cybercriminals. Furthermore, IIoT devices possess inherent limitations, primarily due to their limited resources. As a result, it is often impossible to detect attacks using solutions designed for other environments. Recently, Intrusion Detection Systems (IDS) based on Machine Learning (ML) have emerged as a solution that takes advantage of the large amount of data generated by IIoT devices to implement their functionality and achieve good performance, and the inclusion of the Multi-Access Edge Computing (MEC) paradigm in these environments provides the necessary computational resources to deploy IDS effectively. Furthermore, TabPFN has been considered as an attractive option for solving classification problems without the need to reprocess the data. However, TabPFN has certain drawbacks when it comes to the number of training samples and the maximum number of different classes that the model is capable of classifying. This makes TabPFN unsuitable for use when the dataset exceeds one of these limitations. In order to overcome such limitations, this paper presents a Weighted Fusion-Ensemble-based TabPFN (WFE-Tab) model to improve IDS performance in IIoT-MEC scenarios. The presented study employs a novel weighted fusion method to preprocess data into multiple subsets, generating different ensemble family TabPFN models. The resulting WFE-Tab model comprises four stages: data collection, data preprocessing, model training, and model evaluation. The performance of the WFE-Tab method is evaluated using key metrics such as Accuracy, Precision, Recall, and F1-Score, and validated using the Edge-IIoTset public dataset. The performance of the method is then compared with baseline and modern methods to evaluate its effectiveness, achieving an F1-Score performance of 99.81%.
{"title":"WFE-Tab: Overcoming limitations of TabPFN in IIoT-MEC environments with a weighted fusion ensemble-TabPFN model for improved IDS performance","authors":"Sergio Ruiz-Villafranca , José Roldán-Gómez , Javier Carrillo-Mondéjar , José Luis Martinez , Carlos H. Gañán","doi":"10.1016/j.future.2025.107707","DOIUrl":"10.1016/j.future.2025.107707","url":null,"abstract":"<div><div>In recent years we have seen the emergence of new industrial paradigms such as Industry 4.0/5.0 or the Industrial Internet of Things (IIoT). As the use of these new paradigms continues to grow, so do the number of threats and exploits that they face, which makes the IIoT a desirable target for cybercriminals. Furthermore, IIoT devices possess inherent limitations, primarily due to their limited resources. As a result, it is often impossible to detect attacks using solutions designed for other environments. Recently, Intrusion Detection Systems (IDS) based on Machine Learning (ML) have emerged as a solution that takes advantage of the large amount of data generated by IIoT devices to implement their functionality and achieve good performance, and the inclusion of the Multi-Access Edge Computing (MEC) paradigm in these environments provides the necessary computational resources to deploy IDS effectively. Furthermore, TabPFN has been considered as an attractive option for solving classification problems without the need to reprocess the data. However, TabPFN has certain drawbacks when it comes to the number of training samples and the maximum number of different classes that the model is capable of classifying. This makes TabPFN unsuitable for use when the dataset exceeds one of these limitations. In order to overcome such limitations, this paper presents a Weighted Fusion-Ensemble-based TabPFN (WFE-Tab) model to improve IDS performance in IIoT-MEC scenarios. The presented study employs a novel weighted fusion method to preprocess data into multiple subsets, generating different ensemble family TabPFN models. The resulting WFE-Tab model comprises four stages: data collection, data preprocessing, model training, and model evaluation. The performance of the WFE-Tab method is evaluated using key metrics such as Accuracy, Precision, Recall, and F1-Score, and validated using the Edge-IIoTset public dataset. The performance of the method is then compared with baseline and modern methods to evaluate its effectiveness, achieving an F1-Score performance of 99.81%.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107707"},"PeriodicalIF":6.2,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1016/j.future.2025.107717
Michal Prauzek, Karolina Gaiova, Tereza Kucova, Jaromir Konecny
This study presents a cloud-assisted energy management strategy for energy harvesting Internet-of-Things (IoT) nodes, using a novel digital twin (DT) concept for dynamic optimization of IoT node behavior. The system is built upon a fuzzy-rule-based controller optimized through a differential evolution (DE) algorithm. DE is particularly well-suited for this application, as it is capable of optimizing the controller without requiring gradient information, allowing it to efficiently navigate the complex, nonlinear characteristics of IoT energy management problems. The optimization process tunes nine key fuzzy input coefficients to create an energy-efficient control strategy. The DT concept serves as a virtual replica of the physical IoT node, continuously synchronizing real-time data from sensors and other internal parameters, including energy harvesting rates and component health. Through this real-time feedback loop, the DT enables predictive adjustments to the control system, increasing the longevity and reliability of the IoT devices in harsh and changing environments. Compared to traditional energy management strategies, the proposed method improves energy utilization by 11%, leveraging four years of solar data collected from multiple geographical locations. Moreover, the system achieves a 12% increase in successful transmissions, ensuring greater data availability in the cloud while minimizing device failures and optimizing the use of available energy. The DT concept allows the system to simulate and predict IoT node behavior under various conditions, continuously refining the energy management strategy. This ensures not only optimal energy efficiency but also accounts for component degradation over time, offering long-term adaptability and minimizing the need for manual intervention. Thus, the synergy between the DT concept and DE optimization offers a powerful, scalable solution for managing energy-constrained IoT networks, surpassing conventional expert-designed strategies in both adaptability and performance.
{"title":"Fuzzy energy management strategies for energy harvesting IoT nodes based on a digital twin concept","authors":"Michal Prauzek, Karolina Gaiova, Tereza Kucova, Jaromir Konecny","doi":"10.1016/j.future.2025.107717","DOIUrl":"10.1016/j.future.2025.107717","url":null,"abstract":"<div><div>This study presents a cloud-assisted energy management strategy for energy harvesting Internet-of-Things (IoT) nodes, using a novel digital twin (DT) concept for dynamic optimization of IoT node behavior. The system is built upon a fuzzy-rule-based controller optimized through a differential evolution (DE) algorithm. DE is particularly well-suited for this application, as it is capable of optimizing the controller without requiring gradient information, allowing it to efficiently navigate the complex, nonlinear characteristics of IoT energy management problems. The optimization process tunes nine key fuzzy input coefficients to create an energy-efficient control strategy. The DT concept serves as a virtual replica of the physical IoT node, continuously synchronizing real-time data from sensors and other internal parameters, including energy harvesting rates and component health. Through this real-time feedback loop, the DT enables predictive adjustments to the control system, increasing the longevity and reliability of the IoT devices in harsh and changing environments. Compared to traditional energy management strategies, the proposed method improves energy utilization by 11%, leveraging four years of solar data collected from multiple geographical locations. Moreover, the system achieves a 12% increase in successful transmissions, ensuring greater data availability in the cloud while minimizing device failures and optimizing the use of available energy. The DT concept allows the system to simulate and predict IoT node behavior under various conditions, continuously refining the energy management strategy. This ensures not only optimal energy efficiency but also accounts for component degradation over time, offering long-term adaptability and minimizing the need for manual intervention. Thus, the synergy between the DT concept and DE optimization offers a powerful, scalable solution for managing energy-constrained IoT networks, surpassing conventional expert-designed strategies in both adaptability and performance.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107717"},"PeriodicalIF":6.2,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1016/j.future.2025.107718
Jeongseok Kim , Jemin Lee , Yongin Kwon , Daeyoung Kim
Mixed-precision quantization methods have been proposed to reduce model size while minimizing accuracy degradation. However, existing studies require retraining and do not consider the computational overhead and intermediate representations (IR) generated during the compilation process, limiting their application at the compiler level. This computational overhead refers to the runtime latency caused by frequent quantization and de-quantization operations during inference. Performing these operations at the individual operator level causes significant runtime delays. To address these issues, we propose QuantuneV2, a compiler-based mixed-precision quantization method designed for practical embedded AI applications. QuantuneV2 performs inference only twice – once before quantization and once after quantization – and operates with a computational complexity off that increases linearly with the number of model parameters. We also made the sensitivity analysis more stable by using local metrics like weights, activation values, the Signal-to-Quantization-Noise Ratio (SQNR), and the Mean Squared Error (MSE). We also cut down on computational overhead by choosing the best IR and using operator fusion. Experimental results show that QuantuneV2 achieved up to a 10.28% improvement in accuracy and a 12.52% increase in speed compared to existing methods across five models: ResNet18v1, ResNet50v1, SqueezeNetv1, VGGNet, and MobileNetv2. This demonstrates that QuantuneV2 enhances model performance while maintaining computational efficiency, making it suitable for deployment in embedded AI environments.
{"title":"QuantuneV2: Compiler-based local metric-driven mixed precision quantization for practical embedded AI applications","authors":"Jeongseok Kim , Jemin Lee , Yongin Kwon , Daeyoung Kim","doi":"10.1016/j.future.2025.107718","DOIUrl":"10.1016/j.future.2025.107718","url":null,"abstract":"<div><div>Mixed-precision quantization methods have been proposed to reduce model size while minimizing accuracy degradation. However, existing studies require retraining and do not consider the computational overhead and intermediate representations (IR) generated during the compilation process, limiting their application at the compiler level. This computational overhead refers to the runtime latency caused by frequent quantization and de-quantization operations during inference. Performing these operations at the individual operator level causes significant runtime delays. To address these issues, we propose <span>QuantuneV2</span>, a compiler-based mixed-precision quantization method designed for practical embedded AI applications. <span>QuantuneV2</span> performs inference only twice – once before quantization and once after quantization – and operates with a computational complexity off <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow></mrow></math></span> that increases linearly with the number of model parameters. We also made the sensitivity analysis more stable by using local metrics like weights, activation values, the Signal-to-Quantization-Noise Ratio (SQNR), and the Mean Squared Error (MSE). We also cut down on computational overhead by choosing the best IR and using operator fusion. Experimental results show that <span>QuantuneV2</span> achieved up to a 10.28% improvement in accuracy and a 12.52% increase in speed compared to existing methods across five models: ResNet18v1, ResNet50v1, SqueezeNetv1, VGGNet, and MobileNetv2. This demonstrates that <span>QuantuneV2</span> enhances model performance while maintaining computational efficiency, making it suitable for deployment in embedded AI environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107718"},"PeriodicalIF":6.2,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}