Pub Date : 2024-08-30DOI: 10.1016/j.jksuci.2024.102178
Aytuğ Onan , Hesham A. Alhumyani
In the age of information overload, the ability to distill essential content from extensive texts is invaluable. DeepExtract introduces an advanced framework for extractive summarization, utilizing the groundbreaking capabilities of GPT-4 along with innovative hierarchical positional encoding to redefine information extraction. This manuscript details the development of DeepExtract, which integrates semantic-driven techniques to analyze and summarize complex documents effectively. The framework is structured around a novel hierarchical tree construction that categorizes sentences and sections not just by their physical placement within a text, but by their contextual and thematic significance, leveraging dynamic embeddings generated by GPT-4. We introduce a multi-faceted scoring system that evaluates sentences based on coherence, relevance, and novelty, ensuring that summaries are not only concise but rich with essential content. Further, DeepExtract employs optimized semantic clustering to group thematic elements, which enhances the representativeness of the summaries. This paper demonstrates through comprehensive evaluations that DeepExtract significantly outperforms existing extractive summarization models in terms of accuracy and efficiency, making it a potent tool for academic, professional, and general use. We conclude with a discussion on the practical applications of DeepExtract in various domains, highlighting its adaptability and potential in navigating the vast expanses of digital text.
{"title":"DeepExtract: Semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding","authors":"Aytuğ Onan , Hesham A. Alhumyani","doi":"10.1016/j.jksuci.2024.102178","DOIUrl":"10.1016/j.jksuci.2024.102178","url":null,"abstract":"<div><p>In the age of information overload, the ability to distill essential content from extensive texts is invaluable. DeepExtract introduces an advanced framework for extractive summarization, utilizing the groundbreaking capabilities of GPT-4 along with innovative hierarchical positional encoding to redefine information extraction. This manuscript details the development of DeepExtract, which integrates semantic-driven techniques to analyze and summarize complex documents effectively. The framework is structured around a novel hierarchical tree construction that categorizes sentences and sections not just by their physical placement within a text, but by their contextual and thematic significance, leveraging dynamic embeddings generated by GPT-4. We introduce a multi-faceted scoring system that evaluates sentences based on coherence, relevance, and novelty, ensuring that summaries are not only concise but rich with essential content. Further, DeepExtract employs optimized semantic clustering to group thematic elements, which enhances the representativeness of the summaries. This paper demonstrates through comprehensive evaluations that DeepExtract significantly outperforms existing extractive summarization models in terms of accuracy and efficiency, making it a potent tool for academic, professional, and general use. We conclude with a discussion on the practical applications of DeepExtract in various domains, highlighting its adaptability and potential in navigating the vast expanses of digital text.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102178"},"PeriodicalIF":5.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002672/pdfft?md5=ee7790d3716e8b2a6454863f15695239&pid=1-s2.0-S1319157824002672-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.jksuci.2024.102179
Jian Ge , Qin Qin , Shaojing Song , Jinhua Jiang , Zhiwei Shen
In industrial detection scenarios, achieving high accuracy typically relies on extensive labeled datasets, which are costly and time-consuming. This has motivated a shift towards semi-supervised learning (SSL), which leverages labeled and unlabeled data to improve learning efficiency and reduce annotation costs. This work proposes the unsupervised spectral clustering labeling (USCL) method to optimize SSL for industrial challenges like defect variability, rarity, and complex distributions. Integral to USCL, we employ the multi-task fusion self-supervised learning (MTSL) method to extract robust feature representations through multiple self-supervised tasks. Additionally, we introduce the Enhanced Spectral Clustering (ESC) method and a dynamic selecting function (DSF). ESC effectively integrates both local and global similarity matrices, improving clustering accuracy. The DSF maximally selects the most valuable instances for labeling, significantly enhancing the representativeness and diversity of the labeled data. USCL consistently improves various SSL methods compared to traditional instance selection methods. For example, it boosts Efficient Teacher by 5%, 6.6%, and 7.8% in mean Average Precision(mAP) on the Automotive Sealing Rings Defect Dataset, the Metallic Surface Defect Dataset, and the Printed Circuit Boards (PCB) Defect Dataset with 10% labeled data. Our work sets a new benchmark for SSL in industrial settings.
{"title":"Unsupervised selective labeling for semi-supervised industrial defect detection","authors":"Jian Ge , Qin Qin , Shaojing Song , Jinhua Jiang , Zhiwei Shen","doi":"10.1016/j.jksuci.2024.102179","DOIUrl":"10.1016/j.jksuci.2024.102179","url":null,"abstract":"<div><p>In industrial detection scenarios, achieving high accuracy typically relies on extensive labeled datasets, which are costly and time-consuming. This has motivated a shift towards semi-supervised learning (SSL), which leverages labeled and unlabeled data to improve learning efficiency and reduce annotation costs. This work proposes the unsupervised spectral clustering labeling (USCL) method to optimize SSL for industrial challenges like defect variability, rarity, and complex distributions. Integral to USCL, we employ the multi-task fusion self-supervised learning (MTSL) method to extract robust feature representations through multiple self-supervised tasks. Additionally, we introduce the Enhanced Spectral Clustering (ESC) method and a dynamic selecting function (DSF). ESC effectively integrates both local and global similarity matrices, improving clustering accuracy. The DSF maximally selects the most valuable instances for labeling, significantly enhancing the representativeness and diversity of the labeled data. USCL consistently improves various SSL methods compared to traditional instance selection methods. For example, it boosts Efficient Teacher by 5%, 6.6%, and 7.8% in mean Average Precision(mAP) on the Automotive Sealing Rings Defect Dataset, the Metallic Surface Defect Dataset, and the Printed Circuit Boards (PCB) Defect Dataset with 10% labeled data. Our work sets a new benchmark for SSL in industrial settings.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102179"},"PeriodicalIF":5.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002684/pdfft?md5=2e9ae7d3bfac3922191cefd8f900c5a6&pid=1-s2.0-S1319157824002684-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142117390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.jksuci.2024.102180
Riya Kalra , Tinku Singh , Suryanshi Mishra , Satakshi , Naveen Kumar , Taehong Kim , Manish Kumar
The stock market’s volatility, noise, and information overload necessitate efficient prediction methods. Forecasting index prices in this environment is complex due to the non-linear and non-stationary nature of time series data generated from the stock market. Machine learning and deep learning have emerged as powerful tools for identifying financial data patterns and generating predictions based on historical trends. However, updating these models in real-time is crucial for accurate predictions. Deep learning models require extensive computational resources and careful hyperparameter optimization, while incremental learning models struggle to balance stability and adaptability. This paper proposes a novel hybrid bidirectional-LSTM (H.BLSTM) model that combines incremental learning and deep learning techniques for real-time index price prediction, addressing these scalability and memory challenges. The method utilizes both univariate time series derived from historical index prices and multivariate time series incorporating technical indicators. Implementation within a real-time trading system demonstrates the method’s effectiveness in achieving more accurate price forecasts for major stock indices globally through extensive experimentation. The proposed model achieved an average mean absolute percentage error of 0.001 across nine stock indices, significantly outperforming traditional models. It has an average forecasting delay of 2 s, making it suitable for real-time trading applications.
{"title":"An efficient hybrid approach for forecasting real-time stock market indices","authors":"Riya Kalra , Tinku Singh , Suryanshi Mishra , Satakshi , Naveen Kumar , Taehong Kim , Manish Kumar","doi":"10.1016/j.jksuci.2024.102180","DOIUrl":"10.1016/j.jksuci.2024.102180","url":null,"abstract":"<div><p>The stock market’s volatility, noise, and information overload necessitate efficient prediction methods. Forecasting index prices in this environment is complex due to the non-linear and non-stationary nature of time series data generated from the stock market. Machine learning and deep learning have emerged as powerful tools for identifying financial data patterns and generating predictions based on historical trends. However, updating these models in real-time is crucial for accurate predictions. Deep learning models require extensive computational resources and careful hyperparameter optimization, while incremental learning models struggle to balance stability and adaptability. This paper proposes a novel hybrid bidirectional-LSTM (H.BLSTM) model that combines incremental learning and deep learning techniques for real-time index price prediction, addressing these scalability and memory challenges. The method utilizes both univariate time series derived from historical index prices and multivariate time series incorporating technical indicators. Implementation within a real-time trading system demonstrates the method’s effectiveness in achieving more accurate price forecasts for major stock indices globally through extensive experimentation. The proposed model achieved an average mean absolute percentage error of 0.001 across nine stock indices, significantly outperforming traditional models. It has an average forecasting delay of 2 s, making it suitable for real-time trading applications.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102180"},"PeriodicalIF":5.2,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002696/pdfft?md5=990fa1b67fa197073ed336d80589c08c&pid=1-s2.0-S1319157824002696-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-28DOI: 10.1016/j.jksuci.2024.102169
Jun Lu , Jiaxin Zhang , Dezhi An , Dawei Hao , Xiaokai Ren , Ruoyu Zhao
The rapid development of the big data era has resulted in traditional image encryption algorithms consuming more time in handling the huge amount of data. The consumption of time cost needs to be reduced while ensuring the security of encryption algorithms. With this in mind, the paper proposes a low-time-consumption image encryption (LTC-IE) combining 2D parametric Pascal matrix chaotic system (2D-PPMCS) and elementary operation. First, the 2D-PPMCS with robustness and complex chaotic behavior is adopted. Second, the SHA-256 hash values are applied to the chaotic sequences generated by 2D-PPMCS, which are processed and applied to image permutation and diffusion encryption. In the permutation stage, the pixel matrix is permutation encrypted based on the permutation matrix generated from the chaotic sequences. For diffusion encryption, elementary operations are utilized to construct the model, such as exclusive or, modulo, and arithmetic operations (addition, subtraction, multiplication, and division). After analyzing the security experiments, the LTC-IE algorithm ensures security and robustness while reducing the time cost consumption.
{"title":"A low-time-consumption image encryption combining 2D parametric Pascal matrix chaotic system and elementary operation","authors":"Jun Lu , Jiaxin Zhang , Dezhi An , Dawei Hao , Xiaokai Ren , Ruoyu Zhao","doi":"10.1016/j.jksuci.2024.102169","DOIUrl":"10.1016/j.jksuci.2024.102169","url":null,"abstract":"<div><p>The rapid development of the big data era has resulted in traditional image encryption algorithms consuming more time in handling the huge amount of data. The consumption of time cost needs to be reduced while ensuring the security of encryption algorithms. With this in mind, the paper proposes a low-time-consumption image encryption (LTC-IE) combining 2D parametric Pascal matrix chaotic system (2D-PPMCS) and elementary operation. First, the 2D-PPMCS with robustness and complex chaotic behavior is adopted. Second, the SHA-256 hash values are applied to the chaotic sequences generated by 2D-PPMCS, which are processed and applied to image permutation and diffusion encryption. In the permutation stage, the pixel matrix is permutation encrypted based on the permutation matrix generated from the chaotic sequences. For diffusion encryption, elementary operations are utilized to construct the model, such as exclusive or, modulo, and arithmetic operations (addition, subtraction, multiplication, and division). After analyzing the security experiments, the LTC-IE algorithm ensures security and robustness while reducing the time cost consumption.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102169"},"PeriodicalIF":5.2,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002581/pdfft?md5=db7fa2d27baba2dde9365c9407528c9f&pid=1-s2.0-S1319157824002581-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142098681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-28DOI: 10.1016/j.jksuci.2024.102166
Xinying Yu , Kejun Zhang , Zhufeng Suo , Jun Wang , Wenbin Wang , Bing Zou
Biometric recognition is extensive for user security authentication in the Industrial Internet of Things (IIoT). However, the potential leakage of biometric data has severe repercussions, such as identity theft or tracking. Existing authentication schemes primarily focus on protecting biometric templates but often overlook the “one-authentication multiple-access” mode. As a result, these schemes still confront challenges related to privacy leakage and low efficiency for users who frequently access the server. In this regard, this paper proposes an efficient authentication scheme syncretizing physical unclonable function (PUF) and revocable biometrics in IIoT. Specifically, we design a revocable biometric template generation method syncretizing the user’s biometric data and the device’s PUF to enhance the security and revocability of the dual identity information. Given the generated revocable biometric template and the secret sharing, our scheme implements secure authentication and key negotiation between users and servers. Additionally, we establish an access boundary and an authentication validity period to permit multiple accesses following one authentication, thus significantly decreasing the computational cost of the user-side device. We leverage BAN logic and the ROR model to prove our scheme’s security. Informal security analysis and performance comparison demonstrate that our scheme satisfies more security features with higher authentication efficiency.
生物识别技术在工业物联网(IIoT)中广泛应用于用户安全认证。然而,生物识别数据的潜在泄漏会造成严重影响,如身份盗用或跟踪。现有的身份验证方案主要侧重于保护生物识别模板,但往往忽略了 "一次验证多次访问 "模式。因此,对于频繁访问服务器的用户来说,这些方案仍然面临着隐私泄露和效率低下的挑战。为此,本文提出了一种将物理不可克隆函数(PUF)和可撤销生物识别技术同步应用于物联网的高效身份验证方案。具体来说,我们设计了一种可撤销生物识别模板生成方法,将用户的生物识别数据与设备的 PUF 同步,以增强双重身份信息的安全性和可撤销性。鉴于生成的可撤销生物识别模板和秘密共享,我们的方案实现了用户和服务器之间的安全认证和密钥协商。此外,我们还建立了访问边界和认证有效期,允许在一次认证后进行多次访问,从而大大降低了用户端设备的计算成本。我们利用 BAN 逻辑和 ROR 模型来证明我们方案的安全性。非正式的安全性分析和性能比较表明,我们的方案能以更高的验证效率满足更多的安全特性。
{"title":"An efficient authentication scheme syncretizing physical unclonable function and revocable biometrics in Industrial Internet of Things","authors":"Xinying Yu , Kejun Zhang , Zhufeng Suo , Jun Wang , Wenbin Wang , Bing Zou","doi":"10.1016/j.jksuci.2024.102166","DOIUrl":"10.1016/j.jksuci.2024.102166","url":null,"abstract":"<div><p>Biometric recognition is extensive for user security authentication in the Industrial Internet of Things (IIoT). However, the potential leakage of biometric data has severe repercussions, such as identity theft or tracking. Existing authentication schemes primarily focus on protecting biometric templates but often overlook the “one-authentication multiple-access” mode. As a result, these schemes still confront challenges related to privacy leakage and low efficiency for users who frequently access the server. In this regard, this paper proposes an efficient authentication scheme syncretizing physical unclonable function (PUF) and revocable biometrics in IIoT. Specifically, we design a revocable biometric template generation method syncretizing the user’s biometric data and the device’s PUF to enhance the security and revocability of the dual identity information. Given the generated revocable biometric template and the secret sharing, our scheme implements secure authentication and key negotiation between users and servers. Additionally, we establish an access boundary and an authentication validity period to permit multiple accesses following one authentication, thus significantly decreasing the computational cost of the user-side device. We leverage BAN logic and the ROR model to prove our scheme’s security. Informal security analysis and performance comparison demonstrate that our scheme satisfies more security features with higher authentication efficiency.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102166"},"PeriodicalIF":5.2,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002556/pdfft?md5=bf447ec5a923cea7cdfc3e3a7567340f&pid=1-s2.0-S1319157824002556-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-28DOI: 10.1016/j.jksuci.2024.102170
Mehboob Hussain , Lian-Fu Wei , Amir Rehman , Abid Hussain , Muqadar Ali , Muhammad Hafeez Javed
The cloud computing platform has become a favorable destination for running cloud workflow applications. However, they are primarily complicated and require intensive computing. Task scheduling in cloud environments, when formulated as an optimization problem, is proven to be NP-hard. Thus, efficient task scheduling plays a decisive role in minimizing energy costs. Electricity prices fluctuate depending on the vending company, time, and location. Therefore, optimizing energy costs has become a serious issue that one must consider when building workflow applications scheduling across geographically distributed cloud data centers (GD-CDCs). To tackle this issue, we have suggested a dual optimization approach called electricity price and energy-efficient (EPEE) workflow scheduling algorithm that simultaneously considers energy efficiency and fluctuating electricity prices across GD-CDCs, aims to reach the minimum electricity costs of workflow applications under the deadline constraints. This novel integration of dynamic voltage and frequency scaling (DVFS) with energy and electricity price optimization is unique compared to existing methods. Moreover, our EPEE approach, which includes task prioritization, deadline partitioning, data center selection based on energy efficiency and price diversity, and dynamic task scheduling, provides a comprehensive solution that significantly reduces electricity costs and enhances resource utilization. In addition, the inclusion of both generated and original data transmission times further differentiates our approach, offering a more realistic and practical solution for cloud service providers (CSPs). The experimental results reveal that the EPEE model produces better success rates to meet task deadlines, maximize resource utilization, cost and energy efficiencies in comparison to adapted state-of-the-art algorithms for similar problems.
{"title":"An electricity price and energy-efficient workflow scheduling in geographically distributed cloud data centers","authors":"Mehboob Hussain , Lian-Fu Wei , Amir Rehman , Abid Hussain , Muqadar Ali , Muhammad Hafeez Javed","doi":"10.1016/j.jksuci.2024.102170","DOIUrl":"10.1016/j.jksuci.2024.102170","url":null,"abstract":"<div><p>The cloud computing platform has become a favorable destination for running cloud workflow applications. However, they are primarily complicated and require intensive computing. Task scheduling in cloud environments, when formulated as an optimization problem, is proven to be NP-hard. Thus, efficient task scheduling plays a decisive role in minimizing energy costs. Electricity prices fluctuate depending on the vending company, time, and location. Therefore, optimizing energy costs has become a serious issue that one must consider when building workflow applications scheduling across geographically distributed cloud data centers (GD-CDCs). To tackle this issue, we have suggested a dual optimization approach called electricity price and energy-efficient (EPEE) workflow scheduling algorithm that simultaneously considers energy efficiency and fluctuating electricity prices across GD-CDCs, aims to reach the minimum electricity costs of workflow applications under the deadline constraints. This novel integration of dynamic voltage and frequency scaling (DVFS) with energy and electricity price optimization is unique compared to existing methods. Moreover, our EPEE approach, which includes task prioritization, deadline partitioning, data center selection based on energy efficiency and price diversity, and dynamic task scheduling, provides a comprehensive solution that significantly reduces electricity costs and enhances resource utilization. In addition, the inclusion of both generated and original data transmission times further differentiates our approach, offering a more realistic and practical solution for cloud service providers (CSPs). The experimental results reveal that the EPEE model produces better success rates to meet task deadlines, maximize resource utilization, cost and energy efficiencies in comparison to adapted state-of-the-art algorithms for similar problems.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102170"},"PeriodicalIF":5.2,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002593/pdfft?md5=8ba14b81a0951bd08637405a78b6250b&pid=1-s2.0-S1319157824002593-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-27DOI: 10.1016/j.jksuci.2024.102164
Siraj Uddin Qureshi , Jingsha He , Saima Tunio , Nafei Zhu , Ahsan Nazir , Ahsan Wajahat , Faheem Ullah , Abdul Wadud
The swift proliferation of Internet of Things (IoT) devices has presented considerable challenges in maintaining cybersecurity. As IoT ecosystems expand, they increasingly attract malware attacks, necessitating advanced detection and forensic analysis methods. This systematic review explores the application of deep learning techniques for malware detection and forensic analysis within IoT environments. The literature is organized into four distinct categories: IoT Security, Malware Forensics, Deep Learning, and Anti-Forensics. Each group was analyzed individually to identify common methodologies, techniques, and outcomes. Conducted a combined analysis to synthesize the findings across these categories, highlighting overarching trends and insights.This systematic review identifies several research gaps, including the need for comprehensive IoT-specific datasets, the integration of interdisciplinary methods, scalable real-time detection solutions, and advanced countermeasures against anti-forensic techniques. The primary issue addressed is the complexity of IoT malware and the limitations of current forensic methodologies. Through a robust methodological framework, this review synthesizes findings across these categories, highlighting common methodologies and outcomes. Identifying critical areas for future investigation, this review contributes to the advancement of cybersecurity in IoT environments, offering a comprehensive framework to guide future research and practice in developing more robust and effective security solutions.
{"title":"Systematic review of deep learning solutions for malware detection and forensic analysis in IoT","authors":"Siraj Uddin Qureshi , Jingsha He , Saima Tunio , Nafei Zhu , Ahsan Nazir , Ahsan Wajahat , Faheem Ullah , Abdul Wadud","doi":"10.1016/j.jksuci.2024.102164","DOIUrl":"10.1016/j.jksuci.2024.102164","url":null,"abstract":"<div><p>The swift proliferation of Internet of Things (IoT) devices has presented considerable challenges in maintaining cybersecurity. As IoT ecosystems expand, they increasingly attract malware attacks, necessitating advanced detection and forensic analysis methods. This systematic review explores the application of deep learning techniques for malware detection and forensic analysis within IoT environments. The literature is organized into four distinct categories: IoT Security, Malware Forensics, Deep Learning, and Anti-Forensics. Each group was analyzed individually to identify common methodologies, techniques, and outcomes. Conducted a combined analysis to synthesize the findings across these categories, highlighting overarching trends and insights.This systematic review identifies several research gaps, including the need for comprehensive IoT-specific datasets, the integration of interdisciplinary methods, scalable real-time detection solutions, and advanced countermeasures against anti-forensic techniques. The primary issue addressed is the complexity of IoT malware and the limitations of current forensic methodologies. Through a robust methodological framework, this review synthesizes findings across these categories, highlighting common methodologies and outcomes. Identifying critical areas for future investigation, this review contributes to the advancement of cybersecurity in IoT environments, offering a comprehensive framework to guide future research and practice in developing more robust and effective security solutions.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102164"},"PeriodicalIF":5.2,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002532/pdfft?md5=c4b7c6c2f5d782febc67d0ed9dc92f16&pid=1-s2.0-S1319157824002532-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-27DOI: 10.1016/j.jksuci.2024.102162
Taner Uçkan
Financial data such as stock prices are rich time series data that contain valuable information for investors and financial professionals. Analysis of such data is critical to understanding market behaviour and predicting future price movements. However, stock price predictions are complex and difficult due to the intense noise, non-linear structures, and high volatility contained in this data. While this situation increases the difficulty of making accurate predictions, it also creates an important area for investors and analysts to identify opportunities in the market. One of the effective methods used in predicting stock prices is technical analysis. Multiple indicators are used to predict stock prices with technical analysis. These indicators formulate past stock price movements in different ways and produce signals such as buy, sell, and hold. In this study, the most frequently used ten different indicators were analyzed with PCA (Principal Component Analysis. This study aims to investigate the integration of PCA and deep learning models into the Turkish stock market using indicator values and to assess the effect of this integration on market prediction performance. The most effective indicators used as input for market prediction were selected with the PCA method, and then 4 different models were created using different deep learning architectures (LSTM, CNN, BiLSTM, GRU). The performance values of the proposed models were evaluated with MSE, MAE, MAPE and R2 measurement metrics. The results obtained show that using the indicators selected by PCA together with deep learning models improves market prediction performance. In particular, it was observed that one of the proposed models, the PCA-LSTM-CNN model, produced very successful results.
{"title":"Integrating PCA with deep learning models for stock market Forecasting: An analysis of Turkish stocks markets","authors":"Taner Uçkan","doi":"10.1016/j.jksuci.2024.102162","DOIUrl":"10.1016/j.jksuci.2024.102162","url":null,"abstract":"<div><p>Financial data such as stock prices are rich time series data that contain valuable information for investors and financial professionals. Analysis of such data is critical to understanding market behaviour and predicting future price movements. However, stock price predictions are complex and difficult due to the intense noise, non-linear structures, and high volatility contained in this data. While this situation increases the difficulty of making accurate predictions, it also creates an important area for investors and analysts to identify opportunities in the market. One of the effective methods used in predicting stock prices is technical analysis. Multiple indicators are used to predict stock prices with technical analysis. These indicators formulate past stock price movements in different ways and produce signals such as buy, sell, and hold. In this study, the most frequently used ten different indicators were analyzed with PCA (Principal Component Analysis. This study aims to investigate the integration of PCA and deep learning models into the Turkish stock market using indicator values and to assess the effect of this integration on market prediction performance. The most effective indicators used as input for market prediction were selected with the PCA method, and then 4 different models were created using different deep learning architectures (LSTM, CNN, BiLSTM, GRU). The performance values of the proposed models were evaluated with MSE, MAE, MAPE and R2 measurement metrics. The results obtained show that using the indicators selected by PCA together with deep learning models improves market prediction performance. In particular, it was observed that one of the proposed models, the PCA-LSTM-CNN model, produced very successful results.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102162"},"PeriodicalIF":5.2,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002519/pdfft?md5=58ab4242a07a2504cdd39efa0bbba182&pid=1-s2.0-S1319157824002519-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Entity alignment (EA), aiming to match entities with the same meaning across different knowledge graphs (KGs), is a critical step in knowledge fusion. Existing EA methods usually encode the multi-aspect features of entities as embeddings and learn to align the embeddings with supervised learning. Although these methods have achieved remarkable results, two issues have not been well addressed. Firstly, these methods require pre-aligned entity pairs to perform EA tasks, limiting their applicability in practice. Secondly, these methods overlook the unique contribution of digital attributes to EA tasks when utilising attribute information to enhance entity features. In this paper, we propose a self-supervised entity alignment framework via attribute correction. Specifically, we first design a highly effective seed pair generator based on multi-aspect features of entities to solve the labour-intensive problem of obtaining pre-aligned entity pairs. Then, a novel alignment mechanism via attribute correction is proposed to address the problem that different types of attributes have different contributions to the EA task. Extensive experiments on real-world datasets with semantic features demonstrate that our framework outperforms state-of-the-art (SOTA) EA tasks.
实体配准(EA)旨在匹配不同知识图谱(KG)中具有相同含义的实体,是知识融合的关键步骤。现有的实体配准方法通常将实体的多方面特征编码为嵌入,并通过有监督的学习对嵌入进行配准。虽然这些方法取得了显著的成果,但有两个问题还没有得到很好的解决。首先,这些方法需要预先对齐实体对才能执行 EA 任务,这限制了它们在实践中的适用性。其次,这些方法在利用属性信息增强实体特征时,忽略了数字属性对 EA 任务的独特贡献。在本文中,我们提出了一种通过属性校正进行自我监督的实体配准框架。具体来说,我们首先设计了一种基于实体多方面特征的高效种子对生成器,以解决获取预对齐实体对这一劳动密集型问题。然后,我们提出了一种通过属性校正的新型配准机制,以解决不同类型的属性对 EA 任务有不同贡献的问题。在具有语义特征的真实数据集上进行的大量实验表明,我们的框架优于最先进的(SOTA)EA 任务。
{"title":"A self-supervised entity alignment framework via attribute correction","authors":"Xin Zhang , Yu Liu , Hongkui Wei , Shimin Shan , Zhehuan Zhao","doi":"10.1016/j.jksuci.2024.102167","DOIUrl":"10.1016/j.jksuci.2024.102167","url":null,"abstract":"<div><p>Entity alignment (EA), aiming to match entities with the same meaning across different knowledge graphs (KGs), is a critical step in knowledge fusion. Existing EA methods usually encode the multi-aspect features of entities as embeddings and learn to align the embeddings with supervised learning. Although these methods have achieved remarkable results, two issues have not been well addressed. Firstly, these methods require pre-aligned entity pairs to perform EA tasks, limiting their applicability in practice. Secondly, these methods overlook the unique contribution of digital attributes to EA tasks when utilising attribute information to enhance entity features. In this paper, we propose a self-supervised entity alignment framework via attribute correction. Specifically, we first design a highly effective seed pair generator based on multi-aspect features of entities to solve the labour-intensive problem of obtaining pre-aligned entity pairs. Then, a novel alignment mechanism via attribute correction is proposed to address the problem that different types of attributes have different contributions to the EA task. Extensive experiments on real-world datasets with semantic features demonstrate that our framework outperforms state-of-the-art (SOTA) EA tasks.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102167"},"PeriodicalIF":5.2,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002568/pdfft?md5=cbabc3cd71250bf4b823be664eeec76d&pid=1-s2.0-S1319157824002568-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-22DOI: 10.1016/j.jksuci.2024.102161
Shengrui Zhang, Ling He, Dan Liu, Chuan Jia, Dechao Zhang
Lower limb rehabilitation training often involves the use of assistive standing devices. However, elderly individuals frequently experience reduced exercise effectiveness or suffer muscle injuries when utilizing these devices. The ability to recognize abnormal lower limb postures can significantly enhance training efficiency and minimize the risk of injury. To address this, we propose a model based on dynamic threshold detection of spatial gait features to identify such abnormal postures. A human-assisted standing rehabilitation device platform was developed to build a lower limb gait depth dataset. RGB data is employed for keypoint detection, enabling the establishment of a 3D lower limb posture recognition model that extracts gait, time, spatial features, and keypoints. The predicted joint angles, stride length, and step frequency demonstrate errors of 4 %, 8 %, and 1.3 %, respectively, with an average confidence of 0.95 for 3D key points. We employed the WOA-BP neural network to develop a dynamic threshold algorithm based on gait features and propose a model for recognizing abnormal postures. Compared to other models, our model achieves a 96 % accuracy rate in recognizing abnormal postures, with a recall rate of 83 % and an F1 score of 90 %. ROC curve analysis and AUC values reveal that the WOA-BP algorithm performs farthest from the pure chance line, with the highest AUC value of 0.89, indicating its superior performance over other models. Experimental results demonstrate that this model possesses a strong capability in recognizing abnormal lower limb postures, encouraging patients to correct these postures, thereby reducing muscle injuries and improving exercise effectiveness.
{"title":"Abnormal lower limb posture recognition based on spatial gait feature dynamic threshold detection","authors":"Shengrui Zhang, Ling He, Dan Liu, Chuan Jia, Dechao Zhang","doi":"10.1016/j.jksuci.2024.102161","DOIUrl":"10.1016/j.jksuci.2024.102161","url":null,"abstract":"<div><p>Lower limb rehabilitation training often involves the use of assistive standing devices. However, elderly individuals frequently experience reduced exercise effectiveness or suffer muscle injuries when utilizing these devices. The ability to recognize abnormal lower limb postures can significantly enhance training efficiency and minimize the risk of injury. To address this, we propose a model based on dynamic threshold detection of spatial gait features to identify such abnormal postures. A human-assisted standing rehabilitation device platform was developed to build a lower limb gait depth dataset. RGB data is employed for keypoint detection, enabling the establishment of a 3D lower limb posture recognition model that extracts gait, time, spatial features, and keypoints. The predicted joint angles, stride length, and step frequency demonstrate errors of 4 %, 8 %, and 1.3 %, respectively, with an average confidence of 0.95 for 3D key points. We employed the WOA-BP neural network to develop a dynamic threshold algorithm based on gait features and propose a model for recognizing abnormal postures. Compared to other models, our model achieves a 96 % accuracy rate in recognizing abnormal postures, with a recall rate of 83 % and an F1 score of 90 %. ROC curve analysis and AUC values reveal that the WOA-BP algorithm performs farthest from the pure chance line, with the highest AUC value of 0.89, indicating its superior performance over other models. Experimental results demonstrate that this model possesses a strong capability in recognizing abnormal lower limb postures, encouraging patients to correct these postures, thereby reducing muscle injuries and improving exercise effectiveness.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 8","pages":"Article 102161"},"PeriodicalIF":5.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002507/pdfft?md5=27cec39130c542af88b8b1f0132833cd&pid=1-s2.0-S1319157824002507-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}