Rice production is pivotal for ensuring global food security. In Pakistan, rice is not only the dominant Kharif crop but also a significant export commodity that significantly impacts the state’s economy. However, Pakistan faces challenges such as abrupt climate change and the COVID-19 pandemic, which affect rice production and underscore the need for predictive models for informed decisions aimed at improving productivity and ultimately the state’s economy. This article presents an innovative deep learning-based hybrid predictive model, ResNet50-LSTM, designed to forecast rice yields in the Gujranwala district, Pakistan, utilizing multi-modal data. The model incorporates MODIS satellite imagery capturing EVI, LAI, and FPAR indices along with meteorological and soil data. Google Earth Engine is used for the collection and preprocessing of satellite imagery, where the preprocessing steps involve data filtering, applying region geometry, interpolation, and aggregation. These preprocessing steps were applied manually on meteorological and soil data. Following feature extraction from the imagery data using ResNet50, the three LSTM model configurations are presented with distinct layer architectures. The findings of this study exhibit that the model configuration featuring two LSTM layers with interconnected cells outperforms other proposed configurations in terms of prediction performance. Analysis of various feature combinations reveals that the selected feature set (EVI, FPAR, climate, and soil variables) yields highly accurate results with an R2 = 0.9903, RMSE = 0.1854, MAPE = 0.62%, MAE = 0.1384, MRE = 0.0062, and Willmott’s index of agreement = 0.9536. Moreover, the combination of EVI and FPAR is identified as particularly effective. Our findings revealed the potential of our framework for globally estimating crop yields through the utilization of publicly available multi-source data.
{"title":"Enhancing rice yield prediction: a deep fusion model integrating ResNet50-LSTM with multi source data","authors":"Aqsa Aslam, Saima Farhan","doi":"10.7717/peerj-cs.2219","DOIUrl":"https://doi.org/10.7717/peerj-cs.2219","url":null,"abstract":"Rice production is pivotal for ensuring global food security. In Pakistan, rice is not only the dominant Kharif crop but also a significant export commodity that significantly impacts the state’s economy. However, Pakistan faces challenges such as abrupt climate change and the COVID-19 pandemic, which affect rice production and underscore the need for predictive models for informed decisions aimed at improving productivity and ultimately the state’s economy. This article presents an innovative deep learning-based hybrid predictive model, ResNet50-LSTM, designed to forecast rice yields in the Gujranwala district, Pakistan, utilizing multi-modal data. The model incorporates MODIS satellite imagery capturing EVI, LAI, and FPAR indices along with meteorological and soil data. Google Earth Engine is used for the collection and preprocessing of satellite imagery, where the preprocessing steps involve data filtering, applying region geometry, interpolation, and aggregation. These preprocessing steps were applied manually on meteorological and soil data. Following feature extraction from the imagery data using ResNet50, the three LSTM model configurations are presented with distinct layer architectures. The findings of this study exhibit that the model configuration featuring two LSTM layers with interconnected cells outperforms other proposed configurations in terms of prediction performance. Analysis of various feature combinations reveals that the selected feature set (EVI, FPAR, climate, and soil variables) yields highly accurate results with an R2 = 0.9903, RMSE = 0.1854, MAPE = 0.62%, MAE = 0.1384, MRE = 0.0062, and Willmott’s index of agreement = 0.9536. Moreover, the combination of EVI and FPAR is identified as particularly effective. Our findings revealed the potential of our framework for globally estimating crop yields through the utilization of publicly available multi-source data.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"8 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid increase in vehicle numbers, efficient traffic management has become a critical challenge for society. Traditional methods of vehicle detection and classification often struggle with the diverse characteristics of vehicles, such as varying shapes, colors, edges, shadows, and textures. To address this, we proposed an innovative ensemble method that combines two state-of-the-art deep learning models i.e., EfficientDet and YOLOv8. The proposed work leverages data from the Forward-Looking Infrared (FLIR) dataset, which provides both thermal and RGB images. To enhance the model performance and to address the class imbalances, we applied several data augmentation techniques. Experimental results demonstrate that the proposed ensemble model achieves a mean average precision (mAP) of 95.5% on thermal images, outperforming the individual performances of EfficientDet and YOLOv8, which achieved mAPs of 92.6% and 89.4% respectively. Additionally, the ensemble model attained an average recall (AR) of 0.93 and an optimal localization recall precision (oLRP) of 0.08 on thermal images. For RGB images, the ensemble model achieved mAP of 93.1%, AR of 0.91, and oLRP of 0.10, consistently surpassing the performance of its constituent models. These findings highlight the effectiveness of proposed ensemble approach in improving vehicle detection and classification. The integration of thermal imaging further enhances detection capabilities under various lighting conditions, making the system robust for real-world applications in intelligent traffic management.
{"title":"Vehicle detection and classification using an ensemble of EfficientDet and YOLOv8","authors":"Caixia Lv, Usha Mittal, Vishu Madaan, Prateek Agrawal","doi":"10.7717/peerj-cs.2233","DOIUrl":"https://doi.org/10.7717/peerj-cs.2233","url":null,"abstract":"With the rapid increase in vehicle numbers, efficient traffic management has become a critical challenge for society. Traditional methods of vehicle detection and classification often struggle with the diverse characteristics of vehicles, such as varying shapes, colors, edges, shadows, and textures. To address this, we proposed an innovative ensemble method that combines two state-of-the-art deep learning models i.e., EfficientDet and YOLOv8. The proposed work leverages data from the Forward-Looking Infrared (FLIR) dataset, which provides both thermal and RGB images. To enhance the model performance and to address the class imbalances, we applied several data augmentation techniques. Experimental results demonstrate that the proposed ensemble model achieves a mean average precision (mAP) of 95.5% on thermal images, outperforming the individual performances of EfficientDet and YOLOv8, which achieved mAPs of 92.6% and 89.4% respectively. Additionally, the ensemble model attained an average recall (AR) of 0.93 and an optimal localization recall precision (oLRP) of 0.08 on thermal images. For RGB images, the ensemble model achieved mAP of 93.1%, AR of 0.91, and oLRP of 0.10, consistently surpassing the performance of its constituent models. These findings highlight the effectiveness of proposed ensemble approach in improving vehicle detection and classification. The integration of thermal imaging further enhances detection capabilities under various lighting conditions, making the system robust for real-world applications in intelligent traffic management.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"3 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thabatta Moreira Alves de Araujo, Carlos André de Mattos Teixeira, Carlos Renato Lisboa Francês
Most natural disasters result from geodynamic events such as landslides and slope collapse. These failures cause catastrophes that directly impact the environment and cause financial and human losses. Visual inspection is the primary method for detecting failures in geotechnical structures, but on-site visits can be risky due to unstable soil. In addition, the body design and hostile and remote installation conditions make monitoring these structures inviable. When a fast and secure evaluation is required, analysis by computational methods becomes feasible. In this study, a convolutional neural network (CNN) approach to computer vision is applied to identify defects in the surface of geotechnical structures aided by unmanned aerial vehicle (UAV) and mobile devices, aiming to reduce the reliance on human-led on-site inspections. However, studies in computer vision algorithms still need to be explored in this field due to particularities of geotechnical engineering, such as limited public datasets and redundant images. Thus, this study obtained images of surface failure indicators from slopes near a Brazilian national road, assisted by UAV and mobile devices. We then proposed a custom CNN and low complexity model architecture to build a binary classifier image-aided to detect faults in geotechnical surfaces. The model achieved a satisfactory average accuracy rate of 94.26%. An AUC metric score of 0.99 from the receiver operator characteristic (ROC) curve and matrix confusion with a testing dataset show satisfactory results. The results suggest that the capability of the model to distinguish between the classes ‘damage’ and ‘intact’ is excellent. It enables the identification of failure indicators. Early failure indicator detection on the surface of slopes can facilitate proper maintenance and alarms and prevent disasters, as the integrity of the soil directly affects the structures built around and above it.
{"title":"Enhancing geotechnical damage detection with deep learning: a convolutional neural network approach","authors":"Thabatta Moreira Alves de Araujo, Carlos André de Mattos Teixeira, Carlos Renato Lisboa Francês","doi":"10.7717/peerj-cs.2052","DOIUrl":"https://doi.org/10.7717/peerj-cs.2052","url":null,"abstract":"Most natural disasters result from geodynamic events such as landslides and slope collapse. These failures cause catastrophes that directly impact the environment and cause financial and human losses. Visual inspection is the primary method for detecting failures in geotechnical structures, but on-site visits can be risky due to unstable soil. In addition, the body design and hostile and remote installation conditions make monitoring these structures inviable. When a fast and secure evaluation is required, analysis by computational methods becomes feasible. In this study, a convolutional neural network (CNN) approach to computer vision is applied to identify defects in the surface of geotechnical structures aided by unmanned aerial vehicle (UAV) and mobile devices, aiming to reduce the reliance on human-led on-site inspections. However, studies in computer vision algorithms still need to be explored in this field due to particularities of geotechnical engineering, such as limited public datasets and redundant images. Thus, this study obtained images of surface failure indicators from slopes near a Brazilian national road, assisted by UAV and mobile devices. We then proposed a custom CNN and low complexity model architecture to build a binary classifier image-aided to detect faults in geotechnical surfaces. The model achieved a satisfactory average accuracy rate of 94.26%. An AUC metric score of 0.99 from the receiver operator characteristic (ROC) curve and matrix confusion with a testing dataset show satisfactory results. The results suggest that the capability of the model to distinguish between the classes ‘damage’ and ‘intact’ is excellent. It enables the identification of failure indicators. Early failure indicator detection on the surface of slopes can facilitate proper maintenance and alarms and prevent disasters, as the integrity of the soil directly affects the structures built around and above it.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"43 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Zunnurain Hussain, Zurina Mohd Hanapi, Azizol Abdullah, Masnida Hussin, Mohd Izuan Hafez Ninggal
In the modern digital market flooded by nearly endless cyber-security hazards, sophisticated IDS (intrusion detection systems) can become invaluable in defending against intricate security threats. Sybil-Free Metric-based routing protocol for low power and lossy network (RPL) Trustworthiness Scheme (SF-MRTS) captures the nature of the biggest threat to the routing protocol for low-power and lossy networks under the RPL module, known as the Sybil attack. Sybil attacks build a significant security challenge for RPL networks where an attacker can distort at least two hop paths and disrupt network processes. Using such a new way of calculating node reliability, we introduce a cutting-edge approach, evaluating parameters beyond routing metrics like energy conservation and actuality. SF-MRTS works precisely towards achieving a trusted network by introducing such trust metrics on secure paths. Therefore, this may be considered more likely to withstand the attacks because of these security improvements. The simulation function of SF-MRTS clearly shows its concordance with the security risk management features, which are also necessary for the network’s performance and stability maintenance. These mechanisms are based on the principles of game theory, and they allocate attractions to the nodes that cooperate while imposing penalties on the nodes that do not. This will be the way to avoid damage to the network, and it will lead to collaboration between the nodes. SF-MRTS is a security technology for emerging industrial Internet of Things (IoT) network attacks. It effectively guaranteed reliability and improved the networks’ resilience in different scenarios.
在充斥着近乎无穷无尽的网络安全隐患的现代数字市场中,精密的 IDS(入侵检测系统)在抵御错综复杂的安全威胁方面具有不可估量的作用。无假手于人(Sybil-Free Metric-based Routing Protocol for Low Power and Lossy Network (RPL) Trustworthiness Scheme,简称 SF-MRTS)抓住了 RPL 模块下低功耗和有损网络路由协议面临的最大威胁(即假手于人攻击)的本质。Sybil攻击是RPL网络面临的一个重大安全挑战,攻击者可以扭曲至少两跳路径并破坏网络进程。利用这种计算节点可靠性的新方法,我们引入了一种前沿方法,对能量守恒和实际性等路由指标以外的参数进行评估。SF-MRTS 正是通过在安全路径上引入此类信任指标来实现可信网络。因此,由于这些安全方面的改进,这可能被认为更有可能抵御攻击。SF-MRTS 的仿真功能清楚地显示了它与安全风险管理功能的一致性,这也是网络性能和稳定性维护所必需的。这些机制基于博弈论原理,对合作的节点分配吸引力,对不合作的节点实施惩罚。这将是避免对网络造成破坏的方法,也将促成节点之间的合作。SF-MRTS 是一种针对新兴工业物联网(IoT)网络攻击的安全技术。它有效保证了不同场景下网络的可靠性,提高了网络的弹性。
{"title":"An efficient secure and energy resilient trust-based system for detection and mitigation of sybil attack detection (SAN)","authors":"Muhammad Zunnurain Hussain, Zurina Mohd Hanapi, Azizol Abdullah, Masnida Hussin, Mohd Izuan Hafez Ninggal","doi":"10.7717/peerj-cs.2231","DOIUrl":"https://doi.org/10.7717/peerj-cs.2231","url":null,"abstract":"In the modern digital market flooded by nearly endless cyber-security hazards, sophisticated IDS (intrusion detection systems) can become invaluable in defending against intricate security threats. Sybil-Free Metric-based routing protocol for low power and lossy network (RPL) Trustworthiness Scheme (SF-MRTS) captures the nature of the biggest threat to the routing protocol for low-power and lossy networks under the RPL module, known as the Sybil attack. Sybil attacks build a significant security challenge for RPL networks where an attacker can distort at least two hop paths and disrupt network processes. Using such a new way of calculating node reliability, we introduce a cutting-edge approach, evaluating parameters beyond routing metrics like energy conservation and actuality. SF-MRTS works precisely towards achieving a trusted network by introducing such trust metrics on secure paths. Therefore, this may be considered more likely to withstand the attacks because of these security improvements. The simulation function of SF-MRTS clearly shows its concordance with the security risk management features, which are also necessary for the network’s performance and stability maintenance. These mechanisms are based on the principles of game theory, and they allocate attractions to the nodes that cooperate while imposing penalties on the nodes that do not. This will be the way to avoid damage to the network, and it will lead to collaboration between the nodes. SF-MRTS is a security technology for emerging industrial Internet of Things (IoT) network attacks. It effectively guaranteed reliability and improved the networks’ resilience in different scenarios.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"20 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Boolean satisfiability (SAT) problem exhibits different structural features in various domains. Neural network models can be used as more generalized algorithms that can be learned to solve specific problems based on different domain data than traditional rule-based approaches. How to accurately identify these structural features is crucial for neural networks to solve the SAT problem. Currently, learning-based SAT solvers, whether they are end-to-end models or enhancements to traditional heuristic algorithms, have achieved significant progress. In this article, we propose TG-SAT, an end-to-end framework based on Transformer and gated recurrent neural network (GRU) for predicting the satisfiability of SAT problems. TG-SAT can learn the structural features of SAT problems in a weakly supervised environment. To capture the structural information of the SAT problem, we encodes a SAT problem as an undirected graph and integrates GRU into the Transformer structure to update the node embeddings. By computing cross-attention scores between literals and clauses, a weighted representation of nodes is obtained. The model is eventually trained as a classifier to predict the satisfiability of the SAT problem. Experimental results demonstrate that TG-SAT achieves a 2%–5% improvement in accuracy on random 3-SAT problems compared to NeuroSAT. It also outperforms in SR(N), especially in handling more complex SAT problems, where our model achieves higher prediction accuracy.
布尔可满足性(SAT)问题在不同领域表现出不同的结构特征。与传统的基于规则的方法相比,神经网络模型可以作为一种更具通用性的算法,根据不同的领域数据来学习解决特定问题。如何准确识别这些结构特征是神经网络解决 SAT 问题的关键。目前,基于学习的 SAT 求解器,无论是端到端模型还是对传统启发式算法的增强,都取得了长足的进步。在本文中,我们提出了 TG-SAT,一个基于 Transformer 和门控递归神经网络(GRU)的端到端框架,用于预测 SAT 问题的可满足性。TG-SAT 可以在弱监督环境下学习 SAT 问题的结构特征。为了捕捉 SAT 问题的结构信息,我们将 SAT 问题编码为无向图,并将 GRU 集成到 Transformer 结构中以更新节点嵌入。通过计算字面和分句之间的交叉关注分数,可以得到节点的加权表示。该模型最终被训练成一个分类器,用于预测 SAT 问题的可满足性。实验结果表明,与 NeuroSAT 相比,TG-SAT 在随机 3-SAT 问题上的准确率提高了 2%-5%。它在 SR(N) 中的表现也优于 NeuroSAT,尤其是在处理更复杂的 SAT 问题时,我们的模型能达到更高的预测精度。
{"title":"Predicting the satisfiability of Boolean formulas by incorporating gated recurrent unit (GRU) in the Transformer framework","authors":"Wenjing Chang, Mengyu Guo, Junwei Luo","doi":"10.7717/peerj-cs.2169","DOIUrl":"https://doi.org/10.7717/peerj-cs.2169","url":null,"abstract":"The Boolean satisfiability (SAT) problem exhibits different structural features in various domains. Neural network models can be used as more generalized algorithms that can be learned to solve specific problems based on different domain data than traditional rule-based approaches. How to accurately identify these structural features is crucial for neural networks to solve the SAT problem. Currently, learning-based SAT solvers, whether they are end-to-end models or enhancements to traditional heuristic algorithms, have achieved significant progress. In this article, we propose TG-SAT, an end-to-end framework based on Transformer and gated recurrent neural network (GRU) for predicting the satisfiability of SAT problems. TG-SAT can learn the structural features of SAT problems in a weakly supervised environment. To capture the structural information of the SAT problem, we encodes a SAT problem as an undirected graph and integrates GRU into the Transformer structure to update the node embeddings. By computing cross-attention scores between literals and clauses, a weighted representation of nodes is obtained. The model is eventually trained as a classifier to predict the satisfiability of the SAT problem. Experimental results demonstrate that TG-SAT achieves a 2%–5% improvement in accuracy on random 3-SAT problems compared to NeuroSAT. It also outperforms in SR(N), especially in handling more complex SAT problems, where our model achieves higher prediction accuracy.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"27 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background Bacterial image analysis plays a vital role in various fields, providing valuable information and insights for studying bacterial structural biology, diagnosing and treating infectious diseases caused by pathogenic bacteria, discovering and developing drugs that can combat bacterial infections, etc. As a result, it has prompted efforts to automate bacterial image analysis tasks. By automating analysis tasks and leveraging more advanced computational techniques, such as deep learning (DL) algorithms, bacterial image analysis can contribute to rapid, more accurate, efficient, reliable, and standardised analysis, leading to enhanced understanding, diagnosis, and control of bacterial-related phenomena. Methods Three object detection networks of DL algorithms, namely SSD-MobileNetV2, EfficientDet, and YOLOv4, were developed to automatically detect Escherichia coli (E. coli) bacteria from microscopic images. The multi-task DL framework is developed to classify the bacteria according to their respective growth stages, which include rod-shaped cells, dividing cells, and microcolonies. Data preprocessing steps were carried out before training the object detection models, including image augmentation, image annotation, and data splitting. The performance of the DL techniques is evaluated using the quantitative assessment method based on mean average precision (mAP), precision, recall, and F1-score. The performance metrics of the models were compared and analysed. The best DL model was then selected to perform multi-task object detections in identifying rod-shaped cells, dividing cells, and microcolonies. Results The output of the test images generated from the three proposed DL models displayed high detection accuracy, with YOLOv4 achieving the highest confidence score range of detection and being able to create different coloured bounding boxes for different growth stages of E. coli bacteria. In terms of statistical analysis, among the three proposed models, YOLOv4 demonstrates superior performance, achieving the highest mAP of 98% with the highest precision, recall, and F1-score of 86%, 97%, and 91%, respectively. Conclusions This study has demonstrated the effectiveness, potential, and applicability of DL approaches in multi-task bacterial image analysis, focusing on automating the detection and classification of bacteria from microscopic images. The proposed models can output images with bounding boxes surrounding each detected E. coli bacteria, labelled with their growth stage and confidence level of detection. All proposed object detection models have achieved promising results, with YOLOv4 outperforming the other models.
{"title":"Bacterial image analysis using multi-task deep learning approaches for clinical microscopy","authors":"Shuang Yee Chin, Jian Dong, Khairunnisa Hasikin, Romano Ngui, Khin Wee Lai, Pauline Shan Qing Yeoh, Xiang Wu","doi":"10.7717/peerj-cs.2180","DOIUrl":"https://doi.org/10.7717/peerj-cs.2180","url":null,"abstract":"Background Bacterial image analysis plays a vital role in various fields, providing valuable information and insights for studying bacterial structural biology, diagnosing and treating infectious diseases caused by pathogenic bacteria, discovering and developing drugs that can combat bacterial infections, etc. As a result, it has prompted efforts to automate bacterial image analysis tasks. By automating analysis tasks and leveraging more advanced computational techniques, such as deep learning (DL) algorithms, bacterial image analysis can contribute to rapid, more accurate, efficient, reliable, and standardised analysis, leading to enhanced understanding, diagnosis, and control of bacterial-related phenomena. Methods Three object detection networks of DL algorithms, namely SSD-MobileNetV2, EfficientDet, and YOLOv4, were developed to automatically detect Escherichia coli (E. coli) bacteria from microscopic images. The multi-task DL framework is developed to classify the bacteria according to their respective growth stages, which include rod-shaped cells, dividing cells, and microcolonies. Data preprocessing steps were carried out before training the object detection models, including image augmentation, image annotation, and data splitting. The performance of the DL techniques is evaluated using the quantitative assessment method based on mean average precision (mAP), precision, recall, and F1-score. The performance metrics of the models were compared and analysed. The best DL model was then selected to perform multi-task object detections in identifying rod-shaped cells, dividing cells, and microcolonies. Results The output of the test images generated from the three proposed DL models displayed high detection accuracy, with YOLOv4 achieving the highest confidence score range of detection and being able to create different coloured bounding boxes for different growth stages of E. coli bacteria. In terms of statistical analysis, among the three proposed models, YOLOv4 demonstrates superior performance, achieving the highest mAP of 98% with the highest precision, recall, and F1-score of 86%, 97%, and 91%, respectively. Conclusions This study has demonstrated the effectiveness, potential, and applicability of DL approaches in multi-task bacterial image analysis, focusing on automating the detection and classification of bacteria from microscopic images. The proposed models can output images with bounding boxes surrounding each detected E. coli bacteria, labelled with their growth stage and confidence level of detection. All proposed object detection models have achieved promising results, with YOLOv4 outperforming the other models.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"371 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmad M. Nagm, Mona M. Moussa, Rasha Shoitan, Ahmed Ali, Mohamed Mashhour, Ahmed S. Salama, Hamada I. AbdulWakel
The exponential progress of image editing software has contributed to a rapid rise in the production of fake images. Consequently, various techniques and approaches have been developed to detect manipulated images. These methods aim to discern between genuine and altered images, effectively combating the proliferation of deceptive visual content. However, additional advancements are necessary to enhance their accuracy and precision. Therefore, this research proposes an image forgery algorithm that integrates error level analysis (ELA) and a convolutional neural network (CNN) to detect the manipulation. The system primarily focuses on detecting copy-move and splicing forgeries in images. The input image is fed to the ELA algorithm to identify regions within the image that have different compression levels. Afterward, the created ELA images are used as input to train the proposed CNN model. The CNN model is constructed from two consecutive convolution layers, followed by one max pooling layer and two dense layers. Two dropout layers are inserted between the layers to improve model generalization. The experiments are applied to the CASIA 2 dataset, and the simulation results show that the proposed algorithm demonstrates remarkable performance metrics, including a training accuracy of 99.05%, testing accuracy of 94.14%, precision of 94.1%, and recall of 94.07%. Notably, it outperforms state-of-the-art techniques in both accuracy and precision.
图像编辑软件的飞速发展导致了伪造图像的迅速增加。因此,人们开发了各种技术和方法来检测被篡改的图像。这些方法旨在辨别真假图像,有效打击欺骗性视觉内容的泛滥。然而,要提高这些方法的准确性和精确度,还需要更多的进步。因此,本研究提出了一种图像伪造算法,该算法集成了误差水平分析(ELA)和卷积神经网络(CNN),用于检测篡改行为。该系统主要侧重于检测图像中的复制移动和拼接伪造。输入图像被送入 ELA 算法,以识别图像中具有不同压缩级别的区域。之后,创建的 ELA 图像被用作训练所提议的 CNN 模型的输入。CNN 模型由两个连续卷积层、一个最大池化层和两个密集层构成。在各层之间插入了两个剔除层,以提高模型的泛化能力。实验应用于 CASIA 2 数据集,仿真结果表明,所提出的算法具有显著的性能指标,包括训练准确率 99.05%、测试准确率 94.14%、精确率 94.1%、召回率 94.07%。值得注意的是,该算法在准确率和精确度方面都优于最先进的技术。
{"title":"Detecting image manipulation with ELA-CNN integration: a powerful framework for authenticity verification","authors":"Ahmad M. Nagm, Mona M. Moussa, Rasha Shoitan, Ahmed Ali, Mohamed Mashhour, Ahmed S. Salama, Hamada I. AbdulWakel","doi":"10.7717/peerj-cs.2205","DOIUrl":"https://doi.org/10.7717/peerj-cs.2205","url":null,"abstract":"The exponential progress of image editing software has contributed to a rapid rise in the production of fake images. Consequently, various techniques and approaches have been developed to detect manipulated images. These methods aim to discern between genuine and altered images, effectively combating the proliferation of deceptive visual content. However, additional advancements are necessary to enhance their accuracy and precision. Therefore, this research proposes an image forgery algorithm that integrates error level analysis (ELA) and a convolutional neural network (CNN) to detect the manipulation. The system primarily focuses on detecting copy-move and splicing forgeries in images. The input image is fed to the ELA algorithm to identify regions within the image that have different compression levels. Afterward, the created ELA images are used as input to train the proposed CNN model. The CNN model is constructed from two consecutive convolution layers, followed by one max pooling layer and two dense layers. Two dropout layers are inserted between the layers to improve model generalization. The experiments are applied to the CASIA 2 dataset, and the simulation results show that the proposed algorithm demonstrates remarkable performance metrics, including a training accuracy of 99.05%, testing accuracy of 94.14%, precision of 94.1%, and recall of 94.07%. Notably, it outperforms state-of-the-art techniques in both accuracy and precision.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"22 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background The continuous increase in carbon dioxide (CO2) emissions from fuel vehicles generates a greenhouse effect in the atmosphere, which has a negative impact on global warming and climate change and raises serious concerns about environmental sustainability. Therefore, research on estimating and reducing vehicle CO2 emissions is crucial in promoting environmental sustainability and reducing greenhouse gas emissions in the atmosphere. Methods This study performed a comparative regression analysis using 18 different regression algorithms based on machine learning, ensemble learning, and deep learning paradigms to evaluate and predict CO2 emissions from fuel vehicles. The performance of each algorithm was evaluated using metrics including R2, Adjusted R2, root mean square error (RMSE), and runtime. Results The findings revealed that ensemble learning methods have higher prediction accuracy and lower error rates. Ensemble learning algorithms that included Extreme Gradient Boosting (XGB), Random Forest, and Light Gradient-Boosting Machine (LGBM) demonstrated high R2 and low RMSE values. As a result, these ensemble learning-based algorithms were discovered to be the most effective methods of predicting CO2 emissions. Although deep learning models with complex structures, such as the convolutional neural network (CNN), deep neural network (DNN) and gated recurrent unit (GRU), achieved high R2 values, it was discovered that they take longer to train and require more computational resources. The methodology and findings of our research provide a number of important implications for the different stakeholders striving for environmental sustainability and an ecological world.
{"title":"Forecasting CO2 emissions of fuel vehicles for an ecological world using ensemble learning, machine learning, and deep learning models","authors":"Fatih Gurcan","doi":"10.7717/peerj-cs.2234","DOIUrl":"https://doi.org/10.7717/peerj-cs.2234","url":null,"abstract":"Background The continuous increase in carbon dioxide (CO2) emissions from fuel vehicles generates a greenhouse effect in the atmosphere, which has a negative impact on global warming and climate change and raises serious concerns about environmental sustainability. Therefore, research on estimating and reducing vehicle CO2 emissions is crucial in promoting environmental sustainability and reducing greenhouse gas emissions in the atmosphere. Methods This study performed a comparative regression analysis using 18 different regression algorithms based on machine learning, ensemble learning, and deep learning paradigms to evaluate and predict CO2 emissions from fuel vehicles. The performance of each algorithm was evaluated using metrics including R2, Adjusted R2, root mean square error (RMSE), and runtime. Results The findings revealed that ensemble learning methods have higher prediction accuracy and lower error rates. Ensemble learning algorithms that included Extreme Gradient Boosting (XGB), Random Forest, and Light Gradient-Boosting Machine (LGBM) demonstrated high R2 and low RMSE values. As a result, these ensemble learning-based algorithms were discovered to be the most effective methods of predicting CO2 emissions. Although deep learning models with complex structures, such as the convolutional neural network (CNN), deep neural network (DNN) and gated recurrent unit (GRU), achieved high R2 values, it was discovered that they take longer to train and require more computational resources. The methodology and findings of our research provide a number of important implications for the different stakeholders striving for environmental sustainability and an ecological world.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"45 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georgiana Tucudean, Marian Bucos, Bogdan Dragulescu, Catalin Daniel Caleanu
Natural language processing (NLP) tasks can be addressed with several deep learning architectures, and many different approaches have proven to be efficient. This study aims to briefly summarize the use cases for NLP tasks along with the main architectures. This research presents transformer-based solutions for NLP tasks such as Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-Training (GPT) architectures. To achieve that, we conducted a step-by-step process in the review strategy: identify the recent studies that include Transformers, apply filters to extract the most consistent studies, identify and define inclusion and exclusion criteria, assess the strategy proposed in each study, and finally discuss the methods and architectures presented in the resulting articles. These steps facilitated the systematic summarization and comparative analysis of NLP applications based on Transformer architectures. The primary focus is the current state of the NLP domain, particularly regarding its applications, language models, and data set types. The results provide insights into the challenges encountered in this research domain.
{"title":"Natural language processing with transformers: a review","authors":"Georgiana Tucudean, Marian Bucos, Bogdan Dragulescu, Catalin Daniel Caleanu","doi":"10.7717/peerj-cs.2222","DOIUrl":"https://doi.org/10.7717/peerj-cs.2222","url":null,"abstract":"Natural language processing (NLP) tasks can be addressed with several deep learning architectures, and many different approaches have proven to be efficient. This study aims to briefly summarize the use cases for NLP tasks along with the main architectures. This research presents transformer-based solutions for NLP tasks such as Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-Training (GPT) architectures. To achieve that, we conducted a step-by-step process in the review strategy: identify the recent studies that include Transformers, apply filters to extract the most consistent studies, identify and define inclusion and exclusion criteria, assess the strategy proposed in each study, and finally discuss the methods and architectures presented in the resulting articles. These steps facilitated the systematic summarization and comparative analysis of NLP applications based on Transformer architectures. The primary focus is the current state of the NLP domain, particularly regarding its applications, language models, and data set types. The results provide insights into the challenges encountered in this research domain.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"43 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The combination of memory forensics and deep learning for malware detection has achieved certain progress, but most existing methods convert process dump to images for classification, which is still based on process byte feature classification. After the malware is loaded into memory, the original byte features will change. Compared with byte features, function call features can represent the behaviors of malware more robustly. Therefore, this article proposes the ProcGCN model, a deep learning model based on DGCNN (Deep Graph Convolutional Neural Network), to detect malicious processes in memory images. First, the process dump is extracted from the whole system memory image; then, the Function Call Graph (FCG) of the process is extracted, and feature vectors for the function node in the FCG are generated based on the word bag model; finally, the FCG is input to the ProcGCN model for classification and detection. Using a public dataset for experiments, the ProcGCN model achieved an accuracy of 98.44% and an F1 score of 0.9828. It shows a better result than the existing deep learning methods based on static features, and its detection speed is faster, which demonstrates the effectiveness of the method based on function call features and graph representation learning in memory forensics.
{"title":"ProcGCN: detecting malicious process in memory based on DGCNN","authors":"Heyu Zhang, Binglong Li, Shilong Yu, Chaowen Chang, Jinhui Li, Bohao Yang","doi":"10.7717/peerj-cs.2193","DOIUrl":"https://doi.org/10.7717/peerj-cs.2193","url":null,"abstract":"The combination of memory forensics and deep learning for malware detection has achieved certain progress, but most existing methods convert process dump to images for classification, which is still based on process byte feature classification. After the malware is loaded into memory, the original byte features will change. Compared with byte features, function call features can represent the behaviors of malware more robustly. Therefore, this article proposes the ProcGCN model, a deep learning model based on DGCNN (Deep Graph Convolutional Neural Network), to detect malicious processes in memory images. First, the process dump is extracted from the whole system memory image; then, the Function Call Graph (FCG) of the process is extracted, and feature vectors for the function node in the FCG are generated based on the word bag model; finally, the FCG is input to the ProcGCN model for classification and detection. Using a public dataset for experiments, the ProcGCN model achieved an accuracy of 98.44% and an F1 score of 0.9828. It shows a better result than the existing deep learning methods based on static features, and its detection speed is faster, which demonstrates the effectiveness of the method based on function call features and graph representation learning in memory forensics.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"57 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}