Pub Date : 2024-11-29eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2364
Ala Saleh Alluhaidan, Mashael Maashi, Noha Negm, Shoayee Dlaim Alotaibi, Ibrahim R Alzahrani, Ahmed S Salama
In recent years, the Internet of Things has played a dominant role in various real-time problems and given solutions via sensor signals. Monitoring the patient health status of Internet of Medical Things (IoMT) facilitates communication between wearable sensor devices and patients through a wireless network. Heart illness is one of the reasons for the increasing death rate in the world. Diagnosing the disease is done by the fusion of multi-sensor device signals. Much research has been done in predicting the disease and treating it correctly. However, the issues are accuracy, consumption time, and inefficiency. To overcome these issues, this paper proposed an efficient algorithm for fusing the multi-sensor signals from wearable sensor devices, classifying the medical signal data and predicting heart disease using the hybrid technique of kernel random forest with the Black Hole Optimization algorithm (KRF-BHO). This KRF-BHO is used for sensor data fusion, while XG-Boost is used to classify echocardiogram images. Accuracy in the training phase with multi-sensor data fusion data set of proposed work KRF-BHO with XGBoost classifier is 94.12%; in the testing phase, the accuracy rate is 95.89%. Similarly, for the Cleveland Dataset, the proposed work KRF-BHO with XGBoost classifier is 95.78%; in the testing phase, the accuracy rate is 96.21%.
近年来,物联网在各种实时问题中发挥了主导作用,并通过传感器信号给出了解决方案。通过医疗物联网(Internet of Medical Things, IoMT)监测患者的健康状况,方便可穿戴传感器设备与患者通过无线网络进行通信。心脏病是世界上死亡率不断上升的原因之一。疾病的诊断是通过多传感器设备信号的融合来完成的。在预测疾病和正确治疗方面已经做了很多研究。然而,问题是准确性、消耗时间和效率低下。为了克服这些问题,本文提出了一种基于核随机森林与黑洞优化算法(KRF-BHO)的混合技术,对可穿戴传感器设备的多传感器信号进行融合,对医疗信号数据进行分类,并进行心脏病预测的高效算法。KRF-BHO用于传感器数据融合,XG-Boost用于超声心动图图像分类。基于XGBoost分类器的KRF-BHO在训练阶段与多传感器数据融合数据集的准确率为94.12%;在测试阶段,准确率为95.89%。同样,对于Cleveland数据集,使用XGBoost分类器提出的工作KRF-BHO为95.78%;在测试阶段,准确率为96.21%。
{"title":"Kernel random forest with black hole optimization for heart diseases prediction using data fusion.","authors":"Ala Saleh Alluhaidan, Mashael Maashi, Noha Negm, Shoayee Dlaim Alotaibi, Ibrahim R Alzahrani, Ahmed S Salama","doi":"10.7717/peerj-cs.2364","DOIUrl":"10.7717/peerj-cs.2364","url":null,"abstract":"<p><p>In recent years, the Internet of Things has played a dominant role in various real-time problems and given solutions via sensor signals. Monitoring the patient health status of Internet of Medical Things (IoMT) facilitates communication between wearable sensor devices and patients through a wireless network. Heart illness is one of the reasons for the increasing death rate in the world. Diagnosing the disease is done by the fusion of multi-sensor device signals. Much research has been done in predicting the disease and treating it correctly. However, the issues are accuracy, consumption time, and inefficiency. To overcome these issues, this paper proposed an efficient algorithm for fusing the multi-sensor signals from wearable sensor devices, classifying the medical signal data and predicting heart disease using the hybrid technique of kernel random forest with the Black Hole Optimization algorithm (KRF-BHO). This KRF-BHO is used for sensor data fusion, while XG-Boost is used to classify echocardiogram images. Accuracy in the training phase with multi-sensor data fusion data set of proposed work KRF-BHO with XGBoost classifier is 94.12%; in the testing phase, the accuracy rate is 95.89%. Similarly, for the Cleveland Dataset, the proposed work KRF-BHO with XGBoost classifier is 95.78%; in the testing phase, the accuracy rate is 96.21%.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2364"},"PeriodicalIF":3.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622926/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-29eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2459
G Narayanee Nimeshika, Subitha D
In the rapidly evolving healthcare sector, using advanced technologies to improve medical classification systems has become crucial for enhancing patient care, diagnosis, and treatment planning. There are two main challenges faced in this domain (i) imbalanced distribution of medical data, leading to biased model performance and (ii) the need to preserve patient privacy and comply with data protection regulations. The primary goal of this project is to develop a medical classification model for Alzheimer's disease detection that can effectively learn from decentralized and imbalanced datasets without compromising on data privacy. The proposed system aims to address these challenges by employing an approach that combines split federated learning (SFL) with conditional generative adversarial networks (cGANs) to enhance medical classification models. SFL enables efficient set of distributed agents that collaboratively train learning models without sharing their data, thus improving data privacy and the integration of conditional GANs aims to improve the model's ability to generalize across imbalanced classes by generating realistic synthetic samples for minority classes. The proposed system provided an accuracy of approximately 83.54 percentage for the Alzheimer's disease classification dataset.
{"title":"Enhancing Alzheimer's disease classification through split federated learning and GANs for imbalanced datasets.","authors":"G Narayanee Nimeshika, Subitha D","doi":"10.7717/peerj-cs.2459","DOIUrl":"10.7717/peerj-cs.2459","url":null,"abstract":"<p><p>In the rapidly evolving healthcare sector, using advanced technologies to improve medical classification systems has become crucial for enhancing patient care, diagnosis, and treatment planning. There are two main challenges faced in this domain (i) imbalanced distribution of medical data, leading to biased model performance and (ii) the need to preserve patient privacy and comply with data protection regulations. The primary goal of this project is to develop a medical classification model for Alzheimer's disease detection that can effectively learn from decentralized and imbalanced datasets without compromising on data privacy. The proposed system aims to address these challenges by employing an approach that combines split federated learning (SFL) with conditional generative adversarial networks (cGANs) to enhance medical classification models. SFL enables efficient set of distributed agents that collaboratively train learning models without sharing their data, thus improving data privacy and the integration of conditional GANs aims to improve the model's ability to generalize across imbalanced classes by generating realistic synthetic samples for minority classes. The proposed system provided an accuracy of approximately 83.54 percentage for the Alzheimer's disease classification dataset.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2459"},"PeriodicalIF":3.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623002/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-29eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2478
Shabana Ramzan, Basharat Ali, Ali Raza, Ibrar Hussain, Norma Latif Fitriyani, Yeonghyeon Gu, Muhammad Syafrudin
A thriving agricultural system is the cornerstone of an expanding economy of agricultural countries. Farmers' crop productivity is significantly reduced when they choose the crop without considering environmental factors and soil characteristics. Crop prediction enables farmers to select crops that maximize crop yield and earnings. Accurate crop prediction is mainly concerned with agricultural research, which plays a major role in selecting accurate crops based on environmental factors and soil characteristics. Recently, recommender systems (RS) have gained much attention and are being utilized in various fields such as e-commerce, music, health, text, movies etc. Machine learning techniques can help predict the crop accurately. We proposed an innovative artificial neural network (ANN) based crop prediction system (CPS) to address the farmer's issue. The parameters considered during sensor-based soil data collection for this study are nitrogen, phosphorus, potassium, temperature, humidity, pH, rainfall, electrical conductivity, and soil texture. Python programming language is used to design and validate the proposed system. The accuracy and reliability of the proposed CPS are assessed by using accuracy, precision, recall, and F1-score. We also optimized the proposed CPS by performing a hyperparameter Optimization analysis of applied learning methods. The proposed CPS model accuracy for both real-time collected and state-of-the-art datasets is 99%. The experimental results show that our proposed solution assists farmers in selecting the accurate crop and producing at their best, increasing their profit.
{"title":"An innovative artificial neural network model for smart crop prediction using sensory network based soil data.","authors":"Shabana Ramzan, Basharat Ali, Ali Raza, Ibrar Hussain, Norma Latif Fitriyani, Yeonghyeon Gu, Muhammad Syafrudin","doi":"10.7717/peerj-cs.2478","DOIUrl":"10.7717/peerj-cs.2478","url":null,"abstract":"<p><p>A thriving agricultural system is the cornerstone of an expanding economy of agricultural countries. Farmers' crop productivity is significantly reduced when they choose the crop without considering environmental factors and soil characteristics. Crop prediction enables farmers to select crops that maximize crop yield and earnings. Accurate crop prediction is mainly concerned with agricultural research, which plays a major role in selecting accurate crops based on environmental factors and soil characteristics. Recently, recommender systems (RS) have gained much attention and are being utilized in various fields such as e-commerce, music, health, text, movies etc. Machine learning techniques can help predict the crop accurately. We proposed an innovative artificial neural network (ANN) based crop prediction system (CPS) to address the farmer's issue. The parameters considered during sensor-based soil data collection for this study are nitrogen, phosphorus, potassium, temperature, humidity, pH, rainfall, electrical conductivity, and soil texture. Python programming language is used to design and validate the proposed system. The accuracy and reliability of the proposed CPS are assessed by using accuracy, precision, recall, and F1-score. We also optimized the proposed CPS by performing a hyperparameter Optimization analysis of applied learning methods. The proposed CPS model accuracy for both real-time collected and state-of-the-art datasets is 99%. The experimental results show that our proposed solution assists farmers in selecting the accurate crop and producing at their best, increasing their profit.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2478"},"PeriodicalIF":3.5,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623066/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The efficiency of machine learning (ML) algorithms plays a critical role in their deployment across various applications, particularly those with resource constraints or real-time requirements. This article presents a comprehensive framework for evaluating ML algorithm efficiency by incorporating metrics, such as training time, prediction time, memory usage, and computational resource utilization. The proposed methodology involves a multistep process: collecting raw metrics, normalizing them, applying the Analytic Hierarchy Process (AHP) to determine weights, and computing a composite efficiency score. We applied this framework to two distinct datasets: medical image data and agricultural crop prediction data. The results demonstrate that our approach effectively differentiates algorithm performance based on the specific demands of each application. For medical image analysis, the framework highlights strengths in robustness and adaptability, whereas for agricultural crop prediction, it emphasizes scalability and resource management. This study provides valuable insights into optimizing ML algorithms, and offers a versatile tool for practitioners to assess and enhance algorithmic efficiency across diverse domains.
{"title":"A simplified approach for efficiency analysis of machine learning algorithms.","authors":"Muthuramalingam Sivakumar, Sudhaman Parthasarathy, Thiyagarajan Padmapriya","doi":"10.7717/peerj-cs.2418","DOIUrl":"10.7717/peerj-cs.2418","url":null,"abstract":"<p><p>The efficiency of machine learning (ML) algorithms plays a critical role in their deployment across various applications, particularly those with resource constraints or real-time requirements. This article presents a comprehensive framework for evaluating ML algorithm efficiency by incorporating metrics, such as training time, prediction time, memory usage, and computational resource utilization. The proposed methodology involves a multistep process: collecting raw metrics, normalizing them, applying the Analytic Hierarchy Process (AHP) to determine weights, and computing a composite efficiency score. We applied this framework to two distinct datasets: medical image data and agricultural crop prediction data. The results demonstrate that our approach effectively differentiates algorithm performance based on the specific demands of each application. For medical image analysis, the framework highlights strengths in robustness and adaptability, whereas for agricultural crop prediction, it emphasizes scalability and resource management. This study provides valuable insights into optimizing ML algorithms, and offers a versatile tool for practitioners to assess and enhance algorithmic efficiency across diverse domains.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2418"},"PeriodicalIF":3.5,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623197/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2502
Krishnakumar Vaithianathan, Julian Benadit Pernabas, Latha Parthiban, Mamoon Rashid, Sultan S Alshamrani
Several deep learning networks are developed to identify the complex atrophic patterns of Alzheimer's disease (AD). Among various activation functions used in deep neural networks, the rectifier linear unit is the most used one. Even though these functions are analyzed individually, group activations and their interpretations are still not explored for neuroimaging analysis. In this study, a unique feature extraction technique based on normalized group activations that can be applied to both structural MRI and resting-state-fMRI (rs-fMRI) is proposed. This method is split into two phases: multi-trait condensed feature extraction networks and regional association networks. The initial phase involves extracting features from various brain regions using different multi-layered convolutional networks. Then, multiple regional association networks with normalized group activations for all the regional pairs are trained and the output of these networks is given as input to a classifier. To provide an unbiased estimate, an automated diagnosis system equipped with the proposed feature extraction is designed and analyzed on multi-cohort Alzheimer's Disease Neuroimaging Initiative (ADNI) data to predict multi-stages of AD. This system is also trained/tested on heterogeneous features such as non-transformed features, curvelets, wavelets, shearlets, textures, and scattering operators. Baseline scans of 185 rs-fMRIs and 1442 MRIs from ADNI-1, ADNI-2, and ADNI-GO datasets are used for validation. For MCI (mild cognitive impairment) classifications, there is an increase of 1-4% in performance. The outcome demonstrates the good discriminatory behaviour of the proposed features and its efficiency on rs-fMRI time-series and MRI data to classify multiple stages of AD.
{"title":"Normalized group activations based feature extraction technique using heterogeneous data for Alzheimer's disease classification.","authors":"Krishnakumar Vaithianathan, Julian Benadit Pernabas, Latha Parthiban, Mamoon Rashid, Sultan S Alshamrani","doi":"10.7717/peerj-cs.2502","DOIUrl":"10.7717/peerj-cs.2502","url":null,"abstract":"<p><p>Several deep learning networks are developed to identify the complex atrophic patterns of Alzheimer's disease (AD). Among various activation functions used in deep neural networks, the rectifier linear unit is the most used one. Even though these functions are analyzed individually, group activations and their interpretations are still not explored for neuroimaging analysis. In this study, a unique feature extraction technique based on normalized group activations that can be applied to both structural MRI and resting-state-fMRI (rs-fMRI) is proposed. This method is split into two phases: multi-trait condensed feature extraction networks and regional association networks. The initial phase involves extracting features from various brain regions using different multi-layered convolutional networks. Then, multiple regional association networks with normalized group activations for all the regional pairs are trained and the output of these networks is given as input to a classifier. To provide an unbiased estimate, an automated diagnosis system equipped with the proposed feature extraction is designed and analyzed on multi-cohort Alzheimer's Disease Neuroimaging Initiative (ADNI) data to predict multi-stages of AD. This system is also trained/tested on heterogeneous features such as non-transformed features, curvelets, wavelets, shearlets, textures, and scattering operators. Baseline scans of 185 rs-fMRIs and 1442 MRIs from ADNI-1, ADNI-2, and ADNI-GO datasets are used for validation. For MCI (mild cognitive impairment) classifications, there is an increase of 1-4% in performance. The outcome demonstrates the good discriminatory behaviour of the proposed features and its efficiency on rs-fMRI time-series and MRI data to classify multiple stages of AD.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2502"},"PeriodicalIF":3.5,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622987/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2395
Sergei Koltcov, Anton Surkov, Olessia Koltsova, Vera Ignatenko
Recent advancements in large language models (LLMs) have opened new possibilities for developing conversational agents (CAs) in various subfields of mental healthcare. However, this progress is hindered by limited access to high-quality training data, often due to privacy concerns and high annotation costs for low-resource languages. A potential solution is to create human-AI annotation systems that utilize extensive public domain user-to-user and user-to-professional discussions on social media. These discussions, however, are extremely noisy, necessitating the adaptation of LLMs for fully automatic cleaning and pre-classification to reduce human annotation effort. To date, research on LLM-based annotation in the mental health domain is extremely scarce. In this article, we explore the potential of zero-shot classification using four LLMs to select and pre-classify texts into topics representing psychiatric disorders, in order to facilitate the future development of CAs for disorder-specific counseling. We use 64,404 Russian-language texts from online discussion threads labeled with seven most commonly discussed disorders: depression, neurosis, paranoia, anxiety disorder, bipolar disorder, obsessive-compulsive disorder, and borderline personality disorder. Our research shows that while preliminary data filtering using zero-shot technology slightly improves classification, LLM fine-tuning makes a far larger contribution to its quality. Both standard and natural language inference (NLI) modes of fine-tuning increase classification accuracy by more than three times compared to non-fine-tuned training with preliminarily filtered data. Although NLI fine-tuning achieves slightly higher accuracy (0.64) than the standard approach, it is six times slower, indicating a need for further experimentation with NLI hypothesis engineering. Additionally, we demonstrate that lemmatization does not affect classification quality and that multilingual models using texts in their original language perform slightly better than English-only models using automatically translated texts. Finally, we introduce our dataset and model as the first openly available Russian-language resource for developing conversational agents in the domain of mental health counseling.
{"title":"Using large language models for extracting and pre-annotating texts on mental health from noisy data in a low-resource language.","authors":"Sergei Koltcov, Anton Surkov, Olessia Koltsova, Vera Ignatenko","doi":"10.7717/peerj-cs.2395","DOIUrl":"10.7717/peerj-cs.2395","url":null,"abstract":"<p><p>Recent advancements in large language models (LLMs) have opened new possibilities for developing conversational agents (CAs) in various subfields of mental healthcare. However, this progress is hindered by limited access to high-quality training data, often due to privacy concerns and high annotation costs for low-resource languages. A potential solution is to create human-AI annotation systems that utilize extensive public domain user-to-user and user-to-professional discussions on social media. These discussions, however, are extremely noisy, necessitating the adaptation of LLMs for fully automatic cleaning and pre-classification to reduce human annotation effort. To date, research on LLM-based annotation in the mental health domain is extremely scarce. In this article, we explore the potential of zero-shot classification using four LLMs to select and pre-classify texts into topics representing psychiatric disorders, in order to facilitate the future development of CAs for disorder-specific counseling. We use 64,404 Russian-language texts from online discussion threads labeled with seven most commonly discussed disorders: depression, neurosis, paranoia, anxiety disorder, bipolar disorder, obsessive-compulsive disorder, and borderline personality disorder. Our research shows that while preliminary data filtering using zero-shot technology slightly improves classification, LLM fine-tuning makes a far larger contribution to its quality. Both standard and natural language inference (NLI) modes of fine-tuning increase classification accuracy by more than three times compared to non-fine-tuned training with preliminarily filtered data. Although NLI fine-tuning achieves slightly higher accuracy (0.64) than the standard approach, it is six times slower, indicating a need for further experimentation with NLI hypothesis engineering. Additionally, we demonstrate that lemmatization does not affect classification quality and that multilingual models using texts in their original language perform slightly better than English-only models using automatically translated texts. Finally, we introduce our dataset and model as the first openly available Russian-language resource for developing conversational agents in the domain of mental health counseling.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2395"},"PeriodicalIF":3.5,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623104/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2536
Duy Ho Vo Hoang, Huy Vo Quoc, Bui Thanh Hung
Extracting information from scanned images is a critical task with far-reaching practical implications. Traditional methods often fall short by inadequately leveraging both image and text features, leading to less accurate and efficient outcomes. In this study, we introduce ConBGAT, a cutting-edge model that seamlessly integrates convolutional neural networks (CNNs), Transformers, and graph attention networks to address these shortcomings. Our approach constructs detailed graphs from text regions within images, utilizing advanced Optical Character Recognition to accurately detect and interpret characters. By combining superior extracted features of CNNs for image and Distilled Bidirectional Encoder Representations from Transformers (DistilBERT) for text, our model achieves a comprehensive and efficient data representation. Rigorous testing on real-world datasets shows that ConBGAT significantly outperforms existing methods, demonstrating its superior capability across multiple evaluation metrics. This advancement not only enhances accuracy but also sets a new benchmark for information extraction in scanned image.
{"title":"ConBGAT: a novel model combining convolutional neural networks, transformer and graph attention network for information extraction from scanned image.","authors":"Duy Ho Vo Hoang, Huy Vo Quoc, Bui Thanh Hung","doi":"10.7717/peerj-cs.2536","DOIUrl":"10.7717/peerj-cs.2536","url":null,"abstract":"<p><p>Extracting information from scanned images is a critical task with far-reaching practical implications. Traditional methods often fall short by inadequately leveraging both image and text features, leading to less accurate and efficient outcomes. In this study, we introduce ConBGAT, a cutting-edge model that seamlessly integrates convolutional neural networks (CNNs), Transformers, and graph attention networks to address these shortcomings. Our approach constructs detailed graphs from text regions within images, utilizing advanced Optical Character Recognition to accurately detect and interpret characters. By combining superior extracted features of CNNs for image and Distilled Bidirectional Encoder Representations from Transformers (DistilBERT) for text, our model achieves a comprehensive and efficient data representation. Rigorous testing on real-world datasets shows that ConBGAT significantly outperforms existing methods, demonstrating its superior capability across multiple evaluation metrics. This advancement not only enhances accuracy but also sets a new benchmark for information extraction in scanned image.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2536"},"PeriodicalIF":3.5,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622835/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional methods for detecting seed germination rates often involve lengthy experiments that result in damaged seeds. This study selected the Zheng Dan-958 maize variety to predict germination rates using multi-source information fusion and a random forest (RF) algorithm. Images of the seeds and internal cracks were captured with a digital camera. In contrast, the dielectric constant of the seeds was measured using a flat capacitor and converted into voltage readings. Features such as color, shape, texture, crack count, and normalized voltage were used to form feature vectors. Various prediction algorithms, including random forest (RF), radial basis function (RBF), neural networks (NNs), support vector machine (SVM), and extreme learning machine (ELM), were developed and tested against standard germination experiments. The RF model stood out, with a training time of 5.18 s and the highest accuracy of 92.88%, along with a mean absolute error (MAE) of 0.913 and a root mean square error (RMSE) of 1.163. The study concluded that the RF model, combined with multi-source information fusion, offers a feasible and nondestructive method for quickly and accurately predicting maize seed germination rates.
{"title":"Optimizing maize germination forecasts with random forest and data fusion techniques.","authors":"Lili Wu, Yuqing Xing, Kaiwen Yang, Wenqiang Li, Guangyue Ren, Debang Zhang, Huiping Fan","doi":"10.7717/peerj-cs.2468","DOIUrl":"10.7717/peerj-cs.2468","url":null,"abstract":"<p><p>Traditional methods for detecting seed germination rates often involve lengthy experiments that result in damaged seeds. This study selected the Zheng Dan-958 maize variety to predict germination rates using multi-source information fusion and a random forest (RF) algorithm. Images of the seeds and internal cracks were captured with a digital camera. In contrast, the dielectric constant of the seeds was measured using a flat capacitor and converted into voltage readings. Features such as color, shape, texture, crack count, and normalized voltage were used to form feature vectors. Various prediction algorithms, including random forest (RF), radial basis function (RBF), neural networks (NNs), support vector machine (SVM), and extreme learning machine (ELM), were developed and tested against standard germination experiments. The RF model stood out, with a training time of 5.18 s and the highest accuracy of 92.88%, along with a mean absolute error (MAE) of 0.913 and a root mean square error (RMSE) of 1.163. The study concluded that the RF model, combined with multi-source information fusion, offers a feasible and nondestructive method for quickly and accurately predicting maize seed germination rates.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2468"},"PeriodicalIF":3.5,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623106/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2569
Yongyu Luo, Zhongqiang Luo
The purpose of infrared and visible image fusion is to obtain an image that includes both infrared target and visible information. However, among the existing infrared and visible image fusion methods, some of them give priority to the fusion effect, often with complex design, ignoring the influence of attention mechanisms on deep features, resulting in the lack of visible light texture information in the fusion image. To solve these problems, an infrared and visible image fusion method based on dense gradient attention residuals is proposed in this article. Firstly, squeeze-and-excitation networks are integrated into the gradient convolutional dense block, and a new gradient attention residual dense block is designed to enhance the ability of the network to extract important information. In order to retain more original image information, the feature gradient attention module is introduced to enhance the ability of detail information retention. In the fusion layer, an adaptive weighted energy attention network based on an energy fusion strategy is used to further preserve the infrared and visible details. Through the experimental comparison on the TNO dataset, our method has excellent performance on several evaluation indicators. Specifically, in the indexes of average gradient (AG), information entropy (EN), spatial frequency (SF), mutual information (MI) and standard deviation (SD), our method reached 6.90, 7.46, 17.30, 2.62 and 54.99, respectively, which increased by 37.31%, 6.55%, 32.01%, 8.16%, and 10.01% compared with the other five commonly used methods. These results demonstrate the effectiveness and superiority of our method.
{"title":"Infrared and visible image fusion algorithm based on gradient attention residuals dense block.","authors":"Yongyu Luo, Zhongqiang Luo","doi":"10.7717/peerj-cs.2569","DOIUrl":"10.7717/peerj-cs.2569","url":null,"abstract":"<p><p>The purpose of infrared and visible image fusion is to obtain an image that includes both infrared target and visible information. However, among the existing infrared and visible image fusion methods, some of them give priority to the fusion effect, often with complex design, ignoring the influence of attention mechanisms on deep features, resulting in the lack of visible light texture information in the fusion image. To solve these problems, an infrared and visible image fusion method based on dense gradient attention residuals is proposed in this article. Firstly, squeeze-and-excitation networks are integrated into the gradient convolutional dense block, and a new gradient attention residual dense block is designed to enhance the ability of the network to extract important information. In order to retain more original image information, the feature gradient attention module is introduced to enhance the ability of detail information retention. In the fusion layer, an adaptive weighted energy attention network based on an energy fusion strategy is used to further preserve the infrared and visible details. Through the experimental comparison on the TNO dataset, our method has excellent performance on several evaluation indicators. Specifically, in the indexes of average gradient (AG), information entropy (EN), spatial frequency (SF), mutual information (MI) and standard deviation (SD), our method reached 6.90, 7.46, 17.30, 2.62 and 54.99, respectively, which increased by 37.31%, 6.55%, 32.01%, 8.16%, and 10.01% compared with the other five commonly used methods. These results demonstrate the effectiveness and superiority of our method.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2569"},"PeriodicalIF":3.5,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622899/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142802488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-27eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2487
Kholoud Althobaiti, Nawal Alsufyani
The increased sophistication and frequency of phishing attacks that target organizations necessitate a comprehensive cyber security strategy to handle phishing attacks from several perspectives, such as the detection of phishing and testing of users' awareness. Through a systematic review of 163 research articles, we analyzed the organization-oriented phishing research to categorize current research and identify future opportunities. We find that a notable number of studies concentrate on phishing detection and awareness while other layers of protection are overlooked, such as the mitigation of phishing. In addition, we draw attention to shortcomings and challenges. We believe that this article will provide opportunities for future research on phishing in organizations.
{"title":"A review of organization-oriented phishing research.","authors":"Kholoud Althobaiti, Nawal Alsufyani","doi":"10.7717/peerj-cs.2487","DOIUrl":"https://doi.org/10.7717/peerj-cs.2487","url":null,"abstract":"<p><p>The increased sophistication and frequency of phishing attacks that target organizations necessitate a comprehensive cyber security strategy to handle phishing attacks from several perspectives, such as the detection of phishing and testing of users' awareness. Through a systematic review of 163 research articles, we analyzed the organization-oriented phishing research to categorize current research and identify future opportunities. We find that a notable number of studies concentrate on phishing detection and awareness while other layers of protection are overlooked, such as the mitigation of phishing. In addition, we draw attention to shortcomings and challenges. We believe that this article will provide opportunities for future research on phishing in organizations.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2487"},"PeriodicalIF":3.5,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623132/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}