Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2417
Ming Xu, Jinwei Cui, Xiaoyu Ma, Zhiyi Zou, Zhisheng Xin, Muhammad Bilal
Graphic design, as a product of the burgeoning new media era, has seen its users' requirements for images continuously evolve. However, external factors such as light and noise often cause graphic design images to become distorted during acquisition. To enhance the definition of these images, this paper introduces a novel image enhancement model based on visual features. Initially, a histogram equalization (HE) algorithm is applied to enhance the graphic design images. Subsequently, image feature extraction is performed using a dual-flow network comprising convolutional neural network (CNN) and Transformer architectures. The CNN employs a residual dense block (RDB) to embed spatial local structure information with varying receptive fields. An improved attention mechanism module, attention feature fusion (AFF), is then introduced to integrate the image features extracted from the dual-flow network. Finally, through image perception quality guided adversarial learning, the model adjusts the initial enhanced image's color and recovers more details. Experimental results demonstrate that the proposed algorithm model achieves enhancement effects exceeding 90% on two large image datasets, which represents a 5%-10% improvement over other models. Furthermore, the algorithm exhibits superior performance in terms of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) image quality evaluation metrics. Our findings indicate that the fusion model significantly enhances image quality, thereby advancing the field of graphic design and showcasing its potential in cultural and creative product design.
{"title":"Image enhancement with art design: a visual feature approach with a CNN-transformer fusion model.","authors":"Ming Xu, Jinwei Cui, Xiaoyu Ma, Zhiyi Zou, Zhisheng Xin, Muhammad Bilal","doi":"10.7717/peerj-cs.2417","DOIUrl":"10.7717/peerj-cs.2417","url":null,"abstract":"<p><p>Graphic design, as a product of the burgeoning new media era, has seen its users' requirements for images continuously evolve. However, external factors such as light and noise often cause graphic design images to become distorted during acquisition. To enhance the definition of these images, this paper introduces a novel image enhancement model based on visual features. Initially, a histogram equalization (HE) algorithm is applied to enhance the graphic design images. Subsequently, image feature extraction is performed using a dual-flow network comprising convolutional neural network (CNN) and Transformer architectures. The CNN employs a residual dense block (RDB) to embed spatial local structure information with varying receptive fields. An improved attention mechanism module, attention feature fusion (AFF), is then introduced to integrate the image features extracted from the dual-flow network. Finally, through image perception quality guided adversarial learning, the model adjusts the initial enhanced image's color and recovers more details. Experimental results demonstrate that the proposed algorithm model achieves enhancement effects exceeding 90% on two large image datasets, which represents a 5%-10% improvement over other models. Furthermore, the algorithm exhibits superior performance in terms of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) image quality evaluation metrics. Our findings indicate that the fusion model significantly enhances image quality, thereby advancing the field of graphic design and showcasing its potential in cultural and creative product design.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2417"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2479
Xiuzhe Wang
Anomalies are the existential abnormalities in data, the identification of which is known as anomaly detection. The absence of timely detection of anomalies may affect the key processes of decision-making, fraud detection, and automated classification. Most of the existing models of anomaly detection utilize the traditional way of tokenizing and are computationally costlier, mainly if the outliers are to be extracted from a large script. This research work intends to propose an unsupervised, all-MiniLM-L6-v2-based system for the detection of outliers. The method makes use of centroid embeddings to extract outliers in high-variety, large-volume data. To avoid mistakenly treating novelty as an outlier, the Minimum Covariance Determinant (MCD) based approach is followed to count the novelty of the input script. The proposed method is implemented in a Python project, App. for Anomalies Detection (AAD). The system is evaluated by two non-related datasets-the 20 newsgroups text dataset and the SMS spam collection dataset. The robust accuracy (94%) and F1 score (0.95) revealed that the proposed method could effectively trace anomalies in a comparatively large script. The process is applicable in extracting meanings from textual data, particularly in the domains of human resource management and security.
{"title":"EAD: effortless anomalies detection, a deep learning based approach for detecting outliers in English textual data.","authors":"Xiuzhe Wang","doi":"10.7717/peerj-cs.2479","DOIUrl":"10.7717/peerj-cs.2479","url":null,"abstract":"<p><p>Anomalies are the existential abnormalities in data, the identification of which is known as anomaly detection. The absence of timely detection of anomalies may affect the key processes of decision-making, fraud detection, and automated classification. Most of the existing models of anomaly detection utilize the traditional way of tokenizing and are computationally costlier, mainly if the outliers are to be extracted from a large script. This research work intends to propose an unsupervised, all-MiniLM-L6-v2-based system for the detection of outliers. The method makes use of centroid embeddings to extract outliers in high-variety, large-volume data. To avoid mistakenly treating novelty as an outlier, the Minimum Covariance Determinant (MCD) based approach is followed to count the novelty of the input script. The proposed method is implemented in a Python project, App. for Anomalies Detection (AAD). The system is evaluated by two non-related datasets-the 20 newsgroups text dataset and the SMS spam collection dataset. The robust accuracy (94%) and F1 score (0.95) revealed that the proposed method could effectively trace anomalies in a comparatively large script. The process is applicable in extracting meanings from textual data, particularly in the domains of human resource management and security.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2479"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2467
Yogesh N, Purohit Shrinivasacharya, Nagaraj Naik
Chronic kidney disease (CKD) involves numerous variables, but only a few significantly impact the classification task. The statistically equivalent signature (SES) method, inspired by constraint-based learning of Bayesian networks, is employed to identify essential features in CKD. Unlike conventional feature selection methods, which typically focus on a single set of features with the highest predictive potential, the SES method can identify multiple predictive feature subsets with similar performance. However, most feature selection (FS) classifiers perform suboptimally with strongly correlated data. The FS approach faces challenges in identifying crucial features and selecting the most effective classifier, particularly in high-dimensional data. This study proposes using the Least Absolute Shrinkage and Selection Operator (LASSO) in conjunction with the SES method for feature selection in CKD identification. Following this, an ensemble deep-learning model combining long short-term memory (LSTM) and gated recurrent unit (GRU) networks is proposed for CKD classification. The features selected by the hybrid feature selection method are fed into the ensemble deep-learning model. The model's performance is evaluated using accuracy, precision, recall, and F1 score metrics. The experimental results are compared with individual classifiers, including decision tree (DT), Random Forest (RF), logistic regression (LR), and support vector machine (SVM). The findings indicate a 2% improvement in classification accuracy when using the proposed hybrid feature selection method combined with the LSTM and GRU ensemble deep-learning model. Further analysis reveals that certain features, such as HEMO, POT, bacteria, and coronary artery disease, contribute minimally to the classification task. Future research could explore additional feature selection methods, including dynamic feature selection that adapts to evolving datasets and incorporates clinical knowledge to enhance CKD classification accuracy further.
慢性肾脏病(CKD)涉及众多变量,但只有少数变量会对分类任务产生重大影响。统计等效特征(SES)方法受贝叶斯网络基于约束学习的启发,用于识别 CKD 的基本特征。传统的特征选择方法通常只关注一组具有最高预测潜力的特征,而 SES 方法则不同,它能识别出多个具有相似性能的预测特征子集。然而,大多数特征选择(FS)分类器在处理强相关数据时表现并不理想。FS 方法在识别关键特征和选择最有效的分类器方面面临挑战,尤其是在高维数据中。本研究建议在 CKD 识别中结合使用最小绝对收缩和选择算子(LASSO)和 SES 方法进行特征选择。在此基础上,提出了一种结合了长短期记忆(LSTM)和门控递归单元(GRU)网络的集合深度学习模型,用于 CKD 分类。混合特征选择法选出的特征被输入到集合深度学习模型中。该模型的性能使用准确率、精确度、召回率和 F1 分数指标进行评估。实验结果与单个分类器进行了比较,包括决策树(DT)、随机森林(RF)、逻辑回归(LR)和支持向量机(SVM)。研究结果表明,当使用所提出的混合特征选择方法与 LSTM 和 GRU 集合深度学习模型相结合时,分类准确率提高了 2%。进一步的分析表明,某些特征,如 HEMO、POT、细菌和冠状动脉疾病,对分类任务的贡献微乎其微。未来的研究可以探索更多的特征选择方法,包括动态特征选择,以适应不断变化的数据集,并结合临床知识,进一步提高 CKD 分类的准确性。
{"title":"Novel statistically equivalent signature-based hybrid feature selection and ensemble deep learning LSTM and GRU for chronic kidney disease classification.","authors":"Yogesh N, Purohit Shrinivasacharya, Nagaraj Naik","doi":"10.7717/peerj-cs.2467","DOIUrl":"10.7717/peerj-cs.2467","url":null,"abstract":"<p><p>Chronic kidney disease (CKD) involves numerous variables, but only a few significantly impact the classification task. The statistically equivalent signature (SES) method, inspired by constraint-based learning of Bayesian networks, is employed to identify essential features in CKD. Unlike conventional feature selection methods, which typically focus on a single set of features with the highest predictive potential, the SES method can identify multiple predictive feature subsets with similar performance. However, most feature selection (FS) classifiers perform suboptimally with strongly correlated data. The FS approach faces challenges in identifying crucial features and selecting the most effective classifier, particularly in high-dimensional data. This study proposes using the Least Absolute Shrinkage and Selection Operator (LASSO) in conjunction with the SES method for feature selection in CKD identification. Following this, an ensemble deep-learning model combining long short-term memory (LSTM) and gated recurrent unit (GRU) networks is proposed for CKD classification. The features selected by the hybrid feature selection method are fed into the ensemble deep-learning model. The model's performance is evaluated using accuracy, precision, recall, and F1 score metrics. The experimental results are compared with individual classifiers, including decision tree (DT), Random Forest (RF), logistic regression (LR), and support vector machine (SVM). The findings indicate a 2% improvement in classification accuracy when using the proposed hybrid feature selection method combined with the LSTM and GRU ensemble deep-learning model. Further analysis reveals that certain features, such as HEMO, POT, bacteria, and coronary artery disease, contribute minimally to the classification task. Future research could explore additional feature selection methods, including dynamic feature selection that adapts to evolving datasets and incorporates clinical knowledge to enhance CKD classification accuracy further.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2467"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2406
Ahmet Topal, Burcu Tunga, Erfan Babaee Tirkolaee
Plant diseases threaten agricultural sustainability by reducing crop yields. Rapid and accurate disease identification is crucial for effective management. Recent advancements in artificial intelligence (AI) have facilitated the development of automated systems for disease detection. This study focuses on enhancing the classification of diseases and estimating their severity in coffee leaf images. To do so, we propose a novel approach as the preprocessing step for the classification in which enhanced multivariance product representation (EMPR) is used to decompose the considered image into components, a new image is constructed using some of those components, and the contrast of the new image is enhanced by applying high-dimensional model representation (HDMR) to highlight the diseased parts of the leaves. Popular convolutional neural network (CNN) architectures, including AlexNet, VGG16, and ResNet50, are evaluated. Results show that VGG16 achieves the highest classification accuracy of approximately 96%, while all models perform well in predicting disease severity levels, with accuracies exceeding 85%. Notably, the ResNet50 model achieves accuracy levels surpassing 90%. This research contributes to the advancement of automated crop health management systems.
植物病害通过降低作物产量威胁农业的可持续性。快速准确的疾病识别是有效管理的关键。人工智能(AI)的最新进展促进了疾病检测自动化系统的发展。本研究的重点是加强咖啡叶图像的疾病分类和估计其严重程度。为此,我们提出了一种新的方法作为分类的预处理步骤,该方法使用增强多方差积表示(enhanced multivariance product representation, EMPR)将考虑的图像分解为组件,使用其中的一些组件构建新图像,并通过应用高维模型表示(high-dimensional model representation, HDMR)来突出叶片的病变部分,从而增强新图像的对比度。评估了流行的卷积神经网络(CNN)架构,包括AlexNet, VGG16和ResNet50。结果表明,VGG16的分类准确率最高,约为96%,而所有模型在预测疾病严重程度方面都表现良好,准确率超过85%。值得注意的是,ResNet50模型的准确率超过了90%。本研究有助于作物健康管理自动化系统的发展。
{"title":"DeepEMPR: coffee leaf disease detection with deep learning and enhanced multivariance product representation.","authors":"Ahmet Topal, Burcu Tunga, Erfan Babaee Tirkolaee","doi":"10.7717/peerj-cs.2406","DOIUrl":"10.7717/peerj-cs.2406","url":null,"abstract":"<p><p>Plant diseases threaten agricultural sustainability by reducing crop yields. Rapid and accurate disease identification is crucial for effective management. Recent advancements in artificial intelligence (AI) have facilitated the development of automated systems for disease detection. This study focuses on enhancing the classification of diseases and estimating their severity in coffee leaf images. To do so, we propose a novel approach as the preprocessing step for the classification in which enhanced multivariance product representation (EMPR) is used to decompose the considered image into components, a new image is constructed using some of those components, and the contrast of the new image is enhanced by applying high-dimensional model representation (HDMR) to highlight the diseased parts of the leaves. Popular convolutional neural network (CNN) architectures, including AlexNet, VGG16, and ResNet50, are evaluated. Results show that VGG16 achieves the highest classification accuracy of approximately 96%, while all models perform well in predicting disease severity levels, with accuracies exceeding 85%. Notably, the ResNet50 model achieves accuracy levels surpassing 90%. This research contributes to the advancement of automated crop health management systems.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2406"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623072/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2422
Atif Mahmood, Zati Hakim Azizul, Mohammed Zakariah, Samir Brahim Belhaouari, Ayman Altameem, Roziana Ramli, Abdulaziz S Almazyad, Miss Laiha Mat Kiah, Saaidal Razalli Azzuhri
Federated learning (FL) is a popular method where edge devices work together to train machine learning models. This study introduces an efficient network for analyzing healthcare records. It uses VPN technology and applies a federated learning approach over a wireless backhaul network. The study compares different wireless backhaul channels, including terahertz (THz), E/V band (mmWave), and microwave, for their effectiveness. We looked closely at a suggested FL network that uses VPN technology over awireless backhaul network. We compared it with the standard method and found that using the FedAvg algorithm with Terahertz (THz) for communication gave the best accuracy. The time it took to reach a conclusion improved a lot, going from 55 seconds to an impressive 38 seconds. This emphasizes how having a faster communication link makes FL networks work much better. Furthermore, a three-step plan was executed to boost security, adopting a multi-layered method to safeguard the FL network and its confidential data. The first step involves integrating a private network into the current telecom infrastructure, establishing an initial layer of security. To enhance security further, licensed frequency channels are introduced, providing an extra layer of protection. The highest level of security is achieved by combining a private network with licensed frequency channels, complemented by an additional layer of security through VPN-based measures. This comprehensive strategy ensures the application of strong security protocols.
{"title":"Implementing federated learning over VPN-based wireless backhaul networks for healthcare systems.","authors":"Atif Mahmood, Zati Hakim Azizul, Mohammed Zakariah, Samir Brahim Belhaouari, Ayman Altameem, Roziana Ramli, Abdulaziz S Almazyad, Miss Laiha Mat Kiah, Saaidal Razalli Azzuhri","doi":"10.7717/peerj-cs.2422","DOIUrl":"10.7717/peerj-cs.2422","url":null,"abstract":"<p><p>Federated learning (FL) is a popular method where edge devices work together to train machine learning models. This study introduces an efficient network for analyzing healthcare records. It uses VPN technology and applies a federated learning approach over a wireless backhaul network. The study compares different wireless backhaul channels, including terahertz (THz), E/V band (mmWave), and microwave, for their effectiveness. We looked closely at a suggested FL network that uses VPN technology over awireless backhaul network. We compared it with the standard method and found that using the FedAvg algorithm with Terahertz (THz) for communication gave the best accuracy. The time it took to reach a conclusion improved a lot, going from 55 seconds to an impressive 38 seconds. This emphasizes how having a faster communication link makes FL networks work much better. Furthermore, a three-step plan was executed to boost security, adopting a multi-layered method to safeguard the FL network and its confidential data. The first step involves integrating a private network into the current telecom infrastructure, establishing an initial layer of security. To enhance security further, licensed frequency channels are introduced, providing an extra layer of protection. The highest level of security is achieved by combining a private network with licensed frequency channels, complemented by an additional layer of security through VPN-based measures. This comprehensive strategy ensures the application of strong security protocols.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2422"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622844/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rise of the Internet of Things (IoT) and Industry 2.0 has spurred a growing need for extensive data computing, and Spark emerged as a promising Big Data platform, attributed to its distributed in-memory computing capabilities. However, practical heavy workloads often lead to memory bottleneck issues in the Spark platform. This results in resilient distributed datasets (RDD) eviction and, in extreme cases, violent memory contentions, causing a significant degradation in Spark computational efficiency. To tackle this issue, we propose an adaptive memory reservation (AMR) strategy in this article, specifically designed for heavy workloads in the Spark environment. Specifically, we model optimal task parallelism by minimizing the disparity between the number of tasks completed without blocking and the number completed in regular rounds. Optimal memory for task parallelism is determined to establish an efficient execution memory space for computational parallelism. Subsequently, through adaptive execution memory reservation and dynamic adjustments, such as compression or expansion based on task progress, the strategy ensures dynamic task parallelism in the Spark parallel computing process. Considering the cost of RDD cache location and real-time memory space usage, we select suitable storage locations for different RDD types to alleviate execution memory pressure. Finally, we conduct extensive laboratory experiments to validate the effectiveness of AMR. Results indicate that, compared to existing memory management solutions, AMR reduces the execution time by approximately 46.8%.
{"title":"Adaptive memory reservation strategy for heavy workloads in the Spark environment.","authors":"Bohan Li, Xin He, Junyang Yu, Guanghui Wang, Yixin Song, Shunjie Pan, Hangyu Gu","doi":"10.7717/peerj-cs.2460","DOIUrl":"10.7717/peerj-cs.2460","url":null,"abstract":"<p><p>The rise of the Internet of Things (IoT) and Industry 2.0 has spurred a growing need for extensive data computing, and Spark emerged as a promising Big Data platform, attributed to its distributed in-memory computing capabilities. However, practical heavy workloads often lead to memory bottleneck issues in the Spark platform. This results in resilient distributed datasets (RDD) eviction and, in extreme cases, violent memory contentions, causing a significant degradation in Spark computational efficiency. To tackle this issue, we propose an adaptive memory reservation (AMR) strategy in this article, specifically designed for heavy workloads in the Spark environment. Specifically, we model optimal task parallelism by minimizing the disparity between the number of tasks completed without blocking and the number completed in regular rounds. Optimal memory for task parallelism is determined to establish an efficient execution memory space for computational parallelism. Subsequently, through adaptive execution memory reservation and dynamic adjustments, such as compression or expansion based on task progress, the strategy ensures dynamic task parallelism in the Spark parallel computing process. Considering the cost of RDD cache location and real-time memory space usage, we select suitable storage locations for different RDD types to alleviate execution memory pressure. Finally, we conduct extensive laboratory experiments to validate the effectiveness of AMR. Results indicate that, compared to existing memory management solutions, AMR reduces the execution time by approximately 46.8%.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2460"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639302/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2441
Halim Wildan Awalurahman, Indra Budi
Background: Multiple-choice questions (MCQs) are one of the most used assessment formats. However, creating MCQs is a challenging task, particularly when formulating the distractor. Numerous studies have proposed automatic distractor generation. However, there has been no literature review to summarize and present the current state of research in this field. This study aims to perform a systematic literature review to identify trends and the state of the art of automatic distractor generation studies.
Methodology: We conducted a systematic literature following the Kitchenham framework. The relevant literature was retrieved from the ACM Digital Library, IEEE Xplore, Science Direct, and Scopus databases.
Results: A total of 60 relevant studies from 2009 to 2024 were identified and extracted to answer three research questions regarding the data sources, methods, types of questions, evaluation, languages, and domains used in the automatic distractor generation research. The results of the study indicated that automatic distractor generation has been growing with improvement and expansion in many aspects. Furthermore, trends and the state of the art in this topic were observed.
Conclusions: Nevertheless, we identified potential research gaps, including the need to explore further data sources, methods, languages, and domains. This study can serve as a reference for future studies proposing research within the field of automatic distractor generation.
{"title":"Automatic distractor generation in multiple-choice questions: a systematic literature review.","authors":"Halim Wildan Awalurahman, Indra Budi","doi":"10.7717/peerj-cs.2441","DOIUrl":"10.7717/peerj-cs.2441","url":null,"abstract":"<p><strong>Background: </strong>Multiple-choice questions (MCQs) are one of the most used assessment formats. However, creating MCQs is a challenging task, particularly when formulating the distractor. Numerous studies have proposed automatic distractor generation. However, there has been no literature review to summarize and present the current state of research in this field. This study aims to perform a systematic literature review to identify trends and the state of the art of automatic distractor generation studies.</p><p><strong>Methodology: </strong>We conducted a systematic literature following the Kitchenham framework. The relevant literature was retrieved from the ACM Digital Library, IEEE Xplore, Science Direct, and Scopus databases.</p><p><strong>Results: </strong>A total of 60 relevant studies from 2009 to 2024 were identified and extracted to answer three research questions regarding the data sources, methods, types of questions, evaluation, languages, and domains used in the automatic distractor generation research. The results of the study indicated that automatic distractor generation has been growing with improvement and expansion in many aspects. Furthermore, trends and the state of the art in this topic were observed.</p><p><strong>Conclusions: </strong>Nevertheless, we identified potential research gaps, including the need to explore further data sources, methods, languages, and domains. This study can serve as a reference for future studies proposing research within the field of automatic distractor generation.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2441"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623049/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2354
Yu Zhu, Shifan Xie
The creation of 3D animation increasingly prioritizes the enhancement of character effects, narrative depth, and audience engagement to address the growing demands for visual stimulation, cultural enrichment, and interactive experiences. The advancement of virtual reality (VR) animation is anticipated to require sustained collaboration among researchers, animation experts, and hardware developers over an extended period to achieve full maturity. This article explores the use of Virtual Reality Modeling Language (VRML) in generating 3D stereoscopic forms and environments, applying texture mapping, optimizing lighting effects, and establishing interactive user responses, thereby enriching the 3D animation experience. VRML's functionality is further expanded through the integration of script programs in languages such as Java, JavaScript, and VRML Script via the Script node. The implementation of fuzzy model recognition within 3D animation simulations enhances the identification of textual, musical, and linguistic elements, resulting in improved frame rates. This study also analyzes the real-time correlation between the number of polygons and frame rates in a virtual museum animation scene. The findings demonstrate that the frame rate of the 3D animation within this virtual setting consistently exceeds 40 frames per second, thereby ensuring robust real-time performance, preserving the quality of 3D models, and optimizing rendering speed and visual effects without affecting the system's responsiveness to additional functions.
{"title":"Simulation methods realized by virtual reality modeling language for 3D animation considering fuzzy model recognition.","authors":"Yu Zhu, Shifan Xie","doi":"10.7717/peerj-cs.2354","DOIUrl":"10.7717/peerj-cs.2354","url":null,"abstract":"<p><p>The creation of 3D animation increasingly prioritizes the enhancement of character effects, narrative depth, and audience engagement to address the growing demands for visual stimulation, cultural enrichment, and interactive experiences. The advancement of virtual reality (VR) animation is anticipated to require sustained collaboration among researchers, animation experts, and hardware developers over an extended period to achieve full maturity. This article explores the use of Virtual Reality Modeling Language (VRML) in generating 3D stereoscopic forms and environments, applying texture mapping, optimizing lighting effects, and establishing interactive user responses, thereby enriching the 3D animation experience. VRML's functionality is further expanded through the integration of script programs in languages such as Java, JavaScript, and VRML Script <i>via</i> the Script node. The implementation of fuzzy model recognition within 3D animation simulations enhances the identification of textual, musical, and linguistic elements, resulting in improved frame rates. This study also analyzes the real-time correlation between the number of polygons and frame rates in a virtual museum animation scene. The findings demonstrate that the frame rate of the 3D animation within this virtual setting consistently exceeds 40 frames per second, thereby ensuring robust real-time performance, preserving the quality of 3D models, and optimizing rendering speed and visual effects without affecting the system's responsiveness to additional functions.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2354"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623235/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2485
Chao Li, Rugui Liu, Zhe Quan, Pengpeng Hu, Jun Sun
In recent years, deep learning models have become predominant methods for computer vision tasks, but the large computation and storage requirements of many models make them challenging to deploy on devices with limited resources. Knowledge distillation (KD) is a widely used approach for model compression. However, when applied in the object detection problems, the existing KD methods either directly applies the feature map or simply separate the foreground from the background by using a binary mask, aligning the attention between the teacher and the student models. Unfortunately, these methods either completely overlook or fail to thoroughly eliminate noise, resulting in unsatisfactory model accuracy for student models. To address this issue, we propose a foreground separation distillation (FSD) method in this paper. The FSD method enables student models to distinguish between foreground and background using Gaussian heatmaps, reducing irrelevant information in the learning process. Additionally, FSD also extracts the channel feature by converting the spatial feature maps into probabilistic forms to fully utilize the knowledge in each channel of a well-trained teacher. Experimental results demonstrate that the YOLOX detector enhanced with our distillation method achieved superior performance on both the fall detection and the VOC2007 datasets. For example, YOLOX with FSD achieved 73.1% mean average precision (mAP) on the Fall Detection dataset, which is 1.6% higher than the baseline. The code of FSD is accessible via https://doi.org/10.5281/zenodo.13829676.
{"title":"Foreground separation knowledge distillation for object detection.","authors":"Chao Li, Rugui Liu, Zhe Quan, Pengpeng Hu, Jun Sun","doi":"10.7717/peerj-cs.2485","DOIUrl":"10.7717/peerj-cs.2485","url":null,"abstract":"<p><p>In recent years, deep learning models have become predominant methods for computer vision tasks, but the large computation and storage requirements of many models make them challenging to deploy on devices with limited resources. Knowledge distillation (KD) is a widely used approach for model compression. However, when applied in the object detection problems, the existing KD methods either directly applies the feature map or simply separate the foreground from the background by using a binary mask, aligning the attention between the teacher and the student models. Unfortunately, these methods either completely overlook or fail to thoroughly eliminate noise, resulting in unsatisfactory model accuracy for student models. To address this issue, we propose a foreground separation distillation (FSD) method in this paper. The FSD method enables student models to distinguish between foreground and background using Gaussian heatmaps, reducing irrelevant information in the learning process. Additionally, FSD also extracts the channel feature by converting the spatial feature maps into probabilistic forms to fully utilize the knowledge in each channel of a well-trained teacher. Experimental results demonstrate that the YOLOX detector enhanced with our distillation method achieved superior performance on both the fall detection and the VOC2007 datasets. For example, YOLOX with FSD achieved 73.1% mean average precision (mAP) on the Fall Detection dataset, which is 1.6% higher than the baseline. The code of FSD is accessible via https://doi.org/10.5281/zenodo.13829676.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2485"},"PeriodicalIF":3.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-12eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2469
Quan Cheng, Jingyi Cheng, Jian Chen, Shaojun Liu
In the context of high-quality economic development, technological innovation has emerged as a fundamental driver of socio-economic progress. The consequent proliferation of science and technology news, which acts as a vital medium for disseminating technological advancements and policy changes, has attracted considerable attention from technology management agencies and innovation organizations. Nevertheless, online science and technology news has historically exhibited characteristics such as limited scale, disorderliness, and multi-dimensionality, which is extremely inconvenient for users of deep application. While single-label classification techniques can effectively categorize textual information, they face challenges in leading science and technology news classification due to a lack of a hierarchical knowledge framework and insufficient capacity to reveal knowledge integration features. This study proposes a hierarchical multi-label classification model for science and technology news, enhanced by heterogeneous graph semantics. The model captures multi-dimensional themes and hierarchical structural features within science and technology news through a hierarchical transmission module. It integrates graph convolutional networks to extract node information and hierarchical relationships from heterogeneous graphs, while also incorporating prior knowledge from domain knowledge graphs to address data scarcity. This approach enhances the understanding and classification capabilities of the semantics of science and technology news. Experimental results demonstrate that the model achieves precision, recall, and F1 scores of 84.21%, 88.89%, and 86.49%, respectively, significantly surpassing baseline models. This research presents an innovative solution for hierarchical multi-label classification tasks, demonstrating significant application potential in addressing data scarcity and complex thematic classification challenges.
{"title":"Hierarchical multi-label classification model for science and technology news based on heterogeneous graph semantic enhancement.","authors":"Quan Cheng, Jingyi Cheng, Jian Chen, Shaojun Liu","doi":"10.7717/peerj-cs.2469","DOIUrl":"10.7717/peerj-cs.2469","url":null,"abstract":"<p><p>In the context of high-quality economic development, technological innovation has emerged as a fundamental driver of socio-economic progress. The consequent proliferation of science and technology news, which acts as a vital medium for disseminating technological advancements and policy changes, has attracted considerable attention from technology management agencies and innovation organizations. Nevertheless, online science and technology news has historically exhibited characteristics such as limited scale, disorderliness, and multi-dimensionality, which is extremely inconvenient for users of deep application. While single-label classification techniques can effectively categorize textual information, they face challenges in leading science and technology news classification due to a lack of a hierarchical knowledge framework and insufficient capacity to reveal knowledge integration features. This study proposes a hierarchical multi-label classification model for science and technology news, enhanced by heterogeneous graph semantics. The model captures multi-dimensional themes and hierarchical structural features within science and technology news through a hierarchical transmission module. It integrates graph convolutional networks to extract node information and hierarchical relationships from heterogeneous graphs, while also incorporating prior knowledge from domain knowledge graphs to address data scarcity. This approach enhances the understanding and classification capabilities of the semantics of science and technology news. Experimental results demonstrate that the model achieves precision, recall, and F1 scores of 84.21%, 88.89%, and 86.49%, respectively, significantly surpassing baseline models. This research presents an innovative solution for hierarchical multi-label classification tasks, demonstrating significant application potential in addressing data scarcity and complex thematic classification challenges.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2469"},"PeriodicalIF":3.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623068/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}