The software maintenance process is costly, accounting for up to 70% of the total cost in the software development life cycle (SDLC). The difficulty of maintaining software increases with its size and complexity, requiring significant time and effort. One way to alleviate these costs is to automate parts of the maintenance process. This research focuses on the automation of the classification phase using decision trees (DT) to sort, rank, and accept/reject maintenance requests (MRs) for mobile applications. Our dataset consisted of 1,656 MRs. We found that DTs could automate sorting and accepting/rejecting MRs with accuracies of 71.08% and 64.15%, respectively, though ranking accuracy was lower at 50%. While DTs can reduce costs, effort, and time, human verification is still necessary.
{"title":"Towards an automated classification phase in the software maintenance process using decision tree","authors":"Sahar Alturki, Sarah Almoaiqel","doi":"10.7717/peerj-cs.2228","DOIUrl":"https://doi.org/10.7717/peerj-cs.2228","url":null,"abstract":"The software maintenance process is costly, accounting for up to 70% of the total cost in the software development life cycle (SDLC). The difficulty of maintaining software increases with its size and complexity, requiring significant time and effort. One way to alleviate these costs is to automate parts of the maintenance process. This research focuses on the automation of the classification phase using decision trees (DT) to sort, rank, and accept/reject maintenance requests (MRs) for mobile applications. Our dataset consisted of 1,656 MRs. We found that DTs could automate sorting and accepting/rejecting MRs with accuracies of 71.08% and 64.15%, respectively, though ranking accuracy was lower at 50%. While DTs can reduce costs, effort, and time, human verification is still necessary.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"56 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Shang, Zhihui Wu, Zhiwen Xiao, Yifei Zhang, Jibin Wang
Cache plays a crucial role in improving system response time, alleviating server pressure, and achieving load balancing in various aspects of modern information systems. The data prefetch and cache replacement algorithms are significant factors influencing caching performance. Due to the inability to learn user interests and preferences accurately, existing rule-based and data mining caching algorithms fail to capture the unique features of the user access behavior sequence, resulting in low cache hit rates. In this article, we introduce BERT4Cache, an end-to-end bidirectional Transformer model with attention for data prefetch in cache. BERT4Cache enhances cache hit rates and ultimately improves cache performance by predicting the user’s imminent future requested objects and prefetching them into the cache. In our thorough experiments, we show that BERT4Cache achieves superior results in hit rates and other metrics compared to generic reactive and advanced proactive caching strategies.
{"title":"BERT4Cache: a bidirectional encoder representations for data prefetching in cache","authors":"Jing Shang, Zhihui Wu, Zhiwen Xiao, Yifei Zhang, Jibin Wang","doi":"10.7717/peerj-cs.2258","DOIUrl":"https://doi.org/10.7717/peerj-cs.2258","url":null,"abstract":"Cache plays a crucial role in improving system response time, alleviating server pressure, and achieving load balancing in various aspects of modern information systems. The data prefetch and cache replacement algorithms are significant factors influencing caching performance. Due to the inability to learn user interests and preferences accurately, existing rule-based and data mining caching algorithms fail to capture the unique features of the user access behavior sequence, resulting in low cache hit rates. In this article, we introduce BERT4Cache, an end-to-end bidirectional Transformer model with attention for data prefetch in cache. BERT4Cache enhances cache hit rates and ultimately improves cache performance by predicting the user’s imminent future requested objects and prefetching them into the cache. In our thorough experiments, we show that BERT4Cache achieves superior results in hit rates and other metrics compared to generic reactive and advanced proactive caching strategies.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"12 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, compensation algorithms for zero padding are suggested to enhance the performance of deep convolutional neural networks. By considering the characteristics of convolving filters, the proposed methods efficiently compensate convolutional output errors due to zero padded inputs in a convolutional neural network. Primarily the algorithms are developed for patch based SRResNet for Single Image Super Resolution and the performance comparison is carried out using the SRResNet model but due to generalized nature of the padding algorithms its efficacy is tested in U-Net for Lung CT Image Segmentation. The proposed algorithms show better performance than the existing algorithm called partial convolution based padding (PCP), developed recently.
{"title":"Design of compensation algorithms for zero padding and its application to a patch based deep neural network","authors":"Safi Ullah, Seong-Ho Song","doi":"10.7717/peerj-cs.2287","DOIUrl":"https://doi.org/10.7717/peerj-cs.2287","url":null,"abstract":"In this article, compensation algorithms for zero padding are suggested to enhance the performance of deep convolutional neural networks. By considering the characteristics of convolving filters, the proposed methods efficiently compensate convolutional output errors due to zero padded inputs in a convolutional neural network. Primarily the algorithms are developed for patch based SRResNet for Single Image Super Resolution and the performance comparison is carried out using the SRResNet model but due to generalized nature of the padding algorithms its efficacy is tested in U-Net for Lung CT Image Segmentation. The proposed algorithms show better performance than the existing algorithm called partial convolution based padding (PCP), developed recently.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"106 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This survey rigorously explores contemporary clustering algorithms within the machine learning paradigm, focusing on five primary methodologies: centroid-based, hierarchical, density-based, distribution-based, and graph-based clustering. Through the lens of recent innovations such as deep embedded clustering and spectral clustering, we analyze the strengths, limitations, and the breadth of application domains—ranging from bioinformatics to social network analysis. Notably, the survey introduces novel contributions by integrating clustering techniques with dimensionality reduction and proposing advanced ensemble methods to enhance stability and accuracy across varied data structures. This work uniquely synthesizes the latest advancements and offers new perspectives on overcoming traditional challenges like scalability and noise sensitivity, thus providing a comprehensive roadmap for future research and practical applications in data-intensive environments.
{"title":"Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions","authors":"Aasim Ayaz Wani","doi":"10.7717/peerj-cs.2286","DOIUrl":"https://doi.org/10.7717/peerj-cs.2286","url":null,"abstract":"This survey rigorously explores contemporary clustering algorithms within the machine learning paradigm, focusing on five primary methodologies: centroid-based, hierarchical, density-based, distribution-based, and graph-based clustering. Through the lens of recent innovations such as deep embedded clustering and spectral clustering, we analyze the strengths, limitations, and the breadth of application domains—ranging from bioinformatics to social network analysis. Notably, the survey introduces novel contributions by integrating clustering techniques with dimensionality reduction and proposing advanced ensemble methods to enhance stability and accuracy across varied data structures. This work uniquely synthesizes the latest advancements and offers new perspectives on overcoming traditional challenges like scalability and noise sensitivity, thus providing a comprehensive roadmap for future research and practical applications in data-intensive environments.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"3 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Omar Bin Samin, Nasir Ahmed Abdulkhader Algeelani, Ammar Bathich, Maryam Omar, Musadaq Mansoor, Amir Khan
The integration of Internet of Things (IoT) and artificial intelligence (AI) technologies into modern agriculture has profound implications on data collection, management, and decision-making processes. However, ensuring the security of agricultural data has consistently posed a significant challenge. This study presents a novel evaluation metric titled Latency Aware Accuracy Index (LAAI) for the purpose of optimizing data security in the agricultural sector. The LAAI uses the combined capacities of the IoT and AI in addition to the latency aspect. The use of IoT tools for data collection and AI algorithms for analysis makes farming operation more productive. The LAAI metric is a more holistic way to determine data accuracy while considering latency limitations. This ensures that farmers and other end-users are fed trustworthy information in a timely manner. This unified measure not only makes the data more secure but gives farmers the information that helps them to make smart decisions and, thus, drives healthier farming and food security.
{"title":"Optimizing agricultural data security: harnessing IoT and AI with Latency Aware Accuracy Index (LAAI)","authors":"Omar Bin Samin, Nasir Ahmed Abdulkhader Algeelani, Ammar Bathich, Maryam Omar, Musadaq Mansoor, Amir Khan","doi":"10.7717/peerj-cs.2276","DOIUrl":"https://doi.org/10.7717/peerj-cs.2276","url":null,"abstract":"The integration of Internet of Things (IoT) and artificial intelligence (AI) technologies into modern agriculture has profound implications on data collection, management, and decision-making processes. However, ensuring the security of agricultural data has consistently posed a significant challenge. This study presents a novel evaluation metric titled Latency Aware Accuracy Index (LAAI) for the purpose of optimizing data security in the agricultural sector. The LAAI uses the combined capacities of the IoT and AI in addition to the latency aspect. The use of IoT tools for data collection and AI algorithms for analysis makes farming operation more productive. The LAAI metric is a more holistic way to determine data accuracy while considering latency limitations. This ensures that farmers and other end-users are fed trustworthy information in a timely manner. This unified measure not only makes the data more secure but gives farmers the information that helps them to make smart decisions and, thus, drives healthier farming and food security.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"35 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The facial expression reflects a person’s emotion, cognition, and even physiological or mental state to a large extent. It has important application value in medical treatment, business, criminal investigation, education, and human-computer interaction. Automatic facial expression recognition technology has become an important research topic in computer vision. To solve the problems of insufficient feature extraction, loss of local key information, and low accuracy in facial expression recognition, this article proposes a facial expression recognition network based on attention double branch enhanced fusion. Two parallel branches are used to capture global enhancement features and local attention semantics respectively, and the fusion and complementarity of global and local information is realized through decision-level fusion. The experimental results show that the features extracted by the network are made more complete by fusing and enhancing the global and local features. The proposed method achieves 89.41% and 88.84% expression recognition accuracy on the natural scene face expression datasets RAF-DB and FERPlus, respectively, which is an excellent performance compared with many current methods and demonstrates the effectiveness and superiority of the proposed network model.
{"title":"A facial expression recognition network based on attention double branch enhanced fusion","authors":"Wenming Wang, Min Jia","doi":"10.7717/peerj-cs.2266","DOIUrl":"https://doi.org/10.7717/peerj-cs.2266","url":null,"abstract":"The facial expression reflects a person’s emotion, cognition, and even physiological or mental state to a large extent. It has important application value in medical treatment, business, criminal investigation, education, and human-computer interaction. Automatic facial expression recognition technology has become an important research topic in computer vision. To solve the problems of insufficient feature extraction, loss of local key information, and low accuracy in facial expression recognition, this article proposes a facial expression recognition network based on attention double branch enhanced fusion. Two parallel branches are used to capture global enhancement features and local attention semantics respectively, and the fusion and complementarity of global and local information is realized through decision-level fusion. The experimental results show that the features extracted by the network are made more complete by fusing and enhancing the global and local features. The proposed method achieves 89.41% and 88.84% expression recognition accuracy on the natural scene face expression datasets RAF-DB and FERPlus, respectively, which is an excellent performance compared with many current methods and demonstrates the effectiveness and superiority of the proposed network model.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"106 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Internet of Things (IoT) is revolutionizing diverse sectors like business, healthcare, and the military, but its widespread adoption has also led to significant security challenges. IoT networks, in particular, face increasing vulnerabilities due to the rapid proliferation of connected devices within smart infrastructures. Wireless sensor networks (WSNs) comprise software, gateways, and small sensors that wirelessly transmit and receive data. WSNs consist of two types of nodes: generic nodes with sensing capabilities and gateway nodes that manage data routing. These sensor nodes operate under constraints of limited battery power, storage capacity, and processing capabilities, exposing them to various threats, including wormhole attacks. This study focuses on detecting wormhole attacks by analyzing the connectivity details of network nodes. Machine learning (ML) techniques are proposed as effective solutions to address these modern challenges in wormhole attack detection within sensor networks. The base station employs two ML models, a support vector machine (SVM) and a deep neural network (DNN), to classify traffic data and identify malicious nodes in the network. The effectiveness of these algorithms is validated using traffic generated by the NS3.37 simulator and tested against real-world scenarios. Evaluation metrics such as average recall, false positive rates, latency, end-to-end delay, response time, throughput, energy consumption, and CPU utilization are used to assess the performance of the proposed models. Results indicate that the proposed model outperforms existing methods in terms of efficacy and efficiency.
物联网(IoT)正在彻底改变商业、医疗保健和军事等各个领域,但其广泛应用也带来了巨大的安全挑战。尤其是物联网网络,由于智能基础设施中连接设备的快速激增,面临着越来越多的漏洞。无线传感器网络(WSN)由软件、网关和小型传感器组成,以无线方式传输和接收数据。WSN 由两类节点组成:具有传感功能的普通节点和管理数据路由的网关节点。这些传感器节点在有限的电池电量、存储容量和处理能力的限制下运行,面临着包括虫洞攻击在内的各种威胁。本研究的重点是通过分析网络节点的连接细节来检测虫洞攻击。本文提出了机器学习(ML)技术作为有效的解决方案,以应对传感器网络中虫洞攻击检测所面临的这些现代挑战。基站采用支持向量机(SVM)和深度神经网络(DNN)两种 ML 模型对流量数据进行分类,并识别网络中的恶意节点。使用 NS3.37 模拟器生成的流量验证了这些算法的有效性,并针对实际场景进行了测试。平均召回率、误报率、延迟、端到端延迟、响应时间、吞吐量、能耗和 CPU 利用率等评价指标用于评估所提模型的性能。结果表明,所提出的模型在功效和效率方面优于现有方法。
{"title":"Wormhole attack detection and mitigation model for Internet of Things and WSN using machine learning","authors":"Asma Hassan Alshehri","doi":"10.7717/peerj-cs.2257","DOIUrl":"https://doi.org/10.7717/peerj-cs.2257","url":null,"abstract":"The Internet of Things (IoT) is revolutionizing diverse sectors like business, healthcare, and the military, but its widespread adoption has also led to significant security challenges. IoT networks, in particular, face increasing vulnerabilities due to the rapid proliferation of connected devices within smart infrastructures. Wireless sensor networks (WSNs) comprise software, gateways, and small sensors that wirelessly transmit and receive data. WSNs consist of two types of nodes: generic nodes with sensing capabilities and gateway nodes that manage data routing. These sensor nodes operate under constraints of limited battery power, storage capacity, and processing capabilities, exposing them to various threats, including wormhole attacks. This study focuses on detecting wormhole attacks by analyzing the connectivity details of network nodes. Machine learning (ML) techniques are proposed as effective solutions to address these modern challenges in wormhole attack detection within sensor networks. The base station employs two ML models, a support vector machine (SVM) and a deep neural network (DNN), to classify traffic data and identify malicious nodes in the network. The effectiveness of these algorithms is validated using traffic generated by the NS3.37 simulator and tested against real-world scenarios. Evaluation metrics such as average recall, false positive rates, latency, end-to-end delay, response time, throughput, energy consumption, and CPU utilization are used to assess the performance of the proposed models. Results indicate that the proposed model outperforms existing methods in terms of efficacy and efficiency.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"47 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large amounts of machine learning methods with condensed names bring great challenges for researchers to select a suitable approach for a target dataset in the area of academic research. Although the graph neural networks based on the knowledge graph have been proven helpful in recommending a machine learning method for a given dataset, the issues of inadequate entity representation and over-smoothing of embeddings still need to be addressed. This article proposes a recommendation framework that integrates the feature-enhanced graph neural network and an anti-smoothing aggregation network. In the proposed framework, in addition to utilizing the textual description information of the target entities, each node is enhanced through its neighborhood information before participating in the higher-order propagation process. In addition, an anti-smoothing aggregation network is designed to reduce the influence of central nodes in each information aggregation by an exponential decay function. Extensive experiments on the public dataset demonstrate that the proposed approach exhibits substantial advantages over the strong baselines in recommendation tasks.
{"title":"A feature-enhanced knowledge graph neural network for machine learning method recommendation","authors":"Xin Zhang, Junjie Guo","doi":"10.7717/peerj-cs.2284","DOIUrl":"https://doi.org/10.7717/peerj-cs.2284","url":null,"abstract":"Large amounts of machine learning methods with condensed names bring great challenges for researchers to select a suitable approach for a target dataset in the area of academic research. Although the graph neural networks based on the knowledge graph have been proven helpful in recommending a machine learning method for a given dataset, the issues of inadequate entity representation and over-smoothing of embeddings still need to be addressed. This article proposes a recommendation framework that integrates the feature-enhanced graph neural network and an anti-smoothing aggregation network. In the proposed framework, in addition to utilizing the textual description information of the target entities, each node is enhanced through its neighborhood information before participating in the higher-order propagation process. In addition, an anti-smoothing aggregation network is designed to reduce the influence of central nodes in each information aggregation by an exponential decay function. Extensive experiments on the public dataset demonstrate that the proposed approach exhibits substantial advantages over the strong baselines in recommendation tasks.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"24 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianhui Li, Yanwei Xia, Xianhai Pang, Jihong Zhu, Hui Fan, Li Zhen, Chaomin Gu, Chi Dong, Shijie Lu
A high voltage circuit breaker (HVCB) plays a crucial role in current smart power system. However, the current research on HVCB mainly focuses on the convenience and efficiency of mechanical structures, ignoring the aspect of their fault diagnosis. It is very important to ensure the circuit breaker conducts in a normal state. According to real statistics when HVCB works, most defects and faults in high voltage circuit breakers is caused by mechanical faults such as contact fault, mechanism seizure, bolt loosening, spring fatigue and so on. In this study, vibration sensors were placed at four different locations in the HVCB system to detect four common mechanical faults using vibration signal. In our approach, a convolutional attention network (CANet) was introduced to extract features and determine which mechanical faults occur within a fixed period of time. The results indicate that the mechanical fault diagnosis accuracy rate is up to 94.2%, surpassing traditional methods that rely solely on vibration signals from a single location.
{"title":"Mechanical fault diagnosis of high voltage circuit breaker using multimodal data fusion","authors":"Tianhui Li, Yanwei Xia, Xianhai Pang, Jihong Zhu, Hui Fan, Li Zhen, Chaomin Gu, Chi Dong, Shijie Lu","doi":"10.7717/peerj-cs.2248","DOIUrl":"https://doi.org/10.7717/peerj-cs.2248","url":null,"abstract":"A high voltage circuit breaker (HVCB) plays a crucial role in current smart power system. However, the current research on HVCB mainly focuses on the convenience and efficiency of mechanical structures, ignoring the aspect of their fault diagnosis. It is very important to ensure the circuit breaker conducts in a normal state. According to real statistics when HVCB works, most defects and faults in high voltage circuit breakers is caused by mechanical faults such as contact fault, mechanism seizure, bolt loosening, spring fatigue and so on. In this study, vibration sensors were placed at four different locations in the HVCB system to detect four common mechanical faults using vibration signal. In our approach, a convolutional attention network (CANet) was introduced to extract features and determine which mechanical faults occur within a fixed period of time. The results indicate that the mechanical fault diagnosis accuracy rate is up to 94.2%, surpassing traditional methods that rely solely on vibration signals from a single location.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"1 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guoqing Jiang, Saiya Li, Ziyu Huang, Guorong Cai, Jinhe Su
Point clouds are highly regarded in the field of 3D object detection for their superior geometric properties and versatility. However, object occlusion and defects in scanning equipment frequently result in sparse and missing data within point clouds, adversely affecting the final prediction. Recognizing the synergistic potential between the rich semantic information present in images and the geometric data in point clouds for scene representation, we introduce a two-stage fusion framework (TSFF) for 3D object detection. To address the issue of corrupted geometric information in point clouds caused by object occlusion, we augment point features with image features, thereby enhancing the reference factor of the point cloud during the voting bias phase. Furthermore, we implement a constrained fusion module to selectively sample voting points using a 2D bounding box, integrating valuable image features while reducing the impact of background points in sparse scenes. Our methodology was evaluated on the SUNRGB-D dataset, where it achieved a 3.6 mean average percent (mAP) improvement in the mAP@0.25 evaluation criterion over the baseline. In comparison to other great 3D object detection methods, our method had excellent performance in the detection of some objects.
{"title":"TSFF: a two-stage fusion framework for 3D object detection","authors":"Guoqing Jiang, Saiya Li, Ziyu Huang, Guorong Cai, Jinhe Su","doi":"10.7717/peerj-cs.2260","DOIUrl":"https://doi.org/10.7717/peerj-cs.2260","url":null,"abstract":"Point clouds are highly regarded in the field of 3D object detection for their superior geometric properties and versatility. However, object occlusion and defects in scanning equipment frequently result in sparse and missing data within point clouds, adversely affecting the final prediction. Recognizing the synergistic potential between the rich semantic information present in images and the geometric data in point clouds for scene representation, we introduce a two-stage fusion framework (TSFF) for 3D object detection. To address the issue of corrupted geometric information in point clouds caused by object occlusion, we augment point features with image features, thereby enhancing the reference factor of the point cloud during the voting bias phase. Furthermore, we implement a constrained fusion module to selectively sample voting points using a 2D bounding box, integrating valuable image features while reducing the impact of background points in sparse scenes. Our methodology was evaluated on the SUNRGB-D dataset, where it achieved a 3.6 mean average percent (mAP) improvement in the mAP@0.25 evaluation criterion over the baseline. In comparison to other great 3D object detection methods, our method had excellent performance in the detection of some objects.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"1 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}