Pub Date : 2023-10-27DOI: 10.1142/s021819402350064x
Marko Pozenel, Luka Furst, Damjan Vavpotic, Tomaz Hovelja
{"title":"Agile Effort Estimation: Comparing the Accuracy and Efficiency of Planning Poker, Bucket System, and Affinity Estimation methods","authors":"Marko Pozenel, Luka Furst, Damjan Vavpotic, Tomaz Hovelja","doi":"10.1142/s021819402350064x","DOIUrl":"https://doi.org/10.1142/s021819402350064x","url":null,"abstract":"","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136312451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-27DOI: 10.1142/s0218194023410073
Shunwen Shen, Mulan Yang, Lvqing Yang, Sien Chen, Wensheng Dong, Bo Yu, Qingkai Wang
{"title":"DeepMultiple: A Deep Learning Model for RFID-based Multi-object Activity Recognition","authors":"Shunwen Shen, Mulan Yang, Lvqing Yang, Sien Chen, Wensheng Dong, Bo Yu, Qingkai Wang","doi":"10.1142/s0218194023410073","DOIUrl":"https://doi.org/10.1142/s0218194023410073","url":null,"abstract":"","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":"35 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136312452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-27DOI: 10.1142/s0218194023500651
Winnie Mbaka, Katja Tuma
{"title":"On the measures of success in replication of controlled experiments with STRIDE","authors":"Winnie Mbaka, Katja Tuma","doi":"10.1142/s0218194023500651","DOIUrl":"https://doi.org/10.1142/s0218194023500651","url":null,"abstract":"","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":"99 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136312495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-27DOI: 10.1142/s021819402341005x
Patrick D. Cook, Susan A. Mengel, Siva Parameswaran
This research provides a comprehensive analysis of deontic expressions within the Code of Federal Regulations (CFR) Title 48, Federal Acquisition Regulations System, specifically focusing on obligations, permissions, prohibitions, and dispensations. Utilizing SHAMROQ, a systematic and rigorous methodology, the authors extract, classify, and analyze these expressions, quantify their prevalence, and identify common linguistic patterns within the legal text. The results show that obligations (71.3%) form most deontic expressions in CFR 48, indicating the heavily prescriptive nature of the document. Permissions also form a significant part (21.9%), suggesting the liberties and allowances are embedded within the regulatory framework. In contrast, prohibitions (5.4%) and dispensations (1.4%) are less frequent, indicating that the document leans more towards defining what is required or allowed rather than what is explicitly forbidden or exempted. This research also highlights the challenges encountered during the extraction process, providing insights into the complexities of parsing legal texts and the intricacies of deontic language. These challenges range from the technical difficulties of parsing a complex hierarchical document to the conceptual challenges of defining precise rulesets for regulations and provisions. In summary, the results deepen the understanding of regulatory compliance in software engineering and contribute to the development of more effective and efficient automated extraction tools.
{"title":"SHAMROQ: A Software Engineering Methodology to Extract Deontic Expressions from the Code of Federal Regulations - A Single-Case, Embedded Case Study","authors":"Patrick D. Cook, Susan A. Mengel, Siva Parameswaran","doi":"10.1142/s021819402341005x","DOIUrl":"https://doi.org/10.1142/s021819402341005x","url":null,"abstract":"This research provides a comprehensive analysis of deontic expressions within the Code of Federal Regulations (CFR) Title 48, Federal Acquisition Regulations System, specifically focusing on obligations, permissions, prohibitions, and dispensations. Utilizing SHAMROQ, a systematic and rigorous methodology, the authors extract, classify, and analyze these expressions, quantify their prevalence, and identify common linguistic patterns within the legal text. The results show that obligations (71.3%) form most deontic expressions in CFR 48, indicating the heavily prescriptive nature of the document. Permissions also form a significant part (21.9%), suggesting the liberties and allowances are embedded within the regulatory framework. In contrast, prohibitions (5.4%) and dispensations (1.4%) are less frequent, indicating that the document leans more towards defining what is required or allowed rather than what is explicitly forbidden or exempted. This research also highlights the challenges encountered during the extraction process, providing insights into the complexities of parsing legal texts and the intricacies of deontic language. These challenges range from the technical difficulties of parsing a complex hierarchical document to the conceptual challenges of defining precise rulesets for regulations and provisions. In summary, the results deepen the understanding of regulatory compliance in software engineering and contribute to the development of more effective and efficient automated extraction tools.","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":"34 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136233161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-18DOI: 10.1142/s0218194023500560
Zhongliang Li, Junjun Ding, Zongming Ma
With the explosion of 5G network scale, the network structure becomes increasingly complex. During the operation of the network devices, the probability of anomalies or faults increases accordingly. Network faults may lead to the disappearance of important information and cause unpredictable losses. The prediction of network faults can enhance the quality of network services and reduce economic loss. In this paper, we propose the concept of 4D features and use the BERT algorithm to extract semantic features, the graph neural network algorithm to extract network topology information, and the Temporal Convolutional Network (TCN) algorithm to extract time series. Based on this, we propose Fault Prediction based on GraphSage and TCN (GTFP), an end-to-end solution of network fault alarm prediction, which is based on GraphSage and TCN (GTCN), a hybrid algorithm of a graph neural network and the TCN model. Our solution takes the historical alarm data as input. First, we filter out the alarm noises irrelevant to the faults through data cleaning. Then, we employ feature engineering to extract the valid alarm features, including the statistical features of the network alarm information, the semantic features of the alarm texts, the sequential features of the alarms and the network topology features of the nodes where the alarms are located. Finally, we use GTCN to predict future fault alarms based on the extracted features. Experiments on the alarm data of a real service system show that GTFP performs better than the state-of-the-art algorithms of fault alarm prediction.
{"title":"GTFP: Network Fault Prediction Based on Graph and Time Series","authors":"Zhongliang Li, Junjun Ding, Zongming Ma","doi":"10.1142/s0218194023500560","DOIUrl":"https://doi.org/10.1142/s0218194023500560","url":null,"abstract":"With the explosion of 5G network scale, the network structure becomes increasingly complex. During the operation of the network devices, the probability of anomalies or faults increases accordingly. Network faults may lead to the disappearance of important information and cause unpredictable losses. The prediction of network faults can enhance the quality of network services and reduce economic loss. In this paper, we propose the concept of 4D features and use the BERT algorithm to extract semantic features, the graph neural network algorithm to extract network topology information, and the Temporal Convolutional Network (TCN) algorithm to extract time series. Based on this, we propose Fault Prediction based on GraphSage and TCN (GTFP), an end-to-end solution of network fault alarm prediction, which is based on GraphSage and TCN (GTCN), a hybrid algorithm of a graph neural network and the TCN model. Our solution takes the historical alarm data as input. First, we filter out the alarm noises irrelevant to the faults through data cleaning. Then, we employ feature engineering to extract the valid alarm features, including the statistical features of the network alarm information, the semantic features of the alarm texts, the sequential features of the alarms and the network topology features of the nodes where the alarms are located. Finally, we use GTCN to predict future fault alarms based on the extracted features. Experiments on the alarm data of a real service system show that GTFP performs better than the state-of-the-art algorithms of fault alarm prediction.","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135823311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The big data sampling method for real-time and high-speed streaming data is prone to lose the value and information of a large amount of discrete data, and it is not easy to make an efficient and accurate evaluation of the value characteristics of streaming data. The SDSLA sampling method based on mineral drilling exploration can evaluate the valuable information of streaming data containing many discrete data in real-time, but when the range of discrete data is irregular, it has low sampling accuracy for discrete data. Based on the SDSLA algorithm, we propose a dynamic drilling sampling method SDDS, which takes well as the analysis unit, dynamically changes the size and position of the well, and accurately locates the position and range of discrete data. A new model SDVEM is further proposed for data valuation, which evaluates the sample set from discrete, centralized, and overall dimensions. Experiments show that compared with the SDSLA algorithm, the sample sampled by the SDDS algorithm has higher evaluation accuracy, and the probability distribution of the sample is closer to the original streaming data, with the AOCV indicator being nearly 10% higher. In addition, the SDDS algorithm can achieve over 90% accuracy, recall, and F1 score for training and testing neural networks with small sampling rates, all of which are higher than the SDSLA algorithm. In summary, the SDDS algorithm not only accurately evaluates the value characteristics of streaming data but also facilitates the training of neural network models, which has important research significance in big data estimation.
{"title":"A Dynamic Drilling Sampling Method and Evaluation Model for Big Streaming Data","authors":"Zhaohui Zhang, Pei Zhang, Peng Zhang, Fujuan Xu, Chaochao Hu, Pengwei Wang","doi":"10.1142/s0218194023410036","DOIUrl":"https://doi.org/10.1142/s0218194023410036","url":null,"abstract":"The big data sampling method for real-time and high-speed streaming data is prone to lose the value and information of a large amount of discrete data, and it is not easy to make an efficient and accurate evaluation of the value characteristics of streaming data. The SDSLA sampling method based on mineral drilling exploration can evaluate the valuable information of streaming data containing many discrete data in real-time, but when the range of discrete data is irregular, it has low sampling accuracy for discrete data. Based on the SDSLA algorithm, we propose a dynamic drilling sampling method SDDS, which takes well as the analysis unit, dynamically changes the size and position of the well, and accurately locates the position and range of discrete data. A new model SDVEM is further proposed for data valuation, which evaluates the sample set from discrete, centralized, and overall dimensions. Experiments show that compared with the SDSLA algorithm, the sample sampled by the SDDS algorithm has higher evaluation accuracy, and the probability distribution of the sample is closer to the original streaming data, with the AOCV indicator being nearly 10% higher. In addition, the SDDS algorithm can achieve over 90% accuracy, recall, and F1 score for training and testing neural networks with small sampling rates, all of which are higher than the SDSLA algorithm. In summary, the SDDS algorithm not only accurately evaluates the value characteristics of streaming data but also facilitates the training of neural network models, which has important research significance in big data estimation.","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135823300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-17DOI: 10.1142/s0218194023410048
Beiqi Zhang, Peng Liang, Xiyu Zhou, Aakash Ahmad, Muhammad Waseem
With the advances in machine learning, there is a growing interest in AI-enabled tools for autocompleting source code. GitHub Copilot, also referred to as the “AI Pair Programmer”, has been trained on billions of lines of open source GitHub code, and is one of such tools that has been increasingly used since its launch in June 2021. However, little effort has been devoted to understanding the practices, challenges, and expected features of using Copilot in programming for auto-completed source code from the point of view of practitioners. To this end, we conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub Discussions. More specifically, we searched and manually collected 303 SO posts and 927 GitHub discussions related to the usage of Copilot. We identified the programming languages, Integrated Development Environments (IDEs), technologies used with Copilot, functions implemented, benefits, limitations, and challenges when using Copilot. The results show that when practitioners use Copilot: (1) The major programming languages used with Copilot are JavaScript and Python, (2) the main IDE used with Copilot is Visual Studio Code, (3) the most common used technology with Copilot is Node.js, (4) the leading function implemented by Copilot is data processing, (5) the main purpose of users using Copilot is to help generate code, (6) the significant benefit of using Copilot is useful code generation, (7) the main limitation encountered by practitioners when using Copilot is difficulty of integration, and (8) the most common expected feature is that Copilot can be integrated with more IDEs. Our results suggest that using Copilot is like a double-edged sword, which requires developers to carefully consider various aspects when deciding whether or not to use it. Our study provides empirically grounded foundations that could inform software developers and practitioners, as well as provide a basis for future investigations on the role of Copilot as an AI pair programmer in software development.
{"title":"Demystifying Practices, Challenges and Expected Features of Using GitHub Copilot","authors":"Beiqi Zhang, Peng Liang, Xiyu Zhou, Aakash Ahmad, Muhammad Waseem","doi":"10.1142/s0218194023410048","DOIUrl":"https://doi.org/10.1142/s0218194023410048","url":null,"abstract":"With the advances in machine learning, there is a growing interest in AI-enabled tools for autocompleting source code. GitHub Copilot, also referred to as the “AI Pair Programmer”, has been trained on billions of lines of open source GitHub code, and is one of such tools that has been increasingly used since its launch in June 2021. However, little effort has been devoted to understanding the practices, challenges, and expected features of using Copilot in programming for auto-completed source code from the point of view of practitioners. To this end, we conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub Discussions. More specifically, we searched and manually collected 303 SO posts and 927 GitHub discussions related to the usage of Copilot. We identified the programming languages, Integrated Development Environments (IDEs), technologies used with Copilot, functions implemented, benefits, limitations, and challenges when using Copilot. The results show that when practitioners use Copilot: (1) The major programming languages used with Copilot are JavaScript and Python, (2) the main IDE used with Copilot is Visual Studio Code, (3) the most common used technology with Copilot is Node.js, (4) the leading function implemented by Copilot is data processing, (5) the main purpose of users using Copilot is to help generate code, (6) the significant benefit of using Copilot is useful code generation, (7) the main limitation encountered by practitioners when using Copilot is difficulty of integration, and (8) the most common expected feature is that Copilot can be integrated with more IDEs. Our results suggest that using Copilot is like a double-edged sword, which requires developers to carefully consider various aspects when deciding whether or not to use it. Our study provides empirically grounded foundations that could inform software developers and practitioners, as well as provide a basis for future investigations on the role of Copilot as an AI pair programmer in software development.","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135944434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}