{"title":"CodeLabeller:一个基于web的Java设计模式和摘要代码注释工具","authors":"Najam Nazar, Norman Chen, Chun Yong Chong","doi":"10.1142/s0218194023500213","DOIUrl":null,"url":null,"abstract":"While constructing supervised learning models, we require labeled examples to build a corpus and train a machine learning model. However, most studies have built the labeled dataset manually, which, on many occasions, is a daunting task. To mitigate this problem, we have built an online tool called CodeLabeller. CodeLabeller is a web-based tool that aims to provide an efficient approach to handling the process of labeling source code files for supervised learning methods at scale by improving the data collection process throughout. CodeLabeller is tested by constructing a corpus of over a thousand source files obtained from a large collection of open source Java projects and labeling each Java source file with their respective design patterns and summaries. Twenty-five experts in the field of software engineering participated in a usability evaluation of the tool using the standard User Experience Questionnaire online survey. The survey results demonstrate that the tool achieves the Good standard on hedonic and pragmatic quality standards, is easy to use and meets the needs of annotating the corpus for supervised classifiers. Apart from assisting researchers in crowdsourcing a labeled dataset, the tool has practical applicability in software engineering education and assists in building expert ratings for software artefacts.","PeriodicalId":50288,"journal":{"name":"International Journal of Software Engineering and Knowledge Engineering","volume":null,"pages":null},"PeriodicalIF":0.6000,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CodeLabeller: A Web-Based Code Annotation Tool for Java Design Patterns and Summaries\",\"authors\":\"Najam Nazar, Norman Chen, Chun Yong Chong\",\"doi\":\"10.1142/s0218194023500213\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While constructing supervised learning models, we require labeled examples to build a corpus and train a machine learning model. However, most studies have built the labeled dataset manually, which, on many occasions, is a daunting task. To mitigate this problem, we have built an online tool called CodeLabeller. CodeLabeller is a web-based tool that aims to provide an efficient approach to handling the process of labeling source code files for supervised learning methods at scale by improving the data collection process throughout. CodeLabeller is tested by constructing a corpus of over a thousand source files obtained from a large collection of open source Java projects and labeling each Java source file with their respective design patterns and summaries. Twenty-five experts in the field of software engineering participated in a usability evaluation of the tool using the standard User Experience Questionnaire online survey. The survey results demonstrate that the tool achieves the Good standard on hedonic and pragmatic quality standards, is easy to use and meets the needs of annotating the corpus for supervised classifiers. Apart from assisting researchers in crowdsourcing a labeled dataset, the tool has practical applicability in software engineering education and assists in building expert ratings for software artefacts.\",\"PeriodicalId\":50288,\"journal\":{\"name\":\"International Journal of Software Engineering and Knowledge Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2023-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Software Engineering and Knowledge Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s0218194023500213\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Software Engineering and Knowledge Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0218194023500213","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
CodeLabeller: A Web-Based Code Annotation Tool for Java Design Patterns and Summaries
While constructing supervised learning models, we require labeled examples to build a corpus and train a machine learning model. However, most studies have built the labeled dataset manually, which, on many occasions, is a daunting task. To mitigate this problem, we have built an online tool called CodeLabeller. CodeLabeller is a web-based tool that aims to provide an efficient approach to handling the process of labeling source code files for supervised learning methods at scale by improving the data collection process throughout. CodeLabeller is tested by constructing a corpus of over a thousand source files obtained from a large collection of open source Java projects and labeling each Java source file with their respective design patterns and summaries. Twenty-five experts in the field of software engineering participated in a usability evaluation of the tool using the standard User Experience Questionnaire online survey. The survey results demonstrate that the tool achieves the Good standard on hedonic and pragmatic quality standards, is easy to use and meets the needs of annotating the corpus for supervised classifiers. Apart from assisting researchers in crowdsourcing a labeled dataset, the tool has practical applicability in software engineering education and assists in building expert ratings for software artefacts.
期刊介绍:
The International Journal of Software Engineering and Knowledge Engineering is intended to serve as a forum for researchers, practitioners, and developers to exchange ideas and results for the advancement of software engineering and knowledge engineering. Three types of papers will be published:
Research papers reporting original research results
Technology trend surveys reviewing an area of research in software engineering and knowledge engineering
Survey articles surveying a broad area in software engineering and knowledge engineering
In addition, tool reviews (no more than three manuscript pages) and book reviews (no more than two manuscript pages) are also welcome.
A central theme of this journal is the interplay between software engineering and knowledge engineering: how knowledge engineering methods can be applied to software engineering, and vice versa. The journal publishes papers in the areas of software engineering methods and practices, object-oriented systems, rapid prototyping, software reuse, cleanroom software engineering, stepwise refinement/enhancement, formal methods of specification, ambiguity in software development, impact of CASE on software development life cycle, knowledge engineering methods and practices, logic programming, expert systems, knowledge-based systems, distributed knowledge-based systems, deductive database systems, knowledge representations, knowledge-based systems in language translation & processing, software and knowledge-ware maintenance, reverse engineering in software design, and applications in various domains of interest.