Stratos Kourtzanidis, A. Chatzigeorgiou, Apostolos Ampatzoglou
{"title":"RepoSkillMiner","authors":"Stratos Kourtzanidis, A. Chatzigeorgiou, Apostolos Ampatzoglou","doi":"10.1145/3324884.3415305","DOIUrl":null,"url":null,"abstract":"A GitHub profile is becoming an essential part of a developer's resume enabling HR departments to extract someone's expertise, through automated analysis of his/her contribution to open-source projects. At the same time, having clear insights on the technologies used in a project can be very beneficial for resource allocation and project maintainability planning. In the literature, one can identify various approaches for identifying expertise on programming languages, based on the projects that developer contributed to. In this paper, we move one step further and introduce an approach (ac-companied by a tool) to identify low-level expertise on particular software frameworks and technologies apart, relying solely on GitHub data, using the GitHub API and Natural Language Processing (NLP)-using the Microsoft Language Understanding Intelligent Service (LUIS). In particular, we developed an NLP model in LUIS for named-entity recognition for three (3). NET technologies and two (2) front-end frameworks. Our analysis is based upon specific commit contents, in terms of the exact code chunks, which the committer added or changed. We evaluate the precision, recall and f-measure for the derived technologies/frameworks, by conducting a batch test in LUIS and report the results. The proposed approach is demonstrated through a fully functional web application named RepoSkillMiner.","PeriodicalId":267160,"journal":{"name":"Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3324884.3415305","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
A GitHub profile is becoming an essential part of a developer's resume enabling HR departments to extract someone's expertise, through automated analysis of his/her contribution to open-source projects. At the same time, having clear insights on the technologies used in a project can be very beneficial for resource allocation and project maintainability planning. In the literature, one can identify various approaches for identifying expertise on programming languages, based on the projects that developer contributed to. In this paper, we move one step further and introduce an approach (ac-companied by a tool) to identify low-level expertise on particular software frameworks and technologies apart, relying solely on GitHub data, using the GitHub API and Natural Language Processing (NLP)-using the Microsoft Language Understanding Intelligent Service (LUIS). In particular, we developed an NLP model in LUIS for named-entity recognition for three (3). NET technologies and two (2) front-end frameworks. Our analysis is based upon specific commit contents, in terms of the exact code chunks, which the committer added or changed. We evaluate the precision, recall and f-measure for the derived technologies/frameworks, by conducting a batch test in LUIS and report the results. The proposed approach is demonstrated through a fully functional web application named RepoSkillMiner.