{"title":"\"May the fork be with you\": novel metrics to analyze collaboration on GitHub","authors":"Marco Biazzini, B. Baudry","doi":"10.1145/2593868.2593875","DOIUrl":null,"url":null,"abstract":"Multi-repository software projects are becoming more and more popular, thanks to web-based facilities such as Github. Code and process metrics generally assume a single repository must be analyzed, in order to measure the characteristics of a codebase. Thus they are not apt to measure how much relevant information is hosted in multiple repositories contributing to the same codebase. Nor can they feature the characteristics of such a distributed development process. We present a set of novel metrics, based on an original classification of commits, conceived to capture some interesting aspects of a multi-repository development process. We also describe an efficient way to build a data structure that allows to compute these metrics on a set of Git repositories. Interesting outcomes, obtained by applying our metrics on a large sample of projects hosted on Github, show the usefulness of our contribution.","PeriodicalId":103819,"journal":{"name":"Workshop on Emerging Trends in Software Metrics","volume":"99 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Emerging Trends in Software Metrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2593868.2593875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33
Abstract
Multi-repository software projects are becoming more and more popular, thanks to web-based facilities such as Github. Code and process metrics generally assume a single repository must be analyzed, in order to measure the characteristics of a codebase. Thus they are not apt to measure how much relevant information is hosted in multiple repositories contributing to the same codebase. Nor can they feature the characteristics of such a distributed development process. We present a set of novel metrics, based on an original classification of commits, conceived to capture some interesting aspects of a multi-repository development process. We also describe an efficient way to build a data structure that allows to compute these metrics on a set of Git repositories. Interesting outcomes, obtained by applying our metrics on a large sample of projects hosted on Github, show the usefulness of our contribution.