{"title":"On the Application of Inequality Indices in Comparative Software Analysis","authors":"O. Goloshchapova, M. Lumpe","doi":"10.1109/ASWEC.2013.23","DOIUrl":null,"url":null,"abstract":"Socio-economic inequality indices, like the Gini coefficient or the Theil index, offer us a viable alternative to central tendency statistics when being used to aggregate software metrics data. The specific value of these inequality indices lies in their ability to capture changes in the distribution of metrics data more effectively than, say, average or median. Knowing whether the distribution of one metrics is more unequal than that of another one or whether its distribution becomes more or less unequal over time is the crucial element here. There are, however, challenges in the application of these indices that can result in ecological fallacies. The first issue relates to occurrences of zeros in metrics data, and not all inequality indices cope well with this event. The second problem arises from applying a macro-level inference to a micro-level analysis of a changing population. The Gini coefficient works for the former, whereas the decomposable Theil index serves the latter. Nevertheless, when used with care, and usually in combination, both indices can provide us with a powerful tool not only to analyze software, but also to assess its organizational health and maintainability over time.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 22nd Australian Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASWEC.2013.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Socio-economic inequality indices, like the Gini coefficient or the Theil index, offer us a viable alternative to central tendency statistics when being used to aggregate software metrics data. The specific value of these inequality indices lies in their ability to capture changes in the distribution of metrics data more effectively than, say, average or median. Knowing whether the distribution of one metrics is more unequal than that of another one or whether its distribution becomes more or less unequal over time is the crucial element here. There are, however, challenges in the application of these indices that can result in ecological fallacies. The first issue relates to occurrences of zeros in metrics data, and not all inequality indices cope well with this event. The second problem arises from applying a macro-level inference to a micro-level analysis of a changing population. The Gini coefficient works for the former, whereas the decomposable Theil index serves the latter. Nevertheless, when used with care, and usually in combination, both indices can provide us with a powerful tool not only to analyze software, but also to assess its organizational health and maintainability over time.