Patrick Kreutzer, Georg Dotzler, M. Ring, B. Eskofier, M. Philippsen
Several research tools and projects require groups of similar code changes asinput. Examples are recommendation and bug finding tools that can providevaluable information to developers based on such data. With the help ofsimilar code changes they can simplify the application of bug fixes and codechanges to multiple locations in a project. But despite their benefit, thepractical value of existing tools is limited, as users need to manually specifythe input data, i.e., the groups of similar code changes.To overcome this drawback, this paper presents and evaluates two syntacticalsimilarity metrics, one of them is specifically designed to run fast, incombination with two carefully selected and self-tuning clustering algorithmsto automatically detect groups of similar code changes.We evaluate the combinations of metrics and clustering algorithms by applyingthem to several open source projects and also publish the detected groups ofsimilar code changes online as a reference dataset. The automatically detectedgroups of similar code changes work well when used as input for LASE, arecommendation system for code changes.
{"title":"Automatic Clustering of Code Changes","authors":"Patrick Kreutzer, Georg Dotzler, M. Ring, B. Eskofier, M. Philippsen","doi":"10.1145/2901739.2901749","DOIUrl":"https://doi.org/10.1145/2901739.2901749","url":null,"abstract":"Several research tools and projects require groups of similar code changes asinput. Examples are recommendation and bug finding tools that can providevaluable information to developers based on such data. With the help ofsimilar code changes they can simplify the application of bug fixes and codechanges to multiple locations in a project. But despite their benefit, thepractical value of existing tools is limited, as users need to manually specifythe input data, i.e., the groups of similar code changes.To overcome this drawback, this paper presents and evaluates two syntacticalsimilarity metrics, one of them is specifically designed to run fast, incombination with two carefully selected and self-tuning clustering algorithmsto automatically detect groups of similar code changes.We evaluate the combinations of metrics and clustering algorithms by applyingthem to several open source projects and also publish the detected groups ofsimilar code changes online as a reference dataset. The automatically detectedgroups of similar code changes work well when used as input for LASE, arecommendation system for code changes.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"298 1","pages":"61-72"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86759131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
María Gómez, Romain Rouvoy, Bram Adams, L. Seinturier
The reputation of a mobile app vendor is crucial to survive amongst the ever increasing competition. However this reputation largely depends on the quality of the apps, both functional and non-functional. One major non-functional requirement of mobile apps is to guarantee smooth UI interactions, since choppy scrolling or navigation caused by performance problems on a mobile device’s limited hardware resources, is highly annoying for end-users. The main research challenge of automatically identifying UI performance problems on mobile devices is that the performance of an app highly varies depending on its context—i.e., the hardware and software configurations on which it runs. This paper presents DUNE, an approach to automatically detect UI performance degradations in Android apps while taking into account context differences. First, DUNE builds an ensemble model of the UI performance metrics of an app from a repository of historical test runs that are known to be acceptable, for different configurations of context. Then, DUNE uses this model to flag UI performance deviations (regressions and optimizations) in new test runs. We empirically evaluate DUNE on real UI performance defects reported in two Android apps, and one manually injected defect in a third app. We demonstrate that this toolset can be successfully used to spot UI performance regressions at a fine granularity.
{"title":"Mining Test Repositories for Automatic Detection of UI Performance Regressions in Android Apps","authors":"María Gómez, Romain Rouvoy, Bram Adams, L. Seinturier","doi":"10.1145/2901739.2901747","DOIUrl":"https://doi.org/10.1145/2901739.2901747","url":null,"abstract":"The reputation of a mobile app vendor is crucial to survive amongst the ever increasing competition. However this reputation largely depends on the quality of the apps, both functional and non-functional. One major non-functional requirement of mobile apps is to guarantee smooth UI interactions, since choppy scrolling or navigation caused by performance problems on a mobile device’s limited hardware resources, is highly annoying for end-users. The main research challenge of automatically identifying UI performance problems on mobile devices is that the performance of an app highly varies depending on its context—i.e., the hardware and software configurations on which it runs. This paper presents DUNE, an approach to automatically detect UI performance degradations in Android apps while taking into account context differences. First, DUNE builds an ensemble model of the UI performance metrics of an app from a repository of historical test runs that are known to be acceptable, for different configurations of context. Then, DUNE uses this model to flag UI performance deviations (regressions and optimizations) in new test runs. We empirically evaluate DUNE on real UI performance defects reported in two Android apps, and one manually injected defect in a third app. We demonstrate that this toolset can be successfully used to spot UI performance regressions at a fine granularity.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"27 1","pages":"13-24"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86843543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Developers leverage logs for debugging, performance monitoring and load testing. The increased dependence on logs has lead to the development of numerous logging libraries which help developers in logging their code. As new libraries emerge and current ones evolve, projects often migrate from an older library to another one.In this paper we study logging library migrations within Apache Software Foundation (ASF) projects. From our manual analysis of JIRA issues, we find that 33 out of 223 (i.e., 14%) ASF projects have undergone at least one logging library migration. We find that the five main drivers for logging library migration are: 1) to increase flexibility (i.e., the ability to use different logging libraries within a project) 2) to improve performance, 3) to reduce effort spent on code maintenance, 4) to reduce dependence on other libraries and 5) to obtain specific features from the new logging library. We find that over 70% of the migrated projects encounter on average two post-migration bugs due to the new logging library. Furthermore, our findings suggest that performance (traditionally one of the primary drivers for migrations) is rarely improved after a migration.
{"title":"Logging Library Migrations: A Case Study for the Apache Software Foundation Projects","authors":"Suhas Kabinna, C. Bezemer, Weiyi Shang, A. Hassan","doi":"10.1145/2901739.2901769","DOIUrl":"https://doi.org/10.1145/2901739.2901769","url":null,"abstract":"Developers leverage logs for debugging, performance monitoring and load testing. The increased dependence on logs has lead to the development of numerous logging libraries which help developers in logging their code. As new libraries emerge and current ones evolve, projects often migrate from an older library to another one.In this paper we study logging library migrations within Apache Software Foundation (ASF) projects. From our manual analysis of JIRA issues, we find that 33 out of 223 (i.e., 14%) ASF projects have undergone at least one logging library migration. We find that the five main drivers for logging library migration are: 1) to increase flexibility (i.e., the ability to use different logging libraries within a project) 2) to improve performance, 3) to reduce effort spent on code maintenance, 4) to reduce dependence on other libraries and 5) to obtain specific features from the new logging library. We find that over 70% of the migrated projects encounter on average two post-migration bugs due to the new logging library. Furthermore, our findings suggest that performance (traditionally one of the primary drivers for migrations) is rarely improved after a migration.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"50 1","pages":"154-164"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81211408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Rolfsnes, L. Moonen, Stefano Di Alesio, Razieh Behjati, D. Binkley
Past research has proposed association rule mining as a means to uncover the evolutionary coupling from a system’s change history. These couplings have various applications, such as improving system decomposition and recommending related changes during development. The strength of the coupling can be characterized using a variety of interestingness measures. Existing recommendation engines typically use only the rule with the highest interestingness value in situations where more than one rule applies. In contrast, we argue that multiple applicable rules indicate increased evidence, and hypothesize that the aggregation of such rules can be exploited to provide more accurate recommendations.To investigate this hypothesis we conduct an empirical study on the change histories of two large industrial systems and four large open source systems. As aggregators we adopt three cumulative gain functions from information retrieval. The experiments evaluate the three using 39 different rule interestingness measures. The results show that aggregation provides a significant impact on most measure’s value and, furthermore, leads to a significant improvement in the resulting recommendation.
{"title":"Improving Change Recommendation using Aggregated Association Rules","authors":"Thomas Rolfsnes, L. Moonen, Stefano Di Alesio, Razieh Behjati, D. Binkley","doi":"10.1145/2901739.2901756","DOIUrl":"https://doi.org/10.1145/2901739.2901756","url":null,"abstract":"Past research has proposed association rule mining as a means to uncover the evolutionary coupling from a system’s change history. These couplings have various applications, such as improving system decomposition and recommending related changes during development. The strength of the coupling can be characterized using a variety of interestingness measures. Existing recommendation engines typically use only the rule with the highest interestingness value in situations where more than one rule applies. In contrast, we argue that multiple applicable rules indicate increased evidence, and hypothesize that the aggregation of such rules can be exploited to provide more accurate recommendations.To investigate this hypothesis we conduct an empirical study on the change histories of two large industrial systems and four large open source systems. As aggregators we adopt three cumulative gain functions from information retrieval. The experiments evaluate the three using 39 different rule interestingness measures. The results show that aggregation provides a significant impact on most measure’s value and, furthermore, leads to a significant improvement in the resulting recommendation.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"10 1","pages":"73-84"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84287716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Themistoklis G. Diamantopoulos, Klearchos Thomopoulos, A. Symeonidis
Contemporary software development processes involve finding reusable software components from online repositories and integrating them to the source code, both to reduce development time and to ensure that the final software project is of high quality. Although several systems have been designed to automate this procedure by recommending components that cover the desired functionality, the reusability of these components is usually not assessed by these systems. In this work, we present QualBoa, a recommendation system for source code components that covers both the functional and the quality aspects of software component reuse. Upon retrieving components, QualBoa provides a ranking that involves not only functional matching to the query, but also a reusability score based on configurable thresholds of source code metrics. The evaluation of QualBoa indicates that it can be effective for recommending reusable source code.
{"title":"QualBoa: Reusability-aware Recommendations of Source Code Components","authors":"Themistoklis G. Diamantopoulos, Klearchos Thomopoulos, A. Symeonidis","doi":"10.1145/2901739.2903492","DOIUrl":"https://doi.org/10.1145/2901739.2903492","url":null,"abstract":"Contemporary software development processes involve finding reusable software components from online repositories and integrating them to the source code, both to reduce development time and to ensure that the final software project is of high quality. Although several systems have been designed to automate this procedure by recommending components that cover the desired functionality, the reusability of these components is usually not assessed by these systems. In this work, we present QualBoa, a recommendation system for source code components that covers both the functional and the quality aspects of software component reuse. Upon retrieving components, QualBoa provides a ranking that involves not only functional matching to the query, but also a reusability score based on configurable thresholds of source code metrics. The evaluation of QualBoa indicates that it can be effective for recommending reusable source code.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"17 1","pages":"488-491"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77819998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lingfeng Bao, D. Lo, Xin Xia, Xinyu Wang, Cong Tian
As Android platform becomes more and more popular, a large amount of Android applications have been developed. When developers design and implement Android applications, power consumption management is an important factor to consider since it affects the usability of the applications. Thus, it is important to help developers adopt proper strategies to manage power consumption. Interestingly, today, there is a large number of Android application repositories made publicly available in sites such as GitHub. These repositories can be mined to help crystalize common power management activities that developers do. These in turn can be used to help other developers to perform similar tasks to improve their own Android applications.In this paper, we present an empirical study of power management commits in Android applications. Our study extends that of Moura et al. who perform an empirical studyon energy aware commits; however they do not focus on Android applications and only a few of the commits that they study come from Android applications. Android applications are often different from other applications (e.g., those running on a server) due to the issue of limited battery life and the use of specialized APIs. As subjects of our empirical study, we obtain a list of open source Android applications from F-Droid and crawl their commits from Github. We get 468 power management commits after we filter the commits using a set of keywords and by performing manual analysis. These 468 power management commits are from 154 different Android applications and belong to 15 different application categories. Furthermore, we use open card sort to categorize these power management commits and we obtain 6 groups which correspond to different power management activities. Our study also reveals that for different kinds of Android application (e.g., Games, Connectivity, Navigation, etc.), the dominant power management activities differ.For example, the percentageof power management commits belonging to Power Adaptation activity is larger for Navigation applications than those belonging to other categories.
{"title":"How Android App Developers Manage Power Consumption? - An Empirical Study by Mining Power Management Commits","authors":"Lingfeng Bao, D. Lo, Xin Xia, Xinyu Wang, Cong Tian","doi":"10.1145/2901739.2901748","DOIUrl":"https://doi.org/10.1145/2901739.2901748","url":null,"abstract":"As Android platform becomes more and more popular, a large amount of Android applications have been developed. When developers design and implement Android applications, power consumption management is an important factor to consider since it affects the usability of the applications. Thus, it is important to help developers adopt proper strategies to manage power consumption. Interestingly, today, there is a large number of Android application repositories made publicly available in sites such as GitHub. These repositories can be mined to help crystalize common power management activities that developers do. These in turn can be used to help other developers to perform similar tasks to improve their own Android applications.In this paper, we present an empirical study of power management commits in Android applications. Our study extends that of Moura et al. who perform an empirical studyon energy aware commits; however they do not focus on Android applications and only a few of the commits that they study come from Android applications. Android applications are often different from other applications (e.g., those running on a server) due to the issue of limited battery life and the use of specialized APIs. As subjects of our empirical study, we obtain a list of open source Android applications from F-Droid and crawl their commits from Github. We get 468 power management commits after we filter the commits using a set of keywords and by performing manual analysis. These 468 power management commits are from 154 different Android applications and belong to 15 different application categories. Furthermore, we use open card sort to categorize these power management commits and we obtain 6 groups which correspond to different power management activities. Our study also reveals that for different kinds of Android application (e.g., Games, Connectivity, Navigation, etc.), the dominant power management activities differ.For example, the percentageof power management commits belonging to Power Adaptation activity is larger for Navigation applications than those belonging to other categories.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"65 1","pages":"37-48"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74474194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The node package manager (npm) serves as the frontend to a large repository of JavaScript-based software packages, which foster the development of currently huge amounts of server-side Node.js and client-side JavaScript applications. In a span of 6 years since its inception, npm has grown to become one of the largest software ecosystems, hosting more than 230, 000 packages, with hundreds of millions of package installations every week. In this paper, we examine the npm ecosystem from two complementary perspectives: 1) we look at package descriptions, the dependencies among them, and download metrics, and 2) we look at the use of npm packages in publicly available applications hosted on GitHub. In both perspectives, we consider historical data, providing us with a unique view on the evolution of the ecosystem. We present analyses that provide insights into the ecosystem’s growth and activity, into conflicting measures of package popularity, and into the adoption of package versions over time. These insights help understand the evolution of npm, design better package recommendation engines, and can help developers understand how their packages are being used.
{"title":"A Look at the Dynamics of the JavaScript Package Ecosystem","authors":"Erik Wittern, Philippe Suter, Shriram Rajagopalan","doi":"10.1145/2901739.2901743","DOIUrl":"https://doi.org/10.1145/2901739.2901743","url":null,"abstract":"The node package manager (npm) serves as the frontend to a large repository of JavaScript-based software packages, which foster the development of currently huge amounts of server-side Node.js and client-side JavaScript applications. In a span of 6 years since its inception, npm has grown to become one of the largest software ecosystems, hosting more than 230, 000 packages, with hundreds of millions of package installations every week. In this paper, we examine the npm ecosystem from two complementary perspectives: 1) we look at package descriptions, the dependencies among them, and download metrics, and 2) we look at the use of npm packages in publicly available applications hosted on GitHub. In both perspectives, we consider historical data, providing us with a unique view on the evolution of the ecosystem. We present analyses that provide insights into the ecosystem’s growth and activity, into conflicting measures of package popularity, and into the adoption of package versions over time. These insights help understand the evolution of npm, design better package recommendation engines, and can help developers understand how their packages are being used.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"7 1","pages":"351-361"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90216627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Rozenberg, Ivan Beschastnikh, Fabian Kosmale, Valerie Poser, Heiko Becker, Marc Palyart, G. Murphy
The availability of open source software projects has created an enormous opportunity for software engineering research. However, this availability requires that researchers judiciously select an appropriate set of evaluation targets and properly document this rationale. After all, the choice of targets may have a significant effect on evaluation.We developed a tool called RepoGrams to support researchers in qualitatively comparing and contrasting software projects over time using a set of software metrics. RepoGrams uses an extensible, metrics-based, visualization model that can be adapted to a variety of analyses. Through a user study of 14 software engineering researchers we found that RepoGrams can assist researchers in filtering candidate software projects and make more reasoned choices of targets for their evaluations. The tool is open source and is available online: http://repograms.net/
{"title":"Comparing Repositories Visually with RepoGrams","authors":"Daniel Rozenberg, Ivan Beschastnikh, Fabian Kosmale, Valerie Poser, Heiko Becker, Marc Palyart, G. Murphy","doi":"10.1145/2901739.2901768","DOIUrl":"https://doi.org/10.1145/2901739.2901768","url":null,"abstract":"The availability of open source software projects has created an enormous opportunity for software engineering research. However, this availability requires that researchers judiciously select an appropriate set of evaluation targets and properly document this rationale. After all, the choice of targets may have a significant effect on evaluation.We developed a tool called RepoGrams to support researchers in qualitatively comparing and contrasting software projects over time using a set of software metrics. RepoGrams uses an extensible, metrics-based, visualization model that can be adapted to a variety of analyses. Through a user study of 14 software engineering researchers we found that RepoGrams can assist researchers in filtering candidate software projects and make more reasoned choices of targets for their evaluations. The tool is open source and is available online: http://repograms.net/","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"122 1","pages":"109-120"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90919749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The improvement in battery technology for battery-driven devices is insignificant compared to their computing ability. In spite of the overwhelming advances in processing ability, adoption of sophisticated applications is hindered bythe fear of shorter battery life. This is one of the several reasons software developers are becoming conscious of writing energy efficient code.Research has been conducted to model software energyconsumption, to reduce energy drains, and to understanddevelopers expertise on energy efficiency. In this paper,however, we investigate the nature of energy-aware softwareprojects. We observed that projects concerned with energyissues are larger and more popular than the projects thatdo not address energy consumption. Energy related codechanges are larger than others (e.g., bug fixes). In addition,our initial results suggest that energy efficiency is mostlyaddressed on certain platforms and applications.
{"title":"Characterizing Energy-Aware Software Projects: Are They Different?","authors":"S. Chowdhury, Abram Hindle","doi":"10.1145/2901739.2903494","DOIUrl":"https://doi.org/10.1145/2901739.2903494","url":null,"abstract":"The improvement in battery technology for battery-driven devices is insignificant compared to their computing ability. In spite of the overwhelming advances in processing ability, adoption of sophisticated applications is hindered bythe fear of shorter battery life. This is one of the several reasons software developers are becoming conscious of writing energy efficient code.Research has been conducted to model software energyconsumption, to reduce energy drains, and to understanddevelopers expertise on energy efficiency. In this paper,however, we investigate the nature of energy-aware softwareprojects. We observed that projects concerned with energyissues are larger and more popular than the projects thatdo not address energy consumption. Energy related codechanges are larger than others (e.g., bug fixes). In addition,our initial results suggest that energy efficiency is mostlyaddressed on certain platforms and applications.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"17 1","pages":"508-511"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81326945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Mäntylä, Bram Adams, Giuseppe Destefanis, D. Graziotin, Marco Ortu
Similar to other industries, the software engineering domain is plagued by psychological diseases such as burnout, which lead developers to lose interest, exhibit lower activity and/or feel powerless. Prevention is essential for such diseases, which in turn requires early identification of symptoms. The emotional dimensions of Valence, Arousal and Dominance (VAD) are able to derive a person’s interest (attraction), level of activation and perceived level of control for a particular situation from textual communication, such as emails. As an initial step towards identifying symptoms of productivity loss in software engineering, this paper explores the VAD metrics and their properties on 700,000 Jira issue reports containing over 2,000,000 comments, since issue reports keep track of a developer’s progress on addressing bugs or new features. Using a general-purpose lexicon of 14,000 English words with known VAD scores, our results show that issue reports of different type (e.g., Feature Request vs. Bug) have a fair variation of Valence, while increase in issue priority (e.g., from Minor to Critical) typically increases Arousal. Furthermore, we show that as an issue’s resolution time increases, so does the arousal of the individual the issue is assigned to. Finally, the resolution of an issue increases valence, especially for the issue Reporter and for quickly addressed issues. The existence ofsuch relations between VAD and issue report activities shows promise that text mining in the future could offer an alternative way for work health assessment surveys.
与其他行业类似,软件工程领域也受到倦怠等心理疾病的困扰,这会导致开发人员失去兴趣,表现出较低的积极性和/或感到无能为力。预防对这类疾病至关重要,而预防又要求及早发现症状。效价、唤醒和支配(VAD)的情感维度能够从文本交流(如电子邮件)中得出一个人对特定情况的兴趣(吸引力)、激活水平和感知控制水平。作为识别软件工程中生产力损失症状的第一步,本文在包含超过2,000,000条评论的700,000个Jira问题报告中探讨了VAD指标及其属性,因为问题报告跟踪了开发人员在解决错误或新特性方面的进展。使用已知VAD分数的14000个英语单词的通用词典,我们的结果表明,不同类型的问题报告(例如,Feature Request vs. Bug)在Valence上有相当大的变化,而问题优先级的增加(例如,从Minor到Critical)通常会增加Arousal。此外,我们表明,随着问题解决时间的增加,问题被分配给个体的觉醒也会增加。最后,问题的解决增加了价值,特别是对于问题报告者和快速解决的问题。VAD和问题报告活动之间的这种关系的存在表明,文本挖掘在未来可以为工作健康评估调查提供一种替代方法。
{"title":"Mining Valence, Arousal, and Dominance - Possibilities for Detecting Burnout and Productivity?","authors":"M. Mäntylä, Bram Adams, Giuseppe Destefanis, D. Graziotin, Marco Ortu","doi":"10.1145/2901739.2901752","DOIUrl":"https://doi.org/10.1145/2901739.2901752","url":null,"abstract":"Similar to other industries, the software engineering domain is plagued by psychological diseases such as burnout, which lead developers to lose interest, exhibit lower activity and/or feel powerless. Prevention is essential for such diseases, which in turn requires early identification of symptoms. The emotional dimensions of Valence, Arousal and Dominance (VAD) are able to derive a person’s interest (attraction), level of activation and perceived level of control for a particular situation from textual communication, such as emails. As an initial step towards identifying symptoms of productivity loss in software engineering, this paper explores the VAD metrics and their properties on 700,000 Jira issue reports containing over 2,000,000 comments, since issue reports keep track of a developer’s progress on addressing bugs or new features. Using a general-purpose lexicon of 14,000 English words with known VAD scores, our results show that issue reports of different type (e.g., Feature Request vs. Bug) have a fair variation of Valence, while increase in issue priority (e.g., from Minor to Critical) typically increases Arousal. Furthermore, we show that as an issue’s resolution time increases, so does the arousal of the individual the issue is assigned to. Finally, the resolution of an issue increases valence, especially for the issue Reporter and for quickly addressed issues. The existence ofsuch relations between VAD and issue report activities shows promise that text mining in the future could offer an alternative way for work health assessment surveys.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"291 1","pages":"247-258"},"PeriodicalIF":0.0,"publicationDate":"2016-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78502048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}