B. Yost, Michael J. Coblenz, B. Myers, Joshua Sunshine, Jonathan Aldrich, Sam Weber, M. Patron, M. Heeren, S. Krueger, M. Pfaff
Context: Critical software systems developed for the government continue to be of lower quality than expected, despite extensive literature describing best practices in software engineering. Goal: We wanted to better understand the extent of certain issues in the field and the relationship to software quality. Method: We surveyed fifty software development professionals and asked about practices and barriers in the field and the resulting software quality. Results: There is evidence of certain problematic issues for developers and specific quality characteristics that seem to be affected. Conclusions: This motivates future work to address the most problematic barriers and issues impacting software quality.
{"title":"Software Development Practices, Barriers in the Field and the Relationship to Software Quality","authors":"B. Yost, Michael J. Coblenz, B. Myers, Joshua Sunshine, Jonathan Aldrich, Sam Weber, M. Patron, M. Heeren, S. Krueger, M. Pfaff","doi":"10.1145/2961111.2962614","DOIUrl":"https://doi.org/10.1145/2961111.2962614","url":null,"abstract":"Context: Critical software systems developed for the government continue to be of lower quality than expected, despite extensive literature describing best practices in software engineering. Goal: We wanted to better understand the extent of certain issues in the field and the relationship to software quality. Method: We surveyed fifty software development professionals and asked about practices and barriers in the field and the resulting software quality. Results: There is evidence of certain problematic issues for developers and specific quality characteristics that seem to be affected. Conclusions: This motivates future work to address the most problematic barriers and issues impacting software quality.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116357905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software energy consumption has emerged as a growing concern in recent years. Managing the energy consumed by a software is, however, a difficult challenge due to the large number of factors affecting it -- namely, features of the processor, memory, cache, and other hardware components, characteristics of the program and the workload running, OS routines, compiler optimisations, among others. In this paper we study the relevance of numerous architectural and program features (static and dynamic) to the energy consumed by software. The motivation behind the study is to gain an understanding of the features affecting software energy and to provide recommendations on features to optimise for energy efficiency. In our study we used 58 subject desktop programs, each with their own workload, and from different application domains. We collected over 100 hardware and software metrics, statically and dynamically, using existing tools for program analysis, instrumentation and run time monitoring. We then performed statistical feature selection to extract the features relevant to energy consumption. We discuss potential optimisations for the selected features. We also examine whether the energy-relevant features are different from those known to affect software performance. The features commonly selected in our experiments were execution time, cache accesses, memory instructions, context switches, CPU migrations, and program length (Halstead metric). All of these features are known to affect software performance, in terms of running time, power consumed and latency.
{"title":"A Study on the Influence of Software and Hardware Features on Program Energy","authors":"A. Rajan, Adel Noureddine, Panagiotis Stratis","doi":"10.1145/2961111.2962593","DOIUrl":"https://doi.org/10.1145/2961111.2962593","url":null,"abstract":"Software energy consumption has emerged as a growing concern in recent years. Managing the energy consumed by a software is, however, a difficult challenge due to the large number of factors affecting it -- namely, features of the processor, memory, cache, and other hardware components, characteristics of the program and the workload running, OS routines, compiler optimisations, among others. In this paper we study the relevance of numerous architectural and program features (static and dynamic) to the energy consumed by software. The motivation behind the study is to gain an understanding of the features affecting software energy and to provide recommendations on features to optimise for energy efficiency. In our study we used 58 subject desktop programs, each with their own workload, and from different application domains. We collected over 100 hardware and software metrics, statically and dynamically, using existing tools for program analysis, instrumentation and run time monitoring. We then performed statistical feature selection to extract the features relevant to energy consumption. We discuss potential optimisations for the selected features. We also examine whether the energy-relevant features are different from those known to affect software performance. The features commonly selected in our experiments were execution time, cache accesses, memory instructions, context switches, CPU migrations, and program length (Halstead metric). All of these features are known to affect software performance, in terms of running time, power consumed and latency.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128870390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The software engineering community has always sought to build great software and continues to seek out ways and approaches for doing this. The UX movement emphasizes the usability of the developed product. Agile approaches like scrum focus on aligning the functionality and features of the final product more closely with user/customer/market requirements. The recent interest in DevOps has brought to the fore the need to address the challenges once software goes into production. Despite this, in an enterprise environment, great software does not necessarily translate into real business benefits; few investments fail because the software didn't work [1], [2]. The overwhelming evidence points to the need to actively manage to achieve the business benefits being sought [3], [4], [5], [6]. This keynote presentation introduces the concepts and practices of benefits management and benefits realization that have emerged over the last 25 years. It highlights the issues and challenges in deploying software to deliver expected business outcomes. It suggests that this is a missing perspective in software engineering. Suggestions for how this perspective might be more closely integrated with software engineering are proposed.
{"title":"What about the Benefits?: A Missing Perspective in Software Engineering","authors":"J. Peppard","doi":"10.1145/2961111.2962642","DOIUrl":"https://doi.org/10.1145/2961111.2962642","url":null,"abstract":"The software engineering community has always sought to build great software and continues to seek out ways and approaches for doing this. The UX movement emphasizes the usability of the developed product. Agile approaches like scrum focus on aligning the functionality and features of the final product more closely with user/customer/market requirements. The recent interest in DevOps has brought to the fore the need to address the challenges once software goes into production. Despite this, in an enterprise environment, great software does not necessarily translate into real business benefits; few investments fail because the software didn't work [1], [2]. The overwhelming evidence points to the need to actively manage to achieve the business benefits being sought [3], [4], [5], [6]. This keynote presentation introduces the concepts and practices of benefits management and benefits realization that have emerged over the last 25 years. It highlights the issues and challenges in deploying software to deliver expected business outcomes. It suggests that this is a missing perspective in software engineering. Suggestions for how this perspective might be more closely integrated with software engineering are proposed.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"51 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124533825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background. When estimating whether a software module is faulty based on the value of a measure X for a software internal attribute (e.g., size, structural complexity, cohesion, coupling), it is sensible to set a threshold on fault-proneness first and then induce a threshold on X by using a fault-proneness model where X plays the role of independent variable. However, some modules cannot be estimated as either faulty or non-faulty with confidence: they belong to a "grey zone" and estimating them as either would be quite aleatory and may result in several erroneous decisions. Objective. We propose and evaluate an approach to setting thresholds on X to identify which modules can be confidently estimated faulty or non-faulty, and which ones cannot be estimated either way. Method. Suppose that we do not know if the modules be-longing to a subset of a set of modules are faulty or not, as happens in practical cases with the modules whose faultiness needs to be estimated. We build two fault-proneness models by using the set of modules as the training set. The "pessimistic" model is built by assuming that all modules whose faultiness is unknown are actually faulty and the "optimistic" model by assuming that they are actually non-faulty. The optimistic and pessimistic models can be used to set two thresholds, an optimistic and a pessimistic one. A module is estimated faulty by the optimistic (resp., pessimistic) model with optimistic (resp., pessimistic) threshold if its fault-proneness is above the threshold, and non-faulty otherwise. A module that is estimated faulty (resp., non-faulty) by both the optimistic model with optimistic threshold and the pessimistic model with the pessimistic threshold is estimated faulty (resp., non-faulty). Modules for which the estimates of the two models with associated thresholds conflict, are in the "grey zone," i.e., no reliable faultiness estimation can be made for them. Results. We applied our approach to datasets from the PROMISE repository, we carried out cross-validations, and we assessed accuracy via commonly used indicators. We also compared our results with those obtained with the conventional approach that uses one Binary Logistic Regression model. Our results show that our approach is effective in identifying the grey zone of values of X in which modules cannot be reliably estimated as either faulty or non-faulty and, conversely, the intervals in which modules can be estimated faulty or non-faulty. Our approach turns out to be more accurate, in terms of F-measure, than the conventional one in the majority of cases. In addition, it provides F-measure values that are very concentrated, i.e., it consistently identifies the intervals in which modules can be estimated faulty or non-faulty. Conclusions. Our method can be practically used for identifying "grey zones" in which it does not make much sense to estimate modules' faultiness based on measure X and, therefore, the zones in which modules' faultiness can be e
{"title":"Identifying Thresholds for Software Faultiness via Optimistic and Pessimistic Estimations","authors":"L. Lavazza, S. Morasca","doi":"10.1145/2961111.2962595","DOIUrl":"https://doi.org/10.1145/2961111.2962595","url":null,"abstract":"Background. When estimating whether a software module is faulty based on the value of a measure X for a software internal attribute (e.g., size, structural complexity, cohesion, coupling), it is sensible to set a threshold on fault-proneness first and then induce a threshold on X by using a fault-proneness model where X plays the role of independent variable. However, some modules cannot be estimated as either faulty or non-faulty with confidence: they belong to a \"grey zone\" and estimating them as either would be quite aleatory and may result in several erroneous decisions. Objective. We propose and evaluate an approach to setting thresholds on X to identify which modules can be confidently estimated faulty or non-faulty, and which ones cannot be estimated either way. Method. Suppose that we do not know if the modules be-longing to a subset of a set of modules are faulty or not, as happens in practical cases with the modules whose faultiness needs to be estimated. We build two fault-proneness models by using the set of modules as the training set. The \"pessimistic\" model is built by assuming that all modules whose faultiness is unknown are actually faulty and the \"optimistic\" model by assuming that they are actually non-faulty. The optimistic and pessimistic models can be used to set two thresholds, an optimistic and a pessimistic one. A module is estimated faulty by the optimistic (resp., pessimistic) model with optimistic (resp., pessimistic) threshold if its fault-proneness is above the threshold, and non-faulty otherwise. A module that is estimated faulty (resp., non-faulty) by both the optimistic model with optimistic threshold and the pessimistic model with the pessimistic threshold is estimated faulty (resp., non-faulty). Modules for which the estimates of the two models with associated thresholds conflict, are in the \"grey zone,\" i.e., no reliable faultiness estimation can be made for them. Results. We applied our approach to datasets from the PROMISE repository, we carried out cross-validations, and we assessed accuracy via commonly used indicators. We also compared our results with those obtained with the conventional approach that uses one Binary Logistic Regression model. Our results show that our approach is effective in identifying the grey zone of values of X in which modules cannot be reliably estimated as either faulty or non-faulty and, conversely, the intervals in which modules can be estimated faulty or non-faulty. Our approach turns out to be more accurate, in terms of F-measure, than the conventional one in the majority of cases. In addition, it provides F-measure values that are very concentrated, i.e., it consistently identifies the intervals in which modules can be estimated faulty or non-faulty. Conclusions. Our method can be practically used for identifying \"grey zones\" in which it does not make much sense to estimate modules' faultiness based on measure X and, therefore, the zones in which modules' faultiness can be e","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131790902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Survey is a method of research aiming to gather data from a large population of interest. Despite being extensively used in software engineering, survey-based research faces several challenges, such as selecting a representative population sample and designing the data collection instruments. Objective: This article aims to summarize the existing guidelines, supporting instruments and recommendations on how to conduct and evaluate survey-based research. Methods: A systematic search using manual search and snowballing techniques were used to identify primary studies supporting survey research in software engineering. We used an annotated review to present the findings, describing the references of interest in the research topic. Results: The summary provides a description of 15 available articles addressing the survey methodology, based upon which we derived a set of recommendations on how to conduct survey research, and their impact in the community. Conclusion: Survey-based research in software engineering has its particular challenges, as illustrated by several articles in this review. The annotated review can contribute by raising awareness of such challenges and present the proper recommendations to overcome them.
{"title":"Survey Guidelines in Software Engineering: An Annotated Review","authors":"J. Molléri, K. Petersen, E. Mendes","doi":"10.1145/2961111.2962619","DOIUrl":"https://doi.org/10.1145/2961111.2962619","url":null,"abstract":"Background: Survey is a method of research aiming to gather data from a large population of interest. Despite being extensively used in software engineering, survey-based research faces several challenges, such as selecting a representative population sample and designing the data collection instruments. Objective: This article aims to summarize the existing guidelines, supporting instruments and recommendations on how to conduct and evaluate survey-based research. Methods: A systematic search using manual search and snowballing techniques were used to identify primary studies supporting survey research in software engineering. We used an annotated review to present the findings, describing the references of interest in the research topic. Results: The summary provides a description of 15 available articles addressing the survey methodology, based upon which we derived a set of recommendations on how to conduct survey research, and their impact in the community. Conclusion: Survey-based research in software engineering has its particular challenges, as illustrated by several articles in this review. The annotated review can contribute by raising awareness of such challenges and present the proper recommendations to overcome them.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133691236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Context: To survive in a highly competitive software market, product managers are striving for frequent, incremental releases in ever shorter cycles. Release decisions are characterized by high complexity and have a high impact on project success. Under such conditions, using the experience from past releases could help product managers to take more informed decisions. Goal and research objectives: To make decisions about when to make a release more operational, we formulated release readiness (RR) as a binary classification problem. The goal of our research presented in this paper is twofold: (i) to propose a machine learning approach called RC* (Release readiness Classification applying predictive techniques) with two approaches for defining the training set called incremental and sliding window, and (ii) to empirically evaluate the applicability of RC* for varying project characteristics. Methodology: In the form of explorative case study research, we applied the RC* method to four OSS projects under the Apache Software Foundation. We retrospectively covered a period of 82 months, 90 releases and 3722 issues. We use Random Forest as the classification technique along with eight independent variables to classify release readiness in individual weeks. Predictive performance was measured in terms of precision, recall, F-measure, and accuracy. Results: The incremental and sliding window approaches respectively achieve an overall 76% and 79% accuracy in classifying RR for four analyzed projects. Incremental approach outperforms sliding window approach in terms of stability of the predictive performance. Predictive performance for both approaches are significantly influenced by three project characteristics i) release duration, ii) number of issues in a release, iii) size of the initial training dataset. Conclusion: As our initial observation we identified, incremental approach achieves higher accuracy when releases have long duration, low number of issues and classifiers are trained with large training set. On the other hand, sliding window approach achieves higher accuracy when releases have short duration and classifiers are trained with small training set.
{"title":"Release Readiness Classification: An Explorative Case Study","authors":"S. Alam, Dietmar Pfahl, G. Ruhe","doi":"10.1145/2961111.2962629","DOIUrl":"https://doi.org/10.1145/2961111.2962629","url":null,"abstract":"Context: To survive in a highly competitive software market, product managers are striving for frequent, incremental releases in ever shorter cycles. Release decisions are characterized by high complexity and have a high impact on project success. Under such conditions, using the experience from past releases could help product managers to take more informed decisions. Goal and research objectives: To make decisions about when to make a release more operational, we formulated release readiness (RR) as a binary classification problem. The goal of our research presented in this paper is twofold: (i) to propose a machine learning approach called RC* (Release readiness Classification applying predictive techniques) with two approaches for defining the training set called incremental and sliding window, and (ii) to empirically evaluate the applicability of RC* for varying project characteristics. Methodology: In the form of explorative case study research, we applied the RC* method to four OSS projects under the Apache Software Foundation. We retrospectively covered a period of 82 months, 90 releases and 3722 issues. We use Random Forest as the classification technique along with eight independent variables to classify release readiness in individual weeks. Predictive performance was measured in terms of precision, recall, F-measure, and accuracy. Results: The incremental and sliding window approaches respectively achieve an overall 76% and 79% accuracy in classifying RR for four analyzed projects. Incremental approach outperforms sliding window approach in terms of stability of the predictive performance. Predictive performance for both approaches are significantly influenced by three project characteristics i) release duration, ii) number of issues in a release, iii) size of the initial training dataset. Conclusion: As our initial observation we identified, incremental approach achieves higher accuracy when releases have long duration, low number of issues and classifiers are trained with large training set. On the other hand, sliding window approach achieves higher accuracy when releases have short duration and classifiers are trained with small training set.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127826155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Al-Subaihin, Federica Sarro, S. Black, L. Capra, M. Harman, Yue Jia, Yuanyuan Zhang
Context: Categorising software systems according to their functionality yields many benefits to both users and developers. Goal: In order to uncover the latent clustering of mobile apps in app stores, we propose a novel technique that measures app similarity based on claimed behaviour. Method: Features are extracted using information retrieval augmented with ontological analysis and used as attributes to characterise apps. These attributes are then used to cluster the apps using agglomerative hierarchical clustering. We empirically evaluate our approach on 17,877 apps mined from the BlackBerry and Google app stores in 2014. Results: The results show that our approach dramatically improves the existing categorisation quality for both Blackberry (from 0.02 to 0.41 on average) and Google (from 0.03 to 0.21 on average) stores. We also find a strong Spearman rank correlation (ρ= 0.96 for Google and ρ= 0.99 for BlackBerry) between the number of apps and the ideal granularity within each category, indicating that ideal granularity increases with category size, as expected. Conclusions: Current categorisation in the app stores studied do not exhibit a good classification quality in terms of the claimed feature space. However, a better quality can be achieved using a good feature extraction technique and a traditional clustering method.
{"title":"Clustering Mobile Apps Based on Mined Textual Features","authors":"A. Al-Subaihin, Federica Sarro, S. Black, L. Capra, M. Harman, Yue Jia, Yuanyuan Zhang","doi":"10.1145/2961111.2962600","DOIUrl":"https://doi.org/10.1145/2961111.2962600","url":null,"abstract":"Context: Categorising software systems according to their functionality yields many benefits to both users and developers. Goal: In order to uncover the latent clustering of mobile apps in app stores, we propose a novel technique that measures app similarity based on claimed behaviour. Method: Features are extracted using information retrieval augmented with ontological analysis and used as attributes to characterise apps. These attributes are then used to cluster the apps using agglomerative hierarchical clustering. We empirically evaluate our approach on 17,877 apps mined from the BlackBerry and Google app stores in 2014. Results: The results show that our approach dramatically improves the existing categorisation quality for both Blackberry (from 0.02 to 0.41 on average) and Google (from 0.03 to 0.21 on average) stores. We also find a strong Spearman rank correlation (ρ= 0.96 for Google and ρ= 0.99 for BlackBerry) between the number of apps and the ideal granularity within each category, indicating that ideal granularity increases with category size, as expected. Conclusions: Current categorisation in the app stores studied do not exhibit a good classification quality in terms of the claimed feature space. However, a better quality can be achieved using a good feature extraction technique and a traditional clustering method.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130336311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modern computer systems are highly-configurable, complicating the testing and debugging process. The sheer size of the configuration space makes the quality of software even harder to achieve. Performance is one of the key aspects of non-functional qualities, where performance bugs can cause significant performance degradation and lead to poor user experience. However, performance bugs are difficult to expose, primarily because detecting them requires specific inputs, as well as a specific execution environment (e.g., configurations). While researchers have developed techniques to analyze, quantify, detect, and fix performance bugs, we conjecture that many of these techniques may not be effective in highly-configurable systems. In this paper, we study the challenges that configurability creates for handling performance bugs. We study 113 real-world performance bugs, randomly sampled from three highly-configurable open-source projects: Apache, MySQL and Firefox. The findings of this study provide a set of lessons learned and guidance to aid practitioners and researchers to better handle performance bugs in highly-configurable software systems.
{"title":"An Empirical Study on Performance Bugs for Highly Configurable Software Systems","authors":"Xue Han, Tingting Yu","doi":"10.1145/2961111.2962602","DOIUrl":"https://doi.org/10.1145/2961111.2962602","url":null,"abstract":"Modern computer systems are highly-configurable, complicating the testing and debugging process. The sheer size of the configuration space makes the quality of software even harder to achieve. Performance is one of the key aspects of non-functional qualities, where performance bugs can cause significant performance degradation and lead to poor user experience. However, performance bugs are difficult to expose, primarily because detecting them requires specific inputs, as well as a specific execution environment (e.g., configurations). While researchers have developed techniques to analyze, quantify, detect, and fix performance bugs, we conjecture that many of these techniques may not be effective in highly-configurable systems. In this paper, we study the challenges that configurability creates for handling performance bugs. We study 113 real-world performance bugs, randomly sampled from three highly-configurable open-source projects: Apache, MySQL and Firefox. The findings of this study provide a set of lessons learned and guidance to aid practitioners and researchers to better handle performance bugs in highly-configurable software systems.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121763660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Empirical studies of different kinds are nowadays regularly published in software engineering journals and conferences. Many empirical studies have been published, but are this sufficient? Individual studies are important, but the actual potential in relation to evidence-based software engineering [1] is not fully exploited. As a discipline we have to be able to go further to make our individual studies more useful. Other research should be able to leverage on the studies and industry should be able to make informed decisions based on the empirical research. There are several challenges related to making individual empirical studies useful in a broader context. Anyone having conducted a systematic literature review [2] has most likely experienced the problem of being able to synthesize the relevant studies. In all too many cases, we end up with a systematic mapping study [3], or in the best case something on the borderline between a review and a mapping study. This illustrates the need to write for synthesis [4], and in particular including sufficient contextual information to allow for synthesis [4]. Evidence-based software engineering [1] through the use of systematic literature studies (reviews and maps) has emerged. Methodological support and guidelines (e.g. [2], [3], [6] and [7]) for conducting systematic literature studies have been formulated and they should be carefully followed. However, more is needed! We still need to improve! The keynote is focused on the needs for the future as seen by the presenter. Synthesis has proven hard, and improvements are needed when it comes to both primary studies and secondary studies. It has been shown that the reliability of secondary studies can be challenged [8]. However, if we do manage to publish high quality primary studies, and we truly manage to conduct strong systematic literature reviews, we have a good basis for both building theories in software engineering and to enable industry to make informed decisions using scientific evidence. Unfortunately, this is not the situation today. Theories are mostly based on our own research, as exemplified by [9]. This is fine, but much more can be done if we can easier leverage on the research done by others to build theories. Furthermore, industry is often making decision related to processes, methods, techniques and tools before we manage to obtain sufficient evidence for recommendations. The points made above are highlighted using personal experiences from conducting systematic literature studies, collaborating with industry and research on developing an empirically based software engineering theory.
{"title":"Is there a Future for Empirical Software Engineering?","authors":"C. Wohlin","doi":"10.1145/2961111.2962641","DOIUrl":"https://doi.org/10.1145/2961111.2962641","url":null,"abstract":"Empirical studies of different kinds are nowadays regularly published in software engineering journals and conferences. Many empirical studies have been published, but are this sufficient? Individual studies are important, but the actual potential in relation to evidence-based software engineering [1] is not fully exploited. As a discipline we have to be able to go further to make our individual studies more useful. Other research should be able to leverage on the studies and industry should be able to make informed decisions based on the empirical research. There are several challenges related to making individual empirical studies useful in a broader context. Anyone having conducted a systematic literature review [2] has most likely experienced the problem of being able to synthesize the relevant studies. In all too many cases, we end up with a systematic mapping study [3], or in the best case something on the borderline between a review and a mapping study. This illustrates the need to write for synthesis [4], and in particular including sufficient contextual information to allow for synthesis [4]. Evidence-based software engineering [1] through the use of systematic literature studies (reviews and maps) has emerged. Methodological support and guidelines (e.g. [2], [3], [6] and [7]) for conducting systematic literature studies have been formulated and they should be carefully followed. However, more is needed! We still need to improve! The keynote is focused on the needs for the future as seen by the presenter. Synthesis has proven hard, and improvements are needed when it comes to both primary studies and secondary studies. It has been shown that the reliability of secondary studies can be challenged [8]. However, if we do manage to publish high quality primary studies, and we truly manage to conduct strong systematic literature reviews, we have a good basis for both building theories in software engineering and to enable industry to make informed decisions using scientific evidence. Unfortunately, this is not the situation today. Theories are mostly based on our own research, as exemplified by [9]. This is fine, but much more can be done if we can easier leverage on the research done by others to build theories. Furthermore, industry is often making decision related to processes, methods, techniques and tools before we manage to obtain sufficient evidence for recommendations. The points made above are highlighted using personal experiences from conducting systematic literature studies, collaborating with industry and research on developing an empirically based software engineering theory.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127198735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. A. O. G. Cunha, F. Silva, H. Moura, Francisco J. S. Vasconcellos
Context: In software project management, the decision-making process is a complex set of tasks largely based on human relations and individual knowledge and cultural background. The factors that affect the decisions of the software project managers (SPMs) as well as their potential consequences require attention because project delays and failures might be related to a series of poor decisions. Goals: To understand how SPMs make decisions based on how they interpret their experiences in the workplace. Further, to identify antecedents and consequences of those decisions in order to increase the effectiveness of project management. We also aim to refine the research design for future investigations. Method: Semi-structured interviews were carried out with SPMs within a Brazilian large governmental organization and a Brazilian large private organization. Results: We found that decision-making in software project management is based on knowledge sharing in which the SPM acts as a facilitator. This phenomenon is influenced by individual factors, such as experience, knowledge, personality, organizational ability, communication, negotiation, interpersonal relationship and systemic vision of the project and by situational factors such as the autonomy of the SPM, constant feedback and team members' technical competence. Conclusions: Due to the uncertainty and dynamism inherent to software projects, the SPMs focus on making, monitoring and adjusting decisions in an argument-driven way. From the initial relationships among the identified factors, the research design was refined.
{"title":"Towards a Substantive Theory of Decision-Making in Software Project Management: Preliminary Findings from a Qualitative Study","authors":"J. A. O. G. Cunha, F. Silva, H. Moura, Francisco J. S. Vasconcellos","doi":"10.1145/2961111.2962604","DOIUrl":"https://doi.org/10.1145/2961111.2962604","url":null,"abstract":"Context: In software project management, the decision-making process is a complex set of tasks largely based on human relations and individual knowledge and cultural background. The factors that affect the decisions of the software project managers (SPMs) as well as their potential consequences require attention because project delays and failures might be related to a series of poor decisions. Goals: To understand how SPMs make decisions based on how they interpret their experiences in the workplace. Further, to identify antecedents and consequences of those decisions in order to increase the effectiveness of project management. We also aim to refine the research design for future investigations. Method: Semi-structured interviews were carried out with SPMs within a Brazilian large governmental organization and a Brazilian large private organization. Results: We found that decision-making in software project management is based on knowledge sharing in which the SPM acts as a facilitator. This phenomenon is influenced by individual factors, such as experience, knowledge, personality, organizational ability, communication, negotiation, interpersonal relationship and systemic vision of the project and by situational factors such as the autonomy of the SPM, constant feedback and team members' technical competence. Conclusions: Due to the uncertainty and dynamism inherent to software projects, the SPMs focus on making, monitoring and adjusting decisions in an argument-driven way. From the initial relationships among the identified factors, the research design was refined.","PeriodicalId":208212,"journal":{"name":"Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124937201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}