M. Rapoport, Philippe Suter, Erik Wittern, O. Lhoták, Julian T Dolby
Relying on ubiquitous Internet connectivity, applications on mobile devices frequently perform web requests during their execution. They fetch data for users to interact with, invoke remote functionalities, or send user-generated content or meta-data. These requests collectively reveal common practices of mobile application development, like what external services are used and how, and they point to possible negative effects like security and privacy violations, or impacts on battery life. In this paper, we assess different ways to analyze what web requests Android applications make. We start by presenting dynamic data collected from running 20 randomly selected Android applications and observing their network activity. Next, we present a static analysis tool, Stringoid, that analyzes string concatenations in Android applications to estimate constructed URL strings. Using Stringoid, we extract URLs from 30, 000 Android applications, and compare the performance with a simpler constant extraction analysis. Finally, we present a discussion of the advantages and limitations of dynamic and static analyses when extracting URLs, as we compare the data extracted by Stringoid from the same 20 applications with the dynamically collected data.
{"title":"Who you gonna call?: analyzing web requests in Android applications","authors":"M. Rapoport, Philippe Suter, Erik Wittern, O. Lhoták, Julian T Dolby","doi":"10.1109/MSR.2017.11","DOIUrl":"https://doi.org/10.1109/MSR.2017.11","url":null,"abstract":"Relying on ubiquitous Internet connectivity, applications on mobile devices frequently perform web requests during their execution. They fetch data for users to interact with, invoke remote functionalities, or send user-generated content or meta-data. These requests collectively reveal common practices of mobile application development, like what external services are used and how, and they point to possible negative effects like security and privacy violations, or impacts on battery life. In this paper, we assess different ways to analyze what web requests Android applications make. We start by presenting dynamic data collected from running 20 randomly selected Android applications and observing their network activity. Next, we present a static analysis tool, Stringoid, that analyzes string concatenations in Android applications to estimate constructed URL strings. Using Stringoid, we extract URLs from 30, 000 Android applications, and compare the performance with a simpler constant extraction analysis. Finally, we present a discussion of the advantages and limitations of dynamic and static analyses when extracting URLs, as we compare the data extracted by Stringoid from the same 20 applications with the dynamically collected data.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"60 1","pages":"80-90"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84554111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Ishio, R. Kula, Tetsuya Kanda, D. Germán, Katsuro Inoue
A software product is often dependent on a large number of third-party components.To assess potential risks, such as security vulnerabilities and license violations, a list of components and their versions in a product is important for release engineers and security analysts.Since such a list is not always available, a code comparison technique named Software Bertillonage has been proposed to test whether a product likely includes a copy of a particular component or not.Although the technique can extract candidates of reused components, a user still has to manually identify the original components among the candidates.In this paper, we propose a method to automatically select the most likely origin of components reused in a product, based on an assumption that a product tends to include an entire copy of a component rather than a partial copy.More concretely, given a Java product and a repository of jar files of existing components, our method selects jar files that can provide Java classes to the product in a greedy manner.To compare the method with the existing technique, we have conducted an evaluation using randomly created jar files including up to 1,000 components.The Software Bertillonage technique reports many candidates; the precision and recall are 0.357 and 0.993, respectively.Our method reports a list of original components whose precision and recall are 0.998 and 0.997.
{"title":"Software Ingredients: Detection of Third-Party Component Reuse in Java Software Release","authors":"T. Ishio, R. Kula, Tetsuya Kanda, D. Germán, Katsuro Inoue","doi":"10.1145/2901739.2901773","DOIUrl":"https://doi.org/10.1145/2901739.2901773","url":null,"abstract":"A software product is often dependent on a large number of third-party components.To assess potential risks, such as security vulnerabilities and license violations, a list of components and their versions in a product is important for release engineers and security analysts.Since such a list is not always available, a code comparison technique named Software Bertillonage has been proposed to test whether a product likely includes a copy of a particular component or not.Although the technique can extract candidates of reused components, a user still has to manually identify the original components among the candidates.In this paper, we propose a method to automatically select the most likely origin of components reused in a product, based on an assumption that a product tends to include an entire copy of a component rather than a partial copy.More concretely, given a Java product and a repository of jar files of existing components, our method selects jar files that can provide Java classes to the product in a greedy manner.To compare the method with the existing technique, we have conducted an evaluation using randomly created jar files including up to 1,000 components.The Software Bertillonage technique reports many candidates; the precision and recall are 0.357 and 0.993, respectively.Our method reports a list of original components whose precision and recall are 0.998 and 0.997.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"373 1","pages":"339-350"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74871173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
During software evolution, the source code of a system frequently changes due to bug fixes or new feature requests. Some of these changes may accidentally degrade performance of a newly released software version. A notable problem of regression testing is how to find problematic changes (out of a large number of committed changes) that may be responsible for performance regressions under certain test inputs.We propose a novel recommendation system, coined as PefImpact, for automatically identifying code changes that may potentially be responsible for performance regressions using a combination of search-based input profiling and change impact analysis techniques. PefImpact independently sends the same input values to two releases of the application under test, and uses a genetic algorithm to mine execution traces and explore a large space of input value combinations to find specific inputs that take longer time to execute in a new release. Since these input values are likely to expose performance regressions, PefImpact automatically mines the corresponding execution traces to evaluate the impact of each code change on the performance and ranks the changes based on their estimated contribution to performance regressions. We implemented PefImpact and evaluated it on different releases of two open-source web applications. The results demonstrate that PefImpact effectively detects input value combinations to expose performance regressions and mines the code changes are likely to be responsible for these performance regressions.
{"title":"Mining Performance Regression Inducing Code Changes in Evolving Software","authors":"Qi Luo, D. Poshyvanyk, M. Grechanik","doi":"10.1145/2901739.2901765","DOIUrl":"https://doi.org/10.1145/2901739.2901765","url":null,"abstract":"During software evolution, the source code of a system frequently changes due to bug fixes or new feature requests. Some of these changes may accidentally degrade performance of a newly released software version. A notable problem of regression testing is how to find problematic changes (out of a large number of committed changes) that may be responsible for performance regressions under certain test inputs.We propose a novel recommendation system, coined as PefImpact, for automatically identifying code changes that may potentially be responsible for performance regressions using a combination of search-based input profiling and change impact analysis techniques. PefImpact independently sends the same input values to two releases of the application under test, and uses a genetic algorithm to mine execution traces and explore a large space of input value combinations to find specific inputs that take longer time to execute in a new release. Since these input values are likely to expose performance regressions, PefImpact automatically mines the corresponding execution traces to evaluate the impact of each code change on the performance and ranks the changes based on their estimated contribution to performance regressions. We implemented PefImpact and evaluated it on different releases of two open-source web applications. The results demonstrate that PefImpact effectively detects input value combinations to expose performance regressions and mines the code changes are likely to be responsible for these performance regressions.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"20 1","pages":"25-36"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85055323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enriched by natural language texts, Stack Overflow code snippets arean invaluable code-centric knowledge base of small units ofsource code. Besides being useful for software developers, theseannotated snippets can potentially serve as the basis for automatedtools that provide working code solutions to specific natural languagequeries. With the goal of developing automated tools with the Stack Overflowsnippets and surrounding text, this paper investigates the followingquestions: (1) How usable are the Stack Overflow code snippets? and(2) When using text search engines for matching on the naturallanguage questions and answers around the snippets, what percentage ofthe top results contain usable code snippets?A total of 3M code snippets are analyzed across four languages: C#,Java, JavaScript, and Python. Python and JavaScript proved to be thelanguages for which the most code snippets are usable. Conversely,Java and C# proved to be the languages with the lowest usabilityrate. Further qualitative analysis on usable Python snippets showsthe characteristics of the answers that solve the original question. Finally,we use Google search to investigate the alignment ofusability and the natural language annotations around code snippets, andexplore how to make snippets in Stack Overflow anadequate base for future automatic program generation.
{"title":"From Query to Usable Code: An Analysis of Stack Overflow Code Snippets","authors":"Di Yang, Aftab Hussain, C. Lopes","doi":"10.1145/2901739.2901767","DOIUrl":"https://doi.org/10.1145/2901739.2901767","url":null,"abstract":"Enriched by natural language texts, Stack Overflow code snippets arean invaluable code-centric knowledge base of small units ofsource code. Besides being useful for software developers, theseannotated snippets can potentially serve as the basis for automatedtools that provide working code solutions to specific natural languagequeries. With the goal of developing automated tools with the Stack Overflowsnippets and surrounding text, this paper investigates the followingquestions: (1) How usable are the Stack Overflow code snippets? and(2) When using text search engines for matching on the naturallanguage questions and answers around the snippets, what percentage ofthe top results contain usable code snippets?A total of 3M code snippets are analyzed across four languages: C#,Java, JavaScript, and Python. Python and JavaScript proved to be thelanguages for which the most code snippets are usable. Conversely,Java and C# proved to be the languages with the lowest usabilityrate. Further qualitative analysis on usable Python snippets showsthe characteristics of the answers that solve the original question. Finally,we use Google search to investigate the alignment ofusability and the natural language annotations around code snippets, andexplore how to make snippets in Stack Overflow anadequate base for future automatic program generation.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"14 1","pages":"391-401"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90206597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco Ortu, Alessandro Murgia, Giuseppe Destefanis, Parastou Tourani, R. Tonelli, M. Marchesi, Bram Adams
ABSTRACTIssue tracking systems store valuable data for testing hy-potheses concerning maintenance, building statistical pre-diction models and (recently) investigating developer affec-tiveness. For the latter, issue tracking systems can be minedto explore developers emotions, sentiments and politeness, affects for short. However, research on affect detection insoftware artefacts is still in its early stage due to the lack ofmanually validated data and tools.In this paper, we contribute to the research of affectson software artefacts by providing a labeling of emotionspresent on issue comments.We manually labeled 2,000 issue comments and 4,000 sen-tences written by developers with emotions such as love,joy, surprise, anger, sadness and fear. Labeled commentsand sentences are linked to software artefacts reported inour previously published dataset (containing more than 1Kprojects, more than 700K issue reports and more than 2million issue comments). The enriched dataset presented inthis paper allows the investigation of the role of affects insoftware development.
{"title":"The Emotional Side of Software Developers in JIRA","authors":"Marco Ortu, Alessandro Murgia, Giuseppe Destefanis, Parastou Tourani, R. Tonelli, M. Marchesi, Bram Adams","doi":"10.1145/2901739.2903505","DOIUrl":"https://doi.org/10.1145/2901739.2903505","url":null,"abstract":"ABSTRACTIssue tracking systems store valuable data for testing hy-potheses concerning maintenance, building statistical pre-diction models and (recently) investigating developer affec-tiveness. For the latter, issue tracking systems can be minedto explore developers emotions, sentiments and politeness, affects for short. However, research on affect detection insoftware artefacts is still in its early stage due to the lack ofmanually validated data and tools.In this paper, we contribute to the research of affectson software artefacts by providing a labeling of emotionspresent on issue comments.We manually labeled 2,000 issue comments and 4,000 sen-tences written by developers with emotions such as love,joy, surprise, anger, sadness and fear. Labeled commentsand sentences are linked to software artefacts reported inour previously published dataset (containing more than 1Kprojects, more than 700K issue reports and more than 2million issue comments). The enriched dataset presented inthis paper allows the investigation of the role of affects insoftware development.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"2017 1","pages":"480-483"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79690806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. A. Thompson, G. Murphy, Marc Palyart, Marko Gasparic
Software developers use issues as a means to describe a range of activities to be undertaken on a software system, including features to be added and defects that require fixing. When creating issues, software developers expend manual effort to specify relationships between issues, such as one issue blocking another or one issue being a sub-task of another. In particular, developers use a variety of relationships to express how work is to be broken down on a project. To better understand how software developers use work breakdown relationships between issues, we manually coded a sample of work breakdown relationships from three open source systems. We report on our findings and describe how the recognition of work breakdown relationships opens up new ways to improve software development techniques.
{"title":"How Software Developers Use Work Breakdown Relationships in Issue Repositories","authors":"C. A. Thompson, G. Murphy, Marc Palyart, Marko Gasparic","doi":"10.1145/2901739.2901779","DOIUrl":"https://doi.org/10.1145/2901739.2901779","url":null,"abstract":"Software developers use issues as a means to describe a range of activities to be undertaken on a software system, including features to be added and defects that require fixing. When creating issues, software developers expend manual effort to specify relationships between issues, such as one issue blocking another or one issue being a sub-task of another. In particular, developers use a variety of relationships to express how work is to be broken down on a project. To better understand how software developers use work breakdown relationships between issues, we manually coded a sample of work breakdown relationships from three open source systems. We report on our findings and describe how the recognition of work breakdown relationships opens up new ways to improve software development techniques.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"281-285"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75584873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kostadin Damevski, Hui Chen, D. Shepherd, L. Pollock
Using IDE usage data to analyze the behavior of software developers in the field, during the course of their daily work, can lend support to (or dispute) laboratory studies of devel- opers. This paper describes a technique that leverages Hidden Markov Models (HMMs) as a means of mining high-level developer behavior from low-level IDE interaction traces of many developers in the field. HMMs use dual stochastic processes to model higher-level hidden behavior using observable input sequences of events. We propose an interactive approach of mining interpretable HMMs, based on guiding a human expert in building a high quality HMM in an iterative, one state at a time, manner. The final result is a model that is both representative of the field data and captures the field phenomena of interest. We apply our HMM construction approach to study debugging behavior, using a large IDE interaction dataset collected from nearly 200 developers at ABB, Inc. Our results highlight the different modes and constituent actions in debugging, exhibited by the developers in our dataset.
{"title":"Interactive Exploration of Developer Interaction Traces using a Hidden Markov Model","authors":"Kostadin Damevski, Hui Chen, D. Shepherd, L. Pollock","doi":"10.1145/2901739.2901741","DOIUrl":"https://doi.org/10.1145/2901739.2901741","url":null,"abstract":"Using IDE usage data to analyze the behavior of software developers in the field, during the course of their daily work, can lend support to (or dispute) laboratory studies of devel- opers. This paper describes a technique that leverages Hidden Markov Models (HMMs) as a means of mining high-level developer behavior from low-level IDE interaction traces of many developers in the field. HMMs use dual stochastic processes to model higher-level hidden behavior using observable input sequences of events. We propose an interactive approach of mining interpretable HMMs, based on guiding a human expert in building a high quality HMM in an iterative, one state at a time, manner. The final result is a model that is both representative of the field data and captures the field phenomena of interest. We apply our HMM construction approach to study debugging behavior, using a large IDE interaction dataset collected from nearly 200 developers at ABB, Inc. Our results highlight the different modes and constituent actions in debugging, exhibited by the developers in our dataset.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"19 1","pages":"126-136"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73342221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacob G. Barnett, Charles K. Gathuru, Luke S. Soldano, Shane McIntosh
Just-In-Time (JIT) defect prediction models aim to predict the commits that will introduce defects in the future. Traditionally, JIT defect prediction models are trained using metrics that are primarily derived from aspects of the code change itself (e.g., the size of the change, the author’s prior experience). In addition to the code that is submitted during a commit, authors write commit messages, which describe the commit for archival purposes. It is our position that the level of detail in these commit messages can provide additional explanatory power to JIT defect prediction models. Hence, in this paper, we analyze the relationship between the defect proneness of commits and commit message volume (i.e., the length of the commit message) and commit message content (approximated using spam filtering technology). Through analysis of JIT models that were trained using 342 GitHub repositories, we find that our JIT models outperform random guessing models, achieving AUC and Brier scores that range between 0.63-0.96 and 0.01-0.21, respectively. Furthermore, our metrics that are derived from commit message detail provide a statistically significant boost to the explanatory power to the JIT models in 43%-80% of the studied systems, accounting for up to 72% of the explanatory power. Future JIT studies should consider adding commit message detail metrics.
{"title":"The Relationship between Commit Message Detail and Defect Proneness in Java Projects on GitHub","authors":"Jacob G. Barnett, Charles K. Gathuru, Luke S. Soldano, Shane McIntosh","doi":"10.1145/2901739.2903496","DOIUrl":"https://doi.org/10.1145/2901739.2903496","url":null,"abstract":"Just-In-Time (JIT) defect prediction models aim to predict the commits that will introduce defects in the future. Traditionally, JIT defect prediction models are trained using metrics that are primarily derived from aspects of the code change itself (e.g., the size of the change, the author’s prior experience). In addition to the code that is submitted during a commit, authors write commit messages, which describe the commit for archival purposes. It is our position that the level of detail in these commit messages can provide additional explanatory power to JIT defect prediction models. Hence, in this paper, we analyze the relationship between the defect proneness of commits and commit message volume (i.e., the length of the commit message) and commit message content (approximated using spam filtering technology). Through analysis of JIT models that were trained using 342 GitHub repositories, we find that our JIT models outperform random guessing models, achieving AUC and Brier scores that range between 0.63-0.96 and 0.01-0.21, respectively. Furthermore, our metrics that are derived from commit message detail provide a statistically significant boost to the explanatory power to the JIT models in 43%-80% of the studied systems, accounting for up to 72% of the explanatory power. Future JIT studies should consider adding commit message detail metrics.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"120 1","pages":"496-499"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84064789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}