Pub Date : 2018-01-16DOI: 10.1109/SANER.2018.8330214
Aline Brito, Laerte Xavier, André C. Hora, M. T. Valente
Modern software development depends on APIs to reuse code and increase productivity. As most software systems, these libraries and frameworks also evolve, which may break existing clients. However, the main reasons to introduce breaking changes in APIs are unclear. Therefore, in this paper, we report the results of an almost 4-month long field study with the developers of 400 popular Java libraries and frameworks. We configured an infrastructure to observe all changes in these libraries and to detect breaking changes shortly after their introduction in the code. After identifying breaking changes, we asked the developers to explain the reasons behind their decision to change the APIs. During the study, we identified 59 breaking changes, confirmed by the developers of 19 projects. By analyzing the developers' answers, we report that breaking changes are mostly motivated by the need to implement new features, by the desire to make the APIs simpler and with fewer elements, and to improve maintainability. We conclude by providing suggestions to language designers, tool builders, software engineering researchers and API developers.
{"title":"Why and how Java developers break APIs","authors":"Aline Brito, Laerte Xavier, André C. Hora, M. T. Valente","doi":"10.1109/SANER.2018.8330214","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330214","url":null,"abstract":"Modern software development depends on APIs to reuse code and increase productivity. As most software systems, these libraries and frameworks also evolve, which may break existing clients. However, the main reasons to introduce breaking changes in APIs are unclear. Therefore, in this paper, we report the results of an almost 4-month long field study with the developers of 400 popular Java libraries and frameworks. We configured an infrastructure to observe all changes in these libraries and to detect breaking changes shortly after their introduction in the code. After identifying breaking changes, we asked the developers to explain the reasons behind their decision to change the APIs. During the study, we identified 59 breaking changes, confirmed by the developers of 19 projects. By analyzing the developers' answers, we report that breaking changes are mostly motivated by the need to implement new features, by the desire to make the APIs simpler and with fewer elements, and to improve maintainability. We conclude by providing suggestions to language designers, tool builders, software engineering researchers and API developers.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"15 1","pages":"255-265"},"PeriodicalIF":0.0,"publicationDate":"2018-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78250748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-01-01DOI: 10.1109/SANER.2018.8330215
Md. Arafat Hossain, Steven Versteeg, Jun Han, M. A. Kabir, Jiaojiao Jiang, Jean-Guy Schneider
APIs play a significant role in the sharing, utilization and integration of information and service assets for enterprises, delivering significant business value. However, the documentation of service APIs can often be incomplete, ambiguous, or even non-existent, hindering API-based application development efforts. In this paper, we introduce an approach to automatically mine the fine-grained message formats required in defining the APIs of services and applications from their interaction traces, without assuming any prior knowledge. Our approach includes three major steps with corresponding techniques: (1) classifying the interaction messages of a service into clusters corresponding to message types, (2) identifying the keywords of messages in each cluster, and (3) extracting the format of each message type. We have applied our approach to network traces collected from four real services which used the following application protocols: REST, SOAP, LDAP and SIP. The results show that our approach achieves much greater accuracy in extracting message formats for service APIs than current state-of-art approaches.
{"title":"Mining accurate message formats for service APIs","authors":"Md. Arafat Hossain, Steven Versteeg, Jun Han, M. A. Kabir, Jiaojiao Jiang, Jean-Guy Schneider","doi":"10.1109/SANER.2018.8330215","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330215","url":null,"abstract":"APIs play a significant role in the sharing, utilization and integration of information and service assets for enterprises, delivering significant business value. However, the documentation of service APIs can often be incomplete, ambiguous, or even non-existent, hindering API-based application development efforts. In this paper, we introduce an approach to automatically mine the fine-grained message formats required in defining the APIs of services and applications from their interaction traces, without assuming any prior knowledge. Our approach includes three major steps with corresponding techniques: (1) classifying the interaction messages of a service into clusters corresponding to message types, (2) identifying the keywords of messages in each cluster, and (3) extracting the format of each message type. We have applied our approach to network traces collected from four real services which used the following application protocols: REST, SOAP, LDAP and SIP. The results show that our approach achieves much greater accuracy in extracting message formats for service APIs than current state-of-art approaches.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"17 1","pages":"266-276"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73944276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-01-01DOI: 10.1109/SANER.2018.8330258
D. Dams, A. Mooij, Pepijn Kramer, A. Radulescu, Jaromir Vanhara
The high-tech industry is faced with ever growing amounts of software to be maintained and extended. To keep the associated costs under control, there is a demand for more human overview and for large-scale code restructurings. Language technology such as parsing can assist in this, but classical restructuring tools are typically not flexible enough to accommodate the needs of specific cases. In our research we investigate ways to make software restructuring tools customizable by software developers at Thermo Fisher Scientific as well as at other high-tech companies. We report on an industry-as-lab project, in which we have collaborated on cleaning up the compilation of COM interfaces of a large industrial software component. As a generic result, we have identified a method that we call model-based software restructuring. The approach taken is to extract high-level models from the code, use these to specify and visualize the restructuring, which is then translated into low-level code transformations. To implement this approach, we integrate generic technology to develop custom solutions. We aim for semiautomation and incrementally automate recurring restructuring patterns. The COM clean-up affected 72 type libraries and 1310 client projects with (one or more) dependencies on these type libraries. We have addressed these one type library at a time, and delivered all changes without blocking regular software development. Software developers in neighboring projects immediately noticed the very low defect rate of our restructuring. Moreover, as a spin-off, we have observed that the developed tools also start to contribute to regular software development.
{"title":"Model-based software restructuring: Lessons from cleaning up COM interfaces in industrial legacy code","authors":"D. Dams, A. Mooij, Pepijn Kramer, A. Radulescu, Jaromir Vanhara","doi":"10.1109/SANER.2018.8330258","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330258","url":null,"abstract":"The high-tech industry is faced with ever growing amounts of software to be maintained and extended. To keep the associated costs under control, there is a demand for more human overview and for large-scale code restructurings. Language technology such as parsing can assist in this, but classical restructuring tools are typically not flexible enough to accommodate the needs of specific cases. In our research we investigate ways to make software restructuring tools customizable by software developers at Thermo Fisher Scientific as well as at other high-tech companies. We report on an industry-as-lab project, in which we have collaborated on cleaning up the compilation of COM interfaces of a large industrial software component. As a generic result, we have identified a method that we call model-based software restructuring. The approach taken is to extract high-level models from the code, use these to specify and visualize the restructuring, which is then translated into low-level code transformations. To implement this approach, we integrate generic technology to develop custom solutions. We aim for semiautomation and incrementally automate recurring restructuring patterns. The COM clean-up affected 72 type libraries and 1310 client projects with (one or more) dependencies on these type libraries. We have addressed these one type library at a time, and delivered all changes without blocking regular software development. Software developers in neighboring projects immediately noticed the very low defect rate of our restructuring. Moreover, as a spin-off, we have observed that the developed tools also start to contribute to regular software development.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"56 1","pages":"552-556"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82534867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate software defect prediction could help software practitioners allocate test resources to defect-prone modules effectively and efficiently. In the last decades, much effort has been devoted to build accurate defect prediction models, including developing quality defect predictors and modeling techniques. However, current widely used defect predictors such as code metrics and process metrics could not well describe how software modules change over the project evolution, which we believe is important for defect prediction. In order to deal with this problem, in this paper, we propose to use the Historical Version Sequence of Metrics (HVSM) in continuous software versions as defect predictors. Furthermore, we leverage Recurrent Neural Network (RNN), a popular modeling technique, to take HVSM as the input to build software prediction models. The experimental results show that, in most cases, the proposed HVSM-based RNN model has significantly better effort-aware ranking effectiveness than the commonly used baseline models.
{"title":"Connecting software metrics across versions to predict defects","authors":"Yibin Liu, Yanhui Li, Jianbo Guo, Yuming Zhou, Baowen Xu","doi":"10.1109/SANER.2018.8330212","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330212","url":null,"abstract":"Accurate software defect prediction could help software practitioners allocate test resources to defect-prone modules effectively and efficiently. In the last decades, much effort has been devoted to build accurate defect prediction models, including developing quality defect predictors and modeling techniques. However, current widely used defect predictors such as code metrics and process metrics could not well describe how software modules change over the project evolution, which we believe is important for defect prediction. In order to deal with this problem, in this paper, we propose to use the Historical Version Sequence of Metrics (HVSM) in continuous software versions as defect predictors. Furthermore, we leverage Recurrent Neural Network (RNN), a popular modeling technique, to take HVSM as the input to build software prediction models. The experimental results show that, in most cases, the proposed HVSM-based RNN model has significantly better effort-aware ranking effectiveness than the commonly used baseline models.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"35 1","pages":"232-243"},"PeriodicalIF":0.0,"publicationDate":"2017-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74436255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-17DOI: 10.1109/SANER.2018.8330208
M. Leemans, Wil M.P. van der Aalst, M. Brand
This paper presents 1) a novel hierarchy and recursion extension to the process tree model; and 2) the first, recursion aware process model discovery technique that leverages hierarchical information in event logs, typically available for software systems. This technique allows us to analyze the operational processes of software systems under real-life conditions at multiple levels of granularity. The work can be positioned in-between reverse engineering and process mining. An implementation of the proposed approach is available as a ProM plugin. Experimental results based on real-life (software) event logs demonstrate the feasibility and usefulness of the approach and show the huge potential to speed up discovery by exploiting the available hierarchy.
{"title":"Recursion aware modeling and discovery for hierarchical software event log analysis","authors":"M. Leemans, Wil M.P. van der Aalst, M. Brand","doi":"10.1109/SANER.2018.8330208","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330208","url":null,"abstract":"This paper presents 1) a novel hierarchy and recursion extension to the process tree model; and 2) the first, recursion aware process model discovery technique that leverages hierarchical information in event logs, typically available for software systems. This technique allows us to analyze the operational processes of software systems under real-life conditions at multiple levels of granularity. The work can be positioned in-between reverse engineering and process mining. An implementation of the proposed approach is available as a ProM plugin. Experimental results based on real-life (software) event logs demonstrate the feasibility and usefulness of the approach and show the huge potential to speed up discovery by exploiting the available hierarchy.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"65 1","pages":"185-196"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76897769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}