J. M. D. Silva, Pablo Dall'Oglio, S.C.C.S. Pinto, I. Bittencourt, S. Mergen
According to Capability Maturity Model Integration (CMMI), the purpose of Process and Product Quality Assurance (PPQA) is to provide staff and management with objective insights about processes and their associated work products. Such purpose is usually achieved through the implementation of software inspections. Although software inspection be a common practice, it is time-consuming and expensive, which turns the implementation in small and medium-sized teams infeasible. To improve the software inspection, this paper proposes a domain ontology for representing the concepts of quality assurance inspection, which is independent, extensive, shareable and semantically strong. Through the ontology it is possible to provide a formal structure to support the development of software engineering solutions with quality. To support the quality assurance inspections, we developed an agent-based prototype that encapsulates the ontology model. The prototype is able to generate inspection checklists and automatically allocate noncompliance issues. We validated the approach through a case study that shows an increase of inspection coverage and adherence of process.
{"title":"OntoQAI: An Ontology to Support Quality Assurance Inspections","authors":"J. M. D. Silva, Pablo Dall'Oglio, S.C.C.S. Pinto, I. Bittencourt, S. Mergen","doi":"10.1109/SBES.2015.15","DOIUrl":"https://doi.org/10.1109/SBES.2015.15","url":null,"abstract":"According to Capability Maturity Model Integration (CMMI), the purpose of Process and Product Quality Assurance (PPQA) is to provide staff and management with objective insights about processes and their associated work products. Such purpose is usually achieved through the implementation of software inspections. Although software inspection be a common practice, it is time-consuming and expensive, which turns the implementation in small and medium-sized teams infeasible. To improve the software inspection, this paper proposes a domain ontology for representing the concepts of quality assurance inspection, which is independent, extensive, shareable and semantically strong. Through the ontology it is possible to provide a formal structure to support the development of software engineering solutions with quality. To support the quality assurance inspections, we developed an agent-based prototype that encapsulates the ontology model. The prototype is able to generate inspection checklists and automatically allocate noncompliance issues. We validated the approach through a case study that shows an increase of inspection coverage and adherence of process.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127979981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Wiese, R. Ré, Igor Steinmacher, R. T. Kuroda, G. Oliva, M. Gerosa
Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.
{"title":"Predicting Change Propagation from Repository Information","authors":"I. Wiese, R. Ré, Igor Steinmacher, R. T. Kuroda, G. Oliva, M. Gerosa","doi":"10.1109/SBES.2015.21","DOIUrl":"https://doi.org/10.1109/SBES.2015.21","url":null,"abstract":"Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123679185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Josenildo Melo, Aêda Sousa, C. Agra, J. Júnior, J. Castro, F. Alencar
Due to tough competition, companies must build solutions (or maintain existed) quickly and effectively, covering the needs of customers without neglecting the quality requirements. To model these solutions, there are several patterns, and the UML (Unified Modeling Language) one of the most used. However, UML is not prepared to capture domain requirements for quality. To achieve this goal, models based on GORE (Goal-Oriented Requirements Engineering) are used as i* (iStar). This paper presents a formalization of i* mapping rules for class diagram in the context of MDD (Model-Driven Development), aiming to create class diagrams, the show requirements with quality. An example is used to illustrate how these formalization rules can be applied.
{"title":"Formalization of Mapping Rules from iStar to Class Diagram in UML","authors":"Josenildo Melo, Aêda Sousa, C. Agra, J. Júnior, J. Castro, F. Alencar","doi":"10.1109/SBES.2015.25","DOIUrl":"https://doi.org/10.1109/SBES.2015.25","url":null,"abstract":"Due to tough competition, companies must build solutions (or maintain existed) quickly and effectively, covering the needs of customers without neglecting the quality requirements. To model these solutions, there are several patterns, and the UML (Unified Modeling Language) one of the most used. However, UML is not prepared to capture domain requirements for quality. To achieve this goal, models based on GORE (Goal-Oriented Requirements Engineering) are used as i* (iStar). This paper presents a formalization of i* mapping rules for class diagram in the context of MDD (Model-Driven Development), aiming to create class diagrams, the show requirements with quality. An example is used to illustrate how these formalization rules can be applied.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125109843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The interest on promoting the Academia-Industry partnership in the software development field has been increasingly encouraged by means of research approaches that support the cooperation between researchers and practitioners. The main focus is driven by the collaborative work where the scientific research work meets the real needs of the Industry. Aiming to contribute to this effort we present an approach called Soft Coder (Software Cooperative Design Research) that combines CMD (Cooperative Method Development), a method of Action Re-search, to concepts of DSR (Design Science Research). Our proposal supports the conduction of projects integrating the view-points of Industry and Academia, aiming to add User experience (UX) methods into agile practices. We carried out two studies applying the Soft Coder approach in a software Industry, work in close cooperation with UX and SCRUM teams for building and evaluating artifacts based on UX methods to support practical activities of both teams.
{"title":"Towards an Approach Matching CMD and DSR to Improve the Academia-Industry Software Development Partnership: A Case of Agile and UX Integration","authors":"Joelma Choma, L. Zaina, T. Silva","doi":"10.1109/SBES.2015.18","DOIUrl":"https://doi.org/10.1109/SBES.2015.18","url":null,"abstract":"The interest on promoting the Academia-Industry partnership in the software development field has been increasingly encouraged by means of research approaches that support the cooperation between researchers and practitioners. The main focus is driven by the collaborative work where the scientific research work meets the real needs of the Industry. Aiming to contribute to this effort we present an approach called Soft Coder (Software Cooperative Design Research) that combines CMD (Cooperative Method Development), a method of Action Re-search, to concepts of DSR (Design Science Research). Our proposal supports the conduction of projects integrating the view-points of Industry and Academia, aiming to add User experience (UX) methods into agile practices. We carried out two studies applying the Soft Coder approach in a software Industry, work in close cooperation with UX and SCRUM teams for building and evaluating artifacts based on UX methods to support practical activities of both teams.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129194667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ivan M. Lessa, G. Carneiro, M. Monteiro, Fernando Brito e Abreu
The literature has pointed out the need for focusing efforts to better support comprehension of MATLAB and Octave programs. Despite being largely used in the industry and academia in the engineering domain, programs and routines written in those languages still require efforts to propose approaches and tools for its understanding. Considering the use of crosscutting concerns (CCCs) to support the comprehension of object-oriented programs, there is room of its use in the context of MATLAB and Octave programs. The literature has purpose and examples in this direction. Considering this scenario, we propose the use of visualization enriched with CCCs representation to support the comprehension of such programs. This paper discusses the use of a multiple view interactive environment called OctMiner in the context of two case studies to characterize how collected information relating to crosscutting concerns can foster the comprehension of MATLAB and GNU/Octave programs. As a result of the conducted case studies, we propose strategies based on OctMiner and tailored to support different comprehension activities of programs written in MATLAB and Octave.
{"title":"A Concern Visualization Approach for Improving MATLAB and Octave Program Comprehension","authors":"Ivan M. Lessa, G. Carneiro, M. Monteiro, Fernando Brito e Abreu","doi":"10.1109/SBES.2015.19","DOIUrl":"https://doi.org/10.1109/SBES.2015.19","url":null,"abstract":"The literature has pointed out the need for focusing efforts to better support comprehension of MATLAB and Octave programs. Despite being largely used in the industry and academia in the engineering domain, programs and routines written in those languages still require efforts to propose approaches and tools for its understanding. Considering the use of crosscutting concerns (CCCs) to support the comprehension of object-oriented programs, there is room of its use in the context of MATLAB and Octave programs. The literature has purpose and examples in this direction. Considering this scenario, we propose the use of visualization enriched with CCCs representation to support the comprehension of such programs. This paper discusses the use of a multiple view interactive environment called OctMiner in the context of two case studies to characterize how collected information relating to crosscutting concerns can foster the comprehension of MATLAB and GNU/Octave programs. As a result of the conducted case studies, we propose strategies based on OctMiner and tailored to support different comprehension activities of programs written in MATLAB and Octave.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133119640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Hauck, Igor Almeida, Ricardo Araújo, Júnior Dymow, Moacyr Franco M. Neto
Many different Software Process Reference Models (PRM) and standards have been developed in recent years leading several software organizations to the need of process alignment to two or more models or standards, due to legal obligations or market opportunities. How to deploy multiple PRMs within the same organization without parallel processes implementation and rework is a problem still under investigation. The CERTICS is one of those PRMs, created with a specific purpose of assessing whether the software products of a given organization are result of Brazilian national innovation, aiming to allow also benefits in government procurement. The MPS reference model for software (MR-MPS-SW) is other already well-established PRM that has been widely used in Brazil by software organizations. This paper presents an experience of harmonization of these two models and their implementation in a case study in a software organizational unit that already has an official MR-MPS-SW's "F" level seeking a complementary CERTICS certification. The methodological approach starts with a conceptual models comparison based on the literature, and then the harmonization and implementation of the two models in the context of a case study. The results observed in the case study and the successful official CERTICS certification, raised up preliminary indications that the harmonization experience was conducted without generating processes overlapping and applying acceptable organizational effort.
{"title":"Harmonizing MPS.BR and CERTICS: A Case Study in a Maturity Level F Organization","authors":"J. Hauck, Igor Almeida, Ricardo Araújo, Júnior Dymow, Moacyr Franco M. Neto","doi":"10.1109/SBES.2015.22","DOIUrl":"https://doi.org/10.1109/SBES.2015.22","url":null,"abstract":"Many different Software Process Reference Models (PRM) and standards have been developed in recent years leading several software organizations to the need of process alignment to two or more models or standards, due to legal obligations or market opportunities. How to deploy multiple PRMs within the same organization without parallel processes implementation and rework is a problem still under investigation. The CERTICS is one of those PRMs, created with a specific purpose of assessing whether the software products of a given organization are result of Brazilian national innovation, aiming to allow also benefits in government procurement. The MPS reference model for software (MR-MPS-SW) is other already well-established PRM that has been widely used in Brazil by software organizations. This paper presents an experience of harmonization of these two models and their implementation in a case study in a software organizational unit that already has an official MR-MPS-SW's \"F\" level seeking a complementary CERTICS certification. The methodological approach starts with a conceptual models comparison based on the literature, and then the harmonization and implementation of the two models in the context of a case study. The results observed in the case study and the successful official CERTICS certification, raised up preliminary indications that the harmonization experience was conducted without generating processes overlapping and applying acceptable organizational effort.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129339412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Ferreira, T. Conte, Simone Diniz Junqueira Barbosa
More than just meet the functional requirements, software is expected to promote a good user experience to its users. However, to achieve an adequate degree of user experience, it is necessary to understand users, their needs and expectations very well. Currently, part of the software industry focuses on developing mobile Web applications. A significant difficulty in the design of this type of application is that the development team often does not have a clear group of target users available, with whom the team could use traditional techniques for eliciting requirements such as interviews and questionnaires. Therefore, it is necessary to use different techniques that allow an immersion in the needs and characteristics of users, even when the users are not directly involved in the development. One of these techniques that help thinking of the target users of the application is the Persona technique. An important criticism of the Personas technique is that it does not always help design the solution for real users' needs. This perception of a low degree of usefulness limits the acceptance and adoption of the technique. To address these concerns, this paper presents the PATHY (Persona empathy) technique, which assists designers and developers in thinking about the target users' goals and characteristics, as well as the application characteristics to help users achieve their goals. This paper also discusses the results of a preliminary study, presenting the analysis of personas generated using the PATHY technique and a new version of the technique improved from this analysis.
{"title":"Eliciting Requirements Using Personas and Empathy Map to Enhance the User Experience","authors":"B. Ferreira, T. Conte, Simone Diniz Junqueira Barbosa","doi":"10.1109/SBES.2015.14","DOIUrl":"https://doi.org/10.1109/SBES.2015.14","url":null,"abstract":"More than just meet the functional requirements, software is expected to promote a good user experience to its users. However, to achieve an adequate degree of user experience, it is necessary to understand users, their needs and expectations very well. Currently, part of the software industry focuses on developing mobile Web applications. A significant difficulty in the design of this type of application is that the development team often does not have a clear group of target users available, with whom the team could use traditional techniques for eliciting requirements such as interviews and questionnaires. Therefore, it is necessary to use different techniques that allow an immersion in the needs and characteristics of users, even when the users are not directly involved in the development. One of these techniques that help thinking of the target users of the application is the Persona technique. An important criticism of the Personas technique is that it does not always help design the solution for real users' needs. This perception of a low degree of usefulness limits the acceptance and adoption of the technique. To address these concerns, this paper presents the PATHY (Persona empathy) technique, which assists designers and developers in thinking about the target users' goals and characteristics, as well as the application characteristics to help users achieve their goals. This paper also discusses the results of a preliminary study, presenting the analysis of personas generated using the PATHY technique and a new version of the technique improved from this analysis.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129517809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. A. O. Araújo, M. Delamaro, J. Maldonado, A. Vincenzi
This paper provides evidences on the correspondence between mutations and static warnings. We used mutation operators as a fault model to evaluate the direct correspondence between mutations and static warnings. The main advantage of using mutation operators is that they generate a large number of programs containing faults of different types, which can be used to decide the ones most probable to be detected by static analyzers. Since static analyzers, in general, report a substantial number of false positive warnings, the intention of this study is to define a prioritization approach of static warnings based on the probability they correspond to a true positive and lead to detect software faults. The results obtained for a set of open-source programs indicate that a correspondence exist when considering specific mutation operators such that static warnings may be prioritized based on their correspondence level with mutations.
{"title":"Investigating the Correspondence between Mutations and Static Warnings","authors":"C. A. O. Araújo, M. Delamaro, J. Maldonado, A. Vincenzi","doi":"10.1109/SBES.2015.23","DOIUrl":"https://doi.org/10.1109/SBES.2015.23","url":null,"abstract":"This paper provides evidences on the correspondence between mutations and static warnings. We used mutation operators as a fault model to evaluate the direct correspondence between mutations and static warnings. The main advantage of using mutation operators is that they generate a large number of programs containing faults of different types, which can be used to decide the ones most probable to be detected by static analyzers. Since static analyzers, in general, report a substantial number of false positive warnings, the intention of this study is to define a prioritization approach of static warnings based on the probability they correspond to a true positive and lead to detect software faults. The results obtained for a set of open-source programs indicate that a correspondence exist when considering specific mutation operators such that static warnings may be prioritized based on their correspondence level with mutations.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129801864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. D. Rodrigues, M. Barros, K. Revoredo, L. Azevedo, H. Leopold
It is widely accepted that presenting data in the form of pictures or models can enhance comprehension, decision making and communication of the underlying information. However, there are few systematic studies that examine whether graphical models are more effective than other representation (such as textual descriptions). Process models provide an abstract graphical view of organizational procedures by reducing the complex reality of the work performed by a company to its most important activities. Such models are useful to train new employees, to document and allow improving organizational procedures and policies. This paper describes an experiment to address if there are significant differences in terms of process understand ability depending on whether textual work instructions or process models are used to represent a business process. We compared a control group of subjects that received textual work instructions to a second group, which received process models, in terms of their ability to understand the process. We found empirical support that using textual work instructions or process models do not influence process understand ability for non-expert users but do influence for experienced users.
{"title":"An Experiment on Process Model Understandability Using Textual Work Instructions and BPMN Models","authors":"R. D. Rodrigues, M. Barros, K. Revoredo, L. Azevedo, H. Leopold","doi":"10.1109/SBES.2015.12","DOIUrl":"https://doi.org/10.1109/SBES.2015.12","url":null,"abstract":"It is widely accepted that presenting data in the form of pictures or models can enhance comprehension, decision making and communication of the underlying information. However, there are few systematic studies that examine whether graphical models are more effective than other representation (such as textual descriptions). Process models provide an abstract graphical view of organizational procedures by reducing the complex reality of the work performed by a company to its most important activities. Such models are useful to train new employees, to document and allow improving organizational procedures and policies. This paper describes an experiment to address if there are significant differences in terms of process understand ability depending on whether textual work instructions or process models are used to represent a business process. We compared a control group of subjects that received textual work instructions to a second group, which received process models, in terms of their ability to understand the process. We found empirical support that using textual work instructions or process models do not influence process understand ability for non-expert users but do influence for experienced users.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122100161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christoph Treude, C. Prolo, Fernando Marques Figueira Filho
Many tools that automatically analyze, summarize, or transform software artifacts rely on natural language processing tooling for the interpretation of natural language text produced by software developers, such as documentation, code comments, commit messages, or bug reports. Processing natural language text produced by software developers is challenging because of unique characteristics not found in other texts, such as the presence of code terms and the systematic use of incomplete sentences. In addition, texts produced by Portuguese-speaking developers mix languages since many keywords and programming concepts are referred to by their English name. In this paper, we provide empirical insights into the challenges of analyzing software artifacts written in Portuguese. We analyzed 100 question titles from the Portuguese version of Stack Overflow with two Portuguese language tools and identified multiple problems which resulted in very few sentences being tagged completely correctly. Based on these results, we propose heuristics to improve the analysis of natural language text produced by software developers in Portuguese.
{"title":"Challenges in Analyzing Software Documentation in Portuguese","authors":"Christoph Treude, C. Prolo, Fernando Marques Figueira Filho","doi":"10.1109/SBES.2015.27","DOIUrl":"https://doi.org/10.1109/SBES.2015.27","url":null,"abstract":"Many tools that automatically analyze, summarize, or transform software artifacts rely on natural language processing tooling for the interpretation of natural language text produced by software developers, such as documentation, code comments, commit messages, or bug reports. Processing natural language text produced by software developers is challenging because of unique characteristics not found in other texts, such as the presence of code terms and the systematic use of incomplete sentences. In addition, texts produced by Portuguese-speaking developers mix languages since many keywords and programming concepts are referred to by their English name. In this paper, we provide empirical insights into the challenges of analyzing software artifacts written in Portuguese. We analyzed 100 question titles from the Portuguese version of Stack Overflow with two Portuguese language tools and identified multiple problems which resulted in very few sentences being tagged completely correctly. Based on these results, we propose heuristics to improve the analysis of natural language text produced by software developers in Portuguese.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115053877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}