Most log files are of one format - a flat file with the events of execution recorded one after the other. Each line in the file contains at least a timestamp, a combination of one or more event identifiers, and the actual log message with information of which event was executed and what the values for the dynamic parameters of that event are. Since log files have this trace information, we can use it for many purposes, such as operational profiling and anomalous execution path detection. However the current flat file format of a log file is very unintuitive to detect the existence of a repeating pattern. In this paper we propose a transformation of the current serial order format of a log file to a directed cyclic graph (such as a non-finite state machine) format and how the operational profile of a system can be built from this representation of the log file. We built a tool (in C++), that transforms a log file with a set of log events in a serial order to an adjacency matrix for the resulting graphical representation. We can then easily apply existing graph theory based algorithms on the adjacency matrix to analyze the log file of the system. The directed cyclic graph and the analysis of it can be visualized by rendering the adjacency matrix with graph visualization tools, like Graphviz.
{"title":"Creating operational profiles of software systems by transforming their log files to directed cyclic graphs","authors":"M. Nagappan, Brian P. Robinson","doi":"10.1145/1987856.1987869","DOIUrl":"https://doi.org/10.1145/1987856.1987869","url":null,"abstract":"Most log files are of one format - a flat file with the events of execution recorded one after the other. Each line in the file contains at least a timestamp, a combination of one or more event identifiers, and the actual log message with information of which event was executed and what the values for the dynamic parameters of that event are. Since log files have this trace information, we can use it for many purposes, such as operational profiling and anomalous execution path detection. However the current flat file format of a log file is very unintuitive to detect the existence of a repeating pattern. In this paper we propose a transformation of the current serial order format of a log file to a directed cyclic graph (such as a non-finite state machine) format and how the operational profile of a system can be built from this representation of the log file. We built a tool (in C++), that transforms a log file with a set of log events in a serial order to an adjacency matrix for the resulting graphical representation. We can then easily apply existing graph theory based algorithms on the adjacency matrix to analyze the log file of the system. The directed cyclic graph and the analysis of it can be visualized by rendering the adjacency matrix with graph visualization tools, like Graphviz.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120959988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Analyzing candidate traceability links is a difficult, time consuming and error prone task, as it usually requires a detailed study of a long list of software artifacts of various kinds. One option to alleviate this problem is to select the most important features of the software artifacts that the developers would investigate. We discuss in this position paper how text summarization techniques could be used to address this problem. The potential gains in using summaries are both in terms of time and correctness of the traceability link recovery process.
{"title":"Improving traceability link recovery methods through software artifact summarization","authors":"Jairo Aponte, Andrian Marcus","doi":"10.1145/1987856.1987867","DOIUrl":"https://doi.org/10.1145/1987856.1987867","url":null,"abstract":"Analyzing candidate traceability links is a difficult, time consuming and error prone task, as it usually requires a detailed study of a long list of software artifacts of various kinds. One option to alleviate this problem is to select the most important features of the software artifacts that the developers would investigate. We discuss in this position paper how text summarization techniques could be used to address this problem. The potential gains in using summaries are both in terms of time and correctness of the traceability link recovery process.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132163744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traceability is considered an important activity during the development of software systems. Despite the various classifications that have been proposed for different types of traceability relations, there is still a lack of standard semantic definitions for traceability relations. In this paper, we present an ontology-based formalism for semantic representation of various types of traceability relations for product line systems and associations between these various types of traceability relations.
{"title":"Formalizing traceability relations for product lines","authors":"L. Lamb, Waraporn Jirapanthong, A. Zisman","doi":"10.1145/1987856.1987866","DOIUrl":"https://doi.org/10.1145/1987856.1987866","url":null,"abstract":"Traceability is considered an important activity during the development of software systems. Despite the various classifications that have been proposed for different types of traceability relations, there is still a lack of standard semantic definitions for traceability relations. In this paper, we present an ontology-based formalism for semantic representation of various types of traceability relations for product line systems and associations between these various types of traceability relations.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130788617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Cleland-Huang, Adam Czauderna, Alex Dekhtyar, O. Gotel, J. Hayes, E. Keenan, Greg Leach, Jonathan I. Maletic, D. Poshyvanyk, Yonghee Shin, A. Zisman, G. Antoniol, B. Berenbach, Alexander Egyed, Patrick Mäder
The challenges of implementing successful and cost-effective traceability have created a compelling research agenda that has addressed a broad range of traceability related issues, ranging from qualitative studies of traceability users in industry to very technical and quantitative studies. Unfortunately, advances are hampered by the significant time and effort needed to establish a traceability research environment and to perform comparative evaluations of new results against existing baselines. In this panel we discuss ongoing efforts by members of the Center of Excellence for Software Traceability (CoEST) to define the Grand Challenges of Traceability, develop benchmarks, and to construct TraceLab, an extensible and scalable visual environment for designing and executing a broad range of traceability experiments.
{"title":"Grand challenges, benchmarks, and TraceLab: developing infrastructure for the software traceability research community","authors":"J. Cleland-Huang, Adam Czauderna, Alex Dekhtyar, O. Gotel, J. Hayes, E. Keenan, Greg Leach, Jonathan I. Maletic, D. Poshyvanyk, Yonghee Shin, A. Zisman, G. Antoniol, B. Berenbach, Alexander Egyed, Patrick Mäder","doi":"10.1145/1987856.1987861","DOIUrl":"https://doi.org/10.1145/1987856.1987861","url":null,"abstract":"The challenges of implementing successful and cost-effective traceability have created a compelling research agenda that has addressed a broad range of traceability related issues, ranging from qualitative studies of traceability users in industry to very technical and quantitative studies. Unfortunately, advances are hampered by the significant time and effort needed to establish a traceability research environment and to perform comparative evaluations of new results against existing baselines. In this panel we discuss ongoing efforts by members of the Center of Excellence for Software Traceability (CoEST) to define the Grand Challenges of Traceability, develop benchmarks, and to construct TraceLab, an extensible and scalable visual environment for designing and executing a broad range of traceability experiments.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"284 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123310900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper posits that a theoretical model of analyst effort in tracing tasks is necessary to assist with study of the analyst. Specifically, it is clear from prior work by numerous research groups that the important factors in such a model are: the amount of time it takes for an analyst to vet a given candidate link and the amount of time it takes an analyst to find a missing link. This paper introduces a theoretical model of analyst effort as well as a simplified model. A number of simulations were undertaken in order to build effort curves to assist in evaluating numerous tracing scenarios, such as determining at what point in time an analyst should switch from vetting candidate links to manually searching for links not in the candidate list.
{"title":"Towards a model of analyst effort for traceability research","authors":"Alex Dekhtyar, J. Hayes, Matty Smith","doi":"10.1145/1987856.1987870","DOIUrl":"https://doi.org/10.1145/1987856.1987870","url":null,"abstract":"This paper posits that a theoretical model of analyst effort in tracing tasks is necessary to assist with study of the analyst. Specifically, it is clear from prior work by numerous research groups that the important factors in such a model are: the amount of time it takes for an analyst to vet a given candidate link and the amount of time it takes an analyst to find a missing link. This paper introduces a theoretical model of analyst effort as well as a simplified model. A number of simulations were undertaken in order to build effort curves to assist in evaluating numerous tracing scenarios, such as determining at what point in time an analyst should switch from vetting candidate links to manually searching for links not in the candidate list.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114842328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Requirements-to-source-code traceability employs information retrieval (IR) methods to automatically link requirements to the source code that implements them. A crucial step in this process is indexing, where partial and important information from the software artifacts is converted into a representation that is compatible with the underlying IR model. Source code demands special attention in the indexing process. In this paper, we investigate source code indexing for supporting automatic traceability. We introduce a feature diagram that captures the key components and their relationships in the domain of source code indexing. We then present an experiment to examine the features of the diagram and their dependencies. Results show that utilizing comments has a significant effect on traceability link generation, and stemming is required when comments are considered.
{"title":"Source code indexing for automated tracing","authors":"Anas Mahmoud, Nan Niu","doi":"10.1145/1987856.1987859","DOIUrl":"https://doi.org/10.1145/1987856.1987859","url":null,"abstract":"Requirements-to-source-code traceability employs information retrieval (IR) methods to automatically link requirements to the source code that implements them. A crucial step in this process is indexing, where partial and important information from the software artifacts is converted into a representation that is compatible with the underlying IR model. Source code demands special attention in the indexing process. In this paper, we investigate source code indexing for supporting automatic traceability. We introduce a feature diagram that captures the key components and their relationships in the domain of source code indexing. We then present an experiment to examine the features of the diagram and their dependencies. Results show that utilizing comments has a significant effect on traceability link generation, and stemming is required when comments are considered.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132439672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adam Czauderna, M. Gibiec, Greg Leach, Yubin Li, Yonghee Shin, E. Keenan, J. Cleland-Huang
Numerous trace retrieval algorithms incorporate the standard tf-idf (term frequency, inverse document frequency) technique to weight various terms. In this paper we address Grand Challenge C-GC1 by comparing the effectiveness of computing idf based only on the local terms in the query, versus computing it based on general term usage as documented in the American National Corpus. We also address Grand Challenges L-GC1 and L-GC2 by setting ourselves the additional task of designing and conducting the experiments using the alpha-release of TraceLab. TraceLab is an experimental workbench which allows researchers to graphically model and execute a traceability experiment as a workflow of components. Results of the experiment show that the local idf approach exceeds or matches the global approach in all of the cases studied.
{"title":"Traceability challenge 2011: using TraceLab to evaluate the impact of local versus global IDF on trace retrieval","authors":"Adam Czauderna, M. Gibiec, Greg Leach, Yubin Li, Yonghee Shin, E. Keenan, J. Cleland-Huang","doi":"10.1145/1987856.1987874","DOIUrl":"https://doi.org/10.1145/1987856.1987874","url":null,"abstract":"Numerous trace retrieval algorithms incorporate the standard tf-idf (term frequency, inverse document frequency) technique to weight various terms. In this paper we address Grand Challenge C-GC1 by comparing the effectiveness of computing idf based only on the local terms in the query, versus computing it based on general term usage as documented in the American National Corpus. We also address Grand Challenges L-GC1 and L-GC2 by setting ourselves the additional task of designing and conducting the experiments using the alpha-release of TraceLab. TraceLab is an experimental workbench which allows researchers to graphically model and execute a traceability experiment as a workflow of components. Results of the experiment show that the local idf approach exceeds or matches the global approach in all of the cases studied.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114521874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maurício Serrano, Julio Cesar Sampaio do Prado Leite
In 1993, Goguen published a research note addressing the social issues in Requirements Engineering. He identified in the requirements process three major social groups: the client organization; the requirements team; and the development team. However, nowadays there is a lack of technological support that traces requirements to social issues on the requirements team or development team. From early published traceability metamodels to current requirements traceability literature, the client organization and the stakeholders are first-class citizens, but the software engineers and the interactions between these groups are not. In this paper we present a partially formalized RichPicture traceability model to fill this gap. ITrace is a flexible model to weave together the social network graph, the information sources graph, the social interactions graph, and the Requirements Engineering artifacts evolution graph. We empirically developed our traceability model tracking a Transparency catalogue evolution. We also compare our model structure to Contribution Structures.
{"title":"A rich traceability model for social interactions","authors":"Maurício Serrano, Julio Cesar Sampaio do Prado Leite","doi":"10.1145/1987856.1987871","DOIUrl":"https://doi.org/10.1145/1987856.1987871","url":null,"abstract":"In 1993, Goguen published a research note addressing the social issues in Requirements Engineering. He identified in the requirements process three major social groups: the client organization; the requirements team; and the development team. However, nowadays there is a lack of technological support that traces requirements to social issues on the requirements team or development team. From early published traceability metamodels to current requirements traceability literature, the client organization and the stakeholders are first-class citizens, but the software engineers and the interactions between these groups are not. In this paper we present a partially formalized RichPicture traceability model to fill this gap. ITrace is a flexible model to weave together the social network graph, the information sources graph, the social interactions graph, and the Requirements Engineering artifacts evolution graph. We empirically developed our traceability model tracking a Transparency catalogue evolution. We also compare our model structure to Contribution Structures.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122450277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher S. Corley, Nicholas A. Kraft, L. Etzkorn, Stacy K. Lukins
Traceability links can be recovered using data mined from a revision control system, such as CVS, and an issue tracking system, such as Bugzilla. Existing approaches to recover links between a bug and the methods changed to fix the bug rely on the presence of the bug's identifier in a CVS log message. In this paper we present an approach that relies instead on the presence of a patch in the issue report for the bug. That is, rather than analyzing deltas retrieved from CVS to recover links, our approach analyzes patches retrieved from Bugzilla. We use BugTrace, the tool implementing our approach, to conduct a case study in which we compare the links recovered by our approach to links recovered by manual inspection. The results of the case study support the efficacy of our approach. After describing the limitations of our case study, we conclude by reviewing closely related work and suggesting possible future work.
{"title":"Recovering traceability links between source code and fixed bugs via patch analysis","authors":"Christopher S. Corley, Nicholas A. Kraft, L. Etzkorn, Stacy K. Lukins","doi":"10.1145/1987856.1987863","DOIUrl":"https://doi.org/10.1145/1987856.1987863","url":null,"abstract":"Traceability links can be recovered using data mined from a revision control system, such as CVS, and an issue tracking system, such as Bugzilla. Existing approaches to recover links between a bug and the methods changed to fix the bug rely on the presence of the bug's identifier in a CVS log message. In this paper we present an approach that relies instead on the presence of a patch in the issue report for the bug. That is, rather than analyzing deltas retrieved from CVS to recover links, our approach analyzes patches retrieved from Bugzilla. We use BugTrace, the tool implementing our approach, to conduct a case study in which we compare the links recovered by our approach to links recovered by manual inspection. The results of the case study support the efficacy of our approach. After describing the limitations of our case study, we conclude by reviewing closely related work and suggesting possible future work.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117333410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper advocates for the induction of eye tracking technology in software traceability and takes a position that the use of eye tracking metrics can contribute to several software traceability tasks. The authors posit that the role of eye tracking is not simply restricted to an instrument for empirical studies, but also could extend to providing a foundation of a new software traceability methodology. Several scenarios where eye-tracking metrics could be meaningful are presented. The specific research directions include conducting empirical studies with eye-tracking metrics and replicating previously reported empirical studies, eye-tracking enabled traceability link recovery and management methodology, and visualization support.
{"title":"On the use of eye tracking in software traceability","authors":"Bonita Sharif, Huzefa H. Kagdi","doi":"10.1145/1987856.1987872","DOIUrl":"https://doi.org/10.1145/1987856.1987872","url":null,"abstract":"The paper advocates for the induction of eye tracking technology in software traceability and takes a position that the use of eye tracking metrics can contribute to several software traceability tasks. The authors posit that the role of eye tracking is not simply restricted to an instrument for empirical studies, but also could extend to providing a foundation of a new software traceability methodology. Several scenarios where eye-tracking metrics could be meaningful are presented. The specific research directions include conducting empirical studies with eye-tracking metrics and replicating previously reported empirical studies, eye-tracking enabled traceability link recovery and management methodology, and visualization support.","PeriodicalId":116816,"journal":{"name":"TEFSE '11","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124963881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}