{"title":"关于使用基准来评估c++提取器","authors":"S. Sim, R. Holt, S. Easterbrook","doi":"10.1109/WPC.2002.1021331","DOIUrl":null,"url":null,"abstract":"In this paper, we take the concept of benchmarking, as used extensively in computing, and apply it to the evaluation of C++ fact extractors. We demonstrate the efficacy of this approach by developing a prototype benchmark, CppETS 1.0 (C++ Extractor Test Suite, pronounced 'see-pets') and collecting feedback in a workshop setting. The CppETS benchmark characterises C++ extractors along two dimensions: accuracy and robustness. It consists of a series of test buckets that contain small C++ programs and related questions that pose different challenges to the extractors. As with other research areas, benchmarks are best developed through technical work and consultation with a community, so we invited researchers to apply CppETS to their extractors and report on their results in a workshop. Four teams participated in this effort, evaluating the four extractors Ccia, cppx, the Rigi C++ parser and TkSee/SN. They found that CppETS gave results that were consistent with their experience with these tools and therefore had good external validity. Workshop participants agreed that CppETS was an important contribution to fact extractor development and testing. Further efforts to make CppETS a widely-accepted benchmark will involve technical improvements and collaboration with the broader community.","PeriodicalId":210649,"journal":{"name":"Proceedings 10th International Workshop on Program Comprehension","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"On using a benchmark to evaluate C++ extractors\",\"authors\":\"S. Sim, R. Holt, S. Easterbrook\",\"doi\":\"10.1109/WPC.2002.1021331\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we take the concept of benchmarking, as used extensively in computing, and apply it to the evaluation of C++ fact extractors. We demonstrate the efficacy of this approach by developing a prototype benchmark, CppETS 1.0 (C++ Extractor Test Suite, pronounced 'see-pets') and collecting feedback in a workshop setting. The CppETS benchmark characterises C++ extractors along two dimensions: accuracy and robustness. It consists of a series of test buckets that contain small C++ programs and related questions that pose different challenges to the extractors. As with other research areas, benchmarks are best developed through technical work and consultation with a community, so we invited researchers to apply CppETS to their extractors and report on their results in a workshop. Four teams participated in this effort, evaluating the four extractors Ccia, cppx, the Rigi C++ parser and TkSee/SN. They found that CppETS gave results that were consistent with their experience with these tools and therefore had good external validity. Workshop participants agreed that CppETS was an important contribution to fact extractor development and testing. Further efforts to make CppETS a widely-accepted benchmark will involve technical improvements and collaboration with the broader community.\",\"PeriodicalId\":210649,\"journal\":{\"name\":\"Proceedings 10th International Workshop on Program Comprehension\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 10th International Workshop on Program Comprehension\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WPC.2002.1021331\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 10th International Workshop on Program Comprehension","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WPC.2002.1021331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 49
摘要
在本文中,我们将基准测试的概念广泛应用于计算,并将其应用于c++事实提取器的评估。我们通过开发一个原型基准CppETS 1.0 (c++ Extractor Test Suite,发音为“see-pets”)并在研讨会环境中收集反馈来证明这种方法的有效性。CppETS基准从两个方面来描述c++提取器:准确性和健壮性。它由一系列测试桶组成,其中包含小型c++程序和相关问题,这些问题对提取器提出了不同的挑战。与其他研究领域一样,最好通过技术工作和与社区协商来制定基准,因此我们邀请研究人员将CppETS应用于他们的提取器,并在研讨会上报告他们的结果。四个团队参与了这项工作,评估了四个提取器Ccia、cppx、Rigi c++解析器和TkSee/SN。他们发现CppETS给出的结果与他们使用这些工具的经验一致,因此具有良好的外部有效性。研讨会参与者一致认为,cpets对事实提取器的开发和测试做出了重要贡献。使cpets成为广泛接受的基准的进一步努力将涉及技术改进和与更广泛的社区的合作。
In this paper, we take the concept of benchmarking, as used extensively in computing, and apply it to the evaluation of C++ fact extractors. We demonstrate the efficacy of this approach by developing a prototype benchmark, CppETS 1.0 (C++ Extractor Test Suite, pronounced 'see-pets') and collecting feedback in a workshop setting. The CppETS benchmark characterises C++ extractors along two dimensions: accuracy and robustness. It consists of a series of test buckets that contain small C++ programs and related questions that pose different challenges to the extractors. As with other research areas, benchmarks are best developed through technical work and consultation with a community, so we invited researchers to apply CppETS to their extractors and report on their results in a workshop. Four teams participated in this effort, evaluating the four extractors Ccia, cppx, the Rigi C++ parser and TkSee/SN. They found that CppETS gave results that were consistent with their experience with these tools and therefore had good external validity. Workshop participants agreed that CppETS was an important contribution to fact extractor development and testing. Further efforts to make CppETS a widely-accepted benchmark will involve technical improvements and collaboration with the broader community.