多面控制词汇组件分类和检索的另一个问题

ACM SIGSOFT Symposium on Software Reusability Pub Date : 1997-05-01 DOI:10.1145/258366.258393

H. Mili, Estelle Ah-Ki, R. Godin, H. Mcheick

{"title":"多面控制词汇组件分类和检索的另一个问题","authors":"H. Mili, Estelle Ah-Ki, R. Godin, H. Mcheick","doi":"10.1145/258366.258393","DOIUrl":null,"url":null,"abstract":"Our research centers around exploring methodologies for developing reusable software, and developing methods and toofs for building with reusable software. In this paper, we focus on reusable software component retrieval methods that were developed and teated in the context of ClassServer, an experimental library tool developed at the University of Qu6bec at Montr6al to explore fssues in software reuse [15]. The methods dfscusaed in tbfa paper fall into two categori~ 1) string search-based retrieval metbod$ and 2) keyword-based retrieval methods. Both kinds of methods have been implemented and tested by researchers, both in the context of software repositories (see e.g. [6,9]) and in the context of more traditional document tibrarles (see e.g. [2,25]). Experiments have shown that keyword-based methods, which require some manual, laborintensive pre-proceashrg, performed only marginally better than the entfrely mechanical strhtgsearch methods (see e.g.[6, 2S]), raising the issue of cost-effectivene= of keyword-based methods as compared to string search baaed methods. In this paper, we describe an implementation and experiments which attempt to brfng the two khtds of methods to a level-playing field by: 1) automating as much of the pre-processing involved hr controlled vocabulary-based methods as possible to address the crds issue, and 2) using a realistic experimental setting in which queries consist of problem statemenlx rather than component specifications, in whjch query results are aggregated over several trials, and in which recaU measures take into account overlapping components. Our experiments showed that string search based methods performed better than semi-controlled vocabulary-based method$ which goes further in the direction of more recent component retrfeval experiments which challenged the superiority of controlled vocabulary based clarification and retrieval of components (see e.g. [61).","PeriodicalId":270366,"journal":{"name":"ACM SIGSOFT Symposium on Software Reusability","volume":"23 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":"{\"title\":\"Another nail to the coffin of faceted controlled-vocabulary component classification and retrieval\",\"authors\":\"H. Mili, Estelle Ah-Ki, R. Godin, H. Mcheick\",\"doi\":\"10.1145/258366.258393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Our research centers around exploring methodologies for developing reusable software, and developing methods and toofs for building with reusable software. In this paper, we focus on reusable software component retrieval methods that were developed and teated in the context of ClassServer, an experimental library tool developed at the University of Qu6bec at Montr6al to explore fssues in software reuse [15]. The methods dfscusaed in tbfa paper fall into two categori~ 1) string search-based retrieval metbod$ and 2) keyword-based retrieval methods. Both kinds of methods have been implemented and tested by researchers, both in the context of software repositories (see e.g. [6,9]) and in the context of more traditional document tibrarles (see e.g. [2,25]). Experiments have shown that keyword-based methods, which require some manual, laborintensive pre-proceashrg, performed only marginally better than the entfrely mechanical strhtgsearch methods (see e.g.[6, 2S]), raising the issue of cost-effectivene= of keyword-based methods as compared to string search baaed methods. In this paper, we describe an implementation and experiments which attempt to brfng the two khtds of methods to a level-playing field by: 1) automating as much of the pre-processing involved hr controlled vocabulary-based methods as possible to address the crds issue, and 2) using a realistic experimental setting in which queries consist of problem statemenlx rather than component specifications, in whjch query results are aggregated over several trials, and in which recaU measures take into account overlapping components. Our experiments showed that string search based methods performed better than semi-controlled vocabulary-based method$ which goes further in the direction of more recent component retrfeval experiments which challenged the superiority of controlled vocabulary based clarification and retrieval of components (see e.g. [61).\",\"PeriodicalId\":270366,\"journal\":{\"name\":\"ACM SIGSOFT Symposium on Software Reusability\",\"volume\":\"23 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"54\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM SIGSOFT Symposium on Software Reusability\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/258366.258393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGSOFT Symposium on Software Reusability","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/258366.258393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 54

摘要

我们的研究集中在探索开发可重用软件的方法，以及开发使用可重用软件构建的方法和工具。在本文中，我们重点关注可重用软件组件检索方法，这些方法是在ClassServer的背景下开发和处理的，ClassServer是蒙特利尔Qu6bec大学开发的一个实验性库工具，用于探索软件重用中的问题[15]。本文讨论的方法可分为两大类:1)基于字符串搜索的检索方法$和2)基于关键字的检索方法$。这两种方法都已经被研究人员在软件存储库的环境中(参见[6,9])和更传统的文档库的环境中(参见[2,25])实现和测试过。实验表明，基于关键字的方法，需要一些人工的、劳动密集型的预处理，只比完全机械的强度搜索方法好一点点(参见[6,2s])，这就提出了与基于字符串搜索的方法相比，基于关键字的方法的成本效益问题。在本文中，我们描述了一个实现和实验，试图通过以下方式将这两种方法引入一个公平的竞争环境:1)尽可能多地自动化涉及hr控制的基于词汇表的方法的预处理，以解决crds问题;2)使用一个实际的实验设置，其中查询由问题陈述组成，而不是组件规范，其中查询结果在几个试验中汇总，其中recaU措施考虑了重叠的组件。我们的实验表明，基于字符串搜索的方法比基于半受控词汇表的方法表现得更好，这进一步挑战了最近基于受控词汇表的组件澄清和检索的优越性(参见示例[61])。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Another nail to the coffin of faceted controlled-vocabulary component classification and retrieval

Our research centers around exploring methodologies for developing reusable software, and developing methods and toofs for building with reusable software. In this paper, we focus on reusable software component retrieval methods that were developed and teated in the context of ClassServer, an experimental library tool developed at the University of Qu6bec at Montr6al to explore fssues in software reuse [15]. The methods dfscusaed in tbfa paper fall into two categori~ 1) string search-based retrieval metbod$ and 2) keyword-based retrieval methods. Both kinds of methods have been implemented and tested by researchers, both in the context of software repositories (see e.g. [6,9]) and in the context of more traditional document tibrarles (see e.g. [2,25]). Experiments have shown that keyword-based methods, which require some manual, laborintensive pre-proceashrg, performed only marginally better than the entfrely mechanical strhtgsearch methods (see e.g.[6, 2S]), raising the issue of cost-effectivene= of keyword-based methods as compared to string search baaed methods. In this paper, we describe an implementation and experiments which attempt to brfng the two khtds of methods to a level-playing field by: 1) automating as much of the pre-processing involved hr controlled vocabulary-based methods as possible to address the crds issue, and 2) using a realistic experimental setting in which queries consist of problem statemenlx rather than component specifications, in whjch query results are aggregated over several trials, and in which recaU measures take into account overlapping components. Our experiments showed that string search based methods performed better than semi-controlled vocabulary-based method$ which goes further in the direction of more recent component retrfeval experiments which challenged the superiority of controlled vocabulary based clarification and retrieval of components (see e.g. [61).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM SIGSOFT Symposium on Software Reusability

自引率

0.00%

发文量