Another nail to the coffin of faceted controlled-vocabulary component classification and retrieval

H. Mili, Estelle Ah-Ki, R. Godin, H. Mcheick
{"title":"Another nail to the coffin of faceted controlled-vocabulary component classification and retrieval","authors":"H. Mili, Estelle Ah-Ki, R. Godin, H. Mcheick","doi":"10.1145/258366.258393","DOIUrl":null,"url":null,"abstract":"Our research centers around exploring methodologies for developing reusable software, and developing methods and toofs for building with reusable software. In this paper, we focus on reusable software component retrieval methods that were developed and teated in the context of ClassServer, an experimental library tool developed at the University of Qu6bec at Montr6al to explore fssues in software reuse [15]. The methods dfscusaed in tbfa paper fall into two categori~ 1) string search-based retrieval metbod$ and 2) keyword-based retrieval methods. Both kinds of methods have been implemented and tested by researchers, both in the context of software repositories (see e.g. [6,9]) and in the context of more traditional document tibrarles (see e.g. [2,25]). Experiments have shown that keyword-based methods, which require some manual, laborintensive pre-proceashrg, performed only marginally better than the entfrely mechanical strhtgsearch methods (see e.g.[6, 2S]), raising the issue of cost-effectivene= of keyword-based methods as compared to string search baaed methods. In this paper, we describe an implementation and experiments which attempt to brfng the two khtds of methods to a level-playing field by: 1) automating as much of the pre-processing involved hr controlled vocabulary-based methods as possible to address the crds issue, and 2) using a realistic experimental setting in which queries consist of problem statemenlx rather than component specifications, in whjch query results are aggregated over several trials, and in which recaU measures take into account overlapping components. Our experiments showed that string search based methods performed better than semi-controlled vocabulary-based method$ which goes further in the direction of more recent component retrfeval experiments which challenged the superiority of controlled vocabulary based clarification and retrieval of components (see e.g. [61).","PeriodicalId":270366,"journal":{"name":"ACM SIGSOFT Symposium on Software Reusability","volume":"23 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGSOFT Symposium on Software Reusability","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/258366.258393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54

Abstract

Our research centers around exploring methodologies for developing reusable software, and developing methods and toofs for building with reusable software. In this paper, we focus on reusable software component retrieval methods that were developed and teated in the context of ClassServer, an experimental library tool developed at the University of Qu6bec at Montr6al to explore fssues in software reuse [15]. The methods dfscusaed in tbfa paper fall into two categori~ 1) string search-based retrieval metbod$ and 2) keyword-based retrieval methods. Both kinds of methods have been implemented and tested by researchers, both in the context of software repositories (see e.g. [6,9]) and in the context of more traditional document tibrarles (see e.g. [2,25]). Experiments have shown that keyword-based methods, which require some manual, laborintensive pre-proceashrg, performed only marginally better than the entfrely mechanical strhtgsearch methods (see e.g.[6, 2S]), raising the issue of cost-effectivene= of keyword-based methods as compared to string search baaed methods. In this paper, we describe an implementation and experiments which attempt to brfng the two khtds of methods to a level-playing field by: 1) automating as much of the pre-processing involved hr controlled vocabulary-based methods as possible to address the crds issue, and 2) using a realistic experimental setting in which queries consist of problem statemenlx rather than component specifications, in whjch query results are aggregated over several trials, and in which recaU measures take into account overlapping components. Our experiments showed that string search based methods performed better than semi-controlled vocabulary-based method$ which goes further in the direction of more recent component retrfeval experiments which challenged the superiority of controlled vocabulary based clarification and retrieval of components (see e.g. [61).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多面控制词汇组件分类和检索的另一个问题
我们的研究集中在探索开发可重用软件的方法,以及开发使用可重用软件构建的方法和工具。在本文中,我们重点关注可重用软件组件检索方法,这些方法是在ClassServer的背景下开发和处理的,ClassServer是蒙特利尔Qu6bec大学开发的一个实验性库工具,用于探索软件重用中的问题[15]。本文讨论的方法可分为两大类:1)基于字符串搜索的检索方法$和2)基于关键字的检索方法$。这两种方法都已经被研究人员在软件存储库的环境中(参见[6,9])和更传统的文档库的环境中(参见[2,25])实现和测试过。实验表明,基于关键字的方法,需要一些人工的、劳动密集型的预处理,只比完全机械的强度搜索方法好一点点(参见[6,2s]),这就提出了与基于字符串搜索的方法相比,基于关键字的方法的成本效益问题。在本文中,我们描述了一个实现和实验,试图通过以下方式将这两种方法引入一个公平的竞争环境:1)尽可能多地自动化涉及hr控制的基于词汇表的方法的预处理,以解决crds问题;2)使用一个实际的实验设置,其中查询由问题陈述组成,而不是组件规范,其中查询结果在几个试验中汇总,其中recaU措施考虑了重叠的组件。我们的实验表明,基于字符串搜索的方法比基于半受控词汇表的方法表现得更好,这进一步挑战了最近基于受控词汇表的组件澄清和检索的优越性(参见示例[61])。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Vacuity of the Open Source Security Testing Methodology Manual Building Blocks in Standards: Improving Consistency in Standardization with Ontology and Reasoning Great Expectations: A Critique of Current Approaches to Random Number Generation Testing & Certification Co-ordinating Developers and High-Risk Users of Privacy-Enhanced Secure Messaging Protocols hacspec: Towards Verifiable Crypto Standards
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1