The JavaScript Package Selection Task: A Comparative Experiment Using an LLM-based Approach

Q4 Mathematics CLEI Electronic Journal Pub Date : 2024-07-21 DOI:10.19153/cleiej.27.2.4

Andres Diaz Pace, Antonela Tommasel, H. Vázquez

{"title":"The JavaScript Package Selection Task: A Comparative Experiment Using an LLM-based Approach","authors":"Andres Diaz Pace, Antonela Tommasel, H. Vázquez","doi":"10.19153/cleiej.27.2.4","DOIUrl":null,"url":null,"abstract":"\n \n \nWhen developing JavaScript (JS) applications, the assessment and selection of JS packages becomes challenging for developers due to the growing number of technology options available. Given a technology-related task, a common developers’ strategy is to query Web repositories (e.g., from GitHub) via a search engine (e.g., NPM, Google) and then shortlist candidate JS packages. However, this search might return a long list of results and not all of them might be relevant. Thus, these results often need to be (re-)ordered according to the developer’s criteria. To address these problems, in prior work, we developed a recommender system called AIDT that assists developers in the package selection task. AIDT relies on meta-search and machine learning techniques to infer the relevant packages for a query. An initial evaluation of AIDT showed good search effectiveness, but the tool was unable to explain its choices to the developer. Research on Large Language Models (LLMs) has recently opened new opportunities for this kind of recommender systems. Anyway, human developers should judge whether the recommendations (e.g., JS packages) of these tools (either AIDT or LLMs) are fit to purpose. In this paper, we propose a Retrieval Augmented Generation (RAG) architecture for using LLMs in the domain of technology selection, which enhances the AIDT original design. Furthermore, we report on a user study using both AIDT and different LLM-based variants (ChatGPT, Cohere, Llama2) on a sample of JS-related queries, in which we compared their results and also validated them against developers’ criteria for the task. Our findings show that, although the ranking capabilities of LLMs are not yet on par with AIDT or human efforts, the RAG architecture can achieve a decent performance and is good at providing explanations for the package choices in the rankings. The latter feature makes it more transparent than AIDT and, thus, potentially more flexible to support developers’ tasks. \n \n \n","PeriodicalId":30032,"journal":{"name":"CLEI Electronic Journal","volume":"56 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CLEI Electronic Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.19153/cleiej.27.2.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 0

Abstract

When developing JavaScript (JS) applications, the assessment and selection of JS packages becomes challenging for developers due to the growing number of technology options available. Given a technology-related task, a common developers’ strategy is to query Web repositories (e.g., from GitHub) via a search engine (e.g., NPM, Google) and then shortlist candidate JS packages. However, this search might return a long list of results and not all of them might be relevant. Thus, these results often need to be (re-)ordered according to the developer’s criteria. To address these problems, in prior work, we developed a recommender system called AIDT that assists developers in the package selection task. AIDT relies on meta-search and machine learning techniques to infer the relevant packages for a query. An initial evaluation of AIDT showed good search effectiveness, but the tool was unable to explain its choices to the developer. Research on Large Language Models (LLMs) has recently opened new opportunities for this kind of recommender systems. Anyway, human developers should judge whether the recommendations (e.g., JS packages) of these tools (either AIDT or LLMs) are fit to purpose. In this paper, we propose a Retrieval Augmented Generation (RAG) architecture for using LLMs in the domain of technology selection, which enhances the AIDT original design. Furthermore, we report on a user study using both AIDT and different LLM-based variants (ChatGPT, Cohere, Llama2) on a sample of JS-related queries, in which we compared their results and also validated them against developers’ criteria for the task. Our findings show that, although the ranking capabilities of LLMs are not yet on par with AIDT or human efforts, the RAG architecture can achieve a decent performance and is good at providing explanations for the package choices in the rankings. The latter feature makes it more transparent than AIDT and, thus, potentially more flexible to support developers’ tasks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

JavaScript 软件包选择任务：使用基于 LLM 方法的比较实验

在开发 JavaScript（JS）应用程序时，由于可供选择的技术越来越多，JS 软件包的评估和选择对开发人员来说具有挑战性。对于与技术相关的任务，开发人员的常见策略是通过搜索引擎（如 NPM、Google）查询网络存储库（如 GitHub），然后筛选出候选 JS 包。但是，这种搜索可能会返回一长串结果，而且并非所有结果都是相关的。因此，通常需要根据开发人员的标准对这些结果进行（重新）排序。为了解决这些问题，我们在之前的工作中开发了一个名为 AIDT 的推荐系统，它可以帮助开发人员完成软件包选择任务。AIDT 依靠元搜索和机器学习技术来推断查询的相关软件包。对 AIDT 的初步评估显示其搜索效果良好，但该工具无法向开发人员解释其选择。最近，大型语言模型（LLM）的研究为这类推荐系统带来了新的机遇。无论如何，人类开发人员应该判断这些工具（AIDT 或 LLMs）的推荐（如 JS 包）是否符合目的。在本文中，我们提出了在技术选择领域使用 LLM 的检索增强生成（RAG）架构，该架构增强了 AIDT 的原始设计。此外，我们还报告了一项用户研究，该研究同时使用了 AIDT 和不同的基于 LLM 的变体（ChatGPT、Cohere、Llama2）来处理与 JS 相关的查询样本，在这项研究中，我们比较了它们的结果，并根据开发人员的任务标准对它们进行了验证。我们的研究结果表明，虽然 LLM 的排序能力还无法与 AIDT 或人工排序相提并论，但 RAG 架构可以实现不错的性能，并且善于为排序中的软件包选择提供解释。后一个特点使它比 AIDT 更加透明，因此在支持开发人员的任务方面可能更加灵活。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊