Improving Web Element Localization by Using a Large Language Model

Software Testing, Verification and Reliability Pub Date : 2024-08-16 DOI:10.1002/stvr.1893

Michel Nass, Emil Alégroth, Robert Feldt

{"title":"Improving Web Element Localization by Using a Large Language Model","authors":"Michel Nass, Emil Alégroth, Robert Feldt","doi":"10.1002/stvr.1893","DOIUrl":null,"url":null,"abstract":"Web‐based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but do not grasp the context and meaning of elements and words. The emergence of large language models (LLMs) like GPT‐4, which can show human‐like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top‐ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human‐like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real‐world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analysed for 140 instances. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 40 (out of 804), a 43% reduction. Despite its slower execution time and additional costs of using the GPT‐4 model, the LLM's human‐like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element localization in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs' capabilities, limitations and practical use in GUI testing.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software Testing, Verification and Reliability","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/stvr.1893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Web‐based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but do not grasp the context and meaning of elements and words. The emergence of large language models (LLMs) like GPT‐4, which can show human‐like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top‐ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human‐like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real‐world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analysed for 140 instances. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 40 (out of 804), a 43% reduction. Despite its slower execution time and additional costs of using the GPT‐4 model, the LLM's human‐like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element localization in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs' capabilities, limitations and practical use in GUI testing.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用大型语言模型改进网页元素本地化

基于网络的测试自动化在很大程度上依赖于准确查找网络元素。传统方法只能比较属性，却无法把握元素和词语的上下文和含义。像 GPT-4 这样的大型语言模型（LLM）可以在某些任务中表现出类似人类的推理能力，它的出现为软件工程和网页元素本地化提供了新的机遇。本文介绍并评估了 VON Similo LLM，这是一种增强型网页元素本地化方法。通过使用 LLM，它可以从现有 VON Similo 方法识别出的排名靠前的网页元素中选择最有可能的网页元素，其理想目标是接近人类的选择精确度。我们使用 48 个真实世界网络应用程序中的 804 个网络元素对进行了实验研究。我们测量了正确识别的元素数量和执行时间，比较了 VON Similo LLM 与基准算法的有效性和效率。此外，我们还记录并分析了 140 个实例的 LLM 动机。VON Similo LLM 的性能得到了提高，定位失败从 70 次减少到 40 次（共 804 次），减少了 43%。尽管 LLM 的执行时间较慢，而且使用 GPT-4 模型需要额外成本，但它的类人推理功能在增强网页元素本地化方面显示出了前景。LLM 技术可以增强图形用户界面测试自动化中的网页元素定位，减少误报，并有可能降低维护成本。然而，要充分了解 LLM 的能力、局限性以及在图形用户界面测试中的实际应用，还需要进一步的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Software Testing, Verification and Reliability

自引率

0.00%

发文量

期刊最新文献

Fault tolerance and metamorphic relation prediction Validity Matters: Uncertainty‐Guided Testing of Deep Neural Networks Improving Web Element Localization by Using a Large Language Model Boosting Multimode Ruling in DHR Architecture With Metamorphic Relations Scenario‐Driven Metamorphic Testing for Autonomous Driving Simulators