{"title":"使用大型语言模型改进网页元素本地化","authors":"Michel Nass, Emil Alégroth, Robert Feldt","doi":"10.1002/stvr.1893","DOIUrl":null,"url":null,"abstract":"Web‐based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but do not grasp the context and meaning of elements and words. The emergence of large language models (LLMs) like GPT‐4, which can show human‐like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top‐ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human‐like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real‐world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analysed for 140 instances. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 40 (out of 804), a 43% reduction. Despite its slower execution time and additional costs of using the GPT‐4 model, the LLM's human‐like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element localization in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs' capabilities, limitations and practical use in GUI testing.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Web Element Localization by Using a Large Language Model\",\"authors\":\"Michel Nass, Emil Alégroth, Robert Feldt\",\"doi\":\"10.1002/stvr.1893\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web‐based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but do not grasp the context and meaning of elements and words. The emergence of large language models (LLMs) like GPT‐4, which can show human‐like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top‐ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human‐like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real‐world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analysed for 140 instances. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 40 (out of 804), a 43% reduction. Despite its slower execution time and additional costs of using the GPT‐4 model, the LLM's human‐like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element localization in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs' capabilities, limitations and practical use in GUI testing.\",\"PeriodicalId\":501413,\"journal\":{\"name\":\"Software Testing, Verification and Reliability\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Software Testing, Verification and Reliability\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/stvr.1893\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software Testing, Verification and Reliability","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/stvr.1893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Web Element Localization by Using a Large Language Model
Web‐based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but do not grasp the context and meaning of elements and words. The emergence of large language models (LLMs) like GPT‐4, which can show human‐like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top‐ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human‐like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real‐world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analysed for 140 instances. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 40 (out of 804), a 43% reduction. Despite its slower execution time and additional costs of using the GPT‐4 model, the LLM's human‐like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element localization in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs' capabilities, limitations and practical use in GUI testing.