LLMs for science: Usage for code generation and data analysis

IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Journal of Software-Evolution and Process Pub Date : 2024-09-13 DOI:10.1002/smr.2723
Mohamed Nejjar, Luca Zacharias, Fabian Stiehle, Ingo Weber
{"title":"LLMs for science: Usage for code generation and data analysis","authors":"Mohamed Nejjar, Luca Zacharias, Fabian Stiehle, Ingo Weber","doi":"10.1002/smr.2723","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) have been touted to enable increased productivity in many areas of today's work life. Scientific research as an area of work is no exception: The potential of LLM‐based tools to assist in the daily work of scientists has become a highly discussed topic across disciplines. However, we are only at the very onset of this subject of study. It is still unclear how the potential of LLMs will materialize in research practice. With this study, we give first empirical evidence on the use of LLMs in the research process. We have investigated a set of use cases for LLM‐based tools in scientific research and conducted a first study to assess to which degree current tools are helpful. In this position paper, we report specifically on use cases related to software engineering, specifically, on generating application code and developing scripts for data analytics and visualization. While we studied seemingly simple use cases, results across tools differ significantly. Our results highlight the promise of LLM‐based tools in general, yet we also observe various issues, particularly regarding the integrity of the output these tools provide.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"4 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/smr.2723","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models (LLMs) have been touted to enable increased productivity in many areas of today's work life. Scientific research as an area of work is no exception: The potential of LLM‐based tools to assist in the daily work of scientists has become a highly discussed topic across disciplines. However, we are only at the very onset of this subject of study. It is still unclear how the potential of LLMs will materialize in research practice. With this study, we give first empirical evidence on the use of LLMs in the research process. We have investigated a set of use cases for LLM‐based tools in scientific research and conducted a first study to assess to which degree current tools are helpful. In this position paper, we report specifically on use cases related to software engineering, specifically, on generating application code and developing scripts for data analytics and visualization. While we studied seemingly simple use cases, results across tools differ significantly. Our results highlight the promise of LLM‐based tools in general, yet we also observe various issues, particularly regarding the integrity of the output these tools provide.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于科学的 LLM:用于代码生成和数据分析
大型语言模型(LLM)被认为可以提高当今许多工作领域的生产率。科学研究领域也不例外:基于 LLM 的工具在协助科学家日常工作方面的潜力已成为各学科讨论的热门话题。然而,我们对这一主题的研究才刚刚起步。目前还不清楚 LLM 的潜力将如何在研究实践中实现。通过本研究,我们首次提供了在研究过程中使用 LLM 的实证证据。我们调查了一系列基于 LLM 的工具在科学研究中的使用案例,并进行了首次研究,以评估当前工具在多大程度上有所帮助。在本立场文件中,我们特别报告了与软件工程相关的用例,尤其是生成应用代码以及开发数据分析和可视化脚本的用例。虽然我们研究的用例看似简单,但不同工具的结果却大相径庭。我们的研究结果凸显了基于 LLM 的工具的前景,但我们也发现了各种问题,尤其是这些工具所提供的输出的完整性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Software-Evolution and Process
Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-
自引率
10.00%
发文量
109
期刊最新文献
Issue Information Issue Information A hybrid‐ensemble model for software defect prediction for balanced and imbalanced datasets using AI‐based techniques with feature preservation: SMERKP‐XGB Issue Information LLMs for science: Usage for code generation and data analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1