使用 ChatGPT 作为科学工作流程开发的大型语言模型的定性评估。

IF 11.8 2区生物学 Q1 MULTIDISCIPLINARY SCIENCES GigaScience Pub Date : 2024-01-02 DOI:10.1093/gigascience/giae030

Mario Sänger, Ninon De Mecquenem, Katarzyna Ewa Lewińska, Vasilis Bountris, Fabian Lehmann, Ulf Leser, Thomas Kosch

{"title":"使用 ChatGPT 作为科学工作流程开发的大型语言模型的定性评估。","authors":"Mario Sänger, Ninon De Mecquenem, Katarzyna Ewa Lewińska, Vasilis Bountris, Fabian Lehmann, Ulf Leser, Thomas Kosch","doi":"10.1093/gigascience/giae030","DOIUrl":null,"url":null,"abstract":"Background: Scientific workflow systems are increasingly popular for expressing and executing complex data analysis pipelines over large datasets, as they offer reproducibility, dependability, and scalability of analyses by automatic parallelization on large compute clusters. However, implementing workflows is difficult due to the involvement of many black-box tools and the deep infrastructure stack necessary for their execution. Simultaneously, user-supporting tools are rare, and the number of available examples is much lower than in classical programming languages.Results: To address these challenges, we investigate the efficiency of large language models (LLMs), specifically ChatGPT, to support users when dealing with scientific workflows. We performed 3 user studies in 2 scientific domains to evaluate ChatGPT for comprehending, adapting, and extending workflows. Our results indicate that LLMs efficiently interpret workflows but achieve lower performance for exchanging components or purposeful workflow extensions. We characterize their limitations in these challenging scenarios and suggest future research directions.Conclusions: Our results show a high accuracy for comprehending and explaining scientific workflows while achieving a reduced performance for modifying and extending workflow descriptions. These findings clearly illustrate the need for further research in this area.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11186067/pdf/","citationCount":"0","resultStr":"{\"title\":\"A qualitative assessment of using ChatGPT as large language model for scientific workflow development.\",\"authors\":\"Mario Sänger, Ninon De Mecquenem, Katarzyna Ewa Lewińska, Vasilis Bountris, Fabian Lehmann, Ulf Leser, Thomas Kosch\",\"doi\":\"10.1093/gigascience/giae030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Scientific workflow systems are increasingly popular for expressing and executing complex data analysis pipelines over large datasets, as they offer reproducibility, dependability, and scalability of analyses by automatic parallelization on large compute clusters. However, implementing workflows is difficult due to the involvement of many black-box tools and the deep infrastructure stack necessary for their execution. Simultaneously, user-supporting tools are rare, and the number of available examples is much lower than in classical programming languages.Results: To address these challenges, we investigate the efficiency of large language models (LLMs), specifically ChatGPT, to support users when dealing with scientific workflows. We performed 3 user studies in 2 scientific domains to evaluate ChatGPT for comprehending, adapting, and extending workflows. Our results indicate that LLMs efficiently interpret workflows but achieve lower performance for exchanging components or purposeful workflow extensions. We characterize their limitations in these challenging scenarios and suggest future research directions.Conclusions: Our results show a high accuracy for comprehending and explaining scientific workflows while achieving a reduced performance for modifying and extending workflow descriptions. These findings clearly illustrate the need for further research in this area.\",\"PeriodicalId\":12581,\"journal\":{\"name\":\"GigaScience\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2024-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11186067/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GigaScience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/gigascience/giae030\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gigascience/giae030","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

背景：科学工作流系统通过在大型计算集群上自动并行化，提供了分析的可重复性、可靠性和可扩展性，因此在表达和执行大型数据集上的复杂数据分析管道方面越来越受欢迎。然而，由于涉及许多黑盒工具和执行工作流程所需的深层基础设施堆栈，工作流程的实施非常困难。同时，用户支持工具很少，可用示例的数量也远低于经典编程语言：为了应对这些挑战，我们研究了大型语言模型（LLM），特别是 ChatGPT，在处理科学工作流时为用户提供支持的效率。我们在 2 个科学领域进行了 3 项用户研究，以评估 ChatGPT 在理解、调整和扩展工作流方面的能力。我们的研究结果表明，LLM 可以有效地解释工作流，但在交换组件或有目的的工作流扩展方面性能较低。我们描述了它们在这些具有挑战性的场景中的局限性，并提出了未来的研究方向：我们的研究结果表明，在理解和解释科学工作流方面具有很高的准确性，而在修改和扩展工作流描述方面却表现不佳。这些发现清楚地说明了在这一领域开展进一步研究的必要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A qualitative assessment of using ChatGPT as large language model for scientific workflow development.

Background: Scientific workflow systems are increasingly popular for expressing and executing complex data analysis pipelines over large datasets, as they offer reproducibility, dependability, and scalability of analyses by automatic parallelization on large compute clusters. However, implementing workflows is difficult due to the involvement of many black-box tools and the deep infrastructure stack necessary for their execution. Simultaneously, user-supporting tools are rare, and the number of available examples is much lower than in classical programming languages.

Results: To address these challenges, we investigate the efficiency of large language models (LLMs), specifically ChatGPT, to support users when dealing with scientific workflows. We performed 3 user studies in 2 scientific domains to evaluate ChatGPT for comprehending, adapting, and extending workflows. Our results indicate that LLMs efficiently interpret workflows but achieve lower performance for exchanging components or purposeful workflow extensions. We characterize their limitations in these challenging scenarios and suggest future research directions.

Conclusions: Our results show a high accuracy for comprehending and explaining scientific workflows while achieving a reduced performance for modifying and extending workflow descriptions. These findings clearly illustrate the need for further research in this area.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

GigaScience MULTIDISCIPLINARY SCIENCES-

CiteScore

15.50

自引率

1.10%

发文量

119

审稿时长

1 weeks

期刊介绍： GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.