ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models

IF 3.6 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Current Research in Biotechnology Pub Date : 2024-01-01 DOI:10.1016/j.crbiot.2024.100194
Manojit Bhattacharya , Soumen Pal , Srijan Chatterjee , Abdulrahman Alshammari , Thamer H. Albekairi , Supriya Jagga , Elijah Ige Ohimain , Hatem Zayed , Siddappa N. Byrareddy , Sang-Soo Lee , Zhi-Hong Wen , Govindasamy Agoramoorthy , Prosun Bhattacharya , Chiranjib Chakraborty
{"title":"ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: A pattern of responses of generative artificial intelligence or large language models","authors":"Manojit Bhattacharya ,&nbsp;Soumen Pal ,&nbsp;Srijan Chatterjee ,&nbsp;Abdulrahman Alshammari ,&nbsp;Thamer H. Albekairi ,&nbsp;Supriya Jagga ,&nbsp;Elijah Ige Ohimain ,&nbsp;Hatem Zayed ,&nbsp;Siddappa N. Byrareddy ,&nbsp;Sang-Soo Lee ,&nbsp;Zhi-Hong Wen ,&nbsp;Govindasamy Agoramoorthy ,&nbsp;Prosun Bhattacharya ,&nbsp;Chiranjib Chakraborty","doi":"10.1016/j.crbiot.2024.100194","DOIUrl":null,"url":null,"abstract":"<div><p>Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 <span><math><mo>±</mo></math></span> 70 were correct (Median : 82.5; Coefficient variance : 4.85), 3 <span><math><mo>±</mo></math></span> 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 <span><math><mo>±</mo></math></span> 97.60 (mean range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 <span><math><mo>±</mo></math></span> 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges for an error-free, next-generation LLM to avoid users’ misguidance.</p></div>","PeriodicalId":52676,"journal":{"name":"Current Research in Biotechnology","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590262824000200/pdfft?md5=c02e55a054a90a3e570a4fed3056ffaf&pid=1-s2.0-S2590262824000200-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Biotechnology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590262824000200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT by individual researcher at multi-country level to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 ± 70 were correct (Median : 82.5; Coefficient variance : 4.85), 3 ± 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 ± 97.60 (mean range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 ± 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges for an error-free, next-generation LLM to avoid users’ misguidance.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ChatGPT 的记分卡是在多国进行一系列测试后得出的:生成式人工智能或大型语言模型的响应模式
最近,研究人员对 ChatGPT 派生答案表示担忧。在此,我们使用 ChatGPT 在多国范围内进行了一系列测试,以了解其答案的准确性、可重复性、答案长度、抄袭情况,并通过两份问卷(第一份问卷包含 15 个 MCQ,第二份问卷包含 15 个 KBQ)进行了深入研究。在 15 个 MCQ 生成的答案中,正确率为 13 ± 70(中位数:82.5;系数方差:4.85),错误率为 3 ± 0.77(中位数:3,系数方差:25.81),可重现性为 1 至 10,不可重现性为 11 至 15。在 15 个知识库问题中,每个问题的长度(以字为单位)约为 294.5 ± 97.60(平均范围在 138.7 至 438.09 之间),每个问题的平均相似度指数(以字为单位)约为 29.53 ± 11.40(系数方差:38.62)。此外,还利用分析的答案参数建立了统计模型。研究显示了 ChatGPT 派生答案的正确性和不正确性模式,并敦促开发无差错的下一代 LLM,以避免用户的误导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Current Research in Biotechnology
Current Research in Biotechnology Biochemistry, Genetics and Molecular Biology-Biotechnology
CiteScore
6.70
自引率
3.60%
发文量
50
审稿时长
38 days
期刊介绍: Current Research in Biotechnology (CRBIOT) is a new primary research, gold open access journal from Elsevier. CRBIOT publishes original papers, reviews, and short communications (including viewpoints and perspectives) resulting from research in biotechnology and biotech-associated disciplines. Current Research in Biotechnology is a peer-reviewed gold open access (OA) journal and upon acceptance all articles are permanently and freely available. It is a companion to the highly regarded review journal Current Opinion in Biotechnology (2018 CiteScore 8.450) and is part of the Current Opinion and Research (CO+RE) suite of journals. All CO+RE journals leverage the Current Opinion legacy-of editorial excellence, high-impact, and global reach-to ensure they are a widely read resource that is integral to scientists' workflow.
期刊最新文献
Engineering yeast lipids for production of designer biodiesel Table of Contents Dolastatins and their analogues present a compelling landscape of potential natural and synthetic anticancer drug candidates Drug Discovery, Diagnostic, and therapeutic trends on Mpox: A patent landscape Life cycle and environmental impact assessment of vegetation-activated sludge process (V-ASP) for decentralized wastewater treatment
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1