Medical Education and artificial intelligence: Responsible and effective practice requires human oversight

IF 4.9 1区教育学 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Medical Education Pub Date : 2024-08-23 DOI:10.1111/medu.15495

Kevin W. Eva

{"title":"Medical Education and artificial intelligence: Responsible and effective practice requires human oversight","authors":"Kevin W. Eva","doi":"10.1111/medu.15495","DOIUrl":null,"url":null,"abstract":"I have a confession to make. I have been slow to generate an official policy statement for Medical Education about artificial intelligence (AI) because I find the discussion terribly boring. Don't confuse that statement with lack of interest—I consider the technology exhilarating, use it routinely, and marvel at its potential.1 Don't confuse it either with being dismissive—I recognise, appreciate, and wish to help guard against the ethical harms that could be done from, among other things, loss of intellectual property and reinforcement of systemic bias.2 However, I find most discussion about the use of AI in publishing (be it about writing, enabling better and faster peer review, or the need to guard against unscrupulous practices) to boil down to the same basic sentiment: Responsible and effective practice requires human oversight.With over 14 500 seemingly viable AI resources readily available,3 there is great risk of overgeneralization and I will not profess to having deep knowledge of the means through which each has been generated. I do, however, believe this class of technologies, as a whole, to best be conceived of as tools (that happen to be proliferating at unprecedented speed and with little empirical testing).4, 5 Some of the panic the rate of development creates amounts to worry that we ourselves will become tools, used by the computers, but that is not the reality we are dealing with at the moment and there are very good reasons to not believe the futurists in that regard.6 As such, we must focus on what all tools require for responsible and effective practice: Human, or at least biological,7 oversight. So let's consider the role each group involved in journal publication has to play in that regard.We encourage authors to use AI if and when it helps strengthen their capacity to improve awareness of pre-existing literature,8 to formulate stronger research questions or to bolster research designs and analyses (i.e. any time it helps to make their scholarship better). We are not going to force disclosure of every way in which AI influenced their submissions because it would be impossible to craft a sufficiently detailed guideline (especially given that people are often unaware of how AI has been embedded in common software packages). Further, a dominant theme in our International Editorial Advisory Board's debate about this issue was that requiring such disclosure is likely to be increasingly nonsensical, tantamount to needing to disclose the use of Google, spell-check, a keyboard, or any other tool that is similarly omnipresent in academic work. If using AI was of fundamental importance to your project, then what made it so should be disclosed in the body of your paper. That standard, however, is the same as has always been applied to disclosing use of tools like nVivo or SPSS for primary analyses or any of countless databases for literature searches: authors are responsible for clearly describing aspects of their efforts that readers need to know to understand the rigour and replicability of the study. In doing so, it is of course important to keep in mind, just as we do routinely with other technologies, that any tool can be misused, requiring caution and investment in learning about the tool's strengths and limitations.1, 9, 10 Responsible and effective practice requires human oversight.Optimistically, we hope this position will improve equity in the field by reducing barriers for those who wish to publish in our pages despite their first language not being English. We understand the risk of other barriers being created by virtue of privileging those who can afford the technology, but remain hopeful that is a lesser challenge given the truism that computer technology gets cheaper with time.11 There can be no doubt that AI hallucinates,10 requiring individuals to double check its claims about the world while recognising that the author, not the computer, is accountable for the final text.12 For those reasons, I would never myself dream of submitting a paper in a language other than one in which I was fluent without careful and triangulated effort to confirm that the translation said exactly what I intended it to. Responsible and effective practice requires human oversight.Whether authors have used AI or not, peer review remains the best tool we have at our disposal for improving the work we are collectively undertaking as a field of study.13 Given that AI is currently built on a corpus of knowledge that is predominantly English,2 we need reviewers to raise questions about whether a project responsibly represents the state of knowledge in the world. That standard, however, is the same as has always been applied in attempts to judge the adequacy of a paper's framing. Similarly, while it would be inappropriate to submit a manuscript one received for peer review to an AI device without permission to do so, that aligns with the same confidentiality standard that has existed for decades. Asking a question of AI to clarify one's thinking or to contemplate clearer (or more courteous) ways of conveying one's concerns is encouraged if and when it improves the reviewer's capacity to offer feedback to authors or professional development for reviewers themselves.14 Out of curiosity, I once submitted some of my own writing to ChatGPT with a request to ‘write a rejection letter’ (to see if it could predict the objections peer reviewers might raise). After the first response largely parroted back the claims I had made in the abstract, I instructed the computer to try again, stressing that I wanted a rejection letter. Its response was informative: ‘I am not programmed for critical appraisal.’ Even the AI ‘knew’ that responsible and effective practice requires human oversight.As curators and stewards of the journal, we too pledge to use AI if and when it helps to improve reader, author, and reviewer experience. For example, AI has enabled implementation of a ‘free format’ submission system at Medical Education, so our authors no longer have to go through the tedium of formatting references in a specific way; authors will also note that most of the effort involved in making a submission now amounts to uploading a manuscript and confirming that the software has accurately identified author names, the title, abstract and so on. AI has been used for years to try to detect unethical publication practices such as duplicate submission and plagiarism. Similarly, the software we use to manage peer review has AI embedded that suggests reviewers it thinks to be particularly well suited to the content of the manuscript under consideration. While these systems will undoubtedly continue to improve, each is far from perfect. Truly fraudulent behaviour cannot be caught by any existing software while far more common lesser transgressions are generally over-called. Further, our editors must remain cognizant of the value of hearing diverse voices to inform our peer reviews if we are to facilitate a truly inclusive academic community.15 As a result, we will not automate decision-making or empower AI to conduct it. Instead, these resources will continue to be used as tools, taking advantage of their greater capacity to flag potential issues and opportunities while continuing to investigate them with the care and thoughtfulness required to yield the best outcomes we can achieve. Responsible and effective practice requires human oversight.While less of a policy than a perspective, I offer this editorial as proof that I did (eventually) conclude it necessary to share these views for the sake of transparency, however boring (i.e. non-reactive or reinforcing of the status quo) they happen to be. As technology changes, our policies will continue to evolve, but for now we encourage everyone involved in academic publishing to use whatever tools they have available for improving the field and its capacity to improve health through education. That, to my mind, defines responsible and effective oversight.","PeriodicalId":18370,"journal":{"name":"Medical Education","volume":"58 11","pages":"1260-1261"},"PeriodicalIF":4.9000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/medu.15495","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Education","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/medu.15495","RegionNum":1,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}

引用次数: 0

Abstract

I have a confession to make. I have been slow to generate an official policy statement for Medical Education about artificial intelligence (AI) because I find the discussion terribly boring. Don't confuse that statement with lack of interest—I consider the technology exhilarating, use it routinely, and marvel at its potential.¹ Don't confuse it either with being dismissive—I recognise, appreciate, and wish to help guard against the ethical harms that could be done from, among other things, loss of intellectual property and reinforcement of systemic bias.² However, I find most discussion about the use of AI in publishing (be it about writing, enabling better and faster peer review, or the need to guard against unscrupulous practices) to boil down to the same basic sentiment: Responsible and effective practice requires human oversight.

With over 14 500 seemingly viable AI resources readily available,³ there is great risk of overgeneralization and I will not profess to having deep knowledge of the means through which each has been generated. I do, however, believe this class of technologies, as a whole, to best be conceived of as tools (that happen to be proliferating at unprecedented speed and with little empirical testing).^{4, 5} Some of the panic the rate of development creates amounts to worry that we ourselves will become tools, used by the computers, but that is not the reality we are dealing with at the moment and there are very good reasons to not believe the futurists in that regard.⁶ As such, we must focus on what all tools require for responsible and effective practice: Human, or at least biological,⁷ oversight. So let's consider the role each group involved in journal publication has to play in that regard.

We encourage authors to use AI if and when it helps strengthen their capacity to improve awareness of pre-existing literature,⁸ to formulate stronger research questions or to bolster research designs and analyses (i.e. any time it helps to make their scholarship better). We are not going to force disclosure of every way in which AI influenced their submissions because it would be impossible to craft a sufficiently detailed guideline (especially given that people are often unaware of how AI has been embedded in common software packages). Further, a dominant theme in our International Editorial Advisory Board's debate about this issue was that requiring such disclosure is likely to be increasingly nonsensical, tantamount to needing to disclose the use of Google, spell-check, a keyboard, or any other tool that is similarly omnipresent in academic work. If using AI was of fundamental importance to your project, then what made it so should be disclosed in the body of your paper. That standard, however, is the same as has always been applied to disclosing use of tools like nVivo or SPSS for primary analyses or any of countless databases for literature searches: authors are responsible for clearly describing aspects of their efforts that readers need to know to understand the rigour and replicability of the study. In doing so, it is of course important to keep in mind, just as we do routinely with other technologies, that any tool can be misused, requiring caution and investment in learning about the tool's strengths and limitations.^{1, 9, 10} Responsible and effective practice requires human oversight.

Optimistically, we hope this position will improve equity in the field by reducing barriers for those who wish to publish in our pages despite their first language not being English. We understand the risk of other barriers being created by virtue of privileging those who can afford the technology, but remain hopeful that is a lesser challenge given the truism that computer technology gets cheaper with time.¹¹ There can be no doubt that AI hallucinates,¹⁰ requiring individuals to double check its claims about the world while recognising that the author, not the computer, is accountable for the final text.¹² For those reasons, I would never myself dream of submitting a paper in a language other than one in which I was fluent without careful and triangulated effort to confirm that the translation said exactly what I intended it to. Responsible and effective practice requires human oversight.

Whether authors have used AI or not, peer review remains the best tool we have at our disposal for improving the work we are collectively undertaking as a field of study.¹³ Given that AI is currently built on a corpus of knowledge that is predominantly English,² we need reviewers to raise questions about whether a project responsibly represents the state of knowledge in the world. That standard, however, is the same as has always been applied in attempts to judge the adequacy of a paper's framing. Similarly, while it would be inappropriate to submit a manuscript one received for peer review to an AI device without permission to do so, that aligns with the same confidentiality standard that has existed for decades. Asking a question of AI to clarify one's thinking or to contemplate clearer (or more courteous) ways of conveying one's concerns is encouraged if and when it improves the reviewer's capacity to offer feedback to authors or professional development for reviewers themselves.¹⁴ Out of curiosity, I once submitted some of my own writing to ChatGPT with a request to ‘write a rejection letter’ (to see if it could predict the objections peer reviewers might raise). After the first response largely parroted back the claims I had made in the abstract, I instructed the computer to try again, stressing that I wanted a rejection letter. Its response was informative: ‘I am not programmed for critical appraisal.’ Even the AI ‘knew’ that responsible and effective practice requires human oversight.

As curators and stewards of the journal, we too pledge to use AI if and when it helps to improve reader, author, and reviewer experience. For example, AI has enabled implementation of a ‘free format’ submission system at Medical Education, so our authors no longer have to go through the tedium of formatting references in a specific way; authors will also note that most of the effort involved in making a submission now amounts to uploading a manuscript and confirming that the software has accurately identified author names, the title, abstract and so on. AI has been used for years to try to detect unethical publication practices such as duplicate submission and plagiarism. Similarly, the software we use to manage peer review has AI embedded that suggests reviewers it thinks to be particularly well suited to the content of the manuscript under consideration. While these systems will undoubtedly continue to improve, each is far from perfect. Truly fraudulent behaviour cannot be caught by any existing software while far more common lesser transgressions are generally over-called. Further, our editors must remain cognizant of the value of hearing diverse voices to inform our peer reviews if we are to facilitate a truly inclusive academic community.¹⁵ As a result, we will not automate decision-making or empower AI to conduct it. Instead, these resources will continue to be used as tools, taking advantage of their greater capacity to flag potential issues and opportunities while continuing to investigate them with the care and thoughtfulness required to yield the best outcomes we can achieve. Responsible and effective practice requires human oversight.

While less of a policy than a perspective, I offer this editorial as proof that I did (eventually) conclude it necessary to share these views for the sake of transparency, however boring (i.e. non-reactive or reinforcing of the status quo) they happen to be. As technology changes, our policies will continue to evolve, but for now we encourage everyone involved in academic publishing to use whatever tools they have available for improving the field and its capacity to improve health through education. That, to my mind, defines responsible and effective oversight.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

医学教育与人工智能：负责任和有效的实践需要人类的监督。

14 出于好奇，我曾经向 ChatGPT 提交了自己的一些文章，并要求 "写一封拒稿信"（看看它能否预测同行评审人可能提出的反对意见）。在第一次回复基本照搬了我在摘要中提出的要求后，我指示计算机再试一次，并强调我想要的是一封拒绝信。它的回复信息量很大：'我的程序不适合批判性评价。即使是人工智能也'知道'，负责任和有效的实践需要人类的监督。作为期刊的策划者和管理者，我们也承诺在人工智能有助于改善读者、作者和审稿人体验时使用人工智能。例如，人工智能使《医学教育》杂志采用了 "自由格式 "投稿系统，这样我们的作者就不必再繁琐地按照特定方式排版参考文献；作者也会注意到，现在投稿的大部分工作都是上传稿件，并确认软件是否准确识别了作者姓名、标题、摘要等。多年来，人工智能一直被用于检测重复投稿和抄袭等不道德的出版行为。同样，我们用来管理同行评审的软件中也嵌入了人工智能，它可以推荐它认为特别适合稿件内容的审稿人。毫无疑问，这些系统将不断改进，但每个系统都远非完美。任何现有软件都无法捕捉到真正的欺诈行为，而更常见的轻微违规行为一般都会被忽略。此外，我们的编辑必须继续认识到，如果我们要促进建立一个真正包容的学术社区，就必须听取不同的声音，为同行评审提供信息。相反，我们将继续把这些资源作为工具来使用，利用其更大的能力来标示潜在的问题和机会，同时继续以必要的谨慎和深思熟虑来调查这些问题和机会，以取得我们可以实现的最佳成果。负责任和有效的实践需要人的监督。虽然这不是一项政策，而是一种观点，但我发表这篇社论，证明我确实（最终）得出结论，为了透明起见，有必要分享这些观点，无论它们是多么无聊（即非反应性或强化现状）。随着技术的发展，我们的政策也将不断演变，但目前我们鼓励每一位学术出版从业者利用一切可用的工具来改善这一领域，并通过教育提高健康水平。在我看来，这才是负责任和有效的监督。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Medical Education 医学-卫生保健

CiteScore

8.40

自引率

10.00%

发文量

279

审稿时长

4-8 weeks

期刊介绍： Medical Education seeks to be the pre-eminent journal in the field of education for health care professionals, and publishes material of the highest quality, reflecting world wide or provocative issues and perspectives. The journal welcomes high quality papers on all aspects of health professional education including; -undergraduate education -postgraduate training -continuing professional development -interprofessional education