Mara Giavina Bianchi MD, PhD , Andrew D’adario MSc , Pedro Giavina Bianchi MD, PhD , Birajara Soares Machado PhD
{"title":"Three versions of an atopic dermatitis case report written by humans, artificial intelligence, or both: Identification of authorship and preferences","authors":"Mara Giavina Bianchi MD, PhD , Andrew D’adario MSc , Pedro Giavina Bianchi MD, PhD , Birajara Soares Machado PhD","doi":"10.1016/j.jacig.2024.100373","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The use of artificial intelligence (AI) in scientific writing is rapidly increasing, raising concerns about authorship identification, content quality, and writing efficiency.</div></div><div><h3>Objectives</h3><div>This study investigates the real-world impact of ChatGPT, a large language model, on those aspects in a simulated publication scenario.</div></div><div><h3>Methods</h3><div>Forty-eight individuals representing 3 medical expertise levels (medical students, residents, and experts in allergy or dermatology) evaluated 3 blinded versions of an atopic dermatitis case report: one each human written (HUM), AI generated (AI), and combined written (COM). The survey assessed authorship, ranked their preference, and graded 13 quality criteria for each text. Time taken to generate each manuscript was also recorded.</div></div><div><h3>Results</h3><div>Authorship identification accuracy mirrored the odds at 33%. Expert participants (50.9%) demonstrated significantly higher accuracy compared to residents (27.7%) and students (19.6%, <em>P</em> < .001). Participants favored AI-assisted versions (AI and COM) over HUM (<em>P</em> < .001), with COM receiving the highest quality scores. COM and AI achieved 83.8% and 84.3% reduction in writing time, respectively, compared to HUM, while showing 13.9% (<em>P</em> < .001) and 11.1% improvement in quality (<em>P</em> < .001), respectively. However, experts assigned the lowest score for the references of the AI manuscript, potentially hindering its publication.</div></div><div><h3>Conclusion</h3><div>AI can deceptively mimic human writing, particularly for less experienced readers. Although AI-assisted writing is appealing and offers significant time savings, human oversight remains crucial to ensure accuracy, ethical considerations, and optimal quality. These findings underscore the need for transparency in AI use and highlight the potential of human-AI collaboration in the future of scientific writing.</div></div>","PeriodicalId":75041,"journal":{"name":"The journal of allergy and clinical immunology. Global","volume":"4 1","pages":"Article 100373"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699452/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The journal of allergy and clinical immunology. Global","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772829324001693","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background
The use of artificial intelligence (AI) in scientific writing is rapidly increasing, raising concerns about authorship identification, content quality, and writing efficiency.
Objectives
This study investigates the real-world impact of ChatGPT, a large language model, on those aspects in a simulated publication scenario.
Methods
Forty-eight individuals representing 3 medical expertise levels (medical students, residents, and experts in allergy or dermatology) evaluated 3 blinded versions of an atopic dermatitis case report: one each human written (HUM), AI generated (AI), and combined written (COM). The survey assessed authorship, ranked their preference, and graded 13 quality criteria for each text. Time taken to generate each manuscript was also recorded.
Results
Authorship identification accuracy mirrored the odds at 33%. Expert participants (50.9%) demonstrated significantly higher accuracy compared to residents (27.7%) and students (19.6%, P < .001). Participants favored AI-assisted versions (AI and COM) over HUM (P < .001), with COM receiving the highest quality scores. COM and AI achieved 83.8% and 84.3% reduction in writing time, respectively, compared to HUM, while showing 13.9% (P < .001) and 11.1% improvement in quality (P < .001), respectively. However, experts assigned the lowest score for the references of the AI manuscript, potentially hindering its publication.
Conclusion
AI can deceptively mimic human writing, particularly for less experienced readers. Although AI-assisted writing is appealing and offers significant time savings, human oversight remains crucial to ensure accuracy, ethical considerations, and optimal quality. These findings underscore the need for transparency in AI use and highlight the potential of human-AI collaboration in the future of scientific writing.