Mehmet Yigit Akgun , Melihcan Savasci , Caner Gunerbuyuk , Sezer Onur Gunara , Tunc Oktenoglu , Ali Fahir Ozer , Ozkan Ates
{"title":"Battle of the authors: Comparing neurosurgery articles written by humans and AI","authors":"Mehmet Yigit Akgun , Melihcan Savasci , Caner Gunerbuyuk , Sezer Onur Gunara , Tunc Oktenoglu , Ali Fahir Ozer , Ozkan Ates","doi":"10.1016/j.jocn.2025.111152","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The advancement of artificial intelligence (AI) has led to its application in various fields, including medical literature. This study compares the quality of neurosurgery articles written by human authors and those generated by ChatGPT, an advanced AI model. The objective was to determine if AI-generated articles meet the standards of human-written academic papers.</div></div><div><h3>Methods</h3><div>A total of 10 neurosurgery articles, 5 written by humans and 5 by ChatGPT, were evaluated by a panel of blinded experts. The assessment parameters included overall impression, readability, criteria satisfaction, and degree of detail. Additionally, readability scores were calculated using the Lix score and the Flesch-Kincaid grade level. Preference and identification tests were also conducted to determine if experts could distinguish between the two types of articles.</div></div><div><h3>Results</h3><div>The study found no significant differences in the overall quality parameters between human-written and ChatGPT −generated articles. Readability scores were higher for ChatGPT articles (Lix score: 35 vs. 26, Flesch-Kincaid grade level: 10 vs. 8). Experts correctly identified the authorship of the articles 61% of the time, with preferences almost evenly split (47% preferred CHATGPT, 44% preferred human, and 9% had no preference). The most statistically significant result was the higher readability scores of CHATGPT-generated articles, indicating that AI can produce more readable content than human authors.</div></div><div><h3>Conclusion</h3><div>ChatGPT is capable of generating neurosurgery articles that are comparable in quality to those written by humans. The higher readability scores of AI-generated articles suggest that ChatGPT can enhance the accessibility of scientific literature. This study supports the potential integration of AI in academic writing, offering a valuable tool for researchers and medical professionals.</div></div>","PeriodicalId":15487,"journal":{"name":"Journal of Clinical Neuroscience","volume":"135 ","pages":"Article 111152"},"PeriodicalIF":1.9000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0967586825001249","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
The advancement of artificial intelligence (AI) has led to its application in various fields, including medical literature. This study compares the quality of neurosurgery articles written by human authors and those generated by ChatGPT, an advanced AI model. The objective was to determine if AI-generated articles meet the standards of human-written academic papers.
Methods
A total of 10 neurosurgery articles, 5 written by humans and 5 by ChatGPT, were evaluated by a panel of blinded experts. The assessment parameters included overall impression, readability, criteria satisfaction, and degree of detail. Additionally, readability scores were calculated using the Lix score and the Flesch-Kincaid grade level. Preference and identification tests were also conducted to determine if experts could distinguish between the two types of articles.
Results
The study found no significant differences in the overall quality parameters between human-written and ChatGPT −generated articles. Readability scores were higher for ChatGPT articles (Lix score: 35 vs. 26, Flesch-Kincaid grade level: 10 vs. 8). Experts correctly identified the authorship of the articles 61% of the time, with preferences almost evenly split (47% preferred CHATGPT, 44% preferred human, and 9% had no preference). The most statistically significant result was the higher readability scores of CHATGPT-generated articles, indicating that AI can produce more readable content than human authors.
Conclusion
ChatGPT is capable of generating neurosurgery articles that are comparable in quality to those written by humans. The higher readability scores of AI-generated articles suggest that ChatGPT can enhance the accessibility of scientific literature. This study supports the potential integration of AI in academic writing, offering a valuable tool for researchers and medical professionals.
期刊介绍:
This International journal, Journal of Clinical Neuroscience, publishes articles on clinical neurosurgery and neurology and the related neurosciences such as neuro-pathology, neuro-radiology, neuro-ophthalmology and neuro-physiology.
The journal has a broad International perspective, and emphasises the advances occurring in Asia, the Pacific Rim region, Europe and North America. The Journal acts as a focus for publication of major clinical and laboratory research, as well as publishing solicited manuscripts on specific subjects from experts, case reports and other information of interest to clinicians working in the clinical neurosciences.