Anna K. Fiedler BS , Kai Zhang PhD , Tia S. Lal BS , Xiaoqian Jiang PhD , Stuart M. Fraser MD
{"title":"Generative Pre-trained Transformer for Pediatric Stroke Research: A Pilot Study","authors":"Anna K. Fiedler BS , Kai Zhang PhD , Tia S. Lal BS , Xiaoqian Jiang PhD , Stuart M. Fraser MD","doi":"10.1016/j.pediatrneurol.2024.07.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Pediatric stroke is an important cause of morbidity in children. Although research can be challenging, large amounts of data have been captured through collaborative efforts in the International Pediatric Stroke Study (IPSS). This study explores the use of an advanced artificial intelligence program, the Generative Pre-trained Transformer (GPT), to enter pediatric stroke data into the IPSS.</p></div><div><h3>Methods</h3><p>The most recent 50 clinical notes of patients with ischemic stroke or cerebral venous sinus thrombosis at the UTHealth Pediatric Stroke Clinic were deidentified. Domain-specific prompts were engineered for an offline artificial intelligence program (GPT) to answer IPSS questions. Responses from GPT were compared with the human rater. Percent agreement was assessed across 50 patients for each of the 114 queries developed from the IPSS database outcome questionnaire.</p></div><div><h3>Results</h3><p>GPT demonstrated strong performance on several questions but showed variability overall. In its early iterations it was able to match human judgment occasionally with an accuracy score of 1.00 (n = 20, 17.5%), but it scored as low as 0.26 in some patients. Prompts were adjusted in four subsequent iterations to increase accuracy. In its fourth iteration, agreement was 93.6%, with a maximum agreement of 100% and minimum of 62%. Of 2400 individual items assessed, our model entered 2247 (93.6%) correctly and 153 (6.4%) incorrectly.</p></div><div><h3>Conclusions</h3><p>Although our tailored generative model with domain-specific prompt engineering and ontological guidance shows promise for research applications, further refinement is needed to enhance its accuracy. It cannot enter data entirely independently, but it can be employed in tandem with human oversight contributing to a collaborative approach that reduces overall effort.</p></div>","PeriodicalId":19956,"journal":{"name":"Pediatric neurology","volume":"160 ","pages":"Pages 54-59"},"PeriodicalIF":3.2000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pediatric neurology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0887899424002522","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Pediatric stroke is an important cause of morbidity in children. Although research can be challenging, large amounts of data have been captured through collaborative efforts in the International Pediatric Stroke Study (IPSS). This study explores the use of an advanced artificial intelligence program, the Generative Pre-trained Transformer (GPT), to enter pediatric stroke data into the IPSS.
Methods
The most recent 50 clinical notes of patients with ischemic stroke or cerebral venous sinus thrombosis at the UTHealth Pediatric Stroke Clinic were deidentified. Domain-specific prompts were engineered for an offline artificial intelligence program (GPT) to answer IPSS questions. Responses from GPT were compared with the human rater. Percent agreement was assessed across 50 patients for each of the 114 queries developed from the IPSS database outcome questionnaire.
Results
GPT demonstrated strong performance on several questions but showed variability overall. In its early iterations it was able to match human judgment occasionally with an accuracy score of 1.00 (n = 20, 17.5%), but it scored as low as 0.26 in some patients. Prompts were adjusted in four subsequent iterations to increase accuracy. In its fourth iteration, agreement was 93.6%, with a maximum agreement of 100% and minimum of 62%. Of 2400 individual items assessed, our model entered 2247 (93.6%) correctly and 153 (6.4%) incorrectly.
Conclusions
Although our tailored generative model with domain-specific prompt engineering and ontological guidance shows promise for research applications, further refinement is needed to enhance its accuracy. It cannot enter data entirely independently, but it can be employed in tandem with human oversight contributing to a collaborative approach that reduces overall effort.
期刊介绍:
Pediatric Neurology publishes timely peer-reviewed clinical and research articles covering all aspects of the developing nervous system.
Pediatric Neurology features up-to-the-minute publication of the latest advances in the diagnosis, management, and treatment of pediatric neurologic disorders. The journal''s editor, E. Steve Roach, in conjunction with the team of Associate Editors, heads an internationally recognized editorial board, ensuring the most authoritative and extensive coverage of the field. Among the topics covered are: epilepsy, mitochondrial diseases, congenital malformations, chromosomopathies, peripheral neuropathies, perinatal and childhood stroke, cerebral palsy, as well as other diseases affecting the developing nervous system.