准确性评估:不同人工智能工具对肺炎支原体肺炎相关问题的回答。

IF 1.9 4区 医学 Q3 RESPIRATORY SYSTEM Clinical Respiratory Journal Pub Date : 2024-08-26 DOI:10.1111/crj.70005
Shuang Li
{"title":"准确性评估:不同人工智能工具对肺炎支原体肺炎相关问题的回答。","authors":"Shuang Li","doi":"10.1111/crj.70005","DOIUrl":null,"url":null,"abstract":"<p>With the rapid development of socio-economic technology, artificial intelligence (AI) is increasingly applied in daily life. When discussing AI, it is inevitable to mention ChatGPT, a language model based on AI technology. Pretrained on extensive language data, ChatGPT can perform various natural language processing tasks, including dialog generation. In addition, similar large language models include the intelligent assistant Kimi launched by Beijing Lunar Tech and the chatbot ERNIE developed by Baidu, among others.</p><p><i>Mycoplasma pneumoniae</i> pneumonia, caused by infection with <i>Mycoplasma pneumoniae</i>, refers to inflammation of the lungs that can affect the bronchi, bronchioles, alveoli, and interstitial tissue. It can occur at any age but is more common in children aged 5 and above, as well as in immunocompromised individuals (such as the elderly, immunodeficient individuals, or patients undergoing immunosuppressive therapy). The course of <i>Mycoplasma pneumoniae</i> pneumonia is generally 1–2 weeks, with a favorable prognosis and no sequelae in most cases. However, a small number of cases can develop into severe conditions, primarily presenting with symptoms of respiratory distress and respiratory failure. These severe cases are often associated with acute respiratory distress syndrome, plastic bronchitis affecting the large airways, diffuse bronchiolitis, and severe pulmonary embolism. In rare instances, severe extrapulmonary complications may be the main manifestations. Given the prevalence of <i>Mycoplasma pneumoniae</i> infection in China, we sought to understand whether ChatGPT could contribute to a better understanding of <i>Mycoplasma pneumoniae</i> pneumonia. We selected 13 questions that are most commonly asked by patients in clinical practice and posed them to ChatGPT-3.5, ChatGPT-4.0, Kimi, and ERNIE. Each question was run five times. Then we invited nine experts with extensive clinical experience and knowledge of Mycoplasma pneumonia to rate the accuracy of the answers. Among them, there were four respiratory specialists and five pediatric specialists, with five from our hospital and four from other hospitals. The scoring criteria were as follows: score = 0: completely incorrect; score &lt; 6: inaccurate; 6 ≤ score &lt; 8: mostly accurate; 8 ≤ score &lt; 10: very accurate; score = 10: completely accurate. The final results were ChatGPT 3.5 8.46 ± 0.80, ChatGPT 4.0 7.05 ± 1.16, Kimi 8.62 ± 0.68, and ERNIE 9.33 ± 0.30. In descending order of accuracy, they were ERNIE, Kimi, ChatGPT 3.5, and ChatGPT 4.0.</p><p>We compared the answers provided by ChatGPT versions 3.5 and 4.0, ERNIE, and Kimi regarding Mycoplasma pneumonia and found that their accuracy ranked from the highest to the lowest as ERNIE, Kimi, ChatGPT 3.5, and ChatGPT 4.0, with ERNIE achieving the highest accuracy. Although ERNIE performed the best among the four AI models with more comprehensive answers, it still exhibited some answers with noticeable errors. For instance, in question 11 regarding treatment options for Mycoplasma pneumonia, ERNIE suggested that penicillin antibiotics have some effectiveness against Mycoplasma pneumonia. It is widely known that <i>Mycoplasma pneumoniae</i> lacks a cell wall, rendering antibiotics targeting cell walls ineffective unless bacterial co-infection is present. Another example is in question 2 on how <i>Mycoplasma pneumoniae</i> spreads, where ERNIE's answer suggested sexual transmission, which is not supported by evidence found in relevant PubMed literature.</p><p>Surprisingly, when we compared ChatGPT versions 3.5 and 4.0, we found that despite OpenAI's claims that ChatGPT 4.0 surpasses ChatGPT 3.5 in language understanding, generation capabilities, and performance, ChatGPT 3.5 consistently received higher scores for the questions we posed about Mycoplasma pneumonia. Clearly, in terms of accuracy, ChatGPT 3.5 outperformed ChatGPT 4.0. Furthermore, we observed that compared to ChatGPT 3.5, ChatGPT 4.0, while more vivid and engaging with fewer technical terms, often provided less comprehensive and specific answers, occasionally leading to misunderstandings. For example, in question 1 about what Mycoplasma pneumonia is, ChatGPT 4.0's answer contained factual errors regarding chest radiographs.</p><p>Furthermore, upon comparing the answers from the four AI tools with the latest guidelines, we found instances where the AI answers were inaccurate or incomplete, in addition to the issues mentioned above. For example, regarding question 8, the answers from the four AI tools regarding complications of <i>Mycoplasma pneumoniae</i> pneumonia are not comprehensive. The course of <i>Mycoplasma pneumoniae</i> pneumonia is generally 1–2 weeks, with a favorable prognosis and no sequelae in most cases. However, a small number of cases can develop into severe conditions, primarily presenting with symptoms of respiratory distress and respiratory failure. These severe cases are often associated with acute respiratory distress syndrome, plastic bronchitis affecting the large airways, diffuse bronchiolitis, and severe pulmonary embolism. In rare instances, severe extrapulmonary complications may be the main manifestations. However, the answers from all four AI tools overlooked plastic bronchitis and pulmonary embolism [<span>1</span>]. Plastic bronchitis is a significant contributor to severe and fulminant <i>Mycoplasma pneumoniae</i> pneumonia, presenting with persistent high fever, respiratory distress, and physical examination findings such as tracheal casts, subcutaneous emphysema, and decreased or absent breath sounds in the lungs [<span>1</span>]. Pulmonary embolism may occur independently or concurrently with emboli in other locations, serving as a cause of necrotizing pneumonia and a significant factor in residual lung atelectasis and organizing pneumonia [<span>1</span>].</p><p>For question 11 concerning the drug treatment of <i>Mycoplasma pneumoniae</i> pneumonia, the answers from the four AI tools are not sufficiently accurate. A systematic review and meta-analysis have shown a global trend of increasing macrolide-resistant <i>Mycoplasma pneumoniae</i> infections, rising from 18.2% in 2000 to 76.5% in 2019 [<span>2</span>]. The highest infection rates are observed in the Western Pacific region, with mainland China accounting for 79.5% [<span>2</span>]. Additionally, a multicenter study in China from 2013 to 2019 indicated that macrolide resistance in <i>Mycoplasma pneumoniae</i> increased from 75.8% to 97.4% [<span>3</span>]. Therefore, Chinese guidelines recommend macrolide antibiotics as the first-line treatment for <i>Mycoplasma pneumoniae</i> pneumonia in children [<span>1</span>]. For cases resistant to macrolides, children over 8 years old may be treated with newer tetracyclines, which can potentially cause tooth discoloration and enamel hypoplasia, necessitating careful risk–benefit assessment for children under 8 years old. Fluoroquinolone antibiotics are used as alternative therapies for suspected or confirmed cases of MUMPP (macrolide-unresponsive <i>Mycoplasma pneumoniae</i> pneumonia), RMPP (refractory <i>Mycoplasma pneumoniae</i> pneumonia), or SMPP (severe <i>Mycoplasma pneumoniae</i> pneumonia), despite the risk of cartilage injury in animals and tendon rupture in humans, requiring thorough risk assessment for individuals under 18 years old. In adults [<span>4</span>], due to high macrolide resistance rates, oral doxycycline or minocycline are preferred for Mycoplasma infections. Macrolides may be empirically used in regions with low resistance rates. Respiratory fluoroquinolones are used as alternative treatments in areas with high drug resistance rates or for patients allergic to or intolerant of other medications. Besides antimicrobial therapy, symptomatic treatments such as antipyretics, cough suppressants, and expectorants are also employed. Severe cases may require glucocorticoids or bronchoalveolar lavage. Clearly, the four AI tools did not specify the differences in antimicrobial treatment between children and adults, which may lead to incorrect use, ineffective treatment, or exacerbation of conditions.</p><p>In summary, although AI technology is advancing rapidly with the emergence of various large-scale language models such as ChatGPT, Kimi, and ERNIE, answers to questions related to <i>Mycoplasma pneumoniae</i> pneumonia are still often incomplete, inaccurate, or even clearly erroneous. This may be due to training data not being updated in accordance with the latest guidelines. Furthermore, the application of medications must consider various factors such as patient age, past medication history, and drug side effects. AI tools typically provide lists of treatment medications and may not offer personalized treatment plans based on individual patient conditions. Therefore, while AI tools can be helpful for addressing questions related to <i>Mycoplasma pneumoniae</i> pneumonia, final decisions should still rely on healthcare professionals, as every answer ultimately concludes with “Consult a healthcare professional if you have related concerns.”</p><p>Although AI tools have limitations, overall, their accuracy in answers is relatively high. Such tools can explain common health issues to patients, to some extent saving healthcare resources, while also promoting medical knowledge and aiding in doctor–patient communication and shared decision-making. We believe that with ongoing efforts from researchers, AI may become more specialized and humane in its applications in the medical field in the future.</p><p>Shuang Li designed the study and completed the data collection and manuscript writing.</p><p>Permission to reproduce material from other sources has been obtained and is acknowledged accordingly.</p><p>The author has nothing to report.</p><p>The author has nothing to report.</p><p>The author declares no conflicts of interest.</p>","PeriodicalId":55247,"journal":{"name":"Clinical Respiratory Journal","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/crj.70005","citationCount":"0","resultStr":"{\"title\":\"An Accuracy Assessment: Responses to Mycoplasma Pneumoniae Pneumonia-Related Questions by Different Artificial Intelligence Tools\",\"authors\":\"Shuang Li\",\"doi\":\"10.1111/crj.70005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the rapid development of socio-economic technology, artificial intelligence (AI) is increasingly applied in daily life. When discussing AI, it is inevitable to mention ChatGPT, a language model based on AI technology. Pretrained on extensive language data, ChatGPT can perform various natural language processing tasks, including dialog generation. In addition, similar large language models include the intelligent assistant Kimi launched by Beijing Lunar Tech and the chatbot ERNIE developed by Baidu, among others.</p><p><i>Mycoplasma pneumoniae</i> pneumonia, caused by infection with <i>Mycoplasma pneumoniae</i>, refers to inflammation of the lungs that can affect the bronchi, bronchioles, alveoli, and interstitial tissue. It can occur at any age but is more common in children aged 5 and above, as well as in immunocompromised individuals (such as the elderly, immunodeficient individuals, or patients undergoing immunosuppressive therapy). The course of <i>Mycoplasma pneumoniae</i> pneumonia is generally 1–2 weeks, with a favorable prognosis and no sequelae in most cases. However, a small number of cases can develop into severe conditions, primarily presenting with symptoms of respiratory distress and respiratory failure. These severe cases are often associated with acute respiratory distress syndrome, plastic bronchitis affecting the large airways, diffuse bronchiolitis, and severe pulmonary embolism. In rare instances, severe extrapulmonary complications may be the main manifestations. Given the prevalence of <i>Mycoplasma pneumoniae</i> infection in China, we sought to understand whether ChatGPT could contribute to a better understanding of <i>Mycoplasma pneumoniae</i> pneumonia. We selected 13 questions that are most commonly asked by patients in clinical practice and posed them to ChatGPT-3.5, ChatGPT-4.0, Kimi, and ERNIE. Each question was run five times. Then we invited nine experts with extensive clinical experience and knowledge of Mycoplasma pneumonia to rate the accuracy of the answers. Among them, there were four respiratory specialists and five pediatric specialists, with five from our hospital and four from other hospitals. The scoring criteria were as follows: score = 0: completely incorrect; score &lt; 6: inaccurate; 6 ≤ score &lt; 8: mostly accurate; 8 ≤ score &lt; 10: very accurate; score = 10: completely accurate. The final results were ChatGPT 3.5 8.46 ± 0.80, ChatGPT 4.0 7.05 ± 1.16, Kimi 8.62 ± 0.68, and ERNIE 9.33 ± 0.30. In descending order of accuracy, they were ERNIE, Kimi, ChatGPT 3.5, and ChatGPT 4.0.</p><p>We compared the answers provided by ChatGPT versions 3.5 and 4.0, ERNIE, and Kimi regarding Mycoplasma pneumonia and found that their accuracy ranked from the highest to the lowest as ERNIE, Kimi, ChatGPT 3.5, and ChatGPT 4.0, with ERNIE achieving the highest accuracy. Although ERNIE performed the best among the four AI models with more comprehensive answers, it still exhibited some answers with noticeable errors. For instance, in question 11 regarding treatment options for Mycoplasma pneumonia, ERNIE suggested that penicillin antibiotics have some effectiveness against Mycoplasma pneumonia. It is widely known that <i>Mycoplasma pneumoniae</i> lacks a cell wall, rendering antibiotics targeting cell walls ineffective unless bacterial co-infection is present. Another example is in question 2 on how <i>Mycoplasma pneumoniae</i> spreads, where ERNIE's answer suggested sexual transmission, which is not supported by evidence found in relevant PubMed literature.</p><p>Surprisingly, when we compared ChatGPT versions 3.5 and 4.0, we found that despite OpenAI's claims that ChatGPT 4.0 surpasses ChatGPT 3.5 in language understanding, generation capabilities, and performance, ChatGPT 3.5 consistently received higher scores for the questions we posed about Mycoplasma pneumonia. Clearly, in terms of accuracy, ChatGPT 3.5 outperformed ChatGPT 4.0. Furthermore, we observed that compared to ChatGPT 3.5, ChatGPT 4.0, while more vivid and engaging with fewer technical terms, often provided less comprehensive and specific answers, occasionally leading to misunderstandings. For example, in question 1 about what Mycoplasma pneumonia is, ChatGPT 4.0's answer contained factual errors regarding chest radiographs.</p><p>Furthermore, upon comparing the answers from the four AI tools with the latest guidelines, we found instances where the AI answers were inaccurate or incomplete, in addition to the issues mentioned above. For example, regarding question 8, the answers from the four AI tools regarding complications of <i>Mycoplasma pneumoniae</i> pneumonia are not comprehensive. The course of <i>Mycoplasma pneumoniae</i> pneumonia is generally 1–2 weeks, with a favorable prognosis and no sequelae in most cases. However, a small number of cases can develop into severe conditions, primarily presenting with symptoms of respiratory distress and respiratory failure. These severe cases are often associated with acute respiratory distress syndrome, plastic bronchitis affecting the large airways, diffuse bronchiolitis, and severe pulmonary embolism. In rare instances, severe extrapulmonary complications may be the main manifestations. However, the answers from all four AI tools overlooked plastic bronchitis and pulmonary embolism [<span>1</span>]. Plastic bronchitis is a significant contributor to severe and fulminant <i>Mycoplasma pneumoniae</i> pneumonia, presenting with persistent high fever, respiratory distress, and physical examination findings such as tracheal casts, subcutaneous emphysema, and decreased or absent breath sounds in the lungs [<span>1</span>]. Pulmonary embolism may occur independently or concurrently with emboli in other locations, serving as a cause of necrotizing pneumonia and a significant factor in residual lung atelectasis and organizing pneumonia [<span>1</span>].</p><p>For question 11 concerning the drug treatment of <i>Mycoplasma pneumoniae</i> pneumonia, the answers from the four AI tools are not sufficiently accurate. A systematic review and meta-analysis have shown a global trend of increasing macrolide-resistant <i>Mycoplasma pneumoniae</i> infections, rising from 18.2% in 2000 to 76.5% in 2019 [<span>2</span>]. The highest infection rates are observed in the Western Pacific region, with mainland China accounting for 79.5% [<span>2</span>]. Additionally, a multicenter study in China from 2013 to 2019 indicated that macrolide resistance in <i>Mycoplasma pneumoniae</i> increased from 75.8% to 97.4% [<span>3</span>]. Therefore, Chinese guidelines recommend macrolide antibiotics as the first-line treatment for <i>Mycoplasma pneumoniae</i> pneumonia in children [<span>1</span>]. For cases resistant to macrolides, children over 8 years old may be treated with newer tetracyclines, which can potentially cause tooth discoloration and enamel hypoplasia, necessitating careful risk–benefit assessment for children under 8 years old. Fluoroquinolone antibiotics are used as alternative therapies for suspected or confirmed cases of MUMPP (macrolide-unresponsive <i>Mycoplasma pneumoniae</i> pneumonia), RMPP (refractory <i>Mycoplasma pneumoniae</i> pneumonia), or SMPP (severe <i>Mycoplasma pneumoniae</i> pneumonia), despite the risk of cartilage injury in animals and tendon rupture in humans, requiring thorough risk assessment for individuals under 18 years old. In adults [<span>4</span>], due to high macrolide resistance rates, oral doxycycline or minocycline are preferred for Mycoplasma infections. Macrolides may be empirically used in regions with low resistance rates. Respiratory fluoroquinolones are used as alternative treatments in areas with high drug resistance rates or for patients allergic to or intolerant of other medications. Besides antimicrobial therapy, symptomatic treatments such as antipyretics, cough suppressants, and expectorants are also employed. Severe cases may require glucocorticoids or bronchoalveolar lavage. Clearly, the four AI tools did not specify the differences in antimicrobial treatment between children and adults, which may lead to incorrect use, ineffective treatment, or exacerbation of conditions.</p><p>In summary, although AI technology is advancing rapidly with the emergence of various large-scale language models such as ChatGPT, Kimi, and ERNIE, answers to questions related to <i>Mycoplasma pneumoniae</i> pneumonia are still often incomplete, inaccurate, or even clearly erroneous. This may be due to training data not being updated in accordance with the latest guidelines. Furthermore, the application of medications must consider various factors such as patient age, past medication history, and drug side effects. AI tools typically provide lists of treatment medications and may not offer personalized treatment plans based on individual patient conditions. Therefore, while AI tools can be helpful for addressing questions related to <i>Mycoplasma pneumoniae</i> pneumonia, final decisions should still rely on healthcare professionals, as every answer ultimately concludes with “Consult a healthcare professional if you have related concerns.”</p><p>Although AI tools have limitations, overall, their accuracy in answers is relatively high. Such tools can explain common health issues to patients, to some extent saving healthcare resources, while also promoting medical knowledge and aiding in doctor–patient communication and shared decision-making. We believe that with ongoing efforts from researchers, AI may become more specialized and humane in its applications in the medical field in the future.</p><p>Shuang Li designed the study and completed the data collection and manuscript writing.</p><p>Permission to reproduce material from other sources has been obtained and is acknowledged accordingly.</p><p>The author has nothing to report.</p><p>The author has nothing to report.</p><p>The author declares no conflicts of interest.</p>\",\"PeriodicalId\":55247,\"journal\":{\"name\":\"Clinical Respiratory Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/crj.70005\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Respiratory Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/crj.70005\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RESPIRATORY SYSTEM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Respiratory Journal","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/crj.70005","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RESPIRATORY SYSTEM","Score":null,"Total":0}
引用次数: 0

摘要

随着社会经济技术的飞速发展,人工智能(AI)越来越多地应用于日常生活中。说到人工智能,就不得不提到基于人工智能技术的语言模型 ChatGPT。经过大量语言数据的预训练,ChatGPT 可以执行包括对话生成在内的各种自然语言处理任务。此外,类似的大型语言模型还包括北京朗能科技推出的智能助手Kimi、百度开发的聊天机器人ERNIE等。"肺炎支原体肺炎 "是由肺炎支原体感染引起的肺部炎症,可累及支气管、支气管、肺泡和肺间质组织。肺炎支原体肺炎可发生于任何年龄,但更常见于 5 岁及以上的儿童和免疫力低下的人群(如老年人、免疫缺陷者或接受免疫抑制治疗的患者)。肺炎支原体肺炎的病程一般为 1-2 周,预后良好,大多数病例不会留下后遗症。但少数病例可发展为重症,主要表现为呼吸窘迫和呼吸衰竭症状。这些严重病例通常伴有急性呼吸窘迫综合征、影响大气管的塑性支气管炎、弥漫性支气管炎和严重肺栓塞。在极少数情况下,严重的肺外并发症可能是主要表现。鉴于肺炎支原体感染在中国的流行情况,我们试图了解 ChatGPT 是否有助于更好地了解肺炎支原体肺炎。我们选择了临床实践中患者最常问到的 13 个问题,并将其分别提交给 ChatGPT-3.5、ChatGPT-4.0、Kimi 和 ERNIE。每个问题都运行了五次。然后,我们邀请了九位具有丰富临床经验和支原体肺炎相关知识的专家对答案的准确性进行评分。其中有四位呼吸科专家和五位儿科专家,五位来自本院,四位来自其他医院。评分标准如下:得分=0:完全不正确;得分&lt; 6:不准确;6≤得分&lt; 8:基本准确;8≤得分&lt; 10:非常准确;得分=10:完全准确。最终结果为 ChatGPT 3.5 8.46 ± 0.80、ChatGPT 4.0 7.05 ± 1.16、Kimi 8.62 ± 0.68 和 ERNIE 9.33 ± 0.30。我们比较了 ChatGPT 3.5 和 4.0 版、ERNIE 和 Kimi 对支原体肺炎的回答,发现准确率从高到低依次为 ERNIE、Kimi、ChatGPT 3.5 和 ChatGPT 4.0,其中 ERNIE 的准确率最高。虽然 ERNIE 在四种人工智能模型中表现最好,答案也更全面,但它仍有一些答案存在明显错误。例如,在问题 11 "支原体肺炎的治疗方案 "中,ERNIE 认为青霉素类抗生素对支原体肺炎有一定疗效。众所周知,肺炎支原体缺乏细胞壁,因此除非存在细菌合并感染,否则针对细胞壁的抗生素无效。另一个例子是在问题 2 "肺炎支原体如何传播 "中,ERNIE 的答案建议通过性传播,但这并没有在相关的 PubMed 文献中找到证据支持。令人惊讶的是,当我们比较 ChatGPT 3.5 和 4.0 版本时,我们发现尽管 OpenAI 声称 ChatGPT 4.0 在语言理解、生成能力和性能方面超过了 ChatGPT 3.5,但在我们提出的有关支原体肺炎的问题上,ChatGPT 3.5 始终获得了更高的分数。显然,在准确性方面,ChatGPT 3.5 优于 ChatGPT 4.0。此外,我们还注意到,与 ChatGPT 3.5 相比,ChatGPT 4.0 虽然更生动,使用的专业术语更少,但提供的答案往往不够全面和具体,偶尔会导致误解。例如,在问题 1 "什么是支原体肺炎 "中,ChatGPT 4.0 的答案中包含了有关胸部 X 光片的事实错误。此外,在将四款人工智能工具的答案与最新指南进行比较后,我们发现除了上述问题外,人工智能的答案还存在不准确或不完整的情况。例如,关于问题 8,四种人工智能工具关于肺炎支原体肺炎并发症的答案并不全面。肺炎支原体肺炎的病程一般为 1-2 周,预后良好,大多数病例不会留下后遗症。然而,少数病例可发展为重症,主要表现为呼吸窘迫和呼吸衰竭症状。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Accuracy Assessment: Responses to Mycoplasma Pneumoniae Pneumonia-Related Questions by Different Artificial Intelligence Tools

With the rapid development of socio-economic technology, artificial intelligence (AI) is increasingly applied in daily life. When discussing AI, it is inevitable to mention ChatGPT, a language model based on AI technology. Pretrained on extensive language data, ChatGPT can perform various natural language processing tasks, including dialog generation. In addition, similar large language models include the intelligent assistant Kimi launched by Beijing Lunar Tech and the chatbot ERNIE developed by Baidu, among others.

Mycoplasma pneumoniae pneumonia, caused by infection with Mycoplasma pneumoniae, refers to inflammation of the lungs that can affect the bronchi, bronchioles, alveoli, and interstitial tissue. It can occur at any age but is more common in children aged 5 and above, as well as in immunocompromised individuals (such as the elderly, immunodeficient individuals, or patients undergoing immunosuppressive therapy). The course of Mycoplasma pneumoniae pneumonia is generally 1–2 weeks, with a favorable prognosis and no sequelae in most cases. However, a small number of cases can develop into severe conditions, primarily presenting with symptoms of respiratory distress and respiratory failure. These severe cases are often associated with acute respiratory distress syndrome, plastic bronchitis affecting the large airways, diffuse bronchiolitis, and severe pulmonary embolism. In rare instances, severe extrapulmonary complications may be the main manifestations. Given the prevalence of Mycoplasma pneumoniae infection in China, we sought to understand whether ChatGPT could contribute to a better understanding of Mycoplasma pneumoniae pneumonia. We selected 13 questions that are most commonly asked by patients in clinical practice and posed them to ChatGPT-3.5, ChatGPT-4.0, Kimi, and ERNIE. Each question was run five times. Then we invited nine experts with extensive clinical experience and knowledge of Mycoplasma pneumonia to rate the accuracy of the answers. Among them, there were four respiratory specialists and five pediatric specialists, with five from our hospital and four from other hospitals. The scoring criteria were as follows: score = 0: completely incorrect; score < 6: inaccurate; 6 ≤ score < 8: mostly accurate; 8 ≤ score < 10: very accurate; score = 10: completely accurate. The final results were ChatGPT 3.5 8.46 ± 0.80, ChatGPT 4.0 7.05 ± 1.16, Kimi 8.62 ± 0.68, and ERNIE 9.33 ± 0.30. In descending order of accuracy, they were ERNIE, Kimi, ChatGPT 3.5, and ChatGPT 4.0.

We compared the answers provided by ChatGPT versions 3.5 and 4.0, ERNIE, and Kimi regarding Mycoplasma pneumonia and found that their accuracy ranked from the highest to the lowest as ERNIE, Kimi, ChatGPT 3.5, and ChatGPT 4.0, with ERNIE achieving the highest accuracy. Although ERNIE performed the best among the four AI models with more comprehensive answers, it still exhibited some answers with noticeable errors. For instance, in question 11 regarding treatment options for Mycoplasma pneumonia, ERNIE suggested that penicillin antibiotics have some effectiveness against Mycoplasma pneumonia. It is widely known that Mycoplasma pneumoniae lacks a cell wall, rendering antibiotics targeting cell walls ineffective unless bacterial co-infection is present. Another example is in question 2 on how Mycoplasma pneumoniae spreads, where ERNIE's answer suggested sexual transmission, which is not supported by evidence found in relevant PubMed literature.

Surprisingly, when we compared ChatGPT versions 3.5 and 4.0, we found that despite OpenAI's claims that ChatGPT 4.0 surpasses ChatGPT 3.5 in language understanding, generation capabilities, and performance, ChatGPT 3.5 consistently received higher scores for the questions we posed about Mycoplasma pneumonia. Clearly, in terms of accuracy, ChatGPT 3.5 outperformed ChatGPT 4.0. Furthermore, we observed that compared to ChatGPT 3.5, ChatGPT 4.0, while more vivid and engaging with fewer technical terms, often provided less comprehensive and specific answers, occasionally leading to misunderstandings. For example, in question 1 about what Mycoplasma pneumonia is, ChatGPT 4.0's answer contained factual errors regarding chest radiographs.

Furthermore, upon comparing the answers from the four AI tools with the latest guidelines, we found instances where the AI answers were inaccurate or incomplete, in addition to the issues mentioned above. For example, regarding question 8, the answers from the four AI tools regarding complications of Mycoplasma pneumoniae pneumonia are not comprehensive. The course of Mycoplasma pneumoniae pneumonia is generally 1–2 weeks, with a favorable prognosis and no sequelae in most cases. However, a small number of cases can develop into severe conditions, primarily presenting with symptoms of respiratory distress and respiratory failure. These severe cases are often associated with acute respiratory distress syndrome, plastic bronchitis affecting the large airways, diffuse bronchiolitis, and severe pulmonary embolism. In rare instances, severe extrapulmonary complications may be the main manifestations. However, the answers from all four AI tools overlooked plastic bronchitis and pulmonary embolism [1]. Plastic bronchitis is a significant contributor to severe and fulminant Mycoplasma pneumoniae pneumonia, presenting with persistent high fever, respiratory distress, and physical examination findings such as tracheal casts, subcutaneous emphysema, and decreased or absent breath sounds in the lungs [1]. Pulmonary embolism may occur independently or concurrently with emboli in other locations, serving as a cause of necrotizing pneumonia and a significant factor in residual lung atelectasis and organizing pneumonia [1].

For question 11 concerning the drug treatment of Mycoplasma pneumoniae pneumonia, the answers from the four AI tools are not sufficiently accurate. A systematic review and meta-analysis have shown a global trend of increasing macrolide-resistant Mycoplasma pneumoniae infections, rising from 18.2% in 2000 to 76.5% in 2019 [2]. The highest infection rates are observed in the Western Pacific region, with mainland China accounting for 79.5% [2]. Additionally, a multicenter study in China from 2013 to 2019 indicated that macrolide resistance in Mycoplasma pneumoniae increased from 75.8% to 97.4% [3]. Therefore, Chinese guidelines recommend macrolide antibiotics as the first-line treatment for Mycoplasma pneumoniae pneumonia in children [1]. For cases resistant to macrolides, children over 8 years old may be treated with newer tetracyclines, which can potentially cause tooth discoloration and enamel hypoplasia, necessitating careful risk–benefit assessment for children under 8 years old. Fluoroquinolone antibiotics are used as alternative therapies for suspected or confirmed cases of MUMPP (macrolide-unresponsive Mycoplasma pneumoniae pneumonia), RMPP (refractory Mycoplasma pneumoniae pneumonia), or SMPP (severe Mycoplasma pneumoniae pneumonia), despite the risk of cartilage injury in animals and tendon rupture in humans, requiring thorough risk assessment for individuals under 18 years old. In adults [4], due to high macrolide resistance rates, oral doxycycline or minocycline are preferred for Mycoplasma infections. Macrolides may be empirically used in regions with low resistance rates. Respiratory fluoroquinolones are used as alternative treatments in areas with high drug resistance rates or for patients allergic to or intolerant of other medications. Besides antimicrobial therapy, symptomatic treatments such as antipyretics, cough suppressants, and expectorants are also employed. Severe cases may require glucocorticoids or bronchoalveolar lavage. Clearly, the four AI tools did not specify the differences in antimicrobial treatment between children and adults, which may lead to incorrect use, ineffective treatment, or exacerbation of conditions.

In summary, although AI technology is advancing rapidly with the emergence of various large-scale language models such as ChatGPT, Kimi, and ERNIE, answers to questions related to Mycoplasma pneumoniae pneumonia are still often incomplete, inaccurate, or even clearly erroneous. This may be due to training data not being updated in accordance with the latest guidelines. Furthermore, the application of medications must consider various factors such as patient age, past medication history, and drug side effects. AI tools typically provide lists of treatment medications and may not offer personalized treatment plans based on individual patient conditions. Therefore, while AI tools can be helpful for addressing questions related to Mycoplasma pneumoniae pneumonia, final decisions should still rely on healthcare professionals, as every answer ultimately concludes with “Consult a healthcare professional if you have related concerns.”

Although AI tools have limitations, overall, their accuracy in answers is relatively high. Such tools can explain common health issues to patients, to some extent saving healthcare resources, while also promoting medical knowledge and aiding in doctor–patient communication and shared decision-making. We believe that with ongoing efforts from researchers, AI may become more specialized and humane in its applications in the medical field in the future.

Shuang Li designed the study and completed the data collection and manuscript writing.

Permission to reproduce material from other sources has been obtained and is acknowledged accordingly.

The author has nothing to report.

The author has nothing to report.

The author declares no conflicts of interest.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Respiratory Journal
Clinical Respiratory Journal 医学-呼吸系统
CiteScore
3.70
自引率
0.00%
发文量
104
审稿时长
>12 weeks
期刊介绍: Overview Effective with the 2016 volume, this journal will be published in an online-only format. Aims and Scope The Clinical Respiratory Journal (CRJ) provides a forum for clinical research in all areas of respiratory medicine from clinical lung disease to basic research relevant to the clinic. We publish original research, review articles, case studies, editorials and book reviews in all areas of clinical lung disease including: Asthma Allergy COPD Non-invasive ventilation Sleep related breathing disorders Interstitial lung diseases Lung cancer Clinical genetics Rhinitis Airway and lung infection Epidemiology Pediatrics CRJ provides a fast-track service for selected Phase II and Phase III trial studies. Keywords Clinical Respiratory Journal, respiratory, pulmonary, medicine, clinical, lung disease, Abstracting and Indexing Information Academic Search (EBSCO Publishing) Academic Search Alumni Edition (EBSCO Publishing) Embase (Elsevier) Health & Medical Collection (ProQuest) Health Research Premium Collection (ProQuest) HEED: Health Economic Evaluations Database (Wiley-Blackwell) Hospital Premium Collection (ProQuest) Journal Citation Reports/Science Edition (Clarivate Analytics) MEDLINE/PubMed (NLM) ProQuest Central (ProQuest) Science Citation Index Expanded (Clarivate Analytics) SCOPUS (Elsevier)
期刊最新文献
SIRT3 Inhibits Cell Proliferation of Nonsmall Cell Lung Carcinoma by Inducing ROS Production IGF2BP3/CTCF Axis-Dependent NT5DC2 Promotes M2 Macrophage Polarization to Enhance the Malignant Progression of Lung Squamous Cell Carcinomas. Issue Information Clinical Benefits of new Systemic Therapy for Small-Cell Lung Cancer Over Two Decades: A Cross-Sectional Study Activation of Automatic Tube Compensation Mode Attenuates Auto-PEEP in Chronic Obstructive Pulmonary Disease Patients
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1