Background: Clinical practice settings have increasingly become dependent on the use of digital or eHealth technologies such as electronic health records. It is vitally important to support nurses in adapting to digitalized health care systems; however, little is known about nursing graduates' experiences as they transition to the workplace.
Objective: This study aims to (1) describe newly qualified nurses' experiences with digital health in the workplace, and (2) identify strategies that could help support new graduates' transition and practice with digital health.
Methods: An exploratory descriptive qualitative design was used. A total of 14 nurses from Eastern and Western Canada participated in semistructured interviews and data were analyzed using inductive content analysis.
Results: Three themes were identified: (1) experiences before becoming a registered nurse, (2) experiences upon joining the workplace, and (3) suggestions for bridging the gap in transition to digital health practice. Findings revealed more similarities than differences between participants with respect to gaps in digital health education, technology-related challenges, and their influence on nursing practice.
Conclusions: Digital health is the foundation of contemporary health care; therefore, comprehensive education during nursing school and throughout professional nursing practice, as well as organizational support and policy, are critical pillars. Health systems investing in digital health technologies must create supportive work environments for nurses to thrive in technologically rich environments and increase their capacity to deliver the digital health future.
Background: Although history taking is fundamental for diagnosing medical conditions, teaching and providing feedback on the skill can be challenging due to resource constraints. Virtual simulated patients and web-based chatbots have thus emerged as educational tools, with recent advancements in artificial intelligence (AI) such as large language models (LLMs) enhancing their realism and potential to provide feedback.
Objective: In our study, we aimed to evaluate the effectiveness of a Generative Pretrained Transformer (GPT) 4 model to provide structured feedback on medical students' performance in history taking with a simulated patient.
Methods: We conducted a prospective study involving medical students performing history taking with a GPT-powered chatbot. To that end, we designed a chatbot to simulate patients' responses and provide immediate feedback on the comprehensiveness of the students' history taking. Students' interactions with the chatbot were analyzed, and feedback from the chatbot was compared with feedback from a human rater. We measured interrater reliability and performed a descriptive analysis to assess the quality of feedback.
Results: Most of the study's participants were in their third year of medical school. A total of 1894 question-answer pairs from 106 conversations were included in our analysis. GPT-4's role-play and responses were medically plausible in more than 99% of cases. Interrater reliability between GPT-4 and the human rater showed "almost perfect" agreement (Cohen κ=0.832). Less agreement (κ<0.6) detected for 8 out of 45 feedback categories highlighted topics about which the model's assessments were overly specific or diverged from human judgement.
Conclusions: The GPT model was effective in providing structured feedback on history-taking dialogs provided by medical students. Although we unraveled some limitations regarding the specificity of feedback for certain feedback categories, the overall high agreement with human raters suggests that LLMs can be a valuable tool for medical education. Our findings, thus, advocate the careful integration of AI-driven feedback mechanisms in medical training and highlight important aspects when LLMs are used in that context.
Unlabelled: China's secondary vocational medical education is essential for training primary health care personnel and enhancing public health responses. This education system currently faces challenges, primarily due to its emphasis on knowledge acquisition that overshadows the development and application of skills, especially in the context of emerging artificial intelligence (AI) technologies. This article delves into the impact of AI on medical practices and uses this analysis to suggest reforms for the vocational medical education system in China. AI is found to significantly enhance diagnostic capabilities, therapeutic decision-making, and patient management. However, it also brings about concerns such as potential job losses and necessitates the adaptation of medical professionals to new technologies. Proposed reforms include a greater focus on critical thinking, hands-on experiences, skill development, medical ethics, and integrating humanities and AI into the curriculum. These reforms require ongoing evaluation and sustained research to effectively prepare medical students for future challenges in the field.
Background: Official conference hashtags are commonly used to promote tweeting and social media engagement. The reach and impact of introducing a new hashtag during an oncology conference have yet to be studied. The American Society of Clinical Oncology (ASCO) conducts an annual global meeting, which was entirely virtual due to the COVID-19 pandemic in 2020 and 2021.
Objective: This study aimed to assess the reach and impact (in the form of vertices and edges generated) and X (formerly Twitter) activity of the new hashtags #goASCO20 and #goASCO21 in the ASCO 2020 and 2021 virtual conferences.
Methods: New hashtags (#goASCO20 and #goASCO21) were created for the ASCO virtual conferences in 2020 and 2021 to help focus gynecologic oncology discussion at the ASCO meetings. Data were retrieved using these hashtags (#goASCO20 for 2020 and #goASCO21 for 2021). A social network analysis was performed using the NodeXL software application.
Results: The hashtags #goASCO20 and #goASCO21 had similar impacts on the social network. Analysis of the reach and impact of the individual hashtags found #goASCO20 to have 150 vertices and 2519 total edges and #goASCO20 to have 174 vertices and 2062 total edges. Mentions and tweets between 2020 and 2021 were also similar. The circles representing different users were spatially arranged in a more balanced way in 2021. Tweets using the #goASCO21 hashtag received significantly more responses than tweets using #goASCO20 (75 times in 2020 vs 360 times in 2021; z value=16.63 and P<.001). This indicates increased engagement in the subsequent year.
Conclusions: Introducing a gynecologic oncology specialty-specific hashtag (#goASCO20 and #goASCO21) that is related but different from the official conference hashtag (#ASCO20 and #ASCO21) helped facilitate discussion on topics of interest to gynecologic oncologists during a virtual pan-oncology meeting. This impact was visible in the social network analysis.
Background: With the increasing application of large language models like ChatGPT in various industries, its potential in the medical domain, especially in standardized examinations, has become a focal point of research.
Objective: The aim of this study is to assess the clinical performance of ChatGPT, focusing on its accuracy and reliability in the Chinese National Medical Licensing Examination (CNMLE).
Methods: The CNMLE 2022 question set, consisting of 500 single-answer multiple choices questions, were reclassified into 15 medical subspecialties. Each question was tested 8 to 12 times in Chinese on the OpenAI platform from April 24 to May 15, 2023. Three key factors were considered: the version of GPT-3.5 and 4.0, the prompt's designation of system roles tailored to medical subspecialties, and repetition for coherence. A passing accuracy threshold was established as 60%. The χ2 tests and κ values were employed to evaluate the model's accuracy and consistency.
Results: GPT-4.0 achieved a passing accuracy of 72.7%, which was significantly higher than that of GPT-3.5 (54%; P<.001). The variability rate of repeated responses from GPT-4.0 was lower than that of GPT-3.5 (9% vs 19.5%; P<.001). However, both models showed relatively good response coherence, with κ values of 0.778 and 0.610, respectively. System roles numerically increased accuracy for both GPT-4.0 (0.3%-3.7%) and GPT-3.5 (1.3%-4.5%), and reduced variability by 1.7% and 1.8%, respectively (P>.05). In subgroup analysis, ChatGPT achieved comparable accuracy among different question types (P>.05). GPT-4.0 surpassed the accuracy threshold in 14 of 15 subspecialties, while GPT-3.5 did so in 7 of 15 on the first response.
Conclusions: GPT-4.0 passed the CNMLE and outperformed GPT-3.5 in key areas such as accuracy, consistency, and medical subspecialty expertise. Adding a system role insignificantly enhanced the model's reliability and answer coherence. GPT-4.0 showed promising potential in medical education and clinical practice, meriting further study.
Background: Evaluating the accuracy and educational utility of artificial intelligence-generated medical cases, especially those produced by large language models such as ChatGPT-4 (developed by OpenAI), is crucial yet underexplored.
Objective: This study aimed to assess the educational utility of ChatGPT-4-generated clinical vignettes and their applicability in educational settings.
Methods: Using a convergent mixed methods design, a web-based survey was conducted from January 8 to 28, 2024, to evaluate 18 medical cases generated by ChatGPT-4 in Japanese. In the survey, 6 main question items were used to evaluate the quality of the generated clinical vignettes and their educational utility, which are information quality, information accuracy, educational usefulness, clinical match, terminology accuracy (TA), and diagnosis difficulty. Feedback was solicited from physicians specializing in general internal medicine or general medicine and experienced in medical education. Chi-square and Mann-Whitney U tests were performed to identify differences among cases, and linear regression was used to examine trends associated with physicians' experience. Thematic analysis of qualitative feedback was performed to identify areas for improvement and confirm the educational utility of the cases.
Results: Of the 73 invited participants, 71 (97%) responded. The respondents, primarily male (64/71, 90%), spanned a broad range of practice years (from 1976 to 2017) and represented diverse hospital sizes throughout Japan. The majority deemed the information quality (mean 0.77, 95% CI 0.75-0.79) and information accuracy (mean 0.68, 95% CI 0.65-0.71) to be satisfactory, with these responses being based on binary data. The average scores assigned were 3.55 (95% CI 3.49-3.60) for educational usefulness, 3.70 (95% CI 3.65-3.75) for clinical match, 3.49 (95% CI 3.44-3.55) for TA, and 2.34 (95% CI 2.28-2.40) for diagnosis difficulty, based on a 5-point Likert scale. Statistical analysis showed significant variability in content quality and relevance across the cases (P<.001 after Bonferroni correction). Participants suggested improvements in generating physical findings, using natural language, and enhancing medical TA. The thematic analysis highlighted the need for clearer documentation, clinical information consistency, content relevance, and patient-centered case presentations.
Conclusions: ChatGPT-4-generated medical cases written in Japanese possess considerable potential as resources in medical education, with recognized adequacy in quality and accuracy. Nevertheless, there is a notable need for enhancements in the precision and realism of case details. This study emphasizes ChatGPT-4's value as an adjunctive educational tool in the medical field, requiring expert oversight for optimal application.