Pub Date : 2026-02-17eCollection Date: 2026-02-01DOI: 10.1002/hcs2.70049
Mengchun Gong, Jiao Li, Yonghui Ma, Bo Jin, Wei Chen, Yan Hou, Li Hong, Tianwen Lai, Bohan Zhang, Ge Wu, Zhirong Zeng
Background: Artificial intelligence (AI) is transforming healthcare, demanding reevaluation of medical education. China's "New Medical Education" initiative urgently requires a standardized AI literacy framework for medical students to address fragmented standards, rapid technological evolution, and insufficient localized ethical norms.
Objective: To establish a Chinese expert consensus defining core AI competencies and a multi-modal assessment framework for medical students.
Methods: A multidisciplinary (including medical education, clinical medicine, medical AI, public health, and medical ethics) expert group (n = 32) developed an initial competency list based on the "Knowledge-Skills-Attitude" Medical Competency Model. Two Delphi rounds (100% response rate; consensus threshold: mean ≥ 4.0, CV ≤ 0.25) refined the framework. Core competencies were prioritized via Analytic Hierarchy Process (AHP). The final consensus document was established after multiple expert group meetings.
Results: The consensus defines AI literacy for medical students as a comprehensive attribute for integrating AI into professional knowledge, clinical practice, research, and health management. It comprises a 21-item Competencies of AI Proficiency (CAIP) list across knowledge (eight indicators), skills (seven indicators), and attitude (six indicators) dimensions. Key competencies prioritized include understanding AI's role in multidisciplinary knowledge integration (CAIP3), identifying AI output biases (CAIP4), understanding health data governance (CAIP2), maintaining physician-led AI-assisted diagnosis (CAIP16), and identifying AI diagnostic biases (CAIP12). A multi-modal assessment framework is recommended, including paper-based/computerized tests for knowledge, situational judgment tests (SJTs) for attitudes, and objective structured clinical examinations (OSCEs) with a specific "AI Clinical Decision Conflict Scoring Scale" for skills. A multi-stage dynamic assessment system ("Pre-enrollment-Pre-clinical-Post-clinical") is proposed for longitudinal tracking. Educational integration pathways emphasize embedding AI literacy modularly from early undergraduate years, constructing an integrated curriculum covering fundamental principles, advanced large model applications (e.g., prompt engineering, agent development), and ethical considerations, supported by a "digital twin hospital platform."
Conclusion: This consensus provides authoritative, China-specific guidance for defining and assessing medical students' AI literacy, adhering to national policies and regulations. It offers a core action framework for optimizing AI integration into medical education, fostering future healthcare professionals proficient in both AI technology and medical humanism, with a commitment to dynamic updating to adapt to evolving AI advancements.
{"title":"A Chinese Expert Consensus on the Artificial Intelligence Proficiency of Medical Students: Competencies and the Multi-Modal Assessment.","authors":"Mengchun Gong, Jiao Li, Yonghui Ma, Bo Jin, Wei Chen, Yan Hou, Li Hong, Tianwen Lai, Bohan Zhang, Ge Wu, Zhirong Zeng","doi":"10.1002/hcs2.70049","DOIUrl":"https://doi.org/10.1002/hcs2.70049","url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) is transforming healthcare, demanding reevaluation of medical education. China's \"New Medical Education\" initiative urgently requires a standardized AI literacy framework for medical students to address fragmented standards, rapid technological evolution, and insufficient localized ethical norms.</p><p><strong>Objective: </strong>To establish a Chinese expert consensus defining core AI competencies and a multi-modal assessment framework for medical students.</p><p><strong>Methods: </strong>A multidisciplinary (including medical education, clinical medicine, medical AI, public health, and medical ethics) expert group (<i>n</i> = 32) developed an initial competency list based on the \"Knowledge-Skills-Attitude\" Medical Competency Model. Two Delphi rounds (100% response rate; consensus threshold: mean ≥ 4.0, CV ≤ 0.25) refined the framework. Core competencies were prioritized via Analytic Hierarchy Process (AHP). The final consensus document was established after multiple expert group meetings.</p><p><strong>Results: </strong>The consensus defines AI literacy for medical students as a comprehensive attribute for integrating AI into professional knowledge, clinical practice, research, and health management. It comprises a 21-item Competencies of AI Proficiency (CAIP) list across knowledge (eight indicators), skills (seven indicators), and attitude (six indicators) dimensions. Key competencies prioritized include understanding AI's role in multidisciplinary knowledge integration (CAIP3), identifying AI output biases (CAIP4), understanding health data governance (CAIP2), maintaining physician-led AI-assisted diagnosis (CAIP16), and identifying AI diagnostic biases (CAIP12). A multi-modal assessment framework is recommended, including paper-based/computerized tests for knowledge, situational judgment tests (SJTs) for attitudes, and objective structured clinical examinations (OSCEs) with a specific \"AI Clinical Decision Conflict Scoring Scale\" for skills. A multi-stage dynamic assessment system (\"Pre-enrollment-Pre-clinical-Post-clinical\") is proposed for longitudinal tracking. Educational integration pathways emphasize embedding AI literacy modularly from early undergraduate years, constructing an integrated curriculum covering fundamental principles, advanced large model applications (e.g., prompt engineering, agent development), and ethical considerations, supported by a \"digital twin hospital platform.\"</p><p><strong>Conclusion: </strong>This consensus provides authoritative, China-specific guidance for defining and assessing medical students' AI literacy, adhering to national policies and regulations. It offers a core action framework for optimizing AI integration into medical education, fostering future healthcare professionals proficient in both AI technology and medical humanism, with a commitment to dynamic updating to adapt to evolving AI advancements.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"49-57"},"PeriodicalIF":3.3,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946706/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction: Chemotherapy-induced gastrointestinal symptom clusters in breast cancer impair quality of life and treatment adherence, yet lack effective interventions. While acupuncture mitigates isolated chemotherapy-induced symptoms, its mechanisms for multi-symptom clusters remain unclear. This study evaluates electroacupuncture's efficacy and explores its biological mechanisms in managing these clusters.
Methods: This prospective, multicenter, block-randomized, double-blind, sham-controlled trial will enroll 388 patients with breast cancer undergoing neoadjuvant/adjuvant chemotherapy, to be randomly assigned (1:1) to electroacupuncture or sham electroacupuncture groups. Both groups will receive the standard quadruple antiemetic regimen combined with electroacupuncture or sham intervention. The primary endpoint is the incidence of chemotherapy-induced gastrointestinal symptom clusters within 120 h after chemotherapy. Secondary endpoints include improvement in gastrointestinal symptom clusters post-first chemotherapy cycle, nausea-free rates during acute and delayed phases, vomiting-free rates during overall, acute, and delayed phases, complete response rate, complete protection rate, and quality of life. Adverse events will be documented throughout the study.
Discussion: This study will assess the efficacy and safety of electroacupuncture in alleviating chemotherapy-induced gastrointestinal symptom clusters in patients with breast cancer. By integrating multi-omics analyses, we aim to elucidate the biological mechanisms underlying its therapeutic effects. The findings may offer a robust clinical foundation for optimizing symptom cluster management in cancer care. Trial Registration: Clinical Trials ID: NCT06952920. Date of registration: April 16, 2025. Prospectively registered. URL of Trial Registry Record: https://clinicaltrials.gov/study/NCT06952920cond=NCT06952920&rank=1.
{"title":"Electroacupuncture for Managing Chemotherapy-Induced Gastrointestinal Symptom Clusters in Patients With Breast Cancer: Study Protocol for a Randomized Controlled Trial.","authors":"Xinlong Tao, Zhen Liu, Miaozhou Wang, Dengfeng Ren, Fuxing Zhao, Hongbin Wang, Guowang Yang, Ganlin Zhang, Zitao Li, Zhilin Liu, Shifen Huang, Yongzhi Chen, Mengting Da, Xiaoyan Ma, Hongxia Liang, Yongxin Li, Yinyin Ye, Yonghui Zheng, Xiao Liang, Guoshuang Shen, Xiaorong Bai, Jiuda Zhao","doi":"10.1002/hcs2.70056","DOIUrl":"https://doi.org/10.1002/hcs2.70056","url":null,"abstract":"<p><strong>Introduction: </strong>Chemotherapy-induced gastrointestinal symptom clusters in breast cancer impair quality of life and treatment adherence, yet lack effective interventions. While acupuncture mitigates isolated chemotherapy-induced symptoms, its mechanisms for multi-symptom clusters remain unclear. This study evaluates electroacupuncture's efficacy and explores its biological mechanisms in managing these clusters.</p><p><strong>Methods: </strong>This prospective, multicenter, block-randomized, double-blind, sham-controlled trial will enroll 388 patients with breast cancer undergoing neoadjuvant/adjuvant chemotherapy, to be randomly assigned (1:1) to electroacupuncture or sham electroacupuncture groups. Both groups will receive the standard quadruple antiemetic regimen combined with electroacupuncture or sham intervention. The primary endpoint is the incidence of chemotherapy-induced gastrointestinal symptom clusters within 120 h after chemotherapy. Secondary endpoints include improvement in gastrointestinal symptom clusters post-first chemotherapy cycle, nausea-free rates during acute and delayed phases, vomiting-free rates during overall, acute, and delayed phases, complete response rate, complete protection rate, and quality of life. Adverse events will be documented throughout the study.</p><p><strong>Discussion: </strong>This study will assess the efficacy and safety of electroacupuncture in alleviating chemotherapy-induced gastrointestinal symptom clusters in patients with breast cancer. By integrating multi-omics analyses, we aim to elucidate the biological mechanisms underlying its therapeutic effects. The findings may offer a robust clinical foundation for optimizing symptom cluster management in cancer care. <b>Trial Registration:</b> Clinical Trials ID: NCT06952920. Date of registration: April 16, 2025. Prospectively registered. URL of Trial Registry Record: https://clinicaltrials.gov/study/NCT06952920cond=NCT06952920&rank=1.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"85-94"},"PeriodicalIF":3.3,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946714/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147329054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-15eCollection Date: 2026-02-01DOI: 10.1002/hcs2.70051
You Wu, Haibo Wang, Zongjiu Zhang
{"title":"Three Shifts That Will Redefine Health Systems.","authors":"You Wu, Haibo Wang, Zongjiu Zhang","doi":"10.1002/hcs2.70051","DOIUrl":"https://doi.org/10.1002/hcs2.70051","url":null,"abstract":"","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"1-3"},"PeriodicalIF":3.3,"publicationDate":"2026-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946704/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: The World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) is a popular tool for evaluating functioning and disability in a range of population demographics and medical situations. However, very little is known about the WHODAS 2.0's validity and reliability, particularly when dealing with potentially life-threatening maternal conditions (PLTCs). The aim of this study was to evaluate the validity of the WHODAS 2.0 Tigrigna version.
Methods: This cross-sectional study was conducted in Tigray, northern Ethiopia, from December 15 to 20, 2023. Following translation and back translation, women who had experienced PLTCs during a recent pregnancy, childbirth, or postpartum period were administered the 36-item WHODAS 2.0 in Tigrigna version 6 months after the childbirth. In total, 121 women with a history of PLTCs participated. Cronbach's α was used to evaluate internal consistency in all six WHODAS 2.0 domains, while Spearman's correlation coefficient was used to evaluate convergent validity. With confirmatory factor analysis, construct validity was also examined.
Results: All domain scores of the Tigrigna version of the WHODAS 2.0 indicated excellent internal consistency (α = 0.917-0.978 for 36 items and α = 0.874-0.940 for 12 items), while the Cronbach's α coefficients for the summary score were 0.981 and 0.952 for 36 and 12 items, respectively. The convergent validity between the 36-item and 12-item WHODAS 2.0 showed a strong correlation between similar constructs (r = 0.909-0.981).
Conclusion: Despite the small sample limitation, the WHODAS 2.0 tool adapted to the Tigrigna version indicated an acceptable reliability and validity and therefore could be applied to women with a history of PLTCs at 6 months postpartum.
{"title":"Validation of the World Health Organization Disability Assessment Schedule-II for Measuring Women With a History of Potentially Life-Threatening Maternal Conditions at Six Months Postpartum in Tigray, Northern Ethiopia.","authors":"Fitiwi Tinsae Baykemagn, Girmatsion Fisseha Abreha, Yibrah Berhe Zelelow, Alemayehu Bayray Kahsay","doi":"10.1002/hcs2.70054","DOIUrl":"https://doi.org/10.1002/hcs2.70054","url":null,"abstract":"<p><strong>Background: </strong>The World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) is a popular tool for evaluating functioning and disability in a range of population demographics and medical situations. However, very little is known about the WHODAS 2.0's validity and reliability, particularly when dealing with potentially life-threatening maternal conditions (PLTCs). The aim of this study was to evaluate the validity of the WHODAS 2.0 Tigrigna version.</p><p><strong>Methods: </strong>This cross-sectional study was conducted in Tigray, northern Ethiopia, from December 15 to 20, 2023. Following translation and back translation, women who had experienced PLTCs during a recent pregnancy, childbirth, or postpartum period were administered the 36-item WHODAS 2.0 in Tigrigna version 6 months after the childbirth. In total, 121 women with a history of PLTCs participated. Cronbach's α was used to evaluate internal consistency in all six WHODAS 2.0 domains, while Spearman's correlation coefficient was used to evaluate convergent validity. With confirmatory factor analysis, construct validity was also examined.</p><p><strong>Results: </strong>All domain scores of the Tigrigna version of the WHODAS 2.0 indicated excellent internal consistency (<i>α</i> = 0.917-0.978 for 36 items and <i>α</i> = 0.874-0.940 for 12 items), while the Cronbach's α coefficients for the summary score were 0.981 and 0.952 for 36 and 12 items, respectively. The convergent validity between the 36-item and 12-item WHODAS 2.0 showed a strong correlation between similar constructs (<i>r</i> = 0.909-0.981).</p><p><strong>Conclusion: </strong>Despite the small sample limitation, the WHODAS 2.0 tool adapted to the Tigrigna version indicated an acceptable reliability and validity and therefore could be applied to women with a history of PLTCs at 6 months postpartum.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"29-39"},"PeriodicalIF":3.3,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946705/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147329003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Large language models (LLMs) have shown considerable promise in supporting clinical decision-making. However, their adoption and evaluation in dermatology remains limited. This study aimed to explore the preferences of Chinese dermatologists regarding LLM-generated responses in clinical psoriasis scenarios and to assess how they prioritize key quality dimensions, including accuracy, traceability, and logicality.
Methods: A cross-sectional, web-based survey was conducted between December 25, 2024, and January 22, 2025, following the Checklist for Reporting Results of Internet E-Surveys guidelines. A total of 1247 valid responses were collected from practicing dermatologists across 33 of China's provincial-level administrative divisions. Participants evaluated responses to five categories of clinical questions (etiology, clinical presentation, differential diagnosis, treatment, and case study) generated by five LLMs: ChatGPT-4o, Kimi.ai, Doubao, ZuoYiGPT, and Lingyi-agent. Statistical associations between participant characteristics and model preferences were examined using chi-square tests.
Results: ChatGPT-4o (Model 1) emerged as the most preferred model across all clinical tasks, consistently receiving the highest number of votes in case study (n = 740), clinical presentation (n = 666), differential diagnosis (n = 707), etiology (n = 602), and treatment (n = 656). Significant variation in model preference by professional title was observed only for the differential diagnosis task (χ2 = 21.13, df = 12, p = 0.0485), while no significant differences were found across hospital tiers (p > 0.05). In terms of evaluation dimensions, accuracy was most frequently rated as "very important" (n = 635). A significant association existed between hospital tier and the most valued dimension (χ2 = 27.667, df = 9, p = 0.0011), with dermatologists in primary hospitals prioritizing traceability more than their peers in higher-tier hospitals. No significant associations were found across professional titles (p = 0.127).
Conclusions: Chinese dermatologists suggest a strong preference for ChatGPT-4o over domestic LLMs in psoriasis-related clinical tasks. While accuracy remains the primary criterion, traceability and logicality are also critical, particularly for clinicians in lower-tier hospitals. These findings suggest that future clinical LLMs should prioritize not only content accuracy but also source transparency and structural clarity to meet the diverse needs of different clinical settings.
{"title":"Preferences of Chinese Dermatologists for Large Language Model Responses in Clinical Psoriasis Scenarios: A Nationwide Cross-Sectional Survey in China.","authors":"Jungang Yang, Jingkai Xu, Xuejiao Song, Chengxu Li, Lili Chen, Lingbo Bi, Tingting Jiang, Xianbo Zuo, Yong Cui","doi":"10.1002/hcs2.70057","DOIUrl":"https://doi.org/10.1002/hcs2.70057","url":null,"abstract":"<p><strong>Background: </strong>Large language models (LLMs) have shown considerable promise in supporting clinical decision-making. However, their adoption and evaluation in dermatology remains limited. This study aimed to explore the preferences of Chinese dermatologists regarding LLM-generated responses in clinical psoriasis scenarios and to assess how they prioritize key quality dimensions, including accuracy, traceability, and logicality.</p><p><strong>Methods: </strong>A cross-sectional, web-based survey was conducted between December 25, 2024, and January 22, 2025, following the Checklist for Reporting Results of Internet E-Surveys guidelines. A total of 1247 valid responses were collected from practicing dermatologists across 33 of China's provincial-level administrative divisions. Participants evaluated responses to five categories of clinical questions (etiology, clinical presentation, differential diagnosis, treatment, and case study) generated by five LLMs: ChatGPT-4o, Kimi.ai, Doubao, ZuoYiGPT, and Lingyi-agent. Statistical associations between participant characteristics and model preferences were examined using chi-square tests.</p><p><strong>Results: </strong>ChatGPT-4o (Model 1) emerged as the most preferred model across all clinical tasks, consistently receiving the highest number of votes in case study (<i>n</i> = 740), clinical presentation (<i>n</i> = 666), differential diagnosis (<i>n</i> = 707), etiology (<i>n</i> = 602), and treatment (<i>n</i> = 656). Significant variation in model preference by professional title was observed only for the differential diagnosis task (<i>χ</i> <sup>2</sup> = 21.13, <i>df</i> = 12, <i>p</i> = 0.0485), while no significant differences were found across hospital tiers (<i>p</i> > 0.05). In terms of evaluation dimensions, accuracy was most frequently rated as \"very important\" (<i>n</i> = 635). A significant association existed between hospital tier and the most valued dimension (<i>χ</i> <sup>2</sup> = 27.667, <i>df</i> = 9, <i>p</i> = 0.0011), with dermatologists in primary hospitals prioritizing traceability more than their peers in higher-tier hospitals. No significant associations were found across professional titles (<i>p</i> = 0.127).</p><p><strong>Conclusions: </strong>Chinese dermatologists suggest a strong preference for ChatGPT-4o over domestic LLMs in psoriasis-related clinical tasks. While accuracy remains the primary criterion, traceability and logicality are also critical, particularly for clinicians in lower-tier hospitals. These findings suggest that future clinical LLMs should prioritize not only content accuracy but also source transparency and structural clarity to meet the diverse needs of different clinical settings.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"40-48"},"PeriodicalIF":3.3,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946707/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09eCollection Date: 2026-02-01DOI: 10.1002/hcs2.70055
Muna Salahat, Ali Saleh
Background: The effective delivery of nursing care is crucial in hospital settings because it directly affects patient outcomes. However, nursing care can be missed because of various factors, including inadequate teamwork among nursing staff. Understanding the interplay between missed nursing care and nursing teamwork is essential for enhancing care quality in inpatient settings. This study therefore explored the relationship between missed nursing care and nursing teamwork among registered nurses in hospital inpatient units.
Methods: A descriptive, correlational, cross-sectional study was conducted, involving 375 registered nurses from four hospitals in three healthcare sectors in Jordan. Missed nursing care and nursing teamwork were measured using the Missed Nursing Care Survey and the Nursing Teamwork Survey. Data collection occurred between September and October 2024, with convenience sampling used for participant recruitment. Descriptive and inferential statistics, including mean, standard deviation, percentage, frequency, and Pearson's r correlation coefficient, were used to analyze the data.
Results: The overall average missed nursing care score was 2.35 out of 5, suggesting that nursing care is rarely missed. The most frequently missed care activities reported by registered nurses included attending interdisciplinary care conferences, providing mouth care, and ambulating patients three times daily or as ordered. Activities least often missed included medication administration within 30 min of the scheduled time, assessing vital signs as ordered, and performing patient assessments each shift. The overall mean score for nursing teamwork was 3.5 out of 5 (standard deviation = 1.06). There was a moderate but significant negative correlation between missed nursing care and nursing teamwork (r = -0.310, p < 0.001).
Conclusions: The results underscore the urgent need for targeted interventions to enhance resource allocation and teamwork, ultimately reducing missed nursing care and improving patient outcomes. Addressing these areas will foster a more effective healthcare system and enable nursing professionals to consistently deliver high-quality care.
{"title":"The Relationship Between Missed Nursing Care and Nursing Teamwork in Jordan.","authors":"Muna Salahat, Ali Saleh","doi":"10.1002/hcs2.70055","DOIUrl":"https://doi.org/10.1002/hcs2.70055","url":null,"abstract":"<p><strong>Background: </strong>The effective delivery of nursing care is crucial in hospital settings because it directly affects patient outcomes. However, nursing care can be missed because of various factors, including inadequate teamwork among nursing staff. Understanding the interplay between missed nursing care and nursing teamwork is essential for enhancing care quality in inpatient settings. This study therefore explored the relationship between missed nursing care and nursing teamwork among registered nurses in hospital inpatient units.</p><p><strong>Methods: </strong>A descriptive, correlational, cross-sectional study was conducted, involving 375 registered nurses from four hospitals in three healthcare sectors in Jordan. Missed nursing care and nursing teamwork were measured using the Missed Nursing Care Survey and the Nursing Teamwork Survey. Data collection occurred between September and October 2024, with convenience sampling used for participant recruitment. Descriptive and inferential statistics, including mean, standard deviation, percentage, frequency, and Pearson's <i>r</i> correlation coefficient, were used to analyze the data.</p><p><strong>Results: </strong>The overall average missed nursing care score was 2.35 out of 5, suggesting that nursing care is rarely missed. The most frequently missed care activities reported by registered nurses included attending interdisciplinary care conferences, providing mouth care, and ambulating patients three times daily or as ordered. Activities least often missed included medication administration within 30 min of the scheduled time, assessing vital signs as ordered, and performing patient assessments each shift. The overall mean score for nursing teamwork was 3.5 out of 5 (standard deviation = 1.06). There was a moderate but significant negative correlation between missed nursing care and nursing teamwork (<i>r</i> = -0.310, <i>p</i> < 0.001).</p><p><strong>Conclusions: </strong>The results underscore the urgent need for targeted interventions to enhance resource allocation and teamwork, ultimately reducing missed nursing care and improving patient outcomes. Addressing these areas will foster a more effective healthcare system and enable nursing professionals to consistently deliver high-quality care.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"58-73"},"PeriodicalIF":3.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946711/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147329049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large language models (LLMs) show considerable potential to revolutionize healthcare through their performance across diverse clinical applications. Given the inherent constraints of LLMs and the critical nature of medical practice, a rigorous and systematic evaluation of their medical competence is imperative. This study presents a comprehensive review of the established methodologies and benchmarks for evaluating the medical competence of LLMs, encompassing a thorough analysis of current assessment practices across medical knowledge, clinical practice competence, and ethical-safety considerations. By integrating clinician competency assessment frameworks into LLMs evaluation, we propose a structured tri-dimensional framework that systematically organizes existing evaluation approaches according to medical theoretical knowledge, clinical practice ability, and ethical-safety considerations. Furthermore, this research provides critical insights into future developmental trajectories while establishing foundational frameworks and standardization protocols for the integration of LLMs into medical practice.
{"title":"A Survey on Medical Competence Evaluation Benchmarks for Large Language Models.","authors":"Qiting Wang, Huiru Zou, Haobin Zhang, Yongshun Huang, Junzhang Tian, Weibin Cheng","doi":"10.1002/hcs2.70050","DOIUrl":"https://doi.org/10.1002/hcs2.70050","url":null,"abstract":"<p><p>Large language models (LLMs) show considerable potential to revolutionize healthcare through their performance across diverse clinical applications. Given the inherent constraints of LLMs and the critical nature of medical practice, a rigorous and systematic evaluation of their medical competence is imperative. This study presents a comprehensive review of the established methodologies and benchmarks for evaluating the medical competence of LLMs, encompassing a thorough analysis of current assessment practices across medical knowledge, clinical practice competence, and ethical-safety considerations. By integrating clinician competency assessment frameworks into LLMs evaluation, we propose a structured tri-dimensional framework that systematically organizes existing evaluation approaches according to medical theoretical knowledge, clinical practice ability, and ethical-safety considerations. Furthermore, this research provides critical insights into future developmental trajectories while establishing foundational frameworks and standardization protocols for the integration of LLMs into medical practice.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"4-18"},"PeriodicalIF":3.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946712/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-28eCollection Date: 2026-02-01DOI: 10.1002/hcs2.70048
Ajibola B Bakare, Young Lee, Jhuree Hong, Claus-Peter Richter, Jonathan P Kuriakose
Background: Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology-Head and Neck Surgery systematic literature reviews.
Methods: Three PRISMA-based systematic reviews (Jabbour et al. 2017, Wong et al. 2018, and Wu et al. 2021) were replicated using ChatGPTv3.5 and Bard. Outputs (author, title, publication year, and journal) were compared to the original references and cross-referenced with medical databases for authenticity and recall.
Results: Several themes emerged when comparing Bard and ChatGPT across the three reviews. Bard generated more outputs and had greater recall in Wong et al.'s review, with a broader date range in Jabbour et al.'s review. In Wu et al.'s review, ChatGPT-2 had higher recall and identified more authentic outputs than Bard-2.
Conclusion: Large language models (LLMs) failed to fully replicate peer-reviewed methodologies, producing outputs with inaccuracies but identifying relevant, especially recent, articles missed by the references. While human-led PRISMA-based reviews remain the gold standard, refining LLMs for literature reviews shows potential.
背景:评估ChatGPT和Bard在耳鼻喉-头颈外科系统文献综述文章初始识别中的有效性。方法:使用ChatGPTv3.5和Bard对三个基于prisma的系统评价(Jabbour et al. 2017, Wong et al. 2018和Wu et al. 2021)进行重复。输出(作者、标题、出版年份和期刊)与原始参考文献进行比较,并与医学数据库进行交叉引用,以确定真实性和召回率。结果:在三个评论中比较Bard和ChatGPT时,出现了几个主题。在Wong等人的综述中,Bard产生了更多的输出,召回率更高,在Jabbour等人的综述中,日期范围更广。在Wu等人的综述中,ChatGPT-2比hard -2具有更高的召回率,并识别出更真实的输出。结论:大型语言模型(llm)未能完全复制同行评议的方法,产生不准确的输出,但识别相关的,特别是最近的,被参考文献遗漏的文章。虽然人类主导的基于prisma的评论仍然是黄金标准,但为文献评论改进llm显示出潜力。
{"title":"Assessing Large Language Models for Early Article Identification in Otolaryngology-Head and Neck Surgery Systematic Reviews.","authors":"Ajibola B Bakare, Young Lee, Jhuree Hong, Claus-Peter Richter, Jonathan P Kuriakose","doi":"10.1002/hcs2.70048","DOIUrl":"https://doi.org/10.1002/hcs2.70048","url":null,"abstract":"<p><strong>Background: </strong>Assess ChatGPT and Bard's effectiveness in the initial identification of articles for Otolaryngology-Head and Neck Surgery systematic literature reviews.</p><p><strong>Methods: </strong>Three PRISMA-based systematic reviews (Jabbour et al. 2017, Wong et al. 2018, and Wu et al. 2021) were replicated using ChatGPTv3.5 and Bard. Outputs (author, title, publication year, and journal) were compared to the original references and cross-referenced with medical databases for authenticity and recall.</p><p><strong>Results: </strong>Several themes emerged when comparing Bard and ChatGPT across the three reviews. Bard generated more outputs and had greater recall in Wong et al.'s review, with a broader date range in Jabbour et al.'s review. In Wu et al.'s review, ChatGPT-2 had higher recall and identified more authentic outputs than Bard-2.</p><p><strong>Conclusion: </strong>Large language models (LLMs) failed to fully replicate peer-reviewed methodologies, producing outputs with inaccuracies but identifying relevant, especially recent, articles missed by the references. While human-led PRISMA-based reviews remain the gold standard, refining LLMs for literature reviews shows potential.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"19-28"},"PeriodicalIF":3.3,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946713/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15eCollection Date: 2026-02-01DOI: 10.1002/hcs2.70047
Lan Lan, Jin Yin, Haohan Zhang, Hua Jiang, Rui Qin, Xia Zhao, Yu Zhang, Yilong Wang, Jiajun Qiu
Background: Studies have shown that heart rate variability (HRV) is a predictor of the prognosis of cardiovascular diseases. Contact heartbeat monitoring equipment is widely used, especially in hospitals, and benefits from the rapidity and accuracy of the detection of physiological health indicators. However, long-term contact with equipment has many adverse effects. The purpose of this study was to improve the accuracy of HRV detection via noncontact equipment, thus enabling HRV to be assessed in various scenarios.
Methods: A novel deep learning approach was proposed for measuring heartbeats through camera videos. First, we performed facial segmentation and divided the face into 16 grid cells with different light balance scores. After the trend is filtered by the Hamming window, a transformer-based neural network is used to further filter the signal. Finally, heart rate (HR) and HRV are estimated.
Results: We used 1 million synthetic data points for pretraining and a public dataset in combination with a dataset that we constructed for task training. The final results were obtained on a test dataset that we constructed. The accuracy for HR with a low light balance score (0.867-0.983) was greater than that with a high score (0.667-0.750). Our method had higher accuracy in estimating HR than traditional filtering methods (0.167-0.417) and state-of-the-art neural network filtering methods (0.783-0.917) did. The root mean square error of the HRV from the time domain was the lowest, and the correlation index score was the highest for the HRV from the frequency domain estimated by our method compared with those estimated by two neural networks.
Conclusions: Light balance, large sample training, and two-stage training can improve the accuracy of HRV estimation.
{"title":"A Deep Neural Network Based on Two-Stage Training for Estimating Heart Rate Variability From Camera Videos.","authors":"Lan Lan, Jin Yin, Haohan Zhang, Hua Jiang, Rui Qin, Xia Zhao, Yu Zhang, Yilong Wang, Jiajun Qiu","doi":"10.1002/hcs2.70047","DOIUrl":"https://doi.org/10.1002/hcs2.70047","url":null,"abstract":"<p><strong>Background: </strong>Studies have shown that heart rate variability (HRV) is a predictor of the prognosis of cardiovascular diseases. Contact heartbeat monitoring equipment is widely used, especially in hospitals, and benefits from the rapidity and accuracy of the detection of physiological health indicators. However, long-term contact with equipment has many adverse effects. The purpose of this study was to improve the accuracy of HRV detection via noncontact equipment, thus enabling HRV to be assessed in various scenarios.</p><p><strong>Methods: </strong>A novel deep learning approach was proposed for measuring heartbeats through camera videos. First, we performed facial segmentation and divided the face into 16 grid cells with different light balance scores. After the trend is filtered by the Hamming window, a transformer-based neural network is used to further filter the signal. Finally, heart rate (HR) and HRV are estimated.</p><p><strong>Results: </strong>We used 1 million synthetic data points for pretraining and a public dataset in combination with a dataset that we constructed for task training. The final results were obtained on a test dataset that we constructed. The accuracy for HR with a low light balance score (0.867-0.983) was greater than that with a high score (0.667-0.750). Our method had higher accuracy in estimating HR than traditional filtering methods (0.167-0.417) and state-of-the-art neural network filtering methods (0.783-0.917) did. The root mean square error of the HRV from the time domain was the lowest, and the correlation index score was the highest for the HRV from the frequency domain estimated by our method compared with those estimated by two neural networks.</p><p><strong>Conclusions: </strong>Light balance, large sample training, and two-stage training can improve the accuracy of HRV estimation.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"5 1","pages":"74-84"},"PeriodicalIF":3.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12946708/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingya Zhang, Haoran Li, Ning Zhang, Ying Mao, Bin Zhu
A series of measures to implement national volume-based procurement (NVBP) and follow-on NVBP in China have significantly reduced insulin prices and increased patient affordability. However, NVBP may lead to a higher burden of insulin-related consumables (such as injection pens and needles), which might discourage patients from using insulin in the pooled list and increase the risk of needle reuse. This article emphasizes that it is essential NVBP be implemented for both drugs and consumables, which will contribute to the achievement of universal insulin access.
{"title":"Insulin-Related Consumables Should not be Ignored While Pooling Insulin Purchases: Experience From China","authors":"Jingya Zhang, Haoran Li, Ning Zhang, Ying Mao, Bin Zhu","doi":"10.1002/hcs2.70046","DOIUrl":"10.1002/hcs2.70046","url":null,"abstract":"<p>A series of measures to implement national volume-based procurement (NVBP) and follow-on NVBP in China have significantly reduced insulin prices and increased patient affordability. However, NVBP may lead to a higher burden of insulin-related consumables (such as injection pens and needles), which might discourage patients from using insulin in the pooled list and increase the risk of needle reuse. This article emphasizes that it is essential NVBP be implemented for both drugs and consumables, which will contribute to the achievement of universal insulin access.</p>","PeriodicalId":100601,"journal":{"name":"Health Care Science","volume":"4 6","pages":"410-413"},"PeriodicalIF":3.3,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12728673/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}