首页 > 最新文献

JMIR Formative Research最新文献

英文 中文
Longitudinal Changes in Pitch-Related Acoustic Characteristics of the Voice Throughout the Menstrual Cycle: Observational Study. 在整个月经周期中声音音高相关声学特征的纵向变化:观察性研究。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-09 DOI: 10.2196/65448
Jaycee Kaufman, Jouhyun Jeon, Jessica Oreskovic, Anirudh Thommandram, Yan Fossat

Background: Identifying subtle changes in the menstrual cycle is crucial for effective fertility tracking and understanding reproductive health.

Objective: The aim of the study is to explore how fundamental frequency features vary between menstrual phases using daily voice recordings.

Methods: This study analyzed smartphone-collected voice recordings from 16 naturally cycling female participants, collected every day for 1 full menstrual cycle. Fundamental frequency features (mean, SD, 5th percentile, and 95th percentile) were extracted from each voice recording. Ovulation was estimated using luteinizing hormone urine tests taken every morning. The analysis included comparisons of these features between the follicular and luteal phases and the application of changepoint detection algorithms to assess changes and pinpoint the day in which the shifts in vocal pitch occur.

Results: The fundamental frequency SD was 9.0% (SD 2.9%) lower in the luteal phase compared to the follicular phase (95% CI 3.4%-14.7%; P=.002), and the 5th percentile of the fundamental frequency was 8.8% (SD 3.6%) higher (95% CI 1.7%-16.0%; P=.01). No significant differences were found between phases in mean fundamental frequency or the 95th percentile of the fundamental frequency (P=.65 and P=.07). Changepoint detection, applied separately to each feature, identified the point in time when vocal frequency behaviors shifted. For the fundamental frequency SD and 5th percentile, 81% (n=13) of participants exhibited shifts within the fertile window (P=.03). In comparison, only 63% (n=10; P=.24) and 50% (n=8; P=.50) of participants had shifts in the fertile window for the mean and 95th percentile of the fundamental frequency, respectively.

Conclusions: These findings indicate that subtle variations in vocal pitch may reflect changes associated with the menstrual cycle, suggesting the potential for developing a noninvasive and convenient method for monitoring reproductive health. Changepoint detection may provide a promising avenue for future work in longitudinal fertility analysis.

背景:确定月经周期的细微变化对于有效的生育跟踪和了解生殖健康至关重要。目的:该研究的目的是探索如何基本频率特征变化月经期间使用日常录音。方法:本研究分析了智能手机收集的16名自然周期女性参与者的语音记录,每天收集1个完整的月经周期。从每个录音中提取基频特征(均值、标准差、第5百分位和第95百分位)。排卵是通过每天早上进行黄体生成素尿检来估计的。分析包括比较卵泡期和黄体期之间的这些特征,以及应用变化点检测算法来评估变化并确定音调发生变化的日期。结果:黄体期的基本频率SD比卵泡期低9.0% (SD 2.9%) (95% CI 3.4% ~ 14.7%;P=.002),基频的第5百分位高8.8% (SD 3.6%) (95% CI 1.7%-16.0%;P = . 01)。在平均基频和基频的第95百分位数中,各相位间无显著差异(P=。65, P=.07)。变化点检测,分别应用于每个特征,识别出声音频率行为发生变化的时间点。对于基频SD和第5百分位,81% (n=13)的参与者在可育窗口内表现出移位(P=.03)。相比之下,只有63% (n=10;P= 0.24)和50% (n=8;P=.50)的参与者在基本频率的平均值和第95百分位的可育窗口中分别发生了移位。结论:这些发现表明,音调的细微变化可能反映了与月经周期相关的变化,这表明有可能开发一种无创且方便的监测生殖健康的方法。变点检测可能为今后的纵向生育分析工作提供有前途的途径。
{"title":"Longitudinal Changes in Pitch-Related Acoustic Characteristics of the Voice Throughout the Menstrual Cycle: Observational Study.","authors":"Jaycee Kaufman, Jouhyun Jeon, Jessica Oreskovic, Anirudh Thommandram, Yan Fossat","doi":"10.2196/65448","DOIUrl":"10.2196/65448","url":null,"abstract":"<p><strong>Background: </strong>Identifying subtle changes in the menstrual cycle is crucial for effective fertility tracking and understanding reproductive health.</p><p><strong>Objective: </strong>The aim of the study is to explore how fundamental frequency features vary between menstrual phases using daily voice recordings.</p><p><strong>Methods: </strong>This study analyzed smartphone-collected voice recordings from 16 naturally cycling female participants, collected every day for 1 full menstrual cycle. Fundamental frequency features (mean, SD, 5th percentile, and 95th percentile) were extracted from each voice recording. Ovulation was estimated using luteinizing hormone urine tests taken every morning. The analysis included comparisons of these features between the follicular and luteal phases and the application of changepoint detection algorithms to assess changes and pinpoint the day in which the shifts in vocal pitch occur.</p><p><strong>Results: </strong>The fundamental frequency SD was 9.0% (SD 2.9%) lower in the luteal phase compared to the follicular phase (95% CI 3.4%-14.7%; P=.002), and the 5th percentile of the fundamental frequency was 8.8% (SD 3.6%) higher (95% CI 1.7%-16.0%; P=.01). No significant differences were found between phases in mean fundamental frequency or the 95th percentile of the fundamental frequency (P=.65 and P=.07). Changepoint detection, applied separately to each feature, identified the point in time when vocal frequency behaviors shifted. For the fundamental frequency SD and 5th percentile, 81% (n=13) of participants exhibited shifts within the fertile window (P=.03). In comparison, only 63% (n=10; P=.24) and 50% (n=8; P=.50) of participants had shifts in the fertile window for the mean and 95th percentile of the fundamental frequency, respectively.</p><p><strong>Conclusions: </strong>These findings indicate that subtle variations in vocal pitch may reflect changes associated with the menstrual cycle, suggesting the potential for developing a noninvasive and convenient method for monitoring reproductive health. Changepoint detection may provide a promising avenue for future work in longitudinal fertility analysis.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e65448"},"PeriodicalIF":2.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11737864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142949179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
User Outcomes for an App-Delivered Hypnosis Intervention for Menopausal Hot Flashes: Retrospective Analysis. 应用程序催眠干预绝经期潮热的用户结果:回顾性分析。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-09 DOI: 10.2196/63948
Katherine Scheffrahn, Claire Hall, Vanessa Muñiz, Gary Elkins
<p><strong>Background: </strong>Hypnotherapy has been shown to be a safe, nonhormonal intervention effective for treating menopausal hot flashes. However, women experiencing hot flashes may face accessibility barriers to in-person hypnotherapy. To solve this issue, a smartphone app has been created to deliver hypnotherapy. The Evia app delivers audio-recorded hypnotherapy and has the potential to help individuals experiencing hot flashes.</p><p><strong>Objective: </strong>This study aims to determine user outcomes in hot flash frequency and severity for users of the Evia app.</p><p><strong>Methods: </strong>This study is a retrospective analysis of a dataset of Evia app users. Participants were divided into 2 groups for analysis. The first group reported daytime hot flashes and night sweats, while the second group was asked to report only daytime hot flashes. The participants in the first group (daytime hot flashes and night sweats) were 139 women with ≥3 daily hot flashes who downloaded the Evia app between November 6, 2021, and June 9, 2022, with a baseline mean of 8.330 (SD 3.977) daily hot flashes. The participants in the second group (daytime hot flashes) were 271 women with ≥3 daily hot flashes who downloaded the Evia app between June 10, 2022, and February 5, 2024, with a baseline mean of 6.040 (SD 3.282) daily hot flashes. The Evia program included a 5-week program for all participants with daily tasks such as educational readings, hypnotic inductions, and daily hot-flash tracking. The app uses audio-recorded hypnosis and mental imagery for coolness, such as imagery for a cool breeze, snow, or calmness.</p><p><strong>Results: </strong>A clinically significant reduction, defined as a 50% reduction, in daily hot flashes was experienced by 76.3% (106/139) of the women with hot flashes and night sweats and 56.8% (154/271) of the women with daily hot flashes from baseline to their last logged Evia app survey. On average, the women with hot flashes and night sweats experienced a reduction of 61.4% (SD 33.185%) in their hot flashes experienced at day and night while using the Evia app, and the women with daily hot flashes experienced a reduction of 45.2% (SD 42.567%) in their daytime hot flashes. In both groups, there was a large, statistically significant difference in the average number of daily hot flashes from baseline to end point (women with hot flashes and night sweats: Cohen d=1.28; t<sub>138</sub>=15.055; P<.001; women with daily hot flashes: Cohen d=0.82; t<sub>270</sub>=13.555; P<.001).</p><p><strong>Conclusions: </strong>Hypnotherapy is an efficacious intervention for hot flashes, with the potential to improve women's lives by reducing hot flashes without hormonal or pharmacological intervention. This study takes the first step in evaluating the efficacy of an app-delivered hypnosis intervention for menopausal hot flashes, demonstrating the Evia app provides a promising app delivery of hypnotherapy with potential to increase accessib
背景:催眠疗法已被证明是一种安全、有效的治疗更年期潮热的非激素干预方法。然而,经历潮热的女性可能会面临面对面催眠治疗的障碍。为了解决这个问题,一款智能手机应用程序应运而生,可以提供催眠治疗。Evia应用程序提供录音催眠疗法,有可能帮助个人经历潮热。目的:本研究旨在确定Evia应用程序用户在热闪频率和严重程度方面的用户结果。方法:本研究是对Evia应用程序用户数据集的回顾性分析。将参与者分为两组进行分析。第一组报告白天潮热和盗汗,而第二组被要求只报告白天潮热。第一组参与者(日间潮热和盗汗)为139名每日潮热≥3次的女性,她们在2021年11月6日至2022年6月9日期间下载了Evia应用程序,基线平均值为每日潮热8.330次(SD 3.977)。第二组(日间潮热)的参与者为271名每日潮热≥3次的女性,她们在2022年6月10日至2024年2月5日期间下载了Evia应用程序,每日潮热基线平均值为6.040 (SD为3.282)。Evia项目包括一个为期5周的项目,对所有参与者进行日常任务,如教育阅读、催眠诱导和每日热闪跟踪。该应用程序使用录音催眠和心理意象来实现凉爽,例如凉爽的微风、雪或平静的意象。结果:从基线到最后一次记录的Evia应用程序调查,76.3%(106/139)的潮热和盗汗女性和56.8%(154/271)的潮热女性经历了临床显著的减少,定义为减少50%。平均而言,使用Evia应用程序时,有潮热和盗汗的女性在白天和晚上的潮热症状减少了61.4% (SD 33.185%),而每天有潮热症状的女性在白天的潮热症状减少了45.2% (SD 42.567%)。在两组中,从基线到终点,每天潮热的平均次数有很大的统计学显著差异(潮热和盗汗的女性:Cohen d=1.28;t138 = 15.055;P270 = 13.555;结论:催眠疗法是一种有效的干预潮热的方法,有可能在没有激素或药物干预的情况下通过减少潮热来改善女性的生活。本研究在评估应用程序提供的催眠干预绝经期潮热的疗效方面迈出了第一步,证明Evia应用程序提供了一种有前途的催眠治疗应用程序,有可能增加催眠治疗的可及性。
{"title":"User Outcomes for an App-Delivered Hypnosis Intervention for Menopausal Hot Flashes: Retrospective Analysis.","authors":"Katherine Scheffrahn, Claire Hall, Vanessa Muñiz, Gary Elkins","doi":"10.2196/63948","DOIUrl":"10.2196/63948","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Hypnotherapy has been shown to be a safe, nonhormonal intervention effective for treating menopausal hot flashes. However, women experiencing hot flashes may face accessibility barriers to in-person hypnotherapy. To solve this issue, a smartphone app has been created to deliver hypnotherapy. The Evia app delivers audio-recorded hypnotherapy and has the potential to help individuals experiencing hot flashes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to determine user outcomes in hot flash frequency and severity for users of the Evia app.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;This study is a retrospective analysis of a dataset of Evia app users. Participants were divided into 2 groups for analysis. The first group reported daytime hot flashes and night sweats, while the second group was asked to report only daytime hot flashes. The participants in the first group (daytime hot flashes and night sweats) were 139 women with ≥3 daily hot flashes who downloaded the Evia app between November 6, 2021, and June 9, 2022, with a baseline mean of 8.330 (SD 3.977) daily hot flashes. The participants in the second group (daytime hot flashes) were 271 women with ≥3 daily hot flashes who downloaded the Evia app between June 10, 2022, and February 5, 2024, with a baseline mean of 6.040 (SD 3.282) daily hot flashes. The Evia program included a 5-week program for all participants with daily tasks such as educational readings, hypnotic inductions, and daily hot-flash tracking. The app uses audio-recorded hypnosis and mental imagery for coolness, such as imagery for a cool breeze, snow, or calmness.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;A clinically significant reduction, defined as a 50% reduction, in daily hot flashes was experienced by 76.3% (106/139) of the women with hot flashes and night sweats and 56.8% (154/271) of the women with daily hot flashes from baseline to their last logged Evia app survey. On average, the women with hot flashes and night sweats experienced a reduction of 61.4% (SD 33.185%) in their hot flashes experienced at day and night while using the Evia app, and the women with daily hot flashes experienced a reduction of 45.2% (SD 42.567%) in their daytime hot flashes. In both groups, there was a large, statistically significant difference in the average number of daily hot flashes from baseline to end point (women with hot flashes and night sweats: Cohen d=1.28; t&lt;sub&gt;138&lt;/sub&gt;=15.055; P&lt;.001; women with daily hot flashes: Cohen d=0.82; t&lt;sub&gt;270&lt;/sub&gt;=13.555; P&lt;.001).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Hypnotherapy is an efficacious intervention for hot flashes, with the potential to improve women's lives by reducing hot flashes without hormonal or pharmacological intervention. This study takes the first step in evaluating the efficacy of an app-delivered hypnosis intervention for menopausal hot flashes, demonstrating the Evia app provides a promising app delivery of hypnotherapy with potential to increase accessib","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e63948"},"PeriodicalIF":2.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757980/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142949098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of a Virtual Care Navigation Service on Member-Reported Outcomes Among Lesbian, Gay, Bisexual, Transgender, and Queer Populations: Case Study. 虚拟护理导航服务对女同性恋、男同性恋、双性恋、变性人和酷儿群体成员报告结果的影响:案例研究。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-09 DOI: 10.2196/64137
Seul Ki Choi, Jaclyn Marshall, Patrina Sexton Topper, Andrew Pregnall, José Bauermeister
<p><strong>Background: </strong>While the significance of care navigation in facilitating access to health care within the lesbian, gay, bisexual, transgender, queer, and other (LGBTQ+) communities has been acknowledged, there is limited research examining how care navigation influences an individual's ability to understand and access the care they need in real-world settings. By analyzing private sector data, we can bridge the gap between theoretical research findings and practical applications, ultimately informing both business strategies and public policy with evidence grounded in real-world efficacy.</p><p><strong>Objective: </strong>The objective of this study was to evaluate the impact of specialized virtual care navigation services on LGBTQ+ individuals' ability to comprehend and access necessary care within a national cohort of commercially insured members.</p><p><strong>Methods: </strong>This case study is based on the experience of commercially insured members, aged 18 or older, who used the LGBTQ+ Health Care Navigation (LGBTQ+ Navigation) service by Included Health between January 26 and July 31, 2023. Care coordinators assisted members by connecting them with vetted identity-affirming in-network providers, helping them navigate and understand their LGBTQ+ health benefits, and providing education and advocacy for clinical and nonclinical needs. We examined the impact of navigation on 5 member-reported outcomes. In addition to reporting the proportion who agreed or strongly agreed, we calculated an impact score that averaged assigned numerical values to all 5 question responses (1=strongly disagree to 5=strongly agree) for each respondent. We used ANOVA with Tukey post hoc tests and t tests to explore the relationships between the impact score and member characteristics, including optional self-reported demographics.</p><p><strong>Results: </strong>Out of 4703 LGBTQ+ Navigation cases, 7.53% (n=354) had member-reported outcomes. A large majority of LGBTQ+ members agreed or strongly agreed that care navigation resulted in less stress (315/354, 89%), less care avoidance (305/354, 86.2%), higher confidence in finding an identity-affirming provider (327/354, 92.4%), improved ability to comprehend health care information (312/354, 88.1%), and improved ability to engage with providers (308/354, 87%). The average impact score was 4.44 (SD 0.69), with statistically significant differences by gender identity (P=.003), race (P=.01), ethnicity (P=.008), and pronouns (P=.02). The scores were highest for members with multiple gender identities (mean 4.56, SD 0.37), and members who did not provide their race, ethnicity, or their pronouns (mean 4.55, SD 0.64). Impact scores were lowest for transgender members (mean 4.11, SD 0.95).</p><p><strong>Conclusions: </strong>The LGBTQ+ Navigation service, by enhancing members' comprehension and use of necessary care, demonstrates potential public health utility and value. Continuous evaluation of navigation s
背景:虽然护理导航在促进女同性恋、男同性恋、双性恋、跨性别、酷儿和其他(LGBTQ+)群体获得医疗保健方面的重要性已经得到承认,但关于护理导航如何影响个人在现实世界中理解和获得所需护理的能力的研究有限。通过分析私营部门数据,我们可以弥合理论研究成果与实际应用之间的差距,最终为商业战略和公共政策提供基于现实世界有效性的证据。目的:本研究的目的是评估专门的虚拟护理导航服务对LGBTQ+个人理解和获得必要护理的能力的影响。方法:本案例研究基于在2023年1月26日至7月31日期间使用LGBTQ+ Health Care Navigation (LGBTQ+ Navigation)服务的18岁及以上商业保险会员的经验。护理协调员通过将成员与网络内经过审查的身份确认提供者联系起来,帮助他们导航和了解LGBTQ+的健康福利,并为临床和非临床需求提供教育和宣传。我们检查了导航对5个成员报告结果的影响。除了报告同意或非常同意的比例外,我们还计算了一个影响分数,该分数将所有5个问题的回答(1=非常不同意到5=非常同意)的平均数值分配给每个受访者。我们使用方差分析与Tukey事后检验和t检验来探索影响评分与成员特征之间的关系,包括可选的自我报告的人口统计数据。结果:在4703例LGBTQ+导航病例中,7.53% (n=354)有成员报告的结果。绝大多数LGBTQ+成员同意或强烈同意护理导航减少了压力(315/354,89%),减少了护理回避(305/354,86.2%),提高了找到认同认同的提供者的信心(327/354,92.4%),提高了理解医疗信息的能力(312/354,88.1%),提高了与提供者互动的能力(308/354,87%)。平均影响评分为4.44 (SD 0.69),性别认同(P= 0.003)、种族(P= 0.01)、民族(P= 0.008)、代词(P= 0.02)差异有统计学意义。具有多重性别身份的成员得分最高(平均4.56分,标准差0.37),没有提供种族、民族或代词的成员得分最高(平均4.55分,标准差0.64)。跨性别成员的影响评分最低(平均4.11,标准差0.95)。结论:LGBTQ+导航服务通过提高成员对必要护理的理解和使用,展示了潜在的公共卫生效用和价值。对导航服务的持续评估可作为寻求促进健康公平和改善雇员归属感的雇主的补充工具。这一点尤其重要,因为美国对LGBTQ+社区的歧视和污名一直存在。因此,使用导航服务的可扩展和系统级更改对于接触更大比例的LGBTQ+人口至关重要。
{"title":"Impact of a Virtual Care Navigation Service on Member-Reported Outcomes Among Lesbian, Gay, Bisexual, Transgender, and Queer Populations: Case Study.","authors":"Seul Ki Choi, Jaclyn Marshall, Patrina Sexton Topper, Andrew Pregnall, José Bauermeister","doi":"10.2196/64137","DOIUrl":"10.2196/64137","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;While the significance of care navigation in facilitating access to health care within the lesbian, gay, bisexual, transgender, queer, and other (LGBTQ+) communities has been acknowledged, there is limited research examining how care navigation influences an individual's ability to understand and access the care they need in real-world settings. By analyzing private sector data, we can bridge the gap between theoretical research findings and practical applications, ultimately informing both business strategies and public policy with evidence grounded in real-world efficacy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;The objective of this study was to evaluate the impact of specialized virtual care navigation services on LGBTQ+ individuals' ability to comprehend and access necessary care within a national cohort of commercially insured members.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;This case study is based on the experience of commercially insured members, aged 18 or older, who used the LGBTQ+ Health Care Navigation (LGBTQ+ Navigation) service by Included Health between January 26 and July 31, 2023. Care coordinators assisted members by connecting them with vetted identity-affirming in-network providers, helping them navigate and understand their LGBTQ+ health benefits, and providing education and advocacy for clinical and nonclinical needs. We examined the impact of navigation on 5 member-reported outcomes. In addition to reporting the proportion who agreed or strongly agreed, we calculated an impact score that averaged assigned numerical values to all 5 question responses (1=strongly disagree to 5=strongly agree) for each respondent. We used ANOVA with Tukey post hoc tests and t tests to explore the relationships between the impact score and member characteristics, including optional self-reported demographics.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Out of 4703 LGBTQ+ Navigation cases, 7.53% (n=354) had member-reported outcomes. A large majority of LGBTQ+ members agreed or strongly agreed that care navigation resulted in less stress (315/354, 89%), less care avoidance (305/354, 86.2%), higher confidence in finding an identity-affirming provider (327/354, 92.4%), improved ability to comprehend health care information (312/354, 88.1%), and improved ability to engage with providers (308/354, 87%). The average impact score was 4.44 (SD 0.69), with statistically significant differences by gender identity (P=.003), race (P=.01), ethnicity (P=.008), and pronouns (P=.02). The scores were highest for members with multiple gender identities (mean 4.56, SD 0.37), and members who did not provide their race, ethnicity, or their pronouns (mean 4.55, SD 0.64). Impact scores were lowest for transgender members (mean 4.11, SD 0.95).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The LGBTQ+ Navigation service, by enhancing members' comprehension and use of necessary care, demonstrates potential public health utility and value. Continuous evaluation of navigation s","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e64137"},"PeriodicalIF":2.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11737804/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142949256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proficiency, Clarity, and Objectivity of Large Language Models Versus Specialists' Knowledge on COVID-19 Impacts in Pregnancy: A Cross-Sectional Pilot Study. 大型语言模型与专家对COVID-19对妊娠影响的了解的熟练程度、清晰度和客观性:一项横断面试点研究
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-09 DOI: 10.2196/56126
Nicola Bragazzi, Michèle Buchinger, Hisham Atwan, Ruba Tuma, Francesco Chirico, Lukasz Szarpak, Raymond Farah, Rola Khamisy-Farah
<p><strong>Background: </strong>The COVID-19 pandemic has significantly strained healthcare systems globally, leading to an overwhelming influx of patients and exacerbating resource limitations. Concurrently, an "infodemic" of misinformation, particularly prevalent in women's health, has emerged. This challenge has been pivotal for healthcare providers, especially gynecologists and obstetricians, in managing pregnant women's health. The pandemic heightened risks for pregnant women from COVID-19, necessitating balanced advice from specialists on vaccine safety versus known risks. Additionally, the advent of generative Artificial Intelligence (AI), such as large language models (LLMs), offers promising support in healthcare. However, they necessitate rigorous testing.</p><p><strong>Objective: </strong>To assess LLMs' proficiency, clarity, and objectivity regarding COVID-19 impacts in pregnancy.</p><p><strong>Methods: </strong>This study evaluates four major AI prototypes (ChatGPT-3.5, ChatGPT-4, Microsoft Copilot, and Google Bard) using zero-shot prompts in a questionnaire validated among 159 Israeli gynecologists and obstetricians. The questionnaire assesses proficiency in providing accurate information on COVID-19 in relation to pregnancy. Text-mining, sentiment analysis, and readability (Flesch-Kincaid grade level and Flesch Reading Ease Score) were also conducted.</p><p><strong>Results: </strong>In terms of LLMs' knowledge, ChatGPT-4 and Microsoft Copilot each scored 97% (n=32/33), Google Bard 94% (n=31/33), and ChatGPT-3.5 82% (n=27/33). ChatGPT-4 incorrectly stated an increased risk of miscarriage due to COVID-19. Google Bard and Microsoft Copilot had minor inaccuracies concerning COVID-19 transmission and complications. At the sentiment analysis, Microsoft Copilot achieved the least negative score (-4), followed by ChatGPT-4 (-6) and Google Bard ( -7), while ChatGPT-3.5 obtained the most negative score (-12). Finally, concerning the readability analysis, Flesch-Kincaid Grade Level and Flesch Reading Ease Score showed that Microsoft Copilot was the most accessible at 9.9 and 49, followed by ChatGPT-4 at 12.4 and 37.1, while ChatGPT-3.5 (12.9 and 35.6) and Google Bard (12.9 and 35.8) generated particularly complex responses.</p><p><strong>Conclusions: </strong>The study highlights varying knowledge levels of LLMs in relation to COVID-19 and pregnancy. ChatGPT-3.5 showed the least knowledge and alignment with scientific evidence. Readability and complexity analyses suggest that each AI's approach was tailored to specific audiences, with ChatGPT versions being more suitable for specialized readers and Microsoft Copilot for the general public. Sentiment analysis revealed notable variations in the way LLMs communicated critical information, underscoring the essential role of neutral and objective healthcare communication in ensuring that pregnant women, particularly vulnerable during the COVID-19 pandemic, receive accurate and reassuring guidance.
背景:2019冠状病毒病大流行给全球卫生保健系统造成了严重压力,导致大量患者涌入,加剧了资源限制。与此同时,出现了一种错误信息的“信息流行”,特别是在妇女保健方面。这一挑战对医疗保健提供者,特别是妇科和产科医生在管理孕妇健康方面至关重要。大流行增加了孕妇感染COVID-19的风险,因此需要专家就疫苗安全性与已知风险提供平衡的建议。此外,生成式人工智能(AI)的出现,如大型语言模型(llm),为医疗保健提供了有希望的支持。然而,它们需要严格的测试。目的:评价法学硕士对COVID-19对妊娠影响的熟练程度、清晰度和客观性。方法:本研究对以色列159名妇产科医生进行了问卷调查,使用零射击提示对四种主要的人工智能原型(ChatGPT-3.5、ChatGPT-4、Microsoft Copilot和b谷歌Bard)进行了评估。该问卷评估提供与妊娠有关的COVID-19准确信息的熟练程度。还进行了文本挖掘、情感分析和可读性(Flesch- kincaid年级水平和Flesch阅读简易评分)。结果:在法学硕士的知识方面,ChatGPT-4和Microsoft Copilot得分分别为97% (n=32/33), b谷歌巴德得分为94% (n=31/33), ChatGPT-3.5得分为82% (n=27/33)。ChatGPT-4错误地陈述了COVID-19导致流产的风险增加。b谷歌巴德和微软副驾驶在COVID-19传播和并发症方面有轻微的不准确。在情感分析中,Microsoft Copilot得分最低(-4分),其次是ChatGPT-4(-6分)和b谷歌Bard(-7分),ChatGPT-3.5得分最高(-12分)。最后,在可读性分析方面,Flesch- kincaid Grade Level和Flesch Reading Ease Score显示,Microsoft Copilot的可读性最高,分别为9.9和49,其次是ChatGPT-4,分别为12.4和37.1,而ChatGPT-3.5(12.9和35.6)和谷歌Bard(12.9和35.8)的反应尤为复杂。结论:该研究突出了法学硕士与COVID-19和妊娠相关的不同知识水平。ChatGPT-3.5表现出最少的知识和与科学证据的一致性。可读性和复杂性分析表明,每种人工智能的方法都是为特定的受众量身定制的,ChatGPT版本更适合专业读者,而微软的Copilot更适合普通大众。情感分析显示,法学硕士传达关键信息的方式存在显著差异,强调了中立和客观的医疗保健沟通在确保孕妇(特别是在COVID-19大流行期间的弱势群体)获得准确和可靠的指导方面的重要作用。总体而言,ChatGPT-4、微软Copilot和b谷歌Bard总体上提供了关于COVID-19和母婴健康疫苗的准确、最新信息,符合健康指南。该研究证明了人工智能在补充医疗保健知识方面的潜在作用,需要不断更新和验证人工智能知识库。人工智能工具的选择应考虑目标受众和所需的信息细节级别。临床试验:
{"title":"Proficiency, Clarity, and Objectivity of Large Language Models Versus Specialists' Knowledge on COVID-19 Impacts in Pregnancy: A Cross-Sectional Pilot Study.","authors":"Nicola Bragazzi, Michèle Buchinger, Hisham Atwan, Ruba Tuma, Francesco Chirico, Lukasz Szarpak, Raymond Farah, Rola Khamisy-Farah","doi":"10.2196/56126","DOIUrl":"https://doi.org/10.2196/56126","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The COVID-19 pandemic has significantly strained healthcare systems globally, leading to an overwhelming influx of patients and exacerbating resource limitations. Concurrently, an \"infodemic\" of misinformation, particularly prevalent in women's health, has emerged. This challenge has been pivotal for healthcare providers, especially gynecologists and obstetricians, in managing pregnant women's health. The pandemic heightened risks for pregnant women from COVID-19, necessitating balanced advice from specialists on vaccine safety versus known risks. Additionally, the advent of generative Artificial Intelligence (AI), such as large language models (LLMs), offers promising support in healthcare. However, they necessitate rigorous testing.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;To assess LLMs' proficiency, clarity, and objectivity regarding COVID-19 impacts in pregnancy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;This study evaluates four major AI prototypes (ChatGPT-3.5, ChatGPT-4, Microsoft Copilot, and Google Bard) using zero-shot prompts in a questionnaire validated among 159 Israeli gynecologists and obstetricians. The questionnaire assesses proficiency in providing accurate information on COVID-19 in relation to pregnancy. Text-mining, sentiment analysis, and readability (Flesch-Kincaid grade level and Flesch Reading Ease Score) were also conducted.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;In terms of LLMs' knowledge, ChatGPT-4 and Microsoft Copilot each scored 97% (n=32/33), Google Bard 94% (n=31/33), and ChatGPT-3.5 82% (n=27/33). ChatGPT-4 incorrectly stated an increased risk of miscarriage due to COVID-19. Google Bard and Microsoft Copilot had minor inaccuracies concerning COVID-19 transmission and complications. At the sentiment analysis, Microsoft Copilot achieved the least negative score (-4), followed by ChatGPT-4 (-6) and Google Bard ( -7), while ChatGPT-3.5 obtained the most negative score (-12). Finally, concerning the readability analysis, Flesch-Kincaid Grade Level and Flesch Reading Ease Score showed that Microsoft Copilot was the most accessible at 9.9 and 49, followed by ChatGPT-4 at 12.4 and 37.1, while ChatGPT-3.5 (12.9 and 35.6) and Google Bard (12.9 and 35.8) generated particularly complex responses.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The study highlights varying knowledge levels of LLMs in relation to COVID-19 and pregnancy. ChatGPT-3.5 showed the least knowledge and alignment with scientific evidence. Readability and complexity analyses suggest that each AI's approach was tailored to specific audiences, with ChatGPT versions being more suitable for specialized readers and Microsoft Copilot for the general public. Sentiment analysis revealed notable variations in the way LLMs communicated critical information, underscoring the essential role of neutral and objective healthcare communication in ensuring that pregnant women, particularly vulnerable during the COVID-19 pandemic, receive accurate and reassuring guidance.","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":" ","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142965016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Evaluation of Consumer Wearable Devices for Atrial Fibrillation Detection: Validation Study. 用于房颤检测的消费者可穿戴设备的比较评估:验证研究。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-09 DOI: 10.2196/65139
Femke Wouters, Henri Gruwez, Christophe Smeets, Anessa Pijalovic, Wouter Wilms, Julie Vranken, Zoë Pieters, Hugo Van Herendael, Dieter Nuyens, Maximo Rivero-Ayerza, Pieter Vandervoort, Peter Haemers, Laurent Pison

Background: Consumer-oriented wearable devices (CWDs) such as smartphones and smartwatches have gained prominence for their ability to detect atrial fibrillation (AF) through proprietary algorithms using electrocardiography or photoplethysmography (PPG)-based digital recordings. Despite numerous individual validation studies, a direct comparison of interdevice performance is lacking.

Objective: This study aimed to evaluate and compare the ability of CWDs to distinguish between sinus rhythm and AF.

Methods: Patients exhibiting sinus rhythm or AF were enrolled through a cardiology outpatient clinic. The participants were instructed to perform heart rhythm measurements using a handheld 6-lead electrocardiogram (ECG) device (KardiaMobile 6L), a smartwatch-derived single-lead ECG (Apple Watch), and two PPG-based smartphone apps (FibriCheck and Preventicus) in a random sequence, with simultaneous 12-lead reference ECG as the gold standard.

Results: A total of 122 participants were included in the study: median age 69 (IQR 61-77) years, 63.9% (n=78) men, 25% (n=30) with AF, 9.8% (n=12) without prior smartphone experience, and 73% (n=89) without experience in using a smartwatch. The sensitivity to detect AF was 100% for all devices. The specificity to detect sinus rhythm was 96.4% (95% CI 89.5%-98.8%) for KardiaMobile 6L, 97.8% (95% CI 91.6%-99.5%) for Apple Watch, 98.9% (95% CI 92.5%-99.8%) for FibriCheck, and 97.8% (95% CI 91.5%-99.4%) for Preventicus (P=.50). Insufficient quality measurements were observed in 10.7% (95% CI 6.3%-17.5%) of cases for both KardiaMobile 6L and Apple Watch, 7.4% (95% CI 3.9%-13.6%) for FibriCheck, and 14.8% (95% CI 9.5%-22.2%) for Preventicus (P=.21). Participants preferred Apple Watch over the other devices to monitor their heart rhythm.

Conclusions: In this study population, the discrimination between sinus rhythm and AF using CWDs based on ECG or PPG was highly accurate, with no significant variations in performance across the examined devices.

背景:面向消费者的可穿戴设备(CWDs),如智能手机和智能手表,通过使用基于心电图或光电体积脉搏波(PPG)的数字记录的专有算法检测心房颤动(AF)的能力已获得突出地位。尽管有许多单独的验证研究,但缺乏对设备间性能的直接比较。目的:本研究旨在评价和比较CWDs区分窦性心律和房颤的能力。方法:通过心脏病科门诊登记有窦性心律或房颤的患者。参与者被指示按随机顺序使用手持6导联心电图(ECG)设备(KardiaMobile 6L)、智能手表衍生的单导联心电图(Apple Watch)和两个基于ppg的智能手机应用程序(FibriCheck和preventticus)进行心律测量,同时使用12导联参考心电图作为金标准。结果:研究共纳入122名参与者:中位年龄为69岁(IQR 61-77)岁,63.9% (n=78)为男性,25% (n=30)为AF患者,9.8% (n=12)没有智能手机使用经验,73% (n=89)没有使用智能手表的经验。所有设备检测自动对焦的灵敏度均为100%。KardiaMobile 6L检测窦性心律的特异性为96.4% (95% CI 89.5%-98.8%), Apple Watch为97.8% (95% CI 91.6%-99.5%), FibriCheck为98.9% (95% CI 92.5%-99.8%), Preventicus为97.8% (95% CI 91.5%-99.4%) (P= 0.50)。在KardiaMobile 6L和Apple Watch中,10.7% (95% CI 6.3%-17.5%)的病例观察到质量测量不足,FibriCheck为7.4% (95% CI 3.9%-13.6%), preventticus为14.8% (95% CI 9.5%-22.2%) (P= 0.21)。与其他设备相比,参与者更喜欢苹果手表来监测他们的心律。结论:在本研究人群中,基于ECG或PPG使用CWDs对窦性心律和房颤的区分是高度准确的,在检查的设备之间没有明显的性能差异。
{"title":"Comparative Evaluation of Consumer Wearable Devices for Atrial Fibrillation Detection: Validation Study.","authors":"Femke Wouters, Henri Gruwez, Christophe Smeets, Anessa Pijalovic, Wouter Wilms, Julie Vranken, Zoë Pieters, Hugo Van Herendael, Dieter Nuyens, Maximo Rivero-Ayerza, Pieter Vandervoort, Peter Haemers, Laurent Pison","doi":"10.2196/65139","DOIUrl":"10.2196/65139","url":null,"abstract":"<p><strong>Background: </strong>Consumer-oriented wearable devices (CWDs) such as smartphones and smartwatches have gained prominence for their ability to detect atrial fibrillation (AF) through proprietary algorithms using electrocardiography or photoplethysmography (PPG)-based digital recordings. Despite numerous individual validation studies, a direct comparison of interdevice performance is lacking.</p><p><strong>Objective: </strong>This study aimed to evaluate and compare the ability of CWDs to distinguish between sinus rhythm and AF.</p><p><strong>Methods: </strong>Patients exhibiting sinus rhythm or AF were enrolled through a cardiology outpatient clinic. The participants were instructed to perform heart rhythm measurements using a handheld 6-lead electrocardiogram (ECG) device (KardiaMobile 6L), a smartwatch-derived single-lead ECG (Apple Watch), and two PPG-based smartphone apps (FibriCheck and Preventicus) in a random sequence, with simultaneous 12-lead reference ECG as the gold standard.</p><p><strong>Results: </strong>A total of 122 participants were included in the study: median age 69 (IQR 61-77) years, 63.9% (n=78) men, 25% (n=30) with AF, 9.8% (n=12) without prior smartphone experience, and 73% (n=89) without experience in using a smartwatch. The sensitivity to detect AF was 100% for all devices. The specificity to detect sinus rhythm was 96.4% (95% CI 89.5%-98.8%) for KardiaMobile 6L, 97.8% (95% CI 91.6%-99.5%) for Apple Watch, 98.9% (95% CI 92.5%-99.8%) for FibriCheck, and 97.8% (95% CI 91.5%-99.4%) for Preventicus (P=.50). Insufficient quality measurements were observed in 10.7% (95% CI 6.3%-17.5%) of cases for both KardiaMobile 6L and Apple Watch, 7.4% (95% CI 3.9%-13.6%) for FibriCheck, and 14.8% (95% CI 9.5%-22.2%) for Preventicus (P=.21). Participants preferred Apple Watch over the other devices to monitor their heart rhythm.</p><p><strong>Conclusions: </strong>In this study population, the discrimination between sinus rhythm and AF using CWDs based on ECG or PPG was highly accurate, with no significant variations in performance across the examined devices.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e65139"},"PeriodicalIF":2.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11737281/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142949240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring Bound Attention During Complex Liver Surgery Planning: Feasibility Study. 在复杂肝脏手术计划中测量束缚注意力:可行性研究。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-08 DOI: 10.2196/62740
Tim Schneider, Timur Cetin, Stefan Uppenkamp, Dirk Weyhe, Thomas Muender, Anke V Reinschluessel, Daniela Salzmann, Verena Uslar
<p><strong>Background: </strong>The integration of advanced technologies such as augmented reality (AR) and virtual reality (VR) into surgical procedures has garnered significant attention. However, the introduction of these innovations requires thorough evaluation in the context of human-machine interaction. Despite their potential benefits, new technologies can complicate surgical tasks and increase the cognitive load on surgeons, potentially offsetting their intended advantages. It is crucial to evaluate these technologies not only for their functional improvements but also for their impact on the surgeon's workload in clinical settings. A surgical team today must increasingly navigate advanced technologies such as AR and VR, aiming to reduce surgical trauma and enhance patient safety. However, each innovation needs to be evaluated in terms of human-machine interaction. Even if an innovation appears to bring advancements to the field it is applied in, it may complicate the work and increase the surgeon's workload rather than benefiting the surgeon.</p><p><strong>Objective: </strong>This study aims to establish a method for objectively determining the additional workload generated using AR or VR glasses in a clinical context for the first time.</p><p><strong>Methods: </strong>Electroencephalography (EEG) signals were recorded using a passive auditory oddball paradigm while 9 participants performed surgical planning for liver resection across 3 different conditions: (1) using AR glasses, (2) VR glasses, and (3) the conventional planning software on a computer.</p><p><strong>Results: </strong>The electrophysiological results, that is, the potentials evoked by the auditory stimulus, were compared with the subjectively perceived stress of the participants, as determined by the National Aeronautics and Space Administration-Task Load Index (NASA-TLX) questionnaire. The AR condition had the highest scores for mental demand (median 75, IQR 70-85), effort (median 55, IQR 30-65), and frustration (median 40, IQR 15-75) compared with the VR and PC conditions. The analysis of the EEG revealed a trend toward a lower amplitude of the N1 component as well as for the P3 component at the central electrodes in the AR condition, suggesting a higher workload for participants when using AR glasses. In addition, EEG components in the VR condition did not reveal any noticeable differences compared with the EEG components in the conventional planning condition. For the P1 component, the VR condition elicited significantly earlier latencies at the Fz electrode (mean 75.3 ms, SD 25.8 ms) compared with the PC condition (mean 99.4 ms, SD 28.6 ms).</p><p><strong>Conclusions: </strong>The results suggest a lower stress level when using VR glasses compared with AR glasses, likely due to the 3D visualization of the liver model. Additionally, the alignment between subjectively determined results and objectively determined results confirms the validity of the study design applie
背景:将增强现实(AR)和虚拟现实(VR)等先进技术整合到外科手术中已经引起了极大的关注。然而,这些创新的引入需要在人机交互的背景下进行彻底的评估。尽管有潜在的好处,但新技术会使手术任务复杂化,增加外科医生的认知负荷,潜在地抵消了它们的预期优势。评估这些技术是至关重要的,不仅因为它们的功能改进,而且因为它们对临床环境中外科医生工作量的影响。当今的外科团队必须越来越多地利用AR和VR等先进技术,旨在减少手术创伤并提高患者安全。然而,每一项创新都需要从人机交互的角度进行评估。即使一项创新似乎为其应用的领域带来了进步,它也可能使工作复杂化,增加外科医生的工作量,而不是给外科医生带来好处。目的:本研究旨在首次建立一种客观确定临床环境中使用AR或VR眼镜产生的额外工作量的方法。方法:9名受试者在3种不同条件下(1)使用AR眼镜,(2)使用VR眼镜,(3)使用计算机上的常规计划软件进行肝切除手术计划时,采用被动听觉怪异范式记录脑电图(EEG)信号。结果:将电生理结果(即听觉刺激诱发的电位)与被试主观感知的应激(由美国国家航空航天局任务负荷指数(NASA-TLX)问卷确定)进行比较。与VR和PC条件相比,AR条件在精神需求(中位数75,IQR 70-85)、努力(中位数55,IQR 30-65)和挫折(中位数40,IQR 15-75)方面得分最高。脑电图分析显示,在AR条件下,中央电极的N1分量和P3分量的振幅呈较低的趋势,这表明当使用AR眼镜时,参与者的工作量更高。此外,虚拟现实条件下的脑电成分与常规规划条件下的脑电成分没有明显差异。对于P1分量,VR条件在Fz电极上诱发的潜伏期(平均75.3 ms, SD 25.8 ms)明显早于PC条件(平均99.4 ms, SD 28.6 ms)。结论:结果表明,与AR眼镜相比,使用VR眼镜时压力水平较低,可能是由于肝脏模型的3D可视化。此外,主观决定的结果和客观决定的结果之间的一致性证实了本研究中应用的研究设计的有效性。
{"title":"Measuring Bound Attention During Complex Liver Surgery Planning: Feasibility Study.","authors":"Tim Schneider, Timur Cetin, Stefan Uppenkamp, Dirk Weyhe, Thomas Muender, Anke V Reinschluessel, Daniela Salzmann, Verena Uslar","doi":"10.2196/62740","DOIUrl":"10.2196/62740","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The integration of advanced technologies such as augmented reality (AR) and virtual reality (VR) into surgical procedures has garnered significant attention. However, the introduction of these innovations requires thorough evaluation in the context of human-machine interaction. Despite their potential benefits, new technologies can complicate surgical tasks and increase the cognitive load on surgeons, potentially offsetting their intended advantages. It is crucial to evaluate these technologies not only for their functional improvements but also for their impact on the surgeon's workload in clinical settings. A surgical team today must increasingly navigate advanced technologies such as AR and VR, aiming to reduce surgical trauma and enhance patient safety. However, each innovation needs to be evaluated in terms of human-machine interaction. Even if an innovation appears to bring advancements to the field it is applied in, it may complicate the work and increase the surgeon's workload rather than benefiting the surgeon.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to establish a method for objectively determining the additional workload generated using AR or VR glasses in a clinical context for the first time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Electroencephalography (EEG) signals were recorded using a passive auditory oddball paradigm while 9 participants performed surgical planning for liver resection across 3 different conditions: (1) using AR glasses, (2) VR glasses, and (3) the conventional planning software on a computer.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The electrophysiological results, that is, the potentials evoked by the auditory stimulus, were compared with the subjectively perceived stress of the participants, as determined by the National Aeronautics and Space Administration-Task Load Index (NASA-TLX) questionnaire. The AR condition had the highest scores for mental demand (median 75, IQR 70-85), effort (median 55, IQR 30-65), and frustration (median 40, IQR 15-75) compared with the VR and PC conditions. The analysis of the EEG revealed a trend toward a lower amplitude of the N1 component as well as for the P3 component at the central electrodes in the AR condition, suggesting a higher workload for participants when using AR glasses. In addition, EEG components in the VR condition did not reveal any noticeable differences compared with the EEG components in the conventional planning condition. For the P1 component, the VR condition elicited significantly earlier latencies at the Fz electrode (mean 75.3 ms, SD 25.8 ms) compared with the PC condition (mean 99.4 ms, SD 28.6 ms).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The results suggest a lower stress level when using VR glasses compared with AR glasses, likely due to the 3D visualization of the liver model. Additionally, the alignment between subjectively determined results and objectively determined results confirms the validity of the study design applie","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e62740"},"PeriodicalIF":2.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11754988/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142947891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Postpartum Remote Health Coaching Intervention for Individuals With a Hypertensive Disorder of Pregnancy: Proof-of-Concept Study. 产后远程健康指导干预与妊娠高血压疾病个体:概念验证研究
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-08 DOI: 10.2196/65611
Jaclyn D Borrowman, Lucas J Carr, Gary L Pierce, William T Story, Bethany Barone Gibbs, Kara M Whitaker
<p><strong>Background: </strong>Cardiovascular disease (CVD) is the leading cause of death among women in America. Hypertensive disorders of pregnancy (HDP) negatively impact acute and long-term cardiovascular health, with approximately 16% of all pregnancies affected. With CVD 2-4 times more likely after HDP compared to normotensive pregnancies, effective interventions to promote cardiovascular health are imperative.</p><p><strong>Objective: </strong>With postpartum physical activity (PA) interventions after HDP as an underexplored preventative strategy, we aimed in this study to assess (1) the feasibility and acceptability of a remotely delivered PA intervention for individuals with HDP 3-6 months postpartum and (2) changes in average steps per day, skills related to PA behavior, and postpartum blood pressure (BP).</p><p><strong>Methods: </strong>A remotely delivered 14-week health coaching intervention was designed based on prior formative work. The health coaching intervention called the Hypertensive Disorders of Pregnancy Postpartum Exercise (HyPE) intervention was tested for feasibility and acceptability with a single-arm proof-of-concept study design. A total of 19 women who were 3-6 months postpartum HDP; currently inactive; 18 years of age or older; resided in Iowa; and without diabetes, kidney disease, and CVD were enrolled. Feasibility was assessed by the number of sessions attended and acceptability by self-reported satisfaction with the program. Changes in steps achieved per day were measured with an activPAL4 micro, PA behavior skills via validated surveys online, and BP was assessed remotely with a research-grade Omron Series 5 (Omron Corporation) BP monitor.</p><p><strong>Results: </strong>Participants at enrollment were on average 30.3 years of age, 4.1 months postpartum, self-identified as non-Hispanic White (14/17, 82%), in a committed relationship (16/17, 94%), and had a bachelor's degree (9/17, 53%). A total of 140 of 152 possible health coaching sessions were attended by those who started the intervention (n=19, 92%). Intervention completers (n=17) indicated they were satisfied with the program (n=17, 100%) and would recommend it to others (n=17, 100%). No significant changes in activPAL measured steps were observed from pre- to posttesting (mean 138.40, SD 129.40 steps/day; P=.75). Significant improvements were observed in PA behavior skills including planning (mean 5.35, SD 4.97 vs mean 15.06, SD 3.09; P<.001) and monitoring of PA levels (mean 7.29, SD 3.44 vs mean 13.00, SD 2.45; P<.001). No significant decreases were observed for systolic (mean -1.28, SD 3.59 mm Hg; Hedges g=-0.26; P=.16) and diastolic BP (mean -1.80, SD 5.03 mm Hg; Hedges g=-0.44; P=.12).</p><p><strong>Conclusions: </strong>While PA behaviors did not change, the intervention was found to be feasible and acceptable among this sample of at-risk women. After additional refinement, the intervention should be retested among a larger, more diverse, and less p
背景:心血管疾病(CVD)是美国女性死亡的主要原因。妊娠期高血压疾病(HDP)对急性和长期心血管健康产生负面影响,约有16%的妊娠受到影响。与正常妊娠相比,HDP后发生心血管疾病的可能性高出2-4倍,因此促进心血管健康的有效干预措施势在必行。目的:由于HDP后的产后体力活动(PA)干预是一种尚未被充分探索的预防策略,我们在本研究中旨在评估(1)远程提供的HDP干预对产后3-6个月HDP患者的可行性和可接受性;(2)每天平均步数、与PA行为相关的技能和产后血压(BP)的变化。方法:在前期形成性工作的基础上,设计了一个远程递送的14周健康指导干预。健康指导干预被称为妊娠产后运动高血压疾病(HyPE)干预,通过单臂概念验证研究设计来测试其可行性和可接受性。共19例产后3-6个月HDP患者;目前不活跃;18岁或以上;居住在爱荷华州;没有糖尿病、肾脏疾病和心血管疾病的人被纳入研究。可行性通过参加会议的次数来评估,可接受性通过自我报告的计划满意度来评估。每天完成的步数变化用activPAL4微量表测量,PA行为技能通过在线调查验证,血压用研究级欧姆龙5系列(欧姆龙公司)血压监测仪远程评估。结果:入组的参与者平均年龄为30.3岁,产后4.1个月,自我认定为非西班牙裔白人(14/17,82%),有一段稳定的关系(16/17,94%),拥有学士学位(9/17,53%)。152个可能的健康指导课程中,有140个由开始干预的人参加(n=19, 92%)。干预完成者(n=17)表示他们对该计划感到满意(n=17, 100%),并将向他人推荐(n=17, 100%)。从测试前到测试后,activPAL测量的步数没有显著变化(平均138.40步/天,标准差129.40步/天;P =炮)。PA行为技能有显著改善,包括计划(平均5.35,SD 4.97 vs平均15.06,SD 3.09;结论:虽然PA行为没有改变,但干预在该高危女性样本中是可行和可接受的。在进一步细化后,干预措施应在更大、更多样化、身体活性更低的样本中重新测试。
{"title":"Postpartum Remote Health Coaching Intervention for Individuals With a Hypertensive Disorder of Pregnancy: Proof-of-Concept Study.","authors":"Jaclyn D Borrowman, Lucas J Carr, Gary L Pierce, William T Story, Bethany Barone Gibbs, Kara M Whitaker","doi":"10.2196/65611","DOIUrl":"10.2196/65611","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Cardiovascular disease (CVD) is the leading cause of death among women in America. Hypertensive disorders of pregnancy (HDP) negatively impact acute and long-term cardiovascular health, with approximately 16% of all pregnancies affected. With CVD 2-4 times more likely after HDP compared to normotensive pregnancies, effective interventions to promote cardiovascular health are imperative.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;With postpartum physical activity (PA) interventions after HDP as an underexplored preventative strategy, we aimed in this study to assess (1) the feasibility and acceptability of a remotely delivered PA intervention for individuals with HDP 3-6 months postpartum and (2) changes in average steps per day, skills related to PA behavior, and postpartum blood pressure (BP).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;A remotely delivered 14-week health coaching intervention was designed based on prior formative work. The health coaching intervention called the Hypertensive Disorders of Pregnancy Postpartum Exercise (HyPE) intervention was tested for feasibility and acceptability with a single-arm proof-of-concept study design. A total of 19 women who were 3-6 months postpartum HDP; currently inactive; 18 years of age or older; resided in Iowa; and without diabetes, kidney disease, and CVD were enrolled. Feasibility was assessed by the number of sessions attended and acceptability by self-reported satisfaction with the program. Changes in steps achieved per day were measured with an activPAL4 micro, PA behavior skills via validated surveys online, and BP was assessed remotely with a research-grade Omron Series 5 (Omron Corporation) BP monitor.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Participants at enrollment were on average 30.3 years of age, 4.1 months postpartum, self-identified as non-Hispanic White (14/17, 82%), in a committed relationship (16/17, 94%), and had a bachelor's degree (9/17, 53%). A total of 140 of 152 possible health coaching sessions were attended by those who started the intervention (n=19, 92%). Intervention completers (n=17) indicated they were satisfied with the program (n=17, 100%) and would recommend it to others (n=17, 100%). No significant changes in activPAL measured steps were observed from pre- to posttesting (mean 138.40, SD 129.40 steps/day; P=.75). Significant improvements were observed in PA behavior skills including planning (mean 5.35, SD 4.97 vs mean 15.06, SD 3.09; P&lt;.001) and monitoring of PA levels (mean 7.29, SD 3.44 vs mean 13.00, SD 2.45; P&lt;.001). No significant decreases were observed for systolic (mean -1.28, SD 3.59 mm Hg; Hedges g=-0.26; P=.16) and diastolic BP (mean -1.80, SD 5.03 mm Hg; Hedges g=-0.44; P=.12).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;While PA behaviors did not change, the intervention was found to be feasible and acceptable among this sample of at-risk women. After additional refinement, the intervention should be retested among a larger, more diverse, and less p","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e65611"},"PeriodicalIF":2.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142948833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of Personas and Journey Maps for Artificial Intelligence Agents Supporting the Use of Health Big Data: Human-Centered Design Approach. 支持健康大数据使用的人工智能代理的人物角色和旅程地图的开发:以人为本的设计方法。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-08 DOI: 10.2196/67272
Yoon Heui Lee, Hanna Choi, Soo-Kyoung Lee

Background: The rapid proliferation of artificial intelligence (AI) requires new approaches for human-AI interfaces that are different from classic human-computer interfaces. In developing a system that is conducive to the analysis and use of health big data (HBD), reflecting the empirical characteristics of users who have performed HBD analysis is the most crucial aspect to consider. Recently, human-centered design methodology, a field of user-centered design, has been expanded and is used not only to develop types of products but also technologies and services.

Objective: This study was conducted to integrate and analyze users' experiences along the HBD analysis journey using the human-centered design methodology and reflect them in the development of AI agents that support future HBD analysis. This research aims to help accelerate the development of novel human-AI interfaces for AI agents that support the analysis and use of HBD, which will be urgently needed in the near future.

Methods: Using human-centered design methodology, we collected data through shadowing and in-depth interviews with 16 people with experience in analyzing and using HBD. We identified users' empirical characteristics, emotions, pain points, and needs related to HBD analysis and use and created personas and journey maps.

Results: The general characteristics of participants (n=16) were as follows: the majority were in their 40s (n=6, 38%) and held a PhD degree (n=10, 63%). Professors (n=7, 44%) and health care personnel (n=10, 63%) represented the largest professional groups. Participants' experiences with big data analysis varied, with 25% (n=4) being beginners and 38% (n=6) having extensive experience. Common analysis methods included statistical analysis (n=7, 44%) and data mining (n=6, 38%). Qualitative findings from shadowing and in-depth interviews revealed key challenges: lack of knowledge on using analytical solutions, crisis management difficulties during errors, and inadequate understanding of health care data and clinical decision-making, especially among non-health care professionals. Three types of personas and journey maps-health care professionals as big data analysis beginners, health care professionals who have experience in big data analytics, and non-health care professionals who are experts in big data analytics-were derived. They showed a need for personalized platforms tailored to the user level, appropriate direction through a navigation function, a crisis management support system, communication and sharing among users, and expert linkage service.

Conclusions: The knowledge obtained from this study can be leveraged in designing an AI agent to support future HBD analysis and use. This is expected to further increase the usability of HBD by helping users perform effective use of HBD more easily.

背景:人工智能(AI)的快速发展需要不同于经典人机界面的人机界面的新方法。在开发有利于健康大数据分析和使用的系统时,反映进行健康大数据分析的用户的经验特征是最需要考虑的方面。最近,以人为中心的设计方法论,一个以用户为中心的设计领域,已经扩大,不仅用于开发产品类型,而且用于开发技术和服务。目的:本研究旨在使用以人为中心的设计方法,整合和分析用户在HBD分析过程中的体验,并将其反映在支持未来HBD分析的AI代理的开发中。这项研究旨在帮助加速开发支持HBD分析和使用的AI代理的新型人机界面,这在不久的将来将是迫切需要的。方法:采用以人为本的设计方法,对16名具有HBD分析和使用经验的人员进行了跟踪和深度访谈。我们确定了用户的经验特征、情感、痛点,以及与HBD分析和使用相关的需求,并创建了人物角色和旅程地图。结果:被试(n=16)的总体特征为:40多岁(n=6, 38%),博士学位(n=10, 63%)居多。教授(n=7, 44%)和卫生保健人员(n=10, 63%)是最大的专业群体。参与者在大数据分析方面的经验各不相同,25% (n=4)是初学者,38% (n=6)具有丰富的经验。常用的分析方法有统计分析(n=7, 44%)和数据挖掘(n=6, 38%)。跟踪访谈和深度访谈的定性结果揭示了主要挑战:缺乏使用分析解决方案的知识,错误期间的危机管理困难,以及对医疗保健数据和临床决策的理解不足,特别是在非医疗保健专业人员中。衍生出三种类型的人物角色和旅程地图——作为大数据分析初学者的医疗保健专业人员、具有大数据分析经验的医疗保健专业人员和作为大数据分析专家的非医疗保健专业人员。他们表示需要为用户量身定制个性化平台,通过导航功能提供适当的方向,危机管理支持系统,用户之间的沟通和共享以及专家联动服务。结论:从本研究中获得的知识可以用于设计AI代理,以支持未来的HBD分析和使用。预计这将进一步提高HBD的可用性,帮助用户更轻松地有效使用HBD。
{"title":"Development of Personas and Journey Maps for Artificial Intelligence Agents Supporting the Use of Health Big Data: Human-Centered Design Approach.","authors":"Yoon Heui Lee, Hanna Choi, Soo-Kyoung Lee","doi":"10.2196/67272","DOIUrl":"10.2196/67272","url":null,"abstract":"<p><strong>Background: </strong>The rapid proliferation of artificial intelligence (AI) requires new approaches for human-AI interfaces that are different from classic human-computer interfaces. In developing a system that is conducive to the analysis and use of health big data (HBD), reflecting the empirical characteristics of users who have performed HBD analysis is the most crucial aspect to consider. Recently, human-centered design methodology, a field of user-centered design, has been expanded and is used not only to develop types of products but also technologies and services.</p><p><strong>Objective: </strong>This study was conducted to integrate and analyze users' experiences along the HBD analysis journey using the human-centered design methodology and reflect them in the development of AI agents that support future HBD analysis. This research aims to help accelerate the development of novel human-AI interfaces for AI agents that support the analysis and use of HBD, which will be urgently needed in the near future.</p><p><strong>Methods: </strong>Using human-centered design methodology, we collected data through shadowing and in-depth interviews with 16 people with experience in analyzing and using HBD. We identified users' empirical characteristics, emotions, pain points, and needs related to HBD analysis and use and created personas and journey maps.</p><p><strong>Results: </strong>The general characteristics of participants (n=16) were as follows: the majority were in their 40s (n=6, 38%) and held a PhD degree (n=10, 63%). Professors (n=7, 44%) and health care personnel (n=10, 63%) represented the largest professional groups. Participants' experiences with big data analysis varied, with 25% (n=4) being beginners and 38% (n=6) having extensive experience. Common analysis methods included statistical analysis (n=7, 44%) and data mining (n=6, 38%). Qualitative findings from shadowing and in-depth interviews revealed key challenges: lack of knowledge on using analytical solutions, crisis management difficulties during errors, and inadequate understanding of health care data and clinical decision-making, especially among non-health care professionals. Three types of personas and journey maps-health care professionals as big data analysis beginners, health care professionals who have experience in big data analytics, and non-health care professionals who are experts in big data analytics-were derived. They showed a need for personalized platforms tailored to the user level, appropriate direction through a navigation function, a crisis management support system, communication and sharing among users, and expert linkage service.</p><p><strong>Conclusions: </strong>The knowledge obtained from this study can be leveraged in designing an AI agent to support future HBD analysis and use. This is expected to further increase the usability of HBD by helping users perform effective use of HBD more easily.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e67272"},"PeriodicalIF":2.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11754986/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142949245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A digital program for daily life management with endometriosis: Pilot study on symptoms and quality of life among participants. 子宫内膜异位症患者日常生活管理的数字程序:参与者症状和生活质量的初步研究。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-08 DOI: 10.2196/58262
Zélia Breton, Emilie Stern, Mathilde Pinault, Delphine Lhuillery, Erick Petit, Pierre Panel, Maïa Alexaline
<p><strong>Background: </strong>After suffering for an average of 7 years before diagnosis, endometriosis patients are usually left with more questions than answers about managing their symptoms in the absence of a cure. To help women with endometriosis after their diagnosis, we developed an online support program combining user research, evidence-based medicine, and clinical expertise. Structured around CBT and the quality-of-life metrics from the EHP score, the program is designed to guide participants over a 3-month and is available in France.</p><p><strong>Objective: </strong>This cohort study was designed to measure the impact of a digital health program on the symptom and quality of life levels of women with endometriosis.</p><p><strong>Methods: </strong>Ninety-two participants were included in the pilot study, among a total of 146 program participants who volunteered and assessed for eligibility for this research. They were recruited either free of charge through employer health insurance or via individual direct access. A control group of women with endometriosis who did not follow the program was recruited (n=404) through social media and mailing campaign. Questionnaires assessing quality of life and symptom levels were sent to program participants and controls at baseline and at three months via email. The control group was sampled according to initial pain level in order to obtain a similar pain profile between controls and program participants (n=149). Descriptive statistics and statistical tests (Chi-square, Fisher's exact, Wilcoxon, Mann-Whitney U, Student t-tests) were used to analyze intra- and inter-group differences, with Cohen's D measuring effect size for significant results.</p><p><strong>Results: </strong>Over three months, global symptom burden, the general level of pain, anxiety, depression, dysmenorrhea, dysuria, chronic fatigue, neuropathic pain, and endobelly levels improved significantly among program participants. These improvements were significantly different from the control group for global symptom burden (mean±SD: participants=-0.7±1.6, controls=-0.3±1.3, P=.048, small d), anxiety (participants=-1.1±2.8, controls=0.2±2.5, P<.001, medium d) and depression levels (participants=-0.9±2.5, controls=0.0±3.1, P=.04, small d), neuropathic pain (participants=-1.0±2.7, controls=-0.1±2.6, P=.004, small d), and endobelly (participants=-0.9±2.5, controls=-0.3±2.4, P=.03, small d). Participant quality of life evolution between baseline and three months improved and significantly differed from the control group for the core part of the EHP-5 (participants=-5.9±21.0, controls=1.0±14.8, P=.03, small d) and the EQ-5D (participants=0.1±0.1, controls=-0.0±0.1, P=.001, medium d). Perceived knowledge of endometriosis was significantly greater at three months among participants than in controls (P<.001).</p><p><strong>Conclusions: </strong>The results from this pilot study suggest that a digital health program providing medical and sci
背景:子宫内膜异位症患者在确诊前平均经历了7年的痛苦,在没有治愈方法的情况下,对于如何控制症状,他们通常留下更多的问题而不是答案。为了帮助诊断出子宫内膜异位症的女性,我们开发了一个在线支持项目,结合了用户研究、循证医学和临床专业知识。该项目以CBT和EHP评分中的生活质量指标为基础,旨在指导参与者进行为期3个月的治疗,目前已在法国推出。目的:本队列研究旨在测量数字健康计划对子宫内膜异位症女性症状和生活质量水平的影响。方法:92名参与者被纳入试点研究,共有146名项目参与者自愿参与并评估了本研究的资格。他们要么通过雇主健康保险免费招聘,要么通过个人直接招聘。通过社交媒体和邮寄活动招募了未遵循该计划的子宫内膜异位症女性作为对照组(n=404)。评估生活质量和症状水平的问卷在基线和三个月时通过电子邮件发送给项目参与者和对照组。根据初始疼痛水平对对照组进行采样,以获得对照组和项目参与者之间相似的疼痛概况(n=149)。使用描述性统计和统计检验(卡方检验、Fisher’s exact检验、Wilcoxon检验、Mann-Whitney U检验、学生t检验)分析组内和组间差异,对显著结果使用Cohen’s D测量效应大小。结果:三个月后,总体症状负担、疼痛、焦虑、抑郁、痛经、排尿困难、慢性疲劳、神经性疼痛和腹内水平在项目参与者中显著改善。总体症状负担的改善与对照组相比有显著差异(平均±SD:参与者=-0.7±1.6,对照组=-0.3±1.3,P=。结论:本初步研究的结果表明,提供有关子宫内膜异位症的医学和科学信息以及多学科自我管理工具的数字健康计划可能有助于减轻总体症状负担、焦虑、抑郁、神经性疼痛和内腹,同时提高参与者对子宫内膜异位症的认识和生活质量。临床试验:
{"title":"A digital program for daily life management with endometriosis: Pilot study on symptoms and quality of life among participants.","authors":"Zélia Breton, Emilie Stern, Mathilde Pinault, Delphine Lhuillery, Erick Petit, Pierre Panel, Maïa Alexaline","doi":"10.2196/58262","DOIUrl":"https://doi.org/10.2196/58262","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;After suffering for an average of 7 years before diagnosis, endometriosis patients are usually left with more questions than answers about managing their symptoms in the absence of a cure. To help women with endometriosis after their diagnosis, we developed an online support program combining user research, evidence-based medicine, and clinical expertise. Structured around CBT and the quality-of-life metrics from the EHP score, the program is designed to guide participants over a 3-month and is available in France.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This cohort study was designed to measure the impact of a digital health program on the symptom and quality of life levels of women with endometriosis.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Ninety-two participants were included in the pilot study, among a total of 146 program participants who volunteered and assessed for eligibility for this research. They were recruited either free of charge through employer health insurance or via individual direct access. A control group of women with endometriosis who did not follow the program was recruited (n=404) through social media and mailing campaign. Questionnaires assessing quality of life and symptom levels were sent to program participants and controls at baseline and at three months via email. The control group was sampled according to initial pain level in order to obtain a similar pain profile between controls and program participants (n=149). Descriptive statistics and statistical tests (Chi-square, Fisher's exact, Wilcoxon, Mann-Whitney U, Student t-tests) were used to analyze intra- and inter-group differences, with Cohen's D measuring effect size for significant results.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Over three months, global symptom burden, the general level of pain, anxiety, depression, dysmenorrhea, dysuria, chronic fatigue, neuropathic pain, and endobelly levels improved significantly among program participants. These improvements were significantly different from the control group for global symptom burden (mean±SD: participants=-0.7±1.6, controls=-0.3±1.3, P=.048, small d), anxiety (participants=-1.1±2.8, controls=0.2±2.5, P&lt;.001, medium d) and depression levels (participants=-0.9±2.5, controls=0.0±3.1, P=.04, small d), neuropathic pain (participants=-1.0±2.7, controls=-0.1±2.6, P=.004, small d), and endobelly (participants=-0.9±2.5, controls=-0.3±2.4, P=.03, small d). Participant quality of life evolution between baseline and three months improved and significantly differed from the control group for the core part of the EHP-5 (participants=-5.9±21.0, controls=1.0±14.8, P=.03, small d) and the EQ-5D (participants=0.1±0.1, controls=-0.0±0.1, P=.001, medium d). Perceived knowledge of endometriosis was significantly greater at three months among participants than in controls (P&lt;.001).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The results from this pilot study suggest that a digital health program providing medical and sci","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":" ","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142949103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Public Health Discussions on Social Media: Evaluating Automated Sentiment Analysis Methods. 社交媒体上的公共卫生讨论:评估自动情感分析方法。
IF 2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-01-08 DOI: 10.2196/57395
Lisa M Gandy, Lana V Ivanitskaya, Leeza L Bacon, Rodina Bizri-Baryak
<p><strong>Background: </strong>Sentiment analysis is one of the most widely used methods for mining and examining text. Social media researchers need guidance on choosing between manual and automated sentiment analysis methods.</p><p><strong>Objective: </strong>Popular sentiment analysis tools based on natural language processing (NLP; VADER [Valence Aware Dictionary for Sentiment Reasoning], TEXT2DATA [T2D], and Linguistic Inquiry and Word Count [LIWC-22]), and a large language model (ChatGPT 4.0) were compared with manually coded sentiment scores, as applied to the analysis of YouTube comments on videos discussing the opioid epidemic. Sentiment analysis methods were also examined regarding ease of programming, monetary cost, and other practical considerations.</p><p><strong>Methods: </strong>Evaluation methods included descriptive statistics, receiver operating characteristic (ROC) curve analysis, confusion matrices, Cohen κ, accuracy, specificity, precision, sensitivity (recall), F<sub>1</sub>-score harmonic mean, and the Matthews correlation coefficient. An inductive, iterative approach to content analysis of the data was used to obtain manual sentiment codes.</p><p><strong>Results: </strong>A subset of comments were analyzed by a second coder, producing good agreement between the 2 coders' judgments (κ=0.734). YouTube social media about the opioid crisis had many more negative comments (4286/4871, 88%) than positive comments (79/662, 12%), making it possible to evaluate the performance of sentiment analysis models in an unbalanced dataset. The tone summary measure from LIWC-22 performed better than other tools for estimating the prevalence of negative versus positive sentiment. According to the ROC curve analysis, VADER was best at classifying manually coded negative comments. A comparison of Cohen κ values indicated that NLP tools (VADER, followed by LIWC's tone and T2D) showed only fair agreement with manual coding. In contrast, ChatGPT 4.0 had poor agreement and failed to generate binary sentiment scores in 2 out of 3 attempts. Variations in accuracy, specificity, precision, sensitivity, F<sub>1</sub>-score, and MCC did not reveal a single superior model. F<sub>1</sub>-score harmonic means were 0.34-0.38 (SD 0.02) for NLP tools and very low (0.13) for ChatGPT 4.0. None of the MCCs reached a strong correlation level.</p><p><strong>Conclusions: </strong>Researchers studying negative emotions, public worries, or dissatisfaction with social media face unique challenges in selecting models suitable for unbalanced datasets. We recommend VADER, the only cost-free tool we evaluated, due to its excellent discrimination, which can be further improved when the comments are at least 100 characters long. If estimating the prevalence of negative comments in an unbalanced dataset is important, we recommend the tone summary measure from LIWC-22. Researchers using T2D must know that it may only score some data and, compared with other methods, be more ti
背景:情感分析是文本挖掘和检测中使用最广泛的方法之一。社交媒体研究人员需要在人工和自动情感分析方法之间进行选择的指导。目的:基于自然语言处理(NLP)的流行情感分析工具;VADER[情价感知词典用于情感推理],TEXT2DATA [T2D],以及语言查询和字数统计[LIWC-22])和一个大型语言模型(ChatGPT 4.0)与人工编码的情感得分进行比较,并应用于分析YouTube上讨论阿片类药物流行的视频评论。情感分析方法也检查了编程的便利性,货币成本和其他实际考虑因素。方法:评价方法包括描述性统计、受试者工作特征(ROC)曲线分析、混淆矩阵、Cohen κ、准确度、特异度、精密度、灵敏度(召回率)、f1分调和均值、Matthews相关系数。采用归纳迭代的方法对数据进行内容分析,获得人工情感代码。结果:由第二个编码器分析评论子集,在两个编码器的判断之间产生良好的一致性(κ=0.734)。YouTube社交媒体上关于阿片类药物危机的负面评论(4286/4871,88%)比正面评论(79/662,12%)要多,这使得在不平衡数据集中评估情绪分析模型的性能成为可能。来自LIWC-22的语气总结测量在估计消极情绪与积极情绪的流行程度方面比其他工具表现得更好。根据ROC曲线分析,VADER最擅长对人工编码的负面评论进行分类。Cohen κ值的比较表明,NLP工具(VADER,其次是LIWC的音调和T2D)与手动编码的一致性很好。相比之下,ChatGPT 4.0的一致性很差,在3次尝试中有2次未能生成二元情绪得分。准确度、特异性、精密度、敏感性、f1评分和MCC的变化并没有显示出单一的优越模型。NLP工具的f1得分谐波平均值为0.34-0.38 (SD 0.02), ChatGPT 4.0的平均值非常低(0.13)。mcc均未达到强相关水平。结论:研究负面情绪、公众担忧或对社交媒体不满的研究人员在选择适合非平衡数据集的模型时面临着独特的挑战。我们推荐VADER,这是我们评估的唯一一个免费的工具,因为它具有出色的识别能力,当评论长度至少为100个字符时,可以进一步改进。如果估计不平衡数据集中负面评论的流行程度很重要,我们建议使用LIWC-22的音调汇总度量。使用T2D的研究人员必须知道,它可能只能记录一些数据,而且与其他方法相比,它更耗时,成本也更高。一个通用的大型语言模型,ChatGPT 4.0,还没有超过NLP模型的性能,至少对于具有高度普遍(7:1)负面评论的不平衡数据集。
{"title":"Public Health Discussions on Social Media: Evaluating Automated Sentiment Analysis Methods.","authors":"Lisa M Gandy, Lana V Ivanitskaya, Leeza L Bacon, Rodina Bizri-Baryak","doi":"10.2196/57395","DOIUrl":"https://doi.org/10.2196/57395","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Sentiment analysis is one of the most widely used methods for mining and examining text. Social media researchers need guidance on choosing between manual and automated sentiment analysis methods.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;Popular sentiment analysis tools based on natural language processing (NLP; VADER [Valence Aware Dictionary for Sentiment Reasoning], TEXT2DATA [T2D], and Linguistic Inquiry and Word Count [LIWC-22]), and a large language model (ChatGPT 4.0) were compared with manually coded sentiment scores, as applied to the analysis of YouTube comments on videos discussing the opioid epidemic. Sentiment analysis methods were also examined regarding ease of programming, monetary cost, and other practical considerations.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Evaluation methods included descriptive statistics, receiver operating characteristic (ROC) curve analysis, confusion matrices, Cohen κ, accuracy, specificity, precision, sensitivity (recall), F&lt;sub&gt;1&lt;/sub&gt;-score harmonic mean, and the Matthews correlation coefficient. An inductive, iterative approach to content analysis of the data was used to obtain manual sentiment codes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;A subset of comments were analyzed by a second coder, producing good agreement between the 2 coders' judgments (κ=0.734). YouTube social media about the opioid crisis had many more negative comments (4286/4871, 88%) than positive comments (79/662, 12%), making it possible to evaluate the performance of sentiment analysis models in an unbalanced dataset. The tone summary measure from LIWC-22 performed better than other tools for estimating the prevalence of negative versus positive sentiment. According to the ROC curve analysis, VADER was best at classifying manually coded negative comments. A comparison of Cohen κ values indicated that NLP tools (VADER, followed by LIWC's tone and T2D) showed only fair agreement with manual coding. In contrast, ChatGPT 4.0 had poor agreement and failed to generate binary sentiment scores in 2 out of 3 attempts. Variations in accuracy, specificity, precision, sensitivity, F&lt;sub&gt;1&lt;/sub&gt;-score, and MCC did not reveal a single superior model. F&lt;sub&gt;1&lt;/sub&gt;-score harmonic means were 0.34-0.38 (SD 0.02) for NLP tools and very low (0.13) for ChatGPT 4.0. None of the MCCs reached a strong correlation level.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Researchers studying negative emotions, public worries, or dissatisfaction with social media face unique challenges in selecting models suitable for unbalanced datasets. We recommend VADER, the only cost-free tool we evaluated, due to its excellent discrimination, which can be further improved when the comments are at least 100 characters long. If estimating the prevalence of negative comments in an unbalanced dataset is important, we recommend the tone summary measure from LIWC-22. Researchers using T2D must know that it may only score some data and, compared with other methods, be more ti","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"9 ","pages":"e57395"},"PeriodicalIF":2.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142948961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Formative Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1