首页 > 最新文献

JMIR AI最新文献

英文 中文
Understanding the Long Haulers of COVID-19: Mixed Methods Analysis of YouTube Content. 了解 COVID-19 的长途运输者:对 YouTube 内容的混合方法分析。
Pub Date : 2024-06-03 DOI: 10.2196/54501
Alexis Jordan, Albert Park

Background: The COVID-19 pandemic had a devastating global impact. In the United States, there were >98 million COVID-19 cases and >1 million resulting deaths. One consequence of COVID-19 infection has been post-COVID-19 condition (PCC). People with this syndrome, colloquially called long haulers, experience symptoms that impact their quality of life. The root cause of PCC and effective treatments remains unknown. Many long haulers have turned to social media for support and guidance.

Objective: In this study, we sought to gain a better understanding of the long hauler experience by investigating what has been discussed and how information about long haulers is perceived on social media. We specifically investigated the following: (1) the range of symptoms that are discussed, (2) the ways in which information about long haulers is perceived, (3) informational and emotional support that is available to long haulers, and (4) discourse between viewers and creators. We selected YouTube as our data source due to its popularity and wide range of audience.

Methods: We systematically gathered data from 3 different types of content creators: medical sources, news sources, and long haulers. To computationally understand the video content and viewers' reactions, we used Biterm, a topic modeling algorithm created specifically for short texts, to analyze snippets of video transcripts and all top-level comments from the comment section. To triangulate our findings about viewers' reactions, we used the Valence Aware Dictionary and Sentiment Reasoner to conduct sentiment analysis on comments from each type of content creator. We grouped the comments into positive and negative categories and generated topics for these groups using Biterm. We then manually grouped resulting topics into broader themes for the purpose of analysis.

Results: We organized the resulting topics into 28 themes across all sources. Examples of medical source transcript themes were Explanations in layman's terms and Biological explanations. Examples of news source transcript themes were Negative experiences and handling the long haul. The 2 long hauler transcript themes were Taking treatments into own hands and Changes to daily life. News sources received a greater share of negative comments. A few themes of these negative comments included Misinformation and disinformation and Issues with the health care system. Similarly, negative long hauler comments were organized into several themes, including Disillusionment with the health care system and Requiring more visibility. In contrast, positive medical source comments captured themes such as Appreciation of helpful content and Exchange of helpful information. In addition to this theme, one positive theme found in long hauler comments was Community building.

Conclusions: The results of this study could help public health agencies, po

背景:COVID-19 大流行对全球造成了破坏性影响。在美国,COVID-19 病例超过 9800 万例,死亡人数超过 100 万。COVID-19 感染的一个后果是 COVID-19 后遗症(PCC)。患有这种综合症的人,俗称 "长途司机",会出现影响生活质量的症状。PCC 的根本原因和有效治疗方法仍不得而知。许多长途旅行者转而在社交媒体上寻求支持和指导:在本研究中,我们试图通过调查社交媒体上关于长途旅行者的讨论内容以及人们是如何看待这些信息的,从而更好地了解长途旅行者的经历。我们具体调查了以下内容:(1)讨论的症状范围,(2)感知长途旅行者信息的方式,(3)长途旅行者可获得的信息和情感支持,以及(4)观众和创作者之间的对话。我们选择 YouTube 作为数据来源,是因为它广受欢迎,受众范围广泛:我们系统地收集了来自 3 种不同类型内容创作者的数据:医疗来源、新闻来源和长途旅行者。为了通过计算了解视频内容和观众的反应,我们使用了 Biterm(一种专为短文创建的主题建模算法)来分析视频转录片段和评论区的所有顶级评论。为了对观众的反应进行三角测量,我们使用 Valence Aware Dictionary 和 Sentiment Reasoner 对各类内容创作者的评论进行了情感分析。我们将评论分为正面和负面两类,并使用 Biterm 为这两类评论生成主题。然后,我们手动将生成的主题归类为更广泛的主题,以便进行分析:结果:我们将所有来源中产生的主题归纳为 28 个主题。医学来源记录主题的例子包括通俗解释和生物学解释。新闻来源记录主题的例子有负面经历和长途旅行。2 个长途跋涉记录主题分别是自行治疗和改变日常生活。新闻来源收到的负面评论较多。这些负面评论的几个主题包括错误信息和虚假信息以及医疗保健系统的问题。同样,长途运输者的负面评论也分为几个主题,包括对医疗系统的幻想破灭和需要更多的关注。与此相反,正面的医疗信息来源评论则包含了一些主题,如欣赏有用的内容和交流有用的信息。除这一主题外,在长途运输者的评论中还发现了一个积极的主题,即社区建设:本研究的结果有助于公共卫生机构、政策制定者、组织和卫生研究人员了解与 PCC 相关的症状和经验。这些结果还有助于这些机构制定有关 PCC 的沟通策略。
{"title":"Understanding the Long Haulers of COVID-19: Mixed Methods Analysis of YouTube Content.","authors":"Alexis Jordan, Albert Park","doi":"10.2196/54501","DOIUrl":"10.2196/54501","url":null,"abstract":"<p><strong>Background: </strong>The COVID-19 pandemic had a devastating global impact. In the United States, there were >98 million COVID-19 cases and >1 million resulting deaths. One consequence of COVID-19 infection has been post-COVID-19 condition (PCC). People with this syndrome, colloquially called long haulers, experience symptoms that impact their quality of life. The root cause of PCC and effective treatments remains unknown. Many long haulers have turned to social media for support and guidance.</p><p><strong>Objective: </strong>In this study, we sought to gain a better understanding of the long hauler experience by investigating what has been discussed and how information about long haulers is perceived on social media. We specifically investigated the following: (1) the range of symptoms that are discussed, (2) the ways in which information about long haulers is perceived, (3) informational and emotional support that is available to long haulers, and (4) discourse between viewers and creators. We selected YouTube as our data source due to its popularity and wide range of audience.</p><p><strong>Methods: </strong>We systematically gathered data from 3 different types of content creators: medical sources, news sources, and long haulers. To computationally understand the video content and viewers' reactions, we used Biterm, a topic modeling algorithm created specifically for short texts, to analyze snippets of video transcripts and all top-level comments from the comment section. To triangulate our findings about viewers' reactions, we used the Valence Aware Dictionary and Sentiment Reasoner to conduct sentiment analysis on comments from each type of content creator. We grouped the comments into positive and negative categories and generated topics for these groups using Biterm. We then manually grouped resulting topics into broader themes for the purpose of analysis.</p><p><strong>Results: </strong>We organized the resulting topics into 28 themes across all sources. Examples of medical source transcript themes were Explanations in layman's terms and Biological explanations. Examples of news source transcript themes were Negative experiences and handling the long haul. The 2 long hauler transcript themes were Taking treatments into own hands and Changes to daily life. News sources received a greater share of negative comments. A few themes of these negative comments included Misinformation and disinformation and Issues with the health care system. Similarly, negative long hauler comments were organized into several themes, including Disillusionment with the health care system and Requiring more visibility. In contrast, positive medical source comments captured themes such as Appreciation of helpful content and Exchange of helpful information. In addition to this theme, one positive theme found in long hauler comments was Community building.</p><p><strong>Conclusions: </strong>The results of this study could help public health agencies, po","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e54501"},"PeriodicalIF":0.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11184269/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasibility of Multimodal Artificial Intelligence Using GPT-4 Vision for the Classification of Middle Ear Disease: Qualitative Study and Validation. 利用 GPT-4 视觉进行中耳疾病分类的多模态人工智能的可行性:定性研究与验证。
Pub Date : 2024-05-31 DOI: 10.2196/58342
Masao Noda, Hidekane Yoshimura, Takuya Okubo, Ryota Koshu, Yuki Uchiyama, Akihiro Nomura, Makoto Ito, Yutaka Takumi

Background: The integration of artificial intelligence (AI), particularly deep learning models, has transformed the landscape of medical technology, especially in the field of diagnosis using imaging and physiological data. In otolaryngology, AI has shown promise in image classification for middle ear diseases. However, existing models often lack patient-specific data and clinical context, limiting their universal applicability. The emergence of GPT-4 Vision (GPT-4V) has enabled a multimodal diagnostic approach, integrating language processing with image analysis.

Objective: In this study, we investigated the effectiveness of GPT-4V in diagnosing middle ear diseases by integrating patient-specific data with otoscopic images of the tympanic membrane.

Methods: The design of this study was divided into two phases: (1) establishing a model with appropriate prompts and (2) validating the ability of the optimal prompt model to classify images. In total, 305 otoscopic images of 4 middle ear diseases (acute otitis media, middle ear cholesteatoma, chronic otitis media, and otitis media with effusion) were obtained from patients who visited Shinshu University or Jichi Medical University between April 2010 and December 2023. The optimized GPT-4V settings were established using prompts and patients' data, and the model created with the optimal prompt was used to verify the diagnostic accuracy of GPT-4V on 190 images. To compare the diagnostic accuracy of GPT-4V with that of physicians, 30 clinicians completed a web-based questionnaire consisting of 190 images.

Results: The multimodal AI approach achieved an accuracy of 82.1%, which is superior to that of certified pediatricians at 70.6%, but trailing behind that of otolaryngologists at more than 95%. The model's disease-specific accuracy rates were 89.2% for acute otitis media, 76.5% for chronic otitis media, 79.3% for middle ear cholesteatoma, and 85.7% for otitis media with effusion, which highlights the need for disease-specific optimization. Comparisons with physicians revealed promising results, suggesting the potential of GPT-4V to augment clinical decision-making.

Conclusions: Despite its advantages, challenges such as data privacy and ethical considerations must be addressed. Overall, this study underscores the potential of multimodal AI for enhancing diagnostic accuracy and improving patient care in otolaryngology. Further research is warranted to optimize and validate this approach in diverse clinical settings.

背景:人工智能(AI),尤其是深度学习模型的融合改变了医疗技术的面貌,特别是在利用成像和生理数据进行诊断的领域。在耳鼻喉科领域,人工智能已在中耳疾病的图像分类方面显示出前景。然而,现有模型往往缺乏特定患者的数据和临床背景,限制了其普遍适用性。GPT-4 Vision(GPT-4V)的出现使语言处理与图像分析相结合的多模态诊断方法成为可能:在这项研究中,我们调查了 GPT-4V 在诊断中耳疾病方面的有效性,它将患者的特定数据与鼓膜的耳镜图像相结合:本研究的设计分为两个阶段:(1)建立一个具有适当提示的模型;(2)验证最佳提示模型对图像进行分类的能力。研究人员从 2010 年 4 月至 2023 年 12 月期间到信州大学或吉祥医科大学就诊的患者身上共获取了 305 张耳镜图像,这些图像涉及 4 种中耳疾病(急性中耳炎、中耳胆脂瘤、慢性中耳炎和中耳炎伴渗出)。利用提示和患者数据建立了最佳的 GPT-4V 设置,并使用根据最佳提示创建的模型在 190 张图像上验证了 GPT-4V 的诊断准确性。为了将 GPT-4V 的诊断准确性与医生的诊断准确性进行比较,30 位临床医生填写了一份包含 190 张图像的网络问卷:结果:多模态人工智能方法的准确率为 82.1%,高于认证儿科医生的 70.6%,但落后于耳鼻喉科医生的 95% 以上。该模型针对特定疾病的准确率分别为:急性中耳炎 89.2%、慢性中耳炎 76.5%、中耳胆脂瘤 79.3%、中耳炎伴渗出 85.7%,这凸显了针对特定疾病进行优化的必要性。与医生的比较结果显示,GPT-4V 有助于临床决策:结论:尽管GPT-4V具有诸多优势,但仍需应对数据隐私和伦理考虑等挑战。总之,这项研究强调了多模态人工智能在提高耳鼻喉科诊断准确性和改善患者护理方面的潜力。在不同的临床环境中优化和验证这种方法还需要进一步的研究。
{"title":"Feasibility of Multimodal Artificial Intelligence Using GPT-4 Vision for the Classification of Middle Ear Disease: Qualitative Study and Validation.","authors":"Masao Noda, Hidekane Yoshimura, Takuya Okubo, Ryota Koshu, Yuki Uchiyama, Akihiro Nomura, Makoto Ito, Yutaka Takumi","doi":"10.2196/58342","DOIUrl":"10.2196/58342","url":null,"abstract":"<p><strong>Background: </strong>The integration of artificial intelligence (AI), particularly deep learning models, has transformed the landscape of medical technology, especially in the field of diagnosis using imaging and physiological data. In otolaryngology, AI has shown promise in image classification for middle ear diseases. However, existing models often lack patient-specific data and clinical context, limiting their universal applicability. The emergence of GPT-4 Vision (GPT-4V) has enabled a multimodal diagnostic approach, integrating language processing with image analysis.</p><p><strong>Objective: </strong>In this study, we investigated the effectiveness of GPT-4V in diagnosing middle ear diseases by integrating patient-specific data with otoscopic images of the tympanic membrane.</p><p><strong>Methods: </strong>The design of this study was divided into two phases: (1) establishing a model with appropriate prompts and (2) validating the ability of the optimal prompt model to classify images. In total, 305 otoscopic images of 4 middle ear diseases (acute otitis media, middle ear cholesteatoma, chronic otitis media, and otitis media with effusion) were obtained from patients who visited Shinshu University or Jichi Medical University between April 2010 and December 2023. The optimized GPT-4V settings were established using prompts and patients' data, and the model created with the optimal prompt was used to verify the diagnostic accuracy of GPT-4V on 190 images. To compare the diagnostic accuracy of GPT-4V with that of physicians, 30 clinicians completed a web-based questionnaire consisting of 190 images.</p><p><strong>Results: </strong>The multimodal AI approach achieved an accuracy of 82.1%, which is superior to that of certified pediatricians at 70.6%, but trailing behind that of otolaryngologists at more than 95%. The model's disease-specific accuracy rates were 89.2% for acute otitis media, 76.5% for chronic otitis media, 79.3% for middle ear cholesteatoma, and 85.7% for otitis media with effusion, which highlights the need for disease-specific optimization. Comparisons with physicians revealed promising results, suggesting the potential of GPT-4V to augment clinical decision-making.</p><p><strong>Conclusions: </strong>Despite its advantages, challenges such as data privacy and ethical considerations must be addressed. Overall, this study underscores the potential of multimodal AI for enhancing diagnostic accuracy and improving patient care in otolaryngology. Further research is warranted to optimize and validate this approach in diverse clinical settings.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e58342"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11179042/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying Patterns of Smoking Cessation App Feature Use That Predict Successful Quitting: Secondary Analysis of Experimental Data Leveraging Machine Learning 识别可预测成功戒烟的戒烟应用程序功能使用模式:利用机器学习对实验数据进行二次分析
Pub Date : 2024-05-22 DOI: 10.2196/51756
L. N. Siegel, Kara P Wiseman, Alexandra Budenz, Yvonne M Prutzman
Leveraging free smartphone apps can help expand the availability and use of evidence-based smoking cessation interventions. However, there is a need for additional research investigating how the use of different features within such apps impacts their effectiveness. We used observational data collected from an experiment of a publicly available smoking cessation app to develop supervised machine learning (SML) algorithms intended to distinguish the app features that promote successful smoking cessation. We then assessed the extent to which patterns of app feature use accounted for variance in cessation that could not be explained by other known predictors of cessation (eg, tobacco use behaviors). Data came from an experiment (ClinicalTrials.gov NCT04623736) testing the impacts of incentivizing ecological momentary assessments within the National Cancer Institute’s quitSTART app. Participants’ (N=133) app activity, including every action they took within the app and its corresponding time stamp, was recorded. Demographic and baseline tobacco use characteristics were measured at the start of the experiment, and short-term smoking cessation (7-day point prevalence abstinence) was measured at 4 weeks after baseline. Logistic regression SML modeling was used to estimate participants’ probability of cessation from 28 variables reflecting participants’ use of different app features, assigned experimental conditions, and phone type (iPhone [Apple Inc] or Android [Google]). The SML model was first fit in a training set (n=100) and then its accuracy was assessed in a held-aside test set (n=33). Within the test set, a likelihood ratio test (n=30) assessed whether adding individuals’ SML-predicted probabilities of cessation to a logistic regression model that included demographic and tobacco use (eg, polyuse) variables explained additional variance in 4-week cessation. The SML model’s sensitivity (0.67) and specificity (0.67) in the held-aside test set indicated that individuals’ patterns of using different app features predicted cessation with reasonable accuracy. The likelihood ratio test showed that the logistic regression, which included the SML model–predicted probabilities, was statistically equivalent to the model that only included the demographic and tobacco use variables (P=.16). Harnessing user data through SML could help determine the features of smoking cessation apps that are most useful. This methodological approach could be applied in future research focusing on smoking cessation app features to inform the development and improvement of smoking cessation apps. ClinicalTrials.gov NCT04623736; https://clinicaltrials.gov/study/NCT04623736
利用免费的智能手机应用程序有助于扩大循证戒烟干预措施的可用性和使用范围。然而,还需要进行更多的研究,调查此类应用程序中不同功能的使用如何影响其有效性。 我们利用从一个公开的戒烟应用程序实验中收集到的观察数据,开发了有监督的机器学习(SML)算法,旨在区分促进成功戒烟的应用程序功能。然后,我们评估了应用程序功能使用模式在多大程度上解释了其他已知戒烟预测因素(如烟草使用行为)无法解释的戒烟差异。 数据来自一项实验(ClinicalTrials.gov NCT04623736),该实验测试了在美国国家癌症研究所的戒烟应用程序(quitSTART)中对生态瞬间评估进行激励的影响。参与者(人数=133)的应用活动,包括他们在应用中的每一次操作及其相应的时间戳都被记录下来。在实验开始时测量人口统计学特征和基线烟草使用特征,在基线后 4 周测量短期戒烟情况(7 天点戒烟率)。使用逻辑回归 SML 模型从 28 个变量中估计参与者的戒烟概率,这些变量反映了参与者对不同应用功能的使用情况、指定的实验条件和手机类型(iPhone [Apple Inc] 或 Android [Google])。SML 模型首先在训练集(人数=100)中进行拟合,然后在保留测试集(人数=33)中评估其准确性。在测试集中,似然比测试(n=30)评估了将个人的 SML 预测戒烟概率添加到包含人口统计学和烟草使用(如多用)变量的逻辑回归模型中是否能解释 4 周戒烟率的额外差异。 SML模型在保留边测试集中的灵敏度(0.67)和特异度(0.67)表明,个人使用不同应用功能的模式可以合理准确地预测戒烟情况。似然比检验表明,包含 SML 模型预测概率的逻辑回归与仅包含人口统计学变量和烟草使用变量的模型在统计学上是等效的(P=0.16)。 通过 SML 掌握用户数据有助于确定戒烟应用程序中最有用的功能。这种方法可应用于未来以戒烟应用程序功能为重点的研究中,为戒烟应用程序的开发和改进提供参考。 ClinicalTrials.gov NCT04623736; https://clinicaltrials.gov/study/NCT04623736
{"title":"Identifying Patterns of Smoking Cessation App Feature Use That Predict Successful Quitting: Secondary Analysis of Experimental Data Leveraging Machine Learning","authors":"L. N. Siegel, Kara P Wiseman, Alexandra Budenz, Yvonne M Prutzman","doi":"10.2196/51756","DOIUrl":"https://doi.org/10.2196/51756","url":null,"abstract":"\u0000 \u0000 Leveraging free smartphone apps can help expand the availability and use of evidence-based smoking cessation interventions. However, there is a need for additional research investigating how the use of different features within such apps impacts their effectiveness.\u0000 \u0000 \u0000 \u0000 We used observational data collected from an experiment of a publicly available smoking cessation app to develop supervised machine learning (SML) algorithms intended to distinguish the app features that promote successful smoking cessation. We then assessed the extent to which patterns of app feature use accounted for variance in cessation that could not be explained by other known predictors of cessation (eg, tobacco use behaviors).\u0000 \u0000 \u0000 \u0000 Data came from an experiment (ClinicalTrials.gov NCT04623736) testing the impacts of incentivizing ecological momentary assessments within the National Cancer Institute’s quitSTART app. Participants’ (N=133) app activity, including every action they took within the app and its corresponding time stamp, was recorded. Demographic and baseline tobacco use characteristics were measured at the start of the experiment, and short-term smoking cessation (7-day point prevalence abstinence) was measured at 4 weeks after baseline. Logistic regression SML modeling was used to estimate participants’ probability of cessation from 28 variables reflecting participants’ use of different app features, assigned experimental conditions, and phone type (iPhone [Apple Inc] or Android [Google]). The SML model was first fit in a training set (n=100) and then its accuracy was assessed in a held-aside test set (n=33). Within the test set, a likelihood ratio test (n=30) assessed whether adding individuals’ SML-predicted probabilities of cessation to a logistic regression model that included demographic and tobacco use (eg, polyuse) variables explained additional variance in 4-week cessation.\u0000 \u0000 \u0000 \u0000 The SML model’s sensitivity (0.67) and specificity (0.67) in the held-aside test set indicated that individuals’ patterns of using different app features predicted cessation with reasonable accuracy. The likelihood ratio test showed that the logistic regression, which included the SML model–predicted probabilities, was statistically equivalent to the model that only included the demographic and tobacco use variables (P=.16).\u0000 \u0000 \u0000 \u0000 Harnessing user data through SML could help determine the features of smoking cessation apps that are most useful. This methodological approach could be applied in future research focusing on smoking cessation app features to inform the development and improvement of smoking cessation apps.\u0000 \u0000 \u0000 \u0000 ClinicalTrials.gov NCT04623736; https://clinicaltrials.gov/study/NCT04623736\u0000","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141110022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Framework for Ranking Machine Learning Predictions of Limited, Multimodal, and Longitudinal Behavioral Passive Sensing Data: Combining User-Agnostic and Personalized Modeling 对有限、多模态和纵向行为被动传感数据进行机器学习预测排名的框架:用户诊断与个性化建模相结合
Pub Date : 2024-05-20 DOI: 10.2196/47805
Tahsin Mullick, Samy Shaaban, A. Radovic, Afsaneh Doryab
Passive mobile sensing provides opportunities for measuring and monitoring health status in the wild and outside of clinics. However, longitudinal, multimodal mobile sensor data can be small, noisy, and incomplete. This makes processing, modeling, and prediction of these data challenging. The small size of the data set restricts it from being modeled using complex deep learning networks. The current state of the art (SOTA) tackles small sensor data sets following a singular modeling paradigm based on traditional machine learning (ML) algorithms. These opt for either a user-agnostic modeling approach, making the model susceptible to a larger degree of noise, or a personalized approach, where training on individual data alludes to a more limited data set, giving rise to overfitting, therefore, ultimately, having to seek a trade-off by choosing 1 of the 2 modeling approaches to reach predictions. The objective of this study was to filter, rank, and output the best predictions for small, multimodal, longitudinal sensor data using a framework that is designed to tackle data sets that are limited in size (particularly targeting health studies that use passive multimodal sensors) and that combines both user agnostic and personalized approaches, along with a combination of ranking strategies to filter predictions. In this paper, we introduced a novel ranking framework for longitudinal multimodal sensors (FLMS) to address challenges encountered in health studies involving passive multimodal sensors. Using the FLMS, we (1) built a tensor-based aggregation and ranking strategy for final interpretation, (2) processed various combinations of sensor fusions, and (3) balanced user-agnostic and personalized modeling approaches with appropriate cross-validation strategies. The performance of the FLMS was validated with the help of a real data set of adolescents diagnosed with major depressive disorder for the prediction of change in depression in the adolescent participants. Predictions output by the proposed FLMS achieved a 7% increase in accuracy and a 13% increase in recall for the real data set. Experiments with existing SOTA ML algorithms showed an 11% increase in accuracy for the depression data set and how overfitting and sparsity were handled. The FLMS aims to fill the gap that currently exists when modeling passive sensor data with a small number of data points. It achieves this through leveraging both user-agnostic and personalized modeling techniques in tandem with an effective ranking strategy to filter predictions.
无源移动传感技术为测量和监测野外和诊所外的健康状况提供了机会。然而,纵向多模态移动传感器数据可能较小、噪声较大且不完整。这就给这些数据的处理、建模和预测带来了挑战。数据集的小规模限制了使用复杂的深度学习网络对其进行建模。目前的技术水平(SOTA)是根据基于传统机器学习(ML)算法的单一建模范式来处理小型传感器数据集的。这些方法要么选择与用户无关的建模方法,使模型容易受到较大程度的噪声影响;要么选择个性化方法,对个人数据进行训练,暗示数据集更加有限,从而导致过度拟合,因此最终不得不在这两种建模方法中选择一种进行预测,以寻求权衡。 本研究的目的是利用一个框架,对小型、多模态、纵向传感器数据进行筛选、排序和输出最佳预测结果,该框架旨在处理规模有限的数据集(特别是针对使用被动多模态传感器的健康研究),同时结合了用户无关方法和个性化方法,以及筛选预测结果的排序策略组合。 在本文中,我们为纵向多模态传感器(FLMS)引入了一个新的排序框架,以应对涉及被动多模态传感器的健康研究中遇到的挑战。利用 FLMS,我们(1)建立了基于张量的聚合和排序策略,用于最终解释;(2)处理了传感器融合的各种组合;(3)平衡了用户识别和个性化建模方法,并采用了适当的交叉验证策略。FLMS 的性能在一组被诊断为重度抑郁障碍的青少年真实数据的帮助下得到了验证,以预测青少年参与者的抑郁变化。 在真实数据集上,建议的 FLMS 输出的预测准确率提高了 7%,召回率提高了 13%。使用现有的 SOTA ML 算法进行的实验表明,抑郁症数据集的准确率提高了 11%,而且过拟合和稀疏性也得到了处理。 FLMS 的目标是填补目前在用少量数据点对被动传感器数据建模时存在的空白。它通过利用用户识别和个性化建模技术以及有效的排序策略来过滤预测结果,从而实现了这一目标。
{"title":"Framework for Ranking Machine Learning Predictions of Limited, Multimodal, and Longitudinal Behavioral Passive Sensing Data: Combining User-Agnostic and Personalized Modeling","authors":"Tahsin Mullick, Samy Shaaban, A. Radovic, Afsaneh Doryab","doi":"10.2196/47805","DOIUrl":"https://doi.org/10.2196/47805","url":null,"abstract":"\u0000 \u0000 Passive mobile sensing provides opportunities for measuring and monitoring health status in the wild and outside of clinics. However, longitudinal, multimodal mobile sensor data can be small, noisy, and incomplete. This makes processing, modeling, and prediction of these data challenging. The small size of the data set restricts it from being modeled using complex deep learning networks. The current state of the art (SOTA) tackles small sensor data sets following a singular modeling paradigm based on traditional machine learning (ML) algorithms. These opt for either a user-agnostic modeling approach, making the model susceptible to a larger degree of noise, or a personalized approach, where training on individual data alludes to a more limited data set, giving rise to overfitting, therefore, ultimately, having to seek a trade-off by choosing 1 of the 2 modeling approaches to reach predictions.\u0000 \u0000 \u0000 \u0000 The objective of this study was to filter, rank, and output the best predictions for small, multimodal, longitudinal sensor data using a framework that is designed to tackle data sets that are limited in size (particularly targeting health studies that use passive multimodal sensors) and that combines both user agnostic and personalized approaches, along with a combination of ranking strategies to filter predictions.\u0000 \u0000 \u0000 \u0000 In this paper, we introduced a novel ranking framework for longitudinal multimodal sensors (FLMS) to address challenges encountered in health studies involving passive multimodal sensors. Using the FLMS, we (1) built a tensor-based aggregation and ranking strategy for final interpretation, (2) processed various combinations of sensor fusions, and (3) balanced user-agnostic and personalized modeling approaches with appropriate cross-validation strategies. The performance of the FLMS was validated with the help of a real data set of adolescents diagnosed with major depressive disorder for the prediction of change in depression in the adolescent participants.\u0000 \u0000 \u0000 \u0000 Predictions output by the proposed FLMS achieved a 7% increase in accuracy and a 13% increase in recall for the real data set. Experiments with existing SOTA ML algorithms showed an 11% increase in accuracy for the depression data set and how overfitting and sparsity were handled.\u0000 \u0000 \u0000 \u0000 The FLMS aims to fill the gap that currently exists when modeling passive sensor data with a small number of data points. It achieves this through leveraging both user-agnostic and personalized modeling techniques in tandem with an effective ranking strategy to filter predictions.\u0000","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"15 22","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141121044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Risk Prediction of Methicillin-Resistant Staphylococcus aureus Using Machine Learning Methods With Network Features: Retrospective Development Study. 利用具有网络特征的机器学习方法改进耐甲氧西林金黄色葡萄球菌的风险预测:回顾性发展研究。
Pub Date : 2024-05-16 DOI: 10.2196/48067
Methun Kamruzzaman, Jack Heavey, Alexander Song, Matthew Bielskas, Parantapa Bhattacharya, Gregory Madden, Eili Klein, Xinwei Deng, Anil Vullikanti

Background: Health care-associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile (CDI), place a significant burden on our health care infrastructure.

Objective: Screening for MDROs is an important mechanism for preventing spread but is resource intensive. The objective of this study was to develop automated tools that can predict colonization or infection risk using electronic health record (EHR) data, provide useful information to aid infection control, and guide empiric antibiotic coverage.

Methods: We retrospectively developed a machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection from hospitalized patients at the University of Virginia Hospital. We used clinical and nonclinical features derived from on-admission and throughout-stay information from the patient's EHR data to build the model. In addition, we used a class of features derived from contact networks in EHR data; these network features can capture patients' contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explored heterogeneous models for different patient subpopulations, for example, those admitted to an intensive care unit or emergency department or those with specific testing histories, which perform better.

Results: We found that the penalized logistic regression performs better than other methods, and this model's performance measured in terms of its receiver operating characteristics-area under the curve score improves by nearly 11% when we use polynomial (second-degree) transformation of the features. Some significant features in predicting MDRO risk include antibiotic use, surgery, use of devices, dialysis, patient's comorbidity conditions, and network features. Among these, network features add the most value and improve the model's performance by at least 15%. The penalized logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations.

Conclusions: Our study shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and nonclinical features derived from EHR data. Network features are the most predictive and provide significant improvement over prior methods. Furthermore, heterogeneous prediction models for different patient subpopulations enhance the model's performance.

背景:耐甲氧西林金黄色葡萄球菌(MRSA)和难辨梭状芽孢杆菌(CDI)等耐多药菌(MDRO)引起的医疗相关感染给我们的医疗基础设施带来了沉重负担:目的:筛查 MDROs 是防止传播的重要机制,但需要耗费大量资源。本研究的目的是开发自动化工具,利用电子健康记录(EHR)数据预测定植或感染风险,提供有用信息帮助感染控制,并指导经验性抗生素的使用范围:我们回顾性地开发了一个机器学习模型,用于检测弗吉尼亚大学医院住院患者样本采集时未分化患者的 MRSA 定植和感染情况。我们使用了从患者电子病历数据中的入院和整个住院期间信息中提取的临床和非临床特征来构建模型。此外,我们还使用了一类从电子病历数据中的联系网络中提取的特征;这些网络特征可以捕捉患者与医疗服务提供者和其他患者的联系,从而提高模型的可解释性和准确性,以预测 MRSA 监测检验的结果。最后,我们探索了针对不同患者亚群的异构模型,例如,在重症监护室或急诊科住院的患者或有特定检测史的患者,哪种模型表现更好:我们发现,惩罚逻辑回归比其他方法表现更好,当我们对特征进行多项式(二度)变换时,该模型的接收者操作特征曲线下面积得分的表现提高了近 11%。预测 MDRO 风险的一些重要特征包括抗生素的使用、手术、器械的使用、透析、患者的合并症情况以及网络特征。其中,网络特征的价值最大,至少提高了模型性能的 15%。对于特定的患者亚群,具有相同特征变换的惩罚逻辑回归模型也比其他模型表现更好:我们的研究表明,利用从电子病历数据中提取的临床和非临床特征,机器学习方法可以相当有效地进行 MRSA 风险预测。网络特征最具预测性,与之前的方法相比有显著改善。此外,针对不同患者亚群的异构预测模型也提高了模型的性能。
{"title":"Improving Risk Prediction of Methicillin-Resistant Staphylococcus aureus Using Machine Learning Methods With Network Features: Retrospective Development Study.","authors":"Methun Kamruzzaman, Jack Heavey, Alexander Song, Matthew Bielskas, Parantapa Bhattacharya, Gregory Madden, Eili Klein, Xinwei Deng, Anil Vullikanti","doi":"10.2196/48067","DOIUrl":"10.2196/48067","url":null,"abstract":"<p><strong>Background: </strong>Health care-associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile (CDI), place a significant burden on our health care infrastructure.</p><p><strong>Objective: </strong>Screening for MDROs is an important mechanism for preventing spread but is resource intensive. The objective of this study was to develop automated tools that can predict colonization or infection risk using electronic health record (EHR) data, provide useful information to aid infection control, and guide empiric antibiotic coverage.</p><p><strong>Methods: </strong>We retrospectively developed a machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection from hospitalized patients at the University of Virginia Hospital. We used clinical and nonclinical features derived from on-admission and throughout-stay information from the patient's EHR data to build the model. In addition, we used a class of features derived from contact networks in EHR data; these network features can capture patients' contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explored heterogeneous models for different patient subpopulations, for example, those admitted to an intensive care unit or emergency department or those with specific testing histories, which perform better.</p><p><strong>Results: </strong>We found that the penalized logistic regression performs better than other methods, and this model's performance measured in terms of its receiver operating characteristics-area under the curve score improves by nearly 11% when we use polynomial (second-degree) transformation of the features. Some significant features in predicting MDRO risk include antibiotic use, surgery, use of devices, dialysis, patient's comorbidity conditions, and network features. Among these, network features add the most value and improve the model's performance by at least 15%. The penalized logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations.</p><p><strong>Conclusions: </strong>Our study shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and nonclinical features derived from EHR data. Network features are the most predictive and provide significant improvement over prior methods. Furthermore, heterogeneous prediction models for different patient subpopulations enhance the model's performance.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e48067"},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11140275/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study 微调命名实体识别任务大型语言模型的样本量考虑因素:方法论研究
Pub Date : 2024-05-16 DOI: 10.2196/52095
Zoltan P. Majdik, S. S. Graham, Jade C Shiva Edward, Sabrina N Rodriguez, M. S. Karnes, Jared T Jensen, Joshua B Barbour, Justin F. Rousseau
Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking. This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements. A random sample of 200 disclosure statements was prepared for annotation. All “PERSON” and “ORG” entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F1-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density. Fine-tuned models ranged in topline NER performance from F1-score=0.79 to F1-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R2 ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F1-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38. Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture’s intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.
大型语言模型(LLMs)有可能支持健康信息学中前景广阔的新应用。然而,关于微调 LLM 以执行生物医学和卫生政策背景下的特定任务所需的样本大小考虑因素的实用数据却很缺乏。 本研究旨在评估样本大小和样本选择技术,以微调 LLM,支持改进利益冲突披露声明自定义数据集的命名实体识别(NER)。 研究人员随机抽取了 200 份披露声明进行标注。所有 "PERSON "和 "ORG "实体均由两名标注者分别识别,在达成适当一致后,标注者又独立标注了另外 290 份披露声明。从 490 份注释过的文件中,按不同大小范围抽取了 2500 个分层随机样本。这 2500 个训练集子样本用于微调两个模型架构(转换器双向编码器表示法 [BERT] 和生成预训练转换器 [GPT])中的语言模型,以提高 NER,并使用多元回归评估样本大小(句子)、实体密度(每句实体 [EPS])和训练模型性能(F1-分数)之间的关系。此外,还使用了单预测因子阈值回归模型来评估样本量或实体密度的增加是否会导致边际收益递减。 微调后的模型在不同架构下的最高 NER 性能从 F1-score=0.79 到 F1-score=0.96不等。双预测多元线性回归模型具有显著的统计意义,多重 R2 从 0.6057 到 0.7896 不等(所有 P<.001)。在所有情况下,EPS 和句子数量都是 F1 分数的重要预测因素(P<.001),但 GPT-2_large 模型除外,在该模型中 EPS 不是重要的预测因素(P=.184)。模型阈值表明,以句子数量衡量的训练数据集样本量的增加会导致边际收益递减,点估计值从 RoBERTa_large 的 439 个句子到 GPT-2_large 的 527 个句子不等。同样,阈值回归模型表明 EPS 的边际收益递减,点估计值在 1.36 和 1.38 之间。 对于应用于生物医学文本的 NER 任务,可以使用相对适中的样本量来微调 LLM,而训练数据的实体密度应与生产数据中的实体密度近似。训练数据质量和模型架构的预期用途(文本生成与文本处理或分类)可能与训练数据量和模型参数大小同样重要,甚至更为重要。
{"title":"Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study","authors":"Zoltan P. Majdik, S. S. Graham, Jade C Shiva Edward, Sabrina N Rodriguez, M. S. Karnes, Jared T Jensen, Joshua B Barbour, Justin F. Rousseau","doi":"10.2196/52095","DOIUrl":"https://doi.org/10.2196/52095","url":null,"abstract":"\u0000 \u0000 Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking.\u0000 \u0000 \u0000 \u0000 This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements.\u0000 \u0000 \u0000 \u0000 A random sample of 200 disclosure statements was prepared for annotation. All “PERSON” and “ORG” entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F1-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density.\u0000 \u0000 \u0000 \u0000 Fine-tuned models ranged in topline NER performance from F1-score=0.79 to F1-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R2 ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F1-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38.\u0000 \u0000 \u0000 \u0000 Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture’s intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.\u0000","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"48 20","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140970862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study. 使用消费类可穿戴设备进行情感识别的个性化和通用化方法比较:机器学习研究。
Pub Date : 2024-05-10 DOI: 10.2196/52171
Joe Li, Peter Washington

Background: There are a wide range of potential adverse health effects, ranging from headaches to cardiovascular disease, associated with long-term negative emotions and chronic stress. Because many indicators of stress are imperceptible to observers, the early detection of stress remains a pressing medical need, as it can enable early intervention. Physiological signals offer a noninvasive method for monitoring affective states and are recorded by a growing number of commercially available wearables.

Objective: We aim to study the differences between personalized and generalized machine learning models for 3-class emotion classification (neutral, stress, and amusement) using wearable biosignal data.

Methods: We developed a neural network for the 3-class emotion classification problem using data from the Wearable Stress and Affect Detection (WESAD) data set, a multimodal data set with physiological signals from 15 participants. We compared the results between a participant-exclusive generalized, a participant-inclusive generalized, and a personalized deep learning model.

Results: For the 3-class classification problem, our personalized model achieved an average accuracy of 95.06% and an F1-score of 91.71%; our participant-inclusive generalized model achieved an average accuracy of 66.95% and an F1-score of 42.50%; and our participant-exclusive generalized model achieved an average accuracy of 67.65% and an F1-score of 43.05%.

Conclusions: Our results emphasize the need for increased research in personalized emotion recognition models given that they outperform generalized models in certain contexts. We also demonstrate that personalized machine learning models for emotion classification are viable and can achieve high performance.

背景:长期的消极情绪和慢性压力会对健康产生广泛的潜在不利影响,从头痛到心血管疾病不等。由于许多压力指标是观察者无法察觉的,因此压力的早期检测仍然是一项迫切的医疗需求,因为它可以实现早期干预。生理信号为监测情绪状态提供了一种无创方法,越来越多的商用可穿戴设备都能记录生理信号:我们旨在利用可穿戴生物信号数据,研究个性化和通用化机器学习模型在三类情绪分类(中性、压力和娱乐)中的差异:我们利用可穿戴压力和情感检测(WESAD)数据集(一个包含 15 名参与者生理信号的多模态数据集)中的数据,为 3 类情感分类问题开发了一个神经网络。我们比较了参与者专属广义模型、参与者专属广义模型和个性化深度学习模型的结果:结果:在三类分类问题上,我们的个性化模型取得了 95.06% 的平均准确率和 91.71% 的 F1 分数;我们的参与者包容性广义模型取得了 66.95% 的平均准确率和 42.50% 的 F1 分数;我们的参与者排他性广义模型取得了 67.65% 的平均准确率和 43.05% 的 F1 分数:我们的研究结果强调了加强个性化情感识别模型研究的必要性,因为在某些情况下,个性化情感识别模型的表现优于通用模型。我们还证明了用于情感分类的个性化机器学习模型是可行的,并且可以实现高性能。
{"title":"A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study.","authors":"Joe Li, Peter Washington","doi":"10.2196/52171","DOIUrl":"10.2196/52171","url":null,"abstract":"<p><strong>Background: </strong>There are a wide range of potential adverse health effects, ranging from headaches to cardiovascular disease, associated with long-term negative emotions and chronic stress. Because many indicators of stress are imperceptible to observers, the early detection of stress remains a pressing medical need, as it can enable early intervention. Physiological signals offer a noninvasive method for monitoring affective states and are recorded by a growing number of commercially available wearables.</p><p><strong>Objective: </strong>We aim to study the differences between personalized and generalized machine learning models for 3-class emotion classification (neutral, stress, and amusement) using wearable biosignal data.</p><p><strong>Methods: </strong>We developed a neural network for the 3-class emotion classification problem using data from the Wearable Stress and Affect Detection (WESAD) data set, a multimodal data set with physiological signals from 15 participants. We compared the results between a participant-exclusive generalized, a participant-inclusive generalized, and a personalized deep learning model.</p><p><strong>Results: </strong>For the 3-class classification problem, our personalized model achieved an average accuracy of 95.06% and an F<sub>1</sub>-score of 91.71%; our participant-inclusive generalized model achieved an average accuracy of 66.95% and an F<sub>1</sub>-score of 42.50%; and our participant-exclusive generalized model achieved an average accuracy of 67.65% and an F<sub>1</sub>-score of 43.05%.</p><p><strong>Conclusions: </strong>Our results emphasize the need for increased research in personalized emotion recognition models given that they outperform generalized models in certain contexts. We also demonstrate that personalized machine learning models for emotion classification are viable and can achieve high performance.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e52171"},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11127131/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and Validation. 基于深度语言模型的多维信息质量评估在线健康搜索:算法开发与验证
Pub Date : 2024-05-02 DOI: 10.2196/42630
Boya Zhang, Nona Naderi, Rahul Mishra, Douglas Teodoro

Background: Widespread misinformation in web resources can lead to serious implications for individuals seeking health advice. Despite that, information retrieval models are often focused only on the query-document relevance dimension to rank results.

Objective: We investigate a multidimensional information quality retrieval model based on deep learning to enhance the effectiveness of online health care information search results.

Methods: In this study, we simulated online health information search scenarios with a topic set of 32 different health-related inquiries and a corpus containing 1 billion web documents from the April 2019 snapshot of Common Crawl. Using state-of-the-art pretrained language models, we assessed the quality of the retrieved documents according to their usefulness, supportiveness, and credibility dimensions for a given search query on 6030 human-annotated, query-document pairs. We evaluated this approach using transfer learning and more specific domain adaptation techniques.

Results: In the transfer learning setting, the usefulness model provided the largest distinction between help- and harm-compatible documents, with a difference of +5.6%, leading to a majority of helpful documents in the top 10 retrieved. The supportiveness model achieved the best harm compatibility (+2.4%), while the combination of usefulness, supportiveness, and credibility models achieved the largest distinction between help- and harm-compatibility on helpful topics (+16.9%). In the domain adaptation setting, the linear combination of different models showed robust performance, with help-harm compatibility above +4.4% for all dimensions and going as high as +6.8%.

Conclusions: These results suggest that integrating automatic ranking models created for specific information quality dimensions can increase the effectiveness of health-related information retrieval. Thus, our approach could be used to enhance searches made by individuals seeking online health information.

背景:网络资源中广泛存在的错误信息会对寻求健康建议的个人造成严重影响。尽管如此,信息检索模型往往只关注查询-文档相关性维度来对结果进行排序:我们研究了一种基于深度学习的多维信息质量检索模型,以提高在线医疗保健信息搜索结果的有效性:在这项研究中,我们模拟了在线医疗保健信息搜索场景,其主题集包含 32 个不同的医疗保健相关查询,语料库包含 2019 年 4 月 Common Crawl 快照中的 10 亿个网络文档。我们使用最先进的预训练语言模型,针对特定搜索查询,根据其有用性、支持性和可信度维度,对 6030 个人类标注的查询-文档对进行了检索文档质量评估。我们使用迁移学习和更具体的领域适应技术对这种方法进行了评估:在迁移学习设置中,有用性模型提供了最大的帮助和伤害兼容文档之间的区别,差值为 +5.6%,导致在检索的前 10 名中大多数都是有帮助的文档。支持度模型实现了最佳的危害兼容性(+2.4%),而有用性、支持度和可信度模型的组合实现了有用主题的帮助兼容性和危害兼容性之间的最大区别(+16.9%)。在领域适应设置中,不同模型的线性组合表现出强劲的性能,所有维度的帮助-伤害兼容性都超过了 +4.4%,甚至高达 +6.8%:这些结果表明,整合针对特定信息质量维度创建的自动排序模型可以提高健康相关信息检索的效率。因此,我们的方法可用于提高个人在网上搜索健康信息的效率。
{"title":"Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and Validation.","authors":"Boya Zhang, Nona Naderi, Rahul Mishra, Douglas Teodoro","doi":"10.2196/42630","DOIUrl":"10.2196/42630","url":null,"abstract":"<p><strong>Background: </strong>Widespread misinformation in web resources can lead to serious implications for individuals seeking health advice. Despite that, information retrieval models are often focused only on the query-document relevance dimension to rank results.</p><p><strong>Objective: </strong>We investigate a multidimensional information quality retrieval model based on deep learning to enhance the effectiveness of online health care information search results.</p><p><strong>Methods: </strong>In this study, we simulated online health information search scenarios with a topic set of 32 different health-related inquiries and a corpus containing 1 billion web documents from the April 2019 snapshot of Common Crawl. Using state-of-the-art pretrained language models, we assessed the quality of the retrieved documents according to their usefulness, supportiveness, and credibility dimensions for a given search query on 6030 human-annotated, query-document pairs. We evaluated this approach using transfer learning and more specific domain adaptation techniques.</p><p><strong>Results: </strong>In the transfer learning setting, the usefulness model provided the largest distinction between help- and harm-compatible documents, with a difference of +5.6%, leading to a majority of helpful documents in the top 10 retrieved. The supportiveness model achieved the best harm compatibility (+2.4%), while the combination of usefulness, supportiveness, and credibility models achieved the largest distinction between help- and harm-compatibility on helpful topics (+16.9%). In the domain adaptation setting, the linear combination of different models showed robust performance, with help-harm compatibility above +4.4% for all dimensions and going as high as +6.8%.</p><p><strong>Conclusions: </strong>These results suggest that integrating automatic ranking models created for specific information quality dimensions can increase the effectiveness of health-related information retrieval. Thus, our approach could be used to enhance searches made by individuals seeking online health information.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e42630"},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099810/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study. 评估症状检查器的诊断性能:临床小故事研究
Pub Date : 2024-04-29 DOI: 10.2196/46875
Mohammad Hammoud, Shahd Douglas, Mohamad Darmach, Sara Alawneh, Swapnendu Sanyal, Youssef Kanbour

Background: Medical self-diagnostic tools (or symptom checkers) are becoming an integral part of digital health and our daily lives, whereby patients are increasingly using them to identify the underlying causes of their symptoms. As such, it is essential to rigorously investigate and comprehensively report the diagnostic performance of symptom checkers using standard clinical and scientific approaches.

Objective: This study aims to evaluate and report the accuracies of a few known and new symptom checkers using a standard and transparent methodology, which allows the scientific community to cross-validate and reproduce the reported results, a step much needed in health informatics.

Methods: We propose a 4-stage experimentation methodology that capitalizes on the standard clinical vignette approach to evaluate 6 symptom checkers. To this end, we developed and peer-reviewed 400 vignettes, each approved by at least 5 out of 7 independent and experienced primary care physicians. To establish a frame of reference and interpret the results of symptom checkers accordingly, we further compared the best-performing symptom checker against 3 primary care physicians with an average experience of 16.6 (SD 9.42) years. To measure accuracy, we used 7 standard metrics, including M1 as a measure of a symptom checker's or a physician's ability to return a vignette's main diagnosis at the top of their differential list, F1-score as a trade-off measure between recall and precision, and Normalized Discounted Cumulative Gain (NDCG) as a measure of a differential list's ranking quality, among others.

Results: The diagnostic accuracies of the 6 tested symptom checkers vary significantly. For instance, the differences in the M1, F1-score, and NDCG results between the best-performing and worst-performing symptom checkers or ranges were 65.3%, 39.2%, and 74.2%, respectively. The same was observed among the participating human physicians, whereby the M1, F1-score, and NDCG ranges were 22.8%, 15.3%, and 21.3%, respectively. When compared against each other, physicians outperformed the best-performing symptom checker by an average of 1.2% using F1-score, whereas the best-performing symptom checker outperformed physicians by averages of 10.2% and 25.1% using M1 and NDCG, respectively.

Conclusions: The performance variation between symptom checkers is substantial, suggesting that symptom checkers cannot be treated as a single entity. On a different note, the best-performing symptom checker was an artificial intelligence (AI)-based one, shedding light on the promise of AI in improving the diagnostic capabilities of symptom checkers, especially as AI keeps advancing exponentially.

背景:医疗自我诊断工具(或症状检查器)正在成为数字健康和我们日常生活中不可或缺的一部分,患者越来越多地使用这些工具来确定其症状的根本原因。因此,有必要采用标准的临床和科学方法对症状检查器的诊断性能进行严格研究和全面报告:本研究旨在使用标准、透明的方法评估和报告一些已知和新型症状检查器的准确性,使科学界能够交叉验证和复制所报告的结果,这是健康信息学亟需的一步:方法:我们提出了一种分 4 个阶段的实验方法,利用标准的临床小故事方法对 6 种症状检查器进行评估。为此,我们开发了 400 个小故事并进行了同行评议,每个小故事都得到了 7 位独立的、经验丰富的初级保健医生中至少 5 位的认可。为了建立一个参考框架并据此解释症状检查器的结果,我们进一步将表现最好的症状检查器与 3 位平均经验为 16.6(标清 9.42)年的全科医生进行了比较。为了衡量准确性,我们使用了 7 个标准指标,包括衡量症状检查器或医生将小节的主要诊断结果排在鉴别列表首位的能力的指标 M1、权衡召回率和精确度的指标 F1-score,以及衡量鉴别列表排序质量的指标归一化折现累积收益(NDCG)等:结果:6 个受测症状检查器的诊断准确率差异很大。例如,表现最好和最差的症状检查器或范围之间的 M1、F1 分数和 NDCG 结果差异分别为 65.3%、39.2% 和 74.2%。在参与的人类医生中也观察到了同样的情况,M1、F1 分数和 NDCG 范围分别为 22.8%、15.3% 和 21.3%。在相互比较时,使用 F1 分数,医生的表现比表现最好的症状检查器平均高出 1.2%,而使用 M1 和 NDCG,表现最好的症状检查器的表现分别比医生平均高出 10.2% 和 25.1%:结论:症状检查器之间的性能差异很大,这表明不能将症状检查器视为单一实体。从另一个角度看,表现最好的症状检查器是基于人工智能(AI)的,这说明人工智能有望提高症状检查器的诊断能力,尤其是在人工智能不断飞速发展的情况下。
{"title":"Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study.","authors":"Mohammad Hammoud, Shahd Douglas, Mohamad Darmach, Sara Alawneh, Swapnendu Sanyal, Youssef Kanbour","doi":"10.2196/46875","DOIUrl":"10.2196/46875","url":null,"abstract":"<p><strong>Background: </strong>Medical self-diagnostic tools (or symptom checkers) are becoming an integral part of digital health and our daily lives, whereby patients are increasingly using them to identify the underlying causes of their symptoms. As such, it is essential to rigorously investigate and comprehensively report the diagnostic performance of symptom checkers using standard clinical and scientific approaches.</p><p><strong>Objective: </strong>This study aims to evaluate and report the accuracies of a few known and new symptom checkers using a standard and transparent methodology, which allows the scientific community to cross-validate and reproduce the reported results, a step much needed in health informatics.</p><p><strong>Methods: </strong>We propose a 4-stage experimentation methodology that capitalizes on the standard clinical vignette approach to evaluate 6 symptom checkers. To this end, we developed and peer-reviewed 400 vignettes, each approved by at least 5 out of 7 independent and experienced primary care physicians. To establish a frame of reference and interpret the results of symptom checkers accordingly, we further compared the best-performing symptom checker against 3 primary care physicians with an average experience of 16.6 (SD 9.42) years. To measure accuracy, we used 7 standard metrics, including M1 as a measure of a symptom checker's or a physician's ability to return a vignette's main diagnosis at the top of their differential list, F<sub>1</sub>-score as a trade-off measure between recall and precision, and Normalized Discounted Cumulative Gain (NDCG) as a measure of a differential list's ranking quality, among others.</p><p><strong>Results: </strong>The diagnostic accuracies of the 6 tested symptom checkers vary significantly. For instance, the differences in the M1, F<sub>1</sub>-score, and NDCG results between the best-performing and worst-performing symptom checkers or ranges were 65.3%, 39.2%, and 74.2%, respectively. The same was observed among the participating human physicians, whereby the M1, F<sub>1</sub>-score, and NDCG ranges were 22.8%, 15.3%, and 21.3%, respectively. When compared against each other, physicians outperformed the best-performing symptom checker by an average of 1.2% using F<sub>1</sub>-score, whereas the best-performing symptom checker outperformed physicians by averages of 10.2% and 25.1% using M1 and NDCG, respectively.</p><p><strong>Conclusions: </strong>The performance variation between symptom checkers is substantial, suggesting that symptom checkers cannot be treated as a single entity. On a different note, the best-performing symptom checker was an artificial intelligence (AI)-based one, shedding light on the promise of AI in improving the diagnostic capabilities of symptom checkers, especially as AI keeps advancing exponentially.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e46875"},"PeriodicalIF":0.0,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11091811/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability Framework for Safe and Effective Large Language Models in Medical Education: Narrative Review and Qualitative Study. 医学教育中安全有效的大语言模型的成本、可用性、可信度、公平性、问责制、透明度和可解释性框架:叙事回顾和定性研究。
Pub Date : 2024-04-23 DOI: 10.2196/51834
Majdi Quttainah, Vinaytosh Mishra, Somayya Madakam, Yotam Lurie, Shlomo Mark

Background: The world has witnessed increased adoption of large language models (LLMs) in the last year. Although the products developed using LLMs have the potential to solve accessibility and efficiency problems in health care, there is a lack of available guidelines for developing LLMs for health care, especially for medical education.

Objective: The aim of this study was to identify and prioritize the enablers for developing successful LLMs for medical education. We further evaluated the relationships among these identified enablers.

Methods: A narrative review of the extant literature was first performed to identify the key enablers for LLM development. We additionally gathered the opinions of LLM users to determine the relative importance of these enablers using an analytical hierarchy process (AHP), which is a multicriteria decision-making method. Further, total interpretive structural modeling (TISM) was used to analyze the perspectives of product developers and ascertain the relationships and hierarchy among these enablers. Finally, the cross-impact matrix-based multiplication applied to a classification (MICMAC) approach was used to determine the relative driving and dependence powers of these enablers. A nonprobabilistic purposive sampling approach was used for recruitment of focus groups.

Results: The AHP demonstrated that the most important enabler for LLMs was credibility, with a priority weight of 0.37, followed by accountability (0.27642) and fairness (0.10572). In contrast, usability, with a priority weight of 0.04, showed negligible importance. The results of TISM concurred with the findings of the AHP. The only striking difference between expert perspectives and user preference evaluation was that the product developers indicated that cost has the least importance as a potential enabler. The MICMAC analysis suggested that cost has a strong influence on other enablers. The inputs of the focus group were found to be reliable, with a consistency ratio less than 0.1 (0.084).

Conclusions: This study is the first to identify, prioritize, and analyze the relationships of enablers of effective LLMs for medical education. Based on the results of this study, we developed a comprehendible prescriptive framework, named CUC-FATE (Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability), for evaluating the enablers of LLMs in medical education. The study findings are useful for health care professionals, health technology experts, medical technology regulators, and policy makers.

背景:去年,全球采用大型语言模型(LLMs)的情况越来越多。虽然使用 LLMs 开发的产品有可能解决医疗保健中的可及性和效率问题,但目前缺乏为医疗保健,尤其是医学教育开发 LLMs 的可用指南:本研究的目的是确定为医学教育开发成功的本地化学习工具的有利因素,并确定其优先次序。我们进一步评估了这些已确定的有利因素之间的关系:方法:我们首先对现有文献进行了叙述性回顾,以确定发展 LLM 的关键推动因素。此外,我们还收集了 LLM 用户的意见,利用多标准决策方法--层次分析法(AHP)确定了这些有利因素的相对重要性。此外,还使用了整体解释结构模型(TISM)来分析产品开发人员的观点,并确定这些使能因素之间的关系和层次。最后,采用基于交叉影响矩阵的乘法分类法(MICMAC)来确定这些使能因素的相对驱动力和依赖力。焦点小组的招募采用了非概率目的性抽样方法:结果:AHP 表明,对法律硕士而言,最重要的推动因素是可信度,其优先权重为 0.37,其次是问责制(0.27642)和公平性(0.10572)。相比之下,可用性的优先权重为 0.04,其重要性可忽略不计。TISM 的结果与 AHP 的结果一致。专家观点与用户偏好评估之间的唯一显著差异是,产品开发人员认为成本作为潜在推动因素的重要性最低。MICMAC 分析表明,成本对其他促进因素有很大影响。焦点小组的意见被认为是可靠的,一致性比率小于 0.1 (0.084):本研究首次对医学教育中有效的 LLM 增强因素的关系进行了识别、优先排序和分析。基于本研究的结果,我们开发了一个可理解的规范性框架,命名为 CUC-FATE(成本、可用性、可信度、公平性、可问责性、透明度和可解释性),用于评估医学教育中的 LLM 增强因素。研究结果对医疗保健专业人士、医疗技术专家、医疗技术监管者和政策制定者很有帮助。
{"title":"Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability Framework for Safe and Effective Large Language Models in Medical Education: Narrative Review and Qualitative Study.","authors":"Majdi Quttainah, Vinaytosh Mishra, Somayya Madakam, Yotam Lurie, Shlomo Mark","doi":"10.2196/51834","DOIUrl":"10.2196/51834","url":null,"abstract":"<p><strong>Background: </strong>The world has witnessed increased adoption of large language models (LLMs) in the last year. Although the products developed using LLMs have the potential to solve accessibility and efficiency problems in health care, there is a lack of available guidelines for developing LLMs for health care, especially for medical education.</p><p><strong>Objective: </strong>The aim of this study was to identify and prioritize the enablers for developing successful LLMs for medical education. We further evaluated the relationships among these identified enablers.</p><p><strong>Methods: </strong>A narrative review of the extant literature was first performed to identify the key enablers for LLM development. We additionally gathered the opinions of LLM users to determine the relative importance of these enablers using an analytical hierarchy process (AHP), which is a multicriteria decision-making method. Further, total interpretive structural modeling (TISM) was used to analyze the perspectives of product developers and ascertain the relationships and hierarchy among these enablers. Finally, the cross-impact matrix-based multiplication applied to a classification (MICMAC) approach was used to determine the relative driving and dependence powers of these enablers. A nonprobabilistic purposive sampling approach was used for recruitment of focus groups.</p><p><strong>Results: </strong>The AHP demonstrated that the most important enabler for LLMs was credibility, with a priority weight of 0.37, followed by accountability (0.27642) and fairness (0.10572). In contrast, usability, with a priority weight of 0.04, showed negligible importance. The results of TISM concurred with the findings of the AHP. The only striking difference between expert perspectives and user preference evaluation was that the product developers indicated that cost has the least importance as a potential enabler. The MICMAC analysis suggested that cost has a strong influence on other enablers. The inputs of the focus group were found to be reliable, with a consistency ratio less than 0.1 (0.084).</p><p><strong>Conclusions: </strong>This study is the first to identify, prioritize, and analyze the relationships of enablers of effective LLMs for medical education. Based on the results of this study, we developed a comprehendible prescriptive framework, named CUC-FATE (Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability), for evaluating the enablers of LLMs in medical education. The study findings are useful for health care professionals, health technology experts, medical technology regulators, and policy makers.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e51834"},"PeriodicalIF":0.0,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11077408/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR AI
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1