Journal of Healthcare Informatics Research最新文献

With the unprecedented growth of biomedical publications, it is important to have structured abstracts in bibliographic databases (i.e., PubMed), thus, to facilitate the information retrieval and knowledge synthesis in needs of researchers. Here, we propose a few-shot prompt learning-based approach to classify sentences in medical abstracts of randomized clinical trials (RCT) and observational studies (OS) to subsections of Introduction, Background, Methods, Results, and Conclusion, using an existing corpus of RCT (PubMed 200k/20k RCT) and a newly built corpus of OS (PubMed 20k OS). Five manually designed templates in a combination of 4 BERT model variants were tested and compared to a previous hierarchical sequential labeling network architecture and traditional BERT-based sentence classification method. On the PubMed 200k and 20k RCT datasets, we achieved overall F1 scores of 0.9508 and 0.9401, respectively. Under few-shot settings, we demonstrated that only 20% of training data is sufficient to achieve a comparable F1 score by the HSLN model (0.9266 by us and 0.9263 by HSLN). When trained on the RCT dataset, our method achieved a 0.9065 F1 score on the OS dataset. When trained on the OS dataset, our method achieved a 0.9203 F1 score on the RCT dataset. We show that the prompt learning-based method outperformed the existing method, even when fewer training samples were used. Moreover, the proposed method shows better generalizability across two types of medical publications when compared with the existing approach. We make the datasets and codes publicly available at: https://github.com/YanHu-or-SawyerHu/prompt-learning-based-sentence-classifier-in-medical-abstracts.

随着生物医学出版物的空前增长，在书目数据库（即PubMed）中提供结构化摘要非常重要，从而有助于研究人员根据需要进行信息检索和知识合成。在这里，我们提出了一种基于快速学习的方法，将随机临床试验（RCT）和观察性研究（OS）的医学摘要中的句子分类到引言、背景、方法、结果和结论的子部分，使用现有的RCT语料库（PubMed 200k/20k RCT）以及新建的OS语料库（PubMed 20k OS）。测试了4个BERT模型变体组合中的5个手动设计的模板，并将其与以前的分层顺序标记网络架构和传统的基于BERT的句子分类方法进行了比较。在PubMed 200k和20k RCT数据集上，我们的F1总分分别为0.9508和0.9401。在很少的击球设置下，我们证明只有20%的训练数据足以通过HSLN模型获得可比的F1分数（我们的0.9266和HSLN的0.9263）。当在RCT数据集上训练时，我们的方法在OS数据集上获得了0.9065的F1分数。当在OS数据集上训练时，我们的方法在RCT数据集上获得了0.9203的F1分数。我们表明，即使使用较少的训练样本，基于即时学习的方法也优于现有方法。此外，与现有方法相比，所提出的方法在两种类型的医学出版物中表现出更好的可推广性。我们在以下网站公开数据集和代码：https://github.com/YanHu-or-SawyerHu/prompt-learning-based-sentence-classifier-in-medical-abstracts.

{"title":"Towards More Generalizable and Accurate Sentence Classification in Medical Abstracts with Less Data.","authors":"Yan Hu, Yong Chen, Hua Xu","doi":"10.1007/s41666-023-00141-6","DOIUrl":"10.1007/s41666-023-00141-6","url":null,"abstract":"With the unprecedented growth of biomedical publications, it is important to have structured abstracts in bibliographic databases (i.e., PubMed), thus, to facilitate the information retrieval and knowledge synthesis in needs of researchers. Here, we propose a few-shot prompt learning-based approach to classify sentences in medical abstracts of randomized clinical trials (RCT) and observational studies (OS) to subsections of Introduction, Background, Methods, Results, and Conclusion, using an existing corpus of RCT (PubMed 200k/20k RCT) and a newly built corpus of OS (PubMed 20k OS). Five manually designed templates in a combination of 4 BERT model variants were tested and compared to a previous hierarchical sequential labeling network architecture and traditional BERT-based sentence classification method. On the PubMed 200k and 20k RCT datasets, we achieved overall F1 scores of 0.9508 and 0.9401, respectively. Under few-shot settings, we demonstrated that only 20% of training data is sufficient to achieve a comparable F1 score by the HSLN model (0.9266 by us and 0.9263 by HSLN). When trained on the RCT dataset, our method achieved a 0.9065 F1 score on the OS dataset. When trained on the OS dataset, our method achieved a 0.9203 F1 score on the RCT dataset. We show that the prompt learning-based method outperformed the existing method, even when fewer training samples were used. Moreover, the proposed method shows better generalizability across two types of medical publications when compared with the existing approach. We make the datasets and codes publicly available at: https://github.com/YanHu-or-SawyerHu/prompt-learning-based-sentence-classifier-in-medical-abstracts.","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":"1 1","pages":"542-556"},"PeriodicalIF":5.9,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620359/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42838158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Extracting Complementary and Integrative Health Approaches in Electronic Health Records. 从电子健康记录中提取互补和综合的健康方法。

IF 5.9 Q1 Computer Science

Journal of Healthcare Informatics Research

Pub Date : 2023-08-17 eCollection Date: 2023-09-01 DOI: 10.1007/s41666-023-00137-2

Robin Austin Roni Evans Jenzi Silverman Greg Silverman Zhongran Niu Huixue Zhou Rui Zhang

Complementary and Integrative Health (CIH) has gained increasing popularity in the past decades. While the evidence bases to support them are growing, there is still a gap in understanding their effects and potential adverse events using real-world data. The overall goal of this study is to represent information pertinent to both psychological and physical CIH approaches (specifically, using examples of music therapy, chiropractic, and aquatic exercise in this study) in an electronic health record (EHR) system. We also aim to evaluate the ability of existing natural language processing (NLP) systems to identify CIH approaches. A total of 300 notes were randomly selected and manually annotated. Annotations were made for status, symptom, and frequency of each approach. This set of annotations was used as a gold standard to evaluate the performance of NLP systems used in this study (specifically BioMedICUS, MetaMap, and cTAKES) for extracting CIH concepts. Venn diagram was used to investigate the consistency of medical records searching by Current Procedural Terminology (CPT) codes and CIH approaches keywords in SQL. Since CPT codes usually do not have specific mentions of CIH approaches, the Venn diagram had less overlap with those found in clinical notes for all three CIH therapies. The three NLP systems achieved 0.41 in average lenient match F1-score in all three CIH approaches, respectively. BioMedICUS achieved the best performance in aquatic exercise with an F1-score of 0.66. This study contributes to the overall representation of CIH in clinical note and lays a foundation for using EHR for clinical research for CIH approaches.

在过去的几十年里，补充和综合健康（CIH）越来越受欢迎。尽管支持它们的证据基础正在增加，但使用真实世界的数据来理解它们的影响和潜在的不良事件仍然存在差距。本研究的总体目标是在电子健康记录（EHR）系统中表示与心理和物理CIH方法相关的信息（特别是，在本研究中使用音乐治疗、脊椎按摩和水上运动的例子）。我们还旨在评估现有自然语言处理（NLP）系统识别CIH方法的能力。共有300个音符被随机选择并手动注释。对每种方法的状态、症状和频率进行了注释。这组注释被用作评估本研究中用于提取CIH概念的NLP系统（特别是BioMedICUS、MetaMap和cTAKES）性能的金标准。Venn图用于研究SQL中当前程序术语（CPT）代码和CIH方法关键字搜索病历的一致性。由于CPT代码通常没有具体提及CIH方法，Venn图与所有三种CIH疗法的临床记录中发现的重叠较少。三种NLP系统在所有三种CIH方法中的平均宽松比赛F1得分分别达到0.41。BioMedICUS在水上运动中取得了最好的成绩，F1得分为0.66。本研究有助于CIH在临床笔记中的总体表现，并为使用EHR进行CIH方法的临床研究奠定了基础。

{"title":"Extracting Complementary and Integrative Health Approaches in Electronic Health Records.","authors":"Huixue Zhou, Greg Silverman, Zhongran Niu, Jenzi Silverman, Roni Evans, Robin Austin, Rui Zhang","doi":"10.1007/s41666-023-00137-2","DOIUrl":"10.1007/s41666-023-00137-2","url":null,"abstract":"Complementary and Integrative Health (CIH) has gained increasing popularity in the past decades. While the evidence bases to support them are growing, there is still a gap in understanding their effects and potential adverse events using real-world data. The overall goal of this study is to represent information pertinent to both psychological and physical CIH approaches (specifically, using examples of music therapy, chiropractic, and aquatic exercise in this study) in an electronic health record (EHR) system. We also aim to evaluate the ability of existing natural language processing (NLP) systems to identify CIH approaches. A total of 300 notes were randomly selected and manually annotated. Annotations were made for status, symptom, and frequency of each approach. This set of annotations was used as a gold standard to evaluate the performance of NLP systems used in this study (specifically BioMedICUS, MetaMap, and cTAKES) for extracting CIH concepts. Venn diagram was used to investigate the consistency of medical records searching by Current Procedural Terminology (CPT) codes and CIH approaches keywords in SQL. Since CPT codes usually do not have specific mentions of CIH approaches, the Venn diagram had less overlap with those found in clinical notes for all three CIH therapies. The three NLP systems achieved 0.41 in average lenient match F1-score in all three CIH approaches, respectively. BioMedICUS achieved the best performance in aquatic exercise with an F1-score of 0.66. This study contributes to the overall representation of CIH in clinical note and lays a foundation for using EHR for clinical research for CIH approaches.","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":"7 3","pages":"277-290"},"PeriodicalIF":5.9,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10449701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10107370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introduction and Comparison of Novel Decentral Learning Schemes with Multiple Data Pools for Privacy-Preserving ECG Classification. 介绍和比较用于隐私保护ECG分类的具有多个数据池的新型分散学习方案。

IF 5.9 Q1 Computer Science

Journal of Healthcare Informatics Research

Pub Date : 2023-08-17 eCollection Date: 2023-09-01 DOI: 10.1007/s41666-023-00142-5

Dieter Hayn Günter Schreier Sai Pavan Kumar Veeranki Martin Baumgartner

Artificial intelligence and machine learning have led to prominent and spectacular innovations in various scenarios. Application in medicine, however, can be challenging due to privacy concerns and strict legal regulations. Methods that centralize knowledge instead of data could address this issue. In this work, 6 different decentralized machine learning algorithms are applied to 12-lead ECG classification and compared to conventional, centralized machine learning. The results show that state-of-the-art federated learning leads to reasonable losses of classification performance compared to a standard, central model (-0.054 AUROC) while providing a significantly higher level of privacy. A proposed weighted variant of federated learning (-0.049 AUROC) and an ensemble (-0.035 AUROC) outperformed the standard federated learning algorithm. Overall, considering multiple metrics, the novel batch-wise sequential learning scheme performed best (-0.036 AUROC to baseline). Although, the technical aspects of implementing them in a real-world application are to be carefully considered, the described algorithms constitute a way forward towards preserving-preserving AI in medicine.

人工智能和机器学习在各种场景中带来了突出而引人注目的创新。然而，由于隐私问题和严格的法律规定，在医学中的应用可能具有挑战性。集中知识而不是数据的方法可以解决这个问题。在这项工作中，将6种不同的分散式机器学习算法应用于12导联心电图分类，并与传统的集中式机器学习进行比较。结果表明，与标准的中心模型（-0.054AUROC）相比，最先进的联合学习导致分类性能的合理损失，同时提供了显著更高的隐私水平。提出的联合学习的加权变体（-0.049 AUROC）和集合（-0.035 AUROC）优于标准联合学习算法。总的来说，考虑到多个指标，新的分批顺序学习方案表现最好（与基线相比为-0.036 AUROC）。尽管在现实世界的应用中实现这些算法的技术方面需要仔细考虑，但所描述的算法构成了在医学中保留人工智能的一条前进道路。

{"title":"Introduction and Comparison of Novel Decentral Learning Schemes with Multiple Data Pools for Privacy-Preserving ECG Classification.","authors":"Martin Baumgartner, Sai Pavan Kumar Veeranki, Dieter Hayn, Günter Schreier","doi":"10.1007/s41666-023-00142-5","DOIUrl":"10.1007/s41666-023-00142-5","url":null,"abstract":"Artificial intelligence and machine learning have led to prominent and spectacular innovations in various scenarios. Application in medicine, however, can be challenging due to privacy concerns and strict legal regulations. Methods that centralize knowledge instead of data could address this issue. In this work, 6 different decentralized machine learning algorithms are applied to 12-lead ECG classification and compared to conventional, centralized machine learning. The results show that state-of-the-art federated learning leads to reasonable losses of classification performance compared to a standard, central model (-0.054 AUROC) while providing a significantly higher level of privacy. A proposed weighted variant of federated learning (-0.049 AUROC) and an ensemble (-0.035 AUROC) outperformed the standard federated learning algorithm. Overall, considering multiple metrics, the novel batch-wise sequential learning scheme performed best (-0.036 AUROC to baseline). Although, the technical aspects of implementing them in a real-world application are to be carefully considered, the described algorithms constitute a way forward towards preserving-preserving AI in medicine.","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":"7 3","pages":"291-312"},"PeriodicalIF":5.9,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10449753/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10109772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Assessment of Prediction Tasks and Time Window Selection in Temporal Modeling of Electronic Health Record Data: a Systematic Review. 电子健康记录数据时间建模中预测任务和时间窗口选择的评估：系统综述。

IF 5.9 Q1 Computer Science

Journal of Healthcare Informatics Research

Pub Date : 2023-08-14 eCollection Date: 2023-09-01 DOI: 10.1007/s41666-023-00143-4

Vignesh Subbian Sarah Pungitore

Temporal electronic health record (EHR) data are often preferred for clinical prediction tasks because they offer more complete representations of a patient's pathophysiology than static data. A challenge when working with temporal EHR data is problem formulation, which includes defining the time windows of interest and the prediction task. Our objective was to conduct a systematic review that assessed the definition and reporting of concepts relevant to temporal clinical prediction tasks. We searched PubMed® and IEEE Xplore® databases for studies from January 1, 2010 applying machine learning models to EHR data for patient outcome prediction. Publications applying time-series methods were selected for further review. We identified 92 studies and summarized them by clinical context and definition and reporting of the prediction problem. For the time windows of interest, 12 studies did not discuss window lengths, 57 used a single set of window lengths, and 23 evaluated the relationship between window length and model performance. We also found that 72 studies had appropriate reporting of the prediction task. However, evaluation of prediction problem formulation for temporal EHR data was complicated by heterogeneity in assessing and reporting of these concepts. Even among studies modeling similar clinical outcomes, there were variations in terminology used to describe the prediction problem, rationale for window lengths, and determination of the outcome of interest. As temporal modeling using EHR data expands, minimal reporting standards should include time-series specific concerns to promote rigor and reproducibility in future studies and facilitate model implementation in clinical settings.

Supplementary information: The online version contains supplementary material available at 10.1007/s41666-023-00143-4.

时间电子健康记录（EHR）数据通常是临床预测任务的首选数据，因为它们比静态数据更完整地表示患者的病理生理学。处理时间EHR数据时的一个挑战是问题公式化，其中包括定义感兴趣的时间窗口和预测任务。我们的目的是进行一项系统综述，评估与时间临床预测任务相关的概念的定义和报告。我们在PubMed®和IEEE Xplore®数据库中搜索了2010年1月1日以来的研究，将机器学习模型应用于EHR数据，用于患者结果预测。选择采用时间序列方法的出版物进行进一步审查。我们确定了92项研究，并通过临床背景、预测问题的定义和报告对其进行了总结。对于感兴趣的时间窗口，12项研究没有讨论窗口长度，57项使用了一组窗口长度，23项评估了窗口长度与模型性能之间的关系。我们还发现，72项研究对预测任务有适当的报告。然而，由于评估和报告这些概念的异质性，对时间EHR数据的预测问题公式的评估变得复杂。即使在模拟类似临床结果的研究中，用于描述预测问题的术语、窗口长度的基本原理以及感兴趣结果的确定也存在差异。随着使用EHR数据的时间建模的扩展，最低报告标准应包括特定于时间序列的问题，以提高未来研究的严谨性和再现性，并促进模型在临床环境中的实施。补充信息：在线版本包含补充材料，可访问10.1007/s41666-023-00143-4。

{"title":"Assessment of Prediction Tasks and Time Window Selection in Temporal Modeling of Electronic Health Record Data: a Systematic Review.","authors":"Sarah Pungitore, Vignesh Subbian","doi":"10.1007/s41666-023-00143-4","DOIUrl":"10.1007/s41666-023-00143-4","url":null,"abstract":"Temporal electronic health record (EHR) data are often preferred for clinical prediction tasks because they offer more complete representations of a patient's pathophysiology than static data. A challenge when working with temporal EHR data is problem formulation, which includes defining the time windows of interest and the prediction task. Our objective was to conduct a systematic review that assessed the definition and reporting of concepts relevant to temporal clinical prediction tasks. We searched PubMed® and IEEE Xplore® databases for studies from January 1, 2010 applying machine learning models to EHR data for patient outcome prediction. Publications applying time-series methods were selected for further review. We identified 92 studies and summarized them by clinical context and definition and reporting of the prediction problem. For the time windows of interest, 12 studies did not discuss window lengths, 57 used a single set of window lengths, and 23 evaluated the relationship between window length and model performance. We also found that 72 studies had appropriate reporting of the prediction task. However, evaluation of prediction problem formulation for temporal EHR data was complicated by heterogeneity in assessing and reporting of these concepts. Even among studies modeling similar clinical outcomes, there were variations in terminology used to describe the prediction problem, rationale for window lengths, and determination of the outcome of interest. As temporal modeling using EHR data expands, minimal reporting standards should include time-series specific concerns to promote rigor and reproducibility in future studies and facilitate model implementation in clinical settings.Supplementary information: The online version contains supplementary material available at 10.1007/s41666-023-00143-4.","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":"7 3","pages":"313-331"},"PeriodicalIF":5.9,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10449760/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10109771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach. 肝细胞癌经动脉化疗栓塞后的生存预测：一种深入的多任务生存分析方法。

IF 5.9 Q1 Computer Science

Journal of Healthcare Informatics Research

Pub Date : 2023-07-31 eCollection Date: 2023-09-01 DOI: 10.1007/s41666-023-00139-0

Shu Gong Guo Huang Yongxin Ge Huijun Liu

The accurate prediction of postoperative survival time of patients with Barcelona Clinic Liver Cancer (BCLC) stage B hepatocellular carcinoma (HCC) is important for postoperative health care. Survival analysis is a common method used to predict the occurrence time of events of interest in the medical field. At present, the mainstream survival analysis models, such as the Cox proportional risk model, should make strict assumptions about the potential random process to solve the censored data, thus potentially limiting their application in clinical practice. In this paper, we propose a novel deep multitask survival model (DMSM) to analyze HCC survival data. Specifically, DMSM transforms the traditional survival time prediction problem of patients with HCC into a survival probability prediction problem at multiple time points and applies entropy regularization and ranking loss to optimize a multitask neural network. Compared with the traditional methods of deleting censored data and strong hypothesis, DMSM makes full use of all the information in the censored data but does not need to make any assumption. In addition, we identify the risk factors affecting the prognosis of patients with HCC and visualize the importance of ranking these factors. On the basis of the analysis of a real dataset of patients with BCLC stage B HCC, experimental results on three different validation datasets show that the DMSM achieves competitive performance with concordance index of 0.779, 0.727, and 0.780 and integrated Brier score (IBS) of 0.172, 0.138, and 0.135, respectively. Our DMSM has a comparatively small standard deviation (0.002, 0.002, and 0.003) for IBS of bootstrapping 100 times. The DMSM we proposed can be utilized as an effective survival analysis model and provide an important means for the accurate prediction of postoperative survival time of patients with BCLC stage B HCC.

准确预测巴塞罗那临床癌症（BCLC）B期肝细胞癌（HCC）患者术后生存时间对术后健康护理具有重要意义。生存分析是医学领域中用于预测感兴趣事件发生时间的常用方法。目前，主流的生存分析模型，如Cox比例风险模型，应该对潜在的随机过程进行严格的假设，以解决审查数据，从而可能限制其在临床实践中的应用。在本文中，我们提出了一种新的深度多任务生存模型（DMSM）来分析HCC的生存数据。具体而言，DMSM将HCC患者的传统生存时间预测问题转化为多个时间点的生存概率预测问题，并应用熵正则化和排序损失来优化多任务神经网络。与传统的删除审查数据和强假设的方法相比，DMSM充分利用了审查数据中的所有信息，但不需要做出任何假设。此外，我们还确定了影响HCC患者预后的风险因素，并可视化了对这些因素进行排名的重要性。基于对BCLC B期HCC患者真实数据集的分析，在三个不同验证数据集上的实验结果表明，DMSM实现了竞争性能，一致性指数分别为0.779、0.727和0.780，综合Brier评分（IBS）分别为0.172、0.138和0.135。我们的DMSM对于自举100次的IBS具有相对较小的标准偏差（0.002、0.002和0.003）。我们提出的DMSM可以作为一种有效的生存分析模型，为准确预测BCLC B期HCC患者的术后生存时间提供了重要手段。

{"title":"Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach.","authors":"Guo Huang, Huijun Liu, Shu Gong, Yongxin Ge","doi":"10.1007/s41666-023-00139-0","DOIUrl":"10.1007/s41666-023-00139-0","url":null,"abstract":"The accurate prediction of postoperative survival time of patients with Barcelona Clinic Liver Cancer (BCLC) stage B hepatocellular carcinoma (HCC) is important for postoperative health care. Survival analysis is a common method used to predict the occurrence time of events of interest in the medical field. At present, the mainstream survival analysis models, such as the Cox proportional risk model, should make strict assumptions about the potential random process to solve the censored data, thus potentially limiting their application in clinical practice. In this paper, we propose a novel deep multitask survival model (DMSM) to analyze HCC survival data. Specifically, DMSM transforms the traditional survival time prediction problem of patients with HCC into a survival probability prediction problem at multiple time points and applies entropy regularization and ranking loss to optimize a multitask neural network. Compared with the traditional methods of deleting censored data and strong hypothesis, DMSM makes full use of all the information in the censored data but does not need to make any assumption. In addition, we identify the risk factors affecting the prognosis of patients with HCC and visualize the importance of ranking these factors. On the basis of the analysis of a real dataset of patients with BCLC stage B HCC, experimental results on three different validation datasets show that the DMSM achieves competitive performance with concordance index of 0.779, 0.727, and 0.780 and integrated Brier score (IBS) of 0.172, 0.138, and 0.135, respectively. Our DMSM has a comparatively small standard deviation (0.002, 0.002, and 0.003) for IBS of bootstrapping 100 times. The DMSM we proposed can be utilized as an effective survival analysis model and provide an important means for the accurate prediction of postoperative survival time of patients with BCLC stage B HCC.","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":"7 3","pages":"332-358"},"PeriodicalIF":5.9,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10449707/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10109774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of Healthcare Informatics Research

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀