Health Information Science and Systems最新文献

A new multivariate blood glucose prediction method with hybrid feature clustering and online transfer learning. 采用混合特征聚类和在线迁移学习的新型多变量血糖预测方法。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-11-17 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00313-7

Fuqiang You, Guo Zhao, Xinyu Zhang, Ziheng Zhang, Jinli Cao, Hongru Li

Accurate blood glucose (BG) prediction is greatly benefit for the treatment of diabetes. Generally, clinical physicians are required to comprehensively analyze various factors, such as patient's body temperature, meal, sleep, insulin injection, continuous glucose monitoring (CGM), and other information, to evaluate the fluctuation trend of blood glucose. To address this problem, this paper proposes a multivariate blood glucose prediction method based on mixed feature clustering. It clusters time series data with diverse or mixed features related to blood glucose, effectively leveraging correlations and distribution characteristics. By combining incremental clustering of multivariate time series with transfer learning, this method achieves online prediction of blood glucose levels. The experimental results indicate that the proposed method can decrease the prediction error RMSE by 4.2% (PH=30min) and 5.9% (PH=60min). Compared with other prediction methods, the training time of the multivariate prediction method is reduced by 5.2% (PH=30min) and 4.7% (PH=60min). It was also validated and compared with other methods in a real dataset. The proposed method in this study has lower prediction error and better prediction performance in the prediction horizon (PH) of PH=30, 45, 60, 75, and 90 min, respectively. Compared with the traditional unitary and multivariate time series prediction method, the approach proposed in this paper significantly improves the accuracy and robustness of blood glucose prediction. According to the evaluation results on the data set from OhioT1DM and the Sixth People's Hospital of Shanghai, the proposed method has better generalization performance and clinical acceptability.

准确的血糖（BG）预测对糖尿病的治疗大有裨益。一般来说，临床医生需要综合分析患者的体温、进餐、睡眠、胰岛素注射、连续血糖监测（CGM）等多种因素，来评估血糖的波动趋势。针对这一问题，本文提出了一种基于混合特征聚类的多元血糖预测方法。该方法有效利用相关性和分布特征，对与血糖相关的具有多样化或混合特征的时间序列数据进行聚类。通过将多元时间序列的增量聚类与迁移学习相结合，该方法实现了血糖水平的在线预测。实验结果表明，所提出的方法可将预测误差 RMSE 降低 4.2%（PH=30min）和 5.9%（PH=60min）。与其他预测方法相比，多元预测方法的训练时间减少了 5.2%（PH=30min）和 4.7%（PH=60min）。该方法还在真实数据集中与其他方法进行了验证和比较。在 PH=30 分钟、45 分钟、60 分钟、75 分钟和 90 分钟的预测范围（PH）内，本研究提出的方法具有更低的预测误差和更好的预测性能。与传统的单变量和多变量时间序列预测方法相比，本文提出的方法显著提高了血糖预测的准确性和鲁棒性。根据对 OhioT1DM 和上海市第六人民医院数据集的评估结果，本文提出的方法具有更好的泛化性能和临床可接受性。

{"title":"A new multivariate blood glucose prediction method with hybrid feature clustering and online transfer learning.","authors":"Fuqiang You, Guo Zhao, Xinyu Zhang, Ziheng Zhang, Jinli Cao, Hongru Li","doi":"10.1007/s13755-024-00313-7","DOIUrl":"10.1007/s13755-024-00313-7","url":null,"abstract":"Accurate blood glucose (BG) prediction is greatly benefit for the treatment of diabetes. Generally, clinical physicians are required to comprehensively analyze various factors, such as patient's body temperature, meal, sleep, insulin injection, continuous glucose monitoring (CGM), and other information, to evaluate the fluctuation trend of blood glucose. To address this problem, this paper proposes a multivariate blood glucose prediction method based on mixed feature clustering. It clusters time series data with diverse or mixed features related to blood glucose, effectively leveraging correlations and distribution characteristics. By combining incremental clustering of multivariate time series with transfer learning, this method achieves online prediction of blood glucose levels. The experimental results indicate that the proposed method can decrease the prediction error RMSE by 4.2% (PH=30min) and 5.9% (PH=60min). Compared with other prediction methods, the training time of the multivariate prediction method is reduced by 5.2% (PH=30min) and 4.7% (PH=60min). It was also validated and compared with other methods in a real dataset. The proposed method in this study has lower prediction error and better prediction performance in the prediction horizon (PH) of PH=30, 45, 60, 75, and 90 min, respectively. Compared with the traditional unitary and multivariate time series prediction method, the approach proposed in this paper significantly improves the accuracy and robustness of blood glucose prediction. According to the evaluation results on the data set from OhioT1DM and the Sixth People's Hospital of Shanghai, the proposed method has better generalization performance and clinical acceptability.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"57"},"PeriodicalIF":4.7,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11570574/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142677071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Memetic ant colony optimization for multi-constrained cognitive diagnostic test construction. 用于构建多约束认知诊断测试的蚁群优化记忆法

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-11-16 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00314-6

Xi Cao, Yong-Feng Ge, Kate Wang, Ying Lin

Purpose: Cognitive diagnostic tests (CDTs) assess cognitive skills at a more granular level, providing detailed insights into the mastery profile of test-takers. Traditional algorithms for constructing CDTs have partially addressed these challenges, focusing on a limited number of constraints. This paper intends to utilize a meta-heuristic algorithm to produce high-quality tests and handle more constraints simultaneously.

Methods: This paper presents a memetic ant colony optimization (MACO) algorithm for constructing CDTs while considering multiple constraints. The MACO method utilizes pheromone trails to represent successful test constructions from the past. Additionally, it innovatively integrates item quality and constraint adherence into heuristic information to manage multiple constraints simultaneously. The method evaluates the assembled tests based on the diagnosis index and constraint satisfaction. Another innovation of MACO is the incorporation of a local search strategy to further enhance diagnostic accuracy by partially optimizing item selection. The optimal local search parameter settings are explored through a parameter investigation. A series of simulation experiments validate the effectiveness of MACO under various conditions.

Results: The results demonstrate the great ability of meta-heuristic algorithms to handle multiple constraints and achieve high statistical performance. MACO exhibited superior performance in generating high-quality CDTs while meeting multiple constraints, particularly for mixed and low discrimination item banks. It achieved faster convergence than the ant colony optimization in most scenarios.

Conclusions: MACO provides an effective solution for multi-constrained CDT construction, especially for shorter tests and item banks with mixed or lower discrimination. The experimental results also suggest that the suitability of different optimization approaches may depend on specific test conditions, such as the characteristics of the item bank and the length of the test.

目的：认知诊断测试（CDTs）从更细的层面评估认知技能，详细了解测试者的掌握情况。用于构建 CDT 的传统算法部分地解决了这些难题，但只关注了有限的几个约束条件。本文打算利用元启发式算法来生成高质量的测试，并同时处理更多的约束条件：本文提出了一种记忆蚁群优化（MACO）算法，用于构建 CDT，同时考虑多个约束条件。MACO 方法利用信息素轨迹来表示过去成功的测试构建。此外，它还创新性地将项目质量和约束遵守情况整合到启发式信息中，以同时管理多个约束条件。该方法根据诊断指数和约束满意度来评估组合测试。MACO 的另一项创新是采用局部搜索策略，通过部分优化项目选择来进一步提高诊断准确性。通过参数调查探索了最佳局部搜索参数设置。一系列模拟实验验证了 MACO 在各种条件下的有效性：结果表明，元启发式算法在处理多重约束条件和实现高统计性能方面具有很强的能力。MACO 在生成高质量 CDT 的同时满足多个约束条件方面表现出了卓越的性能，尤其是在混合和低区分度项目库中。在大多数情况下，它比蚁群优化收敛更快：结论：MACO 为多约束 CDT 的构建提供了一个有效的解决方案，尤其适用于较短的测试和具有混合或较低区分度的项目库。实验结果还表明，不同优化方法的适用性可能取决于具体的测试条件，如题目库的特点和测试的长度。

{"title":"Memetic ant colony optimization for multi-constrained cognitive diagnostic test construction.","authors":"Xi Cao, Yong-Feng Ge, Kate Wang, Ying Lin","doi":"10.1007/s13755-024-00314-6","DOIUrl":"10.1007/s13755-024-00314-6","url":null,"abstract":"Purpose: Cognitive diagnostic tests (CDTs) assess cognitive skills at a more granular level, providing detailed insights into the mastery profile of test-takers. Traditional algorithms for constructing CDTs have partially addressed these challenges, focusing on a limited number of constraints. This paper intends to utilize a meta-heuristic algorithm to produce high-quality tests and handle more constraints simultaneously.Methods: This paper presents a memetic ant colony optimization (MACO) algorithm for constructing CDTs while considering multiple constraints. The MACO method utilizes pheromone trails to represent successful test constructions from the past. Additionally, it innovatively integrates item quality and constraint adherence into heuristic information to manage multiple constraints simultaneously. The method evaluates the assembled tests based on the diagnosis index and constraint satisfaction. Another innovation of MACO is the incorporation of a local search strategy to further enhance diagnostic accuracy by partially optimizing item selection. The optimal local search parameter settings are explored through a parameter investigation. A series of simulation experiments validate the effectiveness of MACO under various conditions.Results: The results demonstrate the great ability of meta-heuristic algorithms to handle multiple constraints and achieve high statistical performance. MACO exhibited superior performance in generating high-quality CDTs while meeting multiple constraints, particularly for mixed and low discrimination item banks. It achieved faster convergence than the ant colony optimization in most scenarios.Conclusions: MACO provides an effective solution for multi-constrained CDT construction, especially for shorter tests and item banks with mixed or lower discrimination. The experimental results also suggest that the suitability of different optimization approaches may depend on specific test conditions, such as the characteristics of the item bank and the length of the test.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"56"},"PeriodicalIF":4.7,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569084/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142668916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Forecasting fMRI images from video sequences: linear model analysis. 从视频序列预测 fMRI 图像：线性模型分析。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-11-15 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00315-5

Daniil Dorin, Nikita Kiselev, Andrey Grabovoy, Vadim Strijov

Over the past few decades, a variety of significant scientific breakthroughs have been achieved in the fields of brain encoding and decoding using the functional magnetic resonance imaging (fMRI). Many studies have been conducted on the topic of human brain reaction to visual stimuli. However, the relationship between fMRI images and video sequences viewed by humans remains complex and is often studied using large transformer models. In this paper, we investigate the correlation between videos presented to participants during an experiment and the resulting fMRI images. To achieve this, we propose a method for creating a linear model that predicts changes in fMRI signals based on video sequence images. A linear model is constructed for each individual voxel in the fMRI image, assuming that the image sequence follows a Markov property. Through the comprehensive qualitative experiments, we demonstrate the relationship between the two time series. We hope that our findings contribute to a deeper understanding of the human brain's reaction to external stimuli and provide a basis for future research in this area.

在过去的几十年里，利用功能磁共振成像（fMRI）进行大脑编码和解码的领域取得了各种重大科学突破。关于人脑对视觉刺激的反应这一主题，已经开展了许多研究。然而，fMRI 图像与人类观看的视频序列之间的关系仍然很复杂，通常使用大型变压器模型进行研究。在本文中，我们将研究实验过程中呈现给参与者的视频与产生的 fMRI 图像之间的相关性。为此，我们提出了一种创建线性模型的方法，该模型可根据视频序列图像预测 fMRI 信号的变化。假定图像序列遵循马尔可夫特性，为 fMRI 图像中的每个单独体素构建线性模型。通过综合定性实验，我们证明了两个时间序列之间的关系。我们希望我们的发现有助于加深对人脑对外部刺激反应的理解，并为这一领域未来的研究提供基础。

{"title":"Forecasting fMRI images from video sequences: linear model analysis.","authors":"Daniil Dorin, Nikita Kiselev, Andrey Grabovoy, Vadim Strijov","doi":"10.1007/s13755-024-00315-5","DOIUrl":"10.1007/s13755-024-00315-5","url":null,"abstract":"Over the past few decades, a variety of significant scientific breakthroughs have been achieved in the fields of brain encoding and decoding using the functional magnetic resonance imaging (fMRI). Many studies have been conducted on the topic of human brain reaction to visual stimuli. However, the relationship between fMRI images and video sequences viewed by humans remains complex and is often studied using large transformer models. In this paper, we investigate the correlation between videos presented to participants during an experiment and the resulting fMRI images. To achieve this, we propose a method for creating a linear model that predicts changes in fMRI signals based on video sequence images. A linear model is constructed for each individual voxel in the fMRI image, assuming that the image sequence follows a Markov property. Through the comprehensive qualitative experiments, we demonstrate the relationship between the two time series. We hope that our findings contribute to a deeper understanding of the human brain's reaction to external stimuli and provide a basis for future research in this area.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"55"},"PeriodicalIF":4.7,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568086/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142648946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

KSDKG: construction and application of knowledge graph for kidney stone disease based on biomedical literature and public databases. KSDKG：基于生物医学文献和公共数据库的肾结石病知识图谱的构建与应用。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-11-14 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00309-3

Jianping Man, Yufei Shi, Zhensheng Hu, Rui Yang, Zhisheng Huang, Yi Zhou

Purpose: Kidney stone disease (KSD) is a common urological disorder with an increasing incidence worldwide. The extensive knowledge about KSD is dispersed across multiple databases, challenging the visualization and representation of its hierarchy and connections. This paper aims at constructing a disease-specific knowledge graph for KSD to enhance the effective utilization of knowledge by medical professionals and promote clinical research and discovery.

Methods: Text parsing and semantic analysis were conducted on literature related to KSD from PubMed, with concept annotation based on biomedical ontology being utilized to generate semantic data in RDF format. Moreover, public databases were integrated to construct a large-scale knowledge graph for KSD. Additionally, case studies were carried out to demonstrate the practical utility of the developed knowledge graph.

Results: We proposed and implemented a Kidney Stone Disease Knowledge Graph (KSDKG), covering more than 90 million triples. This graph comprised semantic data extracted from 29,174 articles, integrating available data from UMLS, SNOMED CT, MeSH, DrugBank and Microbe-Disease Knowledge Graph. Through the application of three cases, we retrieved and discovered information on microbes, drugs and diseases associated with KSD. The results illustrated that the KSDKG can integrate diverse medical knowledge and provide new clinical insights for identifying the underlying mechanisms of KSD.

Conclusion: The KSDKG efficiently utilizes knowledge graph to reveal hidden knowledge associations, facilitating semantic search and response. As a blueprint for developing disease-specific knowledge graphs, it offers valuable contributions to medical research.

目的：肾结石病（KSD）是一种常见的泌尿系统疾病，在全球的发病率不断上升。有关 KSD 的大量知识分散在多个数据库中，对其层次和联系的可视化和表示提出了挑战。本文旨在构建针对 KSD 的特定疾病知识图谱，以提高医疗专业人员对知识的有效利用，促进临床研究和发现：方法：对PubMed上与KSD相关的文献进行文本解析和语义分析，并利用基于生物医学本体论的概念注释生成RDF格式的语义数据。此外，还整合了公共数据库，以构建大规模的 KSD 知识图谱。此外，我们还进行了案例研究，以展示所开发的知识图谱的实用性：我们提出并实现了肾结石疾病知识图谱（KSDKG），涵盖了9000多万个三元组。该图由从 29174 篇文章中提取的语义数据组成，整合了来自 UMLS、SNOMED CT、MeSH、DrugBank 和微生物-疾病知识图谱的可用数据。通过三个案例的应用，我们检索并发现了与 KSD 相关的微生物、药物和疾病信息。结果表明，KSDKG 可以整合各种医学知识，为确定 KSD 的潜在机制提供新的临床见解：结论：KSDKG 能有效利用知识图谱揭示隐藏的知识关联，促进语义搜索和响应。作为开发特定疾病知识图谱的蓝图，它为医学研究做出了宝贵的贡献。

{"title":"KSDKG: construction and application of knowledge graph for kidney stone disease based on biomedical literature and public databases.","authors":"Jianping Man, Yufei Shi, Zhensheng Hu, Rui Yang, Zhisheng Huang, Yi Zhou","doi":"10.1007/s13755-024-00309-3","DOIUrl":"10.1007/s13755-024-00309-3","url":null,"abstract":"Purpose: Kidney stone disease (KSD) is a common urological disorder with an increasing incidence worldwide. The extensive knowledge about KSD is dispersed across multiple databases, challenging the visualization and representation of its hierarchy and connections. This paper aims at constructing a disease-specific knowledge graph for KSD to enhance the effective utilization of knowledge by medical professionals and promote clinical research and discovery.Methods: Text parsing and semantic analysis were conducted on literature related to KSD from PubMed, with concept annotation based on biomedical ontology being utilized to generate semantic data in RDF format. Moreover, public databases were integrated to construct a large-scale knowledge graph for KSD. Additionally, case studies were carried out to demonstrate the practical utility of the developed knowledge graph.Results: We proposed and implemented a Kidney Stone Disease Knowledge Graph (KSDKG), covering more than 90 million triples. This graph comprised semantic data extracted from 29,174 articles, integrating available data from UMLS, SNOMED CT, MeSH, DrugBank and Microbe-Disease Knowledge Graph. Through the application of three cases, we retrieved and discovered information on microbes, drugs and diseases associated with KSD. The results illustrated that the KSDKG can integrate diverse medical knowledge and provide new clinical insights for identifying the underlying mechanisms of KSD.Conclusion: The KSDKG efficiently utilizes knowledge graph to reveal hidden knowledge associations, facilitating semantic search and response. As a blueprint for developing disease-specific knowledge graphs, it offers valuable contributions to medical research.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"54"},"PeriodicalIF":4.7,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564440/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142648856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DPD (DePression Detection) Net: a deep neural network for multimodal depression detection. DPD（抑郁检测）网络：用于多模态抑郁检测的深度神经网络。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-11-12 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00311-9

Manlu He, Erwin M Bakker, Michael S Lew

Depression is one of the most prevalent mental conditions which could impair people's productivity and lead to severe consequences. The diagnosis of this disease is complex as it often relies on a physician's subjective interview-based screening. The aim of our work is to propose deep learning models for automatic depression detection by using different data modalities, which could assist in the diagnosis of depression. Current works on automatic depression detection mostly are tested on a single dataset, which might lack robustness, flexibility and scalability. To alleviate this problem, we design a novel Graph Neural Network-enhanced Transformer model named DePressionDetect Net (DPD Net) that leverages textual, audio and visual features and can work under two different application settings: the clinical setting and the social media setting. The model consists of a unimodal encoder module for encoding single modality, a multimodal encoder module for integrating the multimodal information, and a detection module for producing the final prediction. We also propose a model named DePressionDetect-with-EEG Net (DPD-E Net) to incorporate Electroencephalography (EEG) signals and speech data for depression detection. Experiments across four benchmark datasets show that DPD Net and DPD-E Net can outperform the state-of-the-art models on three datasets (i.e., E-DAIC dataset, Twitter depression dataset and MODMA dataset), and achieve competitive performance on the fourth one (i.e., D-vlog dataset). Ablation studies demonstrate the advantages of the proposed modules and the effectiveness of combining diverse modalities for automatic depression detection.

抑郁症是最常见的精神疾病之一，会损害人们的工作效率并导致严重后果。这种疾病的诊断非常复杂，因为它通常依赖于医生基于访谈的主观筛查。我们的工作旨在通过使用不同的数据模式，为抑郁症的自动检测提出深度学习模型，从而为抑郁症的诊断提供帮助。目前的抑郁症自动检测工作大多在单一数据集上进行测试，可能缺乏鲁棒性、灵活性和可扩展性。为了缓解这一问题，我们设计了一种名为 "抑郁检测网络"（DePressionDetect Net，DPD Net）的新型图神经网络增强变换器模型，该模型利用文本、音频和视觉特征，可在两种不同的应用环境下工作：临床环境和社交媒体环境。该模型由用于编码单一模态的单模态编码器模块、用于整合多模态信息的多模态编码器模块和用于生成最终预测结果的检测模块组成。我们还提出了一个名为 "DePressionDetect-with-EEG Net"（DPD-E Net）的模型，用于结合脑电图（EEG）信号和语音数据进行抑郁检测。四个基准数据集的实验表明，DPD Net 和 DPD-E Net 在三个数据集（即 E-DAIC 数据集、Twitter 抑郁症数据集和 MODMA 数据集）上的表现优于最先进的模型，并在第四个数据集（即 D-vlog 数据集）上取得了具有竞争力的性能。消融研究证明了所提模块的优势，以及结合多种模式进行抑郁自动检测的有效性。

{"title":"DPD (DePression Detection) Net: a deep neural network for multimodal depression detection.","authors":"Manlu He, Erwin M Bakker, Michael S Lew","doi":"10.1007/s13755-024-00311-9","DOIUrl":"10.1007/s13755-024-00311-9","url":null,"abstract":"Depression is one of the most prevalent mental conditions which could impair people's productivity and lead to severe consequences. The diagnosis of this disease is complex as it often relies on a physician's subjective interview-based screening. The aim of our work is to propose deep learning models for automatic depression detection by using different data modalities, which could assist in the diagnosis of depression. Current works on automatic depression detection mostly are tested on a single dataset, which might lack robustness, flexibility and scalability. To alleviate this problem, we design a novel Graph Neural Network-enhanced Transformer model named DePressionDetect Net (DPD Net) that leverages textual, audio and visual features and can work under two different application settings: the clinical setting and the social media setting. The model consists of a unimodal encoder module for encoding single modality, a multimodal encoder module for integrating the multimodal information, and a detection module for producing the final prediction. We also propose a model named DePressionDetect-with-EEG Net (DPD-E Net) to incorporate Electroencephalography (EEG) signals and speech data for depression detection. Experiments across four benchmark datasets show that DPD Net and DPD-E Net can outperform the state-of-the-art models on three datasets (i.e., E-DAIC dataset, Twitter depression dataset and MODMA dataset), and achieve competitive performance on the fourth one (i.e., D-vlog dataset). Ablation studies demonstrate the advantages of the proposed modules and the effectiveness of combining diverse modalities for automatic depression detection.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"53"},"PeriodicalIF":4.7,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11557813/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiple feature selection based on an optimization strategy for causal analysis of health data. 基于优化策略的多重特征选择，用于健康数据的因果分析。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-11-12 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00312-8

Ruichen Cong, Ou Deng, Shoji Nishimura, Atsushi Ogihara, Qun Jin

Purpose: Recent advancements in information technology and wearable devices have revolutionized healthcare through health data analysis. Identifying significant relationships in complex health data enhances healthcare and public health strategies. In health analytics, causal graphs are important for investigating the relationships among health features. However, they face challenges owing to the large number of features, complexity, and computational demands. Feature selection methods are useful for addressing these challenges. In this paper, we present a framework for multiple feature selection based on an optimization strategy for causal analysis of health data.

Methods: We select multiple health features based on an optimization strategy. First, we define a Weighted Total Score (WTS) index to assess the feature importance after the combination of different feature selection methods. To explore an optimal set of weights for each method, we design a multiple feature selection algorithm integrated with the greedy algorithm. The features are then ranked according to their WTS, enabling selection of the most important ones. After that, causal graphs are constructed based on the selected features, and the statistical significance of the paths is assessed. Furthermore, evaluation experiments are conducted on an experiment dataset collected for this study and an open dataset for diabetes.

Results: The results demonstrate that our approach outperforms baseline models by reducing the number of features while improving model performance. Moreover, the statistical significance of the relationships between features uncovered through causal graphs is validated for both datasets.

Conclusion: By using the proposed framework for multiple feature selection based on an optimization strategy for causal analysis, the number of features is reduced and the causal relationships are uncovered and validated.

目的信息技术和可穿戴设备的最新进展通过健康数据分析彻底改变了医疗保健。从复杂的健康数据中找出重要的关系，有助于加强医疗保健和公共卫生战略。在健康分析中，因果图对于研究健康特征之间的关系非常重要。然而，由于特征数量大、复杂性高和计算要求高，它们面临着挑战。特征选择方法有助于应对这些挑战。在本文中，我们提出了一个基于优化策略的多特征选择框架，用于健康数据的因果分析：我们根据优化策略选择多个健康特征。首先，我们定义了一个加权总分（WTS）指数，用于评估不同特征选择方法组合后的特征重要性。为了探索每种方法的最佳权重集，我们设计了一种与贪婪算法相结合的多重特征选择算法。然后根据 WTS 对特征进行排序，从而选出最重要的特征。然后，根据所选特征构建因果图，并评估路径的统计意义。此外，我们还在为本研究收集的实验数据集和糖尿病公开数据集上进行了评估实验：结果表明，我们的方法在提高模型性能的同时减少了特征数量，从而优于基线模型。此外，通过因果图揭示的特征间关系的统计意义在两个数据集上都得到了验证：结论：通过使用基于因果分析优化策略的多特征选择框架，减少了特征数量，揭示并验证了因果关系。

{"title":"Multiple feature selection based on an optimization strategy for causal analysis of health data.","authors":"Ruichen Cong, Ou Deng, Shoji Nishimura, Atsushi Ogihara, Qun Jin","doi":"10.1007/s13755-024-00312-8","DOIUrl":"10.1007/s13755-024-00312-8","url":null,"abstract":"Purpose: Recent advancements in information technology and wearable devices have revolutionized healthcare through health data analysis. Identifying significant relationships in complex health data enhances healthcare and public health strategies. In health analytics, causal graphs are important for investigating the relationships among health features. However, they face challenges owing to the large number of features, complexity, and computational demands. Feature selection methods are useful for addressing these challenges. In this paper, we present a framework for multiple feature selection based on an optimization strategy for causal analysis of health data.Methods: We select multiple health features based on an optimization strategy. First, we define a Weighted Total Score (WTS) index to assess the feature importance after the combination of different feature selection methods. To explore an optimal set of weights for each method, we design a multiple feature selection algorithm integrated with the greedy algorithm. The features are then ranked according to their WTS, enabling selection of the most important ones. After that, causal graphs are constructed based on the selected features, and the statistical significance of the paths is assessed. Furthermore, evaluation experiments are conducted on an experiment dataset collected for this study and an open dataset for diabetes.Results: The results demonstrate that our approach outperforms baseline models by reducing the number of features while improving model performance. Moreover, the statistical significance of the relationships between features uncovered through causal graphs is validated for both datasets.Conclusion: By using the proposed framework for multiple feature selection based on an optimization strategy for causal analysis, the number of features is reduced and the causal relationships are uncovered and validated.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"52"},"PeriodicalIF":4.7,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11554952/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine-learning-based prediction of cardiovascular events for hyperlipidemia population with lipid variability and remnant cholesterol as biomarkers. 基于机器学习的高脂血症人群心血管事件预测，以血脂变异性和残余胆固醇为生物标志物。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-11-11 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00310-w

Zhenzhen Du, Shuang Wang, Ouzhou Yang, Juan He, Yujie Yang, Jing Zheng, Honglei Zhao, Yunpeng Cai

Purpose: Dyslipidemia poses a significant risk for the progression to cardiovascular diseases. Despite the identification of numerous risk factors and the proposal of various risk scales, there is still an urgent need for effective predictive models for the onset of cardiovascular diseases in the hyperlipidemic population, which are essential for the prevention of CVD.

Methods: We carried out a retrospective cohort study with 23,548 hyperlipidemia patients in Shenzhen Health Information Big Data Platform, including 11,723 CVD onset cases in a 3-year follow-up. The population was randomly divided into 70% as an independent training dataset and remaining 30% as test set. Four distinct machine-learning algorithms were implemented on the training dataset with the aim of developing highly accurate predictive models, and their performance was subsequently benchmarked against conventional risk assessment scales. An ablation study was also carried out to analyze the impact of individual risk factors to model performance.

Results: The non-linear algorithm, LightGBM, excelled in forecasting the incidence of cardiovascular disease within 3 years, achieving an area under the 'receiver operating characteristic curve' (AUROC) of 0.883. This performance surpassed that of the conventional logistic regression model, which had an AUROC of 0.725, on identical datasets. Concurrently, in direct comparative analyses, machine-learning approaches have notably outperformed the three traditional risk assessment methods within their respective applicable populations. These include the Framingham cardiovascular disease risk score, 2019 ESC/EAS guidelines for the management of dyslipidemia and the 2016 Chinese recommendations for the management of dyslipidemia in adults. Further analysis of risk factors showed that the variability of blood lipid levels and remnant cholesterol played an important role in indicating an increased risk of CVD.

Conclusions: We have shown that the application of machine-learning techniques significantly enhances the precision of cardiovascular risk forecasting among hyperlipidemic patients, addressing the critical issue of disease prediction's heterogeneity and non-linearity. Furthermore, some recently-suggested biomarkers, including blood lipid variability and remnant cholesterol are also important predictors of cardiovascular events, suggesting the importance of continuous lipid monitoring and healthcare profiling through big data platforms.

目的：血脂异常是引发心血管疾病的重要风险因素。尽管已经发现了许多危险因素，并提出了各种风险量表，但仍迫切需要建立有效的高脂血症人群心血管疾病发病预测模型，这对预防心血管疾病至关重要：我们在深圳市健康信息大数据平台上对 23,548 名高脂血症患者进行了回顾性队列研究，其中包括 3 年随访的 11,723 例心血管疾病发病病例。研究对象被随机分为70%作为独立的训练数据集，其余30%作为测试集。在训练数据集上实施了四种不同的机器学习算法，目的是开发出高度准确的预测模型，随后将其性能与传统的风险评估量表进行比较。此外，还进行了一项消融研究，以分析个别风险因素对模型性能的影响：结果：非线性算法 LightGBM 在预测 3 年内心血管疾病发病率方面表现出色，"接收者操作特征曲线 "下面积（AUROC）达到 0.883。在相同的数据集上，其性能超过了传统的逻辑回归模型，后者的接受者操作特征曲线下面积为 0.725。同时，在直接比较分析中，机器学习方法在各自适用人群中的表现明显优于三种传统风险评估方法。这些方法包括弗雷明汉心血管疾病风险评分、2019年ESC/EAS血脂异常管理指南和2016年中国成人血脂异常管理建议。对风险因素的进一步分析表明，血脂水平和残余胆固醇的变异性在表明心血管疾病风险增加方面起着重要作用：我们的研究表明，机器学习技术的应用大大提高了高脂血症患者心血管风险预测的准确性，解决了疾病预测的异质性和非线性这一关键问题。此外，最近提出的一些生物标志物，包括血脂变异性和残余胆固醇，也是心血管事件的重要预测指标，这表明通过大数据平台进行连续血脂监测和医疗保健分析的重要性。

{"title":"Machine-learning-based prediction of cardiovascular events for hyperlipidemia population with lipid variability and remnant cholesterol as biomarkers.","authors":"Zhenzhen Du, Shuang Wang, Ouzhou Yang, Juan He, Yujie Yang, Jing Zheng, Honglei Zhao, Yunpeng Cai","doi":"10.1007/s13755-024-00310-w","DOIUrl":"10.1007/s13755-024-00310-w","url":null,"abstract":"Purpose: Dyslipidemia poses a significant risk for the progression to cardiovascular diseases. Despite the identification of numerous risk factors and the proposal of various risk scales, there is still an urgent need for effective predictive models for the onset of cardiovascular diseases in the hyperlipidemic population, which are essential for the prevention of CVD.Methods: We carried out a retrospective cohort study with 23,548 hyperlipidemia patients in Shenzhen Health Information Big Data Platform, including 11,723 CVD onset cases in a 3-year follow-up. The population was randomly divided into 70% as an independent training dataset and remaining 30% as test set. Four distinct machine-learning algorithms were implemented on the training dataset with the aim of developing highly accurate predictive models, and their performance was subsequently benchmarked against conventional risk assessment scales. An ablation study was also carried out to analyze the impact of individual risk factors to model performance.Results: The non-linear algorithm, LightGBM, excelled in forecasting the incidence of cardiovascular disease within 3 years, achieving an area under the 'receiver operating characteristic curve' (AUROC) of 0.883. This performance surpassed that of the conventional logistic regression model, which had an AUROC of 0.725, on identical datasets. Concurrently, in direct comparative analyses, machine-learning approaches have notably outperformed the three traditional risk assessment methods within their respective applicable populations. These include the Framingham cardiovascular disease risk score, 2019 ESC/EAS guidelines for the management of dyslipidemia and the 2016 Chinese recommendations for the management of dyslipidemia in adults. Further analysis of risk factors showed that the variability of blood lipid levels and remnant cholesterol played an important role in indicating an increased risk of CVD.Conclusions: We have shown that the application of machine-learning techniques significantly enhances the precision of cardiovascular risk forecasting among hyperlipidemic patients, addressing the critical issue of disease prediction's heterogeneity and non-linearity. Furthermore, some recently-suggested biomarkers, including blood lipid variability and remnant cholesterol are also important predictors of cardiovascular events, suggesting the importance of continuous lipid monitoring and healthcare profiling through big data platforms.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"51"},"PeriodicalIF":4.7,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11551092/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning approach to flare-up detection and clustering in chronic obstructive pulmonary disease (COPD) patients. 慢性阻塞性肺病（COPD）患者发作检测和聚类的机器学习方法。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-10-23 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00308-4

Ramón Rueda, Esteban Fabello, Tatiana Silva, Samuel Genzor, Jan Mizera, Ladislav Stanke

Purpose: Chronic obstructive pulmonary disease (COPD) is a prevalent and preventable condition that typically worsens over time. Acute exacerbations of COPD significantly impact disease progression, underscoring the importance of prevention efforts. This observational study aimed to achieve two main objectives: (1) identify patients at risk of exacerbations using an ensemble of clustering algorithms, and (2) classify patients into distinct clusters based on disease severity.

Methods: Data from portable medical devices were analyzed post-hoc using hyperparameter optimization with Self-Organizing Maps (SOM), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Isolation Forest, and Support Vector Machine (SVM) algorithms, to detect flare-ups. Principal Component Analysis (PCA) followed by KMeans clustering was applied to categorize patients by severity.

Results: 25 patients were included within the study population, data from 17 patients had the required reliability. Five patients were identified in the highest deterioration group, with one clinically confirmed exacerbation accurately detected by our ensemble algorithm. Then, PCA and KMeans clustering grouped patients into three clusters based on severity: Cluster 0 started with the least severe characteristics but experienced decline, Cluster 1 consistently showed the most severe characteristics, and Cluster 2 showed slight improvement.

Conclusion: Our approach effectively identified patients at risk of exacerbations and classified them by disease severity. Although promising, the approach would need to be verified on a larger sample with a larger number of recorded clinically verified exacerbations.

目的：慢性阻塞性肺病（COPD）是一种可预防的常见疾病，通常会随着时间的推移而恶化。慢性阻塞性肺疾病的急性加重会严重影响疾病的进展，因此预防工作尤为重要。这项观察性研究旨在实现两个主要目标：(1) 使用聚类算法组合识别有恶化风险的患者；(2) 根据疾病严重程度将患者划分为不同的群组：使用自组织图（SOM）、基于密度的噪声应用空间聚类（DBSCAN）、隔离森林（Isolation Forest）和支持向量机（SVM）算法进行超参数优化，对便携式医疗设备的数据进行事后分析，以检测病情恶化。采用主成分分析法（PCA）和 KMeans 聚类法对患者的严重程度进行分类。有五名患者被确定为病情恶化程度最高的一组，我们的集合算法准确检测出了一名临床确诊的病情恶化患者。然后，PCA 和 KMeans 聚类法根据严重程度将患者分为三组：第 0 组开始时特征最不严重，但病情有所恶化；第 1 组持续表现出最严重的特征；第 2 组病情略有好转：我们的方法能有效识别有病情加重风险的患者，并根据病情严重程度对他们进行分类。虽然这种方法很有前景，但还需要在更大的样本中进行验证，并记录更多临床验证的病情加重情况。

{"title":"Machine learning approach to flare-up detection and clustering in chronic obstructive pulmonary disease (COPD) patients.","authors":"Ramón Rueda, Esteban Fabello, Tatiana Silva, Samuel Genzor, Jan Mizera, Ladislav Stanke","doi":"10.1007/s13755-024-00308-4","DOIUrl":"10.1007/s13755-024-00308-4","url":null,"abstract":"Purpose: Chronic obstructive pulmonary disease (COPD) is a prevalent and preventable condition that typically worsens over time. Acute exacerbations of COPD significantly impact disease progression, underscoring the importance of prevention efforts. This observational study aimed to achieve two main objectives: (1) identify patients at risk of exacerbations using an ensemble of clustering algorithms, and (2) classify patients into distinct clusters based on disease severity.Methods: Data from portable medical devices were analyzed post-hoc using hyperparameter optimization with Self-Organizing Maps (SOM), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Isolation Forest, and Support Vector Machine (SVM) algorithms, to detect flare-ups. Principal Component Analysis (PCA) followed by KMeans clustering was applied to categorize patients by severity.Results: 25 patients were included within the study population, data from 17 patients had the required reliability. Five patients were identified in the highest deterioration group, with one clinically confirmed exacerbation accurately detected by our ensemble algorithm. Then, PCA and KMeans clustering grouped patients into three clusters based on severity: Cluster 0 started with the least severe characteristics but experienced decline, Cluster 1 consistently showed the most severe characteristics, and Cluster 2 showed slight improvement.Conclusion: Our approach effectively identified patients at risk of exacerbations and classified them by disease severity. Although promising, the approach would need to be verified on a larger sample with a larger number of recorded clinically verified exacerbations.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"50"},"PeriodicalIF":4.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11499475/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142516717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Explainable federated learning scheme for secure healthcare data sharing. 用于安全共享医疗数据的可解释联合学习方案。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-09-13 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00306-6

Liutao Zhao, Haoran Xie, Lin Zhong, Yujue Wang

Artificial intelligence has immense potential for applications in smart healthcare. Nowadays, a large amount of medical data collected by wearable or implantable devices has been accumulated in Body Area Networks. Unlocking the value of this data can better explore the applications of artificial intelligence in the smart healthcare field. To utilize these dispersed data, this paper proposes an innovative Federated Learning scheme, focusing on the challenges of explainability and security in smart healthcare. In the proposed scheme, the federated modeling process and explainability analysis are independent of each other. By introducing post-hoc explanation techniques to analyze the global model, the scheme avoids the performance degradation caused by pursuing explainability while understanding the mechanism of the model. In terms of security, firstly, a fair and efficient client private gradient evaluation method is introduced for explainable evaluation of gradient contributions, quantifying client contributions in federated learning and filtering the impact of low-quality data. Secondly, to address the privacy issues of medical health data collected by wireless Body Area Networks, a multi-server model is proposed to solve the secure aggregation problem in federated learning. Furthermore, by employing homomorphic secret sharing and homomorphic hashing techniques, a non-interactive, verifiable secure aggregation protocol is proposed, ensuring that client data privacy is protected and the correctness of the aggregation results is maintained even in the presence of up to t colluding malicious servers. Experimental results demonstrate that the proposed scheme's explainability is consistent with that of centralized training scenarios and shows competitive performance in terms of security and efficiency.

Graphical abstract:

人工智能在智能医疗领域的应用潜力巨大。如今，由可穿戴或植入式设备收集的大量医疗数据已在体域网络中积累起来。挖掘这些数据的价值可以更好地探索人工智能在智能医疗领域的应用。为了利用这些分散的数据，本文提出了一种创新的联盟学习方案，重点关注智能医疗领域中可解释性和安全性的挑战。在所提出的方案中，联合建模过程和可解释性分析是相互独立的。通过引入事后解释技术来分析全局模型，该方案避免了在理解模型机制的同时追求可解释性而导致的性能下降。在安全性方面，首先，针对梯度贡献的可解释性评估，引入了一种公平高效的客户端私有梯度评估方法，量化了联合学习中的客户端贡献，过滤了低质量数据的影响。其次，针对无线体域网收集的医疗健康数据的隐私问题，提出了一种多服务器模型，以解决联合学习中的安全聚合问题。此外，通过采用同态秘密共享和同态散列技术，提出了一种非交互式、可验证的安全聚合协议，确保客户端数据隐私得到保护，即使存在多达 t 个恶意串通的服务器，也能保持聚合结果的正确性。实验结果表明，所提方案的可解释性与集中式训练方案一致，并在安全性和效率方面表现出了竞争力：

{"title":"Explainable federated learning scheme for secure healthcare data sharing.","authors":"Liutao Zhao, Haoran Xie, Lin Zhong, Yujue Wang","doi":"10.1007/s13755-024-00306-6","DOIUrl":"10.1007/s13755-024-00306-6","url":null,"abstract":"Artificial intelligence has immense potential for applications in smart healthcare. Nowadays, a large amount of medical data collected by wearable or implantable devices has been accumulated in Body Area Networks. Unlocking the value of this data can better explore the applications of artificial intelligence in the smart healthcare field. To utilize these dispersed data, this paper proposes an innovative Federated Learning scheme, focusing on the challenges of explainability and security in smart healthcare. In the proposed scheme, the federated modeling process and explainability analysis are independent of each other. By introducing post-hoc explanation techniques to analyze the global model, the scheme avoids the performance degradation caused by pursuing explainability while understanding the mechanism of the model. In terms of security, firstly, a fair and efficient client private gradient evaluation method is introduced for explainable evaluation of gradient contributions, quantifying client contributions in federated learning and filtering the impact of low-quality data. Secondly, to address the privacy issues of medical health data collected by wireless Body Area Networks, a multi-server model is proposed to solve the secure aggregation problem in federated learning. Furthermore, by employing homomorphic secret sharing and homomorphic hashing techniques, a non-interactive, verifiable secure aggregation protocol is proposed, ensuring that client data privacy is protected and the correctness of the aggregation results is maintained even in the presence of up to t colluding malicious servers. Experimental results demonstrate that the proposed scheme's explainability is consistent with that of centralized training scenarios and shows competitive performance in terms of security and efficiency.Graphical abstract: ","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"49"},"PeriodicalIF":4.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11399375/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142298293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comorbidity progression analysis: patient stratification and comorbidity prediction using temporal comorbidity network. 合并症进展分析：利用时间合并症网络对患者进行分层和合并症预测。

IF 4.7 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems

Pub Date : 2024-09-12 eCollection Date: 2024-12-01 DOI: 10.1007/s13755-024-00307-5

Ye Liang, Chonghui Guo, Hailin Li

Objective: The study aims to identify distinct population-specific comorbidity progression patterns, timely detect potential comorbidities, and gain better understanding of the progression of comorbid conditions among patients.

Methods: This work presents a comorbidity progression analysis framework that utilizes temporal comorbidity networks (TCN) for patient stratification and comorbidity prediction. We propose a TCN construction approach that utilizes longitudinal, temporal diagnosis data of patients to construct their TCN. Subsequently, we employ the TCN for patient stratification by conducting preliminary analysis, and typical prescription analysis to uncover potential comorbidity progression patterns in different patient groups. Finally, we propose an innovative comorbidity prediction method by utilizing the distance-matched temporal comorbidity network (TCN-DM). This method identifies similar patients with disease prevalence and disease transition patterns and combines their diagnosis information with that of the current patient to predict potential comorbidity at the patient's next visit.

Results: This study validated the capability of the framework using a real-world dataset MIMIC-III, with heart failure (HF) as interested disease to investigate comorbidity progression in HF patients. With TCN, this study can identify four significant distinctive HF subgroups, revealing the progression of comorbidities in patients. Furthermore, compared to other methods, TCN-DM demonstrated better predictive performance with F1-Score values ranging from 0.454 to 0.612, showcasing its superiority.

Conclusions: This study can identify comorbidity patterns for individuals and population, and offer promising prediction for future comorbidity developments in patients.

研究目的本研究旨在识别特定人群的独特合并症进展模式，及时发现潜在合并症，并更好地了解患者合并症的进展情况：本研究提出了一种合并症进展分析框架，该框架利用时间合并症网络（TCN）对患者进行分层和合并症预测。我们提出了一种 TCN 构建方法，利用患者的纵向时间诊断数据来构建他们的 TCN。随后，我们通过进行初步分析和典型处方分析，利用 TCN 对患者进行分层，从而发现不同患者群体中潜在的合并症进展模式。最后，我们利用距离匹配时间合并症网络（TCN-DM）提出了一种创新的合并症预测方法。该方法可识别具有疾病流行和疾病转变模式的类似患者，并将其诊断信息与当前患者的诊断信息相结合，以预测患者下次就诊时的潜在合并症：本研究利用真实世界数据集 MIMIC-III，以心力衰竭（HF）为相关疾病，对该框架的能力进行了验证，以调查 HF 患者的合并症进展情况。通过 TCN，本研究可以识别出四个明显的 HF 亚组，揭示出患者合并症的进展情况。此外，与其他方法相比，TCN-DM 的预测性能更好，F1-Score 值从 0.454 到 0.612 不等，显示了其优越性：本研究可识别个人和人群的合并症模式，并为预测患者未来的合并症发展提供了前景。

{"title":"Comorbidity progression analysis: patient stratification and comorbidity prediction using temporal comorbidity network.","authors":"Ye Liang, Chonghui Guo, Hailin Li","doi":"10.1007/s13755-024-00307-5","DOIUrl":"10.1007/s13755-024-00307-5","url":null,"abstract":"Objective: The study aims to identify distinct population-specific comorbidity progression patterns, timely detect potential comorbidities, and gain better understanding of the progression of comorbid conditions among patients.Methods: This work presents a comorbidity progression analysis framework that utilizes temporal comorbidity networks (TCN) for patient stratification and comorbidity prediction. We propose a TCN construction approach that utilizes longitudinal, temporal diagnosis data of patients to construct their TCN. Subsequently, we employ the TCN for patient stratification by conducting preliminary analysis, and typical prescription analysis to uncover potential comorbidity progression patterns in different patient groups. Finally, we propose an innovative comorbidity prediction method by utilizing the distance-matched temporal comorbidity network (TCN-DM). This method identifies similar patients with disease prevalence and disease transition patterns and combines their diagnosis information with that of the current patient to predict potential comorbidity at the patient's next visit.Results: This study validated the capability of the framework using a real-world dataset MIMIC-III, with heart failure (HF) as interested disease to investigate comorbidity progression in HF patients. With TCN, this study can identify four significant distinctive HF subgroups, revealing the progression of comorbidities in patients. Furthermore, compared to other methods, TCN-DM demonstrated better predictive performance with F1-Score values ranging from 0.454 to 0.612, showcasing its superiority.Conclusions: This study can identify comorbidity patterns for individuals and population, and offer promising prediction for future comorbidity developments in patients.","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"48"},"PeriodicalIF":4.7,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393239/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142298292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0