Neural network methods for diagnosing patient conditions from cardiopulmonary exercise testing data.

IF 4 3区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Biodata Mining Pub Date : 2022-08-13 DOI:10.1186/s13040-022-00299-6

Donald E Brown, Suchetha Sharma, James A Jablonski, Arthur Weltman

{"title":"Neural network methods for diagnosing patient conditions from cardiopulmonary exercise testing data.","authors":"Donald E Brown, Suchetha Sharma, James A Jablonski, Arthur Weltman","doi":"10.1186/s13040-022-00299-6","DOIUrl":null,"url":null,"abstract":"Background: Cardiopulmonary exercise testing (CPET) provides a reliable and reproducible approach to measuring fitness in patients and diagnosing their health problems. However, the data from CPET consist of multiple time series that require training to interpret. Part of this training teaches the use of flow charts or nested decision trees to interpret the CPET results. This paper investigates the use of two machine learning techniques using neural networks to predict patient health conditions with CPET data in contrast to flow charts. The data for this investigation comes from a small sample of patients with known health problems and who had CPET results. The small size of the sample data also allows us to investigate the use and performance of deep learning neural networks on health care problems with limited amounts of labeled training and testing data.Methods: This paper compares the current standard for interpreting and classifying CPET data, flowcharts, to neural network techniques, autoencoders and convolutional neural networks (CNN). The study also investigated the performance of principal component analysis (PCA) with logistic regression to provide an additional baseline of comparison to the neural network techniques.Results: The patients in the sample had two primary diagnoses: heart failure and metabolic syndrome. All model-based testing was done with 5-fold cross-validation and metrics of precision, recall, F1 score, and accuracy. As a baseline for comparison to our models, the highest performing flow chart method achieved an accuracy of 77%. Both PCA regression and CNN achieved an average accuracy of 90% and outperformed the flow chart methods on all metrics. The autoencoder with logistic regression performed the best on each of the metrics and had an average accuracy of 94%.Conclusions: This study suggests that machine learning and neural network techniques, in particular, can provide higher levels of accuracy with CPET data than traditional flowchart methods. Further, the CNN performed well with a small data set showing that these techniques can be designed to perform well on small data problems that are often found in health care and the life sciences. Further testing with larger data sets is needed to continue evaluating the use of machine learning to interpret CPET data.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":null,"pages":null},"PeriodicalIF":4.0000,"publicationDate":"2022-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9375280/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-022-00299-6","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Cardiopulmonary exercise testing (CPET) provides a reliable and reproducible approach to measuring fitness in patients and diagnosing their health problems. However, the data from CPET consist of multiple time series that require training to interpret. Part of this training teaches the use of flow charts or nested decision trees to interpret the CPET results. This paper investigates the use of two machine learning techniques using neural networks to predict patient health conditions with CPET data in contrast to flow charts. The data for this investigation comes from a small sample of patients with known health problems and who had CPET results. The small size of the sample data also allows us to investigate the use and performance of deep learning neural networks on health care problems with limited amounts of labeled training and testing data.

Methods: This paper compares the current standard for interpreting and classifying CPET data, flowcharts, to neural network techniques, autoencoders and convolutional neural networks (CNN). The study also investigated the performance of principal component analysis (PCA) with logistic regression to provide an additional baseline of comparison to the neural network techniques.

Results: The patients in the sample had two primary diagnoses: heart failure and metabolic syndrome. All model-based testing was done with 5-fold cross-validation and metrics of precision, recall, F1 score, and accuracy. As a baseline for comparison to our models, the highest performing flow chart method achieved an accuracy of 77%. Both PCA regression and CNN achieved an average accuracy of 90% and outperformed the flow chart methods on all metrics. The autoencoder with logistic regression performed the best on each of the metrics and had an average accuracy of 94%.

Conclusions: This study suggests that machine learning and neural network techniques, in particular, can provide higher levels of accuracy with CPET data than traditional flowchart methods. Further, the CNN performed well with a small data set showing that these techniques can be designed to perform well on small data problems that are often found in health care and the life sciences. Further testing with larger data sets is needed to continue evaluating the use of machine learning to interpret CPET data.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从心肺运动测试数据诊断患者病情的神经网络方法。

背景：心肺运动测试（CPET）是测量患者体能和诊断其健康问题的可靠且可重复的方法。然而，CPET 的数据由多个时间序列组成，需要经过培训才能解读。培训的一部分内容是教授如何使用流程图或嵌套决策树来解释 CPET 结果。与流程图相比，本文研究了使用神经网络的两种机器学习技术，通过 CPET 数据预测患者的健康状况。本次调查的数据来自于已知有健康问题且有 CPET 结果的小样本患者。样本数据规模较小，这也使我们能够在标注的训练和测试数据数量有限的情况下，研究深度学习神经网络在医疗保健问题上的应用和性能：本文将当前解释和分类 CPET 数据的标准（流程图）与神经网络技术（自动编码器和卷积神经网络 (CNN)）进行了比较。研究还调查了主成分分析（PCA）与逻辑回归的性能，以提供与神经网络技术比较的额外基线：样本中的患者有两个主要诊断：心力衰竭和代谢综合征。所有基于模型的测试都是通过 5 倍交叉验证以及精确度、召回率、F1 分数和准确度等指标完成的。作为与我们的模型进行比较的基线，性能最高的流程图方法达到了 77% 的准确率。PCA 回归和 CNN 的平均准确率都达到了 90%，在所有指标上都优于流程图方法。带有逻辑回归的自动编码器在各项指标上表现最佳，平均准确率达到 94%：这项研究表明，与传统的流程图方法相比，机器学习和神经网络技术尤其能为 CPET 数据提供更高的准确性。此外，CNN 在小型数据集上表现良好，这表明这些技术在设计上可以很好地解决医疗保健和生命科学领域经常出现的小型数据问题。要继续评估使用机器学习解释 CPET 数据的效果，还需要对更大的数据集进行进一步测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

7.90

自引率

0.00%

发文量

审稿时长

23 weeks

期刊介绍： BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.