A Novel Support Vector Classifier for Longitudinal High-dimensional Data and Its Application to Neuroimaging Data.

IF 2.1 4区数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Statistical Analysis and Data Mining Pub Date : 2011-12-01 DOI:10.1002/sam.10141

Shuo Chen, F DuBois Bowman

{"title":"A Novel Support Vector Classifier for Longitudinal High-dimensional Data and Its Application to Neuroimaging Data.","authors":"Shuo Chen, F DuBois Bowman","doi":"10.1002/sam.10141","DOIUrl":null,"url":null,"abstract":"<p><p>Recent technological advances have made it possible for many studies to collect high dimensional data (HDD) longitudinally, for example images collected during different scanning sessions. Such studies may yield temporal changes of selected features that, when incorporated with machine learning methods, are able to predict disease status or responses to a therapeutic treatment. Support vector machine (SVM) techniques are robust and effective tools well-suited for the classification and prediction of HDD. However, current SVM methods for HDD analysis typically consider cross-sectional data collected during one time period or session (e.g. baseline). We propose a novel support vector classifier (SVC) for longitudinal HDD that allows simultaneous estimation of the SVM separating hyperplane parameters and temporal trend parameters, which determine the optimal means to combine the longitudinal data for classification and prediction. Our approach is based on an augmented reproducing kernel function and uses quadratic programming for optimization. We demonstrate the use and potential advantages of our proposed methodology using a simulation study and a data example from the Alzheimer's disease Neuroimaging Initiative. The results indicate that our proposed method leverages the additional longitudinal information to achieve higher accuracy than methods using only cross-sectional data and methods that combine longitudinal data by naively expanding the feature space.</p>","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"4 6","pages":"604-611"},"PeriodicalIF":2.1000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4189187/pdf/nihms-629358.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/sam.10141","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent technological advances have made it possible for many studies to collect high dimensional data (HDD) longitudinally, for example images collected during different scanning sessions. Such studies may yield temporal changes of selected features that, when incorporated with machine learning methods, are able to predict disease status or responses to a therapeutic treatment. Support vector machine (SVM) techniques are robust and effective tools well-suited for the classification and prediction of HDD. However, current SVM methods for HDD analysis typically consider cross-sectional data collected during one time period or session (e.g. baseline). We propose a novel support vector classifier (SVC) for longitudinal HDD that allows simultaneous estimation of the SVM separating hyperplane parameters and temporal trend parameters, which determine the optimal means to combine the longitudinal data for classification and prediction. Our approach is based on an augmented reproducing kernel function and uses quadratic programming for optimization. We demonstrate the use and potential advantages of our proposed methodology using a simulation study and a data example from the Alzheimer's disease Neuroimaging Initiative. The results indicate that our proposed method leverages the additional longitudinal information to achieve higher accuracy than methods using only cross-sectional data and methods that combine longitudinal data by naively expanding the feature space.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于纵向高维数据的新型支持向量分类器及其在神经影像数据中的应用

最近的技术进步使许多研究能够纵向收集高维数据（HDD），例如在不同扫描过程中收集的图像。这类研究可能会产生所选特征的时间变化，当这些特征与机器学习方法相结合时，就能预测疾病状态或对治疗的反应。支持向量机（SVM）技术是非常适合 HDD 分类和预测的强大而有效的工具。然而，目前用于 HDD 分析的 SVM 方法通常考虑的是在一个时间段或疗程（如基线）内收集的横截面数据。我们提出了一种用于纵向 HDD 的新型支持向量分类器 (SVC)，该分类器可同时估算 SVM 分离超平面参数和时间趋势参数，从而确定结合纵向数据进行分类和预测的最佳方法。我们的方法基于增强再现核函数，并使用二次编程进行优化。我们通过模拟研究和阿尔茨海默病神经成像计划的数据实例，展示了我们提出的方法的用途和潜在优势。结果表明，与仅使用横截面数据的方法和通过天真地扩展特征空间来结合纵向数据的方法相比，我们提出的方法利用了额外的纵向信息，实现了更高的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Statistical Analysis and Data Mining COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEC-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

CiteScore

3.20

自引率

7.70%

发文量

期刊介绍： Statistical Analysis and Data Mining addresses the broad area of data analysis, including statistical approaches, machine learning, data mining, and applications. Topics include statistical and computational approaches for analyzing massive and complex datasets, novel statistical and/or machine learning methods and theory, and state-of-the-art applications with high impact. Of special interest are articles that describe innovative analytical techniques, and discuss their application to real problems, in such a way that they are accessible and beneficial to domain experts across science, engineering, and commerce. The focus of the journal is on papers which satisfy one or more of the following criteria: Solve data analysis problems associated with massive, complex datasets Develop innovative statistical approaches, machine learning algorithms, or methods integrating ideas across disciplines, e.g., statistics, computer science, electrical engineering, operation research. Formulate and solve high-impact real-world problems which challenge existing paradigms via new statistical and/or computational models Provide survey to prominent research topics.