Characteristic gene selection via L2,1-norm Sparse Principal Component Analysis

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Pub Date : 2016-12-01 DOI:10.1109/BIBM.2016.7822796

Yao Lu, Ying-Lian Gao, Jin-Xing Liu, Chang-Gang Wen, Yaxuan Wang, Jiguo Yu

{"title":"Characteristic gene selection via L2,1-norm Sparse Principal Component Analysis","authors":"Yao Lu, Ying-Lian Gao, Jin-Xing Liu, Chang-Gang Wen, Yaxuan Wang, Jiguo Yu","doi":"10.1109/BIBM.2016.7822796","DOIUrl":null,"url":null,"abstract":"Sparse Principal Component Analysis (SPCA) is a method that can get the sparse loadings of the principal components (PCs), and it may formulate PCA as a regression-type optimization problem by using the elastic net. But the selected features are different with each PC and generally independent. A new method named SPCA has been proposed for removing these detect, which replaces the elastic net with L2,1-norm penalty. The results of the method on gene expression data are still unknown. Therefore, we will take a test to prove this point in this paper. Firstly, this method is applied to the simulated data for obtaining an optimal parameter. Secondly, the L2,1SPCA method is applied to the gene expression data, that is the head and neck squamous carcinoma data (HNSC). Thirdly, the characteristic genes are selected according the PCs. The results consist of very lower P-value and very higher hit count, which shows the method of L2,1SPCA can obtain higher recognition accuracy and higher relevancy to the genes. Finally, the experimental results demonstrate that the L2,1SPCA works well and has good performances in the gene expression data.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Sparse Principal Component Analysis (SPCA) is a method that can get the sparse loadings of the principal components (PCs), and it may formulate PCA as a regression-type optimization problem by using the elastic net. But the selected features are different with each PC and generally independent. A new method named SPCA has been proposed for removing these detect, which replaces the elastic net with L2,1-norm penalty. The results of the method on gene expression data are still unknown. Therefore, we will take a test to prove this point in this paper. Firstly, this method is applied to the simulated data for obtaining an optimal parameter. Secondly, the L2,1SPCA method is applied to the gene expression data, that is the head and neck squamous carcinoma data (HNSC). Thirdly, the characteristic genes are selected according the PCs. The results consist of very lower P-value and very higher hit count, which shows the method of L2,1SPCA can obtain higher recognition accuracy and higher relevancy to the genes. Finally, the experimental results demonstrate that the L2,1SPCA works well and has good performances in the gene expression data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于L2,1范数稀疏主成分分析的特征基因选择

稀疏主成分分析(SPCA)是一种获取主成分稀疏载荷的方法，它可以利用弹性网络将主成分分析转化为回归型优化问题。但所选择的功能因个人电脑而异，通常是独立的。为了消除这些检测，提出了一种新的SPCA方法，用L2,1范数惩罚代替弹性网。该方法对基因表达数据的结果尚不清楚。因此，我们将在本文中进行一个测试来证明这一点。首先，将该方法应用于模拟数据，求出最优参数。其次，将L2,1SPCA方法应用于基因表达数据，即头颈部鳞状癌数据(HNSC)。第三，根据pc选择特征基因。结果表明，L2,1SPCA方法具有较低的p值和较高的命中数，可以获得较高的识别精度和与基因的相关性。最后，实验结果表明，L2,1SPCA在基因表达数据中具有良好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

自引率

0.00%

发文量