基因表达和微阵列分析的知识驱动回归模型。

Rong Jin, Luo Si, Shireesh Srivastava, Zheng Li, Christina Chan
{"title":"基因表达和微阵列分析的知识驱动回归模型。","authors":"Rong Jin,&nbsp;Luo Si,&nbsp;Shireesh Srivastava,&nbsp;Zheng Li,&nbsp;Christina Chan","doi":"10.1109/IEMBS.2006.260347","DOIUrl":null,"url":null,"abstract":"<p><p>The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.</p>","PeriodicalId":72689,"journal":{"name":"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference","volume":" ","pages":"5326-9"},"PeriodicalIF":0.0000,"publicationDate":"2006-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/IEMBS.2006.260347","citationCount":"7","resultStr":"{\"title\":\"A knowledge driven regression model for gene expression and microarray analysis.\",\"authors\":\"Rong Jin,&nbsp;Luo Si,&nbsp;Shireesh Srivastava,&nbsp;Zheng Li,&nbsp;Christina Chan\",\"doi\":\"10.1109/IEMBS.2006.260347\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.</p>\",\"PeriodicalId\":72689,\"journal\":{\"name\":\"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference\",\"volume\":\" \",\"pages\":\"5326-9\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/IEMBS.2006.260347\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEMBS.2006.260347\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMBS.2006.260347","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

线性回归模型已广泛用于基因表达分析和微阵列数据,以确定对给定代谢功能重要的基因子集。将线性回归模型应用于基因表达数据分析的关键挑战之一来自数据稀疏问题,其中基因数量明显大于条件数量。为了解决这个问题,我们提出了一个知识驱动的回归模型,该模型将基因本体(GO)数据库中的基因知识整合到线性回归模型中。它是基于这样的假设:当两个基因共享相似的氧化石墨烯编码集时,它们可能被赋予相似的权重。实证研究表明,所提出的知识驱动回归模型可以有效地减少回归误差,并且可以有效地识别与给定代谢物相关的基因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A knowledge driven regression model for gene expression and microarray analysis.

The linear regression model has been widely used in the analysis of gene expression and microarray data to identify a subset of genes that are important to a given metabolic function. One of the key challenges in applying the linear regression model to gene expression data analysis arises from the sparse data problem, in which the number of genes is significantly larger than the number of conditions. To resolve this problem, we present a knowledge driven regression model that incorporates the knowledge of genes from the Gene Ontology (GO) database into the linear regression model. It is based on the assumption that two genes are likely to be assigned similar weights when they share similar sets of GO codes. Empirical studies show that the proposed knowledge driven regression model is effective in reducing the regression errors, and furthermore effective in identifying genes that are relevant to a given metabolite.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.20
自引率
0.00%
发文量
0
期刊最新文献
Rapid Label-free DNA Quantification by Multi-frequency Impedance Sensing on a Chip. A Comparison of 1-D and 2-D Deep Convolutional Neural Networks in ECG Classification Brain Morphometry Analysis with Surface Foliation Theory Low-Cost, USB Connected and Multi-Purpose Biopotential Recording System. A Fast Respiratory Rate Estimation Method using Joint Sparse Signal Reconstruction based on Regularized Sparsity Adaptive Matching Pursuit.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1