基于稀疏特征和长期相互作用的果蝇基因表达模式注释。

Shuiwang Ji, Lei Yuan, Ying-Xin Li, Zhi-Hua Zhou, Sudhir Kumar, Jieping Ye
{"title":"基于稀疏特征和长期相互作用的果蝇基因表达模式注释。","authors":"Shuiwang Ji,&nbsp;Lei Yuan,&nbsp;Ying-Xin Li,&nbsp;Zhi-Hua Zhou,&nbsp;Sudhir Kumar,&nbsp;Jieping Ye","doi":"10.1145/1557019.1557068","DOIUrl":null,"url":null,"abstract":"<p><p>The Drosophila gene expression pattern images document the spatial and temporal dynamics of gene expression and they are valuable tools for explicating the gene functions, interaction, and networks during Drosophila embryogenesis. To provide text-based pattern searching, the images in the Berkeley Drosophila Genome Project (BDGP) study are annotated with ontology terms manually by human curators. We present a systematic approach for automating this task, because the number of images needing text descriptions is now rapidly increasing. We consider both improved feature representation and novel learning formulation to boost the annotation performance. For feature representation, we adapt the bag-of-words scheme commonly used in visual recognition problems so that the image group information in the BDGP study is retained. Moreover, images from multiple views can be integrated naturally in this representation. To reduce the quantization error caused by the bag-of-words representation, we propose an improved feature representation scheme based on the sparse learning technique. In the design of learning formulation, we propose a local regularization framework that can incorporate the correlations among terms explicitly. We further show that the resulting optimization problem admits an analytical solution. Experimental results show that the representation based on sparse learning outperforms the bag-of-words representation significantly. Results also show that incorporation of the term-term correlations improves the annotation performance consistently.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1557019.1557068","citationCount":"37","resultStr":"{\"title\":\"Drosophila Gene Expression Pattern Annotation Using Sparse Features and Term-Term Interactions.\",\"authors\":\"Shuiwang Ji,&nbsp;Lei Yuan,&nbsp;Ying-Xin Li,&nbsp;Zhi-Hua Zhou,&nbsp;Sudhir Kumar,&nbsp;Jieping Ye\",\"doi\":\"10.1145/1557019.1557068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The Drosophila gene expression pattern images document the spatial and temporal dynamics of gene expression and they are valuable tools for explicating the gene functions, interaction, and networks during Drosophila embryogenesis. To provide text-based pattern searching, the images in the Berkeley Drosophila Genome Project (BDGP) study are annotated with ontology terms manually by human curators. We present a systematic approach for automating this task, because the number of images needing text descriptions is now rapidly increasing. We consider both improved feature representation and novel learning formulation to boost the annotation performance. For feature representation, we adapt the bag-of-words scheme commonly used in visual recognition problems so that the image group information in the BDGP study is retained. Moreover, images from multiple views can be integrated naturally in this representation. To reduce the quantization error caused by the bag-of-words representation, we propose an improved feature representation scheme based on the sparse learning technique. In the design of learning formulation, we propose a local regularization framework that can incorporate the correlations among terms explicitly. We further show that the resulting optimization problem admits an analytical solution. Experimental results show that the representation based on sparse learning outperforms the bag-of-words representation significantly. Results also show that incorporation of the term-term correlations improves the annotation performance consistently.</p>\",\"PeriodicalId\":74037,\"journal\":{\"name\":\"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/1557019.1557068\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1557019.1557068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1557019.1557068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37

摘要

果蝇基因表达模式图像记录了果蝇基因表达的时空动态,是解释果蝇胚胎发生过程中基因功能、相互作用和网络的重要工具。为了提供基于文本的模式搜索,伯克利果蝇基因组计划(BDGP)研究中的图像由人类管理员手动标注本体术语。我们提出了一种系统的方法来自动化这项任务,因为需要文本描述的图像数量现在正在迅速增加。我们考虑了改进的特征表示和新的学习公式来提高标注性能。在特征表示方面,我们采用了视觉识别问题中常用的词袋方案,使BDGP研究中的图像组信息得以保留。此外,来自多个视图的图像可以自然地集成在这种表示中。为了减少词袋表示带来的量化误差,提出了一种改进的基于稀疏学习技术的特征表示方案。在学习公式的设计中,我们提出了一个局部正则化框架,可以显式地合并术语之间的相关性。我们进一步证明了所得到的优化问题有解析解。实验结果表明,基于稀疏学习的表示方法明显优于词袋表示方法。结果还表明,术语-术语相关性的结合一致地提高了标注性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Drosophila Gene Expression Pattern Annotation Using Sparse Features and Term-Term Interactions.

The Drosophila gene expression pattern images document the spatial and temporal dynamics of gene expression and they are valuable tools for explicating the gene functions, interaction, and networks during Drosophila embryogenesis. To provide text-based pattern searching, the images in the Berkeley Drosophila Genome Project (BDGP) study are annotated with ontology terms manually by human curators. We present a systematic approach for automating this task, because the number of images needing text descriptions is now rapidly increasing. We consider both improved feature representation and novel learning formulation to boost the annotation performance. For feature representation, we adapt the bag-of-words scheme commonly used in visual recognition problems so that the image group information in the BDGP study is retained. Moreover, images from multiple views can be integrated naturally in this representation. To reduce the quantization error caused by the bag-of-words representation, we propose an improved feature representation scheme based on the sparse learning technique. In the design of learning formulation, we propose a local regularization framework that can incorporate the correlations among terms explicitly. We further show that the resulting optimization problem admits an analytical solution. Experimental results show that the representation based on sparse learning outperforms the bag-of-words representation significantly. Results also show that incorporation of the term-term correlations improves the annotation performance consistently.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Predicting Age-Related Macular Degeneration Progression with Contrastive Attention and Time-Aware LSTM. MolSearch: Search-based Multi-objective Molecular Generation and Property Optimization. Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment Regimes. MoCL: Data-driven Molecular Fingerprint via Knowledge-aware Contrastive Learning from Molecular Graph. Federated Adversarial Debiasing for Fair and Transferable Representations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1