Conceptual models and databases for searching the genome

Anna Bernasconi, Pietro Pinoli
{"title":"Conceptual models and databases for searching the genome","authors":"Anna Bernasconi, Pietro Pinoli","doi":"10.48786/edbt.2022.57","DOIUrl":null,"url":null,"abstract":"Genomics is an extremely complex domain, in terms of concepts, their relations, and their representations in data. This tutorial in-troduces the use of ER models in the context of genomic systems: conceptual models are of great help for simplifying this domain and making it actionable. We carry out a review of successful models presented in the literature for representing biologically-relevant entities and grounding them in databases. We draw a difference between conceptual models that aim to explain the domain and conceptual models that aim to support database design and heterogeneous data integration. Genomic experiments and/or sequences are described by several metadata, specify-ing information on the sampled organism, the used technology, and the organizational process behind the experiment. Instead, we call data the actual regions of the genome that have been read by sequencing technologies and encoded into a machine-readable representation. First, we show how data and metadata can be modeled, then we exploit the proposed models for de-signing search systems, visualizers, and analysis environments. Both domains of human genomics and viral genomics are addressed, surveying several use cases and applications of broader public interest. The tutorial is relevant to the EDBT community because it demonstrates the usefulness of conceptual models’ principles within very current domains; in addition, it offers a concrete example of conceptual models’ use, setting the premises for interdisciplinary collaboration with a greater public (possibly including life science researchers).","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"40 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in database technology : proceedings. International Conference on Extending Database Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48786/edbt.2022.57","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Genomics is an extremely complex domain, in terms of concepts, their relations, and their representations in data. This tutorial in-troduces the use of ER models in the context of genomic systems: conceptual models are of great help for simplifying this domain and making it actionable. We carry out a review of successful models presented in the literature for representing biologically-relevant entities and grounding them in databases. We draw a difference between conceptual models that aim to explain the domain and conceptual models that aim to support database design and heterogeneous data integration. Genomic experiments and/or sequences are described by several metadata, specify-ing information on the sampled organism, the used technology, and the organizational process behind the experiment. Instead, we call data the actual regions of the genome that have been read by sequencing technologies and encoded into a machine-readable representation. First, we show how data and metadata can be modeled, then we exploit the proposed models for de-signing search systems, visualizers, and analysis environments. Both domains of human genomics and viral genomics are addressed, surveying several use cases and applications of broader public interest. The tutorial is relevant to the EDBT community because it demonstrates the usefulness of conceptual models’ principles within very current domains; in addition, it offers a concrete example of conceptual models’ use, setting the premises for interdisciplinary collaboration with a greater public (possibly including life science researchers).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于搜索基因组的概念模型和数据库
基因组学是一个极其复杂的领域,就概念、它们之间的关系以及它们在数据中的表示而言。本教程介绍了在基因组系统上下文中ER模型的使用:概念模型对于简化该领域并使其具有可操作性有很大帮助。我们对文献中提出的成功模型进行了回顾,这些模型用于表示生物学相关实体并将它们置于数据库中。我们在旨在解释领域的概念模型和旨在支持数据库设计和异构数据集成的概念模型之间进行了区分。基因组实验和/或序列由几个元数据来描述,这些元数据指定了关于采样生物体的信息、使用的技术和实验背后的组织过程。相反,我们称数据为基因组的实际区域,这些区域已被测序技术读取并编码为机器可读的表示形式。首先,我们将展示如何对数据和元数据进行建模,然后利用所提出的模型来设计搜索系统、可视化器和分析环境。这两个领域的人类基因组学和病毒基因组学是解决,调查几个用例和更广泛的公共利益的应用。本教程与EDBT社区相关,因为它展示了概念模型原则在当前领域中的有用性;此外,它还提供了一个概念模型使用的具体例子,为与更多公众(可能包括生命科学研究人员)进行跨学科合作奠定了基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Computing Generic Abstractions from Application Datasets Fair Spatial Indexing: A paradigm for Group Spatial Fairness. Data Coverage for Detecting Representation Bias in Image Datasets: A Crowdsourcing Approach Auditing for Spatial Fairness TransEdge: Supporting Efficient Read Queries Across Untrusted Edge Nodes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1