{"title":"Conceptual models and databases for searching the genome","authors":"Anna Bernasconi, Pietro Pinoli","doi":"10.48786/edbt.2022.57","DOIUrl":null,"url":null,"abstract":"Genomics is an extremely complex domain, in terms of concepts, their relations, and their representations in data. This tutorial in-troduces the use of ER models in the context of genomic systems: conceptual models are of great help for simplifying this domain and making it actionable. We carry out a review of successful models presented in the literature for representing biologically-relevant entities and grounding them in databases. We draw a difference between conceptual models that aim to explain the domain and conceptual models that aim to support database design and heterogeneous data integration. Genomic experiments and/or sequences are described by several metadata, specify-ing information on the sampled organism, the used technology, and the organizational process behind the experiment. Instead, we call data the actual regions of the genome that have been read by sequencing technologies and encoded into a machine-readable representation. First, we show how data and metadata can be modeled, then we exploit the proposed models for de-signing search systems, visualizers, and analysis environments. Both domains of human genomics and viral genomics are addressed, surveying several use cases and applications of broader public interest. The tutorial is relevant to the EDBT community because it demonstrates the usefulness of conceptual models’ principles within very current domains; in addition, it offers a concrete example of conceptual models’ use, setting the premises for interdisciplinary collaboration with a greater public (possibly including life science researchers).","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"40 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in database technology : proceedings. International Conference on Extending Database Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48786/edbt.2022.57","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Genomics is an extremely complex domain, in terms of concepts, their relations, and their representations in data. This tutorial in-troduces the use of ER models in the context of genomic systems: conceptual models are of great help for simplifying this domain and making it actionable. We carry out a review of successful models presented in the literature for representing biologically-relevant entities and grounding them in databases. We draw a difference between conceptual models that aim to explain the domain and conceptual models that aim to support database design and heterogeneous data integration. Genomic experiments and/or sequences are described by several metadata, specify-ing information on the sampled organism, the used technology, and the organizational process behind the experiment. Instead, we call data the actual regions of the genome that have been read by sequencing technologies and encoded into a machine-readable representation. First, we show how data and metadata can be modeled, then we exploit the proposed models for de-signing search systems, visualizers, and analysis environments. Both domains of human genomics and viral genomics are addressed, surveying several use cases and applications of broader public interest. The tutorial is relevant to the EDBT community because it demonstrates the usefulness of conceptual models’ principles within very current domains; in addition, it offers a concrete example of conceptual models’ use, setting the premises for interdisciplinary collaboration with a greater public (possibly including life science researchers).