PoliViews:基因组数据概念建模的综合模块化方法

IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Data & Knowledge Engineering Pub Date : 2023-09-01 DOI:10.1016/j.datak.2023.102201
Anna Bernasconi , Alberto García S. , Stefano Ceri , Oscar Pastor
{"title":"PoliViews:基因组数据概念建模的综合模块化方法","authors":"Anna Bernasconi ,&nbsp;Alberto García S. ,&nbsp;Stefano Ceri ,&nbsp;Oscar Pastor","doi":"10.1016/j.datak.2023.102201","DOIUrl":null,"url":null,"abstract":"<div><p>The human genome complexity is captured by many signals, representing for instance DNA variations, the expression of gene activity, or DNA’s structural rearrangements; a rich set of data types and formats is used to record these signals. Conceptual models can support the description and explanation of the genome’s elaborate structure and behavior. Among others, the Conceptual Schema of the Human Genome (CSG) provides a <em>concept-oriented, top-down</em> representation of the genome behavior, which is independent of data formats. The Genomic Conceptual Model (GCM) provides instead a <em>data-oriented, bottom-up</em> representation, targeting a well-organized, unified description of these formats. In this research, we join the two approaches to achieve PoliViews, a comprehensive model that links (1) a <em>concepts layer</em>, describing genome elements and their conceptual connections, with (2) a <em>data layer</em>, describing datasets derived from genome sequencing with specific technologies. Their dynamic connection is established when specific genomic data types are chosen in the data layer, thereby triggering the selection of a view in the concepts layer. The benefit is mutual: data records can be semantically described by high-level concepts exploiting their links and, in turn, the continuously evolving abstract model can be extended thanks to the input provided by real datasets. PoliViews enables expressing queries that employ a holistic conceptual perspective on the genome, directly translated onto data-oriented terms and organization. Here, we demonstrate the approach by linking two major genomic data types, namely DNA variation and gene expression. For each type, we consider different eminent data sources; we describe their mapping with the corresponding view in the concepts layer, enabling an <em>intra-data-type</em> integration. Then, leveraging on the connections available in the concepts layer, we show how the distinct data types can be interoperated, enabling an <em>inter-data-type</em> integration. The PoliViews approach is shown through several examples of biological interest and can be further extended to any kind of genomic information.</p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PoliViews: A comprehensive and modular approach to the conceptual modeling of genomic data\",\"authors\":\"Anna Bernasconi ,&nbsp;Alberto García S. ,&nbsp;Stefano Ceri ,&nbsp;Oscar Pastor\",\"doi\":\"10.1016/j.datak.2023.102201\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The human genome complexity is captured by many signals, representing for instance DNA variations, the expression of gene activity, or DNA’s structural rearrangements; a rich set of data types and formats is used to record these signals. Conceptual models can support the description and explanation of the genome’s elaborate structure and behavior. Among others, the Conceptual Schema of the Human Genome (CSG) provides a <em>concept-oriented, top-down</em> representation of the genome behavior, which is independent of data formats. The Genomic Conceptual Model (GCM) provides instead a <em>data-oriented, bottom-up</em> representation, targeting a well-organized, unified description of these formats. In this research, we join the two approaches to achieve PoliViews, a comprehensive model that links (1) a <em>concepts layer</em>, describing genome elements and their conceptual connections, with (2) a <em>data layer</em>, describing datasets derived from genome sequencing with specific technologies. Their dynamic connection is established when specific genomic data types are chosen in the data layer, thereby triggering the selection of a view in the concepts layer. The benefit is mutual: data records can be semantically described by high-level concepts exploiting their links and, in turn, the continuously evolving abstract model can be extended thanks to the input provided by real datasets. PoliViews enables expressing queries that employ a holistic conceptual perspective on the genome, directly translated onto data-oriented terms and organization. Here, we demonstrate the approach by linking two major genomic data types, namely DNA variation and gene expression. For each type, we consider different eminent data sources; we describe their mapping with the corresponding view in the concepts layer, enabling an <em>intra-data-type</em> integration. Then, leveraging on the connections available in the concepts layer, we show how the distinct data types can be interoperated, enabling an <em>inter-data-type</em> integration. The PoliViews approach is shown through several examples of biological interest and can be further extended to any kind of genomic information.</p></div>\",\"PeriodicalId\":55184,\"journal\":{\"name\":\"Data & Knowledge Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & Knowledge Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169023X23000617\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23000617","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

人类基因组的复杂性被许多信号捕获,例如代表DNA变异、基因活性的表达或DNA的结构重排;使用一组丰富的数据类型和格式来记录这些信号。概念模型可以支持对基因组复杂结构和行为的描述和解释。除其他外,人类基因组概念模式(CSG)提供了一种以概念为导向、自上而下的基因组行为表示,与数据格式无关。基因组概念模型(GCM)提供了一种面向数据、自下而上的表示,旨在对这些格式进行组织良好、统一的描述。在这项研究中,我们结合了实现PoliViews的两种方法,这是一个综合模型,将(1)概念层与(2)数据层联系起来,概念层描述基因组元素及其概念连接,数据层描述通过特定技术进行基因组测序得出的数据集。当在数据层中选择特定的基因组数据类型时,就会建立它们的动态连接,从而触发在概念层中选择视图。好处是相互的:数据记录可以通过利用其链接的高级概念进行语义描述,反过来,由于真实数据集提供的输入,可以扩展不断发展的抽象模型。PoliViews能够表达对基因组采用整体概念视角的查询,直接转化为面向数据的术语和组织。在这里,我们通过连接两种主要的基因组数据类型,即DNA变异和基因表达来证明这种方法。对于每种类型,我们考虑不同的突出数据来源;我们在概念层中用相应的视图描述它们的映射,从而实现数据类型内部集成。然后,利用概念层中可用的连接,我们展示了如何互操作不同的数据类型,从而实现数据类型间的集成。PoliViews方法通过几个生物学兴趣的例子进行了展示,并可以进一步扩展到任何类型的基因组信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PoliViews: A comprehensive and modular approach to the conceptual modeling of genomic data

The human genome complexity is captured by many signals, representing for instance DNA variations, the expression of gene activity, or DNA’s structural rearrangements; a rich set of data types and formats is used to record these signals. Conceptual models can support the description and explanation of the genome’s elaborate structure and behavior. Among others, the Conceptual Schema of the Human Genome (CSG) provides a concept-oriented, top-down representation of the genome behavior, which is independent of data formats. The Genomic Conceptual Model (GCM) provides instead a data-oriented, bottom-up representation, targeting a well-organized, unified description of these formats. In this research, we join the two approaches to achieve PoliViews, a comprehensive model that links (1) a concepts layer, describing genome elements and their conceptual connections, with (2) a data layer, describing datasets derived from genome sequencing with specific technologies. Their dynamic connection is established when specific genomic data types are chosen in the data layer, thereby triggering the selection of a view in the concepts layer. The benefit is mutual: data records can be semantically described by high-level concepts exploiting their links and, in turn, the continuously evolving abstract model can be extended thanks to the input provided by real datasets. PoliViews enables expressing queries that employ a holistic conceptual perspective on the genome, directly translated onto data-oriented terms and organization. Here, we demonstrate the approach by linking two major genomic data types, namely DNA variation and gene expression. For each type, we consider different eminent data sources; we describe their mapping with the corresponding view in the concepts layer, enabling an intra-data-type integration. Then, leveraging on the connections available in the concepts layer, we show how the distinct data types can be interoperated, enabling an inter-data-type integration. The PoliViews approach is shown through several examples of biological interest and can be further extended to any kind of genomic information.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Data & Knowledge Engineering
Data & Knowledge Engineering 工程技术-计算机:人工智能
CiteScore
5.00
自引率
0.00%
发文量
66
审稿时长
6 months
期刊介绍: Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.
期刊最新文献
Reasoning on responsibilities for optimal process alignment computation SRank: Guiding schema selection in NoSQL document stores Relating behaviour of data-aware process models A framework for understanding event abstraction problem solving: Current states of event abstraction studies A conceptual framework for the government big data ecosystem (‘datagov.eco’)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1