不确定概率数据建模

Teng Lv, Ping Yan, Weimin He
{"title":"不确定概率数据建模","authors":"Teng Lv, Ping Yan, Weimin He","doi":"10.14257/ijdta.2016.9.12.17","DOIUrl":null,"url":null,"abstract":"Uncertainty in data is caused by various reasons including data itself, data mapping, and data policy. For data itself, data are uncertain because of various reasons. For example, data from a sensor network, Internet of Things or Radio Frequency Identification is often inaccurate and uncertain because of devices or environmental factors. For data mapping, integrated data from various heterogonous data sources is commonly uncertain because of uncertain data mapping, data inconsistency, missing data, and dirty data. For data policy, data is modified or hided for policies of data privacy and data confidentiality in an organization. But traditional deterministic data management mainly deals with deterministic data which is precise and certain, and cannot process uncertain data. Modeling uncertain data is a foundation of other technologies for further processing data, such as indexing, querying, searching, mapping, integrating, and mining data, etc. Probabilistic data models of relational databases, XML data and graph data are widely used in many applications and areas today, such as World Wide Web, semantic web, sensor networks, Internet of Things, mobile ad-hoc networks, social networks, traffic networks, biological networks, genome databases, and medical records, etc. This paper presents a survey study of different probabilistic models of uncertain data in relational databases, XML data, and graph data, respectively. The advantages and disadvantages of each kind of probabilistic modes are analyzed and compared. Further open topics of modeling uncertain probabilistic data such as semantic and computation aspects are discussed in the paper. Criteria for modeling uncertain data, such as expressive power, complexity, efficiency, extension are also proposed in the paper.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On Uncertain Probabilistic Data Modeling\",\"authors\":\"Teng Lv, Ping Yan, Weimin He\",\"doi\":\"10.14257/ijdta.2016.9.12.17\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Uncertainty in data is caused by various reasons including data itself, data mapping, and data policy. For data itself, data are uncertain because of various reasons. For example, data from a sensor network, Internet of Things or Radio Frequency Identification is often inaccurate and uncertain because of devices or environmental factors. For data mapping, integrated data from various heterogonous data sources is commonly uncertain because of uncertain data mapping, data inconsistency, missing data, and dirty data. For data policy, data is modified or hided for policies of data privacy and data confidentiality in an organization. But traditional deterministic data management mainly deals with deterministic data which is precise and certain, and cannot process uncertain data. Modeling uncertain data is a foundation of other technologies for further processing data, such as indexing, querying, searching, mapping, integrating, and mining data, etc. Probabilistic data models of relational databases, XML data and graph data are widely used in many applications and areas today, such as World Wide Web, semantic web, sensor networks, Internet of Things, mobile ad-hoc networks, social networks, traffic networks, biological networks, genome databases, and medical records, etc. This paper presents a survey study of different probabilistic models of uncertain data in relational databases, XML data, and graph data, respectively. The advantages and disadvantages of each kind of probabilistic modes are analyzed and compared. Further open topics of modeling uncertain probabilistic data such as semantic and computation aspects are discussed in the paper. Criteria for modeling uncertain data, such as expressive power, complexity, efficiency, extension are also proposed in the paper.\",\"PeriodicalId\":13926,\"journal\":{\"name\":\"International journal of database theory and application\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of database theory and application\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14257/ijdta.2016.9.12.17\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/ijdta.2016.9.12.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

数据的不确定性是由数据本身、数据映射、数据策略等多种原因造成的。对于数据本身来说,由于各种原因,数据是不确定的。例如,由于设备或环境因素,来自传感器网络、物联网或射频识别的数据通常是不准确和不确定的。对于数据映射,由于数据映射不确定、数据不一致、数据缺失和脏数据,来自各种异构数据源的集成数据通常是不确定的。对于数据策略,是指根据组织中的数据隐私和数据机密性策略对数据进行修改或隐藏。但传统的确定性数据管理主要处理精确、确定的确定性数据,无法处理不确定性数据。不确定数据建模是进一步处理数据的其他技术的基础,如索引、查询、搜索、映射、集成和挖掘数据等。目前,关系数据库、XML数据和图形数据的概率数据模型被广泛应用于万维网、语义网、传感器网络、物联网、移动自组网、社交网络、交通网络、生物网络、基因组数据库、医疗记录等众多应用和领域。本文分别对关系数据库、XML数据和图形数据中不确定数据的不同概率模型进行了综述研究。分析比较了各种概率模式的优缺点。本文进一步讨论了不确定概率数据建模中语义和计算方面的开放性问题。本文还提出了不确定数据建模的表达能力、复杂性、效率、可拓性等标准。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
On Uncertain Probabilistic Data Modeling
Uncertainty in data is caused by various reasons including data itself, data mapping, and data policy. For data itself, data are uncertain because of various reasons. For example, data from a sensor network, Internet of Things or Radio Frequency Identification is often inaccurate and uncertain because of devices or environmental factors. For data mapping, integrated data from various heterogonous data sources is commonly uncertain because of uncertain data mapping, data inconsistency, missing data, and dirty data. For data policy, data is modified or hided for policies of data privacy and data confidentiality in an organization. But traditional deterministic data management mainly deals with deterministic data which is precise and certain, and cannot process uncertain data. Modeling uncertain data is a foundation of other technologies for further processing data, such as indexing, querying, searching, mapping, integrating, and mining data, etc. Probabilistic data models of relational databases, XML data and graph data are widely used in many applications and areas today, such as World Wide Web, semantic web, sensor networks, Internet of Things, mobile ad-hoc networks, social networks, traffic networks, biological networks, genome databases, and medical records, etc. This paper presents a survey study of different probabilistic models of uncertain data in relational databases, XML data, and graph data, respectively. The advantages and disadvantages of each kind of probabilistic modes are analyzed and compared. Further open topics of modeling uncertain probabilistic data such as semantic and computation aspects are discussed in the paper. Criteria for modeling uncertain data, such as expressive power, complexity, efficiency, extension are also proposed in the paper.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Logical Data Integration Model for the Integration of Data Repositories Fuzzy Associative Classification Driven MapReduce Computing Solution for Effective Learning from Uncertain and Dynamic Big Data Decision Tree Algorithms C4.5 and C5.0 in Data Mining: A Review Evaluating Intelligent Search Agents in a Controlled Environment Using Complex Queries: An Empirical Study ScaffdCF: A Prototype Interface for Managing Conflicts in Peer Review Process of Open Collaboration Projects
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1