首页 > 最新文献

MLSDA '13最新文献

英文 中文
A Hybrid Grid-based Method for Mining Arbitrary Regions-of-Interest from Trajectories 基于混合网格的轨迹任意兴趣区域挖掘方法
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542653
Chihiro Hio, Luke Bermingham, Guochen Cai, Kyungmi Lee, Ickjai Lee
There is an increasing need for a trajectory pattern mining as the volume of available trajectory data grows at an unprecedented rate with the aid of mobile sensing. Region-of-interest mining identifies interesting hot spots that reveal trajectory concentrations. This article introduces an efficient and effective grid-based region-of-interest mining method that is linear to the number of grid cells, and is able to detect arbitrary shapes of regions-of-interest. The proposed algorithm is robust and applicable to continuous and discrete trajectories, and relatively insensitive to parameter values. Experiments show promising results which demonstrate benefits of the proposed algorithm.
在移动传感的帮助下,可用的轨迹数据量以前所未有的速度增长,对轨迹模式挖掘的需求日益增加。兴趣区域挖掘识别出揭示轨迹集中的有趣热点。本文介绍了一种高效的基于网格的兴趣区域挖掘方法,该方法与网格单元数呈线性关系,能够检测兴趣区域的任意形状。该算法具有鲁棒性,适用于连续和离散轨迹,对参数值相对不敏感。实验结果表明了该算法的优越性。
{"title":"A Hybrid Grid-based Method for Mining Arbitrary Regions-of-Interest from Trajectories","authors":"Chihiro Hio, Luke Bermingham, Guochen Cai, Kyungmi Lee, Ickjai Lee","doi":"10.1145/2542652.2542653","DOIUrl":"https://doi.org/10.1145/2542652.2542653","url":null,"abstract":"There is an increasing need for a trajectory pattern mining as the volume of available trajectory data grows at an unprecedented rate with the aid of mobile sensing. Region-of-interest mining identifies interesting hot spots that reveal trajectory concentrations. This article introduces an efficient and effective grid-based region-of-interest mining method that is linear to the number of grid cells, and is able to detect arbitrary shapes of regions-of-interest. The proposed algorithm is robust and applicable to continuous and discrete trajectories, and relatively insensitive to parameter values. Experiments show promising results which demonstrate benefits of the proposed algorithm.","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"15 19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127148778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The MOA Data Stream Mining Tool: A Mid-Term Report MOA数据流挖掘工具:中期报告
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542660
B. Pfahringer
Stream mining research has seen an impressive increase in the number of publications over the last few years. It borrows heavily from more established research fields in Machine Learning, especially from so-called online learning as well as from time series analysis. It fuses ideas and methods of both these fields and extends them in unique new ways. Stream mining needs to process potentially infinite streams of data, where the source, which generates the data, may change over time, or in other words, the source is nonstationary. Most standard learning approaches assume a stationary data source. Data may also include categorical features, something time series analysis cannot cope with that well. Additionally to models needing to be adapted continuously, they also need to be able to predict at any time, and usually cannot afford to spend much time or memory on every single example. So polynomial behaviour is not good enough, usually logarithmic complexity per example is a strict upper limit on computational resources. The MOA (Massive Online Analysis) stream mining software suite was started already in 2005, and the first open source release took place in 2007. In this talk I will first very briefly present MOA’s history, and then explain and discuss the challenges stream mining faces, and how MOA tries to address them. Finally, I will also focus on current shortcomings, and suggest ways of addressing them. As this last part is the most useful one in terms of further research, I will briefly outline these points here.
在过去的几年里,流挖掘研究的出版物数量有了令人印象深刻的增长。它大量借鉴了机器学习中更成熟的研究领域,尤其是所谓的在线学习和时间序列分析。它融合了这两个领域的思想和方法,并以独特的新方式扩展它们。流挖掘需要处理潜在的无限数据流,其中生成数据的源可能会随时间变化,或者换句话说,源是非平稳的。大多数标准的学习方法假设一个固定的数据源。数据还可能包含分类特征,这是时间序列分析无法很好地处理的。除了需要不断调整模型之外,它们还需要能够随时进行预测,并且通常不能在每个单独的示例上花费太多时间或内存。所以多项式行为是不够好的,通常每个例子的对数复杂度是计算资源的严格上限。MOA(大规模在线分析)流挖掘软件套件早在2005年就开始了,第一个开源版本是在2007年发布的。在这次演讲中,我将首先简要介绍MOA的历史,然后解释和讨论流采矿面临的挑战,以及MOA如何尝试解决这些挑战。最后,我还将重点讨论当前的不足之处,并提出解决这些问题的方法。由于最后一部分对进一步研究最有用,我将在这里简要概述这些要点。
{"title":"The MOA Data Stream Mining Tool: A Mid-Term Report","authors":"B. Pfahringer","doi":"10.1145/2542652.2542660","DOIUrl":"https://doi.org/10.1145/2542652.2542660","url":null,"abstract":"Stream mining research has seen an impressive increase in the number of publications over the last few years. It borrows heavily from more established research fields in Machine Learning, especially from so-called online learning as well as from time series analysis. It fuses ideas and methods of both these fields and extends them in unique new ways. Stream mining needs to process potentially infinite streams of data, where the source, which generates the data, may change over time, or in other words, the source is nonstationary. Most standard learning approaches assume a stationary data source. Data may also include categorical features, something time series analysis cannot cope with that well. Additionally to models needing to be adapted continuously, they also need to be able to predict at any time, and usually cannot afford to spend much time or memory on every single example. So polynomial behaviour is not good enough, usually logarithmic complexity per example is a strict upper limit on computational resources. The MOA (Massive Online Analysis) stream mining software suite was started already in 2005, and the first open source release took place in 2007. In this talk I will first very briefly present MOA’s history, and then explain and discuss the challenges stream mining faces, and how MOA tries to address them. Finally, I will also focus on current shortcomings, and suggest ways of addressing them. As this last part is the most useful one in terms of further research, I will briefly outline these points here.","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125606362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance Analysis of Duty-Cycling Wireless Sensor Network for Train Localization 用于列车定位的无线传感器网络性能分析
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542658
A. Javed, Haibo Zhang, Zhiyi Huang
Wireless sensor networks (WSNs) offer promising solutions for real-time object monitoring and tracking. An interesting application is train localization, in which anchor sensors are deployed along the railway track to detect the train and timely report to a gateway installed on the train. To save energy, anchor sensors operate based on an asynchronous duty-cycling protocol. The accuracy of train localization highly depends on the availability of anchor sensors when a train pass by them, which in turn depends on the duty-cycling. This paper presents an analysis of energy consumption with different levels of performance compromises. We evaluate the energy consumption through simulations, and results show that with slight performance compromise on the number of active anchors, the lifetime of anchor sensors can be significantly extended.
无线传感器网络(wsn)为实时对象监控和跟踪提供了有前途的解决方案。一个有趣的应用是列车定位,其中锚传感器沿着铁路轨道部署,以检测列车并及时向安装在列车上的网关报告。为了节省能源,锚定传感器基于异步占空比协议工作。列车定位的准确性很大程度上取决于列车经过锚点传感器时锚点传感器的可用性,而锚点传感器的可用性又取决于锚点传感器的占空比。本文对不同性能妥协水平下的能耗进行了分析。通过仿真对能量消耗进行了评估,结果表明,在有效锚点数量上略有性能妥协的情况下,锚点传感器的寿命可以显著延长。
{"title":"Performance Analysis of Duty-Cycling Wireless Sensor Network for Train Localization","authors":"A. Javed, Haibo Zhang, Zhiyi Huang","doi":"10.1145/2542652.2542658","DOIUrl":"https://doi.org/10.1145/2542652.2542658","url":null,"abstract":"Wireless sensor networks (WSNs) offer promising solutions for real-time object monitoring and tracking. An interesting application is train localization, in which anchor sensors are deployed along the railway track to detect the train and timely report to a gateway installed on the train. To save energy, anchor sensors operate based on an asynchronous duty-cycling protocol. The accuracy of train localization highly depends on the availability of anchor sensors when a train pass by them, which in turn depends on the duty-cycling. This paper presents an analysis of energy consumption with different levels of performance compromises. We evaluate the energy consumption through simulations, and results show that with slight performance compromise on the number of active anchors, the lifetime of anchor sensors can be significantly extended.","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130131520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Clustering Household Electricity Use Profiles 聚类家庭用电概况
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542656
John R. Williams
An attempt was made to cluster the load profiles of a sample (n ≈ 380) of New Zealand households. An extensive range of approaches was evaluated, including the approach of clustering on "features" of the data rather than the raw data. A semi-automatic search of the problem space (cluster base, distance measure, cluster/partitioning method and k) resulted in a k = 3-cluster solution with acceptable quality indices and face validity. Although a particular combination of base, distance metric and clustering method was found to work well in this case, it is the practice of searching the problem space, rather than a particular solution, that is discussed and advocated.
本文试图对新西兰家庭样本(n≈380)的负荷概况进行聚类。对广泛的方法进行了评估,包括对数据的“特征”而不是原始数据进行聚类的方法。对问题空间(聚类库、距离度量、聚类/划分方法和k)进行半自动搜索,得到k = 3个聚类的解决方案,具有可接受的质量指标和面效度。虽然发现基、距离度量和聚类方法的特定组合在这种情况下工作得很好,但讨论和提倡的是搜索问题空间的实践,而不是特定的解决方案。
{"title":"Clustering Household Electricity Use Profiles","authors":"John R. Williams","doi":"10.1145/2542652.2542656","DOIUrl":"https://doi.org/10.1145/2542652.2542656","url":null,"abstract":"An attempt was made to cluster the load profiles of a sample (n ≈ 380) of New Zealand households. An extensive range of approaches was evaluated, including the approach of clustering on \"features\" of the data rather than the raw data. A semi-automatic search of the problem space (cluster base, distance measure, cluster/partitioning method and k) resulted in a k = 3-cluster solution with acceptable quality indices and face validity. Although a particular combination of base, distance metric and clustering method was found to work well in this case, it is the practice of searching the problem space, rather than a particular solution, that is discussed and advocated.","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133333393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
From Association Analysis to Causal Discovery 从关联分析到因果发现
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542659
Jiuyong Li
Association analysis is an important technique in data mining, and it has been widely used in many application areas [6]. However, associations found in data can be spurious and do not reflect the ‘true’ relationships between the variables under consideration. For example, it is easily for hundreds or thousands of association rules to be generated even in a small data set, but most of them could be spurious and have no practical meaning [11, 21, 22]. This has hindered the applications of association analysis to solving real world problems. While the development of efficient techniques for finding association patterns in data, especially in large data sets, is well underway, the problem for identifying non-spurious associations has become prominent. Causal relationships imply the real data generating mechanisms and how the outcome would change when the cause is changed, so finding them has been the ultimate goals of many scientific explorations and social studies [18]. The gold standard for causal discover is randomised controlled trials (RCTs) [4, 16]. However, a RCT is infeasible in many real world applications, particularly in the case of high dimensional problem of a large number of potential causes. As part of the efforts on causal discovery, statisticians have studied various methods for testing a hypothetical causal relationship based on observational data [16]. However, these methods are designed for validating a known candidate causal relationship and they are incapable of dealing with a large number of potential causes either. Although an association between two variables does not always imply causation, it is well known that associations are indicators for causal relationships [7]. Therefore a practical approach to causal discovery in large data sets could start with association analysis of the data. A question is then whether we can filter out associations that do not have causal indications. Note that this objective is different from that of mining interesting associations [9, 20] or discovering statistically sound associations [5, 21] because interestingness criteria do not measure causality and a test of statistical significance only determines if an association is due to random chance. We have integrated two statis-
关联分析是数据挖掘中的一项重要技术,在许多应用领域得到了广泛的应用[6]。然而,在数据中发现的关联可能是虚假的,并不能反映所考虑的变量之间的“真实”关系。例如,即使在一个很小的数据集中,也很容易产生成百上千的关联规则,但其中大多数可能是虚假的,没有实际意义[11,21,22]。这阻碍了关联分析在解决现实世界问题中的应用。虽然在数据中(特别是在大型数据集中)查找关联模式的有效技术的开发正在顺利进行,但识别非虚假关联的问题已经变得突出。因果关系意味着真实的数据产生机制,以及当原因发生变化时结果会如何变化,因此找到因果关系一直是许多科学探索和社会研究的最终目标[18]。因果发现的黄金标准是随机对照试验(RCTs)[4,16]。然而,RCT在许多实际应用中是不可行的,特别是在具有大量潜在原因的高维问题的情况下。作为因果关系发现工作的一部分,统计学家研究了基于观测数据检验假设因果关系的各种方法[16]。然而,这些方法是为验证已知的候选因果关系而设计的,它们也无法处理大量的潜在原因。虽然两个变量之间的关联并不总是意味着因果关系,但众所周知,关联是因果关系的指标[7]。因此,在大型数据集中发现因果关系的实用方法可以从数据的关联分析开始。那么问题来了,我们是否可以过滤掉那些没有因果关系的联想。请注意,这个目标不同于挖掘有趣的关联[9,20]或发现统计上合理的关联[5,21],因为有趣性标准不衡量因果关系,统计显著性检验只确定关联是否由于随机机会。我们把两种状态结合起来
{"title":"From Association Analysis to Causal Discovery","authors":"Jiuyong Li","doi":"10.1145/2542652.2542659","DOIUrl":"https://doi.org/10.1145/2542652.2542659","url":null,"abstract":"Association analysis is an important technique in data mining, and it has been widely used in many application areas [6]. However, associations found in data can be spurious and do not reflect the ‘true’ relationships between the variables under consideration. For example, it is easily for hundreds or thousands of association rules to be generated even in a small data set, but most of them could be spurious and have no practical meaning [11, 21, 22]. This has hindered the applications of association analysis to solving real world problems. While the development of efficient techniques for finding association patterns in data, especially in large data sets, is well underway, the problem for identifying non-spurious associations has become prominent. Causal relationships imply the real data generating mechanisms and how the outcome would change when the cause is changed, so finding them has been the ultimate goals of many scientific explorations and social studies [18]. The gold standard for causal discover is randomised controlled trials (RCTs) [4, 16]. However, a RCT is infeasible in many real world applications, particularly in the case of high dimensional problem of a large number of potential causes. As part of the efforts on causal discovery, statisticians have studied various methods for testing a hypothetical causal relationship based on observational data [16]. However, these methods are designed for validating a known candidate causal relationship and they are incapable of dealing with a large number of potential causes either. Although an association between two variables does not always imply causation, it is well known that associations are indicators for causal relationships [7]. Therefore a practical approach to causal discovery in large data sets could start with association analysis of the data. A question is then whether we can filter out associations that do not have causal indications. Note that this objective is different from that of mining interesting associations [9, 20] or discovering statistically sound associations [5, 21] because interestingness criteria do not measure causality and a test of statistical significance only determines if an association is due to random chance. We have integrated two statis-","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125824165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Light-weight Online Predictive Data Aggregation for Wireless Sensor Networks 无线传感器网络的轻量级在线预测数据聚合
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542657
Jeremiah D. Deng, Yue Zhang
Wireless Sensor Networks (WSNs) have found many practical applications in recent years. Apart from both the vast new opportunities and challenges raised by the availability of large amounts of sensory data, energy conservation remains a challenging research topic that demands intelligent solutions. Various data aggregation techniques have been proposed in the literature, but the optimal tradeoff between algorithm complexity and prediction ability remains elusive. In this paper we concentrate on employing a few light-weight time series estimation algorithms for online predictive sensing. A number of performance metrics are proposed and employed to examine the effectiveness of the scheme using real-world datasets.
近年来,无线传感器网络(WSNs)得到了许多实际应用。除了大量感官数据的可用性带来的巨大新机遇和挑战之外,节能仍然是一个具有挑战性的研究课题,需要智能的解决方案。文献中已经提出了各种数据聚合技术,但算法复杂性和预测能力之间的最佳权衡仍然难以捉摸。在本文中,我们着重于使用一些轻量级的时间序列估计算法进行在线预测传感。提出了一些性能指标,并使用实际数据集来检验该方案的有效性。
{"title":"Light-weight Online Predictive Data Aggregation for Wireless Sensor Networks","authors":"Jeremiah D. Deng, Yue Zhang","doi":"10.1145/2542652.2542657","DOIUrl":"https://doi.org/10.1145/2542652.2542657","url":null,"abstract":"Wireless Sensor Networks (WSNs) have found many practical applications in recent years. Apart from both the vast new opportunities and challenges raised by the availability of large amounts of sensory data, energy conservation remains a challenging research topic that demands intelligent solutions. Various data aggregation techniques have been proposed in the literature, but the optimal tradeoff between algorithm complexity and prediction ability remains elusive. In this paper we concentrate on employing a few light-weight time series estimation algorithms for online predictive sensing. A number of performance metrics are proposed and employed to examine the effectiveness of the scheme using real-world datasets.","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121387920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Ensemble Feature Ranking for Shellfish Farm Closure Cause Identification 贝类养殖场关闭原因识别的集合特征排序
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542655
Ashfaqur Rahman, C. D'Este, John McCulloch
Shellfish farms must be closed if there is suspected contamination during production to avoid serious health hazards. The authorities monitor a number of environmental and water quality variables through a set of sensors to check the health of shellfish farms and to decide on the closure of the farms. The research presented in this paper aims to develop an ensemble feature ranking algorithm to identify the cause of closure. We have presented and analysed the results obtained using the proposed algorithm to demonstrate its effectiveness.
如果贝类养殖场在生产过程中怀疑受到污染,必须关闭,以免严重危害健康。当局通过一套传感器监测一些环境和水质变量,以检查贝类养殖场的健康状况,并决定关闭养殖场。本文的研究旨在开发一种集成特征排序算法来识别关闭的原因。我们给出并分析了使用该算法获得的结果,以证明其有效性。
{"title":"Ensemble Feature Ranking for Shellfish Farm Closure Cause Identification","authors":"Ashfaqur Rahman, C. D'Este, John McCulloch","doi":"10.1145/2542652.2542655","DOIUrl":"https://doi.org/10.1145/2542652.2542655","url":null,"abstract":"Shellfish farms must be closed if there is suspected contamination during production to avoid serious health hazards. The authorities monitor a number of environmental and water quality variables through a set of sensors to check the health of shellfish farms and to decide on the closure of the farms. The research presented in this paper aims to develop an ensemble feature ranking algorithm to identify the cause of closure. We have presented and analysed the results obtained using the proposed algorithm to demonstrate its effectiveness.","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124783216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Predicting Petroleum Reservoir Properties from Downhole Sensor Data using an Ensemble Model of Neural Networks 利用神经网络集成模型从井下传感器数据预测油藏性质
Pub Date : 2013-12-02 DOI: 10.1145/2542652.2542654
Anifowose Fatai, J. Labadin, A. Raheem
The acquisition of huge sensor data has led to the advent of the smart field phenomenon in the petroleum industry. A lot of data is acquired during drilling and production processes through logging tools equipped with sub-surface/down-hole sensors. Reservoir modeling has advanced from the use of empirical equations through statistical regression tools to the present embrace of Artificial Intelligence (AI) and its hybrid techniques. Due to the high dimensionality and heterogeneity of the sensor data, the capability of conventional AI techniques has become limited as they could not handle more than one hypothesis at a time. Ensemble learning method has the capability to combine several hypotheses to evolve a single ensemble solution to a problem. Despite its popular use, especially in petroleum engineering, Artificial Neural Networks (ANN) has posed a number of challenges. One of such is the difficulty in determining the most suitable learning algorithm for optimal model performance. To save the cost, effort and time involved in the use of trial-and-error and evolutionary methods, this paper presents an ensemble model of ANN that combines the diverse performances of seven "weak" learning algorithms to evolve an ensemble solution in the prediction of porosity and permeability of petroleum reservoirs. When compared to the individual ANN, ANN-bagging and RandomForest, the proposed model performed best. This further confirms the great opportunities for ensemble modeling in petroleum reservoir characterization and other petroleum engineering problems.
海量传感器数据的采集导致了石油行业智能油田现象的出现。在钻井和生产过程中,通过配备地下/井下传感器的测井工具获取大量数据。油藏建模已经从使用统计回归工具的经验方程发展到目前采用人工智能(AI)及其混合技术。由于传感器数据的高维性和异质性,传统的人工智能技术的能力已经变得有限,因为它们不能一次处理多个假设。集成学习方法能够将多个假设结合起来,以演化出一个问题的集成解决方案。尽管人工神经网络(ANN)得到了广泛的应用,特别是在石油工程中,但它也带来了许多挑战。其中之一是难以确定最合适的学习算法以获得最佳模型性能。为了节省使用试错法和进化方法所涉及的成本、精力和时间,本文提出了一种人工神经网络的集成模型,该模型结合了7种“弱”学习算法的不同性能,以进化出预测油藏孔隙度和渗透率的集成解决方案。与单个人工神经网络、人工神经网络bagging和随机森林相比,所提出的模型表现最好。这进一步证实了集成建模在油藏表征和其他石油工程问题中的巨大机遇。
{"title":"Predicting Petroleum Reservoir Properties from Downhole Sensor Data using an Ensemble Model of Neural Networks","authors":"Anifowose Fatai, J. Labadin, A. Raheem","doi":"10.1145/2542652.2542654","DOIUrl":"https://doi.org/10.1145/2542652.2542654","url":null,"abstract":"The acquisition of huge sensor data has led to the advent of the smart field phenomenon in the petroleum industry. A lot of data is acquired during drilling and production processes through logging tools equipped with sub-surface/down-hole sensors. Reservoir modeling has advanced from the use of empirical equations through statistical regression tools to the present embrace of Artificial Intelligence (AI) and its hybrid techniques. Due to the high dimensionality and heterogeneity of the sensor data, the capability of conventional AI techniques has become limited as they could not handle more than one hypothesis at a time. Ensemble learning method has the capability to combine several hypotheses to evolve a single ensemble solution to a problem. Despite its popular use, especially in petroleum engineering, Artificial Neural Networks (ANN) has posed a number of challenges. One of such is the difficulty in determining the most suitable learning algorithm for optimal model performance. To save the cost, effort and time involved in the use of trial-and-error and evolutionary methods, this paper presents an ensemble model of ANN that combines the diverse performances of seven \"weak\" learning algorithms to evolve an ensemble solution in the prediction of porosity and permeability of petroleum reservoirs. When compared to the individual ANN, ANN-bagging and RandomForest, the proposed model performed best. This further confirms the great opportunities for ensemble modeling in petroleum reservoir characterization and other petroleum engineering problems.","PeriodicalId":248909,"journal":{"name":"MLSDA '13","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124589684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
MLSDA '13
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1