DataCentric Engineering最新文献

英文中文

Performance and accuracy assessments of an incompressible fluid solver coupled with a deep convolutional neural network—ADDENDUM 与深度卷积神经网络耦合的不可压缩流体求解器的性能和精度评估——附录

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2022-04-04 DOI: 10.1017/dce.2022.10

Ekhi Ajuria Illarramendi, M. Bauerheim, B. Cuenot

引用次数: 1

Bayesian optimization with informative parametric models via sequential Monte Carlo 基于序列蒙特卡罗的具有信息参数模型的贝叶斯优化

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2022-03-08 DOI: 10.1017/dce.2022.5

Rafael Oliveira, R. Scalzo, R. Kohn, Sally Cripps, Kyle Hardman, J. Close, Nasrin Taghavi, C. Lemckert

Abstract Bayesian optimization (BO) has been a successful approach to optimize expensive functions whose prior knowledge can be specified by means of a probabilistic model. Due to their expressiveness and tractable closed-form predictive distributions, Gaussian process (GP) surrogate models have been the default go-to choice when deriving BO frameworks. However, as nonparametric models, GPs offer very little in terms of interpretability and informative power when applied to model complex physical phenomena in scientific applications. In addition, the Gaussian assumption also limits the applicability of GPs to problems where the variables of interest may highly deviate from Gaussianity. In this article, we investigate an alternative modeling framework for BO which makes use of sequential Monte Carlo (SMC) to perform Bayesian inference with parametric models. We propose a BO algorithm to take advantage of SMC’s flexible posterior representations and provide methods to compensate for bias in the approximations and reduce particle degeneracy. Experimental results on simulated engineering applications in detecting water leaks and contaminant source localization are presented showing performance improvements over GP-based BO approaches.

摘要贝叶斯优化（BO）是一种成功的优化昂贵函数的方法，其先验知识可以通过概率模型来指定。由于其表现力和可处理的闭式预测分布，高斯过程（GP）代理模型一直是推导BO框架时的默认选择。然而，作为非参数模型，当应用于科学应用中的复杂物理现象建模时，GP在可解释性和信息能力方面提供的很少。此外，高斯假设还限制了GP对感兴趣的变量可能高度偏离高斯性的问题的适用性。在本文中，我们研究了BO的另一种建模框架，该框架利用序列蒙特卡罗（SMC）对参数模型进行贝叶斯推理。我们提出了一种BO算法来利用SMC的灵活后验表示，并提供了补偿近似中的偏差和减少粒子退化的方法。在检测漏水和污染物源定位方面的模拟工程应用的实验结果表明，与基于GP的BO方法相比，性能有所提高。

{"title":"Bayesian optimization with informative parametric models via sequential Monte Carlo","authors":"Rafael Oliveira, R. Scalzo, R. Kohn, Sally Cripps, Kyle Hardman, J. Close, Nasrin Taghavi, C. Lemckert","doi":"10.1017/dce.2022.5","DOIUrl":"https://doi.org/10.1017/dce.2022.5","url":null,"abstract":"Abstract Bayesian optimization (BO) has been a successful approach to optimize expensive functions whose prior knowledge can be specified by means of a probabilistic model. Due to their expressiveness and tractable closed-form predictive distributions, Gaussian process (GP) surrogate models have been the default go-to choice when deriving BO frameworks. However, as nonparametric models, GPs offer very little in terms of interpretability and informative power when applied to model complex physical phenomena in scientific applications. In addition, the Gaussian assumption also limits the applicability of GPs to problems where the variables of interest may highly deviate from Gaussianity. In this article, we investigate an alternative modeling framework for BO which makes use of sequential Monte Carlo (SMC) to perform Bayesian inference with parametric models. We propose a BO algorithm to take advantage of SMC’s flexible posterior representations and provide methods to compensate for bias in the approximations and reduce particle degeneracy. Experimental results on simulated engineering applications in detecting water leaks and contaminant source localization are presented showing performance improvements over GP-based BO approaches.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42701494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Markov chain Monte Carlo for a hyperbolic Bayesian inverse problem in traffic flow modeling 交通流建模中双曲贝叶斯反问题的马尔可夫链蒙特卡罗

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2022-02-22 DOI: 10.1017/dce.2022.3

Jeremie Coullon, Y. Pokern

Abstract As a Bayesian approach to fitting motorway traffic flow models remains rare in the literature, we empirically explore the sampling challenges this approach offers which have to do with the strong correlations and multimodality of the posterior distribution. In particular, we provide a unified statistical model to estimate using motorway data both boundary conditions and fundamental diagram parameters in a motorway traffic flow model due to Lighthill, Whitham, and Richards known as LWR. This allows us to provide a traffic flow density estimation method that is shown to be superior to two methods found in the traffic flow literature. To sample from this challenging posterior distribution, we use a state-of-the-art gradient-free function space sampler augmented with parallel tempering.

摘要由于拟合高速公路交通流模型的贝叶斯方法在文献中仍然很少见，我们从经验上探讨了这种方法所带来的采样挑战，这与后验分布的强相关性和多模态性有关。特别是，我们提供了一个统一的统计模型，以使用高速公路数据来估计Lighthill、Whitham和Richards（LWR）的高速公路交通流模型中的边界条件和基本图参数。这使我们能够提供一种交通流密度估计方法，该方法被证明优于交通流文献中的两种方法。为了从这种具有挑战性的后验分布中采样，我们使用了一种最先进的无梯度函数空间采样器，并添加了平行回火。

引用次数: 1

Universal Digital Twin: Land use 环球数字孪生：土地利用

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2022-02-10 DOI: 10.1017/dce.2021.21

J. Akroyd, Zachary S. Harper, David Soutar, Feroz Farazi, Amita Bhave, S. Mosbach, M. Kraft

Abstract This article develops an ontological description of land use and applies it to incorporate geospatial information describing land coverage into a knowledge-graph-based Universal Digital Twin. Sources of data relating to land use in the UK have been surveyed. The Crop Map of England (CROME) is produced annually by the UK Government and was identified as a valuable source of open data. Formal ontologies to represent land use and the geospatial data arising from such surveys have been developed. The ontologies have been deployed using a high-performance graph database. A customized vocabulary was developed to extend the geospatial capabilities of the graph database to support the CROME data. The integration of the CROME data into the Universal Digital Twin is demonstrated in two use cases that show the potential of the Universal Digital Twin to share data across sectors. The first use case combines data about land use with a geospatial analysis of scenarios for energy provision. The second illustrates how the Universal Digital Twin could use the land use data to support the cross-domain analysis of flood risk. Opportunities for the extension and enrichment of the ontologies, and further development of the Universal Digital Twin are discussed.

摘要本文开发了一种土地利用的本体论描述，并将其应用于将描述土地覆盖的地理空间信息纳入基于知识图的通用数字孪生中。对英国土地使用相关数据来源进行了调查。英格兰作物地图（CROME）由英国政府每年编制一次，被认为是开放数据的宝贵来源。已经开发了表示土地利用和此类调查产生的地理空间数据的正式本体论。本体已经使用高性能的图数据库进行了部署。开发了一个自定义词汇表，以扩展图形数据库的地理空间功能，从而支持CROME数据。CROME数据与通用数字孪生的集成在两个用例中得到了证明，这两个用例显示了通用数字孪生在跨行业共享数据的潜力。第一个用例将土地使用数据与能源供应情景的地理空间分析相结合。第二个例子说明了通用数字孪生如何利用土地利用数据来支持洪水风险的跨领域分析。讨论了本体论的扩展和丰富以及通用数字孪生的进一步发展的机会。

{"title":"Universal Digital Twin: Land use","authors":"J. Akroyd, Zachary S. Harper, David Soutar, Feroz Farazi, Amita Bhave, S. Mosbach, M. Kraft","doi":"10.1017/dce.2021.21","DOIUrl":"https://doi.org/10.1017/dce.2021.21","url":null,"abstract":"Abstract This article develops an ontological description of land use and applies it to incorporate geospatial information describing land coverage into a knowledge-graph-based Universal Digital Twin. Sources of data relating to land use in the UK have been surveyed. The Crop Map of England (CROME) is produced annually by the UK Government and was identified as a valuable source of open data. Formal ontologies to represent land use and the geospatial data arising from such surveys have been developed. The ontologies have been deployed using a high-performance graph database. A customized vocabulary was developed to extend the geospatial capabilities of the graph database to support the CROME data. The integration of the CROME data into the Universal Digital Twin is demonstrated in two use cases that show the potential of the Universal Digital Twin to share data across sectors. The first use case combines data about land use with a geospatial analysis of scenarios for energy provision. The second illustrates how the Universal Digital Twin could use the land use data to support the cross-domain analysis of flood risk. Opportunities for the extension and enrichment of the ontologies, and further development of the Universal Digital Twin are discussed.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46117462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Development of a digital twin operational platform using Python Flask 使用Python Flask开发数字双操作平台

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2022-01-31 DOI: 10.1017/dce.2022.1

M. Bonney, M. de Angelis, M. Dal Borgo, Luis Andrade, S. Beregi, N. Jamia, D. Wagg

Abstract The digital twin concept has developed as a method for extracting value from data, and is being developed as a new technique for the design and asset management of high-value engineering systems such as aircraft, energy generating plant, and wind turbines. In terms of implementation, many proprietary digital twin software solutions have been marketed in this domain. In contrast, this paper describes a recently released open-source software framework for digital twins, which provides a browser-based operational platform using Python and Flask. The new platform is intended to maximize connectivity between users and data obtained from the physical twin. This paper describes how this type of digital twin operational platform (DTOP) can be used to connect the physical twin and other Internet-of-Things devices to both users and cloud computing services. The current release of the software—DTOP-Cristallo—uses the example of a three-storey structure as the engineering asset to be managed. Within DTOP-Cristallo, specific engineering software tools have been developed for use in the digital twin, and these are used to demonstrate the concept. At this stage, the framework presented is a prototype. However, the potential for open-source digital twin software using network connectivity is a very large area for future research and development.

摘要数字孪生概念已发展成为一种从数据中提取价值的方法，并正在发展成为一项新技术，用于飞机、发电厂和风力涡轮机等高价值工程系统的设计和资产管理。在实施方面，许多专有的数字孪生软件解决方案已在该领域上市。相比之下，本文描述了最近发布的数字双胞胎开源软件框架，该框架使用Python和Flask提供了一个基于浏览器的操作平台。新平台旨在最大限度地提高用户和从物理双胞胎中获得的数据之间的连接。本文描述了如何使用这种类型的数字孪生操作平台（DTOP）将物理孪生和其他物联网设备连接到用户和云计算服务。该软件的最新版本DTOP Cristallo使用了一个三层结构的例子作为要管理的工程资产。在DTOP Cristallo中，已经开发了用于数字孪生的特定工程软件工具，这些工具用于演示这一概念。在这个阶段，所提出的框架是一个原型。然而，使用网络连接的开源数字孪生软件的潜力是未来研发的一个非常大的领域。

{"title":"Development of a digital twin operational platform using Python Flask","authors":"M. Bonney, M. de Angelis, M. Dal Borgo, Luis Andrade, S. Beregi, N. Jamia, D. Wagg","doi":"10.1017/dce.2022.1","DOIUrl":"https://doi.org/10.1017/dce.2022.1","url":null,"abstract":"Abstract The digital twin concept has developed as a method for extracting value from data, and is being developed as a new technique for the design and asset management of high-value engineering systems such as aircraft, energy generating plant, and wind turbines. In terms of implementation, many proprietary digital twin software solutions have been marketed in this domain. In contrast, this paper describes a recently released open-source software framework for digital twins, which provides a browser-based operational platform using Python and Flask. The new platform is intended to maximize connectivity between users and data obtained from the physical twin. This paper describes how this type of digital twin operational platform (DTOP) can be used to connect the physical twin and other Internet-of-Things devices to both users and cloud computing services. The current release of the software—DTOP-Cristallo—uses the example of a three-storey structure as the engineering asset to be managed. Within DTOP-Cristallo, specific engineering software tools have been developed for use in the digital twin, and these are used to demonstrate the concept. At this stage, the framework presented is a prototype. However, the potential for open-source digital twin software using network connectivity is a very large area for future research and development.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42539430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Uniform-in-phase-space data selection with iterative normalizing flows 迭代归一化流的相空间均匀数据选择

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2021-12-28 DOI: 10.1017/dce.2023.4

M. Hassanaly, Bruce A. Perry, M. Mueller, S. Yellapantula

Abstract Improvements in computational and experimental capabilities are rapidly increasing the amount of scientific data that are routinely generated. In applications that are constrained by memory and computational intensity, excessively large datasets may hinder scientific discovery, making data reduction a critical component of data-driven methods. Datasets are growing in two directions: the number of data points and their dimensionality. Whereas dimension reduction typically aims at describing each data sample on lower-dimensional space, the focus here is on reducing the number of data points. A strategy is proposed to select data points such that they uniformly span the phase-space of the data. The algorithm proposed relies on estimating the probability map of the data and using it to construct an acceptance probability. An iterative method is used to accurately estimate the probability of the rare data points when only a small subset of the dataset is used to construct the probability map. Instead of binning the phase-space to estimate the probability map, its functional form is approximated with a normalizing flow. Therefore, the method naturally extends to high-dimensional datasets. The proposed framework is demonstrated as a viable pathway to enable data-efficient machine learning when abundant data are available.

摘要计算和实验能力的提高正在迅速增加日常生成的科学数据量。在受内存和计算强度限制的应用程序中，过大的数据集可能会阻碍科学发现，使数据缩减成为数据驱动方法的关键组成部分。数据集正朝着两个方向增长：数据点的数量及其维度。降维通常旨在描述低维空间上的每个数据样本，而这里的重点是减少数据点的数量。提出了一种策略来选择数据点，使得它们均匀地跨越数据的相位空间。所提出的算法依赖于估计数据的概率图，并使用它来构建接受概率。当仅使用数据集的一个子集来构建概率图时，使用迭代方法来准确估计罕见数据点的概率。不是对相位空间进行装箱来估计概率图，而是用归一化流来近似其函数形式。因此，该方法自然扩展到高维数据集。所提出的框架被证明是一种可行的途径，可以在有大量数据可用的情况下实现数据高效的机器学习。

{"title":"Uniform-in-phase-space data selection with iterative normalizing flows","authors":"M. Hassanaly, Bruce A. Perry, M. Mueller, S. Yellapantula","doi":"10.1017/dce.2023.4","DOIUrl":"https://doi.org/10.1017/dce.2023.4","url":null,"abstract":"Abstract Improvements in computational and experimental capabilities are rapidly increasing the amount of scientific data that are routinely generated. In applications that are constrained by memory and computational intensity, excessively large datasets may hinder scientific discovery, making data reduction a critical component of data-driven methods. Datasets are growing in two directions: the number of data points and their dimensionality. Whereas dimension reduction typically aims at describing each data sample on lower-dimensional space, the focus here is on reducing the number of data points. A strategy is proposed to select data points such that they uniformly span the phase-space of the data. The algorithm proposed relies on estimating the probability map of the data and using it to construct an acceptance probability. An iterative method is used to accurately estimate the probability of the rare data points when only a small subset of the dataset is used to construct the probability map. Instead of binning the phase-space to estimate the probability map, its functional form is approximated with a normalizing flow. Therefore, the method naturally extends to high-dimensional datasets. The proposed framework is demonstrated as a viable pathway to enable data-efficient machine learning when abundant data are available.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44703688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A hierarchical Bayesian approach for calibration of stochastic material models 随机材料模型的分层贝叶斯校正方法

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2021-12-17 DOI: 10.1017/dce.2021.20

Nikolaos Papadimas, T. Dodwell

Abstract This article recasts the traditional challenge of calibrating a material constitutive model into a hierarchical probabilistic framework. We consider a Bayesian framework where material parameters are assigned distributions, which are then updated given experimental data. Importantly, in true engineering setting, we are not interested in inferring the parameters for a single experiment, but rather inferring the model parameters over the population of possible experimental samples. In doing so, we seek to also capture the inherent variability of the material from coupon-to-coupon, as well as uncertainties around the repeatability of the test. In this article, we address this problem using a hierarchical Bayesian model. However, a vanilla computational approach is prohibitively expensive. Our strategy marginalizes over each individual experiment, decreasing the dimension of our inference problem to only the hyperparameter—those parameter describing the population statistics of the material model only. Importantly, this marginalization step, requires us to derive an approximate likelihood, for which, we exploit an emulator (built offline prior to sampling) and Bayesian quadrature, allowing us to capture the uncertainty in this numerical approximation. Importantly, our approach renders hierarchical Bayesian calibration of material models computational feasible. The approach is tested in two different examples. The first is a compression test of simple spring model using synthetic data; the second, a more complex example using real experiment data to fit a stochastic elastoplastic model for 3D-printed steel.

摘要:本文将传统的材料本构模型的校准问题转化为一个层次概率框架。我们考虑一个贝叶斯框架，其中材料参数被分配分布，然后根据实验数据更新。重要的是，在真正的工程环境中，我们对推断单个实验的参数不感兴趣，而是推断可能实验样本总体上的模型参数。在这样做的过程中，我们还试图捕获材料从优惠券到优惠券的固有可变性，以及围绕测试可重复性的不确定性。在本文中，我们使用分层贝叶斯模型来解决这个问题。然而，普通的计算方法是非常昂贵的。我们的策略将每个单独的实验边缘化，将我们的推理问题的维度降低到只有超参数-那些只描述材料模型的总体统计的参数。重要的是，这个边缘化步骤需要我们推导出一个近似的似然，为此，我们利用模拟器(在采样之前离线构建)和贝叶斯正交，允许我们捕获这个数值近似中的不确定性。重要的是，我们的方法使材料模型的分层贝叶斯校准在计算上可行。该方法在两个不同的示例中进行了测试。首先利用合成数据对简单弹簧模型进行压缩试验;第二，一个更复杂的例子，使用实际实验数据拟合3d打印钢的随机弹塑性模型。

{"title":"A hierarchical Bayesian approach for calibration of stochastic material models","authors":"Nikolaos Papadimas, T. Dodwell","doi":"10.1017/dce.2021.20","DOIUrl":"https://doi.org/10.1017/dce.2021.20","url":null,"abstract":"Abstract This article recasts the traditional challenge of calibrating a material constitutive model into a hierarchical probabilistic framework. We consider a Bayesian framework where material parameters are assigned distributions, which are then updated given experimental data. Importantly, in true engineering setting, we are not interested in inferring the parameters for a single experiment, but rather inferring the model parameters over the population of possible experimental samples. In doing so, we seek to also capture the inherent variability of the material from coupon-to-coupon, as well as uncertainties around the repeatability of the test. In this article, we address this problem using a hierarchical Bayesian model. However, a vanilla computational approach is prohibitively expensive. Our strategy marginalizes over each individual experiment, decreasing the dimension of our inference problem to only the hyperparameter—those parameter describing the population statistics of the material model only. Importantly, this marginalization step, requires us to derive an approximate likelihood, for which, we exploit an emulator (built offline prior to sampling) and Bayesian quadrature, allowing us to capture the uncertainty in this numerical approximation. Importantly, our approach renders hierarchical Bayesian calibration of material models computational feasible. The approach is tested in two different examples. The first is a compression test of simple spring model using synthetic data; the second, a more complex example using real experiment data to fit a stochastic elastoplastic model for 3D-printed steel.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":"10 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41301936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Materials informatics and sustainability—The case for urgency 材料信息学和可持续性——紧急案例

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2021-11-15 DOI: 10.1017/dce.2021.19

H. Melia, Eric S. Muckley, J. Saal

Abstract The development of transformative technologies for mitigating our global environmental and technological challenges will require significant innovation in the design, development, and manufacturing of advanced materials and chemicals. To achieve this innovation faster than what is possible by traditional human intuition-guided scientific methods, we must transition to a materials informatics-centered paradigm, in which synergies between data science, materials science, and artificial intelligence are leveraged to enable transformative, data-driven discoveries faster than ever before through the use of predictive models and digital twins. While materials informatics is experiencing rapidly increasing use across the materials and chemicals industries, broad adoption is hindered by barriers such as skill gaps, cultural resistance, and data sparsity. We discuss the importance of materials informatics for accelerating technological innovation, describe current barriers and examples of good practices, and offer suggestions for how researchers, funding agencies, and educational institutions can help accelerate the adoption of urgently needed informatics-based toolsets for science in the 21st century.

摘要开发变革性技术以缓解我们的全球环境和技术挑战，需要在先进材料和化学品的设计、开发和制造方面进行重大创新。为了比传统的人类直觉指导的科学方法更快地实现这一创新，我们必须过渡到以材料信息学为中心的范式，在这种范式中，利用数据科学、材料科学和人工智能之间的协同作用，实现变革，通过使用预测模型和数字双胞胎，数据驱动的发现比以往任何时候都更快。尽管材料信息学在材料和化工行业的使用正在迅速增加，但技能差距、文化阻力和数据稀疏等障碍阻碍了其广泛应用。我们讨论了材料信息学对加速技术创新的重要性，描述了当前的障碍和良好实践的例子，并就研究人员、资助机构和教育机构如何帮助加快采用21世纪急需的基于信息学的科学工具集提出了建议。

{"title":"Materials informatics and sustainability—The case for urgency","authors":"H. Melia, Eric S. Muckley, J. Saal","doi":"10.1017/dce.2021.19","DOIUrl":"https://doi.org/10.1017/dce.2021.19","url":null,"abstract":"Abstract The development of transformative technologies for mitigating our global environmental and technological challenges will require significant innovation in the design, development, and manufacturing of advanced materials and chemicals. To achieve this innovation faster than what is possible by traditional human intuition-guided scientific methods, we must transition to a materials informatics-centered paradigm, in which synergies between data science, materials science, and artificial intelligence are leveraged to enable transformative, data-driven discoveries faster than ever before through the use of predictive models and digital twins. While materials informatics is experiencing rapidly increasing use across the materials and chemicals industries, broad adoption is hindered by barriers such as skill gaps, cultural resistance, and data sparsity. We discuss the importance of materials informatics for accelerating technological innovation, describe current barriers and examples of good practices, and offer suggestions for how researchers, funding agencies, and educational institutions can help accelerate the adoption of urgently needed informatics-based toolsets for science in the 21st century.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47415973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Multiphase flow applications of nonintrusive reduced-order models with Gaussian process emulation 高斯过程仿真下非侵入降阶模型的多相流应用

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2021-11-15 DOI: 10.1017/dce.2022.19

T. Botsas, Indranil Pan, L. Mason, O. Matar

Abstract Reduced-order models (ROMs) are computationally inexpensive simplifications of high-fidelity complex ones. Such models can be found in computational fluid dynamics where they can be used to predict the characteristics of multiphase flows. In previous work, we presented a ROM analysis framework that coupled compression techniques, such as autoencoders, with Gaussian process regression in the latent space. This pairing has significant advantages over the standard encoding–decoding routine, such as the ability to interpolate or extrapolate in the initial conditions’ space, which can provide predictions even when simulation data are not available. In this work, we focus on this major advantage and show its effectiveness by performing the pipeline on three multiphase flow applications. We also extend the methodology by using deep Gaussian processes as the interpolation algorithm and compare the performance of our two variations, as well as another variation from the literature that uses long short-term memory networks, for the interpolation.

摘要降阶模型（ROM）是高保真度复杂模型的计算廉价的简化。这种模型可以在计算流体动力学中找到，在计算流体力学中，它们可以用于预测多相流的特性。在之前的工作中，我们提出了一个ROM分析框架，该框架将压缩技术（如自动编码器）与潜在空间中的高斯过程回归相结合。与标准编码-解码例程相比，这种配对具有显著优势，例如能够在初始条件空间中进行插值或外推，即使在模拟数据不可用的情况下，也可以提供预测。在这项工作中，我们专注于这一主要优势，并通过在三个多相流应用中执行管道来展示其有效性。我们还通过使用深度高斯过程作为插值算法来扩展该方法，并比较我们的两种变体以及文献中使用长短期记忆网络的另一种变体的插值性能。

引用次数: 2

Decision-theoretic inspection planning using imperfect and incomplete data 基于不完善和不完整数据的决策理论检验计划

Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

DataCentric Engineering

Pub Date : 2021-11-10 DOI: 10.1017/dce.2021.18

D. Di Francesco, M. Chryssanthopoulos, M. Faber, U. Bharadwaj

Abstract Attempts to formalize inspection and monitoring strategies in industry have struggled to combine evidence from multiple sources (including subject matter expertise) in a mathematically coherent way. The perceived requirement for large amounts of data are often cited as the reason that quantitative risk-based inspection is incompatible with the sparse and imperfect information that is typically available to structural integrity engineers. Current industrial guidance is also limited in its methods of distinguishing quality of inspections, as this is typically based on simplified (qualitative) heuristics. In this paper, Bayesian multi-level (partial pooling) models are proposed as a flexible and transparent method of combining imperfect and incomplete information, to support decision-making regarding the integrity management of in-service structures. This work builds on the established theoretical framework for computing the expected value of information, by allowing for partial pooling between inspection measurements (or groups of measurements). This method is demonstrated for a simulated example of a structure with active corrosion in multiple locations, which acknowledges that the data will be associated with some precision, bias, and reliability. Quantifying the extent to which an inspection of one location can reduce uncertainty in damage models at remote locations has been shown to influence many aspects of the expected value of an inspection. These results are considered in the context of the current challenges in risk based structural integrity management.

摘要试图将工业中的检查和监测策略形式化，很难以数学上连贯的方式将来自多个来源（包括主题专业知识）的证据结合起来。对大量数据的感知需求经常被认为是基于风险的定量检查与结构完整性工程师通常可以获得的稀疏和不完善的信息不兼容的原因。目前的行业指南在区分检查质量的方法上也受到限制，因为这通常基于简化的（定性）启发式方法。在本文中，贝叶斯多级（部分池化）模型被提出为一种灵活透明的方法，将不完美和不完整的信息相结合，以支持在役结构完整性管理的决策。这项工作建立在已建立的计算信息期望值的理论框架之上，允许在检查测量（或测量组）之间进行部分汇集。该方法在多个位置具有主动腐蚀的结构的模拟示例中得到了验证，这承认数据将与一些精度、偏差和可靠性相关。量化一个位置的检查可以在多大程度上降低远程位置损伤模型的不确定性，这已被证明会影响检查预期值的许多方面。这些结果是在当前基于风险的结构完整性管理面临挑战的背景下考虑的。

{"title":"Decision-theoretic inspection planning using imperfect and incomplete data","authors":"D. Di Francesco, M. Chryssanthopoulos, M. Faber, U. Bharadwaj","doi":"10.1017/dce.2021.18","DOIUrl":"https://doi.org/10.1017/dce.2021.18","url":null,"abstract":"Abstract Attempts to formalize inspection and monitoring strategies in industry have struggled to combine evidence from multiple sources (including subject matter expertise) in a mathematically coherent way. The perceived requirement for large amounts of data are often cited as the reason that quantitative risk-based inspection is incompatible with the sparse and imperfect information that is typically available to structural integrity engineers. Current industrial guidance is also limited in its methods of distinguishing quality of inspections, as this is typically based on simplified (qualitative) heuristics. In this paper, Bayesian multi-level (partial pooling) models are proposed as a flexible and transparent method of combining imperfect and incomplete information, to support decision-making regarding the integrity management of in-service structures. This work builds on the established theoretical framework for computing the expected value of information, by allowing for partial pooling between inspection measurements (or groups of measurements). This method is demonstrated for a simulated example of a structure with active corrosion in multiple locations, which acknowledges that the data will be associated with some precision, bias, and reliability. Quantifying the extent to which an inspection of one location can reduce uncertainty in damage models at remote locations has been shown to influence many aspects of the expected value of an inspection. These results are considered in the context of the current challenges in risk based structural integrity management.","PeriodicalId":34169,"journal":{"name":"DataCentric Engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42183233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

DataCentric Engineering

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀