Pub Date : 2025-02-01DOI: 10.1016/j.jocs.2024.102480
Diyi Liu , Weijie Du , Lin Lin , James P. Vary , Chao Yang
We present an efficient quantum circuit for block encoding a pairing Hamiltonian often studied in nuclear physics. Our block encoding scheme does not require mapping the creation and annihilation operators to the Pauli operators and representing the Hamiltonian as a linear combination of unitaries. Instead, we show how to encode the Hamiltonian directly using controlled swap operations. We analyze the gate complexity of the block encoding circuit and show that it scales polynomially with respect to the number of qubits required to represent a quantum state associated with the pairing Hamiltonian. We also show how the block encoding circuit can be combined with the quantum singular value transformation to construct an efficient quantum circuit for approximating the density of states of a pairing Hamiltonian. The techniques presented can be extended to encode more general second-quantized Hamiltonians.
{"title":"An efficient quantum circuit for block encoding a pairing Hamiltonian","authors":"Diyi Liu , Weijie Du , Lin Lin , James P. Vary , Chao Yang","doi":"10.1016/j.jocs.2024.102480","DOIUrl":"10.1016/j.jocs.2024.102480","url":null,"abstract":"<div><div>We present an efficient quantum circuit for block encoding a pairing Hamiltonian often studied in nuclear physics. Our block encoding scheme does not require mapping the creation and annihilation operators to the Pauli operators and representing the Hamiltonian as a linear combination of unitaries. Instead, we show how to encode the Hamiltonian directly using controlled swap operations. We analyze the gate complexity of the block encoding circuit and show that it scales polynomially with respect to the number of qubits required to represent a quantum state associated with the pairing Hamiltonian. We also show how the block encoding circuit can be combined with the quantum singular value transformation to construct an efficient quantum circuit for approximating the density of states of a pairing Hamiltonian. The techniques presented can be extended to encode more general second-quantized Hamiltonians.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102480"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143176118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Exploring the behavior of complex industrial problems might become burdensome, especially in high-dimensional design spaces. Reduced Order Models (ROMs) aim to minimize the computational effort needed to study different design choices by exploiting already available data. In this work, we propose a methodology where the full-order solution is replaced with a Proper Orthogonal Decomposition based ROM, enhanced by a multi-fidelity surrogate model. Multi-fidelity approaches allow to exploit heterogeneous information sources, and consequently reduce the cost of creating the training data needed to build the ROM. To explore the multi-fidelity ROM capabilities, we present and discuss results and challenges for an automotive aerodynamic application, based on a geometric morphing of the DrivAer test case with multi-fidelity fluid-dynamics simulations.
{"title":"Integration of multi-fidelity methods in parametrized non-intrusive reduced order models for industrial applications","authors":"Fausto Dicech , Konstantinos Gkaragkounis , Lucia Parussini , Anna Spagnolo , Haysam Telib","doi":"10.1016/j.jocs.2024.102511","DOIUrl":"10.1016/j.jocs.2024.102511","url":null,"abstract":"<div><div>Exploring the behavior of complex industrial problems might become burdensome, especially in high-dimensional design spaces. Reduced Order Models (ROMs) aim to minimize the computational effort needed to study different design choices by exploiting already available data. In this work, we propose a methodology where the full-order solution is replaced with a Proper Orthogonal Decomposition based ROM, enhanced by a multi-fidelity surrogate model. Multi-fidelity approaches allow to exploit heterogeneous information sources, and consequently reduce the cost of creating the training data needed to build the ROM. To explore the multi-fidelity ROM capabilities, we present and discuss results and challenges for an automotive aerodynamic application, based on a geometric morphing of the DrivAer test case with multi-fidelity fluid-dynamics simulations.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102511"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jocs.2025.102531
Dongnian Jiang, Shuai Zhang
In complex industrial processes, incomplete datasets are common due to problems such as different sampling periods and data loss, which reduces the accuracy of industrial soft sensing models. To solve this problem, this paper proposes a missing data generation and filling method based on a conditional denoising diffusion model. First, a missing area detection method based on a binary mark array is used to locate the region of missing data, and a masking mechanism is applied to obtain the accurate location and size of the missing data. Then, the correlation between the original data and the mask matrix is learned with a multi-head self-attention mechanism, and is used as the condition for the original denoising diffusion model to ensure the accuracy of the generated data. Finally, the generated data are filled into the missing areas to construct a complete dataset, with the aim of improving the prediction accuracy of the soft sensor model. The simulation results demonstrate that the proposed imputation method performs exceptionally well in filling missing data. Compared to traditional methods, it significantly enhances the prediction accuracy of the soft sensor model, reducing the mean squared error by approximately 40 %.
{"title":"A filling method for missing soft measurement data based on a conditional denoising diffusion model","authors":"Dongnian Jiang, Shuai Zhang","doi":"10.1016/j.jocs.2025.102531","DOIUrl":"10.1016/j.jocs.2025.102531","url":null,"abstract":"<div><div>In complex industrial processes, incomplete datasets are common due to problems such as different sampling periods and data loss, which reduces the accuracy of industrial soft sensing models. To solve this problem, this paper proposes a missing data generation and filling method based on a conditional denoising diffusion model. First, a missing area detection method based on a binary mark array is used to locate the region of missing data, and a masking mechanism is applied to obtain the accurate location and size of the missing data. Then, the correlation between the original data and the mask matrix is learned with a multi-head self-attention mechanism, and is used as the condition for the original denoising diffusion model to ensure the accuracy of the generated data. Finally, the generated data are filled into the missing areas to construct a complete dataset, with the aim of improving the prediction accuracy of the soft sensor model. The simulation results demonstrate that the proposed imputation method performs exceptionally well in filling missing data. Compared to traditional methods, it significantly enhances the prediction accuracy of the soft sensor model, reducing the mean squared error by approximately 40 %.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102531"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The enhanced capabilities of autonomous underwater vehicles (AUVs) will facilitate sustainable exploration and utilization of maritime resources through improved precision in underwater mapping, resource extraction, and environmental surveillance. Enhanced navigation and communication systems will bolster the robustness and flexibility of AUVs, opening up new avenues for research and operations in demanding underwater conditions. The objective of this initiative is to optimize the performance of AUVs by developing sophisticated navigation methodologies specifically designed for complex marine environments. To achieve this goal, this paper proposes a modified structure of the well-known metaheuristic called differential evolution (DE). The proposed algorithm is denoted by a fitness-based differential evolution algorithm (FDE). Through the utilization of path planning techniques and the application of the proposed FDE to enhance navigation, this paper seeks to overcome obstacles such as underwater barriers, restricted communication, and limited visibility. These enhancements are anticipated to notably elevate the efficacy and cognitive capabilities of AUVs. The validation of the proposed FDE algorithm is conducted on nine case studies of the path planning of AUV, and the comparison is made with other metaheuristic algorithms. The comparison indicates the effectiveness of the FDE in solving the AUV path planning problem.
{"title":"Autonomous underwater vehicle path planning using fitness-based differential evolution algorithm","authors":"Shubham Gupta , Ayush Kumar , Vinay Kumar , Shitu Singh , Sachin , Mayank Gautam","doi":"10.1016/j.jocs.2024.102498","DOIUrl":"10.1016/j.jocs.2024.102498","url":null,"abstract":"<div><div>The enhanced capabilities of autonomous underwater vehicles (AUVs) will facilitate sustainable exploration and utilization of maritime resources through improved precision in underwater mapping, resource extraction, and environmental surveillance. Enhanced navigation and communication systems will bolster the robustness and flexibility of AUVs, opening up new avenues for research and operations in demanding underwater conditions. The objective of this initiative is to optimize the performance of AUVs by developing sophisticated navigation methodologies specifically designed for complex marine environments. To achieve this goal, this paper proposes a modified structure of the well-known metaheuristic called differential evolution (DE). The proposed algorithm is denoted by a fitness-based differential evolution algorithm (FDE). Through the utilization of path planning techniques and the application of the proposed FDE to enhance navigation, this paper seeks to overcome obstacles such as underwater barriers, restricted communication, and limited visibility. These enhancements are anticipated to notably elevate the efficacy and cognitive capabilities of AUVs. The validation of the proposed FDE algorithm is conducted on nine case studies of the path planning of AUV, and the comparison is made with other metaheuristic algorithms. The comparison indicates the effectiveness of the FDE in solving the AUV path planning problem.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102498"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jocs.2024.102475
Ruby , Vembu Shanthi , Higinio Ramos
This paper aims at solving a weakly coupled system of quasilinear convection diffusion equations with jump discontinuities in the convection and source terms. Due to the presence of a jump discontinuity in the convection term, the solution exhibits strong interior layers at the point of discontinuity. To approximate the solution of this problem, a hybrid difference technique is used and implemented on a Shishkin mesh. The proposed technique is proven to present an almost second order uniform convergence. To validate the theoretical results, some numerical examples are presented.
{"title":"Numerical Analysis for a weakly coupled system of Singularly Perturbed Quasilinear Problem with non-smooth data","authors":"Ruby , Vembu Shanthi , Higinio Ramos","doi":"10.1016/j.jocs.2024.102475","DOIUrl":"10.1016/j.jocs.2024.102475","url":null,"abstract":"<div><div>This paper aims at solving a weakly coupled system of quasilinear convection diffusion equations with jump discontinuities in the convection and source terms. Due to the presence of a jump discontinuity in the convection term, the solution exhibits strong interior layers at the point of discontinuity. To approximate the solution of this problem, a hybrid difference technique is used and implemented on a Shishkin mesh. The proposed technique is proven to present an almost second order uniform convergence. To validate the theoretical results, some numerical examples are presented.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102475"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-28DOI: 10.1016/j.jocs.2025.102530
Jiahao He, Zijun Chen, Xue Sun, Wenyuan Liu
Community search in bipartite graphs is an essential problem extensively studied, which aims at retrieving high-quality communities. And -wing is a cohesive subgraph where butterflies (i.e., (2, 2)-biclique) are connected with each other. However, communities based on -wing do not consider weights of edges. Motivated by this, in this paper, we investigate the problem of finding the top- weighted -wing communities in weighted bipartite graphs. To solve this problem, we propose two baseline algorithms, Globalsearch and Localsearch. The former tries to get results after finding all communities, while the latter aims to reduce the search space by utilizing a group of subgraphs of increasing size. Inspired by LocalSearch, we propose an offline index WNC-Index to filter out edges that are not in the results. In addition, we prove that butterfly connectivity can be transformed to bloom connectivity, thus the finding of -wings can be accelerated by utilizing blooms. Based on this, we propose an online index BCC-Index, which can improve the key steps in our algorithms. Moreover, these two indexes can be used simultaneously to speed up the query process and reduce the space cost of BCC-Index. Finally, we have conducted extensive experiments on ten real-world datasets. The results demonstrate the efficiency and effectiveness of the proposed algorithms.
{"title":"Finding top-r weighted k-wing communities in bipartite graphs","authors":"Jiahao He, Zijun Chen, Xue Sun, Wenyuan Liu","doi":"10.1016/j.jocs.2025.102530","DOIUrl":"10.1016/j.jocs.2025.102530","url":null,"abstract":"<div><div>Community search in bipartite graphs is an essential problem extensively studied, which aims at retrieving high-quality communities. And <span><math><mi>k</mi></math></span>-wing is a cohesive subgraph where butterflies (i.e., (2, 2)-biclique) are connected with each other. However, communities based on <span><math><mi>k</mi></math></span>-wing do not consider weights of edges. Motivated by this, in this paper, we investigate the problem of finding the top-<span><math><mi>r</mi></math></span> weighted <span><math><mi>k</mi></math></span>-wing communities in weighted bipartite graphs. To solve this problem, we propose two baseline algorithms, Globalsearch and Localsearch. The former tries to get results after finding all communities, while the latter aims to reduce the search space by utilizing a group of subgraphs of increasing size. Inspired by LocalSearch, we propose an offline index WNC-Index to filter out edges that are not in the results. In addition, we prove that butterfly connectivity can be transformed to bloom connectivity, thus the finding of <span><math><mi>k</mi></math></span>-wings can be accelerated by utilizing blooms. Based on this, we propose an online index BCC-Index, which can improve the key steps in our algorithms. Moreover, these two indexes can be used simultaneously to speed up the query process and reduce the space cost of BCC-Index. Finally, we have conducted extensive experiments on ten real-world datasets. The results demonstrate the efficiency and effectiveness of the proposed algorithms.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102530"},"PeriodicalIF":3.1,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-25DOI: 10.1016/j.jocs.2024.102478
Vo Anh Khoa , Pham Minh Quan , Ja’Niyah Allen , Kbenesh W. Blayneh
In this paper, we introduce a novel numerical approach for approximating the Susceptible–Infectious–Recovered (SIR) model in epidemiology. Our method enhances the existing linearization procedure by incorporating a suitable relaxation term to tackle the transcendental equation of nonlinear type. Developed within the continuous framework, our relaxation method is explicit and easy to implement, relying on a sequence of linear differential equations. This approach yields accurate approximations in both discrete and analytical forms. Through rigorous analysis, we prove that, with an appropriate choice of the relaxation parameter, our numerical scheme is non-negativity-preserving; moreover, it is strongly convergent to the true solution. We also extend the applicability of our relaxation method to handle some variations of the traditional SIR model. Finally, we present numerical examples using simulated data to demonstrate the effectiveness of our proposed method.
{"title":"Efficient relaxation scheme for the SIR and related compartmental models","authors":"Vo Anh Khoa , Pham Minh Quan , Ja’Niyah Allen , Kbenesh W. Blayneh","doi":"10.1016/j.jocs.2024.102478","DOIUrl":"10.1016/j.jocs.2024.102478","url":null,"abstract":"<div><div>In this paper, we introduce a novel numerical approach for approximating the Susceptible–Infectious–Recovered (SIR) model in epidemiology. Our method enhances the existing linearization procedure by incorporating a suitable relaxation term to tackle the transcendental equation of nonlinear type. Developed within the continuous framework, our relaxation method is explicit and easy to implement, relying on a sequence of linear differential equations. This approach yields accurate approximations in both discrete and analytical forms. Through rigorous analysis, we prove that, with an appropriate choice of the relaxation parameter, our numerical scheme is non-negativity-preserving; moreover, it is strongly convergent to the true solution. We also extend the applicability of our relaxation method to handle some variations of the traditional SIR model. Finally, we present numerical examples using simulated data to demonstrate the effectiveness of our proposed method.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102478"},"PeriodicalIF":3.1,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-25DOI: 10.1016/j.jocs.2024.102476
Nitha V.R., Vinod Chandra S.S.
Lung cancer, the second most prevalent and lethal cancer, is caused by aberrant and uncontrolled cell division in the lungs. Once lung cancer spreads to surrounding tissues or organs, the likelihood of recovery declines; hence, early illness detection is vital. Machine learning has shown significant potential in several healthcare applications. Examining various factors and trends in the data, the machine learning model can predict lung cancer menace by pinpointing those more susceptible to the illness. Among the various causes of lung cancer, pesticide is a major contributor. ‘Pesticide’ refers to any chemical used in agriculture to manage pests like weeds and insects. Numerous health hazards, including the possibility of developing cancer, have been linked to exposure to specific pesticides. Our objective is to obtain the trust of medical professionals and patients depending on how interpretable machine learning models are in healthcare. This paper deals with implementing the proposed study by utilizing a public dataset from a Thai case study to predict the risk of lung cancer caused by pesticide exposure. Since the dataset was highly imbalanced, a hybrid normalization technique was utilized, combining the Synthetic Minority Oversampling Technique (SMOTE) and Edited Nearest Neighbor (ENN). We applied a two-stage feature selection technique combined with Extra Tree Classifier and Principal Component Analysis. An eXplainable XGBoost Classifier is developed to predict lung cancer risk based on pesticide exposure. The robustness of the model is reflected in the results, with accuracy, sensitivity, and F1-Score as 99.00%, 98.87%, and 98.57%, respectively. Two public datasets were utilized to generalize the model, and the model performed well on both datasets. The model achieved accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, and 99.33% on the ‘Lung Cancer Prediction’ dataset. The model is trained and tested on the ‘survey lung cancer’ dataset and obtained an accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, 99.00%, respectively. The proposed model outperformed existing state-of-the-art methodologies regarding quality metrics. An illustration is done on the XAI (eXplainable Artificial Intelligence) model by utilizing SHapley Additive exPlanations (SHAP), thereby identifying the most relevant features contributing to the lung cancer menace.
{"title":"An eXplainable machine learning framework for predicting the impact of pesticide exposure in lung cancer prognosis","authors":"Nitha V.R., Vinod Chandra S.S.","doi":"10.1016/j.jocs.2024.102476","DOIUrl":"10.1016/j.jocs.2024.102476","url":null,"abstract":"<div><div>Lung cancer, the second most prevalent and lethal cancer, is caused by aberrant and uncontrolled cell division in the lungs. Once lung cancer spreads to surrounding tissues or organs, the likelihood of recovery declines; hence, early illness detection is vital. Machine learning has shown significant potential in several healthcare applications. Examining various factors and trends in the data, the machine learning model can predict lung cancer menace by pinpointing those more susceptible to the illness. Among the various causes of lung cancer, pesticide is a major contributor. ‘Pesticide’ refers to any chemical used in agriculture to manage pests like weeds and insects. Numerous health hazards, including the possibility of developing cancer, have been linked to exposure to specific pesticides. Our objective is to obtain the trust of medical professionals and patients depending on how interpretable machine learning models are in healthcare. This paper deals with implementing the proposed study by utilizing a public dataset from a Thai case study to predict the risk of lung cancer caused by pesticide exposure. Since the dataset was highly imbalanced, a hybrid normalization technique was utilized, combining the Synthetic Minority Oversampling Technique (SMOTE) and Edited Nearest Neighbor (ENN). We applied a two-stage feature selection technique combined with Extra Tree Classifier and Principal Component Analysis. An eXplainable XGBoost Classifier is developed to predict lung cancer risk based on pesticide exposure. The robustness of the model is reflected in the results, with accuracy, sensitivity, and F1-Score as 99.00%, 98.87%, and 98.57%, respectively. Two public datasets were utilized to generalize the model, and the model performed well on both datasets. The model achieved accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, and 99.33% on the ‘Lung Cancer Prediction’ dataset. The model is trained and tested on the ‘survey lung cancer’ dataset and obtained an accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, 99.00%, respectively. The proposed model outperformed existing state-of-the-art methodologies regarding quality metrics. An illustration is done on the XAI (eXplainable Artificial Intelligence) model by utilizing SHapley Additive exPlanations (SHAP), thereby identifying the most relevant features contributing to the lung cancer menace.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102476"},"PeriodicalIF":3.1,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-22DOI: 10.1016/j.jocs.2024.102467
Dennis Christensen , Per August Jarval Moen
In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation is not possible. Recently, the idea of permutation counting was introduced, which provides an estimator which can accurately estimate MLs of models for exchangeable binary responses. Such data arise in a multitude of statistical problems, including binary classification, bioassay and sensitivity testing. Permutation counting is entirely likelihood-free and works for any model from which a random sample can be generated, including nonparametric models. Here we present perms, a package implementing permutation counting. Following optimisation efforts, perms is computationally efficient and can handle large data problems. It is available as both an R package and a Python library. A broad gallery of examples illustrating its usage is provided, which includes both standard parametric binary classification and novel applications of nonparametric models, such as changepoint analysis. We also cover the details of the implementation of perms and illustrate its computational speed via a simple simulation study.
在贝叶斯统计中,边际似然(ML)是模型比较和模型平均所需的关键要素。遗憾的是,准确估计 ML 是出了名的困难,尤其是对于无法进行后验模拟的模型。最近,有人提出了置换计数的概念,它提供了一种估计方法,可以准确估计可交换二元响应模型的 ML。这类数据出现在许多统计问题中,包括二元分类、生物测定和灵敏度测试。置换计数完全不需要似然,适用于任何可以生成随机样本的模型,包括非参数模型。这里我们介绍 perms,这是一个实现置换计数的软件包。经过优化,perms 的计算效率很高,可以处理大型数据问题。它既是一个 R 软件包,也是一个 Python 库。我们提供了大量示例来说明它的用法,其中既包括标准参数二元分类,也包括非参数模型的新型应用,如变化点分析。我们还介绍了 perms 的实现细节,并通过简单的模拟研究说明了其计算速度。
{"title":"perms: Likelihood-free estimation of marginal likelihoods for binary response data in Python and R","authors":"Dennis Christensen , Per August Jarval Moen","doi":"10.1016/j.jocs.2024.102467","DOIUrl":"10.1016/j.jocs.2024.102467","url":null,"abstract":"<div><div>In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation is not possible. Recently, the idea of permutation counting was introduced, which provides an estimator which can accurately estimate MLs of models for exchangeable binary responses. Such data arise in a multitude of statistical problems, including binary classification, bioassay and sensitivity testing. Permutation counting is entirely likelihood-free and works for any model from which a random sample can be generated, including nonparametric models. Here we present <span>perms</span>, a package implementing permutation counting. Following optimisation efforts, <span>perms</span> is computationally efficient and can handle large data problems. It is available as both an R package and a Python library. A broad gallery of examples illustrating its usage is provided, which includes both standard parametric binary classification and novel applications of nonparametric models, such as changepoint analysis. We also cover the details of the implementation of <span>perms</span> and illustrate its computational speed via a simple simulation study.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102467"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper considers a new approach for people flow estimation in buildings from elevator trip records and corresponding load data, and the resulting model is used on the virtual sensing platform we have developed. People flow data can be used to improve elevator performance through optimal car assignments to hall calls by a group controller and are useful for estimating occupant distributions as heat loads allowing for optimized air-conditioning control to realize energy savings. Available data from an elevator controller is insufficient for exact people flow estimation and therefore this problem becomes under-defined. Our virtual sensing platform adopts equation-based modeling and optimization-based parameter estimation, which estimates application-related parameters from available sensor data, allowing for over- or under-defined situations among sensory information, but better mathematical formulation is essential for accurate parameter estimation on this virtual sensing platform. Accordingly, we propose a new method to define an elevator trip-wise mathematical formulation by modifying pre-defined base equations or defining additional equations. The key idea is that each elevator trip has different features, including sparsity, that are useful for improving accuracy and can be successfully formulated as simultaneous equations that our virtual sensing platform accepts. The procedure for defining a mathematical formulation is invoked after trip data are obtained and we refer this procedure as “on-the-fly mathematical formulation.” The formulated trip-wise equations are combined as simultaneous equations for estimating people flow over a given period on the virtual sensing platform by mathematical optimization.
{"title":"On-the-fly mathematical formulation for estimating people flow from elevator load data in smart building virtual sensing platforms","authors":"Koichi Kondo , Ryosuke Ohori , Kiyotaka Matsue , Hiroyuki Aizu","doi":"10.1016/j.jocs.2024.102488","DOIUrl":"10.1016/j.jocs.2024.102488","url":null,"abstract":"<div><div>This paper considers a new approach for people flow estimation in buildings from elevator trip records and corresponding load data, and the resulting model is used on the virtual sensing platform we have developed. People flow data can be used to improve elevator performance through optimal car assignments to hall calls by a group controller and are useful for estimating occupant distributions as heat loads allowing for optimized air-conditioning control to realize energy savings. Available data from an elevator controller is insufficient for exact people flow estimation and therefore this problem becomes under-defined. Our virtual sensing platform adopts equation-based modeling and optimization-based parameter estimation, which estimates application-related parameters from available sensor data, allowing for over- or under-defined situations among sensory information, but better mathematical formulation is essential for accurate parameter estimation on this virtual sensing platform. Accordingly, we propose a new method to define an elevator trip-wise mathematical formulation by modifying pre-defined base equations or defining additional equations. The key idea is that each elevator trip has different features, including sparsity, that are useful for improving accuracy and can be successfully formulated as simultaneous equations that our virtual sensing platform accepts. The procedure for defining a mathematical formulation is invoked after trip data are obtained and we refer this procedure as “on-the-fly mathematical formulation.” The formulated trip-wise equations are combined as simultaneous equations for estimating people flow over a given period on the virtual sensing platform by mathematical optimization.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102488"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}