首页 > 最新文献

Journal of Computational Science最新文献

英文 中文
An efficient quantum circuit for block encoding a pairing Hamiltonian
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-02-01 DOI: 10.1016/j.jocs.2024.102480
Diyi Liu , Weijie Du , Lin Lin , James P. Vary , Chao Yang
We present an efficient quantum circuit for block encoding a pairing Hamiltonian often studied in nuclear physics. Our block encoding scheme does not require mapping the creation and annihilation operators to the Pauli operators and representing the Hamiltonian as a linear combination of unitaries. Instead, we show how to encode the Hamiltonian directly using controlled swap operations. We analyze the gate complexity of the block encoding circuit and show that it scales polynomially with respect to the number of qubits required to represent a quantum state associated with the pairing Hamiltonian. We also show how the block encoding circuit can be combined with the quantum singular value transformation to construct an efficient quantum circuit for approximating the density of states of a pairing Hamiltonian. The techniques presented can be extended to encode more general second-quantized Hamiltonians.
{"title":"An efficient quantum circuit for block encoding a pairing Hamiltonian","authors":"Diyi Liu ,&nbsp;Weijie Du ,&nbsp;Lin Lin ,&nbsp;James P. Vary ,&nbsp;Chao Yang","doi":"10.1016/j.jocs.2024.102480","DOIUrl":"10.1016/j.jocs.2024.102480","url":null,"abstract":"<div><div>We present an efficient quantum circuit for block encoding a pairing Hamiltonian often studied in nuclear physics. Our block encoding scheme does not require mapping the creation and annihilation operators to the Pauli operators and representing the Hamiltonian as a linear combination of unitaries. Instead, we show how to encode the Hamiltonian directly using controlled swap operations. We analyze the gate complexity of the block encoding circuit and show that it scales polynomially with respect to the number of qubits required to represent a quantum state associated with the pairing Hamiltonian. We also show how the block encoding circuit can be combined with the quantum singular value transformation to construct an efficient quantum circuit for approximating the density of states of a pairing Hamiltonian. The techniques presented can be extended to encode more general second-quantized Hamiltonians.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102480"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143176118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integration of multi-fidelity methods in parametrized non-intrusive reduced order models for industrial applications
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-02-01 DOI: 10.1016/j.jocs.2024.102511
Fausto Dicech , Konstantinos Gkaragkounis , Lucia Parussini , Anna Spagnolo , Haysam Telib
Exploring the behavior of complex industrial problems might become burdensome, especially in high-dimensional design spaces. Reduced Order Models (ROMs) aim to minimize the computational effort needed to study different design choices by exploiting already available data. In this work, we propose a methodology where the full-order solution is replaced with a Proper Orthogonal Decomposition based ROM, enhanced by a multi-fidelity surrogate model. Multi-fidelity approaches allow to exploit heterogeneous information sources, and consequently reduce the cost of creating the training data needed to build the ROM. To explore the multi-fidelity ROM capabilities, we present and discuss results and challenges for an automotive aerodynamic application, based on a geometric morphing of the DrivAer test case with multi-fidelity fluid-dynamics simulations.
{"title":"Integration of multi-fidelity methods in parametrized non-intrusive reduced order models for industrial applications","authors":"Fausto Dicech ,&nbsp;Konstantinos Gkaragkounis ,&nbsp;Lucia Parussini ,&nbsp;Anna Spagnolo ,&nbsp;Haysam Telib","doi":"10.1016/j.jocs.2024.102511","DOIUrl":"10.1016/j.jocs.2024.102511","url":null,"abstract":"<div><div>Exploring the behavior of complex industrial problems might become burdensome, especially in high-dimensional design spaces. Reduced Order Models (ROMs) aim to minimize the computational effort needed to study different design choices by exploiting already available data. In this work, we propose a methodology where the full-order solution is replaced with a Proper Orthogonal Decomposition based ROM, enhanced by a multi-fidelity surrogate model. Multi-fidelity approaches allow to exploit heterogeneous information sources, and consequently reduce the cost of creating the training data needed to build the ROM. To explore the multi-fidelity ROM capabilities, we present and discuss results and challenges for an automotive aerodynamic application, based on a geometric morphing of the DrivAer test case with multi-fidelity fluid-dynamics simulations.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102511"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A filling method for missing soft measurement data based on a conditional denoising diffusion model
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-02-01 DOI: 10.1016/j.jocs.2025.102531
Dongnian Jiang, Shuai Zhang
In complex industrial processes, incomplete datasets are common due to problems such as different sampling periods and data loss, which reduces the accuracy of industrial soft sensing models. To solve this problem, this paper proposes a missing data generation and filling method based on a conditional denoising diffusion model. First, a missing area detection method based on a binary mark array is used to locate the region of missing data, and a masking mechanism is applied to obtain the accurate location and size of the missing data. Then, the correlation between the original data and the mask matrix is learned with a multi-head self-attention mechanism, and is used as the condition for the original denoising diffusion model to ensure the accuracy of the generated data. Finally, the generated data are filled into the missing areas to construct a complete dataset, with the aim of improving the prediction accuracy of the soft sensor model. The simulation results demonstrate that the proposed imputation method performs exceptionally well in filling missing data. Compared to traditional methods, it significantly enhances the prediction accuracy of the soft sensor model, reducing the mean squared error by approximately 40 %.
{"title":"A filling method for missing soft measurement data based on a conditional denoising diffusion model","authors":"Dongnian Jiang,&nbsp;Shuai Zhang","doi":"10.1016/j.jocs.2025.102531","DOIUrl":"10.1016/j.jocs.2025.102531","url":null,"abstract":"<div><div>In complex industrial processes, incomplete datasets are common due to problems such as different sampling periods and data loss, which reduces the accuracy of industrial soft sensing models. To solve this problem, this paper proposes a missing data generation and filling method based on a conditional denoising diffusion model. First, a missing area detection method based on a binary mark array is used to locate the region of missing data, and a masking mechanism is applied to obtain the accurate location and size of the missing data. Then, the correlation between the original data and the mask matrix is learned with a multi-head self-attention mechanism, and is used as the condition for the original denoising diffusion model to ensure the accuracy of the generated data. Finally, the generated data are filled into the missing areas to construct a complete dataset, with the aim of improving the prediction accuracy of the soft sensor model. The simulation results demonstrate that the proposed imputation method performs exceptionally well in filling missing data. Compared to traditional methods, it significantly enhances the prediction accuracy of the soft sensor model, reducing the mean squared error by approximately 40 %.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102531"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autonomous underwater vehicle path planning using fitness-based differential evolution algorithm
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-02-01 DOI: 10.1016/j.jocs.2024.102498
Shubham Gupta , Ayush Kumar , Vinay Kumar , Shitu Singh , Sachin , Mayank Gautam
The enhanced capabilities of autonomous underwater vehicles (AUVs) will facilitate sustainable exploration and utilization of maritime resources through improved precision in underwater mapping, resource extraction, and environmental surveillance. Enhanced navigation and communication systems will bolster the robustness and flexibility of AUVs, opening up new avenues for research and operations in demanding underwater conditions. The objective of this initiative is to optimize the performance of AUVs by developing sophisticated navigation methodologies specifically designed for complex marine environments. To achieve this goal, this paper proposes a modified structure of the well-known metaheuristic called differential evolution (DE). The proposed algorithm is denoted by a fitness-based differential evolution algorithm (FDE). Through the utilization of path planning techniques and the application of the proposed FDE to enhance navigation, this paper seeks to overcome obstacles such as underwater barriers, restricted communication, and limited visibility. These enhancements are anticipated to notably elevate the efficacy and cognitive capabilities of AUVs. The validation of the proposed FDE algorithm is conducted on nine case studies of the path planning of AUV, and the comparison is made with other metaheuristic algorithms. The comparison indicates the effectiveness of the FDE in solving the AUV path planning problem.
{"title":"Autonomous underwater vehicle path planning using fitness-based differential evolution algorithm","authors":"Shubham Gupta ,&nbsp;Ayush Kumar ,&nbsp;Vinay Kumar ,&nbsp;Shitu Singh ,&nbsp;Sachin ,&nbsp;Mayank Gautam","doi":"10.1016/j.jocs.2024.102498","DOIUrl":"10.1016/j.jocs.2024.102498","url":null,"abstract":"<div><div>The enhanced capabilities of autonomous underwater vehicles (AUVs) will facilitate sustainable exploration and utilization of maritime resources through improved precision in underwater mapping, resource extraction, and environmental surveillance. Enhanced navigation and communication systems will bolster the robustness and flexibility of AUVs, opening up new avenues for research and operations in demanding underwater conditions. The objective of this initiative is to optimize the performance of AUVs by developing sophisticated navigation methodologies specifically designed for complex marine environments. To achieve this goal, this paper proposes a modified structure of the well-known metaheuristic called differential evolution (DE). The proposed algorithm is denoted by a fitness-based differential evolution algorithm (FDE). Through the utilization of path planning techniques and the application of the proposed FDE to enhance navigation, this paper seeks to overcome obstacles such as underwater barriers, restricted communication, and limited visibility. These enhancements are anticipated to notably elevate the efficacy and cognitive capabilities of AUVs. The validation of the proposed FDE algorithm is conducted on nine case studies of the path planning of AUV, and the comparison is made with other metaheuristic algorithms. The comparison indicates the effectiveness of the FDE in solving the AUV path planning problem.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102498"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Numerical Analysis for a weakly coupled system of Singularly Perturbed Quasilinear Problem with non-smooth data
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-02-01 DOI: 10.1016/j.jocs.2024.102475
Ruby , Vembu Shanthi , Higinio Ramos
This paper aims at solving a weakly coupled system of quasilinear convection diffusion equations with jump discontinuities in the convection and source terms. Due to the presence of a jump discontinuity in the convection term, the solution exhibits strong interior layers at the point of discontinuity. To approximate the solution of this problem, a hybrid difference technique is used and implemented on a Shishkin mesh. The proposed technique is proven to present an almost second order uniform convergence. To validate the theoretical results, some numerical examples are presented.
{"title":"Numerical Analysis for a weakly coupled system of Singularly Perturbed Quasilinear Problem with non-smooth data","authors":"Ruby ,&nbsp;Vembu Shanthi ,&nbsp;Higinio Ramos","doi":"10.1016/j.jocs.2024.102475","DOIUrl":"10.1016/j.jocs.2024.102475","url":null,"abstract":"<div><div>This paper aims at solving a weakly coupled system of quasilinear convection diffusion equations with jump discontinuities in the convection and source terms. Due to the presence of a jump discontinuity in the convection term, the solution exhibits strong interior layers at the point of discontinuity. To approximate the solution of this problem, a hybrid difference technique is used and implemented on a Shishkin mesh. The proposed technique is proven to present an almost second order uniform convergence. To validate the theoretical results, some numerical examples are presented.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102475"},"PeriodicalIF":3.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finding top-r weighted k-wing communities in bipartite graphs
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-28 DOI: 10.1016/j.jocs.2025.102530
Jiahao He, Zijun Chen, Xue Sun, Wenyuan Liu
Community search in bipartite graphs is an essential problem extensively studied, which aims at retrieving high-quality communities. And k-wing is a cohesive subgraph where butterflies (i.e., (2, 2)-biclique) are connected with each other. However, communities based on k-wing do not consider weights of edges. Motivated by this, in this paper, we investigate the problem of finding the top-r weighted k-wing communities in weighted bipartite graphs. To solve this problem, we propose two baseline algorithms, Globalsearch and Localsearch. The former tries to get results after finding all communities, while the latter aims to reduce the search space by utilizing a group of subgraphs of increasing size. Inspired by LocalSearch, we propose an offline index WNC-Index to filter out edges that are not in the results. In addition, we prove that butterfly connectivity can be transformed to bloom connectivity, thus the finding of k-wings can be accelerated by utilizing blooms. Based on this, we propose an online index BCC-Index, which can improve the key steps in our algorithms. Moreover, these two indexes can be used simultaneously to speed up the query process and reduce the space cost of BCC-Index. Finally, we have conducted extensive experiments on ten real-world datasets. The results demonstrate the efficiency and effectiveness of the proposed algorithms.
{"title":"Finding top-r weighted k-wing communities in bipartite graphs","authors":"Jiahao He,&nbsp;Zijun Chen,&nbsp;Xue Sun,&nbsp;Wenyuan Liu","doi":"10.1016/j.jocs.2025.102530","DOIUrl":"10.1016/j.jocs.2025.102530","url":null,"abstract":"<div><div>Community search in bipartite graphs is an essential problem extensively studied, which aims at retrieving high-quality communities. And <span><math><mi>k</mi></math></span>-wing is a cohesive subgraph where butterflies (i.e., (2, 2)-biclique) are connected with each other. However, communities based on <span><math><mi>k</mi></math></span>-wing do not consider weights of edges. Motivated by this, in this paper, we investigate the problem of finding the top-<span><math><mi>r</mi></math></span> weighted <span><math><mi>k</mi></math></span>-wing communities in weighted bipartite graphs. To solve this problem, we propose two baseline algorithms, Globalsearch and Localsearch. The former tries to get results after finding all communities, while the latter aims to reduce the search space by utilizing a group of subgraphs of increasing size. Inspired by LocalSearch, we propose an offline index WNC-Index to filter out edges that are not in the results. In addition, we prove that butterfly connectivity can be transformed to bloom connectivity, thus the finding of <span><math><mi>k</mi></math></span>-wings can be accelerated by utilizing blooms. Based on this, we propose an online index BCC-Index, which can improve the key steps in our algorithms. Moreover, these two indexes can be used simultaneously to speed up the query process and reduce the space cost of BCC-Index. Finally, we have conducted extensive experiments on ten real-world datasets. The results demonstrate the efficiency and effectiveness of the proposed algorithms.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"85 ","pages":"Article 102530"},"PeriodicalIF":3.1,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143177212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient relaxation scheme for the SIR and related compartmental models SIR和相关室室模型的有效松弛方案
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-11-25 DOI: 10.1016/j.jocs.2024.102478
Vo Anh Khoa , Pham Minh Quan , Ja’Niyah Allen , Kbenesh W. Blayneh
In this paper, we introduce a novel numerical approach for approximating the Susceptible–Infectious–Recovered (SIR) model in epidemiology. Our method enhances the existing linearization procedure by incorporating a suitable relaxation term to tackle the transcendental equation of nonlinear type. Developed within the continuous framework, our relaxation method is explicit and easy to implement, relying on a sequence of linear differential equations. This approach yields accurate approximations in both discrete and analytical forms. Through rigorous analysis, we prove that, with an appropriate choice of the relaxation parameter, our numerical scheme is non-negativity-preserving; moreover, it is strongly convergent to the true solution. We also extend the applicability of our relaxation method to handle some variations of the traditional SIR model. Finally, we present numerical examples using simulated data to demonstrate the effectiveness of our proposed method.
本文介绍了一种近似流行病学中易感-感染-恢复(SIR)模型的新颖数值方法。我们的方法通过加入合适的松弛项来处理非线性型超越方程,从而改进了现有的线性化过程。在连续框架内开发,我们的松弛方法是明确的,易于实现,依赖于一系列线性微分方程。这种方法在离散形式和解析形式中都能得到精确的近似。通过严密的分析,我们证明了在适当选择松弛参数的情况下,我们的数值格式是非负保持的;而且,它是强收敛于真解的。我们还扩展了松弛方法的适用性,以处理传统SIR模型的一些变化。最后,我们用模拟数据给出了数值例子来证明我们所提出的方法的有效性。
{"title":"Efficient relaxation scheme for the SIR and related compartmental models","authors":"Vo Anh Khoa ,&nbsp;Pham Minh Quan ,&nbsp;Ja’Niyah Allen ,&nbsp;Kbenesh W. Blayneh","doi":"10.1016/j.jocs.2024.102478","DOIUrl":"10.1016/j.jocs.2024.102478","url":null,"abstract":"<div><div>In this paper, we introduce a novel numerical approach for approximating the Susceptible–Infectious–Recovered (SIR) model in epidemiology. Our method enhances the existing linearization procedure by incorporating a suitable relaxation term to tackle the transcendental equation of nonlinear type. Developed within the continuous framework, our relaxation method is explicit and easy to implement, relying on a sequence of linear differential equations. This approach yields accurate approximations in both discrete and analytical forms. Through rigorous analysis, we prove that, with an appropriate choice of the relaxation parameter, our numerical scheme is non-negativity-preserving; moreover, it is strongly convergent to the true solution. We also extend the applicability of our relaxation method to handle some variations of the traditional SIR model. Finally, we present numerical examples using simulated data to demonstrate the effectiveness of our proposed method.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102478"},"PeriodicalIF":3.1,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An eXplainable machine learning framework for predicting the impact of pesticide exposure in lung cancer prognosis 一个可解释的机器学习框架,用于预测农药暴露对肺癌预后的影响
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-11-25 DOI: 10.1016/j.jocs.2024.102476
Nitha V.R., Vinod Chandra S.S.
Lung cancer, the second most prevalent and lethal cancer, is caused by aberrant and uncontrolled cell division in the lungs. Once lung cancer spreads to surrounding tissues or organs, the likelihood of recovery declines; hence, early illness detection is vital. Machine learning has shown significant potential in several healthcare applications. Examining various factors and trends in the data, the machine learning model can predict lung cancer menace by pinpointing those more susceptible to the illness. Among the various causes of lung cancer, pesticide is a major contributor. ‘Pesticide’ refers to any chemical used in agriculture to manage pests like weeds and insects. Numerous health hazards, including the possibility of developing cancer, have been linked to exposure to specific pesticides. Our objective is to obtain the trust of medical professionals and patients depending on how interpretable machine learning models are in healthcare. This paper deals with implementing the proposed study by utilizing a public dataset from a Thai case study to predict the risk of lung cancer caused by pesticide exposure. Since the dataset was highly imbalanced, a hybrid normalization technique was utilized, combining the Synthetic Minority Oversampling Technique (SMOTE) and Edited Nearest Neighbor (ENN). We applied a two-stage feature selection technique combined with Extra Tree Classifier and Principal Component Analysis. An eXplainable XGBoost Classifier is developed to predict lung cancer risk based on pesticide exposure. The robustness of the model is reflected in the results, with accuracy, sensitivity, and F1-Score as 99.00%, 98.87%, and 98.57%, respectively. Two public datasets were utilized to generalize the model, and the model performed well on both datasets. The model achieved accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, and 99.33% on the ‘Lung Cancer Prediction’ dataset. The model is trained and tested on the ‘survey lung cancer’ dataset and obtained an accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, 99.00%, respectively. The proposed model outperformed existing state-of-the-art methodologies regarding quality metrics. An illustration is done on the XAI (eXplainable Artificial Intelligence) model by utilizing SHapley Additive exPlanations (SHAP), thereby identifying the most relevant features contributing to the lung cancer menace.
肺癌是第二大流行和致命的癌症,是由肺部异常和不受控制的细胞分裂引起的。一旦肺癌扩散到周围组织或器官,康复的可能性就会下降;因此,早期疾病检测至关重要。机器学习在一些医疗保健应用中显示出巨大的潜力。通过检查数据中的各种因素和趋势,机器学习模型可以通过精确定位那些更容易患病的人来预测肺癌的威胁。在导致肺癌的各种原因中,农药是一个主要因素。“农药”指的是农业中用于控制杂草和昆虫等害虫的任何化学品。许多健康危害,包括患癌症的可能性,都与接触特定的农药有关。我们的目标是获得医疗专业人员和患者的信任,这取决于机器学习模型在医疗保健中的可解释性。本文通过利用泰国案例研究的公共数据集来预测农药暴露引起的肺癌风险,从而实现拟议的研究。由于数据集高度不平衡,采用了一种混合归一化技术,将合成少数过采样技术(SMOTE)和编辑最近邻技术(ENN)相结合。我们采用了一种结合额外树分类器和主成分分析的两阶段特征选择技术。开发了一种可解释的XGBoost分类器,用于基于农药暴露预测肺癌风险。模型的稳健性体现在结果中,准确率为99.00%,灵敏度为98.87%,F1-Score为98.57%。利用两个公共数据集对模型进行泛化,模型在两个数据集上都表现良好。该模型在“肺癌预测”数据集上的准确性、灵敏度和F1-Score分别为99.00%、99.00%和99.33%。该模型在“调查肺癌”数据集上进行了训练和测试,分别获得了99.00%、99.00%、99.00%的准确性、灵敏度和F1-Score。提出的模型在质量度量方面优于现有的最先进的方法。利用SHapley加性解释(SHAP)对XAI(可解释人工智能)模型进行了说明,从而确定了导致肺癌威胁的最相关特征。
{"title":"An eXplainable machine learning framework for predicting the impact of pesticide exposure in lung cancer prognosis","authors":"Nitha V.R.,&nbsp;Vinod Chandra S.S.","doi":"10.1016/j.jocs.2024.102476","DOIUrl":"10.1016/j.jocs.2024.102476","url":null,"abstract":"<div><div>Lung cancer, the second most prevalent and lethal cancer, is caused by aberrant and uncontrolled cell division in the lungs. Once lung cancer spreads to surrounding tissues or organs, the likelihood of recovery declines; hence, early illness detection is vital. Machine learning has shown significant potential in several healthcare applications. Examining various factors and trends in the data, the machine learning model can predict lung cancer menace by pinpointing those more susceptible to the illness. Among the various causes of lung cancer, pesticide is a major contributor. ‘Pesticide’ refers to any chemical used in agriculture to manage pests like weeds and insects. Numerous health hazards, including the possibility of developing cancer, have been linked to exposure to specific pesticides. Our objective is to obtain the trust of medical professionals and patients depending on how interpretable machine learning models are in healthcare. This paper deals with implementing the proposed study by utilizing a public dataset from a Thai case study to predict the risk of lung cancer caused by pesticide exposure. Since the dataset was highly imbalanced, a hybrid normalization technique was utilized, combining the Synthetic Minority Oversampling Technique (SMOTE) and Edited Nearest Neighbor (ENN). We applied a two-stage feature selection technique combined with Extra Tree Classifier and Principal Component Analysis. An eXplainable XGBoost Classifier is developed to predict lung cancer risk based on pesticide exposure. The robustness of the model is reflected in the results, with accuracy, sensitivity, and F1-Score as 99.00%, 98.87%, and 98.57%, respectively. Two public datasets were utilized to generalize the model, and the model performed well on both datasets. The model achieved accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, and 99.33% on the ‘Lung Cancer Prediction’ dataset. The model is trained and tested on the ‘survey lung cancer’ dataset and obtained an accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, 99.00%, respectively. The proposed model outperformed existing state-of-the-art methodologies regarding quality metrics. An illustration is done on the XAI (eXplainable Artificial Intelligence) model by utilizing SHapley Additive exPlanations (SHAP), thereby identifying the most relevant features contributing to the lung cancer menace.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102476"},"PeriodicalIF":3.1,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
perms: Likelihood-free estimation of marginal likelihoods for binary response data in Python and R perms:用 Python 和 R 对二元响应数据的边际似然进行无似然估计
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-11-22 DOI: 10.1016/j.jocs.2024.102467
Dennis Christensen , Per August Jarval Moen
In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation is not possible. Recently, the idea of permutation counting was introduced, which provides an estimator which can accurately estimate MLs of models for exchangeable binary responses. Such data arise in a multitude of statistical problems, including binary classification, bioassay and sensitivity testing. Permutation counting is entirely likelihood-free and works for any model from which a random sample can be generated, including nonparametric models. Here we present perms, a package implementing permutation counting. Following optimisation efforts, perms is computationally efficient and can handle large data problems. It is available as both an R package and a Python library. A broad gallery of examples illustrating its usage is provided, which includes both standard parametric binary classification and novel applications of nonparametric models, such as changepoint analysis. We also cover the details of the implementation of perms and illustrate its computational speed via a simple simulation study.
在贝叶斯统计中,边际似然(ML)是模型比较和模型平均所需的关键要素。遗憾的是,准确估计 ML 是出了名的困难,尤其是对于无法进行后验模拟的模型。最近,有人提出了置换计数的概念,它提供了一种估计方法,可以准确估计可交换二元响应模型的 ML。这类数据出现在许多统计问题中,包括二元分类、生物测定和灵敏度测试。置换计数完全不需要似然,适用于任何可以生成随机样本的模型,包括非参数模型。这里我们介绍 perms,这是一个实现置换计数的软件包。经过优化,perms 的计算效率很高,可以处理大型数据问题。它既是一个 R 软件包,也是一个 Python 库。我们提供了大量示例来说明它的用法,其中既包括标准参数二元分类,也包括非参数模型的新型应用,如变化点分析。我们还介绍了 perms 的实现细节,并通过简单的模拟研究说明了其计算速度。
{"title":"perms: Likelihood-free estimation of marginal likelihoods for binary response data in Python and R","authors":"Dennis Christensen ,&nbsp;Per August Jarval Moen","doi":"10.1016/j.jocs.2024.102467","DOIUrl":"10.1016/j.jocs.2024.102467","url":null,"abstract":"<div><div>In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation is not possible. Recently, the idea of permutation counting was introduced, which provides an estimator which can accurately estimate MLs of models for exchangeable binary responses. Such data arise in a multitude of statistical problems, including binary classification, bioassay and sensitivity testing. Permutation counting is entirely likelihood-free and works for any model from which a random sample can be generated, including nonparametric models. Here we present <span>perms</span>, a package implementing permutation counting. Following optimisation efforts, <span>perms</span> is computationally efficient and can handle large data problems. It is available as both an R package and a Python library. A broad gallery of examples illustrating its usage is provided, which includes both standard parametric binary classification and novel applications of nonparametric models, such as changepoint analysis. We also cover the details of the implementation of <span>perms</span> and illustrate its computational speed via a simple simulation study.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102467"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On-the-fly mathematical formulation for estimating people flow from elevator load data in smart building virtual sensing platforms 智能建筑虚拟传感平台中根据电梯负载数据估算人流量的即时数学计算公式
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-11-22 DOI: 10.1016/j.jocs.2024.102488
Koichi Kondo , Ryosuke Ohori , Kiyotaka Matsue , Hiroyuki Aizu
This paper considers a new approach for people flow estimation in buildings from elevator trip records and corresponding load data, and the resulting model is used on the virtual sensing platform we have developed. People flow data can be used to improve elevator performance through optimal car assignments to hall calls by a group controller and are useful for estimating occupant distributions as heat loads allowing for optimized air-conditioning control to realize energy savings. Available data from an elevator controller is insufficient for exact people flow estimation and therefore this problem becomes under-defined. Our virtual sensing platform adopts equation-based modeling and optimization-based parameter estimation, which estimates application-related parameters from available sensor data, allowing for over- or under-defined situations among sensory information, but better mathematical formulation is essential for accurate parameter estimation on this virtual sensing platform. Accordingly, we propose a new method to define an elevator trip-wise mathematical formulation by modifying pre-defined base equations or defining additional equations. The key idea is that each elevator trip has different features, including sparsity, that are useful for improving accuracy and can be successfully formulated as simultaneous equations that our virtual sensing platform accepts. The procedure for defining a mathematical formulation is invoked after trip data are obtained and we refer this procedure as “on-the-fly mathematical formulation.” The formulated trip-wise equations are combined as simultaneous equations for estimating people flow over a given period on the virtual sensing platform by mathematical optimization.
本文考虑采用一种新方法,根据电梯运行记录和相应的负荷数据对建筑物内的人流进行估算,并将由此产生的模型用于我们开发的虚拟传感平台。人流数据可用于通过群组控制器对电梯厅呼叫的最佳轿厢分配来提高电梯性能,也可用于估算作为热负荷的乘员分布,从而优化空调控制,实现节能。来自电梯控制器的可用数据不足以进行精确的人流估算,因此这一问题变得不够明确。我们的虚拟传感平台采用了基于方程的建模和基于优化的参数估计,可从可用的传感器数据中估算出与应用相关的参数,从而允许在感知信息中出现定义过度或定义不足的情况,但要在该虚拟传感平台上进行精确的参数估计,更好的数学表述是必不可少的。因此,我们提出了一种新方法,通过修改预先定义的基本方程或定义附加方程来定义电梯行程数学公式。其主要思想是,每个电梯行程都有不同的特征,包括稀疏性,这些特征有助于提高精确度,并且可以成功地表述为我们的虚拟传感平台所接受的同步方程。定义数学公式的程序是在获得行程数据后调用的,我们将这一程序称为 "即时数学公式"。通过数学优化,将制定的行程方程合并为同步方程,用于估算虚拟传感平台上给定时间段内的人流量。
{"title":"On-the-fly mathematical formulation for estimating people flow from elevator load data in smart building virtual sensing platforms","authors":"Koichi Kondo ,&nbsp;Ryosuke Ohori ,&nbsp;Kiyotaka Matsue ,&nbsp;Hiroyuki Aizu","doi":"10.1016/j.jocs.2024.102488","DOIUrl":"10.1016/j.jocs.2024.102488","url":null,"abstract":"<div><div>This paper considers a new approach for people flow estimation in buildings from elevator trip records and corresponding load data, and the resulting model is used on the virtual sensing platform we have developed. People flow data can be used to improve elevator performance through optimal car assignments to hall calls by a group controller and are useful for estimating occupant distributions as heat loads allowing for optimized air-conditioning control to realize energy savings. Available data from an elevator controller is insufficient for exact people flow estimation and therefore this problem becomes under-defined. Our virtual sensing platform adopts equation-based modeling and optimization-based parameter estimation, which estimates application-related parameters from available sensor data, allowing for over- or under-defined situations among sensory information, but better mathematical formulation is essential for accurate parameter estimation on this virtual sensing platform. Accordingly, we propose a new method to define an elevator trip-wise mathematical formulation by modifying pre-defined base equations or defining additional equations. The key idea is that each elevator trip has different features, including sparsity, that are useful for improving accuracy and can be successfully formulated as simultaneous equations that our virtual sensing platform accepts. The procedure for defining a mathematical formulation is invoked after trip data are obtained and we refer this procedure as “on-the-fly mathematical formulation.” The formulated trip-wise equations are combined as simultaneous equations for estimating people flow over a given period on the virtual sensing platform by mathematical optimization.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102488"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computational Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1