Thanh Tung Khuat, Robert Bassett, Ellen Otte, Bogdan Gabrys
{"title":"Uncertainty Quantification Using Ensemble Learning and Monte Carlo Sampling for Performance Prediction and Monitoring in Cell Culture Processes","authors":"Thanh Tung Khuat, Robert Bassett, Ellen Otte, Bogdan Gabrys","doi":"arxiv-2409.02149","DOIUrl":null,"url":null,"abstract":"Biopharmaceutical products, particularly monoclonal antibodies (mAbs), have\ngained prominence in the pharmaceutical market due to their high specificity\nand efficacy. As these products are projected to constitute a substantial\nportion of global pharmaceutical sales, the application of machine learning\nmodels in mAb development and manufacturing is gaining momentum. This paper\naddresses the critical need for uncertainty quantification in machine learning\npredictions, particularly in scenarios with limited training data. Leveraging\nensemble learning and Monte Carlo simulations, our proposed method generates\nadditional input samples to enhance the robustness of the model in small\ntraining datasets. We evaluate the efficacy of our approach through two case\nstudies: predicting antibody concentrations in advance and real-time monitoring\nof glucose concentrations during bioreactor runs using Raman spectra data. Our\nfindings demonstrate the effectiveness of the proposed method in estimating the\nuncertainty levels associated with process performance predictions and\nfacilitating real-time decision-making in biopharmaceutical manufacturing. This\ncontribution not only introduces a novel approach for uncertainty\nquantification but also provides insights into overcoming challenges posed by\nsmall training datasets in bioprocess development. The evaluation demonstrates\nthe effectiveness of our method in addressing key challenges related to\nuncertainty estimation within upstream cell cultivation, illustrating its\npotential impact on enhancing process control and product quality in the\ndynamic field of biopharmaceuticals.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Biopharmaceutical products, particularly monoclonal antibodies (mAbs), have
gained prominence in the pharmaceutical market due to their high specificity
and efficacy. As these products are projected to constitute a substantial
portion of global pharmaceutical sales, the application of machine learning
models in mAb development and manufacturing is gaining momentum. This paper
addresses the critical need for uncertainty quantification in machine learning
predictions, particularly in scenarios with limited training data. Leveraging
ensemble learning and Monte Carlo simulations, our proposed method generates
additional input samples to enhance the robustness of the model in small
training datasets. We evaluate the efficacy of our approach through two case
studies: predicting antibody concentrations in advance and real-time monitoring
of glucose concentrations during bioreactor runs using Raman spectra data. Our
findings demonstrate the effectiveness of the proposed method in estimating the
uncertainty levels associated with process performance predictions and
facilitating real-time decision-making in biopharmaceutical manufacturing. This
contribution not only introduces a novel approach for uncertainty
quantification but also provides insights into overcoming challenges posed by
small training datasets in bioprocess development. The evaluation demonstrates
the effectiveness of our method in addressing key challenges related to
uncertainty estimation within upstream cell cultivation, illustrating its
potential impact on enhancing process control and product quality in the
dynamic field of biopharmaceuticals.
生物制药产品,尤其是单克隆抗体(mAbs),因其高度的特异性和有效性而在医药市场中占据重要地位。由于这些产品预计将在全球药品销售中占据相当大的比例,因此机器学习模型在 mAb 开发和制造中的应用正日益壮大。本文探讨了机器学习预测中不确定性量化的关键需求,尤其是在训练数据有限的情况下。利用集合学习和蒙特卡罗模拟,我们提出的方法生成了额外的输入样本,以增强模型在小训练数据集中的鲁棒性。我们通过两个案例研究评估了我们方法的有效性:提前预测抗体浓度和使用拉曼光谱数据实时监控生物反应器运行过程中的葡萄糖浓度。我们的发现证明了所提出的方法在估算与工艺性能预测相关的不确定性水平和促进生物制药生产中的实时决策方面的有效性。这一贡献不仅为不确定性量化引入了一种新方法,还为克服生物工艺开发中因训练数据集较小而带来的挑战提供了见解。评估证明了我们的方法在解决上游细胞培养中与不确定性估计相关的关键挑战方面的有效性,说明了它对加强生物制药动态领域的过程控制和产品质量的潜在影响。