Data-driven discovery of chemotactic migration of bacteria via coordinate-invariant machine learning.

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS BMC Bioinformatics Pub Date : 2024-10-24 DOI:10.1186/s12859-024-05929-w

Yorgos M Psarellis, Seungjoon Lee, Tapomoy Bhattacharjee, Sujit S Datta, Juan M Bello-Rivas, Ioannis G Kevrekidis

{"title":"Data-driven discovery of chemotactic migration of bacteria via coordinate-invariant machine learning.","authors":"Yorgos M Psarellis, Seungjoon Lee, Tapomoy Bhattacharjee, Sujit S Datta, Juan M Bello-Rivas, Ioannis G Kevrekidis","doi":"10.1186/s12859-024-05929-w","DOIUrl":null,"url":null,"abstract":"Background: E. coli chemotactic motion in the presence of a chemonutrient field can be studied using wet laboratory experiments or macroscale-level partial differential equations (PDEs) (among others). Bridging experimental measurements and chemotactic Partial Differential Equations requires knowledge of the evolution of all underlying fields, initial and boundary conditions, and often necessitates strong assumptions. In this work, we propose machine learning approaches, along with ideas from the Whitney and Takens embedding theorems, to circumvent these challenges.Results: Machine learning approaches for identifying underlying PDEs were (a) validated through the use of simulation data from established continuum models and (b) used to infer chemotactic PDEs from experimental data. Such data-driven models were surrogates either for the entire chemotactic PDE right-hand-side (black box models), or, in a more targeted fashion, just for the chemotactic term (gray box models). Furthermore, it was demonstrated that a short history of bacterial density may compensate for the missing measurements of the field of chemonutrient concentration. In fact, given reasonable conditions, such a short history of bacterial density measurements could even be used to infer chemonutrient concentration.Conclusion: Data-driven PDEs are an important modeling tool when studying Chemotaxis at the macroscale, as they can learn bacterial motility from various data sources, fidelities (here, computational models, experiments) or coordinate systems. The resulting data-driven PDEs can then be simulated to reproduce/predict computational or experimental bacterial density profile data independent of the coordinate system, approximate meaningful parameters or functional terms, and even possibly estimate the underlying (unmeasured) chemonutrient field evolution.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"337"},"PeriodicalIF":3.3000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11515320/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-024-05929-w","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: E. coli chemotactic motion in the presence of a chemonutrient field can be studied using wet laboratory experiments or macroscale-level partial differential equations (PDEs) (among others). Bridging experimental measurements and chemotactic Partial Differential Equations requires knowledge of the evolution of all underlying fields, initial and boundary conditions, and often necessitates strong assumptions. In this work, we propose machine learning approaches, along with ideas from the Whitney and Takens embedding theorems, to circumvent these challenges.

Results: Machine learning approaches for identifying underlying PDEs were (a) validated through the use of simulation data from established continuum models and (b) used to infer chemotactic PDEs from experimental data. Such data-driven models were surrogates either for the entire chemotactic PDE right-hand-side (black box models), or, in a more targeted fashion, just for the chemotactic term (gray box models). Furthermore, it was demonstrated that a short history of bacterial density may compensate for the missing measurements of the field of chemonutrient concentration. In fact, given reasonable conditions, such a short history of bacterial density measurements could even be used to infer chemonutrient concentration.

Conclusion: Data-driven PDEs are an important modeling tool when studying Chemotaxis at the macroscale, as they can learn bacterial motility from various data sources, fidelities (here, computational models, experiments) or coordinate systems. The resulting data-driven PDEs can then be simulated to reproduce/predict computational or experimental bacterial density profile data independent of the coordinate system, approximate meaningful parameters or functional terms, and even possibly estimate the underlying (unmeasured) chemonutrient field evolution.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过坐标不变机器学习，以数据驱动发现细菌的趋化迁移。

背景：可以使用湿实验室实验或宏观偏微分方程（PDEs）等方法来研究大肠杆菌在螯合剂场作用下的趋化运动。连接实验测量和趋化偏微分方程需要了解所有基础场、初始条件和边界条件的演变，而且往往需要强有力的假设。在这项工作中，我们提出了机器学习方法以及惠特尼和塔肯斯嵌入定理的思想，以规避这些挑战：结果：（a）通过使用已建立的连续模型的模拟数据验证了识别底层 PDE 的机器学习方法；（b）使用机器学习方法从实验数据中推断出趋化 PDE。这些数据驱动模型既可以是整个趋化 PDE 右侧的代用模型（黑框模型），也可以是更有针对性的趋化项的代用模型（灰框模型）。此外，研究还证明，细菌密度的短暂历史可以弥补螯合剂浓度场测量的缺失。事实上，在条件合理的情况下，这种短时间的细菌密度测量甚至可以用来推断螯合营养素的浓度：在研究宏观尺度的趋化性时，数据驱动的 PDE 是一种重要的建模工具，因为它们可以从不同的数据源、保真度（此处为计算模型、实验）或坐标系中学习细菌的运动。由此产生的数据驱动 PDEs 可以通过模拟来重现/预测计算或实验中的细菌密度曲线数据，而不受坐标系的影响，近似有意义的参数或函数项，甚至可能估计潜在的（未测量的）螯合剂场演化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.