Data-driven discovery of chemotactic migration of bacteria via coordinate-invariant machine learning.

IF 2.9 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS BMC Bioinformatics Pub Date : 2024-10-24 DOI:10.1186/s12859-024-05929-w
Yorgos M Psarellis, Seungjoon Lee, Tapomoy Bhattacharjee, Sujit S Datta, Juan M Bello-Rivas, Ioannis G Kevrekidis
{"title":"Data-driven discovery of chemotactic migration of bacteria via coordinate-invariant machine learning.","authors":"Yorgos M Psarellis, Seungjoon Lee, Tapomoy Bhattacharjee, Sujit S Datta, Juan M Bello-Rivas, Ioannis G Kevrekidis","doi":"10.1186/s12859-024-05929-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>E. coli chemotactic motion in the presence of a chemonutrient field can be studied using wet laboratory experiments or macroscale-level partial differential equations (PDEs) (among others). Bridging experimental measurements and chemotactic Partial Differential Equations requires knowledge of the evolution of all underlying fields, initial and boundary conditions, and often necessitates strong assumptions. In this work, we propose machine learning approaches, along with ideas from the Whitney and Takens embedding theorems, to circumvent these challenges.</p><p><strong>Results: </strong>Machine learning approaches for identifying underlying PDEs were (a) validated through the use of simulation data from established continuum models and (b) used to infer chemotactic PDEs from experimental data. Such data-driven models were surrogates either for the entire chemotactic PDE right-hand-side (black box models), or, in a more targeted fashion, just for the chemotactic term (gray box models). Furthermore, it was demonstrated that a short history of bacterial density may compensate for the missing measurements of the field of chemonutrient concentration. In fact, given reasonable conditions, such a short history of bacterial density measurements could even be used to infer chemonutrient concentration.</p><p><strong>Conclusion: </strong>Data-driven PDEs are an important modeling tool when studying Chemotaxis at the macroscale, as they can learn bacterial motility from various data sources, fidelities (here, computational models, experiments) or coordinate systems. The resulting data-driven PDEs can then be simulated to reproduce/predict computational or experimental bacterial density profile data independent of the coordinate system, approximate meaningful parameters or functional terms, and even possibly estimate the underlying (unmeasured) chemonutrient field evolution.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11515320/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-024-05929-w","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: E. coli chemotactic motion in the presence of a chemonutrient field can be studied using wet laboratory experiments or macroscale-level partial differential equations (PDEs) (among others). Bridging experimental measurements and chemotactic Partial Differential Equations requires knowledge of the evolution of all underlying fields, initial and boundary conditions, and often necessitates strong assumptions. In this work, we propose machine learning approaches, along with ideas from the Whitney and Takens embedding theorems, to circumvent these challenges.

Results: Machine learning approaches for identifying underlying PDEs were (a) validated through the use of simulation data from established continuum models and (b) used to infer chemotactic PDEs from experimental data. Such data-driven models were surrogates either for the entire chemotactic PDE right-hand-side (black box models), or, in a more targeted fashion, just for the chemotactic term (gray box models). Furthermore, it was demonstrated that a short history of bacterial density may compensate for the missing measurements of the field of chemonutrient concentration. In fact, given reasonable conditions, such a short history of bacterial density measurements could even be used to infer chemonutrient concentration.

Conclusion: Data-driven PDEs are an important modeling tool when studying Chemotaxis at the macroscale, as they can learn bacterial motility from various data sources, fidelities (here, computational models, experiments) or coordinate systems. The resulting data-driven PDEs can then be simulated to reproduce/predict computational or experimental bacterial density profile data independent of the coordinate system, approximate meaningful parameters or functional terms, and even possibly estimate the underlying (unmeasured) chemonutrient field evolution.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过坐标不变机器学习,以数据驱动发现细菌的趋化迁移。
背景:可以使用湿实验室实验或宏观偏微分方程(PDEs)等方法来研究大肠杆菌在螯合剂场作用下的趋化运动。连接实验测量和趋化偏微分方程需要了解所有基础场、初始条件和边界条件的演变,而且往往需要强有力的假设。在这项工作中,我们提出了机器学习方法以及惠特尼和塔肯斯嵌入定理的思想,以规避这些挑战:结果:(a)通过使用已建立的连续模型的模拟数据验证了识别底层 PDE 的机器学习方法;(b)使用机器学习方法从实验数据中推断出趋化 PDE。这些数据驱动模型既可以是整个趋化 PDE 右侧的代用模型(黑框模型),也可以是更有针对性的趋化项的代用模型(灰框模型)。此外,研究还证明,细菌密度的短暂历史可以弥补螯合剂浓度场测量的缺失。事实上,在条件合理的情况下,这种短时间的细菌密度测量甚至可以用来推断螯合营养素的浓度:在研究宏观尺度的趋化性时,数据驱动的 PDE 是一种重要的建模工具,因为它们可以从不同的数据源、保真度(此处为计算模型、实验)或坐标系中学习细菌的运动。由此产生的数据驱动 PDEs 可以通过模拟来重现/预测计算或实验中的细菌密度曲线数据,而不受坐标系的影响,近似有意义的参数或函数项,甚至可能估计潜在的(未测量的)螯合剂场演化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BMC Bioinformatics
BMC Bioinformatics 生物-生化研究方法
CiteScore
5.70
自引率
3.30%
发文量
506
审稿时长
4.3 months
期刊介绍: BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.
期刊最新文献
Rare copy number variant analysis in case-control studies using snp array data: a scalable and automated data analysis pipeline. Mining contextually meaningful subgraphs from a vertex-attributed graph. Robust double machine learning model with application to omics data. A mapping-free natural language processing-based technique for sequence search in nanopore long-reads. Closha 2.0: a bio-workflow design system for massive genome data analysis on high performance cluster infrastructure.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1