Yifan Gao, Zakariyya Mughal, Jose A. Jaramillo-Villegas, Marie Corradi, Alexandre Borrel, Ben Lieberman, Suliman Sharif, John Shaffer, Karamarie Fecho, Ajay Chatrath, Alexandra Maertens, Marc A. T. Teunis, Nicole Kleinstreuer, Thomas Hartung, Thomas Luechtefeld
Researchers in biomedical research, public health, and the life sciences often spend weeks or months discovering, accessing, curating, and integrating data from disparate sources, significantly delaying the onset of actual analysis and innovation. Instead of countless developers creating redundant and inconsistent data pipelines, BioBricks.ai offers a centralized data repository and a suite of developer-friendly tools to simplify access to scientific data. Currently, BioBricks.ai delivers over ninety biological and chemical datasets. It provides a package manager-like system for installing and managing dependencies on data sources. Each 'brick' is a Data Version Control git repository that supports an updateable pipeline for extraction, transformation, and loading data into the BioBricks.ai backend at https://biobricks.ai. Use cases include accelerating data science workflows and facilitating the creation of novel data assets by integrating multiple datasets into unified, harmonized resources. In conclusion, BioBricks.ai offers an opportunity to accelerate access and use of public data through a single open platform.
{"title":"BioBricks.ai: A Versioned Data Registry for Life Sciences Data Assets","authors":"Yifan Gao, Zakariyya Mughal, Jose A. Jaramillo-Villegas, Marie Corradi, Alexandre Borrel, Ben Lieberman, Suliman Sharif, John Shaffer, Karamarie Fecho, Ajay Chatrath, Alexandra Maertens, Marc A. T. Teunis, Nicole Kleinstreuer, Thomas Hartung, Thomas Luechtefeld","doi":"arxiv-2408.17320","DOIUrl":"https://doi.org/arxiv-2408.17320","url":null,"abstract":"Researchers in biomedical research, public health, and the life sciences\u0000often spend weeks or months discovering, accessing, curating, and integrating\u0000data from disparate sources, significantly delaying the onset of actual\u0000analysis and innovation. Instead of countless developers creating redundant and\u0000inconsistent data pipelines, BioBricks.ai offers a centralized data repository\u0000and a suite of developer-friendly tools to simplify access to scientific data.\u0000Currently, BioBricks.ai delivers over ninety biological and chemical datasets.\u0000It provides a package manager-like system for installing and managing\u0000dependencies on data sources. Each 'brick' is a Data Version Control git\u0000repository that supports an updateable pipeline for extraction, transformation,\u0000and loading data into the BioBricks.ai backend at https://biobricks.ai. Use\u0000cases include accelerating data science workflows and facilitating the creation\u0000of novel data assets by integrating multiple datasets into unified, harmonized\u0000resources. In conclusion, BioBricks.ai offers an opportunity to accelerate\u0000access and use of public data through a single open platform.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce a generalized promotion time cure model motivated by a new biological consideration. The new approach is flexible to model heterogeneous survival data, in particular for addressing intra-sample heterogeneity.
{"title":"A note on promotion time cure models with a new biological consideration","authors":"Zhi Zhao, Fatih Kızılaslan","doi":"arxiv-2408.17188","DOIUrl":"https://doi.org/arxiv-2408.17188","url":null,"abstract":"We introduce a generalized promotion time cure model motivated by a new\u0000biological consideration. The new approach is flexible to model heterogeneous\u0000survival data, in particular for addressing intra-sample heterogeneity.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul N. Patrone, Lili Wang, Sheng Lin-Gibson, Anthony J. Kearsley
Harmonizing serology measurements is critical for identifying reference materials that permit standardization and comparison of results across different diagnostic platforms. However, the theoretical foundations of such tasks have yet to be fully explored in the context of antibody thermodynamics and uncertainty quantification (UQ). This has restricted the usefulness of standards currently deployed and limited the scope of materials considered as viable reference material. To address these problems, we develop rigorous theories of antibody normalization and harmonization, as well as formulate a probabilistic framework for defining correlates of protection. We begin by proposing a mathematical definition of harmonization equipped with structure needed to quantify uncertainty associated with the choice of standard, assay, etc. We then show how a thermodynamic description of serology measurements (i) relates this structure to the Gibbs free-energy of antibody binding, and thereby (ii) induces a regression analysis that directly harmonizes measurements. We supplement this with a novel, optimization-based normalization (not harmonization!) method that checks for consistency between reference and sample dilution curves. Last, we relate these analyses to uncertainty propagation techniques to estimate correlates of protection. A key result of these analyses is that under physically reasonable conditions, the choice of reference material does not increase uncertainty associated with harmonization or correlates of protection. We provide examples and validate main ideas in the context of an interlab study that lays the foundation for using monoclonal antibodies as a reference for SARS-CoV-2 serology measurements.
{"title":"Uncertainty Quantification of Antibody Measurements: Physical Principles and Implications for Standardization","authors":"Paul N. Patrone, Lili Wang, Sheng Lin-Gibson, Anthony J. Kearsley","doi":"arxiv-2409.00191","DOIUrl":"https://doi.org/arxiv-2409.00191","url":null,"abstract":"Harmonizing serology measurements is critical for identifying reference\u0000materials that permit standardization and comparison of results across\u0000different diagnostic platforms. However, the theoretical foundations of such\u0000tasks have yet to be fully explored in the context of antibody thermodynamics\u0000and uncertainty quantification (UQ). This has restricted the usefulness of\u0000standards currently deployed and limited the scope of materials considered as\u0000viable reference material. To address these problems, we develop rigorous\u0000theories of antibody normalization and harmonization, as well as formulate a\u0000probabilistic framework for defining correlates of protection. We begin by\u0000proposing a mathematical definition of harmonization equipped with structure\u0000needed to quantify uncertainty associated with the choice of standard, assay,\u0000etc. We then show how a thermodynamic description of serology measurements (i)\u0000relates this structure to the Gibbs free-energy of antibody binding, and\u0000thereby (ii) induces a regression analysis that directly harmonizes\u0000measurements. We supplement this with a novel, optimization-based normalization\u0000(not harmonization!) method that checks for consistency between reference and\u0000sample dilution curves. Last, we relate these analyses to uncertainty\u0000propagation techniques to estimate correlates of protection. A key result of\u0000these analyses is that under physically reasonable conditions, the choice of\u0000reference material does not increase uncertainty associated with harmonization\u0000or correlates of protection. We provide examples and validate main ideas in the\u0000context of an interlab study that lays the foundation for using monoclonal\u0000antibodies as a reference for SARS-CoV-2 serology measurements.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Breast cancer is one of the two cancers responsible for the most deaths in women, with about 42,000 deaths each year in the US. That there are over 300,000 breast cancers newly diagnosed each year suggests that only a fraction of the cancers result in mortality. Thus, most of the women undergo seemingly curative treatment for localized cancers, but a significant later succumb to metastatic disease for which current treatments are only temporizing for the vast majority. The current prognostic metrics are of little actionable value for 4 of the 5 women seemingly cured after local treatment, and many women are exposed to morbid and even mortal adjuvant therapies unnecessarily, with these adjuvant therapies reducing metastatic recurrence by only a third. Thus, there is a need for better prognostics to target aggressive treatment at those who are likely to relapse and spare those who were actually cured. While there is a plethora of molecular and tumor-marker assays in use and under-development to detect recurrence early, these are time consuming, expensive and still often un-validated as to actionable prognostic utility. A different approach would use large data techniques to determine clinical and histopathological parameters that would provide accurate prognostics using existing data. Herein, we report on machine learning, together with grid search and Bayesian Networks to develop algorithms that present a AUC of up to 0.9 in ROC analyses, using only extant data. Such algorithms could be rapidly translated to clinical management as they do not require testing beyond routine tumor evaluations.
{"title":"Coalitions of AI-based Methods Predict 15-Year Risks of Breast Cancer Metastasis Using Real-World Clinical Data with AUC up to 0.9","authors":"Xia Jiang, Yijun Zhou, Alan Wells, Adam Brufsky","doi":"arxiv-2408.16256","DOIUrl":"https://doi.org/arxiv-2408.16256","url":null,"abstract":"Breast cancer is one of the two cancers responsible for the most deaths in\u0000women, with about 42,000 deaths each year in the US. That there are over\u0000300,000 breast cancers newly diagnosed each year suggests that only a fraction\u0000of the cancers result in mortality. Thus, most of the women undergo seemingly\u0000curative treatment for localized cancers, but a significant later succumb to\u0000metastatic disease for which current treatments are only temporizing for the\u0000vast majority. The current prognostic metrics are of little actionable value\u0000for 4 of the 5 women seemingly cured after local treatment, and many women are\u0000exposed to morbid and even mortal adjuvant therapies unnecessarily, with these\u0000adjuvant therapies reducing metastatic recurrence by only a third. Thus, there\u0000is a need for better prognostics to target aggressive treatment at those who\u0000are likely to relapse and spare those who were actually cured. While there is a\u0000plethora of molecular and tumor-marker assays in use and under-development to\u0000detect recurrence early, these are time consuming, expensive and still often\u0000un-validated as to actionable prognostic utility. A different approach would\u0000use large data techniques to determine clinical and histopathological\u0000parameters that would provide accurate prognostics using existing data. Herein,\u0000we report on machine learning, together with grid search and Bayesian Networks\u0000to develop algorithms that present a AUC of up to 0.9 in ROC analyses, using\u0000only extant data. Such algorithms could be rapidly translated to clinical\u0000management as they do not require testing beyond routine tumor evaluations.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adrienne C. Kinney, Roberto Barrera, Joceline Lega
We present a method to convert weather data into probabilistic forecasts of Aedes aegypti abundance. The approach, which relies on the Aedes-AI suite of neural networks, produces weekly point predictions with corresponding uncertainty estimates. Once calibrated on past trap and weather data, the model is designed to use weather forecasts to estimate future trap catches. We demonstrate that when reliable input data are used, the resulting predictions have high skill. This technique may therefore be used to supplement vector surveillance efforts or identify periods of elevated risk for vector-borne disease outbreaks.
{"title":"Rapid and accurate mosquito abundance forecasting with Aedes-AI neural networks","authors":"Adrienne C. Kinney, Roberto Barrera, Joceline Lega","doi":"arxiv-2408.16152","DOIUrl":"https://doi.org/arxiv-2408.16152","url":null,"abstract":"We present a method to convert weather data into probabilistic forecasts of\u0000Aedes aegypti abundance. The approach, which relies on the Aedes-AI suite of\u0000neural networks, produces weekly point predictions with corresponding\u0000uncertainty estimates. Once calibrated on past trap and weather data, the model\u0000is designed to use weather forecasts to estimate future trap catches. We\u0000demonstrate that when reliable input data are used, the resulting predictions\u0000have high skill. This technique may therefore be used to supplement vector\u0000surveillance efforts or identify periods of elevated risk for vector-borne\u0000disease outbreaks.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Magnetic resonance spectroscopy (MRS) is an established technique for studying tissue metabolism, particularly in central nervous system disorders. While powerful and versatile, MRS is often limited by challenges associated with data quality, processing, and quantification. Existing MRS quantification methods face difficulties in balancing model complexity and reproducibility during spectral modeling, often falling into the trap of either oversimplification or over-parameterization. To address these limitations, this study introduces a deep learning (DL) framework that employs transfer learning, in which the model is pre-trained on simulated datasets before it undergoes fine-tuning on in vivo data. The proposed framework showed promising performance when applied to the Philips dataset from the BIG GABA repository and represents an exciting advancement in MRS data analysis.
磁共振波谱(MRS)是研究组织代谢,尤其是中枢神经系统疾病的成熟技术。虽然 MRS 功能强大且用途广泛,但它往往受限于与数据质量、处理和量化相关的挑战。现有的 MRS 定量方法在光谱建模过程中难以在模型复杂性和可重复性之间取得平衡,往往会陷入过度简化或过度参数化的陷阱。为了解决这些局限性,本研究引入了一种采用迁移学习的深度学习(DL)框架,即先在模拟数据集上对模型进行预训练,然后再在体内数据上进行微调。所提出的框架在应用于 BIG GABA 数据库中的飞利浦数据集时表现出了良好的性能,代表了 MRS 数据分析领域令人兴奋的进步。
{"title":"Q-MRS: A Deep Learning Framework for Quantitative Magnetic Resonance Spectra Analysis","authors":"Christopher J. Wu, Lawrence S. Kegeles, Jia Guo","doi":"arxiv-2408.15999","DOIUrl":"https://doi.org/arxiv-2408.15999","url":null,"abstract":"Magnetic resonance spectroscopy (MRS) is an established technique for\u0000studying tissue metabolism, particularly in central nervous system disorders.\u0000While powerful and versatile, MRS is often limited by challenges associated\u0000with data quality, processing, and quantification. Existing MRS quantification\u0000methods face difficulties in balancing model complexity and reproducibility\u0000during spectral modeling, often falling into the trap of either\u0000oversimplification or over-parameterization. To address these limitations, this\u0000study introduces a deep learning (DL) framework that employs transfer learning,\u0000in which the model is pre-trained on simulated datasets before it undergoes\u0000fine-tuning on in vivo data. The proposed framework showed promising\u0000performance when applied to the Philips dataset from the BIG GABA repository\u0000and represents an exciting advancement in MRS data analysis.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"94 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Filip Dorm, Christian Lange, Scott Loarie, Oisin Mac Aodha
Accurately predicting the geographic ranges of species is crucial for assisting conservation efforts. Traditionally, range maps were manually created by experts. However, species distribution models (SDMs) and, more recently, deep learning-based variants offer a potential automated alternative. Deep learning-based SDMs generate a continuous probability representing the predicted presence of a species at a given location, which must be binarized by setting per-species thresholds to obtain binary range maps. However, selecting appropriate per-species thresholds to binarize these predictions is non-trivial as different species can require distinct thresholds. In this work, we evaluate different approaches for automatically identifying the best thresholds for binarizing range maps using presence-only data. This includes approaches that require the generation of additional pseudo-absence data, along with ones that only require presence data. We also propose an extension of an existing presence-only technique that is more robust to outliers. We perform a detailed evaluation of different thresholding techniques on the tasks of binary range estimation and large-scale fine-grained visual classification, and we demonstrate improved performance over existing pseudo-absence free approaches using our method.
{"title":"Generating Binary Species Range Maps","authors":"Filip Dorm, Christian Lange, Scott Loarie, Oisin Mac Aodha","doi":"arxiv-2408.15956","DOIUrl":"https://doi.org/arxiv-2408.15956","url":null,"abstract":"Accurately predicting the geographic ranges of species is crucial for\u0000assisting conservation efforts. Traditionally, range maps were manually created\u0000by experts. However, species distribution models (SDMs) and, more recently,\u0000deep learning-based variants offer a potential automated alternative. Deep\u0000learning-based SDMs generate a continuous probability representing the\u0000predicted presence of a species at a given location, which must be binarized by\u0000setting per-species thresholds to obtain binary range maps. However, selecting\u0000appropriate per-species thresholds to binarize these predictions is non-trivial\u0000as different species can require distinct thresholds. In this work, we evaluate\u0000different approaches for automatically identifying the best thresholds for\u0000binarizing range maps using presence-only data. This includes approaches that\u0000require the generation of additional pseudo-absence data, along with ones that\u0000only require presence data. We also propose an extension of an existing\u0000presence-only technique that is more robust to outliers. We perform a detailed\u0000evaluation of different thresholding techniques on the tasks of binary range\u0000estimation and large-scale fine-grained visual classification, and we\u0000demonstrate improved performance over existing pseudo-absence free approaches\u0000using our method.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"111 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
While machine learning has advanced in medicine, its widespread use in clinical applications, especially in predicting breast cancer metastasis, is still limited. We have been dedicated to constructing a DFNN model to predict breast cancer metastasis n years in advance. However, the challenge lies in efficiently identifying optimal hyperparameter values through grid search, given the constraints of time and resources. Issues such as the infinite possibilities for continuous hyperparameters like l1 and l2, as well as the time-consuming and costly process, further complicate the task. To address these challenges, we developed Single Hyperparameter Grid Search (SHGS) strategy, serving as a preselection method before grid search. Our experiments with SHGS applied to DFNN models for breast cancer metastasis prediction focus on analyzing eight target hyperparameters: epochs, batch size, dropout, L1, L2, learning rate, decay, and momentum. We created three figures, each depicting the experiment results obtained from three LSM-I-10-Plus-year datasets. These figures illustrate the relationship between model performance and the target hyperparameter values. For each hyperparameter, we analyzed whether changes in this hyperparameter would affect model performance, examined if there were specific patterns, and explored how to choose values for the particular hyperparameter. Our experimental findings reveal that the optimal value of a hyperparameter is not only dependent on the dataset but is also significantly influenced by the settings of other hyperparameters. Additionally, our experiments suggested some reduced range of values for a target hyperparameter, which may be helpful for low-budget grid search. This approach serves as a prior experience and foundation for subsequent use of grid search to enhance model performance.
{"title":"Deep Learning to Predict Late-Onset Breast Cancer Metastasis: the Single Hyperparameter Grid Search (SHGS) Strategy for Meta Tuning Concerning Deep Feed-forward Neural Network","authors":"Yijun Zhou, Om Arora-Jain, Xia Jiang","doi":"arxiv-2408.15498","DOIUrl":"https://doi.org/arxiv-2408.15498","url":null,"abstract":"While machine learning has advanced in medicine, its widespread use in\u0000clinical applications, especially in predicting breast cancer metastasis, is\u0000still limited. We have been dedicated to constructing a DFNN model to predict\u0000breast cancer metastasis n years in advance. However, the challenge lies in\u0000efficiently identifying optimal hyperparameter values through grid search,\u0000given the constraints of time and resources. Issues such as the infinite\u0000possibilities for continuous hyperparameters like l1 and l2, as well as the\u0000time-consuming and costly process, further complicate the task. To address\u0000these challenges, we developed Single Hyperparameter Grid Search (SHGS)\u0000strategy, serving as a preselection method before grid search. Our experiments\u0000with SHGS applied to DFNN models for breast cancer metastasis prediction focus\u0000on analyzing eight target hyperparameters: epochs, batch size, dropout, L1, L2,\u0000learning rate, decay, and momentum. We created three figures, each depicting\u0000the experiment results obtained from three LSM-I-10-Plus-year datasets. These\u0000figures illustrate the relationship between model performance and the target\u0000hyperparameter values. For each hyperparameter, we analyzed whether changes in\u0000this hyperparameter would affect model performance, examined if there were\u0000specific patterns, and explored how to choose values for the particular\u0000hyperparameter. Our experimental findings reveal that the optimal value of a\u0000hyperparameter is not only dependent on the dataset but is also significantly\u0000influenced by the settings of other hyperparameters. Additionally, our\u0000experiments suggested some reduced range of values for a target hyperparameter,\u0000which may be helpful for low-budget grid search. This approach serves as a\u0000prior experience and foundation for subsequent use of grid search to enhance\u0000model performance.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"275 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Proteins can form droplets via liquid-liquid phase separation (LLPS) in cells. Recent experiments demonstrate that LLPS is qualitatively different on two-dimensional (2d) surfaces compared to three-dimensional (3d) solutions. In this paper, we use mathematical modeling to investigate the causes of the discrepancies between LLPS in 2d versus 3d. We model the number of proteins and droplets inducing LLPS by continuous-time Markov chains and use chemical reaction network theory to analyze the model. To reflect the influence of space dimension, droplet formation and dissociation rates are determined using the first hitting times of diffusing proteins. We first show that our stochastic model reproduces the appropriate phase diagram and is consistent with the relevant thermodynamic constraints. After further analyzing the model, we find that it predicts that the space dimension induces qualitatively different features of LLPS which are consistent with recent experiments. While it has been claimed that the differences between 2d and 3d LLPS stems mainly from different diffusion coefficients, our analysis is independent of the diffusion coefficients of the proteins since we use the stationary model behavior. Therefore, our results give new hypotheses about how space dimension affects LLPS.
{"title":"A reaction network model of microscale liquid-liquid phase separation reveals effects of spatial dimension","authors":"Jinyoung Kim, Sean D. Lawley, Jinsu Kim","doi":"arxiv-2408.15303","DOIUrl":"https://doi.org/arxiv-2408.15303","url":null,"abstract":"Proteins can form droplets via liquid-liquid phase separation (LLPS) in\u0000cells. Recent experiments demonstrate that LLPS is qualitatively different on\u0000two-dimensional (2d) surfaces compared to three-dimensional (3d) solutions. In\u0000this paper, we use mathematical modeling to investigate the causes of the\u0000discrepancies between LLPS in 2d versus 3d. We model the number of proteins and\u0000droplets inducing LLPS by continuous-time Markov chains and use chemical\u0000reaction network theory to analyze the model. To reflect the influence of space\u0000dimension, droplet formation and dissociation rates are determined using the\u0000first hitting times of diffusing proteins. We first show that our stochastic\u0000model reproduces the appropriate phase diagram and is consistent with the\u0000relevant thermodynamic constraints. After further analyzing the model, we find\u0000that it predicts that the space dimension induces qualitatively different\u0000features of LLPS which are consistent with recent experiments. While it has\u0000been claimed that the differences between 2d and 3d LLPS stems mainly from\u0000different diffusion coefficients, our analysis is independent of the diffusion\u0000coefficients of the proteins since we use the stationary model behavior.\u0000Therefore, our results give new hypotheses about how space dimension affects\u0000LLPS.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phuc D Nguyen, Claire Dunbar, Hannah Scott, Bastien Lechat, Jack Manners, Gorica Micic, Nicole Lovato, Amy C Reynolds, Leon Lack, Robert Adams, Danny Eckert, Andrew Vakulin, Peter G Catcheside
Circadian disruption contributes to adverse effects on sleep, performance, and health. One accepted method to track continuous daily changes in circadian timing is to measure core body temperature (CBT), and establish daily, circadian-related CBT minimum time (Tmin). This method typically applies cosine-model fits to measured CBT data, which may not adequately account for substantial wake metabolic activity and sleep effects on CBT that confound and mask circadian effects, and thus estimates of the circadian-related Tmin. This study introduced a novel physiology-grounded analytic approach to separate circadian from non-circadian effects on CBT, which we compared against traditional cosine-based methods. The dataset comprised 33 healthy participants attending a 39-hour in-laboratory study with an initial overnight sleep followed by an extended wake period. CBT data were collected at 30-second intervals via ingestible capsules. Our design captured CBT during both the baseline sleep period and during extended wake period (without sleep) and allowed us to model the influence of circadian and non-circadian effects of sleep, wake, and activity on CBT using physiology-guided generalized additive models. Model fits and estimated Tmin inferred from extended wake without sleep were compared with traditional cosine-based models fits. Compared to the traditional cosine model, the new model exhibited superior fits to CBT (Pearson R 0.90 [95%CI; [0.83 - 0.96] versus 0.81 [0.55-0.93]). The difference between estimated vs measured circadian Tmin, derived from the day without sleep, was better fit with our method (0.2 [-0.5,0.3] hours) versus previous methods (1.4 [1.1 to 1.7] hours). This new method provides superior demasking of non-circadian influences compared to traditional cosine methods, including the removal of a sleep-related bias towards an earlier estimate of circadian Tmin.
{"title":"A novel method to separate circadian from non-circadian masking effects in order to enhance daily circadian timing and amplitude estimation from core body temperature","authors":"Phuc D Nguyen, Claire Dunbar, Hannah Scott, Bastien Lechat, Jack Manners, Gorica Micic, Nicole Lovato, Amy C Reynolds, Leon Lack, Robert Adams, Danny Eckert, Andrew Vakulin, Peter G Catcheside","doi":"arxiv-2408.15295","DOIUrl":"https://doi.org/arxiv-2408.15295","url":null,"abstract":"Circadian disruption contributes to adverse effects on sleep, performance,\u0000and health. One accepted method to track continuous daily changes in circadian\u0000timing is to measure core body temperature (CBT), and establish daily,\u0000circadian-related CBT minimum time (Tmin). This method typically applies\u0000cosine-model fits to measured CBT data, which may not adequately account for\u0000substantial wake metabolic activity and sleep effects on CBT that confound and\u0000mask circadian effects, and thus estimates of the circadian-related Tmin. This\u0000study introduced a novel physiology-grounded analytic approach to separate\u0000circadian from non-circadian effects on CBT, which we compared against\u0000traditional cosine-based methods. The dataset comprised 33 healthy participants\u0000attending a 39-hour in-laboratory study with an initial overnight sleep\u0000followed by an extended wake period. CBT data were collected at 30-second\u0000intervals via ingestible capsules. Our design captured CBT during both the\u0000baseline sleep period and during extended wake period (without sleep) and\u0000allowed us to model the influence of circadian and non-circadian effects of\u0000sleep, wake, and activity on CBT using physiology-guided generalized additive\u0000models. Model fits and estimated Tmin inferred from extended wake without sleep\u0000were compared with traditional cosine-based models fits. Compared to the\u0000traditional cosine model, the new model exhibited superior fits to CBT (Pearson\u0000R 0.90 [95%CI; [0.83 - 0.96] versus 0.81 [0.55-0.93]). The difference between\u0000estimated vs measured circadian Tmin, derived from the day without sleep, was\u0000better fit with our method (0.2 [-0.5,0.3] hours) versus previous methods (1.4\u0000[1.1 to 1.7] hours). This new method provides superior demasking of\u0000non-circadian influences compared to traditional cosine methods, including the\u0000removal of a sleep-related bias towards an earlier estimate of circadian Tmin.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"275 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142213353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}