Pub Date : 2026-02-22DOI: 10.1101/2025.04.26.650787
Ling-Hong Hung, Niharika Nasam, Chris Biju, Wes Lloyd, Ka Yee Yeung
Singe cell RNA sequencing (scRNA-seq) has become a routine method for measuring cell activities. Processing large scRNA-seq datasets requires high-performance computing resources. The emergence of cloud computing allows us to leverage its on-demand capabilities without major investment in infrastructure. Serverless computing provides cost efficiency by allowing users to pay only for actual resource usage, eliminating the necessity for pre-allocated server capacities. Additionally, there is no requirement to set up servers in advance. We present a novel and generalizable methodology using serverless cloud computing to accelerate computationally intensive workflows. We create an on-demand "supercomputer" using rapidly deployable cloud serverless functions as automatically provisioned computation units. We tested our methodology of optimizing a scRNA-seq workflow by leveraging serverless functions on the cloud using two publicly available peripheral blood mononuclear cell (PBMC) datasets. In addition, we demonstrate our approach using data generated by the NIH MorPhiC program, where we process a 450 GB human scRNA-seq dataset across 86 cell lines designed to study the temporal impact of perturbations on pancreatic differentiation. We compared the total execution time of the scRNA-seq serverless workflow with the traditional workflow without using serverless functions, and demonstrate major speedup for large scRNA-seq datasets.
{"title":"Singe cell RNA sequencing data processing using cloud-based serverless computing.","authors":"Ling-Hong Hung, Niharika Nasam, Chris Biju, Wes Lloyd, Ka Yee Yeung","doi":"10.1101/2025.04.26.650787","DOIUrl":"10.1101/2025.04.26.650787","url":null,"abstract":"<p><p>Singe cell RNA sequencing (scRNA-seq) has become a routine method for measuring cell activities. Processing large scRNA-seq datasets requires high-performance computing resources. The emergence of cloud computing allows us to leverage its on-demand capabilities without major investment in infrastructure. Serverless computing provides cost efficiency by allowing users to pay only for actual resource usage, eliminating the necessity for pre-allocated server capacities. Additionally, there is no requirement to set up servers in advance. We present a novel and generalizable methodology using serverless cloud computing to accelerate computationally intensive workflows. We create an on-demand \"supercomputer\" using rapidly deployable cloud serverless functions as automatically provisioned computation units. We tested our methodology of optimizing a scRNA-seq workflow by leveraging serverless functions on the cloud using two publicly available peripheral blood mononuclear cell (PBMC) datasets. In addition, we demonstrate our approach using data generated by the NIH MorPhiC program, where we process a 450 GB human scRNA-seq dataset across 86 cell lines designed to study the temporal impact of perturbations on pancreatic differentiation. We compared the total execution time of the scRNA-seq serverless workflow with the traditional workflow without using serverless functions, and demonstrate major speedup for large scRNA-seq datasets.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2026.02.21.707194
Robert W Cross, Declan D Pigeaud, Victoriya Borisevich, Krystle N Agans, Mack B Harrison, Rachel O'Toole, Abhishek N Prasad, Thomas W Geisbert
There are no approved medical countermeasures for combatting Nipah virus (NiV) which causes regular outbreaks in humans and animals in South and Southeast Asia with mortality rates in humans ranging from 40% to more than 90%. Recently, it was shown that 4'-fluorouridine (4'-FlU; EIDD-2749), an orally available ribonucleoside analog, protected guinea pigs and nonhuman primates from lethal challenge with Lassa virus and that 4'-FlU has in vitro antiviral activity against NiV. Here, we assessed the postexposure protective efficacy of 4'-FlU in a lethal hamster model of NiV infection. Daily treatment with 4'-FlU beginning 3 days after exposure to NiV resulted in complete protection from lethal infection. Our findings support the further development of 4'-FlU as a therapy for NiV disease.
{"title":"Oral 4'fluorouridine provides postexposure protection against lethal Nipah virus infection.","authors":"Robert W Cross, Declan D Pigeaud, Victoriya Borisevich, Krystle N Agans, Mack B Harrison, Rachel O'Toole, Abhishek N Prasad, Thomas W Geisbert","doi":"10.64898/2026.02.21.707194","DOIUrl":"10.64898/2026.02.21.707194","url":null,"abstract":"<p><p>There are no approved medical countermeasures for combatting Nipah virus (NiV) which causes regular outbreaks in humans and animals in South and Southeast Asia with mortality rates in humans ranging from 40% to more than 90%. Recently, it was shown that 4'-fluorouridine (4'-FlU; EIDD-2749), an orally available ribonucleoside analog, protected guinea pigs and nonhuman primates from lethal challenge with Lassa virus and that 4'-FlU has <i>in vitro</i> antiviral activity against NiV. Here, we assessed the postexposure protective efficacy of 4'-FlU in a lethal hamster model of NiV infection. Daily treatment with 4'-FlU beginning 3 days after exposure to NiV resulted in complete protection from lethal infection. Our findings support the further development of 4'-FlU as a therapy for NiV disease.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934678/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2026.02.20.707016
Ela Iwaszkiewicz-Eggebrecht, Emma Granqvist, Karol H Nowak, Catalina Valdivia, Mateusz Buczek, Amrita Srivathsan, Emily Hartop, Andreia Miraldo, Tomas Roslin, Ayco J M Tack, Piotr Łukasik, Rudolf Meier, Fredrik Ronquist
1. DNA metabarcoding-high-throughput sequencing of barcode regions from bulk samples-has become a key tool for insect biodiversity assessment. Yet, how methodological choices affect the accuracy of metabarcoding data remains insufficiently explored. In this paper, we ask: (1) How does the lysis method (non-destructive lysis vs. destructive homogenization) affect community recovery? (2) How comprehensively does metabarcoding capture species richness? (3) To what extent can spike-ins improve abundance estimates? (4) How accurately can species abundances be estimated?2. We evaluated the accuracy of insect metabarcoding using 4,749 bulk samples from a large-scale biodiversity survey subjected to mild lysis. Of these samples, 856 were also homogenized, allowing a systematic comparison of the effect of alternative treatments. To potentially improve abundance estimates, we added six biological spike-ins (i.e., foreign insects) to all samples, and two synthetic spike-ins (artificial DNA fragments) to the homogenization treatment. In addition, we established the contents of 15 samples by individually barcoding all specimens, enabling direct assessment of occurrence and abundance estimates.3. Our results revealed consistent differences between destructive and non-destructive treatments. While both methods reliably detected the majority of species, small and soft-bodied taxa were more often recovered after mild lysis than after homogenization, while the reverse was true for heavily sclerotized, hairy, and large taxa. Using biological spike-ins for calibration reduced the variance in read numbers per specimen considerably, especially in homogenized samples, while synthetic spike-ins were less effective. In a Bayesian analysis, where species data were matched to the best-fitting spike-in calibration curve, accurate abundance estimates (+/-1 individual) were obtained for 72.9% of species occurrences.4. Our results show that it is possible to obtain reasonably accurate abundance estimates from metabarcoding data, and that mild lysis and homogenization result in different taxon-specific biases in terms of occurrence data, with neither method outperforming the other. Accuracy is improved by homogenization rather than mild lysis of samples, and by the use of biological rather than synthetic spike-ins. Together, these findings provide a major step towards robust, quantitative biodiversity monitoring using DNA-metabarcoding.
{"title":"Accuracy of occurrence and abundance estimates from insect metabarcoding.","authors":"Ela Iwaszkiewicz-Eggebrecht, Emma Granqvist, Karol H Nowak, Catalina Valdivia, Mateusz Buczek, Amrita Srivathsan, Emily Hartop, Andreia Miraldo, Tomas Roslin, Ayco J M Tack, Piotr Łukasik, Rudolf Meier, Fredrik Ronquist","doi":"10.64898/2026.02.20.707016","DOIUrl":"10.64898/2026.02.20.707016","url":null,"abstract":"<p><p>1. DNA metabarcoding-high-throughput sequencing of barcode regions from bulk samples-has become a key tool for insect biodiversity assessment. Yet, how methodological choices affect the accuracy of metabarcoding data remains insufficiently explored. In this paper, we ask: (1) How does the lysis method (non-destructive lysis vs. destructive homogenization) affect community recovery? (2) How comprehensively does metabarcoding capture species richness? (3) To what extent can spike-ins improve abundance estimates? (4) How accurately can species abundances be estimated?2. We evaluated the accuracy of insect metabarcoding using 4,749 bulk samples from a large-scale biodiversity survey subjected to mild lysis. Of these samples, 856 were also homogenized, allowing a systematic comparison of the effect of alternative treatments. To potentially improve abundance estimates, we added six biological spike-ins (i.e., foreign insects) to all samples, and two synthetic spike-ins (artificial DNA fragments) to the homogenization treatment. In addition, we established the contents of 15 samples by individually barcoding all specimens, enabling direct assessment of occurrence and abundance estimates.3. Our results revealed consistent differences between destructive and non-destructive treatments. While both methods reliably detected the majority of species, small and soft-bodied taxa were more often recovered after mild lysis than after homogenization, while the reverse was true for heavily sclerotized, hairy, and large taxa. Using biological spike-ins for calibration reduced the variance in read numbers per specimen considerably, especially in homogenized samples, while synthetic spike-ins were less effective. In a Bayesian analysis, where species data were matched to the best-fitting spike-in calibration curve, accurate abundance estimates (+/-1 individual) were obtained for 72.9% of species occurrences.4. Our results show that it is possible to obtain reasonably accurate abundance estimates from metabarcoding data, and that mild lysis and homogenization result in different taxon-specific biases in terms of occurrence data, with neither method outperforming the other. Accuracy is improved by homogenization rather than mild lysis of samples, and by the use of biological rather than synthetic spike-ins. Together, these findings provide a major step towards robust, quantitative biodiversity monitoring using DNA-metabarcoding.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934785/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2026.02.20.707039
Alex N Popinga, Jack Forman, Dmitri Svetlov, Huy Vo, Brian Munsky
<p><p>Biological data is prone to both intrinsic and extrinsic noise and variability between experimental replicas. That same stochasticity and heterogeneity can carry information about underlying biochemical mechanisms but, if not incorporated in modeling and probabilistic inference, can also bias parameter estimates and misguide predictions and, subsequently, experiment design. Mechanistic inference typically requires lengthy simulations (e.g., the Stochastic Simulation Algorithm (SSA)); approximations to chemical master equation (CME) solutions that lack rigorous error tracking; or deterministic averaging that lacks the complexity necessary to reflect the data. We introduce the Stochastic System Identification Toolkit (SSIT) - a fast, flexible, and open-source software package available on GitHub that makes use of MATLAB's efficient and diverse computational architecture. The SSIT is designed for building, simulating, and solving chemical reaction models using ODEs, moments, SSA, Finite State Projection truncations of the CME, or hybrid methods; sensitivity analysis and Fisher information quantification; parameter fitting using likelihood-or Bayesian-based methods; handling of experimental noise and measurement errors using probabilistic distortion operators; and sequential experiment design that empowers users to save time and resources while gaining the most information possible out of their data. The SSIT also offers advanced modeling tools, including model reduction methods for increased efficiency and joint fitting of models and datasets with overlapping reactions/parameters. To facilitate the ease and speed of use, the SSIT provides a graphical user interface and ready-made, adaptable pipelines that can be run in the background from commandline or high-performance computing clusters. We demonstrate features of the SSIT on two experimental datasets: the first consists of published mRNA count data that reflect <i>Saccharomyces cerevisiae</i> yeast cell response to osmotic shock using single-cell single-molecule fluorescence in situ hybridization; the second consists of single-cell RNA sequencing measurements of 151 activating genes in breast cancer cells following treatment with dexamethasone.</p><p><strong>Author summary: </strong>We present the Stochastic System Identification Toolkit (SSIT) to model, fit, and predict any data that can be interpreted as changing populations or counts through time, including but not limited to single-cell experiments, economics, epidemiology, ecology, sociology, agriculture, and biotechnology. The SSIT was constructed particularly for stochastic modeling, which is important for systems whose states may experience significant fluctuations from mean behavior, thus affecting the inference of the underlying rate parameters and predictions of subsequent behavior. The SSIT provides statistical inference tools for parameter estimation; sensitivity analysis and information calculation; handling of distortions to
{"title":"The Stochastic System Identification Toolkit (SSIT) to model, fit, predict, and design experiments.","authors":"Alex N Popinga, Jack Forman, Dmitri Svetlov, Huy Vo, Brian Munsky","doi":"10.64898/2026.02.20.707039","DOIUrl":"10.64898/2026.02.20.707039","url":null,"abstract":"<p><p>Biological data is prone to both intrinsic and extrinsic noise and variability between experimental replicas. That same stochasticity and heterogeneity can carry information about underlying biochemical mechanisms but, if not incorporated in modeling and probabilistic inference, can also bias parameter estimates and misguide predictions and, subsequently, experiment design. Mechanistic inference typically requires lengthy simulations (e.g., the Stochastic Simulation Algorithm (SSA)); approximations to chemical master equation (CME) solutions that lack rigorous error tracking; or deterministic averaging that lacks the complexity necessary to reflect the data. We introduce the Stochastic System Identification Toolkit (SSIT) - a fast, flexible, and open-source software package available on GitHub that makes use of MATLAB's efficient and diverse computational architecture. The SSIT is designed for building, simulating, and solving chemical reaction models using ODEs, moments, SSA, Finite State Projection truncations of the CME, or hybrid methods; sensitivity analysis and Fisher information quantification; parameter fitting using likelihood-or Bayesian-based methods; handling of experimental noise and measurement errors using probabilistic distortion operators; and sequential experiment design that empowers users to save time and resources while gaining the most information possible out of their data. The SSIT also offers advanced modeling tools, including model reduction methods for increased efficiency and joint fitting of models and datasets with overlapping reactions/parameters. To facilitate the ease and speed of use, the SSIT provides a graphical user interface and ready-made, adaptable pipelines that can be run in the background from commandline or high-performance computing clusters. We demonstrate features of the SSIT on two experimental datasets: the first consists of published mRNA count data that reflect <i>Saccharomyces cerevisiae</i> yeast cell response to osmotic shock using single-cell single-molecule fluorescence in situ hybridization; the second consists of single-cell RNA sequencing measurements of 151 activating genes in breast cancer cells following treatment with dexamethasone.</p><p><strong>Author summary: </strong>We present the Stochastic System Identification Toolkit (SSIT) to model, fit, and predict any data that can be interpreted as changing populations or counts through time, including but not limited to single-cell experiments, economics, epidemiology, ecology, sociology, agriculture, and biotechnology. The SSIT was constructed particularly for stochastic modeling, which is important for systems whose states may experience significant fluctuations from mean behavior, thus affecting the inference of the underlying rate parameters and predictions of subsequent behavior. The SSIT provides statistical inference tools for parameter estimation; sensitivity analysis and information calculation; handling of distortions to ","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934706/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2025.12.02.691839
Daniel de Castro Assumpcao, Emma Sofia Vinokour, Madeline Marie Mills, Shiqi Liang, Carolyn Elaine Mills, Aline Carvalho da Costa, Nolan Warren Kennedy, Danielle Tullman-Ercek
MS2 virus-like particles (VLPs) are widely used as protein nanocages for cargo encapsulation, yet in vitro disassembly/reassembly protocols remain poorly standardized, and reassembly yields are reported inconsistently. As a result, the same experiments reported in literature produce widely divergent yields, limiting reproducibility and cross-study comparability. Here, we introduce a cargo-specific, quantitative framework for standardized MS2 VLP reassembly yield determination. We evaluate commonly used disassembly and post-disassembly processing methods and identify practical trade-offs between protein recovery, accessibility, and reproducibility. Reassembly yield is quantified using size exclusion chromatography calibrated against purified VLP standards, enabling robust, cargo-specific yield measurement. Using this framework, we apply a full factorial design of experiments to quantify the individual and combined effects of coat protein concentration, ionic strength, buffer pH, and molecular crowding on reassembly yield. The resulting statistical model explains more than 99% of the explainable variance and its linear fit to the experimental data indicates that optimal reassembly conditions extend beyond those tested to date. Protein concentration and ionic strength dominate reassembly yield, whereas pH and osmolyte concentration contribute more modestly within the tested ranges. Finally, we propose practical guidelines for standardized MS2 VLP disassembly, reassembly, and yield reporting, defining a transferable operating envelope for MS2 VLP reconstruction. While demonstrated here using a single nucleic acid cargo (tr-DNA), the framework is readily extensible to alternative cargos and coat protein variants.
{"title":"Process for Standardizing and Assessing the Parameters Governing MS2 Virus-Like Particle Reassembly around Nucleic Acid Cargo.","authors":"Daniel de Castro Assumpcao, Emma Sofia Vinokour, Madeline Marie Mills, Shiqi Liang, Carolyn Elaine Mills, Aline Carvalho da Costa, Nolan Warren Kennedy, Danielle Tullman-Ercek","doi":"10.64898/2025.12.02.691839","DOIUrl":"10.64898/2025.12.02.691839","url":null,"abstract":"<p><p>MS2 virus-like particles (VLPs) are widely used as protein nanocages for cargo encapsulation, yet in vitro disassembly/reassembly protocols remain poorly standardized, and reassembly yields are reported inconsistently. As a result, the same experiments reported in literature produce widely divergent yields, limiting reproducibility and cross-study comparability. Here, we introduce a cargo-specific, quantitative framework for standardized MS2 VLP reassembly yield determination. We evaluate commonly used disassembly and post-disassembly processing methods and identify practical trade-offs between protein recovery, accessibility, and reproducibility. Reassembly yield is quantified using size exclusion chromatography calibrated against purified VLP standards, enabling robust, cargo-specific yield measurement. Using this framework, we apply a full factorial design of experiments to quantify the individual and combined effects of coat protein concentration, ionic strength, buffer pH, and molecular crowding on reassembly yield. The resulting statistical model explains more than 99% of the explainable variance and its linear fit to the experimental data indicates that optimal reassembly conditions extend beyond those tested to date. Protein concentration and ionic strength dominate reassembly yield, whereas pH and osmolyte concentration contribute more modestly within the tested ranges. Finally, we propose practical guidelines for standardized MS2 VLP disassembly, reassembly, and yield reporting, defining a transferable operating envelope for MS2 VLP reconstruction. While demonstrated here using a single nucleic acid cargo (tr-DNA), the framework is readily extensible to alternative cargos and coat protein variants.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12714026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145807064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2026.02.20.707040
Adele M H Seelke, Christina L Hung, Sabrina L Mederos, Sophia Rogers, Tiffany Lam, Lauren A Meckler, Karen L Bales
Prairie voles ( Microtus ochrogaster ) are highly social rodents that have become a valuable animal model for studying social attachment, pair bonding, parental care, and the neurobiological mechanisms underlying social behavior. In recent years, due in part to the publication of the prairie vole genome and deeper mechanistic understanding of their social behavior, prairie voles have become a more popular research model, especially for translational research. However, generating reliable and reproducible findings requires effective colony management, including thoughtful breeding strategies, consistent husbandry practices, and clear documentation. In this paper, we describe the demographic history of and husbandry techniques employed in our prairie vole breeding colony at UC Davis from 2004 to 2020. Well-organized and transparent colony management allows for the preservation of informative behavioral traits in prairie voles and strengthens the impact of the prairie vole model across behavioral and biomedical science.
{"title":"A Demographic History of a Prairie Vole ( <i>Microtus Ochrogaster</i> ) Breeding Colony (2004-2020).","authors":"Adele M H Seelke, Christina L Hung, Sabrina L Mederos, Sophia Rogers, Tiffany Lam, Lauren A Meckler, Karen L Bales","doi":"10.64898/2026.02.20.707040","DOIUrl":"10.64898/2026.02.20.707040","url":null,"abstract":"<p><p>Prairie voles ( <i>Microtus ochrogaster</i> ) are highly social rodents that have become a valuable animal model for studying social attachment, pair bonding, parental care, and the neurobiological mechanisms underlying social behavior. In recent years, due in part to the publication of the prairie vole genome and deeper mechanistic understanding of their social behavior, prairie voles have become a more popular research model, especially for translational research. However, generating reliable and reproducible findings requires effective colony management, including thoughtful breeding strategies, consistent husbandry practices, and clear documentation. In this paper, we describe the demographic history of and husbandry techniques employed in our prairie vole breeding colony at UC Davis from 2004 to 2020. Well-organized and transparent colony management allows for the preservation of informative behavioral traits in prairie voles and strengthens the impact of the prairie vole model across behavioral and biomedical science.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934815/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2026.02.21.706835
Vivek S Peche, Sebastian Kenny, Tae Gun Kang, Brian Coventry, Tian Mi, Inna Goreshnik, Mariana Garcia Sanchez, Reid Martin, Macey Smith, Dionne Vafeados, Rahul S Kathayat, Yu Kaiwen, Zuo-Fei Yuan, Long Wu, Anthony High, Andrew Nemecek, Elizabeth Wickmann, Adeleye Adeshakin, Francesca Ferrara, Robert E Throm, Taosheng Chen, Benjamin Youngblood, David Baker, Stephen Gottschalk
Gene editing has been used to enhance CAR T-cell function by disrupting negative regulators but has limitations. Here we show that de novo-designed generated targeted degraders (bioPROTACs) provide an alternative approach. Expression of bioPROTACs in CAR T-cells targeting DNMT3A, a key regulator of T-cell exhaustion, phenocopied gene knockout. Our reversible, non-gene editing approach provides a tunable strategy to reprogram T-cell fate which should be broadly applicable for next-generation cell therapies.
{"title":"Reprogramming CAR T-Cells with designed bioPROTACs.","authors":"Vivek S Peche, Sebastian Kenny, Tae Gun Kang, Brian Coventry, Tian Mi, Inna Goreshnik, Mariana Garcia Sanchez, Reid Martin, Macey Smith, Dionne Vafeados, Rahul S Kathayat, Yu Kaiwen, Zuo-Fei Yuan, Long Wu, Anthony High, Andrew Nemecek, Elizabeth Wickmann, Adeleye Adeshakin, Francesca Ferrara, Robert E Throm, Taosheng Chen, Benjamin Youngblood, David Baker, Stephen Gottschalk","doi":"10.64898/2026.02.21.706835","DOIUrl":"10.64898/2026.02.21.706835","url":null,"abstract":"<p><p>Gene editing has been used to enhance CAR T-cell function by disrupting negative regulators but has limitations. Here we show that de novo-designed generated targeted degraders (bioPROTACs) provide an alternative approach. Expression of bioPROTACs in CAR T-cells targeting DNMT3A, a key regulator of T-cell exhaustion, phenocopied gene knockout. Our reversible, non-gene editing approach provides a tunable strategy to reprogram T-cell fate which should be broadly applicable for next-generation cell therapies.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934639/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2026.02.21.706873
Mehagan S Hopkins, Thomas C Terwilliger, Pavel V Afonine, Helen M Ginn, James M Holton
We report the discovery of a new class of local minima that has severely limited the accuracy of macromolecular models. Termed density misfit barrier traps, these minima explain much of the poor fit between macromolecular models and experimental data relative to that of smaller molecules: not just high R factors, but distorted chemical geometry. We postulated that proteins exist as an ensemble of conformations that each have good geometry, but refinement algorithms have been unable to converge to them due to a tangling phenomenon arising from these traps. To demonstrate, a synthetic ground truth data set was generated, consisting of a 2-member ensemble with excellent geometry. A series of starting models, each trapped in increasingly difficult local minima, were prepared, a unified validation score defined, and an open Challenge issued. This Challenge inspired algorithms for escaping such traps, and new programs have been released that are expected to substantially improve the accuracy of macromolecular ensemble models.
Synopsis: A synthetic 2-member conformational ensemble of a small protein and corresponding electron density data was generated to demonstrate how topological local minima hinder simultaneous agreement with density data and chemical geometry restraints in conventional structure refinement.
{"title":"The Untangle Challenge for accurate ensemble models.","authors":"Mehagan S Hopkins, Thomas C Terwilliger, Pavel V Afonine, Helen M Ginn, James M Holton","doi":"10.64898/2026.02.21.706873","DOIUrl":"10.64898/2026.02.21.706873","url":null,"abstract":"<p><p>We report the discovery of a new class of local minima that has severely limited the accuracy of macromolecular models. Termed density misfit barrier traps, these minima explain much of the poor fit between macromolecular models and experimental data relative to that of smaller molecules: not just high R factors, but distorted chemical geometry. We postulated that proteins exist as an ensemble of conformations that each have good geometry, but refinement algorithms have been unable to converge to them due to a tangling phenomenon arising from these traps. To demonstrate, a synthetic ground truth data set was generated, consisting of a 2-member ensemble with excellent geometry. A series of starting models, each trapped in increasingly difficult local minima, were prepared, a unified validation score defined, and an open Challenge issued. This Challenge inspired algorithms for escaping such traps, and new programs have been released that are expected to substantially improve the accuracy of macromolecular ensemble models.</p><p><strong>Synopsis: </strong>A synthetic 2-member conformational ensemble of a small protein and corresponding electron density data was generated to demonstrate how topological local minima hinder simultaneous agreement with density data and chemical geometry restraints in conventional structure refinement.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934704/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-22DOI: 10.64898/2026.02.21.707202
Zhijun Huang, Wei Cui, Adam Klaiss, Gerd P Pfeifer
Human SMCHD1 (Structural Maintenance of Chromosomes Flexible Hinge Domain Containing 1) is a chromatin architectural protein linked to heterochromatin repression. Loss of function mutations of SMCHD1 cause facioscapulohumeral muscular dystrophy type 2 (FSHD2) through activation of the DUX4 homeobox transcription factor gene. However, it is unknown how SMCHD1 may regulate myogenic transcription independently of DUX4. Here, we show that SMCHD1 safeguards enhancer organization within the three-dimensional (3D) genome in human myoblasts. Loss of SMCHD1 leads to widespread gains in chromatin accessibility, aberrant transcription and a global redistribution of the myogenic transcription factor MYOD1. Integrative analyses of histone modifications, chromatin accessibility, Hi-C looping, and activity-by-contact enhancer-gene modeling reveal that SMCHD1 loss rewires the landscape of clustered enhancers and promotes the emergence of a new MYOD1-related network of enhancer elements, termed MYOD1 enhancer nexuses. These structures are marked by increased enhancer-enhancer connectivity, increased local 3D chromatin interactions, and coordinated activation of genes likely relevant for FSHD pathology. Together, our findings identify SMCHD1 as a key architectural constraint that suppresses hyperactive enhancer networks, thereby preserving transcriptional homeostasis in myoblasts.
人类SMCHD1 (Structural Maintenance of chromosome Flexible Hinge Domain Containing 1)是一种染色质结构蛋白,与异染色质抑制有关。SMCHD1的功能突变缺失通过激活DUX4同源盒转录因子基因导致2型面肩肱骨肌营养不良(FSHD2)。然而,尚不清楚SMCHD1如何独立于DUX4调节肌原性转录。在这里,我们发现SMCHD1在人成肌细胞的三维(3D)基因组中保护增强子组织。SMCHD1的缺失导致染色质可及性的广泛增加、转录异常和肌源性转录因子MYOD1的全球再分布。对组蛋白修饰、染色质可及性、Hi-C环和接触活性增强子基因模型的综合分析表明,SMCHD1缺失重塑了群集增强子的格局,并促进了一个新的MYOD1相关增强子元件网络的出现,称为MYOD1增强子连接。这些结构的特点是增强子与增强子之间的连通性增加,局部三维染色质相互作用增加,以及可能与FSHD病理相关的基因的协调激活。总之,我们的研究结果确定SMCHD1是抑制过度活跃的增强子网络的关键结构约束,从而保持成肌细胞的转录稳态。
{"title":"SMCHD1 loss re-wires MYOD1 enhancer nexuses and chromatin accessibility landscapes in muscle cells.","authors":"Zhijun Huang, Wei Cui, Adam Klaiss, Gerd P Pfeifer","doi":"10.64898/2026.02.21.707202","DOIUrl":"10.64898/2026.02.21.707202","url":null,"abstract":"<p><p>Human SMCHD1 (Structural Maintenance of Chromosomes Flexible Hinge Domain Containing 1) is a chromatin architectural protein linked to heterochromatin repression. Loss of function mutations of SMCHD1 cause facioscapulohumeral muscular dystrophy type 2 (FSHD2) through activation of the <i>DUX4</i> homeobox transcription factor gene. However, it is unknown how SMCHD1 may regulate myogenic transcription independently of DUX4. Here, we show that SMCHD1 safeguards enhancer organization within the three-dimensional (3D) genome in human myoblasts. Loss of SMCHD1 leads to widespread gains in chromatin accessibility, aberrant transcription and a global redistribution of the myogenic transcription factor MYOD1. Integrative analyses of histone modifications, chromatin accessibility, Hi-C looping, and activity-by-contact enhancer-gene modeling reveal that SMCHD1 loss rewires the landscape of clustered enhancers and promotes the emergence of a new MYOD1-related network of enhancer elements, termed MYOD1 enhancer nexuses. These structures are marked by increased enhancer-enhancer connectivity, increased local 3D chromatin interactions, and coordinated activation of genes likely relevant for FSHD pathology. Together, our findings identify SMCHD1 as a key architectural constraint that suppresses hyperactive enhancer networks, thereby preserving transcriptional homeostasis in myoblasts.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934784/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-21DOI: 10.64898/2026.02.20.707097
Van N T La, Noa Lahav, Mario Rodriguez, Randy Diaz-Tapia, Briana McGovern, Jared Benjamin, Haim Barr, Kris M White, Lulu Kang, John D Chodera, David D L Minh
Compounds that bind to the Middle East Respiratory Syndrome Coronavirus (MERS-CoV) main protease (MPro) often produce biphasic concentration-response curves (CRCs) in biochemical assays; low concentrations activate the enzyme and high concentrations inhibit it. This biphasic behavior complicates data analysis. Here, we compare three approaches to data analysis: fitting the Hill equation to the activation phase, fitting it to the inhibition phase, and fitting an enzyme kinetics model that incorporates dimerization and ligand binding to the complete CRC. In the latter case, cellular efficacy is predicted by extrapolating the model to high enzyme concentrations. For compounds in our drug lead series, all three procedures yield inhibitory concentrations that are correlated with live-virus antiviral assays. The latter procedure provides the most accurate forecast of cellular efficacy rank. These data analysis procedures may be valuable for antiviral drug discovery against MERS-CoV MPro and other enzymes with similar kinetics.
{"title":"Linking biochemical and cellular efficacy of MERS coronavirus main protease inhibitors.","authors":"Van N T La, Noa Lahav, Mario Rodriguez, Randy Diaz-Tapia, Briana McGovern, Jared Benjamin, Haim Barr, Kris M White, Lulu Kang, John D Chodera, David D L Minh","doi":"10.64898/2026.02.20.707097","DOIUrl":"10.64898/2026.02.20.707097","url":null,"abstract":"<p><p>Compounds that bind to the Middle East Respiratory Syndrome Coronavirus (MERS-CoV) main protease (MPro) often produce biphasic concentration-response curves (CRCs) in biochemical assays; low concentrations activate the enzyme and high concentrations inhibit it. This biphasic behavior complicates data analysis. Here, we compare three approaches to data analysis: fitting the Hill equation to the activation phase, fitting it to the inhibition phase, and fitting an enzyme kinetics model that incorporates dimerization and ligand binding to the complete CRC. In the latter case, cellular efficacy is predicted by extrapolating the model to high enzyme concentrations. For compounds in our drug lead series, all three procedures yield inhibitory concentrations that are correlated with live-virus antiviral assays. The latter procedure provides the most accurate forecast of cellular efficacy rank. These data analysis procedures may be valuable for antiviral drug discovery against MERS-CoV MPro and other enzymes with similar kinetics.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12934682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147314338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}