Pub Date : 2026-01-28eCollection Date: 2026-01-01DOI: 10.1093/biomethods/bpaf094
Jialai She, Gil Alterovitz
Latent factor models are valuable in bioinformatics for accounting for unmeasured variation alongside observed covariates. Yet many methods struggle to separate known effects from latent structure and to handle losses beyond standard regression. We present a unified framework that augments row and column predictors with a low-rank latent component, jointly modeling measured effects and residual variation. To remove ambiguity in estimating observed and latent effects, we impose a carefully designed set of orthogonality constraints on the coefficient and latent factor matrices, relative to the spans of the predictor matrices. These constraints ensure identifiability, yield a decomposition in which the latent term captures only variation unexplained by the covariates, and improve interpretability. An efficient algorithm handles general non-quadratic losses via surrogates with monotone descent. Each iteration updates the latent term by truncated singular value decomposition of a doubly projected residual and refines coefficients by projections. The number of latent factors is selected by applying an elbow rule to a degrees-of-freedom-adjusted information criterion. A parametric bootstrap provides valid inference on feature-outcome associations under the regularized low-rank structure. Applied to real pharmacogenomic data, the method recovers biologically coherent gene-drug associations missed by standard factor models, such as the EGFR-inhibitor link, highlights novel candidates with plausible mechanisms, and reveals gene programs aligned with compound modes of action, including a latent unfolded-protein-response module affecting drug sensitivity. These results support the framework's utility for precision oncology, yielding stronger biomarkers for patient stratification and deeper insight into drug resistance mechanisms.
{"title":"SOLVE: A structured orthogonal latent variable framework for disentangling confounding in matrix data.","authors":"Jialai She, Gil Alterovitz","doi":"10.1093/biomethods/bpaf094","DOIUrl":"10.1093/biomethods/bpaf094","url":null,"abstract":"<p><p>Latent factor models are valuable in bioinformatics for accounting for unmeasured variation alongside observed covariates. Yet many methods struggle to separate known effects from latent structure and to handle losses beyond standard regression. We present a unified framework that augments row and column predictors with a low-rank latent component, jointly modeling measured effects and residual variation. To remove ambiguity in estimating observed and latent effects, we impose a carefully designed set of orthogonality constraints on the coefficient and latent factor matrices, relative to the spans of the predictor matrices. These constraints ensure identifiability, yield a decomposition in which the latent term captures only variation unexplained by the covariates, and improve interpretability. An efficient algorithm handles general non-quadratic losses via surrogates with monotone descent. Each iteration updates the latent term by truncated singular value decomposition of a doubly projected residual and refines coefficients by projections. The number of latent factors is selected by applying an elbow rule to a degrees-of-freedom-adjusted information criterion. A parametric bootstrap provides valid inference on feature-outcome associations under the regularized low-rank structure. Applied to real pharmacogenomic data, the method recovers biologically coherent gene-drug associations missed by standard factor models, such as the EGFR-inhibitor link, highlights novel candidates with plausible mechanisms, and reveals gene programs aligned with compound modes of action, including a latent unfolded-protein-response module affecting drug sensitivity. These results support the framework's utility for precision oncology, yielding stronger biomarkers for patient stratification and deeper insight into drug resistance mechanisms.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf094"},"PeriodicalIF":1.3,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12848822/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146087502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13eCollection Date: 2026-01-01DOI: 10.1093/biomethods/bpag001
Fraser G Ferens, Cassandra C Taber, Jeffrey J Eo, Michael Ohh
Pacak-Zhuang syndrome is an emerging pseudohypoxic disorder that causes defined but varied manifestations of neuroendocrine tumours with or without polycythemia or exclusively polycythemia. This disease is caused by mutations in the EPAS1 gene, which encodes for one of three hypoxia-inducible factor (HIF) α subunits, HIF2α. As new mutations in this gene are observed in individuals exhibiting the manifestations of Pacak-Zhuang syndrome, there is a need to distinguish bona-fide disease causing mutations from benign mutations, which could have a valuable impact on the direction of patient care. We recently showed that reductions in the affinity of prolyl-hydroxylase 2 (PHD2) for HIF2α due to mutations are at the root of the mechanism underlying Pacak-Zhuang syndrome. The determination of affinity was accomplished using microscale thermophoresis (MST). Here, we describe a detailed protocol for the assessment of binding affinities between HIF2α peptides or the entire oxygen-dependent degradation domains of HIFα proteins and PHD2 using MST and propose that this method can be used to assess the potential pathogenicity of novel mutations in HIF2α.
{"title":"Assessment of HIF2α mutational pathogenicity using microscale thermophoresis.","authors":"Fraser G Ferens, Cassandra C Taber, Jeffrey J Eo, Michael Ohh","doi":"10.1093/biomethods/bpag001","DOIUrl":"https://doi.org/10.1093/biomethods/bpag001","url":null,"abstract":"<p><p>Pacak-Zhuang syndrome is an emerging pseudohypoxic disorder that causes defined but varied manifestations of neuroendocrine tumours with or without polycythemia or exclusively polycythemia. This disease is caused by mutations in the <i>EPAS1</i> gene, which encodes for one of three hypoxia-inducible factor (HIF) α subunits, HIF2α. As new mutations in this gene are observed in individuals exhibiting the manifestations of Pacak-Zhuang syndrome, there is a need to distinguish bona-fide disease causing mutations from benign mutations, which could have a valuable impact on the direction of patient care. We recently showed that reductions in the affinity of prolyl-hydroxylase 2 (PHD2) for HIF2α due to mutations are at the root of the mechanism underlying Pacak-Zhuang syndrome. The determination of affinity was accomplished using microscale thermophoresis (MST). Here, we describe a detailed protocol for the assessment of binding affinities between HIF2α peptides or the entire oxygen-dependent degradation domains of HIFα proteins and PHD2 using MST and propose that this method can be used to assess the potential pathogenicity of novel mutations in HIF2α.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpag001"},"PeriodicalIF":1.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12867580/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-17eCollection Date: 2025-01-01DOI: 10.1093/biomethods/bpaf091
Kathryn Whitehead, Sarah Planchak, Trinity Williams, Julia Xia, Soeun Park, Alejandra Hernandez Moyers, Shreyas Shah, Lloyd Bwanali, Anubhav Tripathi
Cell2Read is a novel automated method for complete integration of cell lysis and sample preparation for next-generation sequencing (NGS). It optimizes diffusion kinetics and complex thermal geometries to allow for effective use down to low inputs of cells. This allows for DNA analysis from a low cellular input, whether this be for in vitro analysis or diagnostic applications from dissociated tumor biopsies. We demonstrate that the system can process input cell suspensions as low as 1500 cells without compromising sequencing integrity. We also demonstrate the breadth of the protocol in its ability to repeatably process many cell types, including HepG2, Caov3, HEY A8, OVCAR 8, MDA-MB-231, and Human Primary Ovarian Epithelial Cells. The workflow integrates and fully automates cell lysis, DNA extraction, and library preparation into a single automated platform, offering high sensitivity and reproducibility. Our results show that the system yields consistent DNA quantities (≥10 ng) with high sequencing quality, even at low cell inputs, with alignment rates exceeding 95% for inputs of 3125 cells or greater. The automated method's sequencing performance was comparable to manual protocols, with no significant differences in quality scores or GC bias across processing methods. We also demonstrated effective, non-biased sequencing of heterogeneous cell suspensions, through comprehensive testing of spiked concentrations of cancerous cells with non-cancerous ovarian cells. Sequencing output showed proportional DNA representation of cancer markers to the concentration of cancer cells inputted. The Cell2Read workflow offers a technically validated, scalable solution that expands accessibility to genomic analysis and supports reproducible, high-quality sequencing from low-input human samples. This robustness across a range of cell types, makes Cell2Read an ideal solution for sequencing applications, including oncology research and clinical diagnostics.
{"title":"Cell2Read: an automated workflow to generate sequencing-ready DNA libraries from human cell suspensions.","authors":"Kathryn Whitehead, Sarah Planchak, Trinity Williams, Julia Xia, Soeun Park, Alejandra Hernandez Moyers, Shreyas Shah, Lloyd Bwanali, Anubhav Tripathi","doi":"10.1093/biomethods/bpaf091","DOIUrl":"10.1093/biomethods/bpaf091","url":null,"abstract":"<p><p>Cell2Read is a novel automated method for complete integration of cell lysis and sample preparation for next-generation sequencing (NGS). It optimizes diffusion kinetics and complex thermal geometries to allow for effective use down to low inputs of cells. This allows for DNA analysis from a low cellular input, whether this be for in vitro analysis or diagnostic applications from dissociated tumor biopsies. We demonstrate that the system can process input cell suspensions as low as 1500 cells without compromising sequencing integrity. We also demonstrate the breadth of the protocol in its ability to repeatably process many cell types, including HepG2, Caov3, HEY A8, OVCAR 8, MDA-MB-231, and Human Primary Ovarian Epithelial Cells. The workflow integrates and fully automates cell lysis, DNA extraction, and library preparation into a single automated platform, offering high sensitivity and reproducibility. Our results show that the system yields consistent DNA quantities (≥10 ng) with high sequencing quality, even at low cell inputs, with alignment rates exceeding 95% for inputs of 3125 cells or greater. The automated method's sequencing performance was comparable to manual protocols, with no significant differences in quality scores or GC bias across processing methods. We also demonstrated effective, non-biased sequencing of heterogeneous cell suspensions, through comprehensive testing of spiked concentrations of cancerous cells with non-cancerous ovarian cells. Sequencing output showed proportional DNA representation of cancer markers to the concentration of cancer cells inputted. The Cell2Read workflow offers a technically validated, scalable solution that expands accessibility to genomic analysis and supports reproducible, high-quality sequencing from low-input human samples. This robustness across a range of cell types, makes Cell2Read an ideal solution for sequencing applications, including oncology research and clinical diagnostics.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf091"},"PeriodicalIF":1.3,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12744386/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-14eCollection Date: 2026-01-01DOI: 10.1093/biomethods/bpaf093
Vikas N Vattipally, Patrick Kramer, Nada Abouelseoud, Isha Yeleswarapu, A Daniel Davidar, Joseph M Dardick, Ali Bydon, Timothy F Witham, Daniel Lubelski, Kathryn Rosenblatt, Judy Huang, Chetan Bettegowda, Frederick Sieber, Esther S Oh, Sridevi V Sarma, Ozan Akca, Nicholas Theodore, Tej D Azad
Postoperative delirium (POD) is a complication of surgery in older adults associated with adverse outcomes. Current screening methods demonstrate poor interrater reliability, and conventional electroencephalography (EEG)-based screening requires intensive setup. Point-of-care (POC) EEG technology offers a rapid and objective alternative that may capture neurophysiological signatures of delirium risk. When combined with baseline and perioperative variables, POC EEG may enable the prediction of POD before clinical manifestation. In this study, we aim to develop a POD prediction model using POC EEG as well as explore secondary outcomes such as longer-term cognitive impairment and postoperative pain. This is a prospective cohort study enrolling older adults (≥60 years) undergoing elective non-cranial inpatient surgery at two academic hospitals. The target cohort size is 150 participants, determined by an events-per-parameter approach. All participants undergo baseline cognitive testing and pain assessment using the Montreal Cognitive Assessment (MoCA) and Numeric Rating Scale. The primary outcome is POD, while secondary outcomes include follow-up MoCA scores and postoperative pain scores. POD is assessed immediately after surgery and every 12 h during the admission with the 4AT tool. Perioperative EEG is acquired using the Ceribell EEG system (Ceribell, Inc.) across standardized preoperative, intraoperative, and postoperative phases. EEG features such as spectral power, alpha/delta ratio, and burst suppression ratio are analyzed in relation to outcomes. Predictive models will be developed using regularized logistic regression with nested feature sets, and model performance will be evaluated. This study evaluates whether POC EEG can accurately predict POD in older adults undergoing elective surgery, as well as longer-term cognitive impairment and postoperative pain. This approach could enable early identification of high-risk patients and facilitate targeted preventive strategies. By generating a validated risk model, multimodal exploratory analyses, and openly available datasets, this work aims to advance the practical management of perioperative outcomes.
{"title":"Point-of-care electroencephalography for prediction of postoperative delirium in older adults undergoing elective surgery: protocol for a prospective cohort study.","authors":"Vikas N Vattipally, Patrick Kramer, Nada Abouelseoud, Isha Yeleswarapu, A Daniel Davidar, Joseph M Dardick, Ali Bydon, Timothy F Witham, Daniel Lubelski, Kathryn Rosenblatt, Judy Huang, Chetan Bettegowda, Frederick Sieber, Esther S Oh, Sridevi V Sarma, Ozan Akca, Nicholas Theodore, Tej D Azad","doi":"10.1093/biomethods/bpaf093","DOIUrl":"10.1093/biomethods/bpaf093","url":null,"abstract":"<p><p>Postoperative delirium (POD) is a complication of surgery in older adults associated with adverse outcomes. Current screening methods demonstrate poor interrater reliability, and conventional electroencephalography (EEG)-based screening requires intensive setup. Point-of-care (POC) EEG technology offers a rapid and objective alternative that may capture neurophysiological signatures of delirium risk. When combined with baseline and perioperative variables, POC EEG may enable the prediction of POD before clinical manifestation. In this study, we aim to develop a POD prediction model using POC EEG as well as explore secondary outcomes such as longer-term cognitive impairment and postoperative pain. This is a prospective cohort study enrolling older adults (≥60 years) undergoing elective non-cranial inpatient surgery at two academic hospitals. The target cohort size is 150 participants, determined by an events-per-parameter approach. All participants undergo baseline cognitive testing and pain assessment using the Montreal Cognitive Assessment (MoCA) and Numeric Rating Scale. The primary outcome is POD, while secondary outcomes include follow-up MoCA scores and postoperative pain scores. POD is assessed immediately after surgery and every 12 h during the admission with the 4AT tool. Perioperative EEG is acquired using the Ceribell EEG system (Ceribell, Inc.) across standardized preoperative, intraoperative, and postoperative phases. EEG features such as spectral power, alpha/delta ratio, and burst suppression ratio are analyzed in relation to outcomes. Predictive models will be developed using regularized logistic regression with nested feature sets, and model performance will be evaluated. This study evaluates whether POC EEG can accurately predict POD in older adults undergoing elective surgery, as well as longer-term cognitive impairment and postoperative pain. This approach could enable early identification of high-risk patients and facilitate targeted preventive strategies. By generating a validated risk model, multimodal exploratory analyses, and openly available datasets, this work aims to advance the practical management of perioperative outcomes.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf093"},"PeriodicalIF":1.3,"publicationDate":"2025-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12798540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145970960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-13eCollection Date: 2026-01-01DOI: 10.1093/biomethods/bpaf089
Rosa Aghdam, Shan Shan, Richard Lankau, Claudia Solís-Lemus
Microbiome research faces two central challenges, namely constructing reliable networks, where nodes represent microbial taxa and edges represent their associations, and identifying significant disease-associated taxa. To address the first challenge, we developed CMIMN, a novel R package that applies a Bayesian network framework based on conditional mutual information to infer microbial interaction networks. To further enhance reliability, we construct a consensus microbiome network by integrating results from CMIMN and three widely used methods, including Sparse Inverse Covariance Estimation for Ecological Association Inference (SPIEC-EASI), Semi-Parametric Rank-based correlation and partial correlation Estimation (SPRING), and Sparse Correlations for Compositional Data (SPARCC). This consensus approach, which overlays and weights edges shared across methods, reduces inconsistencies and provides a more biologically meaningful view of microbial relationships. To address the second challenge, we designed a multi-method feature selection framework that combines machine learning with network-based strategies. Our machine learning pipeline applies distinct algorithms and identifies key taxa based on their consistent importance across models. Complementing this, we employ two network-based strategies that prioritize taxa based on centrality differences between networks constructed from healthy samples and disease-affected samples, as well as a composite scoring system that ranks nodes using integrated network metrics. We applied CMIMN on soil microbiome data from potato fields affected by common scab disease. Bootstrap analysis confirmed the robustness of CMIMN, and the consensus network further improved stability and interpretability. The multi-method framework enhances confidence in identifying soil microbial taxa associated with potato disease. Notably, we identified Bacteroidota, WPS-2, and Proteobacteria at the Phylum level; Actinobacteria, AD3, Bacilli, Anaerolineae, and Ktedonobacteria at the Class level; and C0119, Defluviicoccales, Bacteroidales, and Ktedonobacterales at the Order level as key taxa associated with disease status.
{"title":"A hybrid framework for disease biomarker discovery in microbiome research combining Bayesian networks, machine learning, and network-based methods.","authors":"Rosa Aghdam, Shan Shan, Richard Lankau, Claudia Solís-Lemus","doi":"10.1093/biomethods/bpaf089","DOIUrl":"10.1093/biomethods/bpaf089","url":null,"abstract":"<p><p>Microbiome research faces two central challenges, namely constructing reliable networks, where nodes represent microbial taxa and edges represent their associations, and identifying significant disease-associated taxa. To address the first challenge, we developed CMIMN, a novel R package that applies a Bayesian network framework based on conditional mutual information to infer microbial interaction networks. To further enhance reliability, we construct a consensus microbiome network by integrating results from CMIMN and three widely used methods, including Sparse Inverse Covariance Estimation for Ecological Association Inference (SPIEC-EASI), Semi-Parametric Rank-based correlation and partial correlation Estimation (SPRING), and Sparse Correlations for Compositional Data (SPARCC). This consensus approach, which overlays and weights edges shared across methods, reduces inconsistencies and provides a more biologically meaningful view of microbial relationships. To address the second challenge, we designed a multi-method feature selection framework that combines machine learning with network-based strategies. Our machine learning pipeline applies distinct algorithms and identifies key taxa based on their consistent importance across models. Complementing this, we employ two network-based strategies that prioritize taxa based on centrality differences between networks constructed from healthy samples and disease-affected samples, as well as a composite scoring system that ranks nodes using integrated network metrics. We applied CMIMN on soil microbiome data from potato fields affected by common scab disease. Bootstrap analysis confirmed the robustness of CMIMN, and the consensus network further improved stability and interpretability. The multi-method framework enhances confidence in identifying soil microbial taxa associated with potato disease. Notably, we identified <i>Bacteroidota</i>, <i>WPS-2</i>, and <i>Proteobacteria</i> at the Phylum level; <i>Actinobacteria</i>, <i>AD3</i>, <i>Bacilli</i>, <i>Anaerolineae</i>, and <i>Ktedonobacteria</i> at the Class level; and <i>C0119</i>, <i>Defluviicoccales</i>, <i>Bacteroidales</i>, and <i>Ktedonobacterales</i> at the Order level as key taxa associated with disease status.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf089"},"PeriodicalIF":1.3,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12791661/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-12eCollection Date: 2026-01-01DOI: 10.1093/biomethods/bpaf092
Karina I Shakhgeldyan, Vladislav Y Rublev, Nikita S Kuksin, Boris I Geltser, Regina L Pak
Postoperative atrial fibrillation (PoAF) is a common complication after coronary artery bypass grafting (CABG). Despite its association with increased risk of ischemic stroke, bleeding, acute renal failure and mortality there is still no ideal predictive tool with proper clinical interpretability. A retrospective single-center cohort study enrolled 1305 electronic medical records of patients with elective isolated CABG. PoAF was identified in 280 (21.5%) patients. Prognostic models with continuous variables were developed utilizing multivariate logistic regression (MLR), random forest and eXtreme gradient boosting methods. Predictors were dichotomized via grid search for optimal cut-off points, centroid calculation, and Shapley additive explanation (SHAP). For multilevel categorization, we proposed to use threshold values combination identified during dichotomization, as well as ranking cut-off thresholds by MLR weighting coefficients (multimetric categorization method). Based on multistage selection, nine PoAF predictors were identified and validated. After categorization, prognostic models with continuous and multilevel categorical variables were developed. The best XGB model employing continuous predictors demonstrated an AUC = 0.76. Models in which predictors were derived utilizing the multimetric categorization approach showed comparable predictive performance (AUC = 0.758). The main advantage of models with multilevel predictors categorization was their superior explainability and clinical interpretability in predicting POAF. Multilevel predictors categorization represents a promising tool for improving the explainability of POAF predictive development estimates. Using the developed prognostic models, it was demonstrated that the categorization procedures proposed by the authors ensure both high predictive accuracy and transparency of the generated clinical conclusions.
{"title":"Multilevel predictors categorization for post-CABG atrial fibrillation prediction.","authors":"Karina I Shakhgeldyan, Vladislav Y Rublev, Nikita S Kuksin, Boris I Geltser, Regina L Pak","doi":"10.1093/biomethods/bpaf092","DOIUrl":"10.1093/biomethods/bpaf092","url":null,"abstract":"<p><p>Postoperative atrial fibrillation (PoAF) is a common complication after coronary artery bypass grafting (CABG). Despite its association with increased risk of ischemic stroke, bleeding, acute renal failure and mortality there is still no ideal predictive tool with proper clinical interpretability. A retrospective single-center cohort study enrolled 1305 electronic medical records of patients with elective isolated CABG. PoAF was identified in 280 (21.5%) patients. Prognostic models with continuous variables were developed utilizing multivariate logistic regression (MLR), random forest and eXtreme gradient boosting methods. Predictors were dichotomized via grid search for optimal cut-off points, centroid calculation, and Shapley additive explanation (SHAP). For multilevel categorization, we proposed to use threshold values combination identified during dichotomization, as well as ranking cut-off thresholds by MLR weighting coefficients (multimetric categorization method). Based on multistage selection, nine PoAF predictors were identified and validated. After categorization, prognostic models with continuous and multilevel categorical variables were developed. The best XGB model employing continuous predictors demonstrated an AUC = 0.76. Models in which predictors were derived utilizing the multimetric categorization approach showed comparable predictive performance (AUC = 0.758). The main advantage of models with multilevel predictors categorization was their superior explainability and clinical interpretability in predicting POAF. Multilevel predictors categorization represents a promising tool for improving the explainability of POAF predictive development estimates. Using the developed prognostic models, it was demonstrated that the categorization procedures proposed by the authors ensure both high predictive accuracy and transparency of the generated clinical conclusions.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf092"},"PeriodicalIF":1.3,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12791823/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04eCollection Date: 2025-01-01DOI: 10.1093/biomethods/bpaf084
Weronika Klecel, Hadley Rahael, Samantha A Brooks
DeepLabCut has transformed behavioral and locomotor research by enabling markerless pose estimation through deep learning. Despite its broad adoption across species and behaviors, quantitative kinematic analyses remained limited by noisy outputs and the computational expertise required for refinement. To address this issue, we introduce refineDLC, a comprehensive post-processing pipeline that streamlines the conversion of noisy DeepLabCut outputs into robust, analytically reliable kinematic data. The pipeline incorporates essential cleaning steps, including inversion of the y-coordinates for intuitive spatial interpretation, removal of zero-value frames, and exclusion of irrelevant body part labels. It further applies dual-stage filtering based on likelihood scores and positional changes, enhancing data accuracy and consistency. Multiple interpolation strategies manage missing values while maintaining data continuity and integrity. We evaluated refineDLC using two datasets: controlled locomotion in cattle and field-recorded trotting horses. Across both contexts, the pipeline substantially improved data quality and interpretability, reducing variability, eliminating false-positive labeling errors, and transforming noisy trajectories into physiologically meaningful kinematic patterns. Outputs were reliable and analysis-ready regardless of recording conditions or species. By simplifying the transformation from raw DeepLabCut outputs to meaningful kinematic insights, refineDLC expands accessibility for researchers, particularly those with limited programming expertise, enabling precise quantitative analyses at scale. Future developments may incorporate adaptive filtering algorithms and real-time quality assessments, further optimizing performance and automation. These enhancements will extend the pipeline's applicability to precision phenotyping, behavioral ecology, animal science, and conservation biology.
{"title":"refineDLC: An advanced post-processing pipeline for DeepLabCut outputs.","authors":"Weronika Klecel, Hadley Rahael, Samantha A Brooks","doi":"10.1093/biomethods/bpaf084","DOIUrl":"10.1093/biomethods/bpaf084","url":null,"abstract":"<p><p>DeepLabCut has transformed behavioral and locomotor research by enabling markerless pose estimation through deep learning. Despite its broad adoption across species and behaviors, quantitative kinematic analyses remained limited by noisy outputs and the computational expertise required for refinement. To address this issue, we introduce refineDLC, a comprehensive post-processing pipeline that streamlines the conversion of noisy DeepLabCut outputs into robust, analytically reliable kinematic data. The pipeline incorporates essential cleaning steps, including inversion of the y-coordinates for intuitive spatial interpretation, removal of zero-value frames, and exclusion of irrelevant body part labels. It further applies dual-stage filtering based on likelihood scores and positional changes, enhancing data accuracy and consistency. Multiple interpolation strategies manage missing values while maintaining data continuity and integrity. We evaluated refineDLC using two datasets: controlled locomotion in cattle and field-recorded trotting horses. Across both contexts, the pipeline substantially improved data quality and interpretability, reducing variability, eliminating false-positive labeling errors, and transforming noisy trajectories into physiologically meaningful kinematic patterns. Outputs were reliable and analysis-ready regardless of recording conditions or species. By simplifying the transformation from raw DeepLabCut outputs to meaningful kinematic insights, refineDLC expands accessibility for researchers, particularly those with limited programming expertise, enabling precise quantitative analyses at scale. Future developments may incorporate adaptive filtering algorithms and real-time quality assessments, further optimizing performance and automation. These enhancements will extend the pipeline's applicability to precision phenotyping, behavioral ecology, animal science, and conservation biology.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf084"},"PeriodicalIF":1.3,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12744387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01eCollection Date: 2025-01-01DOI: 10.1093/biomethods/bpaf086
Ruslan V Pustovit, Yugeesh R Lankadeva, Ming S Soh, Sam F Berkovic, Christopher A Reid, Clive N May
The pathophysiology of seizures is complex and could contribute to a range of morbidities including sudden unexpected death of epilepsy (SUDEP). A better understanding of seizure-induced pathophysiology can lead to the development of targeted interventions. Here, we describe the development and characterization of a novel large mammalian model of convulsive seizures in non-anesthetized sheep induced by pentylenetetrazol (PTZ), one of the most widely used proconvulsant drugs in epilepsy research. A dose of intravenous PTZ that reliably induced a reproducible and consistent level of seizure in non-anaesthetized sheep was determined. Convulsive seizures went through a relatively predictable sequence, similar to that seen in other animal models of epilepsy. A species-specific seizure severity scale system, based on the field Racine's scale that is widely used in epilepsy research, was designed to establish a user-friendly scoring system for PTZ-induced seizures in sheep. We demonstrated that convulsive seizures caused substantial increases in mean arterial pressure and heart rate. The translational value of this large animal model can be further enhanced when combined with other translational tools such as quantitative systems physiology and pharmacology, potential biomarker testing and experimental preclinical trials of potential prophylactic treatments. An advanced animal model, such as described in this study, provides a unique opportunity for comprehensive physiological monitoring of neural and systemic pathways activated by interictal and ictal activity and can contribute to the development of preventive therapies for seizures.
{"title":"Development and characterization of a pentylenetetrazol-induced convulsive seizure model in non-anaesthetized sheep.","authors":"Ruslan V Pustovit, Yugeesh R Lankadeva, Ming S Soh, Sam F Berkovic, Christopher A Reid, Clive N May","doi":"10.1093/biomethods/bpaf086","DOIUrl":"10.1093/biomethods/bpaf086","url":null,"abstract":"<p><p>The pathophysiology of seizures is complex and could contribute to a range of morbidities including sudden unexpected death of epilepsy (SUDEP). A better understanding of seizure-induced pathophysiology can lead to the development of targeted interventions. Here, we describe the development and characterization of a novel large mammalian model of convulsive seizures in non-anesthetized sheep induced by pentylenetetrazol (PTZ), one of the most widely used proconvulsant drugs in epilepsy research. A dose of intravenous PTZ that reliably induced a reproducible and consistent level of seizure in non-anaesthetized sheep was determined. Convulsive seizures went through a relatively predictable sequence, similar to that seen in other animal models of epilepsy. A species-specific seizure severity scale system, based on the field Racine's scale that is widely used in epilepsy research, was designed to establish a user-friendly scoring system for PTZ-induced seizures in sheep. We demonstrated that convulsive seizures caused substantial increases in mean arterial pressure and heart rate. The translational value of this large animal model can be further enhanced when combined with other translational tools such as quantitative systems physiology and pharmacology, potential biomarker testing and experimental preclinical trials of potential prophylactic treatments. An advanced animal model, such as described in this study, provides a unique opportunity for comprehensive physiological monitoring of neural and systemic pathways activated by interictal and ictal activity and can contribute to the development of preventive therapies for seizures.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf086"},"PeriodicalIF":1.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12674773/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145678895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26eCollection Date: 2025-01-01DOI: 10.1093/biomethods/bpaf078
Malte Willmes, Anders Varmann Aamodt, Børge Solli Andreassen, Lina Victoria Tuddenham Haug, Enghild Steinkjer, Gunnel M Østborg, Gitte Løkeberg, Peder Fiske, Geir R Brandt, Terje Mikalsen, Arne Siversten, Magnus Moustache, June Larsen Ydsti, Bjørn Florø-Larsen
Escaped farmed salmon are a major concern for wild Atlantic salmon (Salmo salar) stocks in Norway. Fish scale analysis is a well-established method for distinguishing farmed from wild fish, but the process is labor and time intensive. Deep learning has recently been shown to automate this task with high accuracy, though typically on relatively small and geographically limited datasets. Here we train and validate a new convolutional neural network on nearly 90 000 scale images from two national archives, encompassing heterogeneous imaging protocols, hundreds of rivers, and time series extending back to the 1930s. The model achieved an F1 score of 0.95 on a large, independent test set, with predictions closely matching both genetic reference samples and known farmed-origin scales. By developing and testing this new model on a large and diverse dataset, we demonstrate that deep learning generalizes robustly across ecological and methodological contexts, supporting its use as a validated, large-scale tool for monitoring escaped farmed salmon.
{"title":"Identifying escaped farmed salmon from fish scales using deep learning.","authors":"Malte Willmes, Anders Varmann Aamodt, Børge Solli Andreassen, Lina Victoria Tuddenham Haug, Enghild Steinkjer, Gunnel M Østborg, Gitte Løkeberg, Peder Fiske, Geir R Brandt, Terje Mikalsen, Arne Siversten, Magnus Moustache, June Larsen Ydsti, Bjørn Florø-Larsen","doi":"10.1093/biomethods/bpaf078","DOIUrl":"10.1093/biomethods/bpaf078","url":null,"abstract":"<p><p>Escaped farmed salmon are a major concern for wild Atlantic salmon (<i>Salmo salar</i>) stocks in Norway. Fish scale analysis is a well-established method for distinguishing farmed from wild fish, but the process is labor and time intensive. Deep learning has recently been shown to automate this task with high accuracy, though typically on relatively small and geographically limited datasets. Here we train and validate a new convolutional neural network on nearly 90 000 scale images from two national archives, encompassing heterogeneous imaging protocols, hundreds of rivers, and time series extending back to the 1930s. The model achieved an F1 score of 0.95 on a large, independent test set, with predictions closely matching both genetic reference samples and known farmed-origin scales. By developing and testing this new model on a large and diverse dataset, we demonstrate that deep learning generalizes robustly across ecological and methodological contexts, supporting its use as a validated, large-scale tool for monitoring escaped farmed salmon.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf078"},"PeriodicalIF":1.3,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12647055/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145640650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-21eCollection Date: 2025-01-01DOI: 10.1093/biomethods/bpaf088
Ravi Shankar, Fiona Devi, Xu Qian
The integration of computational methods with traditional qualitative research has emerged as a transformative paradigm in healthcare research. Computational Grounded Theory (CGT) combines the interpretive depth of grounded theory with computational techniques including machine learning and natural language processing. This systematic review examines CGT application in healthcare research through analysis of eight studies demonstrating the method's utility across diverse contexts. Following systematic search across five databases and PRISMA-aligned screening, eight papers applying CGT in healthcare were analyzed. Studies spanned COVID-19 risk perception, medical AI adoption, mental health interventions, diabetes management, women's health technology, online health communities, and social welfare systems, employing computational techniques including Latent Dirichlet Allocation (LDA), sentiment analysis, word embeddings, and deep learning algorithms. Results demonstrate CGT's capacity for analyzing large-scale textual data (100 000+ documents) while maintaining theoretical depth, with consistent reports of enhanced analytical capacity, latent pattern identification, and novel theoretical insights. However, challenges include technical complexity, interpretation validity, resource requirements, and need for interdisciplinary expertise. CGT represents a promising methodological innovation for healthcare research, particularly for understanding complex phenomena, patient experiences, and technology adoption, though the small sample size (8 of 892 screened articles) reflects its nascent application and limits generalizability. CGT represents a promising methodological innovation for healthcare research, particularly valuable for understanding complex healthcare phenomena, patient experiences, and technology adoption. The small sample size (8 of 892 screened articles) reflects CGT's nascent application in healthcare, limiting generalizability. Future research should focus on standardizing methodological procedures, developing best practices, expanding applications, and addressing accessibility barriers.
{"title":"A systematic review of the application of computational grounded theory method in healthcare research.","authors":"Ravi Shankar, Fiona Devi, Xu Qian","doi":"10.1093/biomethods/bpaf088","DOIUrl":"10.1093/biomethods/bpaf088","url":null,"abstract":"<p><p>The integration of computational methods with traditional qualitative research has emerged as a transformative paradigm in healthcare research. Computational Grounded Theory (CGT) combines the interpretive depth of grounded theory with computational techniques including machine learning and natural language processing. This systematic review examines CGT application in healthcare research through analysis of eight studies demonstrating the method's utility across diverse contexts. Following systematic search across five databases and PRISMA-aligned screening, eight papers applying CGT in healthcare were analyzed. Studies spanned COVID-19 risk perception, medical AI adoption, mental health interventions, diabetes management, women's health technology, online health communities, and social welfare systems, employing computational techniques including Latent Dirichlet Allocation (LDA), sentiment analysis, word embeddings, and deep learning algorithms. Results demonstrate CGT's capacity for analyzing large-scale textual data (100 000+ documents) while maintaining theoretical depth, with consistent reports of enhanced analytical capacity, latent pattern identification, and novel theoretical insights. However, challenges include technical complexity, interpretation validity, resource requirements, and need for interdisciplinary expertise. CGT represents a promising methodological innovation for healthcare research, particularly for understanding complex phenomena, patient experiences, and technology adoption, though the small sample size (8 of 892 screened articles) reflects its nascent application and limits generalizability. CGT represents a promising methodological innovation for healthcare research, particularly valuable for understanding complex healthcare phenomena, patient experiences, and technology adoption. The small sample size (8 of 892 screened articles) reflects CGT's nascent application in healthcare, limiting generalizability. Future research should focus on standardizing methodological procedures, developing best practices, expanding applications, and addressing accessibility barriers.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf088"},"PeriodicalIF":1.3,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12744390/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}