首页 > 最新文献

Biology Methods and Protocols最新文献

英文 中文
SOLVE: A structured orthogonal latent variable framework for disentangling confounding in matrix data. 求解:一个结构化的正交潜在变量框架,用于解开矩阵数据中的混淆。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-28 eCollection Date: 2026-01-01 DOI: 10.1093/biomethods/bpaf094
Jialai She, Gil Alterovitz

Latent factor models are valuable in bioinformatics for accounting for unmeasured variation alongside observed covariates. Yet many methods struggle to separate known effects from latent structure and to handle losses beyond standard regression. We present a unified framework that augments row and column predictors with a low-rank latent component, jointly modeling measured effects and residual variation. To remove ambiguity in estimating observed and latent effects, we impose a carefully designed set of orthogonality constraints on the coefficient and latent factor matrices, relative to the spans of the predictor matrices. These constraints ensure identifiability, yield a decomposition in which the latent term captures only variation unexplained by the covariates, and improve interpretability. An efficient algorithm handles general non-quadratic losses via surrogates with monotone descent. Each iteration updates the latent term by truncated singular value decomposition of a doubly projected residual and refines coefficients by projections. The number of latent factors is selected by applying an elbow rule to a degrees-of-freedom-adjusted information criterion. A parametric bootstrap provides valid inference on feature-outcome associations under the regularized low-rank structure. Applied to real pharmacogenomic data, the method recovers biologically coherent gene-drug associations missed by standard factor models, such as the EGFR-inhibitor link, highlights novel candidates with plausible mechanisms, and reveals gene programs aligned with compound modes of action, including a latent unfolded-protein-response module affecting drug sensitivity. These results support the framework's utility for precision oncology, yielding stronger biomarkers for patient stratification and deeper insight into drug resistance mechanisms.

潜在因素模型在生物信息学中是有价值的,用于计算未测量的变异和观察到的协变量。然而,许多方法难以将已知的影响与潜在的结构分开,并处理超出标准回归的损失。我们提出了一个统一的框架,用低秩潜在成分增加行和列预测因子,共同建模测量效应和剩余变异。为了消除估计观察到的和潜在的影响的模糊性,我们对系数和潜在因素矩阵施加了一组精心设计的正交性约束,相对于预测矩阵的跨度。这些约束确保了可识别性,产生了一种分解,其中潜在项仅捕获协变量无法解释的变化,并提高了可解释性。一种有效的算法通过单调下降的代理来处理一般的非二次损失。每次迭代通过双投影残差的截断奇异值分解来更新潜项,并通过投影来细化系数。通过将肘部规则应用于自由度调整的信息准则来选择潜在因素的数量。参数自举在正则化低秩结构下提供了有效的特征-结果关联推理。应用于真实的药物基因组学数据,该方法恢复了标准因子模型(如egfr -抑制剂链接)所遗漏的生物学上一致的基因-药物关联,突出了具有合理机制的新候选药物,并揭示了与复合作用模式一致的基因程序,包括影响药物敏感性的潜在未折叠蛋白质反应模块。这些结果支持了该框架在精确肿瘤学中的实用性,为患者分层提供了更强的生物标志物,并对耐药性机制有了更深入的了解。
{"title":"SOLVE: A structured orthogonal latent variable framework for disentangling confounding in matrix data.","authors":"Jialai She, Gil Alterovitz","doi":"10.1093/biomethods/bpaf094","DOIUrl":"10.1093/biomethods/bpaf094","url":null,"abstract":"<p><p>Latent factor models are valuable in bioinformatics for accounting for unmeasured variation alongside observed covariates. Yet many methods struggle to separate known effects from latent structure and to handle losses beyond standard regression. We present a unified framework that augments row and column predictors with a low-rank latent component, jointly modeling measured effects and residual variation. To remove ambiguity in estimating observed and latent effects, we impose a carefully designed set of orthogonality constraints on the coefficient and latent factor matrices, relative to the spans of the predictor matrices. These constraints ensure identifiability, yield a decomposition in which the latent term captures only variation unexplained by the covariates, and improve interpretability. An efficient algorithm handles general non-quadratic losses via surrogates with monotone descent. Each iteration updates the latent term by truncated singular value decomposition of a doubly projected residual and refines coefficients by projections. The number of latent factors is selected by applying an elbow rule to a degrees-of-freedom-adjusted information criterion. A parametric bootstrap provides valid inference on feature-outcome associations under the regularized low-rank structure. Applied to real pharmacogenomic data, the method recovers biologically coherent gene-drug associations missed by standard factor models, such as the EGFR-inhibitor link, highlights novel candidates with plausible mechanisms, and reveals gene programs aligned with compound modes of action, including a latent unfolded-protein-response module affecting drug sensitivity. These results support the framework's utility for precision oncology, yielding stronger biomarkers for patient stratification and deeper insight into drug resistance mechanisms.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf094"},"PeriodicalIF":1.3,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12848822/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146087502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessment of HIF2α mutational pathogenicity using microscale thermophoresis. 应用微尺度热电泳技术评价HIF2α突变致病性。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-13 eCollection Date: 2026-01-01 DOI: 10.1093/biomethods/bpag001
Fraser G Ferens, Cassandra C Taber, Jeffrey J Eo, Michael Ohh

Pacak-Zhuang syndrome is an emerging pseudohypoxic disorder that causes defined but varied manifestations of neuroendocrine tumours with or without polycythemia or exclusively polycythemia. This disease is caused by mutations in the EPAS1 gene, which encodes for one of three hypoxia-inducible factor (HIF) α subunits, HIF2α. As new mutations in this gene are observed in individuals exhibiting the manifestations of Pacak-Zhuang syndrome, there is a need to distinguish bona-fide disease causing mutations from benign mutations, which could have a valuable impact on the direction of patient care. We recently showed that reductions in the affinity of prolyl-hydroxylase 2 (PHD2) for HIF2α due to mutations are at the root of the mechanism underlying Pacak-Zhuang syndrome. The determination of affinity was accomplished using microscale thermophoresis (MST). Here, we describe a detailed protocol for the assessment of binding affinities between HIF2α peptides or the entire oxygen-dependent degradation domains of HIFα proteins and PHD2 using MST and propose that this method can be used to assess the potential pathogenicity of novel mutations in HIF2α.

Pacak-Zhuang综合征是一种新出现的假性缺氧疾病,可引起明确但表现多样的神经内分泌肿瘤,伴或不伴红细胞增多症或单纯红细胞增多症。这种疾病是由EPAS1基因突变引起的,EPAS1基因编码三种缺氧诱导因子(HIF) α亚基之一HIF2α。由于该基因在出现Pacak-Zhuang综合征表现的个体中观察到新的突变,因此有必要区分真正引起疾病的突变与良性突变,这可能对患者护理的方向产生宝贵影响。我们最近发现,脯氨酸羟化酶2 (PHD2)对HIF2α的亲和力由于突变而降低,这是packak - zhuang综合征的根本机制。采用微尺度热泳法测定亲和度。在这里,我们描述了一种使用MST评估HIF2α肽或整个HIFα蛋白氧依赖性降解区域与PHD2之间结合亲和力的详细方案,并提出该方法可用于评估HIF2α新突变的潜在致病性。
{"title":"Assessment of HIF2α mutational pathogenicity using microscale thermophoresis.","authors":"Fraser G Ferens, Cassandra C Taber, Jeffrey J Eo, Michael Ohh","doi":"10.1093/biomethods/bpag001","DOIUrl":"https://doi.org/10.1093/biomethods/bpag001","url":null,"abstract":"<p><p>Pacak-Zhuang syndrome is an emerging pseudohypoxic disorder that causes defined but varied manifestations of neuroendocrine tumours with or without polycythemia or exclusively polycythemia. This disease is caused by mutations in the <i>EPAS1</i> gene, which encodes for one of three hypoxia-inducible factor (HIF) α subunits, HIF2α. As new mutations in this gene are observed in individuals exhibiting the manifestations of Pacak-Zhuang syndrome, there is a need to distinguish bona-fide disease causing mutations from benign mutations, which could have a valuable impact on the direction of patient care. We recently showed that reductions in the affinity of prolyl-hydroxylase 2 (PHD2) for HIF2α due to mutations are at the root of the mechanism underlying Pacak-Zhuang syndrome. The determination of affinity was accomplished using microscale thermophoresis (MST). Here, we describe a detailed protocol for the assessment of binding affinities between HIF2α peptides or the entire oxygen-dependent degradation domains of HIFα proteins and PHD2 using MST and propose that this method can be used to assess the potential pathogenicity of novel mutations in HIF2α.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpag001"},"PeriodicalIF":1.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12867580/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cell2Read: an automated workflow to generate sequencing-ready DNA libraries from human cell suspensions. Cell2Read:从人类细胞悬浮液中生成测序就绪DNA文库的自动化工作流程。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-17 eCollection Date: 2025-01-01 DOI: 10.1093/biomethods/bpaf091
Kathryn Whitehead, Sarah Planchak, Trinity Williams, Julia Xia, Soeun Park, Alejandra Hernandez Moyers, Shreyas Shah, Lloyd Bwanali, Anubhav Tripathi

Cell2Read is a novel automated method for complete integration of cell lysis and sample preparation for next-generation sequencing (NGS). It optimizes diffusion kinetics and complex thermal geometries to allow for effective use down to low inputs of cells. This allows for DNA analysis from a low cellular input, whether this be for in vitro analysis or diagnostic applications from dissociated tumor biopsies. We demonstrate that the system can process input cell suspensions as low as 1500 cells without compromising sequencing integrity. We also demonstrate the breadth of the protocol in its ability to repeatably process many cell types, including HepG2, Caov3, HEY A8, OVCAR 8, MDA-MB-231, and Human Primary Ovarian Epithelial Cells. The workflow integrates and fully automates cell lysis, DNA extraction, and library preparation into a single automated platform, offering high sensitivity and reproducibility. Our results show that the system yields consistent DNA quantities (≥10 ng) with high sequencing quality, even at low cell inputs, with alignment rates exceeding 95% for inputs of 3125 cells or greater. The automated method's sequencing performance was comparable to manual protocols, with no significant differences in quality scores or GC bias across processing methods. We also demonstrated effective, non-biased sequencing of heterogeneous cell suspensions, through comprehensive testing of spiked concentrations of cancerous cells with non-cancerous ovarian cells. Sequencing output showed proportional DNA representation of cancer markers to the concentration of cancer cells inputted. The Cell2Read workflow offers a technically validated, scalable solution that expands accessibility to genomic analysis and supports reproducible, high-quality sequencing from low-input human samples. This robustness across a range of cell types, makes Cell2Read an ideal solution for sequencing applications, including oncology research and clinical diagnostics.

Cell2Read是一种全新的自动化方法,用于下一代测序(NGS)的细胞裂解和样品制备。它优化了扩散动力学和复杂的热几何形状,允许有效地使用低输入的细胞。这允许DNA分析从低细胞输入,无论是用于体外分析或诊断应用从游离肿瘤活检。我们证明,该系统可以处理低至1500个细胞的输入细胞悬液,而不会影响测序的完整性。我们还证明了该方案的广度,因为它能够重复处理多种细胞类型,包括HepG2, Caov3, HEY A8, OVCAR 8, MDA-MB-231和人原代卵巢上皮细胞。该工作流程集成并完全自动化细胞裂解,DNA提取和文库制备到一个自动化平台,提供高灵敏度和可重复性。我们的研究结果表明,即使在低细胞输入量下,该系统也能产生一致的DNA量(≥10 ng),具有高测序质量,对于输入量为3125个或更大的细胞,比对率超过95%。自动化方法的测序性能与手动协议相当,在处理方法的质量分数或GC偏差方面没有显着差异。我们还通过对卵巢癌细胞和非卵巢癌细胞的加峰浓度进行综合测试,证明了异质细胞悬浮液的有效、无偏倚测序。测序结果显示,癌症标记物的DNA表示与输入的癌细胞浓度成正比。Cell2Read工作流程提供了一种经过技术验证的、可扩展的解决方案,扩展了基因组分析的可访问性,并支持从低输入的人类样本中进行可重复的高质量测序。这种跨越一系列细胞类型的稳健性,使Cell2Read成为测序应用的理想解决方案,包括肿瘤研究和临床诊断。
{"title":"Cell2Read: an automated workflow to generate sequencing-ready DNA libraries from human cell suspensions.","authors":"Kathryn Whitehead, Sarah Planchak, Trinity Williams, Julia Xia, Soeun Park, Alejandra Hernandez Moyers, Shreyas Shah, Lloyd Bwanali, Anubhav Tripathi","doi":"10.1093/biomethods/bpaf091","DOIUrl":"10.1093/biomethods/bpaf091","url":null,"abstract":"<p><p>Cell2Read is a novel automated method for complete integration of cell lysis and sample preparation for next-generation sequencing (NGS). It optimizes diffusion kinetics and complex thermal geometries to allow for effective use down to low inputs of cells. This allows for DNA analysis from a low cellular input, whether this be for in vitro analysis or diagnostic applications from dissociated tumor biopsies. We demonstrate that the system can process input cell suspensions as low as 1500 cells without compromising sequencing integrity. We also demonstrate the breadth of the protocol in its ability to repeatably process many cell types, including HepG2, Caov3, HEY A8, OVCAR 8, MDA-MB-231, and Human Primary Ovarian Epithelial Cells. The workflow integrates and fully automates cell lysis, DNA extraction, and library preparation into a single automated platform, offering high sensitivity and reproducibility. Our results show that the system yields consistent DNA quantities (≥10 ng) with high sequencing quality, even at low cell inputs, with alignment rates exceeding 95% for inputs of 3125 cells or greater. The automated method's sequencing performance was comparable to manual protocols, with no significant differences in quality scores or GC bias across processing methods. We also demonstrated effective, non-biased sequencing of heterogeneous cell suspensions, through comprehensive testing of spiked concentrations of cancerous cells with non-cancerous ovarian cells. Sequencing output showed proportional DNA representation of cancer markers to the concentration of cancer cells inputted. The Cell2Read workflow offers a technically validated, scalable solution that expands accessibility to genomic analysis and supports reproducible, high-quality sequencing from low-input human samples. This robustness across a range of cell types, makes Cell2Read an ideal solution for sequencing applications, including oncology research and clinical diagnostics.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf091"},"PeriodicalIF":1.3,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12744386/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Point-of-care electroencephalography for prediction of postoperative delirium in older adults undergoing elective surgery: protocol for a prospective cohort study. 即时脑电图预测老年人择期手术后谵妄:前瞻性队列研究方案。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-14 eCollection Date: 2026-01-01 DOI: 10.1093/biomethods/bpaf093
Vikas N Vattipally, Patrick Kramer, Nada Abouelseoud, Isha Yeleswarapu, A Daniel Davidar, Joseph M Dardick, Ali Bydon, Timothy F Witham, Daniel Lubelski, Kathryn Rosenblatt, Judy Huang, Chetan Bettegowda, Frederick Sieber, Esther S Oh, Sridevi V Sarma, Ozan Akca, Nicholas Theodore, Tej D Azad

Postoperative delirium (POD) is a complication of surgery in older adults associated with adverse outcomes. Current screening methods demonstrate poor interrater reliability, and conventional electroencephalography (EEG)-based screening requires intensive setup. Point-of-care (POC) EEG technology offers a rapid and objective alternative that may capture neurophysiological signatures of delirium risk. When combined with baseline and perioperative variables, POC EEG may enable the prediction of POD before clinical manifestation. In this study, we aim to develop a POD prediction model using POC EEG as well as explore secondary outcomes such as longer-term cognitive impairment and postoperative pain. This is a prospective cohort study enrolling older adults (≥60 years) undergoing elective non-cranial inpatient surgery at two academic hospitals. The target cohort size is 150 participants, determined by an events-per-parameter approach. All participants undergo baseline cognitive testing and pain assessment using the Montreal Cognitive Assessment (MoCA) and Numeric Rating Scale. The primary outcome is POD, while secondary outcomes include follow-up MoCA scores and postoperative pain scores. POD is assessed immediately after surgery and every 12 h during the admission with the 4AT tool. Perioperative EEG is acquired using the Ceribell EEG system (Ceribell, Inc.) across standardized preoperative, intraoperative, and postoperative phases. EEG features such as spectral power, alpha/delta ratio, and burst suppression ratio are analyzed in relation to outcomes. Predictive models will be developed using regularized logistic regression with nested feature sets, and model performance will be evaluated. This study evaluates whether POC EEG can accurately predict POD in older adults undergoing elective surgery, as well as longer-term cognitive impairment and postoperative pain. This approach could enable early identification of high-risk patients and facilitate targeted preventive strategies. By generating a validated risk model, multimodal exploratory analyses, and openly available datasets, this work aims to advance the practical management of perioperative outcomes.

术后谵妄(POD)是老年人手术的并发症,与不良后果相关。目前的筛查方法显示出较差的互连可靠性,而传统的基于脑电图(EEG)的筛查需要密集的设置。即时护理(POC)脑电图技术提供了一种快速客观的替代方法,可以捕捉谵妄风险的神经生理特征。结合基线和围手术期变量,POC脑电图可以在临床表现前预测POD。在这项研究中,我们的目标是利用POC脑电图建立POD预测模型,并探讨长期认知障碍和术后疼痛等次要结果。这是一项前瞻性队列研究,纳入了在两家学术医院接受选择性非颅脑住院手术的老年人(≥60岁)。目标队列规模为150名参与者,由每个参数事件的方法确定。所有参与者都接受基线认知测试和使用蒙特利尔认知评估(MoCA)和数字评定量表进行疼痛评估。主要结局是POD,次要结局包括随访MoCA评分和术后疼痛评分。术后立即用4AT工具评估POD,入院时每12小时评估一次。围手术期脑电图是使用Ceribell脑电图系统(Ceribell, Inc)在标准化的术前、术中和术后阶段获得的。脑电图特征,如频谱功率,α / δ比,和突发抑制比分析与结果的关系。预测模型将使用嵌套特征集的正则化逻辑回归开发,模型性能将被评估。本研究评估POC脑电图是否能准确预测择期手术老年人的POD,以及长期认知障碍和术后疼痛。这种方法可以使高风险患者的早期识别和促进有针对性的预防策略。通过生成一个经过验证的风险模型、多模式探索性分析和公开可用的数据集,这项工作旨在推进围手术期结果的实际管理。
{"title":"Point-of-care electroencephalography for prediction of postoperative delirium in older adults undergoing elective surgery: protocol for a prospective cohort study.","authors":"Vikas N Vattipally, Patrick Kramer, Nada Abouelseoud, Isha Yeleswarapu, A Daniel Davidar, Joseph M Dardick, Ali Bydon, Timothy F Witham, Daniel Lubelski, Kathryn Rosenblatt, Judy Huang, Chetan Bettegowda, Frederick Sieber, Esther S Oh, Sridevi V Sarma, Ozan Akca, Nicholas Theodore, Tej D Azad","doi":"10.1093/biomethods/bpaf093","DOIUrl":"10.1093/biomethods/bpaf093","url":null,"abstract":"<p><p>Postoperative delirium (POD) is a complication of surgery in older adults associated with adverse outcomes. Current screening methods demonstrate poor interrater reliability, and conventional electroencephalography (EEG)-based screening requires intensive setup. Point-of-care (POC) EEG technology offers a rapid and objective alternative that may capture neurophysiological signatures of delirium risk. When combined with baseline and perioperative variables, POC EEG may enable the prediction of POD before clinical manifestation. In this study, we aim to develop a POD prediction model using POC EEG as well as explore secondary outcomes such as longer-term cognitive impairment and postoperative pain. This is a prospective cohort study enrolling older adults (≥60 years) undergoing elective non-cranial inpatient surgery at two academic hospitals. The target cohort size is 150 participants, determined by an events-per-parameter approach. All participants undergo baseline cognitive testing and pain assessment using the Montreal Cognitive Assessment (MoCA) and Numeric Rating Scale. The primary outcome is POD, while secondary outcomes include follow-up MoCA scores and postoperative pain scores. POD is assessed immediately after surgery and every 12 h during the admission with the 4AT tool. Perioperative EEG is acquired using the Ceribell EEG system (Ceribell, Inc.) across standardized preoperative, intraoperative, and postoperative phases. EEG features such as spectral power, alpha/delta ratio, and burst suppression ratio are analyzed in relation to outcomes. Predictive models will be developed using regularized logistic regression with nested feature sets, and model performance will be evaluated. This study evaluates whether POC EEG can accurately predict POD in older adults undergoing elective surgery, as well as longer-term cognitive impairment and postoperative pain. This approach could enable early identification of high-risk patients and facilitate targeted preventive strategies. By generating a validated risk model, multimodal exploratory analyses, and openly available datasets, this work aims to advance the practical management of perioperative outcomes.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf093"},"PeriodicalIF":1.3,"publicationDate":"2025-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12798540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145970960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid framework for disease biomarker discovery in microbiome research combining Bayesian networks, machine learning, and network-based methods. 结合贝叶斯网络、机器学习和基于网络的方法,在微生物组研究中发现疾病生物标志物的混合框架。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-13 eCollection Date: 2026-01-01 DOI: 10.1093/biomethods/bpaf089
Rosa Aghdam, Shan Shan, Richard Lankau, Claudia Solís-Lemus

Microbiome research faces two central challenges, namely constructing reliable networks, where nodes represent microbial taxa and edges represent their associations, and identifying significant disease-associated taxa. To address the first challenge, we developed CMIMN, a novel R package that applies a Bayesian network framework based on conditional mutual information to infer microbial interaction networks. To further enhance reliability, we construct a consensus microbiome network by integrating results from CMIMN and three widely used methods, including Sparse Inverse Covariance Estimation for Ecological Association Inference (SPIEC-EASI), Semi-Parametric Rank-based correlation and partial correlation Estimation (SPRING), and Sparse Correlations for Compositional Data (SPARCC). This consensus approach, which overlays and weights edges shared across methods, reduces inconsistencies and provides a more biologically meaningful view of microbial relationships. To address the second challenge, we designed a multi-method feature selection framework that combines machine learning with network-based strategies. Our machine learning pipeline applies distinct algorithms and identifies key taxa based on their consistent importance across models. Complementing this, we employ two network-based strategies that prioritize taxa based on centrality differences between networks constructed from healthy samples and disease-affected samples, as well as a composite scoring system that ranks nodes using integrated network metrics. We applied CMIMN on soil microbiome data from potato fields affected by common scab disease. Bootstrap analysis confirmed the robustness of CMIMN, and the consensus network further improved stability and interpretability. The multi-method framework enhances confidence in identifying soil microbial taxa associated with potato disease. Notably, we identified Bacteroidota, WPS-2, and Proteobacteria at the Phylum level; Actinobacteria, AD3, Bacilli, Anaerolineae, and Ktedonobacteria at the Class level; and C0119, Defluviicoccales, Bacteroidales, and Ktedonobacterales at the Order level as key taxa associated with disease status.

微生物组研究面临两个核心挑战,即构建可靠的网络,其中节点代表微生物分类群,边缘代表它们的关联,以及识别重要的疾病相关分类群。为了解决第一个挑战,我们开发了CMIMN,这是一个新的R包,它应用基于条件互信息的贝叶斯网络框架来推断微生物相互作用网络。为了进一步提高可靠性,我们将CMIMN的结果与三种广泛使用的方法(包括基于生态关联推理的稀疏逆协方差估计(SPIEC-EASI)、半参数基于秩的相关和偏相关估计(SPRING)和成分数据的稀疏相关估计(SPARCC))相结合,构建了共识微生物组网络。这种共识方法覆盖和加权了不同方法共享的边缘,减少了不一致性,并提供了更有生物学意义的微生物关系视图。为了解决第二个挑战,我们设计了一个多方法特征选择框架,将机器学习与基于网络的策略相结合。我们的机器学习管道应用不同的算法,并根据它们在模型中的一致重要性来识别关键分类群。为了补充这一点,我们采用了两种基于网络的策略,基于健康样本和疾病影响样本构建的网络之间的中心性差异对分类群进行优先排序,以及使用综合网络指标对节点进行排名的复合评分系统。本研究应用CMIMN技术对受马铃薯普通痂病影响的马铃薯田土壤微生物组数据进行了分析。Bootstrap分析证实了CMIMN的鲁棒性,共识网络进一步提高了稳定性和可解释性。多方法框架提高了鉴定与马铃薯病害相关的土壤微生物分类群的信心。值得注意的是,我们在门水平上鉴定了拟杆菌门、WPS-2和变形杆菌门;放线菌、AD3、芽孢杆菌、厌氧菌和Ktedonobacteria在纲水平;C0119、Defluviicoccales、Bacteroidales和Ktedonobacterales在目水平上是与疾病状态相关的关键分类群。
{"title":"A hybrid framework for disease biomarker discovery in microbiome research combining Bayesian networks, machine learning, and network-based methods.","authors":"Rosa Aghdam, Shan Shan, Richard Lankau, Claudia Solís-Lemus","doi":"10.1093/biomethods/bpaf089","DOIUrl":"10.1093/biomethods/bpaf089","url":null,"abstract":"<p><p>Microbiome research faces two central challenges, namely constructing reliable networks, where nodes represent microbial taxa and edges represent their associations, and identifying significant disease-associated taxa. To address the first challenge, we developed CMIMN, a novel R package that applies a Bayesian network framework based on conditional mutual information to infer microbial interaction networks. To further enhance reliability, we construct a consensus microbiome network by integrating results from CMIMN and three widely used methods, including Sparse Inverse Covariance Estimation for Ecological Association Inference (SPIEC-EASI), Semi-Parametric Rank-based correlation and partial correlation Estimation (SPRING), and Sparse Correlations for Compositional Data (SPARCC). This consensus approach, which overlays and weights edges shared across methods, reduces inconsistencies and provides a more biologically meaningful view of microbial relationships. To address the second challenge, we designed a multi-method feature selection framework that combines machine learning with network-based strategies. Our machine learning pipeline applies distinct algorithms and identifies key taxa based on their consistent importance across models. Complementing this, we employ two network-based strategies that prioritize taxa based on centrality differences between networks constructed from healthy samples and disease-affected samples, as well as a composite scoring system that ranks nodes using integrated network metrics. We applied CMIMN on soil microbiome data from potato fields affected by common scab disease. Bootstrap analysis confirmed the robustness of CMIMN, and the consensus network further improved stability and interpretability. The multi-method framework enhances confidence in identifying soil microbial taxa associated with potato disease. Notably, we identified <i>Bacteroidota</i>, <i>WPS-2</i>, and <i>Proteobacteria</i> at the Phylum level; <i>Actinobacteria</i>, <i>AD3</i>, <i>Bacilli</i>, <i>Anaerolineae</i>, and <i>Ktedonobacteria</i> at the Class level; and <i>C0119</i>, <i>Defluviicoccales</i>, <i>Bacteroidales</i>, and <i>Ktedonobacterales</i> at the Order level as key taxa associated with disease status.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf089"},"PeriodicalIF":1.3,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12791661/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilevel predictors categorization for post-CABG atrial fibrillation prediction. 冠状动脉搭桥后房颤预测的多水平预测因子分类。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-12 eCollection Date: 2026-01-01 DOI: 10.1093/biomethods/bpaf092
Karina I Shakhgeldyan, Vladislav Y Rublev, Nikita S Kuksin, Boris I Geltser, Regina L Pak

Postoperative atrial fibrillation (PoAF) is a common complication after coronary artery bypass grafting (CABG). Despite its association with increased risk of ischemic stroke, bleeding, acute renal failure and mortality there is still no ideal predictive tool with proper clinical interpretability. A retrospective single-center cohort study enrolled 1305 electronic medical records of patients with elective isolated CABG. PoAF was identified in 280 (21.5%) patients. Prognostic models with continuous variables were developed utilizing multivariate logistic regression (MLR), random forest and eXtreme gradient boosting methods. Predictors were dichotomized via grid search for optimal cut-off points, centroid calculation, and Shapley additive explanation (SHAP). For multilevel categorization, we proposed to use threshold values combination identified during dichotomization, as well as ranking cut-off thresholds by MLR weighting coefficients (multimetric categorization method). Based on multistage selection, nine PoAF predictors were identified and validated. After categorization, prognostic models with continuous and multilevel categorical variables were developed. The best XGB model employing continuous predictors demonstrated an AUC = 0.76. Models in which predictors were derived utilizing the multimetric categorization approach showed comparable predictive performance (AUC = 0.758). The main advantage of models with multilevel predictors categorization was their superior explainability and clinical interpretability in predicting POAF. Multilevel predictors categorization represents a promising tool for improving the explainability of POAF predictive development estimates. Using the developed prognostic models, it was demonstrated that the categorization procedures proposed by the authors ensure both high predictive accuracy and transparency of the generated clinical conclusions.

术后心房颤动(PoAF)是冠状动脉旁路移植术(CABG)后常见的并发症。尽管它与缺血性中风、出血、急性肾功能衰竭和死亡率增加的风险有关,但仍然没有理想的具有适当临床可解释性的预测工具。一项回顾性单中心队列研究纳入了1305例选择性孤立性冠脉搭桥患者的电子病历。280例(21.5%)患者被确诊为PoAF。利用多元逻辑回归(MLR)、随机森林和极端梯度增强方法建立了具有连续变量的预测模型。通过网格搜索最佳截断点、质心计算和Shapley加性解释(SHAP)对预测因子进行二分类。对于多级分类,我们提出使用二分类过程中识别的阈值组合,以及使用MLR加权系数对截止阈值进行排序(多度量分类法)。基于多阶段选择,确定并验证了9个PoAF预测因子。分类后,建立了具有连续和多级分类变量的预测模型。采用连续预测因子的最佳XGB模型显示AUC = 0.76。利用多度量分类方法推导预测因子的模型显示出可比的预测性能(AUC = 0.758)。多水平预测因子分类模型的主要优势在于其在预测POAF方面具有较好的可解释性和临床可解释性。多级预测因子分类是一种很有前途的工具,可以提高POAF预测开发评估的可解释性。使用开发的预后模型,证明了作者提出的分类程序确保了高预测准确性和产生的临床结论的透明度。
{"title":"Multilevel predictors categorization for post-CABG atrial fibrillation prediction.","authors":"Karina I Shakhgeldyan, Vladislav Y Rublev, Nikita S Kuksin, Boris I Geltser, Regina L Pak","doi":"10.1093/biomethods/bpaf092","DOIUrl":"10.1093/biomethods/bpaf092","url":null,"abstract":"<p><p>Postoperative atrial fibrillation (PoAF) is a common complication after coronary artery bypass grafting (CABG). Despite its association with increased risk of ischemic stroke, bleeding, acute renal failure and mortality there is still no ideal predictive tool with proper clinical interpretability. A retrospective single-center cohort study enrolled 1305 electronic medical records of patients with elective isolated CABG. PoAF was identified in 280 (21.5%) patients. Prognostic models with continuous variables were developed utilizing multivariate logistic regression (MLR), random forest and eXtreme gradient boosting methods. Predictors were dichotomized via grid search for optimal cut-off points, centroid calculation, and Shapley additive explanation (SHAP). For multilevel categorization, we proposed to use threshold values combination identified during dichotomization, as well as ranking cut-off thresholds by MLR weighting coefficients (multimetric categorization method). Based on multistage selection, nine PoAF predictors were identified and validated. After categorization, prognostic models with continuous and multilevel categorical variables were developed. The best XGB model employing continuous predictors demonstrated an AUC = 0.76. Models in which predictors were derived utilizing the multimetric categorization approach showed comparable predictive performance (AUC = 0.758). The main advantage of models with multilevel predictors categorization was their superior explainability and clinical interpretability in predicting POAF. Multilevel predictors categorization represents a promising tool for improving the explainability of POAF predictive development estimates. Using the developed prognostic models, it was demonstrated that the categorization procedures proposed by the authors ensure both high predictive accuracy and transparency of the generated clinical conclusions.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"11 1","pages":"bpaf092"},"PeriodicalIF":1.3,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12791823/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
refineDLC: An advanced post-processing pipeline for DeepLabCut outputs. refineDLC:用于DeepLabCut输出的高级后处理管道。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-04 eCollection Date: 2025-01-01 DOI: 10.1093/biomethods/bpaf084
Weronika Klecel, Hadley Rahael, Samantha A Brooks

DeepLabCut has transformed behavioral and locomotor research by enabling markerless pose estimation through deep learning. Despite its broad adoption across species and behaviors, quantitative kinematic analyses remained limited by noisy outputs and the computational expertise required for refinement. To address this issue, we introduce refineDLC, a comprehensive post-processing pipeline that streamlines the conversion of noisy DeepLabCut outputs into robust, analytically reliable kinematic data. The pipeline incorporates essential cleaning steps, including inversion of the y-coordinates for intuitive spatial interpretation, removal of zero-value frames, and exclusion of irrelevant body part labels. It further applies dual-stage filtering based on likelihood scores and positional changes, enhancing data accuracy and consistency. Multiple interpolation strategies manage missing values while maintaining data continuity and integrity. We evaluated refineDLC using two datasets: controlled locomotion in cattle and field-recorded trotting horses. Across both contexts, the pipeline substantially improved data quality and interpretability, reducing variability, eliminating false-positive labeling errors, and transforming noisy trajectories into physiologically meaningful kinematic patterns. Outputs were reliable and analysis-ready regardless of recording conditions or species. By simplifying the transformation from raw DeepLabCut outputs to meaningful kinematic insights, refineDLC expands accessibility for researchers, particularly those with limited programming expertise, enabling precise quantitative analyses at scale. Future developments may incorporate adaptive filtering algorithms and real-time quality assessments, further optimizing performance and automation. These enhancements will extend the pipeline's applicability to precision phenotyping, behavioral ecology, animal science, and conservation biology.

DeepLabCut通过深度学习实现无标记姿势估计,改变了行为和运动研究。尽管它在物种和行为上被广泛采用,但定量运动学分析仍然受到噪声输出和改进所需的计算专业知识的限制。为了解决这个问题,我们引入了refineDLC,这是一种全面的后处理管道,可以将嘈杂的DeepLabCut输出简化为鲁棒的、分析上可靠的运动学数据。该管道包含必要的清理步骤,包括y坐标的反演,以直观的空间解释,去除零值框架,以及排除无关的身体部位标签。进一步采用基于似然评分和位置变化的双级滤波,提高了数据的准确性和一致性。多种插值策略管理缺失值,同时保持数据的连续性和完整性。我们使用两个数据集来评估refineDLC:牛的受控运动和现场记录的小跑马。在这两种情况下,该管道大大提高了数据质量和可解释性,减少了可变性,消除了假阳性标记错误,并将噪声轨迹转换为生理上有意义的运动模式。无论记录条件或物种如何,输出都是可靠的,可供分析。通过简化从原始DeepLabCut输出到有意义的运动学见解的转换,refineDLC扩展了研究人员的可访问性,特别是那些编程专业知识有限的研究人员,可以大规模进行精确的定量分析。未来的发展可能包括自适应过滤算法和实时质量评估,进一步优化性能和自动化。这些增强将扩展管道的适用性,以精确表型,行为生态学,动物科学和保护生物学。
{"title":"refineDLC: An advanced post-processing pipeline for DeepLabCut outputs.","authors":"Weronika Klecel, Hadley Rahael, Samantha A Brooks","doi":"10.1093/biomethods/bpaf084","DOIUrl":"10.1093/biomethods/bpaf084","url":null,"abstract":"<p><p>DeepLabCut has transformed behavioral and locomotor research by enabling markerless pose estimation through deep learning. Despite its broad adoption across species and behaviors, quantitative kinematic analyses remained limited by noisy outputs and the computational expertise required for refinement. To address this issue, we introduce refineDLC, a comprehensive post-processing pipeline that streamlines the conversion of noisy DeepLabCut outputs into robust, analytically reliable kinematic data. The pipeline incorporates essential cleaning steps, including inversion of the y-coordinates for intuitive spatial interpretation, removal of zero-value frames, and exclusion of irrelevant body part labels. It further applies dual-stage filtering based on likelihood scores and positional changes, enhancing data accuracy and consistency. Multiple interpolation strategies manage missing values while maintaining data continuity and integrity. We evaluated refineDLC using two datasets: controlled locomotion in cattle and field-recorded trotting horses. Across both contexts, the pipeline substantially improved data quality and interpretability, reducing variability, eliminating false-positive labeling errors, and transforming noisy trajectories into physiologically meaningful kinematic patterns. Outputs were reliable and analysis-ready regardless of recording conditions or species. By simplifying the transformation from raw DeepLabCut outputs to meaningful kinematic insights, refineDLC expands accessibility for researchers, particularly those with limited programming expertise, enabling precise quantitative analyses at scale. Future developments may incorporate adaptive filtering algorithms and real-time quality assessments, further optimizing performance and automation. These enhancements will extend the pipeline's applicability to precision phenotyping, behavioral ecology, animal science, and conservation biology.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf084"},"PeriodicalIF":1.3,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12744387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and characterization of a pentylenetetrazol-induced convulsive seizure model in non-anaesthetized sheep. 戊四唑致非麻醉绵羊惊厥发作模型的建立与表征。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-01 eCollection Date: 2025-01-01 DOI: 10.1093/biomethods/bpaf086
Ruslan V Pustovit, Yugeesh R Lankadeva, Ming S Soh, Sam F Berkovic, Christopher A Reid, Clive N May

The pathophysiology of seizures is complex and could contribute to a range of morbidities including sudden unexpected death of epilepsy (SUDEP). A better understanding of seizure-induced pathophysiology can lead to the development of targeted interventions. Here, we describe the development and characterization of a novel large mammalian model of convulsive seizures in non-anesthetized sheep induced by pentylenetetrazol (PTZ), one of the most widely used proconvulsant drugs in epilepsy research. A dose of intravenous PTZ that reliably induced a reproducible and consistent level of seizure in non-anaesthetized sheep was determined. Convulsive seizures went through a relatively predictable sequence, similar to that seen in other animal models of epilepsy. A species-specific seizure severity scale system, based on the field Racine's scale that is widely used in epilepsy research, was designed to establish a user-friendly scoring system for PTZ-induced seizures in sheep. We demonstrated that convulsive seizures caused substantial increases in mean arterial pressure and heart rate. The translational value of this large animal model can be further enhanced when combined with other translational tools such as quantitative systems physiology and pharmacology, potential biomarker testing and experimental preclinical trials of potential prophylactic treatments. An advanced animal model, such as described in this study, provides a unique opportunity for comprehensive physiological monitoring of neural and systemic pathways activated by interictal and ictal activity and can contribute to the development of preventive therapies for seizures.

癫痫发作的病理生理是复杂的,并可能导致一系列的发病率,包括癫痫猝死(SUDEP)。更好地了解癫痫诱发的病理生理学可以导致有针对性的干预措施的发展。在这里,我们描述了一种新的大型哺乳动物模型的发展和特征,该模型是由戊四唑(PTZ)引起的非麻醉绵羊惊厥发作,PTZ是癫痫研究中最广泛使用的前惊厥药物之一。确定了在未麻醉的绵羊中可靠地诱导可重复和一致水平癫痫发作的静脉注射PTZ剂量。惊厥发作经历了一个相对可预测的顺序,类似于在其他癫痫动物模型中看到的。在癫痫研究中广泛使用的野外拉辛量表基础上,设计了一种特定物种的癫痫发作严重程度评分系统,以建立一个用户友好的ptz诱发绵羊癫痫发作评分系统。我们证明抽搐发作引起平均动脉压和心率的显著增加。当与定量系统生理学和药理学、潜在生物标志物检测和潜在预防治疗的实验性临床前试验等其他翻译工具相结合时,该大型动物模型的翻译价值可以进一步增强。一种先进的动物模型,如在这项研究中描述的,提供了一个独特的机会,对由发作间期和发作期活动激活的神经和全身通路进行全面的生理监测,并有助于癫痫发作预防治疗的发展。
{"title":"Development and characterization of a pentylenetetrazol-induced convulsive seizure model in non-anaesthetized sheep.","authors":"Ruslan V Pustovit, Yugeesh R Lankadeva, Ming S Soh, Sam F Berkovic, Christopher A Reid, Clive N May","doi":"10.1093/biomethods/bpaf086","DOIUrl":"10.1093/biomethods/bpaf086","url":null,"abstract":"<p><p>The pathophysiology of seizures is complex and could contribute to a range of morbidities including sudden unexpected death of epilepsy (SUDEP). A better understanding of seizure-induced pathophysiology can lead to the development of targeted interventions. Here, we describe the development and characterization of a novel large mammalian model of convulsive seizures in non-anesthetized sheep induced by pentylenetetrazol (PTZ), one of the most widely used proconvulsant drugs in epilepsy research. A dose of intravenous PTZ that reliably induced a reproducible and consistent level of seizure in non-anaesthetized sheep was determined. Convulsive seizures went through a relatively predictable sequence, similar to that seen in other animal models of epilepsy. A species-specific seizure severity scale system, based on the field Racine's scale that is widely used in epilepsy research, was designed to establish a user-friendly scoring system for PTZ-induced seizures in sheep. We demonstrated that convulsive seizures caused substantial increases in mean arterial pressure and heart rate. The translational value of this large animal model can be further enhanced when combined with other translational tools such as quantitative systems physiology and pharmacology, potential biomarker testing and experimental preclinical trials of potential prophylactic treatments. An advanced animal model, such as described in this study, provides a unique opportunity for comprehensive physiological monitoring of neural and systemic pathways activated by interictal and ictal activity and can contribute to the development of preventive therapies for seizures.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf086"},"PeriodicalIF":1.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12674773/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145678895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying escaped farmed salmon from fish scales using deep learning. 利用深度学习识别从鱼鳞中逃跑的养殖鲑鱼。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-11-26 eCollection Date: 2025-01-01 DOI: 10.1093/biomethods/bpaf078
Malte Willmes, Anders Varmann Aamodt, Børge Solli Andreassen, Lina Victoria Tuddenham Haug, Enghild Steinkjer, Gunnel M Østborg, Gitte Løkeberg, Peder Fiske, Geir R Brandt, Terje Mikalsen, Arne Siversten, Magnus Moustache, June Larsen Ydsti, Bjørn Florø-Larsen

Escaped farmed salmon are a major concern for wild Atlantic salmon (Salmo salar) stocks in Norway. Fish scale analysis is a well-established method for distinguishing farmed from wild fish, but the process is labor and time intensive. Deep learning has recently been shown to automate this task with high accuracy, though typically on relatively small and geographically limited datasets. Here we train and validate a new convolutional neural network on nearly 90 000 scale images from two national archives, encompassing heterogeneous imaging protocols, hundreds of rivers, and time series extending back to the 1930s. The model achieved an F1 score of 0.95 on a large, independent test set, with predictions closely matching both genetic reference samples and known farmed-origin scales. By developing and testing this new model on a large and diverse dataset, we demonstrate that deep learning generalizes robustly across ecological and methodological contexts, supporting its use as a validated, large-scale tool for monitoring escaped farmed salmon.

逃逸的养殖鲑鱼是挪威野生大西洋鲑鱼(Salmo salar)库存的主要问题。鱼鳞分析是一种行之有效的区分养殖鱼和野生鱼的方法,但这一过程需要耗费大量人力和时间。深度学习最近被证明可以高精度地自动化这项任务,尽管通常是在相对较小和地理上有限的数据集上。在这里,我们训练并验证了一个新的卷积神经网络,该网络使用了来自两个国家档案馆的近9万幅尺度图像,包括异构成像协议、数百条河流和时间序列,可追溯到20世纪30年代。该模型在一个大型独立测试集上获得了0.95的F1分数,预测结果与遗传参考样本和已知的养殖来源尺度密切匹配。通过在一个大型和多样化的数据集上开发和测试这个新模型,我们证明了深度学习在生态和方法背景下的强大泛化,支持其作为监测逃逸养殖鲑鱼的有效大规模工具的使用。
{"title":"Identifying escaped farmed salmon from fish scales using deep learning.","authors":"Malte Willmes, Anders Varmann Aamodt, Børge Solli Andreassen, Lina Victoria Tuddenham Haug, Enghild Steinkjer, Gunnel M Østborg, Gitte Løkeberg, Peder Fiske, Geir R Brandt, Terje Mikalsen, Arne Siversten, Magnus Moustache, June Larsen Ydsti, Bjørn Florø-Larsen","doi":"10.1093/biomethods/bpaf078","DOIUrl":"10.1093/biomethods/bpaf078","url":null,"abstract":"<p><p>Escaped farmed salmon are a major concern for wild Atlantic salmon (<i>Salmo salar</i>) stocks in Norway. Fish scale analysis is a well-established method for distinguishing farmed from wild fish, but the process is labor and time intensive. Deep learning has recently been shown to automate this task with high accuracy, though typically on relatively small and geographically limited datasets. Here we train and validate a new convolutional neural network on nearly 90 000 scale images from two national archives, encompassing heterogeneous imaging protocols, hundreds of rivers, and time series extending back to the 1930s. The model achieved an F1 score of 0.95 on a large, independent test set, with predictions closely matching both genetic reference samples and known farmed-origin scales. By developing and testing this new model on a large and diverse dataset, we demonstrate that deep learning generalizes robustly across ecological and methodological contexts, supporting its use as a validated, large-scale tool for monitoring escaped farmed salmon.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf078"},"PeriodicalIF":1.3,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12647055/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145640650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic review of the application of computational grounded theory method in healthcare research. 计算扎根理论方法在医疗保健研究中的应用综述。
IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-11-21 eCollection Date: 2025-01-01 DOI: 10.1093/biomethods/bpaf088
Ravi Shankar, Fiona Devi, Xu Qian

The integration of computational methods with traditional qualitative research has emerged as a transformative paradigm in healthcare research. Computational Grounded Theory (CGT) combines the interpretive depth of grounded theory with computational techniques including machine learning and natural language processing. This systematic review examines CGT application in healthcare research through analysis of eight studies demonstrating the method's utility across diverse contexts. Following systematic search across five databases and PRISMA-aligned screening, eight papers applying CGT in healthcare were analyzed. Studies spanned COVID-19 risk perception, medical AI adoption, mental health interventions, diabetes management, women's health technology, online health communities, and social welfare systems, employing computational techniques including Latent Dirichlet Allocation (LDA), sentiment analysis, word embeddings, and deep learning algorithms. Results demonstrate CGT's capacity for analyzing large-scale textual data (100 000+ documents) while maintaining theoretical depth, with consistent reports of enhanced analytical capacity, latent pattern identification, and novel theoretical insights. However, challenges include technical complexity, interpretation validity, resource requirements, and need for interdisciplinary expertise. CGT represents a promising methodological innovation for healthcare research, particularly for understanding complex phenomena, patient experiences, and technology adoption, though the small sample size (8 of 892 screened articles) reflects its nascent application and limits generalizability. CGT represents a promising methodological innovation for healthcare research, particularly valuable for understanding complex healthcare phenomena, patient experiences, and technology adoption. The small sample size (8 of 892 screened articles) reflects CGT's nascent application in healthcare, limiting generalizability. Future research should focus on standardizing methodological procedures, developing best practices, expanding applications, and addressing accessibility barriers.

计算方法与传统定性研究的整合已成为医疗保健研究的变革范式。计算基础理论(CGT)将基础理论的解释深度与包括机器学习和自然语言处理在内的计算技术相结合。本系统综述通过对八项研究的分析,考察了CGT在医疗保健研究中的应用,证明了该方法在不同背景下的效用。通过对5个数据库的系统搜索和prisma对齐筛选,对8篇在医疗保健中应用CGT的论文进行了分析。研究涵盖了COVID-19风险感知、医疗人工智能应用、心理健康干预、糖尿病管理、女性健康技术、在线健康社区和社会福利系统,采用了潜在狄利克雷分配(LDA)、情感分析、词嵌入和深度学习算法等计算技术。结果表明,CGT能够在保持理论深度的同时分析大规模文本数据(100,000 +文档),具有一致的分析能力,潜在模式识别和新颖的理论见解。然而,挑战包括技术复杂性、解释有效性、资源需求和对跨学科专业知识的需求。CGT代表了一种很有前途的医疗保健研究方法创新,特别是在理解复杂现象、患者经验和技术采用方面,尽管样本量小(892篇筛选文章中的8篇)反映了它的应用尚不成熟,并且限制了其推广能力。CGT代表了一种很有前途的医疗保健研究方法创新,对于理解复杂的医疗保健现象、患者体验和技术采用尤其有价值。小样本量(892篇筛选文章中的8篇)反映了CGT在医疗保健领域的初步应用,限制了其普遍性。未来的研究应该集中在标准化方法程序、开发最佳实践、扩展应用和解决可访问性障碍上。
{"title":"A systematic review of the application of computational grounded theory method in healthcare research.","authors":"Ravi Shankar, Fiona Devi, Xu Qian","doi":"10.1093/biomethods/bpaf088","DOIUrl":"10.1093/biomethods/bpaf088","url":null,"abstract":"<p><p>The integration of computational methods with traditional qualitative research has emerged as a transformative paradigm in healthcare research. Computational Grounded Theory (CGT) combines the interpretive depth of grounded theory with computational techniques including machine learning and natural language processing. This systematic review examines CGT application in healthcare research through analysis of eight studies demonstrating the method's utility across diverse contexts. Following systematic search across five databases and PRISMA-aligned screening, eight papers applying CGT in healthcare were analyzed. Studies spanned COVID-19 risk perception, medical AI adoption, mental health interventions, diabetes management, women's health technology, online health communities, and social welfare systems, employing computational techniques including Latent Dirichlet Allocation (LDA), sentiment analysis, word embeddings, and deep learning algorithms. Results demonstrate CGT's capacity for analyzing large-scale textual data (100 000+ documents) while maintaining theoretical depth, with consistent reports of enhanced analytical capacity, latent pattern identification, and novel theoretical insights. However, challenges include technical complexity, interpretation validity, resource requirements, and need for interdisciplinary expertise. CGT represents a promising methodological innovation for healthcare research, particularly for understanding complex phenomena, patient experiences, and technology adoption, though the small sample size (8 of 892 screened articles) reflects its nascent application and limits generalizability. CGT represents a promising methodological innovation for healthcare research, particularly valuable for understanding complex healthcare phenomena, patient experiences, and technology adoption. The small sample size (8 of 892 screened articles) reflects CGT's nascent application in healthcare, limiting generalizability. Future research should focus on standardizing methodological procedures, developing best practices, expanding applications, and addressing accessibility barriers.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf088"},"PeriodicalIF":1.3,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12744390/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biology Methods and Protocols
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1