Pub Date : 2026-03-03DOI: 10.64898/2026.03.01.26347356
Peyton L Coleman, Jeffrey Annis, Hiral Master, Daniel E Gustavson, Lide Han, Evan Brittain, Douglas M Ruderfer
Background: As sleep data from wearable devices are increasingly available in health research, there are new opportunities to understand sleep regulation behaviors as modifiable risk factors for disease. At such a large scale (tens of thousands of people over millions of day-level observations), prioritizing and interpreting sleep behaviors is challenging while maintaining biological relevance and modifiability. In this work, we aim to address this challenge by proposing a framework to interpret Fitbit data through a well-known neurobiological framing of sleep regulation, the two-process model.
Methods: We use data from the All of Us Research Program, a national biobank with passively collected Fitbit data for 32,292 people across 15,754,893 total days. We map Fitbit behaviors ( b ) to either circadian (C) or homeostatic (S) processes. Using iterative exploratory factor analysis to obtain weights, the Fitbit C b and S b are then weighted at the level of each day to create C b and S b scores.
Findings: C b and S b scores were found to align with expected real-world relationships with age, seasonality, shift work, and napping. C b and S b scores were interpreted with relation to depression, where it was found that S b scores are highly associated with likelihood of diagnosis (OR = 1.5, p < 2e-16) while C b and S b scores are equally associated with severity (S b score β = 0.2, C b score β = 0.21, p < 2e-16).
Interpretation: C b and S b scores support longitudinal interpretation (e.g., changes in S b around treatment), aggregation (e.g., differences in C b between two groups), and actionable modification (e.g., reduce naps to improve poor S b ). Overall, our behavior scores allow for interpretation of wearables sleep data and can be utilized across many disease contexts to better understand how sleep influences health.
Funding: This work was supported by NIH training grant T32GM145734 and NIH R21HL172038.
背景:随着可穿戴设备的睡眠数据越来越多地用于健康研究,有了新的机会来理解睡眠调节行为作为可改变的疾病危险因素。在如此大的范围内(成千上万的人在数百万天的观察中),在保持生物学相关性和可修改性的同时,优先考虑和解释睡眠行为是具有挑战性的。在这项工作中,我们的目标是通过提出一个框架来解释Fitbit数据,通过一个众所周知的睡眠调节神经生物学框架,即双过程模型,来解决这一挑战。方法:我们使用来自我们所有人研究计划的数据,这是一个国家生物银行,被动收集了32,292人在15,754,893天中的Fitbit数据。我们将Fitbit行为(b)映射到昼夜节律(C)或稳态(S)过程。使用迭代探索性因子分析获得权重,然后在每天的水平上对Fitbit C b和S b进行加权,以创建C b和S b分数。研究发现:C b和S b分数与年龄、季节性、轮班工作和午睡等预期的现实世界关系一致。C b和S b评分被解释为与抑郁有关,其中发现S b评分与诊断可能性高度相关(OR = 1.5, p < 2e-16),而C b和S b评分与严重程度同样相关(S b评分β = 0.2, C b评分β = 0.21, p < 2e-16)。解释:C b和S b评分支持纵向解释(例如,治疗前后S b的变化)、聚合(例如,两组之间C b的差异)和可操作的修改(例如,减少小睡以改善较差的S b)。总的来说,我们的行为评分允许解释可穿戴设备的睡眠数据,并可以在许多疾病背景下使用,以更好地了解睡眠如何影响健康。经费:本工作由NIH培训基金T32GM145734和NIH R21HL172038支持。
{"title":"Making sleep behaviors interpretable: adapting the two-process model of sleep regulation to longitudinal Fitbit sleep and activity behaviors for health insights.","authors":"Peyton L Coleman, Jeffrey Annis, Hiral Master, Daniel E Gustavson, Lide Han, Evan Brittain, Douglas M Ruderfer","doi":"10.64898/2026.03.01.26347356","DOIUrl":"https://doi.org/10.64898/2026.03.01.26347356","url":null,"abstract":"<p><strong>Background: </strong>As sleep data from wearable devices are increasingly available in health research, there are new opportunities to understand sleep regulation behaviors as modifiable risk factors for disease. At such a large scale (tens of thousands of people over millions of day-level observations), prioritizing and interpreting sleep behaviors is challenging while maintaining biological relevance and modifiability. In this work, we aim to address this challenge by proposing a framework to interpret Fitbit data through a well-known neurobiological framing of sleep regulation, the two-process model.</p><p><strong>Methods: </strong>We use data from the All of Us Research Program, a national biobank with passively collected Fitbit data for 32,292 people across 15,754,893 total days. We map Fitbit behaviors ( <sub>b</sub> ) to either circadian (C) or homeostatic (S) processes. Using iterative exploratory factor analysis to obtain weights, the Fitbit C <sub>b</sub> and S <sub>b</sub> are then weighted at the level of each day to create C <sub>b</sub> and S <sub>b</sub> scores.</p><p><strong>Findings: </strong>C <sub>b</sub> and S <sub>b</sub> scores were found to align with expected real-world relationships with age, seasonality, shift work, and napping. C <sub>b</sub> and S <sub>b</sub> scores were interpreted with relation to depression, where it was found that S <sub>b</sub> scores are highly associated with likelihood of diagnosis (OR = 1.5, p < 2e-16) while C <sub>b</sub> and S <sub>b</sub> scores are equally associated with severity (S <sub>b</sub> score β = 0.2, C <sub>b</sub> score β = 0.21, p < 2e-16).</p><p><strong>Interpretation: </strong>C <sub>b</sub> and S <sub>b</sub> scores support longitudinal interpretation (e.g., changes in S <sub>b</sub> around treatment), aggregation (e.g., differences in C <sub>b</sub> between two groups), and actionable modification (e.g., reduce naps to improve poor S <sub>b</sub> ). Overall, our behavior scores allow for interpretation of wearables sleep data and can be utilized across many disease contexts to better understand how sleep influences health.</p><p><strong>Funding: </strong>This work was supported by NIH training grant T32GM145734 and NIH R21HL172038.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13004143/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147501353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.02.26347449
Jeremy M Lawrence, Sophie Breunig, Lukas S Schaffer, Alexander Sheppard, Katerina Zorina-Lichtenwalter, Andrew D Grotzinger
Major depression (MD) is a disorder class that exhibits substantial phenotypic and clinical heterogeneity, yet many large-scale molecular genetic investigations treat MD as a unitary outcome. Here, we applied Genomic Structural Equation Modeling (Genomic SEM) to characterize the genetic variation in two clinically relevant MD subtypes, childhood-onset (child-onset) and treatment-resistant MD, that are independent of the field-standard GWAS of MD in all its forms. In addition, we fit a complementary "boosting" model that leveraged shared signal across the subtype and general MD GWAS to increase power for subtype biological discovery. At the genome-wide level, more than half of the common-variant liability for child-onset and treatment-resistant MD was unique relative to the general MD GWAS, indicating substantial subtype-specific genetic architecture. Unique components of both subtypes showed robust associations with genetic liability for schizophrenia and bipolar disorder, and the child-onset specific component exhibited genome-wide overlap with early developmental outcomes, including autism spectrum disorder and childhood intelligence. Transcriptome-wide analyses implicated upregulation of SMIM19 in liability specific to child-onset MD, while stratified functional enrichment highlighted gene sets involved in limbic and frontal brain systems for the boosted child-onset component. Together, these findings demonstrate that MD contains biologically distinct subtypes that exhibit etiological divergences more akin to separate disorders than subtypes within a concrete diagnostic framework. We find that stratifying MD by biologically distinguishable subtypes may be crucial for enhancing biological discovery and elucidating etiological pathways in molecular genetic studies of depression.
{"title":"Genetic Signal Augmentation of Childhood-Onset and Treatment-Resistant Major Depression Reveals Distinct Biological Disorders.","authors":"Jeremy M Lawrence, Sophie Breunig, Lukas S Schaffer, Alexander Sheppard, Katerina Zorina-Lichtenwalter, Andrew D Grotzinger","doi":"10.64898/2026.03.02.26347449","DOIUrl":"https://doi.org/10.64898/2026.03.02.26347449","url":null,"abstract":"<p><p>Major depression (MD) is a disorder class that exhibits substantial phenotypic and clinical heterogeneity, yet many large-scale molecular genetic investigations treat MD as a unitary outcome. Here, we applied Genomic Structural Equation Modeling (Genomic SEM) to characterize the genetic variation in two clinically relevant MD subtypes, childhood-onset (child-onset) and treatment-resistant MD, that are independent of the field-standard GWAS of MD in all its forms. In addition, we fit a complementary \"boosting\" model that leveraged shared signal across the subtype and general MD GWAS to increase power for subtype biological discovery. At the genome-wide level, more than half of the common-variant liability for child-onset and treatment-resistant MD was unique relative to the general MD GWAS, indicating substantial subtype-specific genetic architecture. Unique components of both subtypes showed robust associations with genetic liability for schizophrenia and bipolar disorder, and the child-onset specific component exhibited genome-wide overlap with early developmental outcomes, including autism spectrum disorder and childhood intelligence. Transcriptome-wide analyses implicated upregulation of <i>SMIM19</i> in liability specific to child-onset MD, while stratified functional enrichment highlighted gene sets involved in limbic and frontal brain systems for the boosted child-onset component. Together, these findings demonstrate that MD contains biologically distinct subtypes that exhibit etiological divergences more akin to separate disorders than subtypes within a concrete diagnostic framework. We find that stratifying MD by biologically distinguishable subtypes may be crucial for enhancing biological discovery and elucidating etiological pathways in molecular genetic studies of depression.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13004145/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147501280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.02.26347392
Patricio Solis-Urra, Marcos Olvera-Rojas, Yolanda García-Rivero, Xuemei Zeng, Yijun Chen, Anuradha Sehrawat, Mahnaz Shekari, Lauren E Oberlin, Kirk I Erickson, Thomas K Karikari, Manuel Gómez-Río, Francisco B Ortega, Irene Esteban-Cornejo
We examined whether a 24-week resistance training program influenced brain amyloid-β (Aβ) and Alzheimer's Disease (AD)-related blood-based biomarkers. Ninety cognitively normal, physically inactive older adults aged 65-80 years were randomly allocated to a 24-week resistance training program (three ∼60-min supervised sessions/week) or a wait-list control group. Primary analyses assessed exercise-induced changes in brain Aβ (Centiloid values) and plasma ptau217/Aβ1-42 IPMS ratio. Secondary analyses examined ptau217/Aβ42 SIMOA ratio, ptau217, ptau181 and Aβ42/40, as well as potential interactions with sex, age, education, apolipoprotein ε4 ( APOE4 ) status, amyloid PET-positivity, and comorbidities. The intervention produced no significant differences on brain Aβ or AD-related blood-based biomarkers (p>0.05) compared to the control group. However, the ptau217/Aβ1-42 IPMS ratio showed a small, non-significant increase in the control group (SMD = 0.162; 95% CI: -0.159 to 0.483) while remaining stable in the exercise group (SMD = 0.01; 95% CI: -0.291 to 0.310) with a similar trend for ptau217/Aβ42 SIMOA. Moderator analyses indicated differential responses by amyloid PET-positivity and APOE4 status on brain Aβ (p for interaction<0.05), with increases observed in APOE4 carriers and amyloid PET-positive individuals in the control group, whereas those allocated to the exercise intervention reduced their levels. The specificity observed within our subgroups suggests that resistance exercise may serve as a targeted intervention to modulate AD pathophysiology, raising new questions regarding its broader role in the delay of the disease in vulnerable populations.
我们研究了24周的阻力训练计划是否影响脑淀粉样蛋白-β (a β)和阿尔茨海默病(AD)相关的血液生物标志物。90名认知正常、不运动的65-80岁老年人被随机分配到一个24周的阻力训练计划(每周3 ~ 60分钟的监督训练)或一个等候名单对照组。初步分析评估了运动引起的脑Aβ (Centiloid值)和血浆pta217 /Aβ1-42 IPMS比值的变化。二级分析检测了ptau217/ a - β42 SIMOA比值、ptau217、ptau181和a - β42/40,以及与性别、年龄、教育程度、载脂蛋白ε4 (APOE4)状态、淀粉样蛋白pet阳性和合共病的潜在相互作用。与对照组相比,干预在脑Aβ或ad相关血液生物标志物方面没有显著差异(p < 0.05)。然而,ptau217/ a - β1-42的IPMS比值在对照组中有小幅无显著升高(SMD = 0.162, 95% CI: -0.159 ~ 0.483),而在运动组中保持稳定(SMD = 0.01, 95% CI: -0.291 ~ 0.310), ptau217/ a - β42的SIMOA也有类似的趋势。调节分析表明,在对照组中,淀粉样蛋白pet阳性和APOE4状态对相互作用APOE4携带者和淀粉样蛋白pet阳性个体的脑Aβ (p)水平有不同的反应,而那些分配到运动干预组的人则降低了它们的水平。在我们的亚组中观察到的特异性表明,阻力运动可以作为一种有针对性的干预措施来调节阿尔茨海默病的病理生理,这就提出了新的问题,即阻力运动在弱势人群中延迟疾病的更广泛作用。
{"title":"Effects of a 24-week resistance exercise program on brain amyloid and Alzheimer's disease blood-based biomarkers: the AGUEDA randomized controlled trial.","authors":"Patricio Solis-Urra, Marcos Olvera-Rojas, Yolanda García-Rivero, Xuemei Zeng, Yijun Chen, Anuradha Sehrawat, Mahnaz Shekari, Lauren E Oberlin, Kirk I Erickson, Thomas K Karikari, Manuel Gómez-Río, Francisco B Ortega, Irene Esteban-Cornejo","doi":"10.64898/2026.03.02.26347392","DOIUrl":"https://doi.org/10.64898/2026.03.02.26347392","url":null,"abstract":"<p><p>We examined whether a 24-week resistance training program influenced brain amyloid-β (Aβ) and Alzheimer's Disease (AD)-related blood-based biomarkers. Ninety cognitively normal, physically inactive older adults aged 65-80 years were randomly allocated to a 24-week resistance training program (three ∼60-min supervised sessions/week) or a wait-list control group. Primary analyses assessed exercise-induced changes in brain Aβ (Centiloid values) and plasma ptau217/Aβ1-42 IPMS ratio. Secondary analyses examined ptau217/Aβ42 SIMOA ratio, ptau217, ptau181 and Aβ42/40, as well as potential interactions with sex, age, education, apolipoprotein ε4 ( <i>APOE4</i> ) status, amyloid PET-positivity, and comorbidities. The intervention produced no significant differences on brain Aβ or AD-related blood-based biomarkers (p>0.05) compared to the control group. However, the ptau217/Aβ1-42 IPMS ratio showed a small, non-significant increase in the control group (SMD = 0.162; 95% CI: -0.159 to 0.483) while remaining stable in the exercise group (SMD = 0.01; 95% CI: -0.291 to 0.310) with a similar trend for ptau217/Aβ42 SIMOA. Moderator analyses indicated differential responses by amyloid PET-positivity and <i>APOE4</i> status on brain Aβ (p for interaction<0.05), with increases observed in <i>APOE4</i> carriers and amyloid PET-positive individuals in the control group, whereas those allocated to the exercise intervention reduced their levels. The specificity observed within our subgroups suggests that resistance exercise may serve as a targeted intervention to modulate AD pathophysiology, raising new questions regarding its broader role in the delay of the disease in vulnerable populations.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13004095/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147501302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.02.26347451
Silvia Miramontes, Erin L Ferguson, Scott Zimmerman, Evan Phelps, Boris Oskotsky, Tomiko T Oskotsky, John A Capra, Elena Tsoy, Marina Sirota, M Maria Glymour
Background and objectives: Progression from mild cognitive impairment (MCI) to Alzheimer's Disease and Related Dementias (AD/ADRD) varies widely across individuals, yet the mechanisms underlying this heterogeneity remain unclear. Identifying clinical and social determinants influencing this transition could enable earlier intervention. While cardiovascular and social risk factors are established contributors to dementia incidence, their role in progression from MCI to dementia may differ. Few studies using real world clinical data have evaluated these potential determinants of MCI progression.
Methods: Using electronic health records (EHR) from patients with incident MCI at UCSF Health (2010-2024), we evaluated cardiovascular (blood pressure [BP], body mass index [BMI], and type II diabetes) and social (marital status, language preference, race/ethnicity, and neighborhood disadvantage) risk factors for rate of progression from MCI to AD/ADRD. Covariate-adjusted Cox proportional hazards models estimated hazard ratios for incident AD/ADRD, with evaluation of interactions by sex.
Results: Among 6,529 patients, higher systolic BP was associated with AD/ADRD incidence (HR per 10 mmHg: 1.09, 95% CI: 1.05-1.14). BMI was inversely associated with incidence in both males (HR: 0.94; 95% CI: 0.92-0.97) and females (HR:0.98; 95% CI: 0.96-0.99). Compared to married individuals, widowed patients had a higher hazard of progression (HR: 1.15; 95% CI: 1.00-1.32). Spanish-speaking (HR: 1.38; 95% CI: 1.04-1.81), Chinese-speaking (HR: 1.19; 95% CI: 1.00-1.42), and "Other non-English" speaking patients (HR:1.24; 95% CI: 1.03-1.51) had a higher hazard of progression compared to English speakers. Latinx (HR:1.22; 95% CI: 1.01-1.48) and Asian patients (HR:1.14, 95% CI: 1.00-1.30; p=0.04) also had higher hazards of progression compared to White patients. Neighborhood disadvantage was not significantly associated with disease progression.
Discussion: Cardiovascular and social factors independently influence dementia progression, with some sex-specific patterns. Integrating clinical and social indicators highlights the potential of EHR data to identify high-risk patients earlier in the care continuum and support equitable dementia prevention.
{"title":"Social and Cardiovascular Risk Factors as Predictors of the Progression from Mild Cognitive Impairment to Dementia in a Large EHR Database.","authors":"Silvia Miramontes, Erin L Ferguson, Scott Zimmerman, Evan Phelps, Boris Oskotsky, Tomiko T Oskotsky, John A Capra, Elena Tsoy, Marina Sirota, M Maria Glymour","doi":"10.64898/2026.03.02.26347451","DOIUrl":"10.64898/2026.03.02.26347451","url":null,"abstract":"<p><strong>Background and objectives: </strong>Progression from mild cognitive impairment (MCI) to Alzheimer's Disease and Related Dementias (AD/ADRD) varies widely across individuals, yet the mechanisms underlying this heterogeneity remain unclear. Identifying clinical and social determinants influencing this transition could enable earlier intervention. While cardiovascular and social risk factors are established contributors to dementia incidence, their role in progression from MCI to dementia may differ. Few studies using real world clinical data have evaluated these potential determinants of MCI progression.</p><p><strong>Methods: </strong>Using electronic health records (EHR) from patients with incident MCI at UCSF Health (2010-2024), we evaluated cardiovascular (blood pressure [BP], body mass index [BMI], and type II diabetes) and social (marital status, language preference, race/ethnicity, and neighborhood disadvantage) risk factors for rate of progression from MCI to AD/ADRD. Covariate-adjusted Cox proportional hazards models estimated hazard ratios for incident AD/ADRD, with evaluation of interactions by sex.</p><p><strong>Results: </strong>Among 6,529 patients, higher systolic BP was associated with AD/ADRD incidence (HR per 10 mmHg: 1.09, 95% CI: 1.05-1.14). BMI was inversely associated with incidence in both males (HR: 0.94; 95% CI: 0.92-0.97) and females (HR:0.98; 95% CI: 0.96-0.99). Compared to married individuals, widowed patients had a higher hazard of progression (HR: 1.15; 95% CI: 1.00-1.32). Spanish-speaking (HR: 1.38; 95% CI: 1.04-1.81), Chinese-speaking (HR: 1.19; 95% CI: 1.00-1.42), and \"Other non-English\" speaking patients (HR:1.24; 95% CI: 1.03-1.51) had a higher hazard of progression compared to English speakers. Latinx (HR:1.22; 95% CI: 1.01-1.48) and Asian patients (HR:1.14, 95% CI: 1.00-1.30; p=0.04) also had higher hazards of progression compared to White patients. Neighborhood disadvantage was not significantly associated with disease progression.</p><p><strong>Discussion: </strong>Cardiovascular and social factors independently influence dementia progression, with some sex-specific patterns. Integrating clinical and social indicators highlights the potential of EHR data to identify high-risk patients earlier in the care continuum and support equitable dementia prevention.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976900/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computational growth and remodeling (G&R) models have been extentively used to investigate abdominal aortic aneurysm (AAA) progression and to support clinical decision-making. However, the development of robust predictive models is often limited by the scarcity of large-scale longitudinal imaging datasets. In this study, we propose a physics-based G&R framework to simulate AAA shape evolution and generate a virtual cohort of aneurysms, thereby addressing data limitations and enabling integration with data-driven machine learning approaches for growth prediction. The proposed arterial G&R model incorporates key mechanisms influencing aneurysm progression, including elastin degradation and stress-mediated collagen production. A modified elastin degradation formulation was introduced to generate realistic aneurysm geometries exhibiting clinically relevant features such as asymmetry and tortuosity. By systematically varying parameters governing elastin damage and collagen production, 200 distinct G&R simulations were performed to produce a diverse set of AAA geometries. The dataset was further expanded using kriging-based spatial interpolation to construct a large in silico cohort. The synthetic dataset, combined with longitudinal imaging data from 25 patients, was used to train and validate four machine learning models: Deep Belief Network (DBN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). A two-step training strategy was adopted to predict maximum aneurysm diameter and growth rate based on prior geometric characteristics. The LSTM model achieved the highest performance for maximum diameter prediction (R2 = 0.92), while the RNN demonstrated strong overall performance (R2 = 0.90 for maximum diameter and 0.89 for growth rate). The DBN and GRU models also showed competitive predictive capability. Overall, this study demonstrates that integrating physics-based G&R simulations with machine learning enables accurate prediction of AAA growth and maximum diameter. The proposed framework provides a scalable strategy for augmenting limited clinical datasets and offers a promising tool to support personalized risk assessment and treatment planning.
{"title":"Physics-Based Growth and Remodeling Modeling for Virtual Abdominal Aortic Aneurysm Evolution and Growth Prediction.","authors":"Faeze Jahani, Zhenxiang Jiang, Malikeh Nabaei, Seungik Baek","doi":"10.64898/2026.02.26.26347026","DOIUrl":"10.64898/2026.02.26.26347026","url":null,"abstract":"<p><p>Computational growth and remodeling (G&R) models have been extentively used to investigate abdominal aortic aneurysm (AAA) progression and to support clinical decision-making. However, the development of robust predictive models is often limited by the scarcity of large-scale longitudinal imaging datasets. In this study, we propose a physics-based G&R framework to simulate AAA shape evolution and generate a virtual cohort of aneurysms, thereby addressing data limitations and enabling integration with data-driven machine learning approaches for growth prediction. The proposed arterial G&R model incorporates key mechanisms influencing aneurysm progression, including elastin degradation and stress-mediated collagen production. A modified elastin degradation formulation was introduced to generate realistic aneurysm geometries exhibiting clinically relevant features such as asymmetry and tortuosity. By systematically varying parameters governing elastin damage and collagen production, 200 distinct G&R simulations were performed to produce a diverse set of AAA geometries. The dataset was further expanded using kriging-based spatial interpolation to construct a large in silico cohort. The synthetic dataset, combined with longitudinal imaging data from 25 patients, was used to train and validate four machine learning models: Deep Belief Network (DBN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). A two-step training strategy was adopted to predict maximum aneurysm diameter and growth rate based on prior geometric characteristics. The LSTM model achieved the highest performance for maximum diameter prediction (R<sup>2</sup> = 0.92), while the RNN demonstrated strong overall performance (R<sup>2</sup> = 0.90 for maximum diameter and 0.89 for growth rate). The DBN and GRU models also showed competitive predictive capability. Overall, this study demonstrates that integrating physics-based G&R simulations with machine learning enables accurate prediction of AAA growth and maximum diameter. The proposed framework provides a scalable strategy for augmenting limited clinical datasets and offers a promising tool to support personalized risk assessment and treatment planning.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976916/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.02.26347436
Leela Shah, Elizabeth M Planalp, Ryan McDonald, Caitlin J Regner, Sreevalli Atluru, Andrew L Alexander, Pilar N Ossorio, Julie Poehlmann, Douglas C Dean
<p><strong>Importance: </strong>Prenatal cannabis exposure is increasing in prevalence, yet its associations with early brain development-particularly how the timing and frequency of exposure across gestation relate to neonatal brain structure-remain insufficiently understood. Clarifying these associations is essential for informing early risk identification and guiding perinatal care.</p><p><strong>Objective: </strong>To examine associations between patterns of maternal prenatal cannabis exposure, including exposure presence, gestational timing, and frequency of exposure, and neonatal brain structure and microstructure during the first month of life.</p><p><strong>Design setting and participants: </strong>This cohort study included 1,782 mother-infant dyads (221 with PCE) from the HEALthy Brain and Child Development Study. Mother-reported prenatal cannabis exposure was assessed using the validated Timeline Follow-back method. Infants underwent natural-sleep magnetic resonance imaging, including T2-weighted structural imaging and diffusion imaging, within the first month of life.</p><p><strong>Main outcomes and measures: </strong>Associations between prenatal cannabis exposure and regional T2-weighted volumes and diffusion white matter microstructure metrics examined (1) exposure presence, (2) gestational timing of exposure, and (3) frequency of exposure within exposed infants.</p><p><strong>Results: </strong>Any prenatal cannabis exposure was associated with brain volume differences in cerebellar and subcortical limbic regions, including smaller amygdala, thalamic, and cerebellar vermis volumes and larger caudate, hippocampal, and cerebellar cortex volumes. Timing-specific analyses revealed divergent patterns: first trimester exposure was associated with smaller volumes in select regions, whereas exposure that continued into the third trimester was associated with larger volumes in overlapping structures, with additional subcortical volumetric differences observed. White matter microstructure alterations were observed only among infants with exposure that continued into the third trimester. Within the exposed subgroup, higher frequency of cannabis exposure was associated with larger cerebral white matter volumes and white matter microstructural differences in white matter regions.</p><p><strong>Conclusions and relevance: </strong>In infants with maternal prenatal cannabis exposure, we observed timing- and frequency-dependent differences in brain development within the first month of life. These findings underscore the importance of considering not only the presence of exposure, but also when and how much cannabis is used during pregnancy to support targeted prenatal counseling and early developmental monitoring for exposed infants.</p><p><strong>Key points: </strong><b>Question:</b> Is prenatal cannabis exposure associated with brain development in the first month of life?<b>Findings:</b> In a cohort[ABS] of 1,782 mother-infant dyads, prenatal c
{"title":"Associations of Prenatal Cannabis Exposure and Neonatal Brain Development in the HBCD Cohort.","authors":"Leela Shah, Elizabeth M Planalp, Ryan McDonald, Caitlin J Regner, Sreevalli Atluru, Andrew L Alexander, Pilar N Ossorio, Julie Poehlmann, Douglas C Dean","doi":"10.64898/2026.03.02.26347436","DOIUrl":"10.64898/2026.03.02.26347436","url":null,"abstract":"<p><strong>Importance: </strong>Prenatal cannabis exposure is increasing in prevalence, yet its associations with early brain development-particularly how the timing and frequency of exposure across gestation relate to neonatal brain structure-remain insufficiently understood. Clarifying these associations is essential for informing early risk identification and guiding perinatal care.</p><p><strong>Objective: </strong>To examine associations between patterns of maternal prenatal cannabis exposure, including exposure presence, gestational timing, and frequency of exposure, and neonatal brain structure and microstructure during the first month of life.</p><p><strong>Design setting and participants: </strong>This cohort study included 1,782 mother-infant dyads (221 with PCE) from the HEALthy Brain and Child Development Study. Mother-reported prenatal cannabis exposure was assessed using the validated Timeline Follow-back method. Infants underwent natural-sleep magnetic resonance imaging, including T2-weighted structural imaging and diffusion imaging, within the first month of life.</p><p><strong>Main outcomes and measures: </strong>Associations between prenatal cannabis exposure and regional T2-weighted volumes and diffusion white matter microstructure metrics examined (1) exposure presence, (2) gestational timing of exposure, and (3) frequency of exposure within exposed infants.</p><p><strong>Results: </strong>Any prenatal cannabis exposure was associated with brain volume differences in cerebellar and subcortical limbic regions, including smaller amygdala, thalamic, and cerebellar vermis volumes and larger caudate, hippocampal, and cerebellar cortex volumes. Timing-specific analyses revealed divergent patterns: first trimester exposure was associated with smaller volumes in select regions, whereas exposure that continued into the third trimester was associated with larger volumes in overlapping structures, with additional subcortical volumetric differences observed. White matter microstructure alterations were observed only among infants with exposure that continued into the third trimester. Within the exposed subgroup, higher frequency of cannabis exposure was associated with larger cerebral white matter volumes and white matter microstructural differences in white matter regions.</p><p><strong>Conclusions and relevance: </strong>In infants with maternal prenatal cannabis exposure, we observed timing- and frequency-dependent differences in brain development within the first month of life. These findings underscore the importance of considering not only the presence of exposure, but also when and how much cannabis is used during pregnancy to support targeted prenatal counseling and early developmental monitoring for exposed infants.</p><p><strong>Key points: </strong><b>Question:</b> Is prenatal cannabis exposure associated with brain development in the first month of life?<b>Findings:</b> In a cohort[ABS] of 1,782 mother-infant dyads, prenatal c","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976915/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.02.26347453
Jesús Ernesto Martínez-Luna, María Fernanda Suárez-Velázquez, Mario Cesar Torres-Chávez, Guillermo C Cardoso-Saldaña, Juan Reyes-Barrera, Jaime Berumen-Campos, Pablo Kuri-Morales, Roberto Tapia-Conyer, Jesus Alegre-Díaz, Carlos A Fermín-Martínez, Jacqueline A Seiglie, Omar Yaxmehen Bello-Chavolla, Neftali Eduardo Antonio-Villa
Background: Visceral adipose tissue (VAT) has been associated with cardiovascular disease (CVD) mortality. However, the comparative performance of VAT-related clinical surrogates remains poorly characterized.
Objectives: To evaluate the performance of seven VAT-related clinical surrogates for predicting CVD and cause-specific CVD mortality.
Methods: We analyzed data from the Mexico City Prospective Cohort, a population-based prospective cohort study, with baseline recruitmetn between 1998 - 2004 and ongoing mortality follow-up. CVD mortality included deaths from cardiac, stroke-related, and other vascular causes. Seven VAT-related surrogates (METS-VF, CVAI, EVA, DAAT, LAAP, VAI, and DAI) were estimated using clinical, biochemical, and anthropometric data at baseline. Associations with outcomes were evaluated using Cox regression models to estimate adjusted hazard ratios (aHRs). Discrimination was assessed with Harrell's C-statistic (Cs) and fixed-point at 10-years receiver operating characteristic (ROC) curves, and calibration with slope plots.
Results: In a subsample of 102,385 participants (median age: 47 years; 67% female), 4,068 (3.97%) died from any CVD causes. METS-VF (Cs: 0.722; aHR: 1.17, 95% CI: 1.12-1.23), EVA (Cs: 0.72; 1.14, 1.12-1.23), CVAI (Cs: 0.70; 1.13, 1.09-1.18), and DAAT (Cs: 0.626; 1.13, 1.09-1.18) were positively associated with CVD mortality and showed the highest predictive capacity among the surrogates. Adding METS-VF to a CVD risk score among individuals classified as intermediate risk improved discrimination for CVD mortality.
Conclusions: In this large cohort of Mexican adults, four VAT-related clinical surrogates, particularly METS-VF, demonstrated good discriminatory performance for long-term CVD mortality. These indices could help to identify individuals with high VAT accumulation and high CVD risk in resource-limited settings.
{"title":"Predictive performance of seven clinical surrogates of visceral adipose tissue for cardiovascular mortality: A sub-analysis of 102,385 adults from the Mexico City Prospective Study.","authors":"Jesús Ernesto Martínez-Luna, María Fernanda Suárez-Velázquez, Mario Cesar Torres-Chávez, Guillermo C Cardoso-Saldaña, Juan Reyes-Barrera, Jaime Berumen-Campos, Pablo Kuri-Morales, Roberto Tapia-Conyer, Jesus Alegre-Díaz, Carlos A Fermín-Martínez, Jacqueline A Seiglie, Omar Yaxmehen Bello-Chavolla, Neftali Eduardo Antonio-Villa","doi":"10.64898/2026.03.02.26347453","DOIUrl":"10.64898/2026.03.02.26347453","url":null,"abstract":"<p><strong>Background: </strong>Visceral adipose tissue (VAT) has been associated with cardiovascular disease (CVD) mortality. However, the comparative performance of VAT-related clinical surrogates remains poorly characterized.</p><p><strong>Objectives: </strong>To evaluate the performance of seven VAT-related clinical surrogates for predicting CVD and cause-specific CVD mortality.</p><p><strong>Methods: </strong>We analyzed data from the Mexico City Prospective Cohort, a population-based prospective cohort study, with baseline recruitmetn between 1998 - 2004 and ongoing mortality follow-up. CVD mortality included deaths from cardiac, stroke-related, and other vascular causes. Seven VAT-related surrogates (METS-VF, CVAI, EVA, DAAT, LAAP, VAI, and DAI) were estimated using clinical, biochemical, and anthropometric data at baseline. Associations with outcomes were evaluated using Cox regression models to estimate adjusted hazard ratios (aHRs). Discrimination was assessed with Harrell's C-statistic (Cs) and fixed-point at 10-years receiver operating characteristic (ROC) curves, and calibration with slope plots.</p><p><strong>Results: </strong>In a subsample of 102,385 participants (median age: 47 years; 67% female), 4,068 (3.97%) died from any CVD causes. METS-VF (Cs: 0.722; aHR: 1.17, 95% CI: 1.12-1.23), EVA (Cs: 0.72; 1.14, 1.12-1.23), CVAI (Cs: 0.70; 1.13, 1.09-1.18), and DAAT (Cs: 0.626; 1.13, 1.09-1.18) were positively associated with CVD mortality and showed the highest predictive capacity among the surrogates. Adding METS-VF to a CVD risk score among individuals classified as intermediate risk improved discrimination for CVD mortality.</p><p><strong>Conclusions: </strong>In this large cohort of Mexican adults, four VAT-related clinical surrogates, particularly METS-VF, demonstrated good discriminatory performance for long-term CVD mortality. These indices could help to identify individuals with high VAT accumulation and high CVD risk in resource-limited settings.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976897/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.02.26347469
Chao Yan, Wu-Chen Su, Yi Xin, Monika E Grabowska, Vern E Kerchberger, Victor A Borza, Jinlian Wang, Liwei Wang, Rui Li, Jacob Lynn, Alyson L Dickson, Cathy Shyr, QiPing Feng, Charles M Stein, Kai Wang, Peter J Embi, Bradley A Malin, Hongfang Liu, Wei-Qi Wei
Rare diseases affect over 300 million people worldwide, yet patients often endure years-long diagnostic delays that limit timely intervention and trial opportunities. Computational rare disease recognition (RDR) remains constrained by knowledge resources that are often incomplete, heterogeneous, and dependent on extensive multi-disciplinary expert curation that cannot scale. Large language models (LLMs) applied directly for end-to-end diagnosis or disease discrimination face similar knowledge bottlenecks while also raising concerns around cost, reproducibility, and data governance. Here, we introduce GEN-KnowRD, a knowledge-layer-first framework that leverages LLMs to generate schema-guided rare disease profiles, systematically assesses their quality, and constructs a computable knowledge base (PheMAP-RD) for local deployment. GEN-KnowRD integrates this knowledge into lightweight inference pipelines for both general-purpose disease screening and specialized early discrimination from longitudinal electronic health records. Across six public benchmarks for general-purpose screen (9,290 patients spanning 798 rare diseases), GEN-KnowRD significantly improves disease ranking compared to a state-of-the-art, HPO-centered diagnostic framework (up to 345.8% improvement in top-1 success), advanced end-to-end LLM reasoning (up to 129.1% improvement), and a variant of GEN-KnowRD instantiated with expert-curated knowledge rather than LLM-generated profiles. In two real-world cohorts for early diagnosis of idiopathic pulmonary fibrosis (511 patients) as a use case, GEN-KnowRD also demonstrates robust discrimination performance gains, supporting effective RDR during the pre-diagnostic window. These findings demonstrate that repositioning LLMs from diagnostic reasoning to the knowledge layer-decoupling knowledge construction from patient-level inference-yields stronger RDR, while providing scalable, continuously updatable, and reusable infrastructure for diagnosis, screening, and clinical research across the rare disease landscape.
{"title":"GEN-KnowRD: Reframing AI for Rare Disease Recognition.","authors":"Chao Yan, Wu-Chen Su, Yi Xin, Monika E Grabowska, Vern E Kerchberger, Victor A Borza, Jinlian Wang, Liwei Wang, Rui Li, Jacob Lynn, Alyson L Dickson, Cathy Shyr, QiPing Feng, Charles M Stein, Kai Wang, Peter J Embi, Bradley A Malin, Hongfang Liu, Wei-Qi Wei","doi":"10.64898/2026.03.02.26347469","DOIUrl":"https://doi.org/10.64898/2026.03.02.26347469","url":null,"abstract":"<p><p>Rare diseases affect over 300 million people worldwide, yet patients often endure years-long diagnostic delays that limit timely intervention and trial opportunities. Computational rare disease recognition (RDR) remains constrained by knowledge resources that are often incomplete, heterogeneous, and dependent on extensive multi-disciplinary expert curation that cannot scale. Large language models (LLMs) applied directly for end-to-end diagnosis or disease discrimination face similar knowledge bottlenecks while also raising concerns around cost, reproducibility, and data governance. Here, we introduce GEN-KnowRD, a knowledge-layer-first framework that leverages LLMs to generate schema-guided rare disease profiles, systematically assesses their quality, and constructs a computable knowledge base (PheMAP-RD) for local deployment. GEN-KnowRD integrates this knowledge into lightweight inference pipelines for both general-purpose disease screening and specialized early discrimination from longitudinal electronic health records. Across six public benchmarks for general-purpose screen (9,290 patients spanning 798 rare diseases), GEN-KnowRD significantly improves disease ranking compared to a state-of-the-art, HPO-centered diagnostic framework (up to 345.8% improvement in top-1 success), advanced end-to-end LLM reasoning (up to 129.1% improvement), and a variant of GEN-KnowRD instantiated with expert-curated knowledge rather than LLM-generated profiles. In two real-world cohorts for early diagnosis of idiopathic pulmonary fibrosis (511 patients) as a use case, GEN-KnowRD also demonstrates robust discrimination performance gains, supporting effective RDR during the pre-diagnostic window. These findings demonstrate that repositioning LLMs from diagnostic reasoning to the knowledge layer-decoupling knowledge construction from patient-level inference-yields stronger RDR, while providing scalable, continuously updatable, and reusable infrastructure for diagnosis, screening, and clinical research across the rare disease landscape.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13004138/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147501286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.01.26347369
Anna Constantino-Pettit, Cassandra Trammel, Arpana Agrawal, Christopher Smyser, Ebony Carter, Ryan Bogdan, Cynthia Rogers
Objective: Cannabis use during pregnancy is increasing; associations with neonatal growth may be confounded by nicotine. We evaluated prenatal cannabis exposure (PreCE) and neonatal outcomes in a prospective cohort with biochemical control for nicotine exposure.
Methods: In the Cannabis Use During Early Life and Development (CUDDEL) study, pregnant women with a lifetime history of cannabis use were classified as PreCE if they self-reported use or had urine THC-COOH positivity at any trimester (n=297) and as unexposed if they reported no use and tested negative (n=151). Linear regression and modified Poisson models estimated associations with birthweight and small for gestational age (SGA; <10th and <5th percentiles), adjusting for sociodemographic factors, gestational age, maternal age and BMI, and urinary cotinine. Analyses stratified by cannabis use frequency (>weekly vs
Results: Participants (N=448; 18-41 years; 85.3% non-Hispanic Black) had lower birthweight with PreCE in adjusted models (Beta=-0.08; padj=0.041). High-frequency PreCE was associated with lower birthweight compared with unexposed pregnancies (Beta=-0.13; padj=0.03), whereas low-frequency PreCE was not. Cotinine-positive PreCE showed the greatest birthweight reduction versus unexposed (Beta=-0.20; padj<0.001). PreCE was also associated with higher likelihood of SGA <5th percentile; risk was highest in PreCE+Nicotine compared with both unexposed and PreCE-Nicotine groups.
Conclusions: Prenatal cannabis exposure was associated with reduced birthweight and SGA in this cohort. Nicotine co-exposure intensified these associations, yet effects persisted without cotinine, supporting cannabis as an independent perinatal risk factor and emphasizing the value of cotinine assessment in populations where blunt use or secondhand exposure is common.
{"title":"Associations Between Prenatal Cannabis Exposure and Birth Outcomes: Results from a Prospective Cohort Study.","authors":"Anna Constantino-Pettit, Cassandra Trammel, Arpana Agrawal, Christopher Smyser, Ebony Carter, Ryan Bogdan, Cynthia Rogers","doi":"10.64898/2026.03.01.26347369","DOIUrl":"https://doi.org/10.64898/2026.03.01.26347369","url":null,"abstract":"<p><strong>Objective: </strong>Cannabis use during pregnancy is increasing; associations with neonatal growth may be confounded by nicotine. We evaluated prenatal cannabis exposure (PreCE) and neonatal outcomes in a prospective cohort with biochemical control for nicotine exposure.</p><p><strong>Methods: </strong>In the Cannabis Use During Early Life and Development (CUDDEL) study, pregnant women with a lifetime history of cannabis use were classified as PreCE if they self-reported use or had urine THC-COOH positivity at any trimester (n=297) and as unexposed if they reported no use and tested negative (n=151). Linear regression and modified Poisson models estimated associations with birthweight and small for gestational age (SGA; <10th and <5th percentiles), adjusting for sociodemographic factors, gestational age, maternal age and BMI, and urinary cotinine. Analyses stratified by cannabis use frequency (>weekly vs <monthly) and cotinine status.</p><p><strong>Results: </strong>Participants (N=448; 18-41 years; 85.3% non-Hispanic Black) had lower birthweight with PreCE in adjusted models (Beta=-0.08; padj=0.041). High-frequency PreCE was associated with lower birthweight compared with unexposed pregnancies (Beta=-0.13; padj=0.03), whereas low-frequency PreCE was not. Cotinine-positive PreCE showed the greatest birthweight reduction versus unexposed (Beta=-0.20; padj<0.001). PreCE was also associated with higher likelihood of SGA <5th percentile; risk was highest in PreCE+Nicotine compared with both unexposed and PreCE-Nicotine groups.</p><p><strong>Conclusions: </strong>Prenatal cannabis exposure was associated with reduced birthweight and SGA in this cohort. Nicotine co-exposure intensified these associations, yet effects persisted without cotinine, supporting cannabis as an independent perinatal risk factor and emphasizing the value of cotinine assessment in populations where blunt use or secondhand exposure is common.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13004135/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147501321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.64898/2026.03.01.26347258
Chris Sebastian, Mengxin Yu, Jin Jin
Polygenic risk scores (PRSs) have emerged as a valuable tool for genetic risk prediction and stratification in human diseases. Over the past decade, extensive methodological efforts have focused on improving the predictive power of PRS, leading to the development of numerous methods for PRS construction. Benchmarking these various methods thus becomes an essential task that is crucial for guiding future PRS applications. While studies have benchmarked subsets of these methods on specific phenotypes and cohorts, the resulting evidence remains fragmented, with a lack of work that comprehensively assess the relative performance of the various PRS methods. In this study, we addressed this gap by systematically constructing a PRS method benchmarking database synthesizing published results from 2009 to 2025. We applied a spectral ranking inference framework with uncertainty quantification to rank 14 PRS methods that had been adequately compared against each other in the literature. We constructed rankings using two complementary sources: original method-development studies and applications/benchmarking studies. While the highest-ranked methods (LDpred2 and AnnoPred) and the lowest-ranked method (C+T) were consistently identified from both sources, the relative ordering of most methods showed moderate variability. We further constructed phenotype-specific rankings, providing more detailed insights into the robustness and phenotype-specific strengths of individual methods. Collectively, the overall and phenotype-specific rankings of the PRS methods, along with the curated benchmarking data from the literature, provide a dynamic and practical reference database that can continuingly be updated with emerging new PRS methods and published benchmarking results to guide future PRS applications.
{"title":"Constructing a Literature-Derived Database for Benchmarking Polygenic Risk Score Construction Methods with Spectral Ranking Inferences.","authors":"Chris Sebastian, Mengxin Yu, Jin Jin","doi":"10.64898/2026.03.01.26347258","DOIUrl":"https://doi.org/10.64898/2026.03.01.26347258","url":null,"abstract":"<p><p>Polygenic risk scores (PRSs) have emerged as a valuable tool for genetic risk prediction and stratification in human diseases. Over the past decade, extensive methodological efforts have focused on improving the predictive power of PRS, leading to the development of numerous methods for PRS construction. Benchmarking these various methods thus becomes an essential task that is crucial for guiding future PRS applications. While studies have benchmarked subsets of these methods on specific phenotypes and cohorts, the resulting evidence remains fragmented, with a lack of work that comprehensively assess the relative performance of the various PRS methods. In this study, we addressed this gap by systematically constructing a PRS method benchmarking database synthesizing published results from 2009 to 2025. We applied a spectral ranking inference framework with uncertainty quantification to rank 14 PRS methods that had been adequately compared against each other in the literature. We constructed rankings using two complementary sources: original method-development studies and applications/benchmarking studies. While the highest-ranked methods (LDpred2 and AnnoPred) and the lowest-ranked method (C+T) were consistently identified from both sources, the relative ordering of most methods showed moderate variability. We further constructed phenotype-specific rankings, providing more detailed insights into the robustness and phenotype-specific strengths of individual methods. Collectively, the overall and phenotype-specific rankings of the PRS methods, along with the curated benchmarking data from the literature, provide a dynamic and practical reference database that can continuingly be updated with emerging new PRS methods and published benchmarking results to guide future PRS applications.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12976906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}