Pub Date : 2026-02-02DOI: 10.64898/2026.01.30.26344972
Praveen Kumar, Fernando Alarid-Escudero, Tanvi V Chiddarwar, David Ulises Garibay-Treviño, Krishna Roy Chowdhury, Prince Peprah, Bruce L Jacobs, Karen M Kuntz, Stella K Kang, Thomas A Trikalinos, Hawre Jalal
Purpose: Bladder cancer is associated with significant morbidity and mortality in the US, with 85,000 new cases and 17,400 deaths expected in 2025. Black patients are more likely than White patients to be diagnosed with bladder cancer at advanced stages, as are female patients compared with male patients. We examine whether differences in cancer diagnosis rates by race and sex can explain the observed variability using a simulation model and project outcomes of potential improvement in diagnosis.
Methods: We developed a state transition model for bladder cancer to simulate four cohorts based on sex (males, females) and race (Blacks, Whites) from birth through various health states, including disease-free, preclinical stages (0a/0is - IV), clinical stages (0a/0is - IV), and death (bladder cancer or other cause death). Parameters related to disease onset, progression, and diagnosis were estimated by calibrating the model to race- and sex-specific incidence rates by age, and stage distribution at diagnosis for cases diagnosed between 2015 and 2019 in SEER 17 registry areas. We conducted a scenario analysis to examine the impact of differences in diagnosis rates on stage distribution and life expectancy, assuming that Black males (or females) and White females had diagnosis rates similar to those of White males.
Results: The calibrated model attributes the differences in stage distribution to lower diagnosis rates in White females (hazard ratio, [HR] = 0.95, 95% credible interval [CI]: 0.92 - 0.96), Black males (0.80, 95% CI: 0.75 - 0.81) and Black females (0.56, 95% CI: 0.53 - 0.58), relative to White males. If diagnosis rates for all demographic groups were similar to White males, the expected life span of a 65-year-old bladder cancer patient would increase by 0.2 years for White females (from 13.8 to 13.9 years), 0.6 years for Black males (from 10.6 to 11.1 years), and 1.9 years for Black females (from 10.5 to 12.4 years).
Conclusions: Differences in diagnosis rates of bladder cancer by race and sex explain the observed differences in stage distribution at diagnosis. Targeted interventions aimed at improving diagnosis rates have the potential to substantially improve survival for patients with bladder cancer.
{"title":"Differences in Bladder Cancer Diagnosis by Demographic Factors: A Simulation Modeling Analysis.","authors":"Praveen Kumar, Fernando Alarid-Escudero, Tanvi V Chiddarwar, David Ulises Garibay-Treviño, Krishna Roy Chowdhury, Prince Peprah, Bruce L Jacobs, Karen M Kuntz, Stella K Kang, Thomas A Trikalinos, Hawre Jalal","doi":"10.64898/2026.01.30.26344972","DOIUrl":"https://doi.org/10.64898/2026.01.30.26344972","url":null,"abstract":"<p><strong>Purpose: </strong>Bladder cancer is associated with significant morbidity and mortality in the US, with 85,000 new cases and 17,400 deaths expected in 2025. Black patients are more likely than White patients to be diagnosed with bladder cancer at advanced stages, as are female patients compared with male patients. We examine whether differences in cancer diagnosis rates by race and sex can explain the observed variability using a simulation model and project outcomes of potential improvement in diagnosis.</p><p><strong>Methods: </strong>We developed a state transition model for bladder cancer to simulate four cohorts based on sex (males, females) and race (Blacks, Whites) from birth through various health states, including disease-free, preclinical stages (0a/0is - IV), clinical stages (0a/0is - IV), and death (bladder cancer or other cause death). Parameters related to disease onset, progression, and diagnosis were estimated by calibrating the model to race- and sex-specific incidence rates by age, and stage distribution at diagnosis for cases diagnosed between 2015 and 2019 in SEER 17 registry areas. We conducted a scenario analysis to examine the impact of differences in diagnosis rates on stage distribution and life expectancy, assuming that Black males (or females) and White females had diagnosis rates similar to those of White males.</p><p><strong>Results: </strong>The calibrated model attributes the differences in stage distribution to lower diagnosis rates in White females (hazard ratio, [HR] = 0.95, 95% credible interval [CI]: 0.92 - 0.96), Black males (0.80, 95% CI: 0.75 - 0.81) and Black females (0.56, 95% CI: 0.53 - 0.58), relative to White males. If diagnosis rates for all demographic groups were similar to White males, the expected life span of a 65-year-old bladder cancer patient would increase by 0.2 years for White females (from 13.8 to 13.9 years), 0.6 years for Black males (from 10.6 to 11.1 years), and 1.9 years for Black females (from 10.5 to 12.4 years).</p><p><strong>Conclusions: </strong>Differences in diagnosis rates of bladder cancer by race and sex explain the observed differences in stage distribution at diagnosis. Targeted interventions aimed at improving diagnosis rates have the potential to substantially improve survival for patients with bladder cancer.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889898/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.30.26344887
Jillian L McKee, Sarah M Ruggiero, Kristin Cunningham, JoeyLynn Coyne, Ian McSalley, Michael C Kaufman, Bintou Bane, Torrey Chisari, Jonathan Toib, Carlyn Glatts, Sarah Tefft, Julie M Orlando, Viveknarayanan Padmanabhan, Alexander K Gonzalez, Alicia Harrison, Charlene Woo, Stephanie A Zbikowski, Rency Dhaduk, Johanna Mercurio, Macie McCarthy, Jan H Magielski, Zachary Grinspan, Megan Abbott, Juliet Knowles, Hsiao-Tuan Chao, Katherine Xiong, Elizabeth Berry-Kravis, Sepideh Tabarestani, J Michael Graglia, Kathryn Helde, Virginie McNamar, Charlene Son Rigby, James Goss, Scott Demarest, Andrea Miele, Benjamin Prosser, Michael J Boland, Samuel R Pierce, Ingo Helbig
<p><strong>Objective: </strong><i>STXBP1-</i> Related Disorders ( <i>STXBP1</i> -RD) and <i>SYNGAP1</i> -Related Disorder ( <i>SYNGAP1</i> -RD) are two common genetic synaptopathies, leading to epilepsy, developmental delay, and intellectual disability. Both <i>STXBP1</i> -RD and <i>SYNGAP1</i> -RD are potential targets for disease-modifying therapies, but there is limited information in the literature describing the natural history of either disorder, which impedes outcome selection for future clinical trials. The objective of this study is to develop a framework to better define and outline the clinical spectrum and longitudinal trajectories of <i>STXBP1</i> -RD and <i>SYNGAP1</i> -RD natural history, including development, behavior, seizure histories, and electrophysiology.</p><p><strong>Methods: </strong>Here, we describe a protocol, regulatory structure, and supportive preliminary data for multi-center, prospective natural history studies of <i>STXBP1</i> -RD (STARR) and <i>SYNGAP1</i> -RD (ProMMiS). The protocols incorporate gold-standard clinician-assessed outcome measures including the Bayley Scales of Infant and Toddler Development 4 <sup>th</sup> edition, Gross Motor Function Measure-66, and fine motor domains of the Peabody Developmental Motor Scales 3 <sup>rd</sup> Edition, parent reported outcome measures (PROMs), epilepsy histories, and biomarker exploration. To date, the study has enrolled 164 individuals with <i>STXBP1</i> -RD and 159 with <i>SYNGAP1</i> -RD, with ongoing longitudinal assessments every 6 months in a subset of approximately 200 total individuals across both disorders.</p><p><strong>Results: </strong>Our data support that existing developmental measures are feasible, informative, and show minimal floor or ceiling effects. Furthermore, we demonstrate that medical record-based seizure history reconstruction reveals unique epilepsy trajectories while minimizing burden to families. We observe disease-specific patterns of developmental performance and distinct longitudinal seizure dynamics, highlighting the need for data generation in a gene/disorder-specific manner for clinical trial readiness.</p><p><strong>Significance: </strong>In summary, we present a feasible natural history protocol with prospective data for two complex neurodevelopmental disorders with natural histories that have previously been incompletely characterized, within a regulatory framework that will support the use of these data to expedite clinical trial development.</p><p><strong>Key points: </strong><i>STXBP1</i> -Related Disorder ( <i>STXBP1</i> -RD) and <i>SYNGAP1</i> -Related Disorder ( <i>SYNGAP1</i> -RD) are two common genetic causes of epilepsy, developmental delay, and intellectual disability. <i>STXBP1</i> -RD and <i>SYNGAP1</i> -RD are potential targets for drug and gene therapy, but there is limited information in the literature describing the natural history of either disorder which impedes the development of possible therapeutics. The
{"title":"A Prospective Natural History Study Protocol for Clinical Trial Readiness in Synaptic Disorders.","authors":"Jillian L McKee, Sarah M Ruggiero, Kristin Cunningham, JoeyLynn Coyne, Ian McSalley, Michael C Kaufman, Bintou Bane, Torrey Chisari, Jonathan Toib, Carlyn Glatts, Sarah Tefft, Julie M Orlando, Viveknarayanan Padmanabhan, Alexander K Gonzalez, Alicia Harrison, Charlene Woo, Stephanie A Zbikowski, Rency Dhaduk, Johanna Mercurio, Macie McCarthy, Jan H Magielski, Zachary Grinspan, Megan Abbott, Juliet Knowles, Hsiao-Tuan Chao, Katherine Xiong, Elizabeth Berry-Kravis, Sepideh Tabarestani, J Michael Graglia, Kathryn Helde, Virginie McNamar, Charlene Son Rigby, James Goss, Scott Demarest, Andrea Miele, Benjamin Prosser, Michael J Boland, Samuel R Pierce, Ingo Helbig","doi":"10.64898/2026.01.30.26344887","DOIUrl":"https://doi.org/10.64898/2026.01.30.26344887","url":null,"abstract":"<p><strong>Objective: </strong><i>STXBP1-</i> Related Disorders ( <i>STXBP1</i> -RD) and <i>SYNGAP1</i> -Related Disorder ( <i>SYNGAP1</i> -RD) are two common genetic synaptopathies, leading to epilepsy, developmental delay, and intellectual disability. Both <i>STXBP1</i> -RD and <i>SYNGAP1</i> -RD are potential targets for disease-modifying therapies, but there is limited information in the literature describing the natural history of either disorder, which impedes outcome selection for future clinical trials. The objective of this study is to develop a framework to better define and outline the clinical spectrum and longitudinal trajectories of <i>STXBP1</i> -RD and <i>SYNGAP1</i> -RD natural history, including development, behavior, seizure histories, and electrophysiology.</p><p><strong>Methods: </strong>Here, we describe a protocol, regulatory structure, and supportive preliminary data for multi-center, prospective natural history studies of <i>STXBP1</i> -RD (STARR) and <i>SYNGAP1</i> -RD (ProMMiS). The protocols incorporate gold-standard clinician-assessed outcome measures including the Bayley Scales of Infant and Toddler Development 4 <sup>th</sup> edition, Gross Motor Function Measure-66, and fine motor domains of the Peabody Developmental Motor Scales 3 <sup>rd</sup> Edition, parent reported outcome measures (PROMs), epilepsy histories, and biomarker exploration. To date, the study has enrolled 164 individuals with <i>STXBP1</i> -RD and 159 with <i>SYNGAP1</i> -RD, with ongoing longitudinal assessments every 6 months in a subset of approximately 200 total individuals across both disorders.</p><p><strong>Results: </strong>Our data support that existing developmental measures are feasible, informative, and show minimal floor or ceiling effects. Furthermore, we demonstrate that medical record-based seizure history reconstruction reveals unique epilepsy trajectories while minimizing burden to families. We observe disease-specific patterns of developmental performance and distinct longitudinal seizure dynamics, highlighting the need for data generation in a gene/disorder-specific manner for clinical trial readiness.</p><p><strong>Significance: </strong>In summary, we present a feasible natural history protocol with prospective data for two complex neurodevelopmental disorders with natural histories that have previously been incompletely characterized, within a regulatory framework that will support the use of these data to expedite clinical trial development.</p><p><strong>Key points: </strong><i>STXBP1</i> -Related Disorder ( <i>STXBP1</i> -RD) and <i>SYNGAP1</i> -Related Disorder ( <i>SYNGAP1</i> -RD) are two common genetic causes of epilepsy, developmental delay, and intellectual disability. <i>STXBP1</i> -RD and <i>SYNGAP1</i> -RD are potential targets for drug and gene therapy, but there is limited information in the literature describing the natural history of either disorder which impedes the development of possible therapeutics. The ","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889760/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.24.26344530
Marzieh Khani, Sheila N Yeboah, Catalina Cerquera-Cleves, Alexandra Kedmi, Bernabe I Bustos, Spencer M Grant, Suleyman Can Akerman, Fulya Akçimen, Paul Suhwan Lee, Paula Reyes-Pérez, Lara M Lange, Hampton Leonard, Mathew J Koretsky, Mary B Makarious, Zachary Schneider, Caroline Jonson, Pin-Shiuan Chen, Yi Wen Tay, Jeffrey D Rothstein, Chin-Hsien Lin, Shen-Yang Lim, Christine Klein, Kalpana Merchant, Niccolò E Mencacci, Dimitri Krainc, Mark R Cookson, Andrew Singleton, Sara Bandres-Ciga
SORL1, the gene encoding the SORLA protein, has arisen as a potential therapeutic target for Alzheimer's disease (AD). Studies suggest that restoring SORLA function or its trafficking pathways, particularly the SORLA-retromer recycling system, may offer a promising strategy to slow or halt AD progression. While both rare and common SORL1 variants have been associated with increased AD risk, recent evidence suggests a potential involvement of SORL1 in other neurodegenerative conditions. This study assessed the contribution of SORL1 genetic variation to the risk of AD, related dementias (RD), and Parkinson's disease (PD) using data from six large-scale biobanks, comprising 15,043 AD, 9,943 RD, and 42,763 PD cases, along with 111,969 controls across 11 ancestries. We identified 53 potentially disease-related SORL1 variants (CADD score > 20, MAC ≥ 2, annotated as protein-altering or splicing, and with the mutated allele present only in cases), including 41 novel and 12 previously reported variants. Three were found across multiple ancestries. Overall, 13 variants were found in AD-related cohorts, 5 in RD cohorts, and 35 in PD cohorts. Association analysis identified 10 nominally significant variants associated with AD and 5 with PD. The replication of multiple SORL1 variants across neurodegenerative diseases and ancestrally diverse populations underscores its potential broad genetic contribution to neurodegeneration and reinforces its relevance across distinct clinical phenotypes. Gene-based burden analysis did not reveal any significant cumulative effect of SORL1 variants in the populations tested. A family-based analysis identified a rare predicted-damaging variant in two East Asian families (11:121478242:G:A, p.R176Q) and two variants in two families of European ancestry (11:121514222:A:C, p.N371T; 11:121545392:G:A, p.V672M) that show some evidence of segregation in PD families. Although these variants were slightly more frequent in unrelated PD cases vs. controls, none of them showed statistically significant enrichment in PD, likely due to their very low frequency. Overall, our results extend the understanding of SORL1 beyond AD, suggesting a broader role in neurodegeneration and emphasizing the need for diverse population studies when evaluating genetic risk.
{"title":"Is <i>SORL1</i> a common genetic target across neurodegenerative diseases?: A multi-ancestry biobank scale assessment.","authors":"Marzieh Khani, Sheila N Yeboah, Catalina Cerquera-Cleves, Alexandra Kedmi, Bernabe I Bustos, Spencer M Grant, Suleyman Can Akerman, Fulya Akçimen, Paul Suhwan Lee, Paula Reyes-Pérez, Lara M Lange, Hampton Leonard, Mathew J Koretsky, Mary B Makarious, Zachary Schneider, Caroline Jonson, Pin-Shiuan Chen, Yi Wen Tay, Jeffrey D Rothstein, Chin-Hsien Lin, Shen-Yang Lim, Christine Klein, Kalpana Merchant, Niccolò E Mencacci, Dimitri Krainc, Mark R Cookson, Andrew Singleton, Sara Bandres-Ciga","doi":"10.64898/2026.01.24.26344530","DOIUrl":"10.64898/2026.01.24.26344530","url":null,"abstract":"<p><p><i>SORL1,</i> the gene encoding the SORLA protein, has arisen as a potential therapeutic target for Alzheimer's disease (AD). Studies suggest that restoring SORLA function or its trafficking pathways, particularly the SORLA-retromer recycling system, may offer a promising strategy to slow or halt AD progression. While both rare and common <i>SORL1</i> variants have been associated with increased AD risk, recent evidence suggests a potential involvement of <i>SORL1</i> in other neurodegenerative conditions. This study assessed the contribution of <i>SORL1</i> genetic variation to the risk of AD, related dementias (RD), and Parkinson's disease (PD) using data from six large-scale biobanks, comprising 15,043 AD, 9,943 RD, and 42,763 PD cases, along with 111,969 controls across 11 ancestries. We identified 53 potentially disease-related <i>SORL1</i> variants (CADD score > 20, MAC ≥ 2, annotated as protein-altering or splicing, and with the mutated allele present only in cases), including 41 novel and 12 previously reported variants. Three were found across multiple ancestries. Overall, 13 variants were found in AD-related cohorts, 5 in RD cohorts, and 35 in PD cohorts. Association analysis identified 10 nominally significant variants associated with AD and 5 with PD. The replication of multiple <i>SORL1</i> variants across neurodegenerative diseases and ancestrally diverse populations underscores its potential broad genetic contribution to neurodegeneration and reinforces its relevance across distinct clinical phenotypes. Gene-based burden analysis did not reveal any significant cumulative effect of <i>SORL1</i> variants in the populations tested. A family-based analysis identified a rare predicted-damaging variant in two East Asian families (11:121478242:G:A, p.R176Q) and two variants in two families of European ancestry (11:121514222:A:C, p.N371T; 11:121545392:G:A, p.V672M) that show some evidence of segregation in PD families. Although these variants were slightly more frequent in unrelated PD cases vs. controls, none of them showed statistically significant enrichment in PD, likely due to their very low frequency. Overall, our results extend the understanding of <i>SORL1</i> beyond AD, suggesting a broader role in neurodegeneration and emphasizing the need for diverse population studies when evaluating genetic risk.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12870714/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146128227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.30.26345233
Hyunmi Choi, Jose Gutierrez, Tian Wang, Minghua Liu, Cheng-Shiun Leu, Sylwia Misiewicz, Jiying Han, Natalie Bello, Mary L Biggs, Emily M Briceño, Adam M Brickman, James F Burke, Ligong Chen, Lisandro D Colantonio, Stefany P Diaz Andino, Mitchell S V Elkind, Annette L Fitzpatrick, Christopher Gonzalez Corona, Alden L Gross, Lei Huang, Emily L Johnson, W Craig Johnson, Deborah A Levine, W T Longstreth, Sofia Pelagalli Maia, Richard P Mayeux, Brian C Petersen, Oluwadamilola Obalana, Dolly Reyes-Dumeyer, Tatjana Rundek, Danurys Sanchez, Steven J Shea, Kevin Strobino, Carolyn W Zhu, Evan L Thacker
Objectives: With the expected demographic shift toward those ≥65 years of age in the United States, late-onset epilepsy (LOE) poses a significant public health issue, yet it has been historically understudied. We are undertaking an effort in the Epilepsy-Cog study to pool individual participant data from six US-based prospective cohort studies. In this paper, we outline the process for ascertaining epilepsy, harmonizing, and pooling individual participant data across the six cohorts.
Methods: The Epilepsy-Cog study includes individual participant data from six US-based longitudinal cohort studies: ARIC, CHS, MESA, NOMAS, REGARDS, and WHICAP. In all cohorts except NOMAS, prevalent and incident epilepsy were ascertained using Medicare claims-based algorithms. In NOMAS, epilepsy cases were identified through cohort-based reporting and medical record review. To perform cross-cohort harmonization of variables, we used the lowest common denominator approach, assigning response categories or value levels in common across all cohorts.
Results: From a total of 68,544 participants across six cohorts, 43,753 participants met eligibility criteria for Epilepsy-Cog. Among them, we identified 551 (1.3%) participants with prevalent epilepsy and 1,500 (3.4%) participants with incident epilepsy. We have harmonized demographic characteristics, health behaviors, vascular risk factors (VRFs), one genetic variable, medication use, subjective health status measures, incident events, and cause-of-death variables.
Conclusion: The Epilepsy-Cog pooled cohort of 43,753 participants with and without epilepsy, combined with harmonized demographic, VRFs, and event data, offers a unique resource to yield new insights into LOE.
{"title":"The Epilepsy-Cog study: methods to establish a harmonized study of late-onset epilepsy in a meta-cohort of six population-based cohorts in the United States.","authors":"Hyunmi Choi, Jose Gutierrez, Tian Wang, Minghua Liu, Cheng-Shiun Leu, Sylwia Misiewicz, Jiying Han, Natalie Bello, Mary L Biggs, Emily M Briceño, Adam M Brickman, James F Burke, Ligong Chen, Lisandro D Colantonio, Stefany P Diaz Andino, Mitchell S V Elkind, Annette L Fitzpatrick, Christopher Gonzalez Corona, Alden L Gross, Lei Huang, Emily L Johnson, W Craig Johnson, Deborah A Levine, W T Longstreth, Sofia Pelagalli Maia, Richard P Mayeux, Brian C Petersen, Oluwadamilola Obalana, Dolly Reyes-Dumeyer, Tatjana Rundek, Danurys Sanchez, Steven J Shea, Kevin Strobino, Carolyn W Zhu, Evan L Thacker","doi":"10.64898/2026.01.30.26345233","DOIUrl":"https://doi.org/10.64898/2026.01.30.26345233","url":null,"abstract":"<p><strong>Objectives: </strong>With the expected demographic shift toward those ≥65 years of age in the United States, late-onset epilepsy (LOE) poses a significant public health issue, yet it has been historically understudied. We are undertaking an effort in the Epilepsy-Cog study to pool individual participant data from six US-based prospective cohort studies. In this paper, we outline the process for ascertaining epilepsy, harmonizing, and pooling individual participant data across the six cohorts.</p><p><strong>Methods: </strong>The Epilepsy-Cog study includes individual participant data from six US-based longitudinal cohort studies: ARIC, CHS, MESA, NOMAS, REGARDS, and WHICAP. In all cohorts except NOMAS, prevalent and incident epilepsy were ascertained using Medicare claims-based algorithms. In NOMAS, epilepsy cases were identified through cohort-based reporting and medical record review. To perform cross-cohort harmonization of variables, we used the lowest common denominator approach, assigning response categories or value levels in common across all cohorts.</p><p><strong>Results: </strong>From a total of 68,544 participants across six cohorts, 43,753 participants met eligibility criteria for Epilepsy-Cog. Among them, we identified 551 (1.3%) participants with prevalent epilepsy and 1,500 (3.4%) participants with incident epilepsy. We have harmonized demographic characteristics, health behaviors, vascular risk factors (VRFs), one genetic variable, medication use, subjective health status measures, incident events, and cause-of-death variables.</p><p><strong>Conclusion: </strong>The Epilepsy-Cog pooled cohort of 43,753 participants with and without epilepsy, combined with harmonized demographic, VRFs, and event data, offers a unique resource to yield new insights into LOE.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889749/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.30.26345239
Syed-Amad Hussain, Daniel I Jackson, Samanvith Thotapalli, Marissa B McClellan, Madeleine Stanco, Grace Varney, Sterling Gleeson, Florencia Nugroho, William Leever, Eric Fosler-Lussier, Emre Sezgin
Health-Related Social Needs (HRSNs) significantly impact health outcomes, yet traditional care often fails to address them effectively. While conversational agents offer scalable support, their deployment is hindered by privacy risks and a lack of specialized training data for clinical applications. Synthetic data generation offers a solution to address this gap; standard pipelines often prompt LLMs using structured user personas, comprising demographics, constraints, and goals, to emulate dialogues. However, current methods relying on coarse demographic attributes often yield generic or stereotyped personas that lack real-world nuance. To improve the realism of synthetic data, we introduce Socially Grounded Exemplars (SGEs), which translate abstract persona attributes into granular, conversational descriptors. We implemented a two-stage pipeline using GPT-4o to generate SGEs, which then grounded synthetic dialogue generation under various prompting strategies. We evaluated the approach using automatic diversity metrics (Vendi Score) and blinded pairwise preference ratings by community behavioral health specialists (CBHS). Validation confirmed the feasibility of input generation, with GPT-4o achieving an 85% term acceptability rate for SGEs. In conversation generation, dynamic SGEs significantly improved lexical diversity, achieving a Vendi Score of 289.41 compared to 252.36 for the control baseline. CBHS ranked the model combining dynamic SGEs with implicit name-based cueing highest (Bradley-Terry Score: 0.753), surpassing both the SGE-only model (0.663) and the explicit demographics model (0.348). Raters favored the name-augmented model for "Specificity & Natural Authenticity" (30.0%), while explicit demographic labeling reduced perceived authenticity. We show SGEs leverage LLM parametric knowledge to produce diverse synthetic data, surpassing the limitations of rigid demographic ontologies. Our findings indicate that implicit cueing through names yields more authentic representations than explicit labeling, reducing the risk of stereotyped outputs. This framework supports the creation of privacy-preserving, conversational datasets informing tasks (e.g. evaluation, agentic workflows, and model distillation) in sensitive healthcare contexts.
{"title":"Socially Grounded Exemplars Improve Synthetic Conversations for Health-Related Social Needs Navigation.","authors":"Syed-Amad Hussain, Daniel I Jackson, Samanvith Thotapalli, Marissa B McClellan, Madeleine Stanco, Grace Varney, Sterling Gleeson, Florencia Nugroho, William Leever, Eric Fosler-Lussier, Emre Sezgin","doi":"10.64898/2026.01.30.26345239","DOIUrl":"https://doi.org/10.64898/2026.01.30.26345239","url":null,"abstract":"<p><p>Health-Related Social Needs (HRSNs) significantly impact health outcomes, yet traditional care often fails to address them effectively. While conversational agents offer scalable support, their deployment is hindered by privacy risks and a lack of specialized training data for clinical applications. Synthetic data generation offers a solution to address this gap; standard pipelines often prompt LLMs using structured user personas, comprising demographics, constraints, and goals, to emulate dialogues. However, current methods relying on coarse demographic attributes often yield generic or stereotyped personas that lack real-world nuance. To improve the realism of synthetic data, we introduce Socially Grounded Exemplars (SGEs), which translate abstract persona attributes into granular, conversational descriptors. We implemented a two-stage pipeline using GPT-4o to generate SGEs, which then grounded synthetic dialogue generation under various prompting strategies. We evaluated the approach using automatic diversity metrics (Vendi Score) and blinded pairwise preference ratings by community behavioral health specialists (CBHS). Validation confirmed the feasibility of input generation, with GPT-4o achieving an 85% term acceptability rate for SGEs. In conversation generation, dynamic SGEs significantly improved lexical diversity, achieving a Vendi Score of 289.41 compared to 252.36 for the control baseline. CBHS ranked the model combining dynamic SGEs with implicit name-based cueing highest (Bradley-Terry Score: 0.753), surpassing both the SGE-only model (0.663) and the explicit demographics model (0.348). Raters favored the name-augmented model for \"Specificity & Natural Authenticity\" (30.0%), while explicit demographic labeling reduced perceived authenticity. We show SGEs leverage LLM parametric knowledge to produce diverse synthetic data, surpassing the limitations of rigid demographic ontologies. Our findings indicate that implicit cueing through names yields more authentic representations than explicit labeling, reducing the risk of stereotyped outputs. This framework supports the creation of privacy-preserving, conversational datasets informing tasks (e.g. evaluation, agentic workflows, and model distillation) in sensitive healthcare contexts.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.30.25342358
Charles Hadley King, Rebekah Barrick, Miguel Almalvez, Kirsten Blanco, Ivan De Dios, Vincent A Fusaro, Emmanuèle Délot, Chris Donohue, Seth Berger, Changrui Xiao, Eric Vilain, Jonathan LoTempio
Across biomedical research and care, many conversations transmit information with profound practical, ethical, and legal consequences. The process of informed consent, where individuals decide to join a study or accept clinical care, is perhaps the most consequential, yet it is also complex, labor-intensive, and variable across sites. Existing platforms for information transmission in the informed consent context largely reproduce static documents and lack reproducibility or auditability, while generative chatbots offer flexibility at the cost of stochasticity, hallucination, and regulatory risk. We present Kauro, an open-source, graph-based chatbot that encodes scripted conversations as version-controlled JavaScript Object Notation (JSON) structures, enabling deterministic traversal (ie, paths through the graph), complete audit logging, and IRB-verifiable oversight. Its modular separation of client, server, and script ensures portability across institutions. By operationalizing constraint rather than flexibility, Kauro reframes deployment of machine intelligence in biomedical communication with reproducibility and auditability, offering a scalable platform generalizable to any domain where conversations demand safety, precision, and trust.
{"title":"Kauro, a graph-based chatbot for high-fidelity information transmission conversations.","authors":"Charles Hadley King, Rebekah Barrick, Miguel Almalvez, Kirsten Blanco, Ivan De Dios, Vincent A Fusaro, Emmanuèle Délot, Chris Donohue, Seth Berger, Changrui Xiao, Eric Vilain, Jonathan LoTempio","doi":"10.64898/2026.01.30.25342358","DOIUrl":"https://doi.org/10.64898/2026.01.30.25342358","url":null,"abstract":"<p><p>Across biomedical research and care, many conversations transmit information with profound practical, ethical, and legal consequences. The process of informed consent, where individuals decide to join a study or accept clinical care, is perhaps the most consequential, yet it is also complex, labor-intensive, and variable across sites. Existing platforms for information transmission in the informed consent context largely reproduce static documents and lack reproducibility or auditability, while generative chatbots offer flexibility at the cost of stochasticity, hallucination, and regulatory risk. We present Kauro, an open-source, graph-based chatbot that encodes scripted conversations as version-controlled JavaScript Object Notation (JSON) structures, enabling deterministic traversal (ie, paths through the graph), complete audit logging, and IRB-verifiable oversight. Its modular separation of client, server, and script ensures portability across institutions. By operationalizing constraint rather than flexibility, Kauro reframes deployment of machine intelligence in biomedical communication with reproducibility and auditability, offering a scalable platform generalizable to any domain where conversations demand safety, precision, and trust.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.30.26345235
Shayne S-H Lin, Alicia Milam, Andrew M Kiselica, Stephen L Aita, Mubarick Saeed, Troy Webber, Steven Paul Woods, Nicholas C Borgogna, Keenan A Walker, Vidyulata Kamath, Kristina Visscher, Charles F Murchison, David S Geldmacher, Erik D Roberson, Benjamin D Hill, Victor A Del Bene
Objective: To assess intra-individual cognitive variability (IICV) in relation to Alzheimer's Disease (AD) biomarkers.
Methods: The sample included 879 adults from the National Alzheimer's Coordinating Center, aged 50 and above with a complete neuropsychological evaluation and AD biomarker data available (64% cognitively intact; 36% cognitively impaired). We conducted a series of moderated regression models where AD biomarkers, neurocognitive status, and their interaction effects predicted IICV. IICV measures included demographically adjusted normed scores for the intraindividual standard deviation (iSD) and coefficient of variance (CoV). AD biomarkers included cerebrospinal fluid (CSF) measures of Aβ 1-42 , phosphorylated tau 181 (p-Tau 181 ), and total tau (t-Tau), as well as amyloid positron emission tomography (PET; with both continuous centiloid values and a dichotomous variable).
Results: Increased AD biomarker burden was associated with increased IICV among cognitively impaired individuals (correlational strength ranging from .206 to .391 for iSD and from .149 to .460 for CoV) but not among the cognitively intact group (correlational strength ranging from .008 to .085 for iSD and from .016 to .085 for CoV). The pattern of results held even after controlling for demographic factors and was comparable in magnitude to the association between AD biomarkers and mean cognitive performance.
Conclusions: Increases in measures of amyloid, soluble tau, and neurodegeneration are associated with increased IICV among cognitively impaired older adults. The findings underscore the potential of IICV as a sensitive outcome measure in the AD clinical disease phase. Future studies should replicate findings longitudinally and in more diverse samples.
{"title":"Linking Cognitive Variability and Alzheimers Disease Biomarkers by Neurocognitive Status.","authors":"Shayne S-H Lin, Alicia Milam, Andrew M Kiselica, Stephen L Aita, Mubarick Saeed, Troy Webber, Steven Paul Woods, Nicholas C Borgogna, Keenan A Walker, Vidyulata Kamath, Kristina Visscher, Charles F Murchison, David S Geldmacher, Erik D Roberson, Benjamin D Hill, Victor A Del Bene","doi":"10.64898/2026.01.30.26345235","DOIUrl":"https://doi.org/10.64898/2026.01.30.26345235","url":null,"abstract":"<p><strong>Objective: </strong>To assess intra-individual cognitive variability (IICV) in relation to Alzheimer's Disease (AD) biomarkers.</p><p><strong>Methods: </strong>The sample included 879 adults from the National Alzheimer's Coordinating Center, aged 50 and above with a complete neuropsychological evaluation and AD biomarker data available (64% cognitively intact; 36% cognitively impaired). We conducted a series of moderated regression models where AD biomarkers, neurocognitive status, and their interaction effects predicted IICV. IICV measures included demographically adjusted normed scores for the intraindividual standard deviation (iSD) and coefficient of variance (CoV). AD biomarkers included cerebrospinal fluid (CSF) measures of Aβ <sub>1-42</sub> , phosphorylated tau 181 (p-Tau <sub>181</sub> ), and total tau (t-Tau), as well as amyloid positron emission tomography (PET; with both continuous centiloid values and a dichotomous variable).</p><p><strong>Results: </strong>Increased AD biomarker burden was associated with increased IICV among cognitively impaired individuals (correlational strength ranging from .206 to .391 for iSD and from .149 to .460 for CoV) but not among the cognitively intact group (correlational strength ranging from .008 to .085 for iSD and from .016 to .085 for CoV). The pattern of results held even after controlling for demographic factors and was comparable in magnitude to the association between AD biomarkers and mean cognitive performance.</p><p><strong>Conclusions: </strong>Increases in measures of amyloid, soluble tau, and neurodegeneration are associated with increased IICV among cognitively impaired older adults. The findings underscore the potential of IICV as a sensitive outcome measure in the AD clinical disease phase. Future studies should replicate findings longitudinally and in more diverse samples.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889766/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1101/2025.08.28.25334653
Jo-Ying Hung, Chih-Yuan Hsu, Pei-Fang Su, Yu Shyr
A surrogate endpoint is a biomarker that is reasonably likely to predict clinical benefit and is used as a substitute for a direct measure of clinical benefit under the Food and Drug Administration (FDA) Accelerated Approval pathway. According to FDA guidelines, a valid surrogate endpoint must meet two associations: I-association (the association between the surrogate and true endpoints, such as disease response and overall survival) and T-association (the association between treatment effects on both endpoints, such as odds ratio and hazard ratio). I-association is commonly evaluated, but T-association is often overlooked due to the lack of appropriate statistical methods. Failure to satisfy T-association precludes a biomarker from supporting accelerated approval. To address this gap, we propose a new method to rigorously assess T-association in accordance with FDA guidelines. This method assumes that treatment effects on the surrogate and true endpoints follow a bivariate normal distribution, accounting for both within-study and between-study variances. The key evaluation metric is the correlation coefficient, which quantifies the relationship between treatment effects on both endpoints. Model parameters, including this correlation, are estimated using maximum likelihood, restricted maximum likelihood, and a Bayesian approach. We demonstrate the method using both simulated and real-world data. The method will serve as the statistical foundation that aligns with FDA guidelines and supports future accelerated approvals. The R package to implement the proposed method is available at https://github.com/jybelindahung/T-association .
{"title":"Evaluation Methods for T-association of a Surrogate Endpoint.","authors":"Jo-Ying Hung, Chih-Yuan Hsu, Pei-Fang Su, Yu Shyr","doi":"10.1101/2025.08.28.25334653","DOIUrl":"https://doi.org/10.1101/2025.08.28.25334653","url":null,"abstract":"<p><p>A surrogate endpoint is a biomarker that is reasonably likely to predict clinical benefit and is used as a substitute for a direct measure of clinical benefit under the Food and Drug Administration (FDA) Accelerated Approval pathway. According to FDA guidelines, a valid surrogate endpoint must meet two associations: I-association (the association between the surrogate and true endpoints, such as disease response and overall survival) and T-association (the association between treatment effects on both endpoints, such as odds ratio and hazard ratio). I-association is commonly evaluated, but T-association is often overlooked due to the lack of appropriate statistical methods. Failure to satisfy T-association precludes a biomarker from supporting accelerated approval. To address this gap, we propose a new method to rigorously assess T-association in accordance with FDA guidelines. This method assumes that treatment effects on the surrogate and true endpoints follow a bivariate normal distribution, accounting for both within-study and between-study variances. The key evaluation metric is the correlation coefficient, which quantifies the relationship between treatment effects on both endpoints. Model parameters, including this correlation, are estimated using maximum likelihood, restricted maximum likelihood, and a Bayesian approach. We demonstrate the method using both simulated and real-world data. The method will serve as the statistical foundation that aligns with FDA guidelines and supports future accelerated approvals. The R package to implement the proposed method is available at https://github.com/jybelindahung/T-association .</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889774/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.28.26344948
Shujian Zheng, Joshua M Mitchell, Jasmine Chong, Jennifer Canniff, Michael J Johnson, Maheshwor Thapa, Elizabeth Aiken, Shabir Madhi, Adriana Weinberg, Shuzhao Li
Pregnant women with HIV control viral replication with antiretrovirals and give birth to HIV-exposed uninfected infants (HEU). The children, however, exhibit increased morbidity and mortality due to severe infections, as well as cognitive and growth abnormalities. In this study, we performed high-resolution, untargeted metabolomics on 123 HIV-exposed mother-baby pairs and 117 control pairs without HIV. High concentrations of the antiretroviral efavirenz and its metabolites were detected in maternal blood and cord blood. The metabolomic differences between HEU participants and controls reflect perturbed pathways of steroids, tryptophan and bile acids, and they largely consisted of metabolites that were correlated with efavirenz concentrations within the HEU group. The results suggest a major contribution of the drug to the abnormal biochemical profile of HEU infants born to mothers treated with efavirenz.
{"title":"Distinct biochemical phenotypes of HIV exposed infants driven by antiviral medication.","authors":"Shujian Zheng, Joshua M Mitchell, Jasmine Chong, Jennifer Canniff, Michael J Johnson, Maheshwor Thapa, Elizabeth Aiken, Shabir Madhi, Adriana Weinberg, Shuzhao Li","doi":"10.64898/2026.01.28.26344948","DOIUrl":"https://doi.org/10.64898/2026.01.28.26344948","url":null,"abstract":"<p><p>Pregnant women with HIV control viral replication with antiretrovirals and give birth to HIV-exposed uninfected infants (HEU). The children, however, exhibit increased morbidity and mortality due to severe infections, as well as cognitive and growth abnormalities. In this study, we performed high-resolution, untargeted metabolomics on 123 HIV-exposed mother-baby pairs and 117 control pairs without HIV. High concentrations of the antiretroviral efavirenz and its metabolites were detected in maternal blood and cord blood. The metabolomic differences between HEU participants and controls reflect perturbed pathways of steroids, tryptophan and bile acids, and they largely consisted of metabolites that were correlated with efavirenz concentrations within the HEU group. The results suggest a major contribution of the drug to the abnormal biochemical profile of HEU infants born to mothers treated with efavirenz.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889808/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.64898/2026.01.31.26345168
Jian Cui, Colleen M Roark, Nerea Domínguez-Pinilla, Pilar Nozal Aranda, Begoña Losada, Pilar Zamarrón, Jacob Lorenzo-Morales, José Miguel Rubio Muñoz, Megan M Dobrose, Ana Van den Rym, Luis M Allende, Catherine Shelton, Dante E Reyna, Janet G Markle, Santiago Rodríguez de Córdoba, Margarita Lopez-Trascasa, Rebeca Pérez de Diego, C Henrique Serezani, Mariana X Byndloss, Isabel de Fuentes Corripio, Luis Ignacio González-Granado, Ruben Martinez-Barricarte
Background: Primary amoebic meningoencephalitis (PAM) is a rapidly progressive and often fatal central nervous system infection caused by Naegleria fowleri . Despite widespread environmental exposure to this free-living amoeba, clinical disease is rare, suggesting that it requires not only exposure to the amoeba but also a host vulnerability. Yet, the immune mechanisms controlling protection vs. susceptibility to N. fowleri remain poorly understood.
Methods: We conducted comprehensive clinical, immunological, and genetic investigations in one of the few survivors of PAM. We performed high-dimensional immune profiling using Cytometry by Time-Of-Flight (CyTOF) to assess immune cell composition and activation state. We employed whole-exome sequencing (WES) to identify rare genetic variants that affect host responses. Functional immune assays were used to assess serum-mediated amoebicidal activity in vitro and to characterize key host defense pathways.
Results: A previously healthy pediatric patient was diagnosed with PAM. Contrary to other cases, her clinical course lasted for more than 2 months before she recovered with miltefosine treatment. Immunologic evaluation showed this patient had normal numbers and frequencies of major lymphoid and myeloid immune cells. WES revealed a homozygous deletion in the complement component 2 (C2) gene, resulting in a complete absence of circulating C2 protein and abolishing classical complement pathway activity. Normal human serum induced complement-mediated lysis of N. fowleri trophozoites in vitro, whereas complement-depleted normal human serum and serum from our patient both failed to deposit membrane attack complex (MAC) or kill N. fowleri . MAC deposition and amoebicidal activity were restored by supplementing the patient's serum with purified human C2 protein.
Conclusion: Our study demonstrates that PAM can be caused by a monogenic inborn error of immunity (IEI) and that the complement system is critical for human immunity against Naegleria fowleri .
{"title":"Primary Amoebic Meningoencephalitis caused by Complement C2 Deficiency.","authors":"Jian Cui, Colleen M Roark, Nerea Domínguez-Pinilla, Pilar Nozal Aranda, Begoña Losada, Pilar Zamarrón, Jacob Lorenzo-Morales, José Miguel Rubio Muñoz, Megan M Dobrose, Ana Van den Rym, Luis M Allende, Catherine Shelton, Dante E Reyna, Janet G Markle, Santiago Rodríguez de Córdoba, Margarita Lopez-Trascasa, Rebeca Pérez de Diego, C Henrique Serezani, Mariana X Byndloss, Isabel de Fuentes Corripio, Luis Ignacio González-Granado, Ruben Martinez-Barricarte","doi":"10.64898/2026.01.31.26345168","DOIUrl":"https://doi.org/10.64898/2026.01.31.26345168","url":null,"abstract":"<p><strong>Background: </strong>Primary amoebic meningoencephalitis (PAM) is a rapidly progressive and often fatal central nervous system infection caused by <i>Naegleria fowleri</i> . Despite widespread environmental exposure to this free-living amoeba, clinical disease is rare, suggesting that it requires not only exposure to the amoeba but also a host vulnerability. Yet, the immune mechanisms controlling protection vs. susceptibility to <i>N. fowleri</i> remain poorly understood.</p><p><strong>Methods: </strong>We conducted comprehensive clinical, immunological, and genetic investigations in one of the few survivors of PAM. We performed high-dimensional immune profiling using Cytometry by Time-Of-Flight (CyTOF) to assess immune cell composition and activation state. We employed whole-exome sequencing (WES) to identify rare genetic variants that affect host responses. Functional immune assays were used to assess serum-mediated amoebicidal activity <i>in vitro</i> and to characterize key host defense pathways.</p><p><strong>Results: </strong>A previously healthy pediatric patient was diagnosed with PAM. Contrary to other cases, her clinical course lasted for more than 2 months before she recovered with miltefosine treatment. Immunologic evaluation showed this patient had normal numbers and frequencies of major lymphoid and myeloid immune cells. WES revealed a homozygous deletion in the complement component 2 (C2) gene, resulting in a complete absence of circulating C2 protein and abolishing classical complement pathway activity. Normal human serum induced complement-mediated lysis of <i>N. fowleri</i> trophozoites in vitro, whereas complement-depleted normal human serum and serum from our patient both failed to deposit membrane attack complex (MAC) or kill <i>N. fowleri</i> . MAC deposition and amoebicidal activity were restored by supplementing the patient's serum with purified human C2 protein.</p><p><strong>Conclusion: </strong>Our study demonstrates that PAM can be caused by a monogenic inborn error of immunity (IEI) and that the complement system is critical for human immunity against <i>Naegleria fowleri</i> .</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889776/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}