Background: Environmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expansion, as well as the drivers behind the mechanisms, to understand and prepare for future pandemic expansions.
Objective: The unique, rapid, global expansion of a clonal complex of Vibrio parahaemolyticus (a marine bacterium causing gastroenteritis infections) named Vibrio parahaemolyticus sequence type 3 (VpST3) provides an opportunity to explore the eco-evolutionary drivers of pathogen expansion.
Methods: The global expansion of VpST3 was reconstructed using VpST3 genomes, which were then classified into metrics characterizing the stages of this expansion process, indicative of the stages of emergence and establishment. We used machine learning, specifically a random forest classifier, to test a range of ecological and evolutionary drivers for their potential in predicting VpST3 expansion dynamics.
Results: We identified a range of evolutionary features, including mutations in the core genome and accessory gene presence, associated with expansion dynamics. A range of random forest classifier approaches were tested to predict expansion classification metrics for each genome. The highest predictive accuracies (ranging from 0.722 to 0.967) were achieved for models using a combined eco-evolutionary approach. While population structure and the difference between introduced and established isolates could be predicted to a high accuracy, our model reported multiple false positives when predicting the success of an introduced isolate, suggesting potential limiting factors not represented in our eco-evolutionary features. Regional models produced for 2 countries reporting the most VpST3 genomes had varying success, reflecting the impacts of class imbalance.
Conclusions: These novel insights into evolutionary features and ecological conditions related to the stages of VpST3 expansion showcase the potential of machine learning models using genomic data and will contribute to the future understanding of the eco-evolutionary pathways of climate-sensitive pathogens.
Background: An increasing body of literature highlights the integration of machine learning with genomic data in psychiatry, particularly for complex mental health disorders such as schizophrenia. These advanced techniques offer promising potential for uncovering various facets of these disorders. A comprehensive review of the current applications of machine learning in conjunction with genomic data within this context can significantly enhance our understanding of the current state of research and its future directions.
Objective: This study aims to conduct a systematic scoping review of the use of machine learning algorithms with genomic data in the field of schizophrenia.
Methods: To conduct a systematic scoping review, a search was performed in the electronic databases MEDLINE, Web of Science, PsycNet (PsycINFO), and Google Scholar from 2013 to 2024. Studies at the intersection of schizophrenia, genomic data, and machine learning were evaluated.
Results: The literature search identified 2437 eligible articles after removing duplicates. Following abstract screening, 143 full-text articles were assessed, and 121 were subsequently excluded. Therefore, 21 studies were thoroughly assessed. Various machine learning algorithms were used in the identified studies, with support vector machines being the most common. The studies notably used genomic data to predict schizophrenia, identify schizophrenia features, discover drugs, classify schizophrenia amongst other mental health disorders, and predict the quality of life of patients.
Conclusions: Several high-quality studies were identified. Yet, the application of machine learning with genomic data in the context of schizophrenia remains limited. Future research is essential to further evaluate the portability of these models and to explore their potential clinical applications.
The integration of chatbots in oncology underscores the pressing need for human-centered artificial intelligence (AI) that addresses patient and family concerns with empathy and precision. Human-centered AI emphasizes ethical principles, empathy, and user-centric approaches, ensuring technology aligns with human values and needs. This review critically examines the ethical implications of using large language models (LLMs) like GPT-3 and GPT-4 (OpenAI) in oncology chatbots. It examines how these models replicate human-like language patterns, impacting the design of ethical AI systems. The paper identifies key strategies for ethically developing oncology chatbots, focusing on potential biases arising from extensive datasets and neural networks. Specific datasets, such as those sourced from predominantly Western medical literature and patient interactions, may introduce biases by overrepresenting certain demographic groups. Moreover, the training methodologies of LLMs, including fine-tuning processes, can exacerbate these biases, leading to outputs that may disproportionately favor affluent or Western populations while neglecting marginalized communities. By providing examples of biased outputs in oncology chatbots, the review highlights the ethical challenges LLMs present and the need for mitigation strategies. The study emphasizes integrating human-centric values into AI to mitigate these biases, ultimately advocating for the development of oncology chatbots that are aligned with ethical principles and capable of serving diverse patient populations equitably.
Background: Despite growing interest in the clinical translation of polygenic risk scores (PRSs), it remains uncertain to what extent genomic information can enhance the prediction of psychiatric outcomes beyond the data collected during clinical visits alone.
Objective: This study aimed to assess the clinical utility of incorporating PRSs into a suicide risk prediction model trained on electronic health records (EHRs) and patient-reported surveys among patients admitted to the emergency department.
Methods: Study participants were recruited from the psychiatric emergency department at Massachusetts General Hospital. There were 333 adult patients of European ancestry who had high-quality genotype data available through their participation in the Mass General Brigham Biobank. Multiple neuropsychiatric PRSs were added to a previously validated suicide prediction model in a prospective cohort enrolled between February 4, 2015, and March 13, 2017. Data analysis was performed from July 11, 2022, to August 31, 2023. Suicide attempt was defined using diagnostic codes from longitudinal EHRs combined with 6-month follow-up surveys. The clinical risk score for suicide attempt was calculated from an ensemble model trained using an EHR-based suicide risk score and a brief survey, and it was subsequently used to define the baseline model. We generated PRSs for depression, bipolar disorder, schizophrenia, suicide attempt, and externalizing traits using a Bayesian polygenic scoring method for European ancestry participants. Model performance was evaluated using area under the receiver operator curve (AUC), area under the precision-recall curve, and positive predictive values.
Results: Of the 333 patients (n=178, 53.5% male; mean age 36.8, SD 13.6 years; n=333, 100% non-Hispanic and n=324, 97.3% self-reported White), 28 (8.4%) had a suicide attempt within 6 months. Adding either the schizophrenia PRS or all PRSs to the baseline model resulted in the numerically highest discrimination (AUC 0.86, 95% CI 0.73-0.99) compared to the baseline model (AUC 0.84, 95% Cl 0.70-0.98). However, the improvement in model performance was not statistically significant.
Conclusions: In this study, incorporating genomic information into clinical prediction models for suicide attempt did not improve patient risk stratification. Larger studies that include more diverse participants are required to validate whether the inclusion of psychiatric PRSs in clinical prediction models can enhance the stratification of patients at risk of suicide attempts.
Background: Chromosomal abnormalities are genetic disorders caused by chromosome errors, leading to developmental delays, birth defects, and miscarriages. Currently, invasive procedures such as amniocentesis or chorionic villus sampling are mostly used, which carry a risk of miscarriage. This has led to the need for a noninvasive and innovative approach to detect and prevent chromosomal abnormalities during pregnancy.
Objective: This review aims to describe and appraise the potential of internet-based abnormal chromosomal preventive measures as a noninvasive approach to detecting and preventing chromosomal abnormalities during pregnancy.
Methods: A thorough review of existing literature and research on chromosomal abnormalities and noninvasive approaches to prenatal diagnosis and therapy was conducted. Electronic databases such as PubMed, Google Scholar, ScienceDirect, CENTRAL, CINAHL, Embase, OVID MEDLINE, OVID PsycINFO, Scopus, ACM, and IEEE Xplore were searched for relevant studies and articles published in the last 5 years. The keywords used included chromosomal abnormalities, prenatal diagnosis, noninvasive, and internet-based, and diagnosis.
Results: The review of literature revealed that internet-based abnormal chromosomal diagnosis is a potential noninvasive approach to detecting and preventing chromosomal abnormalities during pregnancy. This innovative approach involves the use of advanced technology, including high-resolution ultrasound, cell-free DNA testing, and bioinformatics, to analyze fetal DNA from maternal blood samples. It allows early detection of chromosomal abnormalities, enabling timely interventions and treatment to prevent adverse outcomes. Furthermore, with the advancement of technology, internet-based abnormal chromosomal diagnosis has emerged as a safe alternative with benefits including its cost-effectiveness, increased accessibility and convenience, potential for earlier detection and intervention, and ethical considerations.
Conclusions: Internet-based abnormal chromosomal diagnosis has the potential to revolutionize prenatal care by offering a safe and noninvasive alternative to invasive procedures. It has the potential to improve the detection of chromosomal abnormalities, leading to better pregnancy outcomes and reduced risk of miscarriage. Further research and development in this field is needed to make this approach more accessible and affordable for pregnant women.
Background: The rapid evolution of SARS-CoV-2 imposed a huge challenge on disease control. Immune evasion caused by genetic variations of the SARS-CoV-2 spike protein's immunogenic epitopes affects the efficiency of monoclonal antibody-based therapy of COVID-19. Therefore, a rapid method is needed to evaluate the efficacy of the available monoclonal antibodies against the new emerging variants or potential novel variants.
Objective: The aim of this study is to develop a rapid computational method to evaluate the neutralization power of anti-SARS-CoV-2 monoclonal antibodies against new SARS-CoV-2 variants and other potential new mutations.
Methods: The amino acid sequence of the extracellular domain of the spike proteins of the severe acute respiratory syndrome coronavirus (GenBank accession number YP_009825051.1) and SARS-CoV-2 (GenBank accession number YP_009724390.1) were used to create computational 3D models for the native spike proteins. Specific mutations were introduced to the curated sequence to generate the different variant spike models. The neutralization potential of sotrovimab (S309) against these variants was evaluated based on its molecular interactions and Gibbs free energy in comparison to a reference model after molecular replacement of the reference receptor-binding domain with the variant's receptor-binding domain.
Results: Our results show a loss in the binding affinity of the neutralizing antibody S309 with both SARS-CoV and SARS-CoV-2. The binding affinity of S309 was greater to the Alpha, Beta, Gamma, and Kappa variants than to the original Wuhan strain of SARS-CoV-2. However, S309 showed a substantially decreased binding affinity to the Delta and Omicron variants. Based on the mutational profile of Omicron subvariants, our data describe the effect of the G339H and G339D mutations and their role in escaping antibody neutralization, which is in line with published clinical reports.
Conclusions: This method is rapid, applicable, and of interest to adapt the use of therapeutic antibodies to the treatment of emerging variants. It could be applied to antibody-based treatment of other viral infections.
[This corrects the article DOI: 10.2196/43906.].
Background: Carcinoma of unknown primary (CUP) is a subset of metastatic cancers in which the primary tissue source of the cancer cells remains unidentified. CUP is the eighth most common malignancy worldwide, accounting for up to 5% of all malignancies. Representing an exceptionally aggressive metastatic cancer, the median survival is approximately 3 to 6 months. The tissue in which cancer arises plays a key role in our understanding of sensitivities to various forms of cell death. Thus, the lack of knowledge on the tissue of origin (TOO) makes it difficult to devise tailored and effective treatments for patients with CUP. Developing quick and clinically implementable methods to identify the TOO of the primary site is crucial in treating patients with CUP. Noncoding RNAs may hold potential for origin identification and provide a robust route to clinical implementation due to their resistance against chemical degradation.
Objective: This study aims to investigate the potential of microRNAs, a subset of noncoding RNAs, as highly accurate biomarkers for detecting the TOO through data-driven, machine learning approaches for metastatic cancers.
Methods: We used microRNA expression data from The Cancer Genome Atlas data set and assessed various machine learning approaches, from simple classifiers to deep learning approaches. As a test of our classifiers, we evaluated the accuracy on a separate set of 194 primary tumor samples from the Sequence Read Archive. We used permutation feature importance to determine the potential microRNA biomarkers and assessed them with principal component analysis and t-distributed stochastic neighbor embedding visualizations.
Results: Our results show that it is possible to design robust classifiers to detect the TOO for metastatic samples on The Cancer Genome Atlas data set, with an accuracy of up to 97% (351/362), which may be used in situations of CUP. Our findings show that deep learning techniques enhance prediction accuracy. We progressed from an initial accuracy prediction of 62.5% (226/362) with decision trees to 93.2% (337/362) with logistic regression, finally achieving 97% (351/362) accuracy using deep learning on metastatic samples. On the Sequence Read Archive validation set, a lower accuracy of 41.2% (77/188) was achieved by the decision tree, while deep learning achieved a higher accuracy of 80.4% (151/188). Notably, our feature importance analysis showed the top 3 most important features for predicting TOO to be microRNA-10b, microRNA-205, and microRNA-196b, which aligns with previous work.
Conclusions: Our findings highlight the potential of using machine learning techniques to devise accurate tests for detecting TOO for CUP. Since microRNAs are carried throughout the body via extracellular vesicles secreted from cells, they may serve as key biomarkers for liquid biopsy due to their presence in