JMIR bioinformatics and biotechnology最新文献_第5页

Secure Comparisons of Single Nucleotide Polymorphisms Using Secure Multiparty Computation: Method Development. 使用安全多方计算对单核苷酸多态性进行安全比较（预印本）

JMIR bioinformatics and biotechnology

Pub Date : 2023-07-18 DOI: 10.2196/44700

Andrew Woods, Skyler T Kramer, Dong Xu, Wei Jiang

Background: While genomic variations can provide valuable information for health care and ancestry, the privacy of individual genomic data must be protected. Thus, a secure environment is desirable for a human DNA database such that the total data are queryable but not directly accessible to involved parties (eg, data hosts and hospitals) and that the query results are learned only by the user or authorized party.

Objective: In this study, we provide efficient and secure computations on panels of single nucleotide polymorphisms (SNPs) from genomic sequences as computed under the following set operations: union, intersection, set difference, and symmetric difference.

Methods: Using these operations, we can compute similarity metrics, such as the Jaccard similarity, which could allow querying a DNA database to find the same person and genetic relatives securely. We analyzed various security paradigms and show metrics for the protocols under several security assumptions, such as semihonest, malicious with honest majority, and malicious with a malicious majority.

Results: We show that our methods can be used practically on realistically sized data. Specifically, we can compute the Jaccard similarity of two genomes when considering sets of SNPs, each with 400,000 SNPs, in 2.16 seconds with the assumption of a malicious adversary in an honest majority and 0.36 seconds under a semihonest model.

Conclusions: Our methods may help adopt trusted environments for hosting individual genomic data with end-to-end data security.

背景：虽然基因组变异可以为医疗保健和祖先提供有价值的信息，但个人基因组数据的隐私必须得到保护。因此，人类 DNA 数据库需要一个安全的环境，使所有数据可以查询，但相关方（如数据主机和医院）不能直接访问，只有用户或授权方才能了解查询结果：在这项研究中，我们提供了对基因组序列中单核苷酸多态性（SNPs）面板的高效安全计算，计算方法包括以下集合运算：联合、交集、集合差和对称差：利用这些运算，我们可以计算出相似度指标，如 Jaccard 相似度，从而可以查询 DNA 数据库，安全地找到同一个人和遗传亲属。我们分析了各种安全范式，并展示了在半诚信、恶意与诚信多数、恶意与恶意多数等几种安全假设下的协议度量：我们的研究结果表明，我们的方法可以实际应用于真实大小的数据。具体来说，当考虑到 SNPs 集（每个 SNPs 集有 400,000 个 SNPs）时，我们可以在 2.16 秒内计算出两个基因组的 Jaccard 相似度（假设恶意对手处于诚实多数），而在半诚实模型下只需 0.36 秒：我们的方法有助于采用可信环境来托管具有端到端数据安全性的个体基因组数据。

{"title":"Secure Comparisons of Single Nucleotide Polymorphisms Using Secure Multiparty Computation: Method Development.","authors":"Andrew Woods, Skyler T Kramer, Dong Xu, Wei Jiang","doi":"10.2196/44700","DOIUrl":"10.2196/44700","url":null,"abstract":"Background: While genomic variations can provide valuable information for health care and ancestry, the privacy of individual genomic data must be protected. Thus, a secure environment is desirable for a human DNA database such that the total data are queryable but not directly accessible to involved parties (eg, data hosts and hospitals) and that the query results are learned only by the user or authorized party.Objective: In this study, we provide efficient and secure computations on panels of single nucleotide polymorphisms (SNPs) from genomic sequences as computed under the following set operations: union, intersection, set difference, and symmetric difference.Methods: Using these operations, we can compute similarity metrics, such as the Jaccard similarity, which could allow querying a DNA database to find the same person and genetic relatives securely. We analyzed various security paradigms and show metrics for the protocols under several security assumptions, such as semihonest, malicious with honest majority, and malicious with a malicious majority.Results: We show that our methods can be used practically on realistically sized data. Specifically, we can compute the Jaccard similarity of two genomes when considering sets of SNPs, each with 400,000 SNPs, in 2.16 seconds with the assumption of a malicious adversary in an honest majority and 0.36 seconds under a semihonest model.Conclusions: Our methods may help adopt trusted environments for hosting individual genomic data with end-to-end data security.","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e44700"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135223/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49648411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mutations of SARS-CoV-2 Structural Proteins in the Alpha, Beta, Gamma, and Delta Variants: Bioinformatics Analysis. SARS-CoV-2结构蛋白在α、β、γ和δ变体中的突变：生物信息学分析。

JMIR bioinformatics and biotechnology

Pub Date : 2023-07-14 eCollection Date: 2023-01-01 DOI: 10.2196/43906

Saima Rehman Khetran, Roma Mustafa

Background: COVID-19 and Middle East Respiratory Syndrome are two pandemic respiratory diseases caused by coronavirus species. The novel disease COVID-19 caused by SARS-CoV-2 was first reported in Wuhan, Hubei Province, China, in December 2019, and became a pandemic within 2-3 months, affecting social and economic platforms worldwide. Despite the rapid development of vaccines, there have been obstacles to their distribution, including a lack of fundamental resources, poor immunization, and manual vaccine replication. Several variants of the original Wuhan strain have emerged in the last 3 years, which can pose a further challenge for control and vaccine development.

Objective: The aim of this study was to comprehensively analyze mutations in SARS-CoV-2 variants of concern (VoCs) using a bioinformatics approach toward identifying novel mutations that may be helpful in developing new vaccines by targeting these sites.

Methods: Reference sequences of the SARS-CoV-2 spike (YP_009724390) and nucleocapsid (YP_009724397) proteins were compared to retrieved sequences of isolates of four VoCs from 14 countries for mutational and evolutionary analyses. Multiple sequence alignment was performed and phylogenetic trees were constructed by the neighbor-joining method with 1000 bootstrap replicates using MEGA (version 6). Mutations in amino acid sequences were analyzed using the MultAlin online tool (version 5.4.1).

Results: Among the four VoCs, a total of 143 nonsynonymous mutations and 8 deletions were identified in the spike and nucleocapsid proteins. Multiple sequence alignment and amino acid substitution analysis revealed new mutations, including G72W, M2101I, L139F, 209-211 deletion, G212S, P199L, P67S, I292T, and substitutions with unknown amino acid replacement, reported in Egypt (MW533289), the United Kingdom (MT906649), and other regions. The variants B.1.1.7 (Alpha variant) and B.1.617.2 (Delta variant), characterized by higher transmissibility and lethality, harbored the amino acid substitutions D614G, R203K, and G204R with higher prevalence rates in most sequences. Phylogenetic analysis among the novel SARS-CoV-2 variant proteins and some previously reported β-coronavirus proteins indicated that either the evolutionary clade was weakly supported or not supported at all by the β-coronavirus species.

Conclusions: This study could contribute toward gaining a better understanding of the basic nature of SARS-CoV-2 and its four major variants. The numerous novel mutations detected could also provide a better understanding of VoCs and help in identifying suitable mutations for vaccine targets. Moreover, these data offer evidence for new types of mutations in VoCs, which will provide insight into the epidemiology of SARS-CoV-2.

背景：COVID-19和中东呼吸综合征是由冠状病毒引起的两种呼吸道流行病。由SARS-CoV-2引起的新型疾病COVID-19于2019年12月在中国湖北省武汉市首次报告，并在2-3个月内成为大流行病，影响了全世界的社会和经济平台。尽管疫苗发展迅速，但其流通一直存在障碍，包括缺乏基础资源、免疫不力、疫苗人工复制等。近三年来，武汉病毒的原始毒株出现了多个变种，这可能会对控制和疫苗开发带来进一步的挑战：本研究的目的是利用生物信息学方法全面分析 SARS-CoV-2 变异株（VoCs）中的突变，以确定新的突变位点，从而帮助针对这些位点开发新的疫苗：方法：将 SARS-CoV-2 棘突蛋白（YP_009724390）和核壳蛋白（YP_009724397）的参考序列与检索到的来自 14 个国家的 4 个 VoCs 分离物的序列进行比较，以进行突变和进化分析。进行了多重序列比对，并使用 MEGA（版本 6）以 1000 次引导重复的邻接法构建了系统发生树。使用 MultAlin 在线工具（5.4.1 版）分析了氨基酸序列中的突变：结果：在四种 VoCs 中，共发现 143 个非同义突变和 8 个核壳蛋白缺失。多重序列比对和氨基酸替换分析发现了新的突变，包括埃及（MW533289）、英国（MT906649）和其他地区报道的 G72W、M2101I、L139F、209-211 缺失、G212S、P199L、P67S、I292T 和未知氨基酸替换。变异体 B.1.1.7（Alpha 变异体）和 B.1.617.2（Delta 变异体）具有较高的传播性和致死性，其氨基酸替换为 D614G、R203K 和 G204R，在大多数序列中的流行率较高。新型SARS-CoV-2变体蛋白与之前报道的一些β-冠状病毒蛋白之间的系统发生分析表明，β-冠状病毒物种对进化支系的支持较弱或根本不支持：结论：这项研究有助于更好地了解 SARS-CoV-2 及其四个主要变种的基本性质。检测到的大量新型突变也有助于更好地了解 VoCs，并帮助确定合适的突变作为疫苗靶标。此外，这些数据还为 VoCs 中的新型变异提供了证据，有助于深入了解 SARS-CoV-2 的流行病学。

{"title":"Mutations of SARS-CoV-2 Structural Proteins in the Alpha, Beta, Gamma, and Delta Variants: Bioinformatics Analysis.","authors":"Saima Rehman Khetran, Roma Mustafa","doi":"10.2196/43906","DOIUrl":"10.2196/43906","url":null,"abstract":"Background: COVID-19 and Middle East Respiratory Syndrome are two pandemic respiratory diseases caused by coronavirus species. The novel disease COVID-19 caused by SARS-CoV-2 was first reported in Wuhan, Hubei Province, China, in December 2019, and became a pandemic within 2-3 months, affecting social and economic platforms worldwide. Despite the rapid development of vaccines, there have been obstacles to their distribution, including a lack of fundamental resources, poor immunization, and manual vaccine replication. Several variants of the original Wuhan strain have emerged in the last 3 years, which can pose a further challenge for control and vaccine development.Objective: The aim of this study was to comprehensively analyze mutations in SARS-CoV-2 variants of concern (VoCs) using a bioinformatics approach toward identifying novel mutations that may be helpful in developing new vaccines by targeting these sites.Methods: Reference sequences of the SARS-CoV-2 spike (YP_009724390) and nucleocapsid (YP_009724397) proteins were compared to retrieved sequences of isolates of four VoCs from 14 countries for mutational and evolutionary analyses. Multiple sequence alignment was performed and phylogenetic trees were constructed by the neighbor-joining method with 1000 bootstrap replicates using MEGA (version 6). Mutations in amino acid sequences were analyzed using the MultAlin online tool (version 5.4.1).Results: Among the four VoCs, a total of 143 nonsynonymous mutations and 8 deletions were identified in the spike and nucleocapsid proteins. Multiple sequence alignment and amino acid substitution analysis revealed new mutations, including G72W, M2101I, L139F, 209-211 deletion, G212S, P199L, P67S, I292T, and substitutions with unknown amino acid replacement, reported in Egypt (MW533289), the United Kingdom (MT906649), and other regions. The variants B.1.1.7 (Alpha variant) and B.1.617.2 (Delta variant), characterized by higher transmissibility and lethality, harbored the amino acid substitutions D614G, R203K, and G204R with higher prevalence rates in most sequences. Phylogenetic analysis among the novel SARS-CoV-2 variant proteins and some previously reported β-coronavirus proteins indicated that either the evolutionary clade was weakly supported or not supported at all by the β-coronavirus species.Conclusions: This study could contribute toward gaining a better understanding of the basic nature of SARS-CoV-2 and its four major variants. The numerous novel mutations detected could also provide a better understanding of VoCs and help in identifying suitable mutations for vaccine targets. Moreover, these data offer evidence for new types of mutations in VoCs, which will provide insight into the epidemiology of SARS-CoV-2.","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":"4 ","pages":"e43906"},"PeriodicalIF":0.0,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10353769/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9867153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introducing JMIR Bioinformatics and Biotechnology: A Platform for Interdisciplinary Collaboration and Cutting-Edge Research. 介绍JMIR生物信息学和生物技术：跨学科合作和前沿研究的平台

JMIR bioinformatics and biotechnology

Pub Date : 2023-06-12 DOI: 10.2196/48631

Ece Dilber Gamsiz Uzun

JMIR Bioinformatics and Biotechnology supports interdisciplinary research and welcomes contributions that push the boundaries of bioinformatics, genomics, artificial intelligence, and pathology informatics.

JMIR生物信息学和生物技术支持跨学科研究，并欢迎推动生物信息学、基因组学、人工智能和病理学信息学边界的贡献。

引用次数: 0

Genomic Insights Into the Evolution and Demographic History of the SARS-CoV-2 Omicron Variant: Population Genomics Approach. 对 SARS-CoV-2 Omicron 变异体的进化和人口历史的基因组学洞察：群体基因组学方法

JMIR bioinformatics and biotechnology

Pub Date : 2023-06-12 eCollection Date: 2023-01-01 DOI: 10.2196/40673

Kritika M Garg, Vinita Lamba, Balaji Chattopadhyay

Background: A thorough understanding of the patterns of genetic subdivision in a pathogen can provide crucial information that is necessary to prevent disease spread. For SARS-CoV-2, the availability of millions of genomes makes this task analytically challenging, and traditional methods for understanding genetic subdivision often fail.

Objective: The aim of our study was to use population genomics methods to identify the subtle subdivisions and demographic history of the Omicron variant, in addition to those captured by the Pango lineage.

Methods: We used a combination of an evolutionary network approach and multivariate statistical protocols to understand the subdivision and spread of the Omicron variant. We identified subdivisions within the BA.1 and BA.2 lineages and further identified the mutations associated with each cluster. We further characterized the overall genomic diversity of the Omicron variant and assessed the selection pressure for each of the genetic clusters identified.

Results: We observed concordant results, using two different methods to understand genetic subdivision. The overall pattern of subdivision in the Omicron variant was in broad agreement with the Pango lineage definition. Further, 1 cluster of the BA.1 lineage and 3 clusters of the BA.2 lineage revealed statistically significant signatures of selection or demographic expansion (Tajima's D<-2), suggesting the role of microevolutionary processes in the spread of the virus.

Conclusions: We provide an easy framework for assessing the genetic structure and demographic history of SARS-CoV-2, which can be particularly useful for understanding the local history of the virus. We identified important mutations that are advantageous to some lineages of Omicron and aid in the transmission of the virus. This is crucial information for policy makers, as preventive measures can be designed to mitigate further spread based on a holistic understanding of the variability of the virus and the evolutionary processes aiding its spread.

背景：透彻了解病原体的基因细分模式可以提供预防疾病传播所需的重要信息。对于 SARS-CoV-2 而言，数百万个基因组的存在使这项任务在分析上具有挑战性，而了解基因细分的传统方法往往会失败：我们研究的目的是利用群体基因组学方法，在 Pango 系的基础上确定 Omicron 变体的细分和人口历史：我们结合使用了进化网络方法和多元统计方案，以了解奥米克隆变体的细分和传播。我们确定了 BA.1 和 BA.2 系的细分，并进一步确定了与每个群相关的突变。我们进一步确定了 Omicron 变异体的整体基因组多样性，并评估了每个已确定基因簇的选择压力：结果：我们使用两种不同的方法来理解基因细分，观察到了一致的结果。奥米克隆变体的整体细分模式与潘戈系的定义基本一致。此外，BA.1系的1个聚类和BA.2系的3个聚类在统计学上显示出明显的选择或人口扩张特征（Tajima's DConclusions）：我们为评估 SARS-CoV-2 的遗传结构和种群历史提供了一个简便的框架，这对了解病毒的本地历史特别有用。我们发现了一些重要的突变，这些突变对 Omicron 的某些品系有利，有助于病毒的传播。这对政策制定者来说是至关重要的信息，因为可以在全面了解病毒的变异性和帮助其传播的进化过程的基础上，设计预防措施，以减少病毒的进一步传播。

{"title":"Genomic Insights Into the Evolution and Demographic History of the SARS-CoV-2 Omicron Variant: Population Genomics Approach.","authors":"Kritika M Garg, Vinita Lamba, Balaji Chattopadhyay","doi":"10.2196/40673","DOIUrl":"10.2196/40673","url":null,"abstract":"Background: A thorough understanding of the patterns of genetic subdivision in a pathogen can provide crucial information that is necessary to prevent disease spread. For SARS-CoV-2, the availability of millions of genomes makes this task analytically challenging, and traditional methods for understanding genetic subdivision often fail.Objective: The aim of our study was to use population genomics methods to identify the subtle subdivisions and demographic history of the Omicron variant, in addition to those captured by the Pango lineage.Methods: We used a combination of an evolutionary network approach and multivariate statistical protocols to understand the subdivision and spread of the Omicron variant. We identified subdivisions within the BA.1 and BA.2 lineages and further identified the mutations associated with each cluster. We further characterized the overall genomic diversity of the Omicron variant and assessed the selection pressure for each of the genetic clusters identified.Results: We observed concordant results, using two different methods to understand genetic subdivision. The overall pattern of subdivision in the Omicron variant was in broad agreement with the Pango lineage definition. Further, 1 cluster of the BA.1 lineage and 3 clusters of the BA.2 lineage revealed statistically significant signatures of selection or demographic expansion (Tajima's D<-2), suggesting the role of microevolutionary processes in the spread of the virus.Conclusions: We provide an easy framework for assessing the genetic structure and demographic history of SARS-CoV-2, which can be particularly useful for understanding the local history of the virus. We identified important mutations that are advantageous to some lineages of Omicron and aid in the transmission of the virus. This is crucial information for policy makers, as preventive measures can be designed to mitigate further spread based on a holistic understanding of the variability of the virus and the evolutionary processes aiding its spread.","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":"4 ","pages":"e40673"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10331448/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9815596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Decision of the Optimal Rank of a Nonnegative Matrix Factorization Model for Gene Expression Data Sets Utilizing the Unit Invariant Knee Method: Development and Evaluation of the Elbow Method for Rank Selection. 利用单位不变膝法确定基因表达数据集非负矩阵分解模型的最优秩(预印本)

JMIR bioinformatics and biotechnology

Pub Date : 2023-06-06 DOI: 10.2196/43665

Emine Guven

Background: There is a great need to develop a computational approach to analyze and exploit the information contained in gene expression data. The recent utilization of nonnegative matrix factorization (NMF) in computational biology has demonstrated the capability to derive essential details from a high amount of data in particular gene expression microarrays. A common problem in NMF is finding the proper number rank (r) of factors of the degraded demonstration, but no agreement exists on which technique is most appropriate to utilize for this purpose. Thus, various techniques have been suggested to select the optimal value of rank factorization (r).

Objective: In this work, a new metric for rank selection is proposed based on the elbow method, which was methodically compared against the cophenetic metric.

Methods: To decide the optimum number rank (r), this study focused on the unit invariant knee (UIK) method of the NMF on gene expression data sets. Since the UIK method requires an extremum distance estimator that is eventually employed for inflection and identification of a knee point, the proposed method finds the first inflection point of the curvature of the residual sum of squares of the proposed algorithms using the UIK method on gene expression data sets as a target matrix.

Results: Computation was conducted for the UIK task using gene expression data of acute lymphoblastic leukemia and acute myeloid leukemia samples. Consequently, the distinct results of NMF were subjected to comparison on different algorithms. The proposed UIK method is easy to perform, fast, free of a priori rank value input, and does not require initial parameters that significantly influence the model's functionality.

Conclusions: This study demonstrates that the elbow method provides a credible prediction for both gene expression data and for precisely estimating simulated mutational processes data with known dimensions. The proposed UIK method is faster than conventional methods, including metrics utilizing the consensus matrix as a criterion for rank selection, while achieving significantly better computational efficiency without visual inspection on the curvatives. Finally, the suggested rank tuning method based on the elbow method for gene expression data is arguably theoretically superior to the cophenetic measure.

背景：目前亟需开发一种计算方法来分析和利用基因表达数据中包含的信息。最近在计算生物学中使用的非负矩阵因式分解（NMF）证明了从大量数据（尤其是基因表达微阵列）中提取重要细节的能力。非负矩阵因式分解中的一个常见问题是找到降级展示因子的适当秩数（r），但对于为此目的使用哪种技术最合适却没有一致意见。因此，人们提出了各种技术来选择秩因子（r）的最佳值：在这项工作中，根据肘法提出了一种新的秩选择度量，并与共轭度量进行了方法上的比较：为了确定最佳数秩（r），本研究重点研究了基因表达数据集上 NMF 的单位不变膝法（UIK）。由于 UIK 方法需要一个极值距离估计器，该估计器最终被用于拐点和膝点的识别，因此提出的方法以基因表达数据集上的 UIK 方法为目标矩阵，找到了所提算法残差平方和曲率的第一个拐点：使用急性淋巴细胞白血病和急性髓性白血病样本的基因表达数据对 UIK 任务进行了计算。因此，对不同算法的 NMF 结果进行了比较。所提出的 UIK 方法易于执行，速度快，不需要先验秩值输入，也不需要对模型功能有重大影响的初始参数：本研究表明，肘部方法既能为基因表达数据提供可靠的预测，也能精确估计已知维度的模拟突变过程数据。所提出的 UIK 方法比传统方法（包括利用共识矩阵作为秩选择标准的度量方法）更快，同时在不对曲线进行目视检查的情况下，计算效率明显更高。最后，建议的基于基因表达数据肘法的秩调整方法可以说在理论上优于共轭度量。

{"title":"Decision of the Optimal Rank of a Nonnegative Matrix Factorization Model for Gene Expression Data Sets Utilizing the Unit Invariant Knee Method: Development and Evaluation of the Elbow Method for Rank Selection.","authors":"Emine Guven","doi":"10.2196/43665","DOIUrl":"10.2196/43665","url":null,"abstract":"Background: There is a great need to develop a computational approach to analyze and exploit the information contained in gene expression data. The recent utilization of nonnegative matrix factorization (NMF) in computational biology has demonstrated the capability to derive essential details from a high amount of data in particular gene expression microarrays. A common problem in NMF is finding the proper number rank (r) of factors of the degraded demonstration, but no agreement exists on which technique is most appropriate to utilize for this purpose. Thus, various techniques have been suggested to select the optimal value of rank factorization (r).Objective: In this work, a new metric for rank selection is proposed based on the elbow method, which was methodically compared against the cophenetic metric.Methods: To decide the optimum number rank (r), this study focused on the unit invariant knee (UIK) method of the NMF on gene expression data sets. Since the UIK method requires an extremum distance estimator that is eventually employed for inflection and identification of a knee point, the proposed method finds the first inflection point of the curvature of the residual sum of squares of the proposed algorithms using the UIK method on gene expression data sets as a target matrix.Results: Computation was conducted for the UIK task using gene expression data of acute lymphoblastic leukemia and acute myeloid leukemia samples. Consequently, the distinct results of NMF were subjected to comparison on different algorithms. The proposed UIK method is easy to perform, fast, free of a priori rank value input, and does not require initial parameters that significantly influence the model's functionality.Conclusions: This study demonstrates that the elbow method provides a credible prediction for both gene expression data and for precisely estimating simulated mutational processes data with known dimensions. The proposed UIK method is faster than conventional methods, including metrics utilizing the consensus matrix as a criterion for rank selection, while achieving significantly better computational efficiency without visual inspection on the curvatives. Finally, the suggested rank tuning method based on the elbow method for gene expression data is arguably theoretically superior to the cophenetic measure.","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e43665"},"PeriodicalIF":0.0,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135234/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48883023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Identification of Potential Drugs for Dengue Hemorrhagic Fever: Network-Based Drug Reprofiling Study. “登革热潜在药物的鉴定：基于网络的药物再备案方法”（预印本）

JMIR bioinformatics and biotechnology

Pub Date : 2023-05-09 DOI: 10.2196/37306

Praveenkumar Kochuthakidiyel Suresh, Gnanasoundari Sekar, Kavya Mallady, Wan Suriana Wan Ab Rahman, Wan Nazatul Shima Shahidan, Gokulakannan Venkatesan

Background: Dengue fever can progress to dengue hemorrhagic fever (DHF), a more serious and occasionally fatal form of the disease. Indicators of serious disease arise about the time the fever begins to reduce (typically 3 to 7 days following symptom onset). There are currently no effective antivirals available. Drug repurposing is an emerging drug discovery process for rapidly developing effective DHF therapies. Through network pharmacology modeling, several US Food and Drug Administration (FDA)-approved medications have already been researched for various viral outbreaks.

Objective: We aimed to identify potentially repurposable drugs for DHF among existing FDA-approved drugs for viral attacks, symptoms of viral fevers, and DHF.

Methods: Using target identification databases (GeneCards and DrugBank), we identified human-DHF virus interacting genes and drug targets against these genes. We determined hub genes and potential drugs with a network-based analysis. We performed functional enrichment and network analyses to identify pathways, protein-protein interactions, tissues where the gene expression was high, and disease-gene associations.

Results: Analyzing virus-host interactions and therapeutic targets in the human genome network revealed 45 repurposable medicines. Hub network analysis of host-virus-drug associations suggested that aspirin, captopril, and rilonacept might efficiently treat DHF. Gene enrichment analysis supported these findings. According to a Mayo Clinic report, using aspirin in the treatment of dengue fever may increase the risk of bleeding complications, but several studies from around the world suggest that thrombosis is associated with DHF. The human interactome contains the genes prostaglandin-endoperoxide synthase 2 (PTGS2), angiotensin converting enzyme (ACE), and coagulation factor II, thrombin (F2), which have been documented to have a role in the pathogenesis of disease progression in DHF, and our analysis of most of the drugs targeting these genes showed that the hub gene module (human-virus-drug) was highly enriched in tissues associated with the immune system (P=7.29 × 10^-24) and human umbilical vein endothelial cells (P=1.83 × 10^-20); this group of tissues acts as an anticoagulant barrier between the vessel walls and blood. Kegg analysis showed an association with genes linked to cancer (P=1.13 × 10^-14) and the advanced glycation end products-receptor for advanced glycation end products signaling pathway in diabetic complications (P=3.52 × 10^-14), which indicates that DHF patients with diabetes and cancer are at risk of higher pathogenicity. Thus, gene-targeting medications may play a significant part in limiting or worsening the condition of DHF patients.

Conclusions: Aspirin is not usually prescribed for dengue fever because of bleeding complications, but it

背景：登革热可发展为登革出血热（DHF），这是一种更为严重的疾病，有时甚至会致命。严重疾病的征兆大约出现在开始退烧的时候（通常是症状出现后的 3 到 7 天）。目前还没有有效的抗病毒药物。药物再利用是一种新兴的药物发现过程，用于快速开发有效的 DHF 疗法。通过网络药理学建模，美国食品和药物管理局（FDA）批准的几种药物已被研究用于各种病毒爆发：我们的目标是在现有的 FDA 批准的治疗病毒发作、病毒性发烧症状和 DHF 的药物中，找出可能用于 DHF 的可再利用药物：我们利用靶点识别数据库（GeneCards 和 DrugBank）确定了人类-DHF 病毒相互作用基因以及针对这些基因的药物靶点。我们通过网络分析确定了枢纽基因和潜在药物。我们进行了功能富集和网络分析，以确定通路、蛋白-蛋白相互作用、基因高表达的组织以及疾病-基因关联：结果：通过分析人类基因组网络中的病毒-宿主相互作用和治疗靶点，发现了45种可再利用的药物。宿主-病毒-药物关联的枢纽网络分析表明，阿司匹林、卡托普利和利洛那普可以有效治疗DHF。基因富集分析支持了这些发现。根据梅奥诊所的一份报告，使用阿司匹林治疗登革热可能会增加出血并发症的风险，但世界各地的一些研究表明，血栓形成与登革热有关。人类相互作用组包含前列腺素-内过氧化物合成酶 2（PTGS2）、血管紧张素转换酶（ACE）和凝血因子 II、凝血酶（F2）等基因，这些基因已被证实在 DHF 疾病进展的发病机制中发挥作用，我们对大多数靶向这些基因的药物进行分析后发现，中枢基因模块（人类-病毒-药物）在与免疫系统相关的组织中高度富集（P=7.29 × 10-24）和人脐静脉内皮细胞（P=1.83 × 10-20）；这类组织在血管壁和血液之间起着抗凝屏障的作用。Kegg分析显示，与癌症相关的基因（P=1.13 × 10-14）和糖尿病并发症中的高级糖化终产物-高级糖化终产物受体信号通路（P=3.52 × 10-14）存在关联，这表明患有糖尿病和癌症的DHF患者有更高的致病风险。因此，基因靶向药物可能会在限制或恶化 DHF 患者病情方面发挥重要作用：由于出血并发症，阿司匹林通常不是登革热的处方药，但有报道称，使用较小剂量的阿司匹林对治疗有血栓形成的疾病有益。药物再利用是一个新兴领域，在开具处方前需要进行临床验证和剂量鉴定。进一步的回顾性和合作性国际试验对于了解这种疾病的发病机制至关重要。

{"title":"The Identification of Potential Drugs for Dengue Hemorrhagic Fever: Network-Based Drug Reprofiling Study.","authors":"Praveenkumar Kochuthakidiyel Suresh, Gnanasoundari Sekar, Kavya Mallady, Wan Suriana Wan Ab Rahman, Wan Nazatul Shima Shahidan, Gokulakannan Venkatesan","doi":"10.2196/37306","DOIUrl":"10.2196/37306","url":null,"abstract":"Background: Dengue fever can progress to dengue hemorrhagic fever (DHF), a more serious and occasionally fatal form of the disease. Indicators of serious disease arise about the time the fever begins to reduce (typically 3 to 7 days following symptom onset). There are currently no effective antivirals available. Drug repurposing is an emerging drug discovery process for rapidly developing effective DHF therapies. Through network pharmacology modeling, several US Food and Drug Administration (FDA)-approved medications have already been researched for various viral outbreaks.Objective: We aimed to identify potentially repurposable drugs for DHF among existing FDA-approved drugs for viral attacks, symptoms of viral fevers, and DHF.Methods: Using target identification databases (GeneCards and DrugBank), we identified human-DHF virus interacting genes and drug targets against these genes. We determined hub genes and potential drugs with a network-based analysis. We performed functional enrichment and network analyses to identify pathways, protein-protein interactions, tissues where the gene expression was high, and disease-gene associations.Results: Analyzing virus-host interactions and therapeutic targets in the human genome network revealed 45 repurposable medicines. Hub network analysis of host-virus-drug associations suggested that aspirin, captopril, and rilonacept might efficiently treat DHF. Gene enrichment analysis supported these findings. According to a Mayo Clinic report, using aspirin in the treatment of dengue fever may increase the risk of bleeding complications, but several studies from around the world suggest that thrombosis is associated with DHF. The human interactome contains the genes prostaglandin-endoperoxide synthase 2 (PTGS2), angiotensin converting enzyme (ACE), and coagulation factor II, thrombin (F2), which have been documented to have a role in the pathogenesis of disease progression in DHF, and our analysis of most of the drugs targeting these genes showed that the hub gene module (human-virus-drug) was highly enriched in tissues associated with the immune system (P=7.29 × 10-24) and human umbilical vein endothelial cells (P=1.83 × 10-20); this group of tissues acts as an anticoagulant barrier between the vessel walls and blood. Kegg analysis showed an association with genes linked to cancer (P=1.13 × 10-14) and the advanced glycation end products-receptor for advanced glycation end products signaling pathway in diabetic complications (P=3.52 × 10-14), which indicates that DHF patients with diabetes and cancer are at risk of higher pathogenicity. Thus, gene-targeting medications may play a significant part in limiting or worsening the condition of DHF patients.Conclusions: Aspirin is not usually prescribed for dengue fever because of bleeding complications, but it ","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e37306"},"PeriodicalIF":0.0,"publicationDate":"2023-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135182/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43878118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Differentially Expressed Genes Responsible for the Development of T Helper 9 Cells From T Helper 2 Cells in Various Disease States: Immuno-Interactomics Study. 不同疾病状态下T辅助细胞2分化为T辅助细胞9的差异表达基因：免疫相互作用组学研究

JMIR bioinformatics and biotechnology

Pub Date : 2023-02-23 DOI: 10.2196/42421

Manoj Khokhar, Purvi Purohit, Ashita Gadwal, Sojit Tomo, Nitin Kumar Bajpai, Ravindra Shukla

Background: T helper (Th) 9 cells are a novel subset of Th cells that develop independently from Th2 cells and are characterized by the secretion of interleukin (IL)-9. Studies have suggested the involvement of Th9 cells in variable diseases such as allergic and pulmonary diseases (eg, asthma, chronic obstructive airway disease, chronic rhinosinusitis, nasal polyps, and pulmonary hypoplasia), metabolic diseases (eg, acute leukemia, myelocytic leukemia, breast cancer, lung cancer, melanoma, pancreatic cancer), neuropsychiatric disorders (eg, Alzheimer disease), autoimmune diseases (eg, Graves disease, Crohn disease, colitis, psoriasis, systemic lupus erythematosus, systemic scleroderma, rheumatoid arthritis, multiple sclerosis, inflammatory bowel disease, atopic dermatitis, eczema), and infectious diseases (eg, tuberculosis, hepatitis). However, there is a dearth of information on its involvement in other metabolic, neuropsychiatric, and infectious diseases.

Objective: This study aims to identify significant differentially altered genes in the conversion of Th2 to Th9 cells, and their regulating microRNAs (miRs) from publicly available Gene Expression Omnibus data sets of the mouse model using in silico analysis to unravel various pathogenic pathways involved in disease processes.

Methods: Using differentially expressed genes (DEGs) identified from 2 publicly available data sets (GSE99166 and GSE123501) we performed functional enrichment and network analyses to identify pathways, protein-protein interactions, miR-messenger RNA associations, and disease-gene associations related to significant differentially altered genes implicated in the conversion of Th2 to Th9 cells.

Results: We extracted 260 common downregulated, 236 common upregulated, and 634 common DEGs from the expression profiles of data sets GSE99166 and GSE123501. Codifferentially expressed ILs, cytokines, receptors, and transcription factors (TFs) were enriched in 7 crucial Kyoto Encyclopedia of Genes and Genomes pathways and Gene Ontology. We constructed the protein-protein interaction network and predicted the top regulatory miRs involved in the Th2 to Th9 differentiation pathways. We also identified various metabolic, allergic and pulmonary, neuropsychiatric, autoimmune, and infectious diseases as well as carcinomas where the differentiation of Th2 to Th9 may play a crucial role.

Conclusions: This study identified hitherto unexplored possible associations between Th9 and disease states. Some important ILs, including CCL1 (chemokine [C-C motif] ligand 1), CCL20 (chemokine [C-C motif] ligand 20), IL-13, IL-4, IL-12A, and IL-9; receptors, including IL-12RB1, IL-4RA (interleukin 9 receptor alpha), CD53 (cluster of differentiation 53), CD6 (cluster of differentiation 6), CD5 (cluster of differentiation 5), CD83 (cluster of differentiation 83), CD197 (cluster of differentiation

辅助性T细胞（Th）9是一种新的Th细胞亚群，独立于Th2细胞发育，其特征是分泌白细胞介素（IL）-9。研究表明Th9细胞参与多种疾病，如过敏性疾病和肺部疾病（如哮喘、慢性阻塞性呼吸道疾病、慢性鼻窦炎、鼻息肉和肺发育不全）、代谢性疾病（如急性白血病、粒细胞白血病、乳腺癌症、癌症、黑色素瘤、胰腺癌症）、神经精神疾病（如阿尔茨海默病）、，自身免疫性疾病（如Graves病、克罗恩病、结肠炎、银屑病、系统性红斑狼疮、系统性硬皮病、类风湿性关节炎、多发性硬化症、炎症性肠病、特应性皮炎、湿疹）和传染病（如肺结核、肝炎）。然而，缺乏关于其与其他代谢、神经精神和传染病有关的信息。本研究旨在从小鼠模型的公开基因表达综合数据集中鉴定Th2细胞向Th9细胞转化中显著差异改变的基因及其调节微小RNA（miR），使用计算机分析来揭示疾病过程中涉及的各种致病途径。使用从2个公开可用的数据集（GSE99166和GSE123501）中鉴定的差异表达基因（DEG），我们进行了功能富集和网络分析，以鉴定与Th2细胞向Th9细胞转化相关的显著差异改变基因相关的途径、蛋白质-蛋白质相互作用、miR信使RNA关联和疾病基因关联。我们从数据集GSE99166和GSE123501的表达谱中提取了260个常见下调、236个常见上调和634个常见DEG。共分化表达的ILs、细胞因子、受体和转录因子（TF）在7个关键的京都基因和基因组百科全书途径和基因本体论中富集。我们构建了蛋白质-蛋白质相互作用网络，并预测了参与Th2至Th9分化途径的顶级调控miR。我们还确定了各种代谢性、过敏性和肺部、神经精神病、自身免疫性和感染性疾病以及Th2至Th9的分化可能发挥关键作用的癌症。这项研究确定了迄今为止尚未探索的Th9与疾病状态之间的可能联系。一些重要的ILs，包括CCL1（趋化因子[C-C基序]配体1）、CCL20（趋化细胞因子[C-C基序]配体20）、IL-13、IL-4、IL-12A和IL-9；受体，包括IL-12RB1、IL-4RA（白介素9受体α）、CD53（分化簇53）、CD6（分化簇6）、CD5（分化簇5）、CD83（分化簇83）、CD197（分化簇197）、IL-1RL1（白介素1受体样1）、CD101（分化簇101）、CD96（分化簇96），CD72（分化簇72）、CD7（分化簇7）、CD152（细胞毒性T淋巴细胞相关蛋白4）、CD38（分化簇38）、CX3CR1（趋化因子[C-X3-C基序]受体1）、CTLA2A（细胞毒性T淋巴细胞相关蛋白2α）、CTLA 28和CD196（分化簇196）；和TF，包括FOXP3（叉头框P3）、IRF8（干扰素调节因子8）、FOXP2（叉头盒P2）、RORA（RAR相关孤儿受体α）、AHR（芳基烃受体）、MAF（禽肌肉筋膜纤维肉瘤癌基因同源物）、SMAD6（SMAD家族成员6）、JUN（JUN原癌基因）、JAK2（Janus激酶2）、EP300（E1A结合蛋白p300）、ATF6（激活转录因子6），BTAF1（B-TFIID-TATA盒结合蛋白相关因子1）、BAFT（碱性亮氨酸拉链转录因子）、NOTCH1（神经源性基因座缺口同源蛋白1）、GATA3（GATA结合蛋白3）、SATB1（富含特殊AT序列的结合蛋白1），和PPARG（过氧化物酶体增殖物激活受体γ，能够识别Th2细胞向Th9细胞转化过程中显著差异性改变的基因。我们发现了一些常见的miR可以靶向DEG。关于Th9在代谢性疾病中的作用的研究很少，这突出了该领域的空白。我们的研究为探索Th9在各种代谢中的作用提供了理论基础c糖尿病、糖尿病肾病、高血压、缺血性中风、脂肪性肝炎、肝纤维化、肥胖、腺癌、胶质母细胞瘤和神经胶质瘤、胃恶性肿瘤、黑色素瘤、神经母细胞瘤、骨肉瘤、胰腺癌、前列腺癌和胃癌等疾病。

{"title":"The Differentially Expressed Genes Responsible for the Development of T Helper 9 Cells From T Helper 2 Cells in Various Disease States: Immuno-Interactomics Study.","authors":"Manoj Khokhar, Purvi Purohit, Ashita Gadwal, Sojit Tomo, Nitin Kumar Bajpai, Ravindra Shukla","doi":"10.2196/42421","DOIUrl":"10.2196/42421","url":null,"abstract":"Background: T helper (Th) 9 cells are a novel subset of Th cells that develop independently from Th2 cells and are characterized by the secretion of interleukin (IL)-9. Studies have suggested the involvement of Th9 cells in variable diseases such as allergic and pulmonary diseases (eg, asthma, chronic obstructive airway disease, chronic rhinosinusitis, nasal polyps, and pulmonary hypoplasia), metabolic diseases (eg, acute leukemia, myelocytic leukemia, breast cancer, lung cancer, melanoma, pancreatic cancer), neuropsychiatric disorders (eg, Alzheimer disease), autoimmune diseases (eg, Graves disease, Crohn disease, colitis, psoriasis, systemic lupus erythematosus, systemic scleroderma, rheumatoid arthritis, multiple sclerosis, inflammatory bowel disease, atopic dermatitis, eczema), and infectious diseases (eg, tuberculosis, hepatitis). However, there is a dearth of information on its involvement in other metabolic, neuropsychiatric, and infectious diseases.Objective: This study aims to identify significant differentially altered genes in the conversion of Th2 to Th9 cells, and their regulating microRNAs (miRs) from publicly available Gene Expression Omnibus data sets of the mouse model using in silico analysis to unravel various pathogenic pathways involved in disease processes.Methods: Using differentially expressed genes (DEGs) identified from 2 publicly available data sets (GSE99166 and GSE123501) we performed functional enrichment and network analyses to identify pathways, protein-protein interactions, miR-messenger RNA associations, and disease-gene associations related to significant differentially altered genes implicated in the conversion of Th2 to Th9 cells.Results: We extracted 260 common downregulated, 236 common upregulated, and 634 common DEGs from the expression profiles of data sets GSE99166 and GSE123501. Codifferentially expressed ILs, cytokines, receptors, and transcription factors (TFs) were enriched in 7 crucial Kyoto Encyclopedia of Genes and Genomes pathways and Gene Ontology. We constructed the protein-protein interaction network and predicted the top regulatory miRs involved in the Th2 to Th9 differentiation pathways. We also identified various metabolic, allergic and pulmonary, neuropsychiatric, autoimmune, and infectious diseases as well as carcinomas where the differentiation of Th2 to Th9 may play a crucial role.Conclusions: This study identified hitherto unexplored possible associations between Th9 and disease states. Some important ILs, including CCL1 (chemokine [C-C motif] ligand 1), CCL20 (chemokine [C-C motif] ligand 20), IL-13, IL-4, IL-12A, and IL-9; receptors, including IL-12RB1, IL-4RA (interleukin 9 receptor alpha), CD53 (cluster of differentiation 53), CD6 (cluster of differentiation 6), CD5 (cluster of differentiation 5), CD83 (cluster of differentiation 83), CD197 (cluster of differentiation ","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e42421"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135241/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46910282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SARS-CoV-2 Omicron Variant Genomic Sequences and Their Epidemiological Correlates Regarding the End of the Pandemic: In Silico Analysis. SARS-CoV-2 Omicron 变体基因组序列及其与流行病学的相关性：硅分析。

JMIR bioinformatics and biotechnology

Pub Date : 2023-01-10 eCollection Date: 2023-01-01 DOI: 10.2196/42700

Ashutosh Kumar, Adil Asghar, Himanshu N Singh, Muneeb A Faiq, Sujeet Kumar, Ravi K Narayan, Gopichand Kumar, Prakhar Dwivedi, Chetan Sahni, Rakesh K Jha, Maheswari Kulandhasamy, Pranav Prasoon, Kishore Sesham, Kamla Kant, Sada N Pandey

Background: Emergence of the new SARS-CoV-2 variant B.1.1.529 worried health policy makers worldwide due to a large number of mutations in its genomic sequence, especially in the spike protein region. The World Health Organization (WHO) designated this variant as a global variant of concern (VOC), which was named "Omicron." Following Omicron's emergence, a surge of new COVID-19 cases was reported globally, primarily in South Africa.

Objective: The aim of this study was to understand whether Omicron had an epidemiological advantage over existing variants.

Methods: We performed an in silico analysis of the complete genomic sequences of Omicron available on the Global Initiative on Sharing Avian Influenza Data (GISAID) database to analyze the functional impact of the mutations present in this variant on virus-host interactions in terms of viral transmissibility, virulence/lethality, and immune escape. In addition, we performed a correlation analysis of the relative proportion of the genomic sequences of specific SARS-CoV-2 variants (in the period from October 1 to November 29, 2021) with matched epidemiological data (new COVID-19 cases and deaths) from South Africa.

Results: Compared with the current list of global VOCs/variants of interest (VOIs), as per the WHO, Omicron bears more sequence variation, specifically in the spike protein and host receptor-binding motif (RBM). Omicron showed the closest nucleotide and protein sequence homology with the Alpha variant for the complete sequence and the RBM. The mutations were found to be primarily condensed in the spike region (n=28-48) of the virus. Further mutational analysis showed enrichment for the mutations decreasing binding affinity to angiotensin-converting enzyme 2 receptor and receptor-binding domain protein expression, and for increasing the propensity of immune escape. An inverse correlation of Omicron with the Delta variant was noted (r=-0.99, P<.001; 95% CI -0.99 to -0.97) in the sequences reported from South Africa postemergence of the new variant, subsequently showing a decrease. There was a steep rise in new COVID-19 cases in parallel with the increase in the proportion of Omicron isolates since the report of the first case (74%-100%). By contrast, the incidence of new deaths did not increase (r=-0.04, P>.05; 95% CI -0.52 to 0.58).

Conclusions: In silico analysis of viral genomic sequences suggests that the Omicron variant has more remarkable immune-escape ability than existing VOCs/VOIs, including Delta, but reduced virulence/lethality than other reported variants. The higher power for immune escape for Omicron was a likely reason for the resurgence in COVID-19 cases and its rapid rise as the globally dominant strain. Being more infectious but less lethal than the existing variants, Omicron could have plausibly led to widespread unnoticed new, repeated, and vacci

背景：SARS-CoV-2 新变异体 B.1.1.529 的出现使全世界的卫生决策者忧心忡忡，因为它的基因组序列中出现了大量变异，尤其是在尖峰蛋白区。世界卫生组织（WHO）将这一变异体定为全球关注变异体（VOC），并命名为 "Omicron"。Omicron 出现后，全球报告的 COVID-19 新病例激增，主要发生在南非：本研究旨在了解 Omicron 与现有变体相比是否具有流行病学优势：我们对全球禽流感数据共享倡议（GISAID）数据库中的 Omicron 完整基因组序列进行了硅学分析，以分析该变异株中出现的突变在病毒传播性、毒力/致死性和免疫逃逸方面对病毒-宿主相互作用的功能影响。此外，我们还对特定 SARS-CoV-2 变异株基因组序列的相对比例（2021 年 10 月 1 日至 11 月 29 日期间）与南非的匹配流行病学数据（COVID-19 新发病例和死亡病例）进行了相关性分析：结果：与世界卫生组织目前列出的全球 VOCs/相关变异体（VOIs）相比，Omicron 具有更多的序列变异，特别是在尖峰蛋白和宿主受体结合基序（RBM）方面。在完整序列和 RBM 方面，Omicron 与 Alpha 变体的核苷酸和蛋白质序列同源性最接近。突变主要集中在病毒的尖峰区（n=28-48）。进一步的突变分析表明，富集的突变降低了与血管紧张素转换酶2受体的结合亲和力和受体结合域蛋白的表达，并增加了免疫逃逸的倾向。Omicron与Delta变异呈反相关（r=-0.99，PP>.05；95% CI -0.52至0.58）：对病毒基因组序列的硅学分析表明，Omicron变体比包括Delta在内的现有VOC/VOIs具有更强的免疫逃逸能力，但比其他已报道的变体毒力/致死率更低。Omicron 的免疫逃逸能力更强，这很可能是 COVID-19 病例再次出现并迅速成为全球优势菌株的原因。与现有变异株相比，Omicron 的传染性更强，但致命性较低，因此有可能导致新的、重复的和疫苗突破性感染的广泛出现而不为人所察觉，从而提高人群免疫屏障，防止新的致命变异株的出现。因此，Omicron 变体可能会为大流行的结束铺平道路。

{"title":"SARS-CoV-2 Omicron Variant Genomic Sequences and Their Epidemiological Correlates Regarding the End of the Pandemic: In Silico Analysis.","authors":"Ashutosh Kumar, Adil Asghar, Himanshu N Singh, Muneeb A Faiq, Sujeet Kumar, Ravi K Narayan, Gopichand Kumar, Prakhar Dwivedi, Chetan Sahni, Rakesh K Jha, Maheswari Kulandhasamy, Pranav Prasoon, Kishore Sesham, Kamla Kant, Sada N Pandey","doi":"10.2196/42700","DOIUrl":"10.2196/42700","url":null,"abstract":"Background: Emergence of the new SARS-CoV-2 variant B.1.1.529 worried health policy makers worldwide due to a large number of mutations in its genomic sequence, especially in the spike protein region. The World Health Organization (WHO) designated this variant as a global variant of concern (VOC), which was named \"Omicron.\" Following Omicron's emergence, a surge of new COVID-19 cases was reported globally, primarily in South Africa.Objective: The aim of this study was to understand whether Omicron had an epidemiological advantage over existing variants.Methods: We performed an in silico analysis of the complete genomic sequences of Omicron available on the Global Initiative on Sharing Avian Influenza Data (GISAID) database to analyze the functional impact of the mutations present in this variant on virus-host interactions in terms of viral transmissibility, virulence/lethality, and immune escape. In addition, we performed a correlation analysis of the relative proportion of the genomic sequences of specific SARS-CoV-2 variants (in the period from October 1 to November 29, 2021) with matched epidemiological data (new COVID-19 cases and deaths) from South Africa.Results: Compared with the current list of global VOCs/variants of interest (VOIs), as per the WHO, Omicron bears more sequence variation, specifically in the spike protein and host receptor-binding motif (RBM). Omicron showed the closest nucleotide and protein sequence homology with the Alpha variant for the complete sequence and the RBM. The mutations were found to be primarily condensed in the spike region (n=28-48) of the virus. Further mutational analysis showed enrichment for the mutations decreasing binding affinity to angiotensin-converting enzyme 2 receptor and receptor-binding domain protein expression, and for increasing the propensity of immune escape. An inverse correlation of Omicron with the Delta variant was noted (r=-0.99, P<.001; 95% CI -0.99 to -0.97) in the sequences reported from South Africa postemergence of the new variant, subsequently showing a decrease. There was a steep rise in new COVID-19 cases in parallel with the increase in the proportion of Omicron isolates since the report of the first case (74%-100%). By contrast, the incidence of new deaths did not increase (r=-0.04, P>.05; 95% CI -0.52 to 0.58).Conclusions: In silico analysis of viral genomic sequences suggests that the Omicron variant has more remarkable immune-escape ability than existing VOCs/VOIs, including Delta, but reduced virulence/lethality than other reported variants. The higher power for immune escape for Omicron was a likely reason for the resurgence in COVID-19 cases and its rapid rise as the globally dominant strain. Being more infectious but less lethal than the existing variants, Omicron could have plausibly led to widespread unnoticed new, repeated, and vacci","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":"4 ","pages":"e42700"},"PeriodicalIF":0.0,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9843602/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10598394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mutational Patterns Observed in SARS-CoV-2 Genomes Sampled From Successive Epochs Delimited by Major Public Health Events in Ontario, Canada: Genomic Surveillance Study. 在加拿大安大略省重大公共卫生事件界定的连续时期采样的严重急性呼吸系统综合征冠状病毒2型基因组中观察到的突变模式：一项基因组监测研究（预印本）

JMIR bioinformatics and biotechnology

Pub Date : 2022-12-22 DOI: 10.2196/42243

David Chen, Gurjit S Randhawa, Maximillian Pm Soltysiak, Camila Pe de Souza, Lila Kari, Shiva M Singh, Kathleen A Hill

Background: The emergence of SARS-CoV-2 variants with mutations associated with increased transmissibility and virulence is a public health concern in Ontario, Canada. Characterizing how the mutational patterns of the SARS-CoV-2 genome have changed over time can shed light on the driving factors, including selection for increased fitness and host immune response, that may contribute to the emergence of novel variants. Moreover, the study of SARS-CoV-2 in the microcosm of Ontario, Canada can reveal how different province-specific public health policies over time may be associated with observed mutational patterns as a model system.

Objective: This study aimed to perform a comprehensive analysis of single base substitution (SBS) types, counts, and genomic locations observed in SARS-CoV-2 genomic sequences sampled in Ontario, Canada. Comparisons of mutational patterns were conducted between sequences sampled during 4 different epochs delimited by major public health events to track the evolution of the SARS-CoV-2 mutational landscape over 2 years.

Methods: In total, 24,244 SARS-CoV-2 genomic sequences and associated metadata sampled in Ontario, Canada from January 1, 2020, to December 31, 2021, were retrieved from the Global Initiative on Sharing All Influenza Data database. Sequences were assigned to 4 epochs delimited by major public health events based on the sampling date. SBSs from each SARS-CoV-2 sequence were identified relative to the MN996528.1 reference genome. Catalogues of SBS types and counts were generated to estimate the impact of selection in each open reading frame, and identify mutation clusters. The estimation of mutational fitness over time was performed using the Augur pipeline.

Results: The biases in SBS types and proportions observed support previous reports of host antiviral defense activity involving the SARS-CoV-2 genome. There was an increase in U>C substitutions associated with adenosine deaminase acting on RNA (ADAR) activity uniquely observed during Epoch 4. The burden of novel SBSs observed in SARS-CoV-2 genomic sequences was the greatest in Epoch 2 (median 5), followed by Epoch 3 (median 4). Clusters of SBSs were observed in the spike protein open reading frame, ORF1a, and ORF3a. The high proportion of nonsynonymous SBSs and increasing dN/dS metric (ratio of nonsynonymous to synonymous mutations in a given open reading frame) to above 1 in Epoch 4 indicate positive selection of the spike protein open reading frame.

Conclusions: Quantitative analysis of the mutational patterns of the SARS-CoV-2 genome in the microcosm of Ontario, Canada within early consecutive epochs of the pandemic tracked the mutational dynamics in the context of public health events that instigate significant shifts in selection and mutagenesis. Continued genomic surveillance of emergent variants will be useful for the design of public he

背景：在加拿大安大略省，SARS-CoV-2 变异株的出现与传播性和毒力增强有关，是一个公共卫生问题。研究 SARS-CoV-2 基因组的变异模式如何随着时间的推移而发生变化，可以揭示可能导致新型变异体出现的驱动因素，包括对提高适应性和宿主免疫反应的选择。此外，在加拿大安大略省这个微观世界对 SARS-CoV-2 进行研究，可以揭示随着时间推移，不同省份的公共卫生政策如何与作为模型系统的观察到的变异模式相关联：本研究旨在全面分析在加拿大安大略省采样的 SARS-CoV-2 基因组序列中观察到的单碱基置换（SBS）类型、数量和基因组位置。在以重大公共卫生事件为分界线的 4 个不同时期采样的序列之间进行了突变模式比较，以追踪两年来 SARS-CoV-2 突变情况的演变：从全球流感数据共享计划数据库中检索了 2020 年 1 月 1 日至 2021 年 12 月 31 日期间在加拿大安大略省采样的 24,244 个 SARS-CoV-2 基因组序列和相关元数据。根据采样日期，序列被分配到以重大公共卫生事件为分界的 4 个时代。根据 MN996528.1 参考基因组鉴定每个 SARS-CoV-2 序列中的 SBS。生成 SBS 类型和数量的目录，以估计每个开放阅读框中选择的影响，并确定突变群。使用 Augur 管道对随时间变化的突变适配性进行了估计：结果：观察到的 SBS 类型和比例偏差支持以前关于 SARS-CoV-2 基因组中宿主抗病毒防御活动的报道。在第 4 个纪元中，与作用于 RNA 的腺苷脱氨酶（ADAR）活性有关的 U>C 替换有所增加。在 SARS-CoV-2 基因组序列中观察到的新型 SBS 的数量在第 2 个纪元最多（中位数为 5），其次是第 3 个纪元（中位数为 4）。在尖峰蛋白开放阅读框、ORF1a 和 ORF3a 中观察到成群的 SBSs。非同义 SBS 的比例很高，dN/dS 指标（特定开放阅读框中非同义突变与同义突变之比）在第四纪元增至 1 以上，这表明尖峰蛋白开放阅读框存在正选择：对加拿大安大略省微观世界中的 SARS-CoV-2 基因组变异模式进行定量分析，发现了在公共卫生事件背景下的变异动态，这些公共卫生事件引发了选择和诱变的重大转变。继续对新出现的变异体进行基因组监测将有助于制定公共卫生政策，以应对不断演变的 COVID-19 大流行。

{"title":"Mutational Patterns Observed in SARS-CoV-2 Genomes Sampled From Successive Epochs Delimited by Major Public Health Events in Ontario, Canada: Genomic Surveillance Study.","authors":"David Chen, Gurjit S Randhawa, Maximillian Pm Soltysiak, Camila Pe de Souza, Lila Kari, Shiva M Singh, Kathleen A Hill","doi":"10.2196/42243","DOIUrl":"10.2196/42243","url":null,"abstract":"Background: The emergence of SARS-CoV-2 variants with mutations associated with increased transmissibility and virulence is a public health concern in Ontario, Canada. Characterizing how the mutational patterns of the SARS-CoV-2 genome have changed over time can shed light on the driving factors, including selection for increased fitness and host immune response, that may contribute to the emergence of novel variants. Moreover, the study of SARS-CoV-2 in the microcosm of Ontario, Canada can reveal how different province-specific public health policies over time may be associated with observed mutational patterns as a model system.Objective: This study aimed to perform a comprehensive analysis of single base substitution (SBS) types, counts, and genomic locations observed in SARS-CoV-2 genomic sequences sampled in Ontario, Canada. Comparisons of mutational patterns were conducted between sequences sampled during 4 different epochs delimited by major public health events to track the evolution of the SARS-CoV-2 mutational landscape over 2 years.Methods: In total, 24,244 SARS-CoV-2 genomic sequences and associated metadata sampled in Ontario, Canada from January 1, 2020, to December 31, 2021, were retrieved from the Global Initiative on Sharing All Influenza Data database. Sequences were assigned to 4 epochs delimited by major public health events based on the sampling date. SBSs from each SARS-CoV-2 sequence were identified relative to the MN996528.1 reference genome. Catalogues of SBS types and counts were generated to estimate the impact of selection in each open reading frame, and identify mutation clusters. The estimation of mutational fitness over time was performed using the Augur pipeline.Results: The biases in SBS types and proportions observed support previous reports of host antiviral defense activity involving the SARS-CoV-2 genome. There was an increase in U>C substitutions associated with adenosine deaminase acting on RNA (ADAR) activity uniquely observed during Epoch 4. The burden of novel SBSs observed in SARS-CoV-2 genomic sequences was the greatest in Epoch 2 (median 5), followed by Epoch 3 (median 4). Clusters of SBSs were observed in the spike protein open reading frame, ORF1a, and ORF3a. The high proportion of nonsynonymous SBSs and increasing dN/dS metric (ratio of nonsynonymous to synonymous mutations in a given open reading frame) to above 1 in Epoch 4 indicate positive selection of the spike protein open reading frame.Conclusions: Quantitative analysis of the mutational patterns of the SARS-CoV-2 genome in the microcosm of Ontario, Canada within early consecutive epochs of the pandemic tracked the mutational dynamics in the context of public health events that instigate significant shifts in selection and mutagenesis. Continued genomic surveillance of emergent variants will be useful for the design of public he","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e42243"},"PeriodicalIF":0.0,"publicationDate":"2022-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135226/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44573612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Utilization of Heart Rate Variability for Autonomic Nervous System Assessment in Healthy Pregnant Women: Systematic Review. 心率变异性在健康孕妇自主神经系统评估中的应用：系统综述（预印本）

JMIR bioinformatics and biotechnology

Pub Date : 2022-11-17 DOI: 10.2196/36791

Zahra Sharifiheris, Amir Rahmani, Joseph Onwuka, Miriam Bender

Background: The autonomic nervous system (ANS) plays a central role in pregnancy-induced adaptations, and failure in the required adaptations is associated with adverse neonatal and maternal outcomes. Mapping maternal ANS function in healthy pregnancy may help to understand ANS function.

Objective: This study aimed to systematically review studies on the use of heart rate variability (HRV) monitoring to measure ANS function during pregnancy and determine whether specific HRV patterns representing normal ANS function have been identified during pregnancy.

Methods: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline was used to guide the systematic review. The CINAHL, PubMed, SCOPUS, and Web of Science databases were searched to comprehensively identify articles without a time span limitation. Studies were included if they assessed HRV in healthy pregnant individuals at least once during pregnancy or labor, with or without a comparison group (eg, complicated pregnancy). Quality assessment of the included literature was performed using the National Heart, Lung, and Blood Institute (NHLBI) tool. A narrative synthesis approach was used for data extraction and analysis, as the articles were heterogenous in scope, approaches, methods, and variables assessed, which precluded traditional meta-analysis approaches being used.

Results: After full screening, 8 studies met the inclusion criteria. In 88% (7/8) of the studies, HRV was measured using electrocardiogram and operationalized in 3 different ways: linear frequency domain (FD), linear time domain (TD), and nonlinear methods. FD was measured in all (8/8), TD in 75% (6/8), and nonlinear methods in 25% (2/8) of the studies. The assessment duration varied from 5 minutes to 24 hours. TD indexes and most of the FD indexes decreased from the first to the third trimesters in the majority (5/7, 71%) of the studies. Of the FD indexes, low frequency (LF [nu]) and the LF/high frequency (HF) ratio showed an ascending trend from early to late pregnancy, indicating an increase in sympathetic activity toward the end of the pregnancy.

Conclusions: We identified 3 HRV operationalization methods along with potentially indicative HRV patterns. However, we found no justification for the selection of measurement tools, measurement time frames, and operationalization methods, which threaten the generalizability and reliability of pattern findings. More research is needed to determine the criteria and methods for determining HRV patterns corresponding to ANS functioning in healthy pregnant persons.

背景：自律神经系统（ANS）在妊娠诱导的适应过程中发挥着核心作用，而所需适应的失败与新生儿和孕产妇的不良结局有关。绘制健康妊娠期母体自律神经系统的功能图有助于了解自律神经系统的功能：本研究旨在系统回顾有关使用心率变异性（HRV）监测来测量孕期自律神经系统功能的研究，并确定是否已发现代表孕期正常自律神经系统功能的特定 HRV 模式：方法：采用系统综述和元分析首选报告项目（PRISMA）指南指导系统综述。对 CINAHL、PubMed、SCOPUS 和 Web of Science 数据库进行了检索，以全面识别不受时间跨度限制的文章。如果研究对健康孕妇在妊娠或分娩期间的心率变异进行了至少一次评估，无论是否有对比组（如复杂妊娠），均纳入研究。采用美国国家心肺血液研究所（NHLBI）工具对纳入的文献进行质量评估。由于文章在范围、方法、方式和评估变量方面存在差异，因此无法使用传统的荟萃分析方法，因此采用了叙事综合法进行数据提取和分析：经过全面筛选，8 项研究符合纳入标准。在88%（7/8）的研究中，心率变异是通过心电图测量的，并以3种不同的方式进行操作：线性频域（FD）、线性时域（TD）和非线性方法。所有研究（8/8）都测量了线性频域（FD），75% 的研究（6/8）测量了线性时域（TD），25% 的研究（2/8）测量了非线性方法。评估持续时间从 5 分钟到 24 小时不等。在大多数研究中（5/7，71%），TD 指数和大多数 FD 指数从妊娠头三个月到妊娠三个月都有所下降。在FD指数中，低频（LF [nu]）和低频/高频（HF）比值从孕早期到孕晚期呈上升趋势，表明交感神经活动在妊娠末期增加：我们确定了三种心率变异操作方法以及可能具有指示性的心率变异模式。然而，我们发现在选择测量工具、测量时间范围和操作方法时都缺乏合理性，这对模式研究结果的普遍性和可靠性构成了威胁。需要进行更多的研究，以确定与健康孕妇自律神经系统功能相对应的心率变异模式的标准和方法。

{"title":"The Utilization of Heart Rate Variability for Autonomic Nervous System Assessment in Healthy Pregnant Women: Systematic Review.","authors":"Zahra Sharifiheris, Amir Rahmani, Joseph Onwuka, Miriam Bender","doi":"10.2196/36791","DOIUrl":"10.2196/36791","url":null,"abstract":"Background: The autonomic nervous system (ANS) plays a central role in pregnancy-induced adaptations, and failure in the required adaptations is associated with adverse neonatal and maternal outcomes. Mapping maternal ANS function in healthy pregnancy may help to understand ANS function.Objective: This study aimed to systematically review studies on the use of heart rate variability (HRV) monitoring to measure ANS function during pregnancy and determine whether specific HRV patterns representing normal ANS function have been identified during pregnancy.Methods: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline was used to guide the systematic review. The CINAHL, PubMed, SCOPUS, and Web of Science databases were searched to comprehensively identify articles without a time span limitation. Studies were included if they assessed HRV in healthy pregnant individuals at least once during pregnancy or labor, with or without a comparison group (eg, complicated pregnancy). Quality assessment of the included literature was performed using the National Heart, Lung, and Blood Institute (NHLBI) tool. A narrative synthesis approach was used for data extraction and analysis, as the articles were heterogenous in scope, approaches, methods, and variables assessed, which precluded traditional meta-analysis approaches being used.Results: After full screening, 8 studies met the inclusion criteria. In 88% (7/8) of the studies, HRV was measured using electrocardiogram and operationalized in 3 different ways: linear frequency domain (FD), linear time domain (TD), and nonlinear methods. FD was measured in all (8/8), TD in 75% (6/8), and nonlinear methods in 25% (2/8) of the studies. The assessment duration varied from 5 minutes to 24 hours. TD indexes and most of the FD indexes decreased from the first to the third trimesters in the majority (5/7, 71%) of the studies. Of the FD indexes, low frequency (LF [nu]) and the LF/high frequency (HF) ratio showed an ascending trend from early to late pregnancy, indicating an increase in sympathetic activity toward the end of the pregnancy.Conclusions: We identified 3 HRV operationalization methods along with potentially indicative HRV patterns. However, we found no justification for the selection of measurement tools, measurement time frames, and operationalization methods, which threaten the generalizability and reliability of pattern findings. More research is needed to determine the criteria and methods for determining HRV patterns corresponding to ANS functioning in healthy pregnant persons.","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e36791"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135217/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45914601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0