{"title":"Choice of Metric Divergence in Genome Sequence Comparison","authors":"Soumen Ghosh, Jayanta Pal, Bansibadan Maji, Carlo Cattani, Dilip Kumar Bhattacharya","doi":"10.1007/s10930-024-10189-x","DOIUrl":null,"url":null,"abstract":"<div><p>The paper introduces a novel probability descriptor for genome sequence comparison, employing a generalized form of Jensen-Shannon divergence. This divergence metric stems from a one-parameter family, comprising fractions up to a maximum value of half. Utilizing this metric as a distance measure, a distance matrix is computed for the new probability descriptor, shaping Phylogenetic trees via the neighbor-joining method. Initial exploration involves setting the parameter at half for various species. Assessing the impact of parameter variation, trees drawn at different parameter values (half, one-fourth, one-eighth). However, measurement scales decrease with parameter value increments, with higher similarity accuracy corresponding to lower scale values. Ultimately, the highest accuracy aligns with the maximum parameter value of half. Comparative analyses against previous methods, evaluating via Symmetric Distance (SD) values and rationalized perception, consistently favor the present approach's results. Notably, outcomes at the maximum parameter value exhibit the most accuracy, validating the method's efficacy against earlier approaches.</p></div>","PeriodicalId":793,"journal":{"name":"The Protein Journal","volume":"43 2","pages":"259 - 273"},"PeriodicalIF":1.9000,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Protein Journal","FirstCategoryId":"2","ListUrlMain":"https://link.springer.com/article/10.1007/s10930-024-10189-x","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The paper introduces a novel probability descriptor for genome sequence comparison, employing a generalized form of Jensen-Shannon divergence. This divergence metric stems from a one-parameter family, comprising fractions up to a maximum value of half. Utilizing this metric as a distance measure, a distance matrix is computed for the new probability descriptor, shaping Phylogenetic trees via the neighbor-joining method. Initial exploration involves setting the parameter at half for various species. Assessing the impact of parameter variation, trees drawn at different parameter values (half, one-fourth, one-eighth). However, measurement scales decrease with parameter value increments, with higher similarity accuracy corresponding to lower scale values. Ultimately, the highest accuracy aligns with the maximum parameter value of half. Comparative analyses against previous methods, evaluating via Symmetric Distance (SD) values and rationalized perception, consistently favor the present approach's results. Notably, outcomes at the maximum parameter value exhibit the most accuracy, validating the method's efficacy against earlier approaches.
期刊介绍:
The Protein Journal (formerly the Journal of Protein Chemistry) publishes original research work on all aspects of proteins and peptides. These include studies concerned with covalent or three-dimensional structure determination (X-ray, NMR, cryoEM, EPR/ESR, optical methods, etc.), computational aspects of protein structure and function, protein folding and misfolding, assembly, genetics, evolution, proteomics, molecular biology, protein engineering, protein nanotechnology, protein purification and analysis and peptide synthesis, as well as the elucidation and interpretation of the molecular bases of biological activities of proteins and peptides. We accept original research papers, reviews, mini-reviews, hypotheses, opinion papers, and letters to the editor.