Pub Date : 2025-12-20DOI: 10.1142/s2737416526400041
Nirav Modha, Emil Alexov
Missense variants that change arginine-to-histidine and histidine-to-arginine (R>H; H>R) preserve positive charge yet alter pH-dependent behavior near neutrality, creating mutation type-specific but context-dependent effects on protein function, and thus could be pathogenic. To reveal the factors causing pathogenicity, we assembled high-confidence human R>H and H>R variants from ClinVar annotated as pathogenic or benign. It was found that in both cases, R>H/H>R, pathogenic variants are strongly enriched in cores and ordered regions, while benign variants were seen on surfaces and coils. Secondary structure analysis showed mutation type specificity; R>H pathogenic variants were enriched in helices, while H>R pathogenic variants were enriched in β-strands. Regarding the pH-optimum of activity, most R>H and H>R variants fell in physiological/near-physiological pH ranges, but R>H benign variants were more frequent in the neutral/physiological pH bin, whereas H>R pathogenic variants were overrepresented in the same neutral/physiological pH range. The last observation is consistent with histidine's pKa being tunable near physiological range, while arginine's side chain introduces a permanent positive charge, and thus H>R substitution eliminates the wild-type pH-dependence. Functional protein analyses highlighted that pathogenic variants are overrepresented at binding/interface-heavy proteins (e.g., transcription factors) and selected enzymatic classes (e.g., oxidoreductases, ion channels, transporters, ligases). Interestingly, in the vast majority of cases, the proteins in our dataset had either R>H or H>R mutations, but not both present in the same protein. Proteins harboring both variant types, R>H and H>R, were very few, and typically they had both variants, either pathogenic or benign.
{"title":"Functional and Structural Characterization of Pathogenicity of Human Arginine-Histidine Variants.","authors":"Nirav Modha, Emil Alexov","doi":"10.1142/s2737416526400041","DOIUrl":"10.1142/s2737416526400041","url":null,"abstract":"<p><p>Missense variants that change arginine-to-histidine and histidine-to-arginine (R>H; H>R) preserve positive charge yet alter pH-dependent behavior near neutrality, creating mutation type-specific but context-dependent effects on protein function, and thus could be pathogenic. To reveal the factors causing pathogenicity, we assembled high-confidence human R>H and H>R variants from ClinVar annotated as pathogenic or benign. It was found that in both cases, R>H/H>R, pathogenic variants are strongly enriched in cores and ordered regions, while benign variants were seen on surfaces and coils. Secondary structure analysis showed mutation type specificity; R>H pathogenic variants were enriched in helices, while H>R pathogenic variants were enriched in β-strands. Regarding the pH-optimum of activity, most R>H and H>R variants fell in physiological/near-physiological pH ranges, but R>H benign variants were more frequent in the neutral/physiological pH bin, whereas H>R pathogenic variants were overrepresented in the same neutral/physiological pH range. The last observation is consistent with histidine's pK<sub>a</sub> being tunable near physiological range, while arginine's side chain introduces a permanent positive charge, and thus H>R substitution eliminates the wild-type pH-dependence. Functional protein analyses highlighted that pathogenic variants are overrepresented at binding/interface-heavy proteins (e.g., transcription factors) and selected enzymatic classes (e.g., oxidoreductases, ion channels, transporters, ligases). Interestingly, in the vast majority of cases, the proteins in our dataset had either R>H or H>R mutations, but not both present in the same protein. Proteins harboring both variant types, R>H and H>R, were very few, and typically they had both variants, either pathogenic or benign.</p>","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12858162/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146105660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-04-24DOI: 10.1142/s2737416525500164
Ada Y Chen, Shailesh K Panday, Kaoru Ri, Emil Alexov, Bernard R Brooks, Ana Damjanovic
Understanding pKa values in ionizable protein residues is critical for understanding fundamental protein properties, such as structure, function and interactions. We present a new version of PKAD, named PKAD-R, which is a curated database of experimentally determined protein pKa values. The database builds upon its predecessors, PKAD and PKAD-2, with significant updates and improvements through: (1) careful data curation to remove incorrect entries and consolidate redundant entries by offering alternative structures and pKa values for each unique residue (2) database redesign, to enhance its usability by adding additional information such as protein and species names, detailed notes, as well as sequence identity (3) database expansion through identification of 214 new (128 non-redundant) pKa entries from the literature. The database currently contains 877 unique pKa entries for wild type structures and 147 for mutant structures, however, we aim to keep updating the database with new entries. The PKAD-R database is available as a stand-alone downloadable file as well as web servers. The database is designed to provide both a set of pKa entries for unique residues suitable for machine learning applications, as well as modularity by providing alternative pKa values and structures, allowing the user to decide which entries to include.
{"title":"PKAD-R: curated, redesigned and expanded database of experimental pKa values in proteins.","authors":"Ada Y Chen, Shailesh K Panday, Kaoru Ri, Emil Alexov, Bernard R Brooks, Ana Damjanovic","doi":"10.1142/s2737416525500164","DOIUrl":"10.1142/s2737416525500164","url":null,"abstract":"<p><p>Understanding pKa values in ionizable protein residues is critical for understanding fundamental protein properties, such as structure, function and interactions. We present a new version of PKAD, named PKAD-R, which is a curated database of experimentally determined protein pKa values. The database builds upon its predecessors, PKAD and PKAD-2, with significant updates and improvements through: (1) <b>careful data curation</b> to remove incorrect entries and consolidate redundant entries by offering alternative structures and pKa values for each unique residue (2) <b>database redesign</b>, to enhance its usability by adding additional information such as protein and species names, detailed notes, as well as sequence identity (3) <b>database expansion</b> through identification of 214 new (128 non-redundant) pKa entries from the literature. The database currently contains 877 unique pKa entries for wild type structures and 147 for mutant structures, however, we aim to keep updating the database with new entries. The PKAD-R database is available as a stand-alone downloadable file as well as web servers. The database is designed to provide both a set of pKa entries for unique residues suitable for machine learning applications, as well as modularity by providing alternative pKa values and structures, allowing the user to decide which entries to include.</p>","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"24 9","pages":"1189-1203"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12520281/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145301277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01Epub Date: 2024-11-06DOI: 10.1142/s2737416524500613
Hongsong Feng, Li Shen, Jian Liu, Guo-Wei Wei
Artificial intelligence-assisted drug design is revolutionizing the pharmaceutical industry. Effective molecular features are crucial for accurate machine learning predictions, and advanced mathematics plays a key role in designing these features. Persistent homology theory, which equips topological invariants with persistence, provides valuable insights into molecular structures. The standard homology theory is based on a differential rule for the boundary operator that satisfies . Our recent work has extended this rule by employing Mayer homology with generalized differentials that satisfy for , leading to the development of persistent Mayer homology (PMH) theory and richer topological information across various scales. In this study, we utilize PMH to create a novel multiscale topological vectorization for molecular representation, offering valuable tools for descriptive and predictive analyses in molecular data and machine learning prediction. Specifically, benchmark tests on established protein-ligand datasets, including PDBbind-v2007, PDBbind-v2013, and PDBbind-v2016, demonstrate the superior performance of our Mayer homology models in predicting protein-ligand binding affinities.
{"title":"Mayer-Homology Learning Prediction of Protein-Ligand Binding Affinities.","authors":"Hongsong Feng, Li Shen, Jian Liu, Guo-Wei Wei","doi":"10.1142/s2737416524500613","DOIUrl":"10.1142/s2737416524500613","url":null,"abstract":"<p><p>Artificial intelligence-assisted drug design is revolutionizing the pharmaceutical industry. Effective molecular features are crucial for accurate machine learning predictions, and advanced mathematics plays a key role in designing these features. Persistent homology theory, which equips topological invariants with persistence, provides valuable insights into molecular structures. The standard homology theory is based on a differential rule for the boundary operator that satisfies <math> <msup><mrow><mi>d</mi></mrow> <mrow><mn>2</mn></mrow> </msup> <mo>=</mo> <mn>0</mn></math> . Our recent work has extended this rule by employing Mayer homology with generalized differentials that satisfy <math> <msup><mrow><mi>d</mi></mrow> <mrow><mi>N</mi></mrow> </msup> <mo>=</mo> <mn>0</mn></math> for <math><mi>N</mi> <mo>≥</mo> <mn>2</mn></math> , leading to the development of persistent Mayer homology (PMH) theory and richer topological information across various scales. In this study, we utilize PMH to create a novel multiscale topological vectorization for molecular representation, offering valuable tools for descriptive and predictive analyses in molecular data and machine learning prediction. Specifically, benchmark tests on established protein-ligand datasets, including PDBbind-v2007, PDBbind-v2013, and PDBbind-v2016, demonstrate the superior performance of our Mayer homology models in predicting protein-ligand binding affinities.</p>","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"24 2","pages":"253-266"},"PeriodicalIF":2.3,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12463301/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145185924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-09-19DOI: 10.1142/s2737416524500479
Nicole Hayes, Ekaterina Merkurjev, Guo-Wei Wei
Data sets with imbalanced class sizes, where one class size is much smaller than that of others, occur exceedingly often in many applications, including those with biological foundations, such as disease diagnosis and drug discovery. Therefore, it is extremely important to be able to identify data elements of classes of various sizes, as a failure to do so can result in heavy costs. Nonetheless, many data classification procedures do not perform well on imbalanced data sets as they often fail to detect elements belonging to underrepresented classes. In this work, we propose the BTDT-MBO algorithm, incorporating Merriman-Bence-Osher (MBO) approaches and a bidirectional transformer, as well as distance correlation and decision threshold adjustments, for data classification tasks on highly imbalanced molecular data sets, where the sizes of the classes vary greatly. The proposed technique not only integrates adjustments in the classification threshold for the MBO algorithm in order to help deal with the class imbalance, but also uses a bidirectional transformer procedure based on an attention mechanism for self-supervised learning. In addition, the model implements distance correlation as a weight function for the similarity graph-based framework on which the adjusted MBO algorithm operates. The proposed method is validated using six molecular data sets and compared to other related techniques. The computational experiments show that the proposed technique is superior to competing approaches even in the case of a high class imbalance ratio.
{"title":"Graph-Based Bidirectional Transformer Decision Threshold Adjustment Algorithm for Class-Imbalanced Molecular Data.","authors":"Nicole Hayes, Ekaterina Merkurjev, Guo-Wei Wei","doi":"10.1142/s2737416524500479","DOIUrl":"10.1142/s2737416524500479","url":null,"abstract":"<p><p>Data sets with imbalanced class sizes, where one class size is much smaller than that of others, occur exceedingly often in many applications, including those with biological foundations, such as disease diagnosis and drug discovery. Therefore, it is extremely important to be able to identify data elements of classes of various sizes, as a failure to do so can result in heavy costs. Nonetheless, many data classification procedures do not perform well on imbalanced data sets as they often fail to detect elements belonging to underrepresented classes. In this work, we propose the BTDT-MBO algorithm, incorporating Merriman-Bence-Osher (MBO) approaches and a bidirectional transformer, as well as distance correlation and decision threshold adjustments, for data classification tasks on highly imbalanced molecular data sets, where the sizes of the classes vary greatly. The proposed technique not only integrates adjustments in the classification threshold for the MBO algorithm in order to help deal with the class imbalance, but also uses a bidirectional transformer procedure based on an attention mechanism for self-supervised learning. In addition, the model implements distance correlation as a weight function for the similarity graph-based framework on which the adjusted MBO algorithm operates. The proposed method is validated using six molecular data sets and compared to other related techniques. The computational experiments show that the proposed technique is superior to competing approaches even in the case of a high class imbalance ratio.</p>","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"23 10","pages":"1339-1358"},"PeriodicalIF":2.3,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12467357/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145186004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-15DOI: 10.1142/s2737416523500680
Ernest Oduro-Kwateng, Ali H. Rabbad, M. E. Soliman
{"title":"The Juxtaposition of Allosteric and Catalytic Inhibition in PLK1: Tradeoff for Chemotherapy and Thermodynamic Profiles of KBJK557 and BI 6727","authors":"Ernest Oduro-Kwateng, Ali H. Rabbad, M. E. Soliman","doi":"10.1142/s2737416523500680","DOIUrl":"https://doi.org/10.1142/s2737416523500680","url":null,"abstract":"","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"65 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138999027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-15DOI: 10.1142/s2737416523500692
S. Anitha, S. Nandhini, D. Premnath, M. Indiraleka
{"title":"Computational approach to identify the Key Genes for Invasive Lobular Carcinoma (ILC) Diagnosis and Therapies","authors":"S. Anitha, S. Nandhini, D. Premnath, M. Indiraleka","doi":"10.1142/s2737416523500692","DOIUrl":"https://doi.org/10.1142/s2737416523500692","url":null,"abstract":"","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"87 4","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138998183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Molecular Dynamics Study on the Binding Characteristics and Transport Mechanism of Polysaccharides with Different Molecular Weights in Camellia Oleifera Abel","authors":"Jihang Zhai, Fangfang Fan, Chaojie Wang, Zhiyang Zhang, Sheng Zhang, Yuan Zhao","doi":"10.1142/s2737416523500679","DOIUrl":"https://doi.org/10.1142/s2737416523500679","url":null,"abstract":"","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"87 5","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139010924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fragment-Based Protein Structure Prediction, Where Are We Now?","authors":"Qudsia Noor, Raheem Kayode, Rizwan Riaz, Areeba Siddiqui, Aiza Hassan Mirza, Abdul Rauf Siddiqi","doi":"10.1142/s2737416523300018","DOIUrl":"https://doi.org/10.1142/s2737416523300018","url":null,"abstract":"","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"73 2","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139011319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.1142/s2737416523500667
Neha Maurya, Mareechika Gaddam, Abha Sharma
{"title":"Computational studies of multi-target directed ligands against acetylcholinesterase, butyrylcholinesterase and amyloid beta as potential anti-Alzheimer's agents","authors":"Neha Maurya, Mareechika Gaddam, Abha Sharma","doi":"10.1142/s2737416523500667","DOIUrl":"https://doi.org/10.1142/s2737416523500667","url":null,"abstract":"","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":" 2","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138613336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.1142/s2737416523500655
A. Chopade, Vikram H. Potdar, Suraj N. Mali, Susmita Yadav, Anima Pandey, Chin-Hung Lai, Essa M. Saied, Oberdan Oliveira Ferreira, M. D. de Oliveira, S. S. Gurav, Eloísa Helena de Aguiar Andrade
{"title":"Antifibromyalgic activity of Phytomolecule Niranthin: In-Vivo analysis, Molecular docking, Dynamics and DFT","authors":"A. Chopade, Vikram H. Potdar, Suraj N. Mali, Susmita Yadav, Anima Pandey, Chin-Hung Lai, Essa M. Saied, Oberdan Oliveira Ferreira, M. D. de Oliveira, S. S. Gurav, Eloísa Helena de Aguiar Andrade","doi":"10.1142/s2737416523500655","DOIUrl":"https://doi.org/10.1142/s2737416523500655","url":null,"abstract":"","PeriodicalId":15603,"journal":{"name":"Journal of Computational Biophysics and Chemistry","volume":"6 11","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138609620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}