首页 > 最新文献

Database: The Journal of Biological Databases and Curation最新文献

英文 中文
Correction to: GymnoTOA-db: a database and application to optimize functional annotation in gymnosperms. 更正:GymnoTOA-db:一个优化裸子植物功能注释的数据库和应用程序。
IF 3.6 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-07-08 DOI: 10.1093/database/baaf041
{"title":"Correction to: GymnoTOA-db: a database and application to optimize functional annotation in gymnosperms.","authors":"","doi":"10.1093/database/baaf041","DOIUrl":"10.1093/database/baaf041","url":null,"abstract":"","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.6,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12560801/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144583336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAS: enhancing implicit constrained data augmentation with semantic enrichment for biomedical relation extraction and beyond. CAS:增强隐式约束数据增强与语义丰富的生物医学关系提取及其他。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-07-03 DOI: 10.1093/database/baaf025
Fang-Yi Su, Gia-Han Ngo, Ben Phan, Jung-Hsien Chiang

Biomedical relation extraction often involves datasets with implicit constraints, where structural, syntactic, or semantic rules must be strictly preserved to maintain data integrity. Traditional data augmentation techniques struggle in these scenarios, as they risk violating domain-specific constraints. To address these challenges, we propose CAS (Constrained Augmentation and Semantic-Quality), a novel framework designed for constrained datasets. CAS employs large language models to generate diverse data variations while adhering to predefined rules, and it integrates the SemQ Filter. This self-evaluation mechanism ensures the quality and consistency of augmented data by filtering out noisy or semantically incongruent samples. Although CAS is primarily designed for biomedical relation extraction, its versatile design extends its applicability to tasks with implicit constraints, such as code completion, mathematical reasoning, and information retrieval. Through extensive experiments across multiple domains, CAS demonstrates its ability to enhance model performance by maintaining structural fidelity and semantic accuracy in augmented data. These results highlight the potential of CAS not only in advancing biomedical NLP research but also in addressing data augmentation challenges in diverse constrained-task settings within natural language processing. Database URL: https://github.com/ngogiahan149/CAS.

生物医学关系提取通常涉及具有隐式约束的数据集,其中必须严格保留结构、语法或语义规则以保持数据完整性。传统的数据增强技术在这些情况下会遇到困难,因为它们有违反特定领域约束的风险。为了解决这些挑战,我们提出了CAS(约束增强和语义质量),这是一个为约束数据集设计的新框架。CAS使用大型语言模型来生成不同的数据变体,同时遵循预定义的规则,并且集成了SemQ Filter。这种自评价机制通过过滤掉噪声或语义不一致的样本来确保增强数据的质量和一致性。虽然CAS主要是为生物医学关系提取而设计的,但其通用的设计扩展了其对具有隐式约束的任务的适用性,例如代码补全、数学推理和信息检索。通过跨多个领域的广泛实验,CAS证明了其通过在增强数据中保持结构保真度和语义准确性来提高模型性能的能力。这些结果突出了CAS不仅在推进生物医学NLP研究方面的潜力,而且在解决自然语言处理中各种受限任务设置中的数据增强挑战方面的潜力。数据库地址:https://github.com/ngogiahan149/CAS。
{"title":"CAS: enhancing implicit constrained data augmentation with semantic enrichment for biomedical relation extraction and beyond.","authors":"Fang-Yi Su, Gia-Han Ngo, Ben Phan, Jung-Hsien Chiang","doi":"10.1093/database/baaf025","DOIUrl":"10.1093/database/baaf025","url":null,"abstract":"<p><p>Biomedical relation extraction often involves datasets with implicit constraints, where structural, syntactic, or semantic rules must be strictly preserved to maintain data integrity. Traditional data augmentation techniques struggle in these scenarios, as they risk violating domain-specific constraints. To address these challenges, we propose CAS (Constrained Augmentation and Semantic-Quality), a novel framework designed for constrained datasets. CAS employs large language models to generate diverse data variations while adhering to predefined rules, and it integrates the SemQ Filter. This self-evaluation mechanism ensures the quality and consistency of augmented data by filtering out noisy or semantically incongruent samples. Although CAS is primarily designed for biomedical relation extraction, its versatile design extends its applicability to tasks with implicit constraints, such as code completion, mathematical reasoning, and information retrieval. Through extensive experiments across multiple domains, CAS demonstrates its ability to enhance model performance by maintaining structural fidelity and semantic accuracy in augmented data. These results highlight the potential of CAS not only in advancing biomedical NLP research but also in addressing data augmentation challenges in diverse constrained-task settings within natural language processing. Database URL: https://github.com/ngogiahan149/CAS.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12224179/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144552558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Protein Sequence Analysis landscape: A Systematic Review of Task Types, Databases, Datasets, Word Embeddings Methods, and Language Models. 蛋白质序列分析领域:任务类型、数据库、数据集、词嵌入方法和语言模型的系统回顾。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-30 DOI: 10.1093/database/baaf027
Muhammad Nabeel Asim, Tayyaba Asif, Faiza Hassan, Andreas Dengel

Protein sequence analysis examines the order of amino acids within protein sequences to unlock diverse types of a wealth of knowledge about biological processes and genetic disorders. It helps in forecasting disease susceptibility by finding unique protein signatures, or biomarkers that are linked to particular disease states. Protein Sequence analysis through wet-lab experiments is expensive, time-consuming and error prone. To facilitate large-scale proteomics sequence analysis, the biological community is striving for utilizing AI competence for transitioning from wet-lab to computer aided applications. However, Proteomics and AI are two distinct fields and development of AI-driven protein sequence analysis applications requires knowledge of both domains. To bridge the gap between both fields, various review articles have been written. However, these articles focus revolves around few individual tasks or specific applications rather than providing a comprehensive overview about wide tasks and applications. Following the need of a comprehensive literature that presents a holistic view of wide array of tasks and applications, contributions of this manuscript are manifold: It bridges the gap between Proteomics and AI fields by presenting a comprehensive array of AI-driven applications for 63 distinct protein sequence analysis tasks. It equips AI researchers by facilitating biological foundations of 63 protein sequence analysis tasks. It enhances development of AI-driven protein sequence analysis applications by providing comprehensive details of 68 protein databases. It presents a rich data landscape, encompassing 627 benchmark datasets of 63 diverse protein sequence analysis tasks. It highlights the utilization of 25 unique word embedding methods and 13 language models in AI-driven protein sequence analysis applications. It accelerates the development of AI-driven applications by facilitating current state-of-the-art performances across 63 protein sequence analysis tasks.

蛋白质序列分析检查蛋白质序列内氨基酸的顺序,以解锁有关生物过程和遗传疾病的丰富知识的不同类型。它通过发现独特的蛋白质特征或与特定疾病状态相关的生物标志物,有助于预测疾病的易感性。通过湿实验室实验进行蛋白质序列分析是昂贵、耗时且容易出错的。为了促进大规模蛋白质组学序列分析,生物界正在努力利用人工智能能力从湿实验室过渡到计算机辅助应用。然而,蛋白质组学和人工智能是两个不同的领域,人工智能驱动的蛋白质序列分析应用的开发需要这两个领域的知识。为了弥合这两个领域之间的差距,已经写了各种评论文章。然而,这些文章主要围绕个别任务或特定应用程序展开,而不是对广泛的任务和应用程序提供全面的概述。根据综合文献的需要,提出了广泛任务和应用的整体观点,本文的贡献是多方面的:它通过为63种不同的蛋白质序列分析任务提供全面的人工智能驱动应用,弥合了蛋白质组学和人工智能领域之间的差距。它为人工智能研究人员提供了63个蛋白质序列分析任务的生物学基础。它通过提供68个蛋白质数据库的全面细节,加强了人工智能驱动的蛋白质序列分析应用程序的开发。它提供了丰富的数据景观,包括63种不同蛋白质序列分析任务的627个基准数据集。重点介绍了25种独特的词嵌入方法和13种语言模型在人工智能驱动的蛋白质序列分析应用中的应用。它通过促进63个蛋白质序列分析任务的当前最先进性能,加速了人工智能驱动应用程序的开发。
{"title":"Protein Sequence Analysis landscape: A Systematic Review of Task Types, Databases, Datasets, Word Embeddings Methods, and Language Models.","authors":"Muhammad Nabeel Asim, Tayyaba Asif, Faiza Hassan, Andreas Dengel","doi":"10.1093/database/baaf027","DOIUrl":"10.1093/database/baaf027","url":null,"abstract":"<p><p>Protein sequence analysis examines the order of amino acids within protein sequences to unlock diverse types of a wealth of knowledge about biological processes and genetic disorders. It helps in forecasting disease susceptibility by finding unique protein signatures, or biomarkers that are linked to particular disease states. Protein Sequence analysis through wet-lab experiments is expensive, time-consuming and error prone. To facilitate large-scale proteomics sequence analysis, the biological community is striving for utilizing AI competence for transitioning from wet-lab to computer aided applications. However, Proteomics and AI are two distinct fields and development of AI-driven protein sequence analysis applications requires knowledge of both domains. To bridge the gap between both fields, various review articles have been written. However, these articles focus revolves around few individual tasks or specific applications rather than providing a comprehensive overview about wide tasks and applications. Following the need of a comprehensive literature that presents a holistic view of wide array of tasks and applications, contributions of this manuscript are manifold: It bridges the gap between Proteomics and AI fields by presenting a comprehensive array of AI-driven applications for 63 distinct protein sequence analysis tasks. It equips AI researchers by facilitating biological foundations of 63 protein sequence analysis tasks. It enhances development of AI-driven protein sequence analysis applications by providing comprehensive details of 68 protein databases. It presents a rich data landscape, encompassing 627 benchmark datasets of 63 diverse protein sequence analysis tasks. It highlights the utilization of 25 unique word embedding methods and 13 language models in AI-driven protein sequence analysis applications. It accelerates the development of AI-driven applications by facilitating current state-of-the-art performances across 63 protein sequence analysis tasks.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12125710/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144191613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing biomedical relation extraction through data-centric and preprocessing-robust ensemble learning approach. 通过以数据为中心和预处理鲁棒集成学习方法增强生物医学关系提取。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-22 DOI: 10.1093/database/baae127
Wilailack Meesawad, Jen-Chieh Han, Chun-Yu Hsueh, Yu Zhang, Hsi-Chuan Hung, Richard Tzong-Han Tsai

The paper describes our biomedical relation extraction system, which is designed to participate in the BioCreative VIII challenge Track 1: BioRED Track, which emphasizes the relation extraction from biomedical literature. Our system employs an ensemble learning method, leveraging the PubTator API in conjunction with multiple pretrained bidirectional encoder representations from transformer (BERT) models. Various preprocessing inputs are incorporated, encompassing prompt questions, entity ID pairs, and co-occurrence contexts. To enhance model comprehension, special tokens and boundary tags are incorporated. Specifically, we utilize PubMedBERT alongside the Max Rule ensemble learning mechanism to amalgamate outputs from diverse classifiers. Our findings surpass the established benchmark score, thereby providing a robust benchmark for evaluating performance in this task. Moreover, our study introduces and demonstrates the effectiveness of a data-centric approach, emphasizing the significance of prioritizing high-quality data instances in enhancing model performance and robustness.

本文描述了我们的生物医学关系提取系统,该系统是为参加BioCreative VIII挑战赛Track 1: BioRED Track而设计的,该Track强调从生物医学文献中提取关系。我们的系统采用集成学习方法,利用PubTator API与来自变压器(BERT)模型的多个预训练双向编码器表示相结合。合并了各种预处理输入,包括提示问题、实体ID对和共现上下文。为了增强模型的可理解性,加入了特殊的标记和边界标签。具体来说,我们利用PubMedBERT和Max Rule集成学习机制来合并来自不同分类器的输出。我们的发现超过了既定的基准得分,从而为评估该任务的性能提供了一个可靠的基准。此外,我们的研究介绍并证明了以数据为中心的方法的有效性,强调了优先考虑高质量数据实例在提高模型性能和鲁棒性方面的重要性。
{"title":"Enhancing biomedical relation extraction through data-centric and preprocessing-robust ensemble learning approach.","authors":"Wilailack Meesawad, Jen-Chieh Han, Chun-Yu Hsueh, Yu Zhang, Hsi-Chuan Hung, Richard Tzong-Han Tsai","doi":"10.1093/database/baae127","DOIUrl":"10.1093/database/baae127","url":null,"abstract":"<p><p>The paper describes our biomedical relation extraction system, which is designed to participate in the BioCreative VIII challenge Track 1: BioRED Track, which emphasizes the relation extraction from biomedical literature. Our system employs an ensemble learning method, leveraging the PubTator API in conjunction with multiple pretrained bidirectional encoder representations from transformer (BERT) models. Various preprocessing inputs are incorporated, encompassing prompt questions, entity ID pairs, and co-occurrence contexts. To enhance model comprehension, special tokens and boundary tags are incorporated. Specifically, we utilize PubMedBERT alongside the Max Rule ensemble learning mechanism to amalgamate outputs from diverse classifiers. Our findings surpass the established benchmark score, thereby providing a robust benchmark for evaluating performance in this task. Moreover, our study introduces and demonstrates the effectiveness of a data-centric approach, emphasizing the significance of prioritizing high-quality data instances in enhancing model performance and robustness.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12097206/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An exploratory study combining Virtual Reality and Semantic Web for life science research using Graph2VR. 基于Graph2VR的虚拟现实与语义网在生命科学研究中的结合探索性研究。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-20 DOI: 10.1093/database/baaf008
Alexander J Kellmann, Sander van den Hoek, Max Postema, W T Kars Maassen, Brenda S Hijmans, Marije A van der Geest, K Joeri van der Velde, Esther J van Enckevort, Morris A Swertz

We previously described Graph2VR, a prototype that enables researchers to use virtual reality (VR) to explore and navigate through Linked Data graphs using SPARQL queries (see https://doi.org/10.1093/database/baae008). Here we evaluate the use of Graph2VR in three realistic life science use cases. The first use case visualizes metadata from large-scale multi-center cohort studies across Europe and Canada via the EUCAN Connect catalogue. The second use case involves a set of genomic data from synthetic rare disease patients, which was processed through the Variant Interpretation Pipeline and then converted into Resource Description Format for visualization. The third use case involves enriching a graph with additional information, in this case, the Dutch Anatomical Therapeutic Chemical code Ontology with the DrugID from Drugbank. These examples collectively showcase Graph2VR's potential for data exploration and enrichment, as well as some of its limitations. We conclude that the endless three-dimensional space provided by VR indeed shows much potential for the navigation of very large knowledge graphs, and we provide recommendations for data preparation and VR tooling moving forward. Database URL: https://doi.org/10.1093/database/baaf008.

我们之前描述过Graph2VR,这是一个原型,使研究人员能够使用虚拟现实(VR)来探索和浏览使用SPARQL查询的关联数据图(见https://doi.org/10.1093/database/baae008)。在这里,我们评估了Graph2VR在三个现实生命科学用例中的使用。第一个用例通过EUCAN Connect目录将欧洲和加拿大大规模多中心队列研究的元数据可视化。第二个用例涉及一组来自合成罕见病患者的基因组数据,这些数据通过变体解释管道进行处理,然后转换为资源描述格式进行可视化。第三个用例涉及到用附加信息丰富图,在本例中,是荷兰解剖治疗化学代码本体和来自Drugbank的DrugID。这些例子共同展示了Graph2VR在数据探索和丰富方面的潜力,以及它的一些局限性。我们得出的结论是,VR提供的无限三维空间确实显示出巨大的潜力,可以导航非常大的知识图谱,我们为数据准备和VR工具的发展提供了建议。数据库地址:https://doi.org/10.1093/database/baaf008。
{"title":"An exploratory study combining Virtual Reality and Semantic Web for life science research using Graph2VR.","authors":"Alexander J Kellmann, Sander van den Hoek, Max Postema, W T Kars Maassen, Brenda S Hijmans, Marije A van der Geest, K Joeri van der Velde, Esther J van Enckevort, Morris A Swertz","doi":"10.1093/database/baaf008","DOIUrl":"https://doi.org/10.1093/database/baaf008","url":null,"abstract":"<p><p>We previously described Graph2VR, a prototype that enables researchers to use virtual reality (VR) to explore and navigate through Linked Data graphs using SPARQL queries (see https://doi.org/10.1093/database/baae008). Here we evaluate the use of Graph2VR in three realistic life science use cases. The first use case visualizes metadata from large-scale multi-center cohort studies across Europe and Canada via the EUCAN Connect catalogue. The second use case involves a set of genomic data from synthetic rare disease patients, which was processed through the Variant Interpretation Pipeline and then converted into Resource Description Format for visualization. The third use case involves enriching a graph with additional information, in this case, the Dutch Anatomical Therapeutic Chemical code Ontology with the DrugID from Drugbank. These examples collectively showcase Graph2VR's potential for data exploration and enrichment, as well as some of its limitations. We conclude that the endless three-dimensional space provided by VR indeed shows much potential for the navigation of very large knowledge graphs, and we provide recommendations for data preparation and VR tooling moving forward. Database URL: https://doi.org/10.1093/database/baaf008.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An exploratory study combining Virtual Reality and Semantic Web for life science research using Graph2VR. 基于Graph2VR的虚拟现实与语义网在生命科学研究中的结合探索性研究。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-20 DOI: 10.1093/database/baaf008
Alexander J Kellmann, Sander van den Hoek, Max Postema, W T Kars Maassen, Brenda S Hijmans, Marije A van der Geest, K Joeri van der Velde, Esther J van Enckevort, Morris A Swertz

We previously described Graph2VR, a prototype that enables researchers to use virtual reality (VR) to explore and navigate through Linked Data graphs using SPARQL queries (see https://doi.org/10.1093/database/baae008). Here we evaluate the use of Graph2VR in three realistic life science use cases. The first use case visualizes metadata from large-scale multi-center cohort studies across Europe and Canada via the EUCAN Connect catalogue. The second use case involves a set of genomic data from synthetic rare disease patients, which was processed through the Variant Interpretation Pipeline and then converted into Resource Description Format for visualization. The third use case involves enriching a graph with additional information, in this case, the Dutch Anatomical Therapeutic Chemical code Ontology with the DrugID from Drugbank. These examples collectively showcase Graph2VR's potential for data exploration and enrichment, as well as some of its limitations. We conclude that the endless three-dimensional space provided by VR indeed shows much potential for the navigation of very large knowledge graphs, and we provide recommendations for data preparation and VR tooling moving forward. Database URL: https://doi.org/10.1093/database/baaf008.

我们之前描述过Graph2VR,这是一个原型,使研究人员能够使用虚拟现实(VR)来探索和浏览使用SPARQL查询的关联数据图(见https://doi.org/10.1093/database/baae008)。在这里,我们评估了Graph2VR在三个现实生命科学用例中的使用。第一个用例通过EUCAN Connect目录将欧洲和加拿大大规模多中心队列研究的元数据可视化。第二个用例涉及一组来自合成罕见病患者的基因组数据,这些数据通过变体解释管道进行处理,然后转换为资源描述格式进行可视化。第三个用例涉及到用附加信息丰富图,在本例中,是荷兰解剖治疗化学代码本体和来自Drugbank的DrugID。这些例子共同展示了Graph2VR在数据探索和丰富方面的潜力,以及它的一些局限性。我们得出的结论是,VR提供的无限三维空间确实显示出巨大的潜力,可以导航非常大的知识图谱,我们为数据准备和VR工具的发展提供了建议。数据库地址:https://doi.org/10.1093/database/baaf008。
{"title":"An exploratory study combining Virtual Reality and Semantic Web for life science research using Graph2VR.","authors":"Alexander J Kellmann, Sander van den Hoek, Max Postema, W T Kars Maassen, Brenda S Hijmans, Marije A van der Geest, K Joeri van der Velde, Esther J van Enckevort, Morris A Swertz","doi":"10.1093/database/baaf008","DOIUrl":"10.1093/database/baaf008","url":null,"abstract":"<p><p>We previously described Graph2VR, a prototype that enables researchers to use virtual reality (VR) to explore and navigate through Linked Data graphs using SPARQL queries (see https://doi.org/10.1093/database/baae008). Here we evaluate the use of Graph2VR in three realistic life science use cases. The first use case visualizes metadata from large-scale multi-center cohort studies across Europe and Canada via the EUCAN Connect catalogue. The second use case involves a set of genomic data from synthetic rare disease patients, which was processed through the Variant Interpretation Pipeline and then converted into Resource Description Format for visualization. The third use case involves enriching a graph with additional information, in this case, the Dutch Anatomical Therapeutic Chemical code Ontology with the DrugID from Drugbank. These examples collectively showcase Graph2VR's potential for data exploration and enrichment, as well as some of its limitations. We conclude that the endless three-dimensional space provided by VR indeed shows much potential for the navigation of very large knowledge graphs, and we provide recommendations for data preparation and VR tooling moving forward. Database URL: https://doi.org/10.1093/database/baaf008.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12090995/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144110024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GenDiS3 database: census on the prevalence of protein domain superfamilies of known structure in the entire sequence database. GenDiS3数据库:对整个序列数据库中已知结构的蛋白质结构域超家族的流行情况进行普查。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-09 DOI: 10.1093/database/baaf035
Sarthak Joshi, Shailendu Mohapatra, Dhwani Kumar, Adwait Joshi, Meenakshi Iyer, Ramanathan Sowdhamini

Despite the vast amount of sequence data available, a significant disparity exists between the number of protein sequences identified and the relatively few structures that have been resolved. This disparity highlights the challenge in structural biology to bridge the gap between sequence information and 3D structural data, and the necessity for robust databases capable of linking distant homologs to known structures. Studies have indicated that there are a limited number of structural folds, despite the vast diversity of proteins. Hence, computational tools can enhance our ability to classify protein sequences, much before their structures are determined or their functions are characterized, thereby bridging the gap between sequence and structural data. GenDiS (Genomic Distribution of Superfamilies) is a repository with information on the genomic distribution of protein domain superfamilies, involving a one-time computational exercise to search for trusted homologs of protein domains of known structures against the vast sequence database. We have updated this database employing advanced bioinformatics tools, including DELTA-BLAST (domain enhanced lookup time accelerated BLAST) for initial detection of hits and HMMSCAN for validation, significantly improving the accuracy of domain identification. Using these tools, over 151 million sequence homologs for 2060 superfamilies [SCOPe (Structural Classification of Proteins extended)] were identified and 116 million out of them were validated as true positives. Through a case study on glycolysis-related enzymes, variations in domain architectures of these enzymes are explored, revealing evolutionary changes and functional diversity among these essential proteins. We present another case, LOG gene, where one can tune in and find significant mutations across the evolutionary lineage. The GenDiS database, GenDiS3, and the associated tools made available at https://caps.ncbs.res.in/gendis3/ offer a powerful resource for researchers in functional annotation and evolutionary studies. Database URL: https://caps.ncbs.res.in/gendis3/.

尽管有大量可用的序列数据,但已确定的蛋白质序列数量与已解决的相对较少的结构之间存在显着差异。这种差异凸显了结构生物学在序列信息和3D结构数据之间建立桥梁的挑战,以及建立能够将遥远同源物与已知结构联系起来的强大数据库的必要性。研究表明,尽管蛋白质种类繁多,但结构褶皱的数量有限。因此,计算工具可以提高我们对蛋白质序列进行分类的能力,在它们的结构被确定或功能被表征之前,从而弥合了序列和结构数据之间的差距。GenDiS(基因组分布超家族)是蛋白质结构域超家族基因组分布信息的存储库,涉及一次性计算练习,以搜索已知结构的蛋白质结构域的可靠同源物,而不是庞大的序列数据库。我们使用先进的生物信息学工具更新了该数据库,包括用于初始检测命中的DELTA-BLAST(域增强查找时间加速BLAST)和用于验证的HMMSCAN,显着提高了域识别的准确性。使用这些工具,鉴定了2060个超家族[SCOPe (Structural Classification of Proteins extended)]的1.51亿个序列同源物,其中1.16亿个被验证为真阳性。通过对糖酵解相关酶的案例研究,探讨了这些酶结构域结构的变化,揭示了这些必需蛋白质的进化变化和功能多样性。我们提出了另一种情况,LOG基因,在这种情况下,人们可以调谐并发现进化谱系中的重大突变。GenDiS数据库,GenDiS3和相关的工具在https://caps.ncbs.res.in/gendis3/上提供了功能注释和进化研究的研究人员一个强大的资源。数据库地址:https://caps.ncbs.res.in/gendis3/。
{"title":"GenDiS3 database: census on the prevalence of protein domain superfamilies of known structure in the entire sequence database.","authors":"Sarthak Joshi, Shailendu Mohapatra, Dhwani Kumar, Adwait Joshi, Meenakshi Iyer, Ramanathan Sowdhamini","doi":"10.1093/database/baaf035","DOIUrl":"10.1093/database/baaf035","url":null,"abstract":"<p><p>Despite the vast amount of sequence data available, a significant disparity exists between the number of protein sequences identified and the relatively few structures that have been resolved. This disparity highlights the challenge in structural biology to bridge the gap between sequence information and 3D structural data, and the necessity for robust databases capable of linking distant homologs to known structures. Studies have indicated that there are a limited number of structural folds, despite the vast diversity of proteins. Hence, computational tools can enhance our ability to classify protein sequences, much before their structures are determined or their functions are characterized, thereby bridging the gap between sequence and structural data. GenDiS (Genomic Distribution of Superfamilies) is a repository with information on the genomic distribution of protein domain superfamilies, involving a one-time computational exercise to search for trusted homologs of protein domains of known structures against the vast sequence database. We have updated this database employing advanced bioinformatics tools, including DELTA-BLAST (domain enhanced lookup time accelerated BLAST) for initial detection of hits and HMMSCAN for validation, significantly improving the accuracy of domain identification. Using these tools, over 151 million sequence homologs for 2060 superfamilies [SCOPe (Structural Classification of Proteins extended)] were identified and 116 million out of them were validated as true positives. Through a case study on glycolysis-related enzymes, variations in domain architectures of these enzymes are explored, revealing evolutionary changes and functional diversity among these essential proteins. We present another case, LOG gene, where one can tune in and find significant mutations across the evolutionary lineage. The GenDiS database, GenDiS3, and the associated tools made available at https://caps.ncbs.res.in/gendis3/ offer a powerful resource for researchers in functional annotation and evolutionary studies. Database URL: https://caps.ncbs.res.in/gendis3/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12063530/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143978712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GenDiS3 database: census on the prevalence of protein domain superfamilies of known structure in the entire sequence database. GenDiS3数据库:对整个序列数据库中已知结构的蛋白质结构域超家族的流行情况进行普查。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-09 DOI: 10.1093/database/baaf035
Sarthak Joshi, Shailendu Mohapatra, Dhwani Kumar, Adwait Joshi, Meenakshi Iyer, Ramanathan Sowdhamini

Despite the vast amount of sequence data available, a significant disparity exists between the number of protein sequences identified and the relatively few structures that have been resolved. This disparity highlights the challenge in structural biology to bridge the gap between sequence information and 3D structural data, and the necessity for robust databases capable of linking distant homologs to known structures. Studies have indicated that there are a limited number of structural folds, despite the vast diversity of proteins. Hence, computational tools can enhance our ability to classify protein sequences, much before their structures are determined or their functions are characterized, thereby bridging the gap between sequence and structural data. GenDiS (Genomic Distribution of Superfamilies) is a repository with information on the genomic distribution of protein domain superfamilies, involving a one-time computational exercise to search for trusted homologs of protein domains of known structures against the vast sequence database. We have updated this database employing advanced bioinformatics tools, including DELTA-BLAST (domain enhanced lookup time accelerated BLAST) for initial detection of hits and HMMSCAN for validation, significantly improving the accuracy of domain identification. Using these tools, over 151 million sequence homologs for 2060 superfamilies [SCOPe (Structural Classification of Proteins extended)] were identified and 116 million out of them were validated as true positives. Through a case study on glycolysis-related enzymes, variations in domain architectures of these enzymes are explored, revealing evolutionary changes and functional diversity among these essential proteins. We present another case, LOG gene, where one can tune in and find significant mutations across the evolutionary lineage. The GenDiS database, GenDiS3, and the associated tools made available at https://caps.ncbs.res.in/gendis3/ offer a powerful resource for researchers in functional annotation and evolutionary studies. Database URL: https://caps.ncbs.res.in/gendis3/.

尽管有大量可用的序列数据,但已确定的蛋白质序列数量与已解决的相对较少的结构之间存在显着差异。这种差异凸显了结构生物学在序列信息和3D结构数据之间建立桥梁的挑战,以及建立能够将遥远同源物与已知结构联系起来的强大数据库的必要性。研究表明,尽管蛋白质种类繁多,但结构褶皱的数量有限。因此,计算工具可以提高我们对蛋白质序列进行分类的能力,在它们的结构被确定或功能被表征之前,从而弥合了序列和结构数据之间的差距。GenDiS(基因组分布超家族)是蛋白质结构域超家族基因组分布信息的存储库,涉及一次性计算练习,以搜索已知结构的蛋白质结构域的可靠同源物,而不是庞大的序列数据库。我们使用先进的生物信息学工具更新了该数据库,包括用于初始检测命中的DELTA-BLAST(域增强查找时间加速BLAST)和用于验证的HMMSCAN,显着提高了域识别的准确性。使用这些工具,鉴定了2060个超家族[SCOPe (Structural Classification of Proteins extended)]的1.51亿个序列同源物,其中1.16亿个被验证为真阳性。通过对糖酵解相关酶的案例研究,探讨了这些酶结构域结构的变化,揭示了这些必需蛋白质的进化变化和功能多样性。我们提出了另一种情况,LOG基因,在这种情况下,人们可以调谐并发现进化谱系中的重大突变。GenDiS数据库,GenDiS3和相关的工具在https://caps.ncbs.res.in/gendis3/上提供了功能注释和进化研究的研究人员一个强大的资源。数据库地址:https://caps.ncbs.res.in/gendis3/。
{"title":"GenDiS3 database: census on the prevalence of protein domain superfamilies of known structure in the entire sequence database.","authors":"Sarthak Joshi, Shailendu Mohapatra, Dhwani Kumar, Adwait Joshi, Meenakshi Iyer, Ramanathan Sowdhamini","doi":"10.1093/database/baaf035","DOIUrl":"https://doi.org/10.1093/database/baaf035","url":null,"abstract":"<p><p>Despite the vast amount of sequence data available, a significant disparity exists between the number of protein sequences identified and the relatively few structures that have been resolved. This disparity highlights the challenge in structural biology to bridge the gap between sequence information and 3D structural data, and the necessity for robust databases capable of linking distant homologs to known structures. Studies have indicated that there are a limited number of structural folds, despite the vast diversity of proteins. Hence, computational tools can enhance our ability to classify protein sequences, much before their structures are determined or their functions are characterized, thereby bridging the gap between sequence and structural data. GenDiS (Genomic Distribution of Superfamilies) is a repository with information on the genomic distribution of protein domain superfamilies, involving a one-time computational exercise to search for trusted homologs of protein domains of known structures against the vast sequence database. We have updated this database employing advanced bioinformatics tools, including DELTA-BLAST (domain enhanced lookup time accelerated BLAST) for initial detection of hits and HMMSCAN for validation, significantly improving the accuracy of domain identification. Using these tools, over 151 million sequence homologs for 2060 superfamilies [SCOPe (Structural Classification of Proteins extended)] were identified and 116 million out of them were validated as true positives. Through a case study on glycolysis-related enzymes, variations in domain architectures of these enzymes are explored, revealing evolutionary changes and functional diversity among these essential proteins. We present another case, LOG gene, where one can tune in and find significant mutations across the evolutionary lineage. The GenDiS database, GenDiS3, and the associated tools made available at https://caps.ncbs.res.in/gendis3/ offer a powerful resource for researchers in functional annotation and evolutionary studies. Database URL: https://caps.ncbs.res.in/gendis3/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CancerPPD2: an updated repository of anticancer peptides and proteins. CancerPPD2:抗癌肽和蛋白质的更新库。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-07 DOI: 10.1093/database/baaf030
Milind Chauhan, Amisha Gupta, Ritu Tomer, Gajendra P S Raghava

CancerPPD2 (http://webs.iiitd.edu.in/raghava/cancerppd2/) is an updated version of CancerPPD, developed to maintain comprehensive information about anticancer peptides and proteins. It contains 6521 entries, each entry provides detailed information about an anticancer peptide/protein that include origin of the peptide, cancer cell line, type of cancer, peptide sequence, and structure. These anticancer peptides have been tested against 392 types of cancer cell lines and 28 types of cancer-associated tissues. In addition to natural anticancer peptides, CancerPPD2 contains 781 entries for chemically modified and 3018 entries for N-/C- terminus modified anticancer peptides. Few entries are also linked with 47 clinical studies and have provided the cross reference to Uniprot, DrugBank, and ThPDB2. The possible entries also linked with clinical trials. On average, CancerPPD2 contains around 85% more information than its previous version, CancerPPD. The structures of these anticancer peptides and proteins were either obtained from the Protein Data Bank (PDB) or predicted using PEPstrMOD, I-TASSER, and AlphaFold. A wide range of tools have been integrated into CancerPPD2 for data retrieval and similarity searches. Additionally, we integrated a REST API into this repository to facilitate automatic data retrieval via program. Database URL: https://webs.iiitd.edu.in/raghava/cancerppd2/api/rest.html.

CancerPPD2 (http://webs.iiitd.edu.in/raghava/cancerppd2/)是CancerPPD的更新版本,旨在维护有关抗癌肽和蛋白质的全面信息。它包含6521个条目,每个条目提供有关抗癌肽/蛋白质的详细信息,包括肽的来源、癌细胞系、癌症类型、肽序列和结构。这些抗癌肽已经对392种癌细胞系和28种癌症相关组织进行了测试。除天然抗癌肽外,CancerPPD2含有781个化学修饰的片段和3018个N /C末端修饰的抗癌肽片段。少数条目还与47项临床研究相关联,并为Uniprot、DrugBank和ThPDB2提供了交叉参考。可能的条目还与临床试验有关。平均而言,CancerPPD2比之前的版本CancerPPD多包含约85%的信息。这些抗癌肽和蛋白质的结构要么从蛋白质数据库(PDB)中获得,要么使用PEPstrMOD、I-TASSER和AlphaFold预测。CancerPPD2中集成了多种工具,用于数据检索和相似性搜索。此外,我们将一个REST API集成到这个存储库中,以方便通过程序自动检索数据。数据库地址:https://webs.iiitd.edu.in/raghava/cancerppd2/api/rest.html。
{"title":"CancerPPD2: an updated repository of anticancer peptides and proteins.","authors":"Milind Chauhan, Amisha Gupta, Ritu Tomer, Gajendra P S Raghava","doi":"10.1093/database/baaf030","DOIUrl":"https://doi.org/10.1093/database/baaf030","url":null,"abstract":"<p><p>CancerPPD2 (http://webs.iiitd.edu.in/raghava/cancerppd2/) is an updated version of CancerPPD, developed to maintain comprehensive information about anticancer peptides and proteins. It contains 6521 entries, each entry provides detailed information about an anticancer peptide/protein that include origin of the peptide, cancer cell line, type of cancer, peptide sequence, and structure. These anticancer peptides have been tested against 392 types of cancer cell lines and 28 types of cancer-associated tissues. In addition to natural anticancer peptides, CancerPPD2 contains 781 entries for chemically modified and 3018 entries for N-/C- terminus modified anticancer peptides. Few entries are also linked with 47 clinical studies and have provided the cross reference to Uniprot, DrugBank, and ThPDB2. The possible entries also linked with clinical trials. On average, CancerPPD2 contains around 85% more information than its previous version, CancerPPD. The structures of these anticancer peptides and proteins were either obtained from the Protein Data Bank (PDB) or predicted using PEPstrMOD, I-TASSER, and AlphaFold. A wide range of tools have been integrated into CancerPPD2 for data retrieval and similarity searches. Additionally, we integrated a REST API into this repository to facilitate automatic data retrieval via program. Database URL: https://webs.iiitd.edu.in/raghava/cancerppd2/api/rest.html.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A longitudinal analysis of function annotations of the human proteome reveals consistently high biases. 对人类蛋白质组功能注释的纵向分析显示出一贯的高偏差。
IF 3.4 4区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-05-07 DOI: 10.1093/database/baaf036
An Phan, Parnal Joshi, Claus Kadelka, Iddo Friedberg

The resources required to study gene function are limited, especially when considering the number of genes in the human genome and the complexity of their function. Therefore, genes are prioritized for experimental studies based on many different considerations, including, but not limited to, perceived biomedical importance, such as disease-associated genes, or the understanding of biological processes, such as cell signalling pathways. At the same time, most genes are not studied or are under-characterized, which hampers our understanding of their function and potential effects on human health and wellness. Understanding function annotation disparity is a necessary first step toward understanding how much functional knowledge is gained from the human genome, and toward guidelines for better targeting future studies of the genes in the human genome effectively. Here, we present a comprehensive longitudinal analysis of the human proteome utilizing data analysis tools from economics and information theory. Specifically, we view the human proteome as a population of proteins within a knowledge economy: we treat the quantified knowledge of the protein's function as the analogue of wealth and examine the distribution of information in a population of proteins in the proteome in the same manner distribution of wealth is studied in societies. Our results show a highly skewed distribution of information about human proteins over the last decade, in which the inequality in the annotations given to the proteins remains high. Additionally, we examine the correlation between the knowledge about protein function as captured in databases and the interest in proteins as reflected by mentions in the scientific literature. We show a large gap between knowledge and interest and dissect the factors leading to this gap. In conclusion, our study shows that research efforts should be redirected to less studied proteins to mitigate the disparity among human proteins both in databases and literature.

研究基因功能所需的资源是有限的,特别是考虑到人类基因组中基因的数量及其功能的复杂性。因此,基于许多不同的考虑因素,包括但不限于感知到的生物医学重要性,如疾病相关基因,或对生物过程的理解,如细胞信号传导途径,对实验研究的基因进行优先排序。与此同时,大多数基因没有被研究或特征不充分,这阻碍了我们对它们的功能和对人类健康的潜在影响的理解。了解功能注释差异是了解人类基因组功能知识的必要第一步,也是更好地针对人类基因组中基因的未来研究的指导方针。在这里,我们利用经济学和信息论的数据分析工具,对人类蛋白质组进行了全面的纵向分析。具体而言,我们将人类蛋白质组视为知识经济中的蛋白质群体:我们将蛋白质功能的量化知识视为财富的类似物,并以研究社会中财富分布的相同方式检查蛋白质组中蛋白质群体中的信息分布。我们的结果表明,在过去十年中,关于人类蛋白质的信息分布高度倾斜,其中给予蛋白质的注释中的不平等仍然很高。此外,我们研究了数据库中捕获的关于蛋白质功能的知识与科学文献中提到的对蛋白质的兴趣之间的相关性。我们展示了知识和兴趣之间的巨大差距,并剖析了导致这种差距的因素。总之,我们的研究表明,研究工作应该转向研究较少的蛋白质,以减轻数据库和文献中人类蛋白质之间的差异。
{"title":"A longitudinal analysis of function annotations of the human proteome reveals consistently high biases.","authors":"An Phan, Parnal Joshi, Claus Kadelka, Iddo Friedberg","doi":"10.1093/database/baaf036","DOIUrl":"https://doi.org/10.1093/database/baaf036","url":null,"abstract":"<p><p>The resources required to study gene function are limited, especially when considering the number of genes in the human genome and the complexity of their function. Therefore, genes are prioritized for experimental studies based on many different considerations, including, but not limited to, perceived biomedical importance, such as disease-associated genes, or the understanding of biological processes, such as cell signalling pathways. At the same time, most genes are not studied or are under-characterized, which hampers our understanding of their function and potential effects on human health and wellness. Understanding function annotation disparity is a necessary first step toward understanding how much functional knowledge is gained from the human genome, and toward guidelines for better targeting future studies of the genes in the human genome effectively. Here, we present a comprehensive longitudinal analysis of the human proteome utilizing data analysis tools from economics and information theory. Specifically, we view the human proteome as a population of proteins within a knowledge economy: we treat the quantified knowledge of the protein's function as the analogue of wealth and examine the distribution of information in a population of proteins in the proteome in the same manner distribution of wealth is studied in societies. Our results show a highly skewed distribution of information about human proteins over the last decade, in which the inequality in the annotations given to the proteins remains high. Additionally, we examine the correlation between the knowledge about protein function as captured in databases and the interest in proteins as reflected by mentions in the scientific literature. We show a large gap between knowledge and interest and dissect the factors leading to this gap. In conclusion, our study shows that research efforts should be redirected to less studied proteins to mitigate the disparity among human proteins both in databases and literature.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Database: The Journal of Biological Databases and Curation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1