首页 > 最新文献

Genomics & informatics最新文献

英文 中文
AMP-CapsNet: a multi-view feature fusion approach for antimicrobial peptide prediction using capsule networks. AMP-CapsNet:使用胶囊网络进行抗菌肽预测的多视图特征融合方法。
Pub Date : 2026-02-07 DOI: 10.1186/s44342-026-00067-6
Ali Ghulam, Mujeebu Rehman, Huma Fida, Pei-Yu Zhao, Ramsha Noroze, Ye-Chen Qi, Xiao-Long Yu

Antimicrobial peptides (AMPs) are universally found in both intracellular and extracellular settings and have significant antibiotic-resistant bacteria are becoming a bigger problem. In medical laboratories, it has shown notable anti-bacterial effectiveness in treating diabetic foot infections and related issues. New medication development frequently targets (AMPs), which are certainly ensuing components of adaptive immune system. The findings of this research employs deep learning to identify antibiotic activity. Numerous computational methods have been established to detect antimicrobial peptides via deep learning algorithms. We introduced a novel deep learning approach called antimicrobial peptides using Capsule Neural Network (AMP-CapsNet) to precisely forecast them and evaluated its efficacy against deep learning and baseline models. AMPs prediction using capsule neural networks, a type of next generation neural network, to build prediction models. Additionally, we utilized Amino Acid Composition (AAC) for effective features encoded method and as well as dipeptide composition (DPC). Every model underwent independent cross-validation and external testing. The findings indicate that the enhanced AMP-CapsNet deep learning model surpassed its counterparts, achieving an accuracy of 97.29% and an AUC score of 98.91% on the test set using with dipeptide Composition (DPC). The proposed AMP-CapsNet demonstrates superior performance of the testing set achieved accuracy 97.29% score with DPC and accuracy 84.42% score with AAC approach. Consequently, the technique we advocate is anticipated to enhance the accuracy of antimicrobial peptide predictions in the future. By producing powerful peptides for medication development and application, this study advances deep learning-based AMP drug discovery approaches. This finding has important ramifications for how biological data is processed and how pharmacology is calculated.

抗菌肽(AMPs)普遍存在于细胞内和细胞外,具有显著的耐药细菌,正成为一个更大的问题。在医学实验室中,它在治疗糖尿病足感染和相关问题上显示出显著的抗菌效果。新药物的开发经常出现靶点(AMPs),这是适应性免疫系统的必然组成部分。这项研究的结果利用深度学习来识别抗生素的活性。已经建立了许多计算方法,通过深度学习算法来检测抗菌肽。我们引入了一种新的深度学习方法,称为抗菌肽,使用胶囊神经网络(AMP-CapsNet)来精确预测它们,并评估其对深度学习和基线模型的有效性。利用新一代神经网络——胶囊神经网络建立预测模型。此外,我们利用氨基酸组成(AAC)和二肽组成(DPC)作为有效特征编码方法。每个模型都进行了独立的交叉验证和外部测试。研究结果表明,增强的AMP-CapsNet深度学习模型优于同类模型,在使用二肽组成(DPC)的测试集上实现了97.29%的准确率和98.91%的AUC分数。所提出的AMP-CapsNet测试集具有优异的性能,DPC方法的准确率为97.29%,AAC方法的准确率为84.42%。因此,我们提倡的技术有望在未来提高抗菌肽预测的准确性。通过生产用于药物开发和应用的强大肽,本研究推进了基于深度学习的AMP药物发现方法。这一发现对于如何处理生物学数据和如何计算药理学有着重要的影响。
{"title":"AMP-CapsNet: a multi-view feature fusion approach for antimicrobial peptide prediction using capsule networks.","authors":"Ali Ghulam, Mujeebu Rehman, Huma Fida, Pei-Yu Zhao, Ramsha Noroze, Ye-Chen Qi, Xiao-Long Yu","doi":"10.1186/s44342-026-00067-6","DOIUrl":"https://doi.org/10.1186/s44342-026-00067-6","url":null,"abstract":"<p><p>Antimicrobial peptides (AMPs) are universally found in both intracellular and extracellular settings and have significant antibiotic-resistant bacteria are becoming a bigger problem. In medical laboratories, it has shown notable anti-bacterial effectiveness in treating diabetic foot infections and related issues. New medication development frequently targets (AMPs), which are certainly ensuing components of adaptive immune system. The findings of this research employs deep learning to identify antibiotic activity. Numerous computational methods have been established to detect antimicrobial peptides via deep learning algorithms. We introduced a novel deep learning approach called antimicrobial peptides using Capsule Neural Network (AMP-CapsNet) to precisely forecast them and evaluated its efficacy against deep learning and baseline models. AMPs prediction using capsule neural networks, a type of next generation neural network, to build prediction models. Additionally, we utilized Amino Acid Composition (AAC) for effective features encoded method and as well as dipeptide composition (DPC). Every model underwent independent cross-validation and external testing. The findings indicate that the enhanced AMP-CapsNet deep learning model surpassed its counterparts, achieving an accuracy of 97.29% and an AUC score of 98.91% on the test set using with dipeptide Composition (DPC). The proposed AMP-CapsNet demonstrates superior performance of the testing set achieved accuracy 97.29% score with DPC and accuracy 84.42% score with AAC approach. Consequently, the technique we advocate is anticipated to enhance the accuracy of antimicrobial peptide predictions in the future. By producing powerful peptides for medication development and application, this study advances deep learning-based AMP drug discovery approaches. This finding has important ramifications for how biological data is processed and how pharmacology is calculated.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitochondrial transfer in cancer: mechanisms, immune evasion, and therapeutic opportunities. 癌症中的线粒体转移:机制、免疫逃避和治疗机会。
Pub Date : 2026-01-13 DOI: 10.1186/s44342-025-00064-1
Hye In Ka, Hyun Goo Woo

Intercellular mitochondrial transfer (MT) is emerging as a transformative communication axis in cancer biology. Intact mitochondria or mitochondrial components can be exchanged between tumor cells, stromal elements, and immune cells via tunneling nanotubes, extracellular vesicles, cell fusion, or phagocytic uptake. This organelle exchange enables metabolic adaptation by restoring OXPHOS (oxidative phosphorylation), increasing ATP production, and enhancing survival in hostile environments. Conversely, tumor cells also hijack mitochondria from cytotoxic lymphocytes thereby undermining immune function and contributing to immune escape and tumor progression. These converging metabolic exchanges fuel immune evasion, metastatic potential, and resistance to chemotherapy, radiation, and immunotherapy. Cutting-edge tracing tools, including mitochondrial reporter proteins and single-cell mitochondrial genome lineage mapping, have uncovered MT events both in vitro and in vivo. Therapeutic strategies designed to block mitochondrial trafficking, inhibit nanotube formation or vesicle uptake, or enhance immune cell mitochondrial resilience hold promise for tumor sensitization and restoration of antitumor immunity. A deeper understanding of MT provides novel insight into cancer metabolism and intercellular communication, offering a foundation for future therapeutic innovation and potential clinical application as both a biomarker and a therapeutic target.

细胞间线粒体转移(MT)正在成为癌症生物学中一个变革性的通讯轴。完整的线粒体或线粒体成分可以通过隧道纳米管、细胞外囊泡、细胞融合或吞噬摄取在肿瘤细胞、基质元件和免疫细胞之间交换。这种细胞器交换通过恢复氧化磷酸化(OXPHOS)、增加ATP的产生和提高在恶劣环境中的生存能力来实现代谢适应。相反,肿瘤细胞也从细胞毒性淋巴细胞中劫持线粒体,从而破坏免疫功能,促进免疫逃逸和肿瘤进展。这些趋同的代谢交换促进了免疫逃避、转移潜力和对化疗、放疗和免疫治疗的耐药性。尖端的追踪工具,包括线粒体报告蛋白和单细胞线粒体基因组谱系图谱,已经在体外和体内发现了MT事件。旨在阻断线粒体运输、抑制纳米管形成或囊泡摄取或增强免疫细胞线粒体弹性的治疗策略有望实现肿瘤致敏和恢复抗肿瘤免疫。对MT的深入了解为癌症代谢和细胞间通讯提供了新的见解,为未来的治疗创新和作为生物标志物和治疗靶点的潜在临床应用奠定了基础。
{"title":"Mitochondrial transfer in cancer: mechanisms, immune evasion, and therapeutic opportunities.","authors":"Hye In Ka, Hyun Goo Woo","doi":"10.1186/s44342-025-00064-1","DOIUrl":"https://doi.org/10.1186/s44342-025-00064-1","url":null,"abstract":"<p><p>Intercellular mitochondrial transfer (MT) is emerging as a transformative communication axis in cancer biology. Intact mitochondria or mitochondrial components can be exchanged between tumor cells, stromal elements, and immune cells via tunneling nanotubes, extracellular vesicles, cell fusion, or phagocytic uptake. This organelle exchange enables metabolic adaptation by restoring OXPHOS (oxidative phosphorylation), increasing ATP production, and enhancing survival in hostile environments. Conversely, tumor cells also hijack mitochondria from cytotoxic lymphocytes thereby undermining immune function and contributing to immune escape and tumor progression. These converging metabolic exchanges fuel immune evasion, metastatic potential, and resistance to chemotherapy, radiation, and immunotherapy. Cutting-edge tracing tools, including mitochondrial reporter proteins and single-cell mitochondrial genome lineage mapping, have uncovered MT events both in vitro and in vivo. Therapeutic strategies designed to block mitochondrial trafficking, inhibit nanotube formation or vesicle uptake, or enhance immune cell mitochondrial resilience hold promise for tumor sensitization and restoration of antitumor immunity. A deeper understanding of MT provides novel insight into cancer metabolism and intercellular communication, offering a foundation for future therapeutic innovation and potential clinical application as both a biomarker and a therapeutic target.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145968325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prognostic impact of the lipid metabolism gene AGPAT4 in the tumor immune microenvironment of thyroid cancer. 脂质代谢基因AGPAT4在甲状腺癌肿瘤免疫微环境中的预后影响
Pub Date : 2026-01-10 DOI: 10.1186/s44342-025-00065-0
Ying Zhu, Wenbo Xu, Xuejing Bai, Yanyuan Qiao, Dan Ye

Background: Thyroid cancer (THCA) is a common malignant tumor of the endocrine system, and significant clinical challenges remain in its diagnosis and prognostic evaluation. This study aims to elucidate the role of AGPAT4 in thyroid cancer by investigating its expression, involvement in metabolic pathways, and potential as a prognostic biomarker.

Methods: We analyzed data from 512 thyroid cancer patients and 279 controls, performed differential expression analysis of AGPAT4 in thyroid cancer, analyzed the gene expression correlation of AGPAT4 in thyroid cancer, and the protein-protein interaction (PPI) network and functional enrichment analysis of AGPAT4 and its differentially expressed genes (DEGs) were constructed. The Kruskal-Wallis test and receiver operating characteristic (ROC) curve analysis were used to investigate the correlation between AGPAT4 expression and clinicopathological characteristics as well as its diagnostic efficacy. Cox regression analysis and Kaplan-Meier analysis were employed to evaluate its prognostic value. Additionally, single-sample gene set enrichment analysis (ssGSEA) was utilized to explore the association between AGPAT4 expression and the level of immune infiltration in the tumor microenvironment.

Results: Our findings revealed that AGPAT4 was significantly downregulated in thyroid cancer (THCA) tissues (P < 0.001), suggesting a potential tumor-suppressive role of AGPAT4 in thyroid cancer. AGPAT4 exhibited robust efficacy in distinguishing tumor tissues from normal tissues, with an area under the receiver operating characteristic curve (AUC) of 0.973. Furthermore, AGPAT4 expression levels were significantly correlated with pathological stage and survival rate (P < 0.05). Kaplan-Meier survival analysis showed that patients with high AGPAT4 expression had better progression-free interval (PFI) (HR = 0.45, P = 0.007). Protein-protein interaction (PPI) network and functional enrichment analyses revealed that AGPAT4 is involved in key pathways associated with thyroid cancer progression. Immune infiltration analysis suggested an association between AGPAT4 expression and immune responses in the tumor microenvironment.

Conclusion: AGPAT4 holds promise as a potential biomarker for the differential diagnosis and prognostic assessment of thyroid cancer, thereby providing a possible reference for the further exploration of therapeutic strategies against this disease.

背景:甲状腺癌(THCA)是一种常见的内分泌系统恶性肿瘤,其诊断和预后评估仍存在重大的临床挑战。本研究旨在通过研究AGPAT4在甲状腺癌中的表达、代谢途径的参与以及作为预后生物标志物的潜力来阐明其在甲状腺癌中的作用。方法:分析512例甲状腺癌患者和279例对照者的资料,进行AGPAT4在甲状腺癌中的差异表达分析,分析AGPAT4在甲状腺癌中的基因表达相关性,构建AGPAT4及其差异表达基因(DEGs)的蛋白-蛋白相互作用(PPI)网络和功能富集分析。采用Kruskal-Wallis检验和受试者工作特征(ROC)曲线分析AGPAT4表达与临床病理特征及诊断效果的相关性。采用Cox回归分析和Kaplan-Meier分析评价其预后价值。此外,利用单样本基因集富集分析(ssGSEA)来探索肿瘤微环境中AGPAT4表达与免疫浸润水平的关系。结果:我们的研究结果显示,AGPAT4在甲状腺癌(THCA)组织中显著下调(P)。结论:AGPAT4有望作为甲状腺癌鉴别诊断和预后评估的潜在生物标志物,从而为进一步探索甲状腺癌的治疗策略提供可能的参考。
{"title":"Prognostic impact of the lipid metabolism gene AGPAT4 in the tumor immune microenvironment of thyroid cancer.","authors":"Ying Zhu, Wenbo Xu, Xuejing Bai, Yanyuan Qiao, Dan Ye","doi":"10.1186/s44342-025-00065-0","DOIUrl":"10.1186/s44342-025-00065-0","url":null,"abstract":"<p><strong>Background: </strong>Thyroid cancer (THCA) is a common malignant tumor of the endocrine system, and significant clinical challenges remain in its diagnosis and prognostic evaluation. This study aims to elucidate the role of AGPAT4 in thyroid cancer by investigating its expression, involvement in metabolic pathways, and potential as a prognostic biomarker.</p><p><strong>Methods: </strong>We analyzed data from 512 thyroid cancer patients and 279 controls, performed differential expression analysis of AGPAT4 in thyroid cancer, analyzed the gene expression correlation of AGPAT4 in thyroid cancer, and the protein-protein interaction (PPI) network and functional enrichment analysis of AGPAT4 and its differentially expressed genes (DEGs) were constructed. The Kruskal-Wallis test and receiver operating characteristic (ROC) curve analysis were used to investigate the correlation between AGPAT4 expression and clinicopathological characteristics as well as its diagnostic efficacy. Cox regression analysis and Kaplan-Meier analysis were employed to evaluate its prognostic value. Additionally, single-sample gene set enrichment analysis (ssGSEA) was utilized to explore the association between AGPAT4 expression and the level of immune infiltration in the tumor microenvironment.</p><p><strong>Results: </strong>Our findings revealed that AGPAT4 was significantly downregulated in thyroid cancer (THCA) tissues (P < 0.001), suggesting a potential tumor-suppressive role of AGPAT4 in thyroid cancer. AGPAT4 exhibited robust efficacy in distinguishing tumor tissues from normal tissues, with an area under the receiver operating characteristic curve (AUC) of 0.973. Furthermore, AGPAT4 expression levels were significantly correlated with pathological stage and survival rate (P < 0.05). Kaplan-Meier survival analysis showed that patients with high AGPAT4 expression had better progression-free interval (PFI) (HR = 0.45, P = 0.007). Protein-protein interaction (PPI) network and functional enrichment analyses revealed that AGPAT4 is involved in key pathways associated with thyroid cancer progression. Immune infiltration analysis suggested an association between AGPAT4 expression and immune responses in the tumor microenvironment.</p><p><strong>Conclusion: </strong>AGPAT4 holds promise as a potential biomarker for the differential diagnosis and prognostic assessment of thyroid cancer, thereby providing a possible reference for the further exploration of therapeutic strategies against this disease.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12879322/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Peptide‑based therapeutics targeting the SLC39A14‑PIWIL2 fusion in hepatocellular carcinoma. 靶向SLC39A14 - PIWIL2融合治疗肝细胞癌的肽基疗法
Pub Date : 2025-12-20 DOI: 10.1186/s44342-025-00060-5
Masaud Shah, Sung Ung Moon, Ji-Hye Choi, Min Jae Kim, Hyun Goo Woo

Fusion genes are key oncogenic drivers in various cancers; however, their role in hepatocellular carcinoma (HCC) remains underexplored. Here, we analyzed RNA-seq data from 68 HCC patients and identified several fusion products where SLC39A14-PIWIL2 stood out a putative driver. Functional assays revealed that the promoter of SLC39A14 potentially drives the overexpression of a truncated PIWIL2 protein (tPIWIL2), which retains its oncogenic MID and PIWI domains, in liver tissues. Both the wild-type and tPIWIL2 were found to interact with oncogenic partners HDAC3 and NME2 through these domains, as demonstrated by structural modeling and molecular dynamics simulations. To disrupt these interactions, we designed novel decoy peptides that potentially competes with both HDAC3 and NME2, effectively inhibiting PIWIL2-driven tumor activity in Huh7, HepG2, SNU449, and SNU398 HCC cell lines. Among the tested candidates, NEP1 markedly suppressed PIWIL2-driven oncogenic activity, and its co-administration with 5-fluorouracil (5-FU) significantly reduced PIWIL2-induced chemoresistance, thereby enhancing therapeutic efficacy. Collectively, these findings establish SLC39A14-PIWIL2 as a novel oncogenic fusion in HCC and highlight fusion protein-targeted peptide therapeutics as a promising avenue for precision treatment in HCC.

融合基因是多种癌症的关键致癌驱动因素;然而,它们在肝细胞癌(HCC)中的作用仍未得到充分研究。在这里,我们分析了来自68例HCC患者的RNA-seq数据,并确定了几种融合产物,其中SLC39A14-PIWIL2突出了假定的驱动因素。功能分析显示,SLC39A14的启动子可能会驱动截断的PIWIL2蛋白(tPIWIL2)的过表达,该蛋白在肝组织中保留其致癌MID和PIWI结构域。通过结构建模和分子动力学模拟,我们发现野生型和tPIWIL2都通过这些结构域与致癌伴侣HDAC3和NME2相互作用。为了破坏这些相互作用,我们设计了新的诱饵肽,可能与HDAC3和NME2竞争,有效抑制Huh7、HepG2、SNU449和SNU398 HCC细胞系中piwil2驱动的肿瘤活性。在所测试的候选药物中,NEP1可显著抑制piwil2驱动的致癌活性,并与5-氟尿嘧啶(5-FU)合用可显著降低piwil2诱导的化疗耐药,从而提高治疗效果。总之,这些发现确立了SLC39A14-PIWIL2在HCC中的新致癌融合,并强调融合蛋白靶向肽治疗是HCC精确治疗的有希望的途径。
{"title":"Peptide‑based therapeutics targeting the SLC39A14‑PIWIL2 fusion in hepatocellular carcinoma.","authors":"Masaud Shah, Sung Ung Moon, Ji-Hye Choi, Min Jae Kim, Hyun Goo Woo","doi":"10.1186/s44342-025-00060-5","DOIUrl":"10.1186/s44342-025-00060-5","url":null,"abstract":"<p><p>Fusion genes are key oncogenic drivers in various cancers; however, their role in hepatocellular carcinoma (HCC) remains underexplored. Here, we analyzed RNA-seq data from 68 HCC patients and identified several fusion products where SLC39A14-PIWIL2 stood out a putative driver. Functional assays revealed that the promoter of SLC39A14 potentially drives the overexpression of a truncated PIWIL2 protein (tPIWIL2), which retains its oncogenic MID and PIWI domains, in liver tissues. Both the wild-type and tPIWIL2 were found to interact with oncogenic partners HDAC3 and NME2 through these domains, as demonstrated by structural modeling and molecular dynamics simulations. To disrupt these interactions, we designed novel decoy peptides that potentially competes with both HDAC3 and NME2, effectively inhibiting PIWIL2-driven tumor activity in Huh7, HepG2, SNU449, and SNU398 HCC cell lines. Among the tested candidates, NEP1 markedly suppressed PIWIL2-driven oncogenic activity, and its co-administration with 5-fluorouracil (5-FU) significantly reduced PIWIL2-induced chemoresistance, thereby enhancing therapeutic efficacy. Collectively, these findings establish SLC39A14-PIWIL2 as a novel oncogenic fusion in HCC and highlight fusion protein-targeted peptide therapeutics as a promising avenue for precision treatment in HCC.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":"28"},"PeriodicalIF":0.0,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12720462/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145800957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metagenomic analysis of microbiome spatial dynamics in urban river confluence affected by city wastewater. 城市污水影响下城市河流汇合处微生物群空间动态的宏基因组分析
Pub Date : 2025-12-04 DOI: 10.1186/s44342-025-00054-3
Nahid Parwin, Sangita Dixit, Sriansh Das, Rajesh Kumar Sahoo, Enketeswara Subudhi
<p><strong>Background: </strong>Environmental pollutants have a profound impact on microbial dynamics. This study highlights the influence of anthropogenic activity on the shift in bacterial diversity in the catchment area compared to upstream and downstream at Kathajodi, using a metagenomic approach for the first time in River Kathajodi.</p><p><strong>Methods: </strong>Water samples were collected from upstream, catchment, and downstream locations and transported at 4°C to the laboratory for DNA extraction, library preparation, sequencing, and physicochemical analysis employing inductively coupled plasma. The extracted DNA was sequenced via the Illumina HiSeq platform and analyzed through MG-RAST for taxonomic and functional classification using KEGG and COG annotations. Statistical diversity analysis, including rarefaction curves, alpha- and beta-diversity indices, and Venn diagrams, provided insights into microbial composition and community variations across sites.</p><p><strong>Results: </strong>A significant abundance of pollution indicator members of phylum Bacteroidetes (29.82%) in the catchment (CM), highly contaminated with metals, fecal, and other organic pollutants, could be attributed to their high metabolic capabilities to degrade them. The pristine upstream (US) exhibited an abundance of Shewanella (25.04%), Pseudomonas (17.35%), and Synechococcus (5.62%). The CM, influenced by high anthropogenic activity, showed higher abundances of Flavobacterium (5.20%), Arcobacter (4.05%), and Bacteroides (3.88%). In contrast, downstream (DS), with fewer anthropogenic activities, displayed higher abundances of Aeromonas (4.40%), Acidovorax (0.52%), and Acidimicrobium (0.32%). The highest bacterial diversity of CM could be due to the influence of the physicochemical properties of city waste effluent. From the Venn diagram, 73 common OTUs at the genera level were observed in all three sites, which indicates that the native microflora of the river water niche remains unaffected irrespective of the temporary changes in the vicinity. The functional profiling through KEGG and COG revealed that CM was enriched in carbohydrate metabolism (12.11%), while DS exhibited higher contributions to amino acid metabolism, along with the highest relative abundance of general function prediction (R) (12.89%), all indicative of stress adaptation and metabolic flexibility under polluted conditions. The clean upstream is home to oxygen-loving helpful bacteria, the catchment supports nutrient-hungry and sewage-linked microbes, while the downstream is dominated by metal-tolerant and possibly harmful bacteria, showing the clear impact of human activities along the river.</p><p><strong>Conclusions: </strong>The marked shift in bacterial diversity between US, CM, and DS regions highlights the ecological consequences of anthropogenic impact. These findings emphasize the need for effective environmental management to safeguard water quality and prevent undesirable health iss
背景:环境污染物对微生物动力学有着深远的影响。本研究首次在Kathajodi河流域使用宏基因组方法,重点研究了人类活动对汇水区细菌多样性变化的影响,并与上游和下游进行了比较。方法:从上游、集水区和下游采集水样,在4°C下运输到实验室进行DNA提取、文库制备、测序和电感耦合等离子体理化分析。提取的DNA通过Illumina HiSeq平台测序,并通过MG-RAST分析,使用KEGG和COG注释进行分类和功能分类。统计多样性分析,包括稀疏曲线、α -和β -多样性指数和维恩图,提供了微生物组成和群落差异的见解。结果:该流域(CM)中拟杆菌门污染指标成员丰度显著(29.82%),受金属、粪便和其他有机污染物的高度污染,可归因于其高代谢降解能力。原始上游(US)的Shewanella(25.04%)、Pseudomonas(17.35%)和Synechococcus(5.62%)丰度较高。CM受高人为活动影响,黄杆菌(5.20%)、Arcobacter(4.05%)和拟杆菌(3.88%)丰度较高。相比之下,下游(DS)的人为活动较少,气单胞菌(4.40%)、酸性菌(0.52%)和酸性微生物(0.32%)的丰度较高。CM的最高细菌多样性可能是由于城市污水的理化性质的影响。从Venn图中可以看出,在所有三个地点都观察到73个属水平的共同otu,这表明无论附近的临时变化如何,河流水生态位的原生微生物群都没有受到影响。通过KEGG和COG进行的功能分析显示,CM对碳水化合物代谢(12.11%)有丰富的贡献,而DS对氨基酸代谢的贡献更高,一般功能预测的相对丰度(R)最高(12.89%),这些都表明CM在污染条件下具有应激适应和代谢灵活性。干净的上游是嗜氧有益细菌的家园,集水区支持营养匮乏和与污水有关的微生物,而下游主要是耐金属和可能有害的细菌,显示出人类活动对河流的明显影响。结论:US、CM和DS地区细菌多样性的显著变化凸显了人为影响的生态后果。这些研究结果强调需要进行有效的环境管理,以保障水质和防止不良的健康问题。
{"title":"Metagenomic analysis of microbiome spatial dynamics in urban river confluence affected by city wastewater.","authors":"Nahid Parwin, Sangita Dixit, Sriansh Das, Rajesh Kumar Sahoo, Enketeswara Subudhi","doi":"10.1186/s44342-025-00054-3","DOIUrl":"10.1186/s44342-025-00054-3","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Environmental pollutants have a profound impact on microbial dynamics. This study highlights the influence of anthropogenic activity on the shift in bacterial diversity in the catchment area compared to upstream and downstream at Kathajodi, using a metagenomic approach for the first time in River Kathajodi.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Water samples were collected from upstream, catchment, and downstream locations and transported at 4°C to the laboratory for DNA extraction, library preparation, sequencing, and physicochemical analysis employing inductively coupled plasma. The extracted DNA was sequenced via the Illumina HiSeq platform and analyzed through MG-RAST for taxonomic and functional classification using KEGG and COG annotations. Statistical diversity analysis, including rarefaction curves, alpha- and beta-diversity indices, and Venn diagrams, provided insights into microbial composition and community variations across sites.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;A significant abundance of pollution indicator members of phylum Bacteroidetes (29.82%) in the catchment (CM), highly contaminated with metals, fecal, and other organic pollutants, could be attributed to their high metabolic capabilities to degrade them. The pristine upstream (US) exhibited an abundance of Shewanella (25.04%), Pseudomonas (17.35%), and Synechococcus (5.62%). The CM, influenced by high anthropogenic activity, showed higher abundances of Flavobacterium (5.20%), Arcobacter (4.05%), and Bacteroides (3.88%). In contrast, downstream (DS), with fewer anthropogenic activities, displayed higher abundances of Aeromonas (4.40%), Acidovorax (0.52%), and Acidimicrobium (0.32%). The highest bacterial diversity of CM could be due to the influence of the physicochemical properties of city waste effluent. From the Venn diagram, 73 common OTUs at the genera level were observed in all three sites, which indicates that the native microflora of the river water niche remains unaffected irrespective of the temporary changes in the vicinity. The functional profiling through KEGG and COG revealed that CM was enriched in carbohydrate metabolism (12.11%), while DS exhibited higher contributions to amino acid metabolism, along with the highest relative abundance of general function prediction (R) (12.89%), all indicative of stress adaptation and metabolic flexibility under polluted conditions. The clean upstream is home to oxygen-loving helpful bacteria, the catchment supports nutrient-hungry and sewage-linked microbes, while the downstream is dominated by metal-tolerant and possibly harmful bacteria, showing the clear impact of human activities along the river.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;The marked shift in bacterial diversity between US, CM, and DS regions highlights the ecological consequences of anthropogenic impact. These findings emphasize the need for effective environmental management to safeguard water quality and prevent undesirable health iss","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":"27"},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12676890/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a transparent and reproducible AI-assisted research paper writing. 朝着透明和可重复的人工智能辅助研究论文写作的方向发展。
Pub Date : 2025-12-02 DOI: 10.1186/s44342-025-00057-0
Jeongbin Park

Artificial intelligence (AI)-assisted scientific writing is now a common practice in academic publishing, yet concerns persist regarding the authenticity and reproducibility of AI-generated content. While AI tools offer significant advantages, particularly for non-native English speakers who face substantial linguistic barriers in scientific communication, the risk of AI hallucinations and fabricated citations threatens the integrity of scholarly discourse. Journals often require disclosure of the entire AI prompt rather than meaningful intellectual contributions, but this is becoming increasingly impractical as AI prompts are getting longer and more complex. In this paper, I argue that transparency in AI-assisted writing should focus on capturing the author's core research perspective and section-specific key points-the foundational elements that drive meaningful scientific communication. To address this challenge, I developed a web-based tool that implements a human-in-the-loop approach requiring authors to define their research perspective and create detailed outlines with key points before any AI text generation occurs. The tool mitigates AI hallucination by only allowing the use of user-provided citations and generating transparency reports documenting the key elements used for text generation. I validated this approach by writing this paper using the tool itself, demonstrating how the transparency reporting method works in practice. This methodology ensures that AI serves as a linguistic tool rather than a content generator, preserving scientific integrity while democratizing access to high-quality academic writing across linguistic and cultural boundaries.

人工智能(AI)辅助的科学写作现在是学术出版领域的一种常见做法,但对人工智能生成内容的真实性和可重复性的担忧仍然存在。虽然人工智能工具提供了显著的优势,特别是对于那些在科学交流中面临巨大语言障碍的非英语母语人士,但人工智能幻觉和捏造引用的风险威胁着学术话语的完整性。期刊通常要求披露整个人工智能提示,而不是有意义的智力贡献,但随着人工智能提示变得越来越长、越来越复杂,这变得越来越不切实际。在本文中,我认为人工智能辅助写作的透明度应侧重于捕捉作者的核心研究视角和特定章节的关键点——这是推动有意义的科学交流的基本要素。为了应对这一挑战,我开发了一个基于网络的工具,实现了一种“人在循环”的方法,要求作者在任何人工智能文本生成之前定义他们的研究视角,并创建包含关键点的详细大纲。该工具只允许使用用户提供的引用,并生成透明报告,记录用于文本生成的关键元素,从而减轻了人工智能的幻觉。我通过使用工具本身写这篇论文来验证这种方法,演示了透明度报告方法在实践中是如何工作的。这种方法确保人工智能作为语言工具而不是内容生成器,在保持科学完整性的同时,使跨语言和文化边界的高质量学术写作民主化。
{"title":"Towards a transparent and reproducible AI-assisted research paper writing.","authors":"Jeongbin Park","doi":"10.1186/s44342-025-00057-0","DOIUrl":"10.1186/s44342-025-00057-0","url":null,"abstract":"<p><p>Artificial intelligence (AI)-assisted scientific writing is now a common practice in academic publishing, yet concerns persist regarding the authenticity and reproducibility of AI-generated content. While AI tools offer significant advantages, particularly for non-native English speakers who face substantial linguistic barriers in scientific communication, the risk of AI hallucinations and fabricated citations threatens the integrity of scholarly discourse. Journals often require disclosure of the entire AI prompt rather than meaningful intellectual contributions, but this is becoming increasingly impractical as AI prompts are getting longer and more complex. In this paper, I argue that transparency in AI-assisted writing should focus on capturing the author's core research perspective and section-specific key points-the foundational elements that drive meaningful scientific communication. To address this challenge, I developed a web-based tool that implements a human-in-the-loop approach requiring authors to define their research perspective and create detailed outlines with key points before any AI text generation occurs. The tool mitigates AI hallucination by only allowing the use of user-provided citations and generating transparency reports documenting the key elements used for text generation. I validated this approach by writing this paper using the tool itself, demonstrating how the transparency reporting method works in practice. This methodology ensures that AI serves as a linguistic tool rather than a content generator, preserving scientific integrity while democratizing access to high-quality academic writing across linguistic and cultural boundaries.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"23 1","pages":"26"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12670809/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145663141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPICEiST: subcellular RNA pattern enhances cell clustering of imaging-based spatial transcriptomics. 亚细胞RNA模式增强了基于成像的空间转录组学的细胞聚类。
Pub Date : 2025-12-02 DOI: 10.1186/s44342-025-00056-1
Sungwoo Bae, Yuchang Seong, Dongjoo Lee, Hongyoon Choi

Background: Imaging-based spatial transcriptomics (ST) enables the quantification of gene expression at single-cell resolution while preserving spatial context, but its utility is limited by small gene panels and challenges in accurate cell segmentation. To address these limitations, we present a graph autoencoder framework that integrates subcellular transcript distribution patterns with cell-level gene expression profiles for enhanced cell clustering in imaging-based ST (SPICEiST).

Results: The clustering performance of SPICEiST was systematically evaluated across several cancer datasets and gene panel sizes. The results demonstrate that SPICEiST consistently outperforms the conventional cell-level gene expression-based methods in distinguishing subtle differences in cell states, as measured by the number of cell clusters and clustering indices, such as the CHI and DBI. Moreover, the findings indicate that SPICEiST can further enhance the performance, even with advancements in cell segmentation, particularly for datasets with small gene panels. Overall, these improvements in cell clustering indices, CHI and DBI, were more pronounced in datasets with small gene panels of around 300 genes, in contrast to those with large panels containing over a thousand genes. Notably, SPICEiST also reveals more spatially intermixed and less compartmentalized cell clusters, a characteristic that better reflects the complex and heterogeneous nature of tumor microenvironments. This effect was especially evident in the datasets with large panels.

Conclusions: These findings highlight the value of leveraging subcellular transcript patterns to overcome the inherent limitations of imaging-based ST, particularly for small gene panels, and may provide new insights into tumor heterogeneity.

背景:基于成像的空间转录组学(ST)能够在单细胞分辨率下定量基因表达,同时保留空间背景,但其应用受到小基因面板和准确细胞分割的挑战的限制。为了解决这些限制,我们提出了一个图自编码器框架,该框架集成了亚细胞转录物分布模式和细胞水平的基因表达谱,以增强基于成像的ST (SPICEiST)中的细胞聚类。结果:SPICEiST的聚类性能在多个癌症数据集和基因面板大小上进行了系统评估。结果表明,SPICEiST在区分细胞状态的细微差异方面始终优于传统的基于细胞水平基因表达的方法,这是通过细胞簇数量和聚类指数(如CHI和DBI)来衡量的。此外,研究结果表明,SPICEiST可以进一步提高性能,即使在细胞分割方面取得进展,特别是对于具有小基因面板的数据集。总体而言,细胞聚类指数CHI和DBI的这些改进在包含约300个基因的小基因面板的数据集中更为明显,而在包含超过1000个基因的大基因面板的数据集中则相反。值得注意的是,SPICEiST还揭示了更多的空间混合和更少的区隔细胞簇,这一特征更好地反映了肿瘤微环境的复杂性和异质性。这种影响在具有大型面板的数据集中尤为明显。结论:这些发现突出了利用亚细胞转录模式来克服基于成像的ST固有局限性的价值,特别是对于小基因面板,并可能为肿瘤异质性提供新的见解。
{"title":"SPICEiST: subcellular RNA pattern enhances cell clustering of imaging-based spatial transcriptomics.","authors":"Sungwoo Bae, Yuchang Seong, Dongjoo Lee, Hongyoon Choi","doi":"10.1186/s44342-025-00056-1","DOIUrl":"10.1186/s44342-025-00056-1","url":null,"abstract":"<p><strong>Background: </strong>Imaging-based spatial transcriptomics (ST) enables the quantification of gene expression at single-cell resolution while preserving spatial context, but its utility is limited by small gene panels and challenges in accurate cell segmentation. To address these limitations, we present a graph autoencoder framework that integrates subcellular transcript distribution patterns with cell-level gene expression profiles for enhanced cell clustering in imaging-based ST (SPICEiST).</p><p><strong>Results: </strong>The clustering performance of SPICEiST was systematically evaluated across several cancer datasets and gene panel sizes. The results demonstrate that SPICEiST consistently outperforms the conventional cell-level gene expression-based methods in distinguishing subtle differences in cell states, as measured by the number of cell clusters and clustering indices, such as the CHI and DBI. Moreover, the findings indicate that SPICEiST can further enhance the performance, even with advancements in cell segmentation, particularly for datasets with small gene panels. Overall, these improvements in cell clustering indices, CHI and DBI, were more pronounced in datasets with small gene panels of around 300 genes, in contrast to those with large panels containing over a thousand genes. Notably, SPICEiST also reveals more spatially intermixed and less compartmentalized cell clusters, a characteristic that better reflects the complex and heterogeneous nature of tumor microenvironments. This effect was especially evident in the datasets with large panels.</p><p><strong>Conclusions: </strong>These findings highlight the value of leveraging subcellular transcript patterns to overcome the inherent limitations of imaging-based ST, particularly for small gene panels, and may provide new insights into tumor heterogeneity.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"23 1","pages":"23"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12670746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145663187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metagenomic insights into microbial community alterations and co-occurrence networks in infective endocarditis. 感染性心内膜炎中微生物群落改变和共现网络的宏基因组学见解。
Pub Date : 2025-12-02 DOI: 10.1186/s44342-025-00059-y
Zahra Abedi, Mohammad Ali Sheikh Beig Goharrizi, Amirreza Abbasi, Negar Sadat Soleimani Zakeri, Helia Jangi

Background: Infective endocarditis (IE) is a serious infection of the heart valves, and standard culture methods often miss the bacteria responsible, especially in culture-negative cases. To address this, we used 16S rRNA gene-based next-generation sequencing (NGS) on heart valve tissue. This approach allowed us to map out the bacterial communities present and evaluate their potential role in IE.

Result: We identified six key bacterial genera-Enterococcus, Streptococcus, Coxiella, Staphylococcus, Haemophilus, and Cutibacterium-plus three specific species: Streptococcus troglodytae, Haemophilus parainfluenzae, and Coxiella burnetii. Our co-occurrence analysis showed that these bacteria tend to exist independently within infected valve tissue, with no significant correlations between them.

Conclusion: We detected bacterial taxa, including Cutibacterium and Streptococcus troglodytae. Although S. troglodytae is rarely associated with IE, and Cutibacterium comprises low-abundance bacteria not typically linked to this condition. These findings demonstrate the value of NGS in identifying pathogens that standard culture methods may overlook. As these results are based on computational analyses, further laboratory validation is required. Incorporating NGS into diagnostic protocols may enhance pathogen detection in culture-negative IE and support more targeted treatment and prevention strategies.

背景:感染性心内膜炎(IE)是一种严重的心脏瓣膜感染,标准的培养方法经常遗漏细菌,特别是在培养阴性的病例中。为了解决这个问题,我们对心脏瓣膜组织使用了基于16S rRNA基因的下一代测序(NGS)。这种方法使我们能够绘制出存在的细菌群落并评估它们在IE中的潜在作用。结果:鉴定出肠球菌、链球菌、Coxiella、葡萄球菌、嗜血杆菌和表皮杆菌6个关键菌属,以及3个特定菌种:类人虫链球菌、副流感嗜血杆菌和伯纳蒂Coxiella。我们的共现分析显示,这些细菌往往独立存在于被感染的瓣膜组织中,它们之间没有显著的相关性。结论:检出的细菌类群包括角质杆菌和穴居链球菌。虽然S. troglodytae很少与IE相关,Cutibacterium包括低丰度的细菌,通常与这种情况无关。这些发现证明了NGS在鉴定标准培养方法可能忽略的病原体方面的价值。由于这些结果是基于计算分析,因此需要进一步的实验室验证。将NGS纳入诊断方案可以提高培养阴性IE的病原体检测,并支持更有针对性的治疗和预防策略。
{"title":"Metagenomic insights into microbial community alterations and co-occurrence networks in infective endocarditis.","authors":"Zahra Abedi, Mohammad Ali Sheikh Beig Goharrizi, Amirreza Abbasi, Negar Sadat Soleimani Zakeri, Helia Jangi","doi":"10.1186/s44342-025-00059-y","DOIUrl":"10.1186/s44342-025-00059-y","url":null,"abstract":"<p><strong>Background: </strong>Infective endocarditis (IE) is a serious infection of the heart valves, and standard culture methods often miss the bacteria responsible, especially in culture-negative cases. To address this, we used 16S rRNA gene-based next-generation sequencing (NGS) on heart valve tissue. This approach allowed us to map out the bacterial communities present and evaluate their potential role in IE.</p><p><strong>Result: </strong>We identified six key bacterial genera-Enterococcus, Streptococcus, Coxiella, Staphylococcus, Haemophilus, and Cutibacterium-plus three specific species: Streptococcus troglodytae, Haemophilus parainfluenzae, and Coxiella burnetii. Our co-occurrence analysis showed that these bacteria tend to exist independently within infected valve tissue, with no significant correlations between them.</p><p><strong>Conclusion: </strong>We detected bacterial taxa, including Cutibacterium and Streptococcus troglodytae. Although S. troglodytae is rarely associated with IE, and Cutibacterium comprises low-abundance bacteria not typically linked to this condition. These findings demonstrate the value of NGS in identifying pathogens that standard culture methods may overlook. As these results are based on computational analyses, further laboratory validation is required. Incorporating NGS into diagnostic protocols may enhance pathogen detection in culture-negative IE and support more targeted treatment and prevention strategies.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"23 1","pages":"25"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12670860/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145663011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A curation system of rice trait ontology with reliable interoperation by LLM and PubAnnotation. 基于LLM和PubAnnotation的水稻性状本体可靠互操作检索系统。
Pub Date : 2025-12-02 DOI: 10.1186/s44342-025-00058-z
Javeed Muhammad Ahmad, Yawen Liu, Jin-Dong Kim, Xinzhi Yao, Pierre Larmande, Jingbo Xia

Background: Ontology frameworks are essential for organizing complex biological knowledge, such as genes, phenotypes, and pathways, and for ensuring consistent data annotation and retrieval. In biological research, ontologies like the Gene Ontology (GO) and crop-specific trait ontologies (TO) for Oryza sativa (rice) standardize terminology across studies, supporting cross-study comparison and hypothesis generation. However, ontology annotations usually rely on expert manual review of the literature, a process that is accurate but time-consuming, labor-intensive, and difficult to scale as biological data grows. Manual approaches are also prone to inconsistencies and errors. The emergence of large language models (LLMs) such as ChatGPT, DeepSeek, and KIMI, along with curated databases like Rice-Alterome and PubAnnotation, offers new opportunities for semi-automated ontology curation. This study explores how these technologies can be integrated to develop an efficient literature-based curation system for rice trait ontology.

Methods: We developed a curation system that integrates Rice-Alterome-a comprehensive database of rice genomic variations, mutations, and sentence-level literature evidence linked to GO and TO terms with PubAnnotation, an open-source platform for collaborative text annotation. LLMs (DeepSeek and KIMI) were integrated via APIs to automate the extraction, annotation, and validation of trait-related information via prompt engineering. The system was evaluated through use cases designed to demonstrate its performance and functionality compared to manual curation.

Results: The proposed system substantially enhanced the retrieval and organization of literature evidence compared to manual methods. The integrated platform, available through a dedicated website, connects Rice-Alterome, PubAnnotation, and LLMs to streamline ontology curation and evidence discovery. This framework reduces the time domain experts need to locate and validate relevant information and provides interactive tools for users to add, merge, or refine trait annotations. The LLM-driven prompt-based querying also improved the identification of implicit or missing information that may be overlooked during manual curation.

Conclusions: Integrating LLMs with Rice-Alterome and PubAnnotation offers a promising solution for automating rice trait ontology curation. This approach accelerates evidence collection and enhances data consistency and accessibility. Future extensions of this framework will target additional crops such as wheat and maize and focus on refining LLM-based retrieval and annotation mechanisms for broader agricultural genomics applications.

背景:本体框架对于组织复杂的生物知识,如基因、表型和途径,以及确保一致的数据注释和检索是必不可少的。在生物学研究中,像基因本体(GO)和水稻的作物特异性性状本体(TO)这样的本体规范了跨研究的术语,支持跨研究比较和假设生成。然而,本体注释通常依赖于专家对文献的人工审查,这是一个准确的过程,但耗时、费力,并且随着生物数据的增长难以扩展。手工方法也容易出现不一致和错误。大型语言模型(llm)的出现,如ChatGPT、DeepSeek和KIMI,以及像Rice-Alterome和PubAnnotation这样的管理数据库,为半自动本体管理提供了新的机会。本研究探讨了如何将这些技术整合起来,以开发一个高效的基于文献的水稻性状本体管理系统。方法:我们开发了一个整合rice - alterome(水稻基因组变异、突变和与GO和to术语相关的句子级文献证据的综合数据库)和PubAnnotation(协作文本注释的开源平台)的管理系统。llm (DeepSeek和KIMI)通过api集成,通过快速工程自动提取、注释和验证特征相关信息。系统通过用例进行评估,这些用例被设计用来演示其性能和功能,并与手动管理进行比较。结果:与手工方法相比,该系统大大提高了文献证据的检索和组织能力。这个集成平台可以通过一个专门的网站获得,它将Rice-Alterome、PubAnnotation和法学硕士连接起来,以简化本体管理和证据发现。该框架减少了专家定位和验证相关信息所需的时间域,并为用户提供了添加、合并或改进特征注释的交互式工具。llm驱动的基于提示的查询还改进了对在手动管理期间可能被忽视的隐式或缺失信息的识别。结论:将llm与rice - alterome和PubAnnotation集成为实现水稻性状本体的自动化提供了一种很有前景的解决方案。这种方法加速了证据收集,增强了数据的一致性和可访问性。该框架的未来扩展将针对其他作物,如小麦和玉米,并专注于改进基于llm的检索和注释机制,以用于更广泛的农业基因组学应用。
{"title":"A curation system of rice trait ontology with reliable interoperation by LLM and PubAnnotation.","authors":"Javeed Muhammad Ahmad, Yawen Liu, Jin-Dong Kim, Xinzhi Yao, Pierre Larmande, Jingbo Xia","doi":"10.1186/s44342-025-00058-z","DOIUrl":"10.1186/s44342-025-00058-z","url":null,"abstract":"<p><strong>Background: </strong>Ontology frameworks are essential for organizing complex biological knowledge, such as genes, phenotypes, and pathways, and for ensuring consistent data annotation and retrieval. In biological research, ontologies like the Gene Ontology (GO) and crop-specific trait ontologies (TO) for Oryza sativa (rice) standardize terminology across studies, supporting cross-study comparison and hypothesis generation. However, ontology annotations usually rely on expert manual review of the literature, a process that is accurate but time-consuming, labor-intensive, and difficult to scale as biological data grows. Manual approaches are also prone to inconsistencies and errors. The emergence of large language models (LLMs) such as ChatGPT, DeepSeek, and KIMI, along with curated databases like Rice-Alterome and PubAnnotation, offers new opportunities for semi-automated ontology curation. This study explores how these technologies can be integrated to develop an efficient literature-based curation system for rice trait ontology.</p><p><strong>Methods: </strong>We developed a curation system that integrates Rice-Alterome-a comprehensive database of rice genomic variations, mutations, and sentence-level literature evidence linked to GO and TO terms with PubAnnotation, an open-source platform for collaborative text annotation. LLMs (DeepSeek and KIMI) were integrated via APIs to automate the extraction, annotation, and validation of trait-related information via prompt engineering. The system was evaluated through use cases designed to demonstrate its performance and functionality compared to manual curation.</p><p><strong>Results: </strong>The proposed system substantially enhanced the retrieval and organization of literature evidence compared to manual methods. The integrated platform, available through a dedicated website, connects Rice-Alterome, PubAnnotation, and LLMs to streamline ontology curation and evidence discovery. This framework reduces the time domain experts need to locate and validate relevant information and provides interactive tools for users to add, merge, or refine trait annotations. The LLM-driven prompt-based querying also improved the identification of implicit or missing information that may be overlooked during manual curation.</p><p><strong>Conclusions: </strong>Integrating LLMs with Rice-Alterome and PubAnnotation offers a promising solution for automating rice trait ontology curation. This approach accelerates evidence collection and enhances data consistency and accessibility. Future extensions of this framework will target additional crops such as wheat and maize and focus on refining LLM-based retrieval and annotation mechanisms for broader agricultural genomics applications.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"23 1","pages":"24"},"PeriodicalIF":0.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12670841/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145663013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From pixels to cell types: a comprehensive review of computational methods for spatial transcriptomics deconvolution. 从像素到细胞类型:空间转录组反褶积计算方法的全面回顾。
Pub Date : 2025-10-31 DOI: 10.1186/s44342-025-00055-2
Jahanzeb Saqib, Junil Kim

Spatial transcriptomics technologies have significantly enhanced the analysis of gene expression profiles by retaining the spatial information of intact tissue sections and enabling the possibility of a more profound comprehension of tissue structures and cellular relationships. Despite this, most platforms have limited resolution, and at numerous capture spots, multiple signals from various cells are present, requiring deconvolution, a set of computational steps to deduce the underlying cellular composition. Over the last few years, a range of algorithms has been proposed to address this problem, each employing distinct computational principles and processing paradigms. The present review seeks to present a comprehensive analysis of twenty such algorithms, focusing on their methodological foundations. We contrast the underlying computational algorithms, modeling methods, and data processing pipelines that underlie them, and how they deal with external references, noise and sparsity in the data. By drawing out the conceptual as well as technical foundations of each algorithm, we aim to provide researchers a complete and hands-on grasp of the computational landscape of spatial transcriptomics deconvolution. This review is a methodological handbook to enable deep understanding of current deconvolution methods to develop novel strategies and help in selecting or applying these existing tools for different biological contexts.

空间转录组学技术通过保留完整组织切片的空间信息,使对组织结构和细胞关系的更深刻理解成为可能,从而显著增强了基因表达谱的分析。尽管如此,大多数平台的分辨率有限,并且在许多捕获点,存在来自不同细胞的多个信号,需要反卷积,一组计算步骤来推断潜在的细胞组成。在过去的几年里,已经提出了一系列算法来解决这个问题,每个算法都采用不同的计算原理和处理范式。本综述旨在对20种这样的算法进行全面分析,重点是它们的方法基础。我们对比了底层的计算算法、建模方法和数据处理管道,以及它们如何处理数据中的外部引用、噪声和稀疏性。通过绘制出每个算法的概念和技术基础,我们的目标是为研究人员提供一个完整的和动手掌握空间转录组反褶积的计算景观。这篇综述是一本方法论手册,能够深入理解当前的反卷积方法,以开发新的策略,并有助于在不同的生物学背景下选择或应用这些现有的工具。
{"title":"From pixels to cell types: a comprehensive review of computational methods for spatial transcriptomics deconvolution.","authors":"Jahanzeb Saqib, Junil Kim","doi":"10.1186/s44342-025-00055-2","DOIUrl":"10.1186/s44342-025-00055-2","url":null,"abstract":"<p><p>Spatial transcriptomics technologies have significantly enhanced the analysis of gene expression profiles by retaining the spatial information of intact tissue sections and enabling the possibility of a more profound comprehension of tissue structures and cellular relationships. Despite this, most platforms have limited resolution, and at numerous capture spots, multiple signals from various cells are present, requiring deconvolution, a set of computational steps to deduce the underlying cellular composition. Over the last few years, a range of algorithms has been proposed to address this problem, each employing distinct computational principles and processing paradigms. The present review seeks to present a comprehensive analysis of twenty such algorithms, focusing on their methodological foundations. We contrast the underlying computational algorithms, modeling methods, and data processing pipelines that underlie them, and how they deal with external references, noise and sparsity in the data. By drawing out the conceptual as well as technical foundations of each algorithm, we aim to provide researchers a complete and hands-on grasp of the computational landscape of spatial transcriptomics deconvolution. This review is a methodological handbook to enable deep understanding of current deconvolution methods to develop novel strategies and help in selecting or applying these existing tools for different biological contexts.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"23 1","pages":"22"},"PeriodicalIF":0.0,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12577344/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145423791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genomics & informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1