首页 > 最新文献

Analytical biochemistry最新文献

英文 中文
3DCOOR-Kace: A 3-d spatial coordinates representation method for lysine acetylation site identification 一种用于赖氨酸乙酰化位点识别的三维空间坐标表示方法。
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-01 Epub Date: 2025-10-16 DOI: 10.1016/j.ab.2025.115994
Lichao Zhang , Xue Wang , Liang Kong
Lysine acetylation (Kace) is an important post-translational modification. Although structure information has been proven to be a key for improving model effectiveness, it is difficult to obtain in bulk due to experiments limitations. In this study, we propose a spatial coordinates representation using a 2-order tensor according to defined property sequence to address the existing limitations and depicting amino acid position in 3-d space. Based on the proposed coordinates, we construct optimal complex networks to extract network-derived features. Compared to existing network construction methods, protein contact networks (PCN), the features achieve superior performance, demonstrating the proposed spatial coordinates could effectively capture biological global information. Meanwhile, we proposed a computational model, named 3DCOOR-Kace, by fusing sequence and structure information based on DenseNet and Squeeze-and-Excitation layer. The 3DCOOR-Kace achieved satisfactory MCC with 0.7358. Compared with MusiteDeep and TransPTM by the independent testing set, the MCC is 0.4261 higher than MusiteDeep and 0.1660 higher than TransPTM, which demonstrates 3DCOOR-Kace are effective for integrating structure and sequence information for improving Kace site identification. Instead of doing biological experiments, the 3-d spatial coordinates representation could give sites positions directly, which could address the experiments limitation and be convenient for computational methods and biological function research.
赖氨酸乙酰化是一种重要的翻译后修饰。虽然结构信息已被证明是提高模型有效性的关键,但由于实验的限制,很难大量获得。在本研究中,我们根据定义的性质序列,提出了一种使用二阶张量的空间坐标表示,以解决现有的局限性,并描绘了氨基酸在三维空间中的位置。基于所提出的坐标,我们构建了最优的复杂网络来提取网络衍生的特征。与现有的蛋白质接触网络(protein contact networks, PCN)网络构建方法相比,该特征具有更优越的性能,表明所提出的空间坐标能够有效捕获生物全局信息。同时,我们提出了基于DenseNet和Squeeze-and-Excitation层融合序列和结构信息的计算模型3DCOOR-Kace。3DCOOR-Kace的MCC达到了令人满意的0.7358。与独立测试集的MusiteDeep和TransPTM相比,MCC比MusiteDeep高0.4261,比TransPTM高0.1660,说明3DCOOR-Kace能够有效整合结构和序列信息,提高Kace位点的识别。三维空间坐标表示代替了生物实验,直接给出了位点的位置,解决了实验的局限性,方便了计算方法和生物功能研究。
{"title":"3DCOOR-Kace: A 3-d spatial coordinates representation method for lysine acetylation site identification","authors":"Lichao Zhang ,&nbsp;Xue Wang ,&nbsp;Liang Kong","doi":"10.1016/j.ab.2025.115994","DOIUrl":"10.1016/j.ab.2025.115994","url":null,"abstract":"<div><div>Lysine acetylation (Kace) is an important post-translational modification. Although structure information has been proven to be a key for improving model effectiveness, it is difficult to obtain in bulk due to experiments limitations. In this study, we propose a spatial coordinates representation using a 2-order tensor according to defined property sequence to address the existing limitations and depicting amino acid position in 3-d space. Based on the proposed coordinates, we construct optimal complex networks to extract network-derived features. Compared to existing network construction methods, protein contact networks (PCN), the features achieve superior performance, demonstrating the proposed spatial coordinates could effectively capture biological global information. Meanwhile, we proposed a computational model, named 3DCOOR-Kace, by fusing sequence and structure information based on DenseNet and Squeeze-and-Excitation layer. The 3DCOOR-Kace achieved satisfactory MCC with 0.7358. Compared with MusiteDeep and TransPTM by the independent testing set, the MCC is 0.4261 higher than MusiteDeep and 0.1660 higher than TransPTM, which demonstrates 3DCOOR-Kace are effective for integrating structure and sequence information for improving Kace site identification. Instead of doing biological experiments, the 3-d spatial coordinates representation could give sites positions directly, which could address the experiments limitation and be convenient for computational methods and biological function research.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"708 ","pages":"Article 115994"},"PeriodicalIF":2.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145318062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continuous assay for the dNTP triphosphohydrolase of activated SAMHD1 活化SAMHD1的dNTP三磷酸水解酶连续测定。
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-01 Epub Date: 2025-09-01 DOI: 10.1016/j.ab.2025.115966
Roozbeh Eskandari, Daniel P. Groom, Ryo Tamura, Vern L. Schramm
Sterile alpha motif and histidine-aspartate domain-containing protein 1 (SAMHD1) is the only member of the triphosphoric monoester hydrolase family in humans (dNTP + H2O → dN + PPPi). The dNTPase activity of SAMHD1 inhibits DNA synthesis, resulting in cell-cycle arrest and restricting viral replication. The complex allosteric regulation mechanism of SAMHD1 and a reaction that lacks a direct spectroscopic signal make its kinetic analysis and inhibitor discovery challenging. We describe a continuous assay for monitoring SAMHD1 phosphatase activity in its activated physiological state. The assay uses a sequential assembly to generate the active tetrameric form of the enzyme. Two phosphatases convert inorganic triphosphate (PPPi) to inorganic phosphate (Pi). The released Pi reacts with the 7-methyl-6-thioguanosine and purine nucleoside phosphorylase to provide a sensitive continuous spectrophotometric assay. The assay is suitable for 96-microwell plate formats to provide a continuous measurement of SAMHD1 activity. The assay is benchmarked with inhibitors of SAMHD1. With a Z-prime value > 0.90, the assay can be used for high-throughput screening of inhibitors for SAMHD1 and characterizing the allosteric or catalytic activity of the new inhibitors.
无菌α基序和组氨酸-天冬氨酸结构域蛋白1 (SAMHD1)是人类三磷酸单酯水解酶家族(dNTP + H2O→dN + PPPi)中唯一的成员。SAMHD1的dNTPase活性抑制DNA合成,导致细胞周期阻滞,限制病毒复制。SAMHD1复杂的变构调节机制和缺乏直接光谱信号的反应给其动力学分析和抑制剂的发现带来了挑战。我们描述了一个连续监测SAMHD1磷酸酶活性在其激活的生理状态。该分析使用顺序组装来产生活性四聚体形式的酶。两种磷酸酶将无机三磷酸(PPPi)转化为无机磷酸(Pi)。释放的Pi与7-甲基-6-硫鸟嘌呤核苷磷酸化酶反应,提供灵敏的连续分光光度测定。该分析适用于96微孔板格式,以提供SAMHD1活性的连续测量。该试验以SAMHD1抑制剂为基准。该试验的Z-prime值为> 0.90,可用于高通量筛选SAMHD1抑制剂,并表征新抑制剂的变构或催化活性。
{"title":"Continuous assay for the dNTP triphosphohydrolase of activated SAMHD1","authors":"Roozbeh Eskandari,&nbsp;Daniel P. Groom,&nbsp;Ryo Tamura,&nbsp;Vern L. Schramm","doi":"10.1016/j.ab.2025.115966","DOIUrl":"10.1016/j.ab.2025.115966","url":null,"abstract":"<div><div>Sterile alpha motif and histidine-aspartate domain-containing protein 1 (SAMHD1) is the only member of the triphosphoric monoester hydrolase family in humans (dNTP + H<sub>2</sub>O → dN + PPPi). The dNTPase activity of SAMHD1 inhibits DNA synthesis, resulting in cell-cycle arrest and restricting viral replication. The complex allosteric regulation mechanism of SAMHD1 and a reaction that lacks a direct spectroscopic signal make its kinetic analysis and inhibitor discovery challenging. We describe a continuous assay for monitoring SAMHD1 phosphatase activity in its activated physiological state. The assay uses a sequential assembly to generate the active tetrameric form of the enzyme. Two phosphatases convert inorganic triphosphate (PPPi) to inorganic phosphate (Pi). The released Pi reacts with the 7-methyl-6-thioguanosine and purine nucleoside phosphorylase to provide a sensitive continuous spectrophotometric assay. The assay is suitable for 96-microwell plate formats to provide a continuous measurement of SAMHD1 activity. The assay is benchmarked with inhibitors of SAMHD1. With a Z-prime value &gt; 0.90, the assay can be used for high-throughput screening of inhibitors for SAMHD1 and characterizing the allosteric or catalytic activity of the new inhibitors.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"708 ","pages":"Article 115966"},"PeriodicalIF":2.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144991184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding the mode of action of arsenic(III) on the fungus Neofusicoccum parvum: Target protein identification 了解砷(III)对新褐菌的作用模式:靶蛋白鉴定。
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-01 Epub Date: 2025-10-08 DOI: 10.1016/j.ab.2025.115988
Andréa Engel , Florence Ferrari , Maude Meyer , Jean-Marc Strub , Martin Spichty , Christophe Bertsch , Christine Schaeffer-Reiss , Céline Tarnus , Sébastien Albrecht , Mary-Lorène Goddard
Before it was banned, sodium arsenite was the unique fungicide able to control fungi associated with grapevine trunk diseases. However, its mode of action has not been fully elucidated yet. This study focuses on the identification of arsenic(III)-binding proteins using an arsenic-based fluorescent probe and an arsenic-based affinity chromatography. To the best of our knowledge, this is the first comparative study of these techniques to demonstrate their complementarity. Mainly cysteine-rich proteins were identified, the majority of which are involved in the infection process, in particular in plant cell wall degradation, host-pathogen interaction, adhesion and pathogenicity. These proteins could therefore be relevant targets for the development of new ways of grapevine trunk disease control.
在被禁止之前,亚砷酸钠是一种独特的杀菌剂,能够控制与葡萄树干疾病相关的真菌。然而,其作用方式尚未完全阐明。本研究的重点是利用砷基荧光探针和砷基亲和色谱法鉴定砷(III)结合蛋白。据我们所知,这是第一次对这些技术进行比较研究,以证明它们的互补性。主要鉴定出富含半胱氨酸的蛋白质,其中大部分参与了感染过程,特别是在植物细胞壁降解、宿主-病原体相互作用、粘附和致病性方面。因此,这些蛋白质可能是开发葡萄树干疾病控制新方法的相关靶点。
{"title":"Understanding the mode of action of arsenic(III) on the fungus Neofusicoccum parvum: Target protein identification","authors":"Andréa Engel ,&nbsp;Florence Ferrari ,&nbsp;Maude Meyer ,&nbsp;Jean-Marc Strub ,&nbsp;Martin Spichty ,&nbsp;Christophe Bertsch ,&nbsp;Christine Schaeffer-Reiss ,&nbsp;Céline Tarnus ,&nbsp;Sébastien Albrecht ,&nbsp;Mary-Lorène Goddard","doi":"10.1016/j.ab.2025.115988","DOIUrl":"10.1016/j.ab.2025.115988","url":null,"abstract":"<div><div>Before it was banned, sodium arsenite was the unique fungicide able to control fungi associated with grapevine trunk diseases. However, its mode of action has not been fully elucidated yet. This study focuses on the identification of arsenic(III)-binding proteins using an arsenic-based fluorescent probe and an arsenic-based affinity chromatography. To the best of our knowledge, this is the first comparative study of these techniques to demonstrate their complementarity. Mainly cysteine-rich proteins were identified, the majority of which are involved in the infection process, in particular in plant cell wall degradation, host-pathogen interaction, adhesion and pathogenicity. These proteins could therefore be relevant targets for the development of new ways of grapevine trunk disease control.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"708 ","pages":"Article 115988"},"PeriodicalIF":2.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145273530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting phenylketonuria: Do high brain glycine levels caused by chronic hyperphenylalanemia contribute to brain dysfunction by modulating D-serine levels and NMDA receptor activity? 重新审视苯丙酮尿症:慢性高苯贫血引起的高脑甘氨酸水平是否通过调节d -丝氨酸水平和NMDA受体活性而导致脑功能障碍?
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-01 Epub Date: 2025-10-14 DOI: 10.1016/j.ab.2025.115992
Gerald A. Dienel
Phenylketonuria (PKU) is an inborn error of metabolism owing to deficits in phenylalanine hydroxylase (PAH) activity. PKU children acquire irreversible brain damage if newborns are not identified and treated with a phenylalanine-restricted diet. In spite of decades of research, the mechanisms underlying PKU brain dysfunction are not adequately understood. Competition of phenylalanine with large neutral amino acids (LNAAs) for carrier-mediated uptake into brain, causing lower brain LNAA levels and reduced neurotransmitter synthesis from tyrosine and tryptophan, is a long-favored mechanism for brain dysfunction. Here, glycine is hypothesized to contribute to phenylalanine-evoked brain disorders. All PKU animal models exhibit elevated brain glycine levels similar to mouse models of nonketotic hyperglycinemia. Glycine is synthesized from l-serine; it is a co-agonist of N-methyl-d-aspartate receptors (NMDARs) and an inhibitory neurotransmitter. l-Serine is synthesized from glucose in astrocytes, exported to neurons, and converted by serine racemase to d-serine, an NMDAR co-agonist. Increased glycine level enhances its inhibition of serine racemase and reduces levels of d-serine. l-Serine-glycine-d-serine interactions can be linked to PAH deficits because elevated brain phenylalanine concentration causes its metabolism by minor pathways to generate phenyllactate. If phenyllactate and l-serine synthesis are coupled via transaminase and redox reactions, the stoichiometry is 1:1. These findings support the following hypothesis: (i) phenylalanine disrupts glycine and d-serine homeostasis during brain maturation, irreversibly altering neuronal development and circuit formation, and (ii) high glycine and low d-serine levels in PKU adults contribute to cognitive and behavioral dysfunction. Suggested new directions for future studies of PKU focus on glycine neurotoxicity.
苯丙酮尿症(PKU)是由于苯丙氨酸羟化酶(PAH)活性不足而导致的先天性代谢错误。如果新生儿未被识别并以苯丙氨酸限制饮食治疗,PKU儿童将获得不可逆的脑损伤。尽管经过数十年的研究,PKU脑功能障碍的机制仍未得到充分的了解。苯丙氨酸与大中性氨基酸(LNAAs)在载体介导的脑摄取中竞争,导致脑LNAA水平降低,酪氨酸和色氨酸合成神经递质减少,这是脑功能障碍的一个长期被认为的机制。在这里,甘氨酸被假设有助于苯丙氨酸诱发的脑部疾病。所有PKU动物模型均表现出脑甘氨酸水平升高,与非酮症型高甘氨酸血症小鼠模型相似。甘氨酸是由l -丝氨酸合成的;它是n -甲基- d -天冬氨酸受体(NMDARs)的协同激动剂和一种抑制性神经递质。l -丝氨酸由星形胶质细胞中的葡萄糖合成,输出到神经元,并通过丝氨酸消旋酶转化为d -丝氨酸,一种NMDAR共激动剂。甘氨酸水平的升高增强了其对丝氨酸消旋酶的抑制作用,降低了d -丝氨酸水平。l -丝氨酸-甘氨酸- d -丝氨酸相互作用可能与多环芳烃缺陷有关,因为脑苯丙氨酸浓度升高导致其通过次要途径代谢产生苯乳酸。如果苯乳酸和l -丝氨酸的合成通过转氨酶和氧化还原反应偶联,则化学计量比为1:1。这些发现支持以下假设:(i)苯丙氨酸破坏脑成熟过程中甘氨酸和d -丝氨酸的稳态,不可逆转地改变神经元发育和电路形成;(ii) PKU成人中高甘氨酸和低d -丝氨酸水平导致认知和行为功能障碍。提出甘氨酸神经毒性是今后研究的新方向。
{"title":"Revisiting phenylketonuria: Do high brain glycine levels caused by chronic hyperphenylalanemia contribute to brain dysfunction by modulating D-serine levels and NMDA receptor activity?","authors":"Gerald A. Dienel","doi":"10.1016/j.ab.2025.115992","DOIUrl":"10.1016/j.ab.2025.115992","url":null,"abstract":"<div><div>Phenylketonuria (PKU) is an inborn error of metabolism owing to deficits in phenylalanine hydroxylase (PAH) activity. PKU children acquire irreversible brain damage if newborns are not identified and treated with a phenylalanine-restricted diet. In spite of decades of research, the mechanisms underlying PKU brain dysfunction are not adequately understood. Competition of phenylalanine with large neutral amino acids (LNAAs) for carrier-mediated uptake into brain, causing lower brain LNAA levels and reduced neurotransmitter synthesis from tyrosine and tryptophan, is a long-favored mechanism for brain dysfunction. Here, glycine is hypothesized to contribute to phenylalanine-evoked brain disorders. All PKU animal models exhibit elevated brain glycine levels similar to mouse models of nonketotic hyperglycinemia. Glycine is synthesized from <span>l</span>-serine; it is a co-agonist of <em>N</em>-methyl-<span>d</span>-aspartate receptors (NMDARs) and an inhibitory neurotransmitter. <span>l</span>-Serine is synthesized from glucose in astrocytes, exported to neurons, and converted by serine racemase to <span>d</span>-serine, an NMDAR co-agonist. Increased glycine level enhances its inhibition of serine racemase and reduces levels of <span>d</span>-serine. <span>l</span>-Serine-glycine-<span>d</span>-serine interactions can be linked to PAH deficits because elevated brain phenylalanine concentration causes its metabolism by minor pathways to generate phenyllactate. If phenyllactate and <span>l</span>-serine synthesis are coupled via transaminase and redox reactions, the stoichiometry is 1:1. These findings support the following hypothesis: (i) phenylalanine disrupts glycine and <span>d</span>-serine homeostasis during brain maturation, irreversibly altering neuronal development and circuit formation, and (ii) high glycine and low <span>d</span>-serine levels in <span>PKU</span> adults contribute to cognitive and behavioral dysfunction. Suggested new directions for future studies of PKU focus on glycine neurotoxicity.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"708 ","pages":"Article 115992"},"PeriodicalIF":2.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145306838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping transposon insertion sites within bacterial genomes by direct Sanger sequencing 通过直接Sanger测序在细菌基因组中定位转座子插入位点。
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-01 Epub Date: 2025-09-10 DOI: 10.1016/j.ab.2025.115971
Scott W. Herke , Linda M. Heffernan , William N. Beavers , Basel H. Abuaita
Microbial biomedical research frequently involves mutagenesis by insertion of transposons into the genome. Currently, transposon insertion locations are typically elucidated by Next Generation Sequencing or by Sanger sequencing of PCR products. Here, transposons were located by direct Sanger sequencing of bacterial genomic DNA from both Salmonella enterica (Gram-negative) and Staphylococcus aureus (Gram-positive) cultures. DNA was prepared by shearing to a modal size of ∼2 kb followed by purification by paramagnetic beads. Sequencing reactions involved relatively minor modifications of standard protocols (e.g., extra sequencing polymerase, 75–100 PCR cycles); completed reactions were cleaned by ethanol-EDTA precipitation. Reads were generated on the ABI 3130xl Genetic Analyzer using 50-cm capillary arrays and a run protocol modified for extra sample injection time. Good quality reads of ∼500–800 nt were routinely generated; BLAST results returned nearly 100% matches to genomes in the NCBI database. As implemented, the optimized protocol (post-DNA extraction) could be performed within an 8-h workday (with sequencing results the following day) for ∼$10 (USD) per sequencing reaction. Although this method was developed to locate transposons inserted into bacterial genomes, it seems likely that it can be extended to generate sequence data from even native single-copy genes from small genomes (e.g., <5 Mb).
微生物生物医学研究经常涉及将转座子插入基因组的诱变。目前,转座子插入位置通常是通过下一代测序或PCR产物的Sanger测序来确定的。在这里,转座子是通过直接桑格测序从肠炎沙门氏菌(革兰氏阴性)和金黄色葡萄球菌(革兰氏阳性)培养的细菌基因组DNA定位。通过剪切至约2kb的模态大小制备DNA,然后用顺磁珠纯化。测序反应涉及对标准方案相对较小的修改(例如,额外的测序聚合酶,75-100个PCR循环);完成的反应用乙醇- edta沉淀法清洗。在ABI 3130xl遗传分析仪上使用50厘米毛细管阵列和修改的运行方案以增加样品注射时间。常规生成约500-800 nt的高质量读数;BLAST结果与NCBI数据库中的基因组几乎100%匹配。优化后的方案(dna后提取)可以在8小时工作日内完成(第二天有测序结果),每个测序反应约10美元。虽然这种方法是为了定位插入细菌基因组的转座子而开发的,但它似乎可以扩展到从小基因组(例如,
{"title":"Mapping transposon insertion sites within bacterial genomes by direct Sanger sequencing","authors":"Scott W. Herke ,&nbsp;Linda M. Heffernan ,&nbsp;William N. Beavers ,&nbsp;Basel H. Abuaita","doi":"10.1016/j.ab.2025.115971","DOIUrl":"10.1016/j.ab.2025.115971","url":null,"abstract":"<div><div>Microbial biomedical research frequently involves mutagenesis by insertion of transposons into the genome. Currently, transposon insertion locations are typically elucidated by Next Generation Sequencing or by Sanger sequencing of PCR products. Here, transposons were located by direct Sanger sequencing of bacterial genomic DNA from both <em>Salmonella enterica</em> (Gram-negative) and <em>Staphylococcus aureus</em> (Gram-positive) cultures. DNA was prepared by shearing to a modal size of ∼2 kb followed by purification by paramagnetic beads. Sequencing reactions involved relatively minor modifications of standard protocols (e.g., extra sequencing polymerase, 75–100 PCR cycles); completed reactions were cleaned by ethanol-EDTA precipitation. Reads were generated on the ABI 3130xl Genetic Analyzer using 50-cm capillary arrays and a run protocol modified for extra sample injection time. Good quality reads of ∼500–800 nt were routinely generated; BLAST results returned nearly 100% matches to genomes in the NCBI database. As implemented, the optimized protocol (post-DNA extraction) could be performed within an 8-h workday (with sequencing results the following day) for ∼$10 (USD) per sequencing reaction. Although this method was developed to locate transposons inserted into bacterial genomes, it seems likely that it can be extended to generate sequence data from even native single-copy genes from small genomes (e.g., &lt;5 Mb).</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"708 ","pages":"Article 115971"},"PeriodicalIF":2.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145051605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differential binding of copper and zinc to a TDP-43 RNA recognition motif decapeptide and disulfide formation at residues C173/5 revealed by ESI-MS/MS ESI-MS/MS显示铜和锌与TDP-43 RNA识别基序十肽的差异结合和残基C173/5的二硫形成。
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-13 DOI: 10.1016/j.ab.2025.116031
Josephine Esposto , Naomi L. Stock , Robert J. Huber , Sanela Martic
Copper (Cu) and zinc (Zn) metal ions play important roles in the proper functioning and localization of neurological proteins, such as transactive response DNA-binding protein 43 (TDP-43), which is linked to amyotrophic lateral sclerosis (ALS). Previous experimental and computational studies have identified putative Zn-binding regions within the RNA recognition motif 1 (RRM1) of TDP-43. However, Cu-binding interactions have been less explored despite their redox activity in regulating thiol (C173/175) conversion to disulfide within the RRM1 domain, influencing protein structure and function. Herein, the structural characterization and fragmentation pattern analysis of a TDP-43 decapeptide (166-HMIDGRWCDC-175), within RRM1, coordinated to Cu(II) and Zn(II) ions using electrospray ionization tandem mass spectrometry (ESI-MS/MS) was conducted under non-denaturing conditions. Higher-energy collision dissociation (HCD) fragmentation analysis identified that Cu(II) prefers His/Met residues, while Zn(II) was weakly coordinated to various binding sites in the peptide, specifically His, Met, Glu, Cys, Trp and Asp residues. Computational modeling using a metal ion binding server (MIB2) confirmed the binding sites and coordination sphere of metal-peptide complexes. No significant coordination to C173 and C175 was observed with Cu or Zn, as identified by using a double Cys mutant peptide. A complete thiol-to-disulfide conversion was observed in the presence of Cu(II)/(I) only, which was confirmed by the comparison of a preformed intramolecular disulfide peptide. Overall, unique differential coordination environments were observed for each metal ion with the peptide. The study provides new insights into metal ion interactions with TDP-43 RRM1 peptide, leading to a greater understanding of metal homeostasis in TDP-43 protein biochemistry and neurodegeneration.
铜(Cu)和锌(Zn)金属离子在神经系统蛋白的正常功能和定位中发挥重要作用,例如与肌萎缩性侧索硬化症(ALS)有关的交互反应dna结合蛋白43 (TDP-43)。先前的实验和计算研究已经在TDP-43的RNA识别基序1 (RRM1)中确定了假定的锌结合区域。然而,尽管cu结合相互作用在调节硫醇(C173/175)在RRM1结构域中转化为二硫化合物,影响蛋白质结构和功能方面具有氧化还原活性,但对它们的探索较少。在非变性条件下,利用电喷雾串联质谱(ESI-MS/MS)对RRM1中与Cu(II)和Zn(II)离子配位的TDP-43十肽(166-HMIDGRWCDC-175)进行了结构表征和碎片化模式分析。高能碰撞解离(HCD)碎片化分析发现,Cu(II)倾向于His/Met残基,而Zn(II)与肽中的各种结合位点弱协同,特别是His、Met、Glu、Cys、Trp和Asp残基。利用金属离子结合服务器(MIB2)的计算模型确定了金属肽配合物的结合位点和配位球。通过双Cys突变肽鉴定,C173和C175与Cu或Zn没有明显的配位关系。仅在Cu(II)/(I)存在的情况下,观察到硫醇到二硫的完全转化,这一点通过对预形成的分子内二硫肽的比较得到证实。总的来说,观察到每个金属离子与肽的独特的差异配位环境。该研究为金属离子与TDP-43 RRM1肽的相互作用提供了新的见解,从而更好地了解TDP-43蛋白生物化学和神经退行性变中的金属稳态。
{"title":"Differential binding of copper and zinc to a TDP-43 RNA recognition motif decapeptide and disulfide formation at residues C173/5 revealed by ESI-MS/MS","authors":"Josephine Esposto ,&nbsp;Naomi L. Stock ,&nbsp;Robert J. Huber ,&nbsp;Sanela Martic","doi":"10.1016/j.ab.2025.116031","DOIUrl":"10.1016/j.ab.2025.116031","url":null,"abstract":"<div><div>Copper (Cu) and zinc (Zn) metal ions play important roles in the proper functioning and localization of neurological proteins, such as transactive response DNA-binding protein 43 (TDP-43), which is linked to amyotrophic lateral sclerosis (ALS). Previous experimental and computational studies have identified putative Zn-binding regions within the RNA recognition motif 1 (RRM1) of TDP-43. However, Cu-binding interactions have been less explored despite their redox activity in regulating thiol (C173/175) conversion to disulfide within the RRM1 domain, influencing protein structure and function. Herein, the structural characterization and fragmentation pattern analysis of a TDP-43 decapeptide (166-HMIDGRWCDC-175), within RRM1, coordinated to Cu(II) and Zn(II) ions using electrospray ionization tandem mass spectrometry (ESI-MS/MS) was conducted under non-denaturing conditions. Higher-energy collision dissociation (HCD) fragmentation analysis identified that Cu(II) prefers His/Met residues, while Zn(II) was weakly coordinated to various binding sites in the peptide, specifically His, Met, Glu, Cys, Trp and Asp residues. Computational modeling using a metal ion binding server (MIB2) confirmed the binding sites and coordination sphere of metal-peptide complexes. No significant coordination to C173 and C175 was observed with Cu or Zn, as identified by using a double Cys mutant peptide. A complete thiol-to-disulfide conversion was observed in the presence of Cu(II)/(I) only, which was confirmed by the comparison of a preformed intramolecular disulfide peptide. Overall, unique differential coordination environments were observed for each metal ion with the peptide. The study provides new insights into metal ion interactions with TDP-43 RRM1 peptide, leading to a greater understanding of metal homeostasis in TDP-43 protein biochemistry and neurodegeneration.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"710 ","pages":"Article 116031"},"PeriodicalIF":2.5,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145762025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-Entity Dual-Emissive MOF Platform for Reliable Ratiometric Point-of-Care Detection of Amoxicillin Residues 单实体双发射MOF平台可靠的比率点检测阿莫西林残留
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-03 DOI: 10.1016/j.ab.2025.116028
Sameera Sh. Mohammed Ameen , Fiasal K. Algethami , Khalid M. Omer , Idrees B. Qader , Hemn A. Qader
In fluorescence-based sensing, self-referencing ratiometric analysis offers a significant advantage over external referencing by integrating both the probe and reference signals within a single material, rather than relying on two separate components. This intrinsic approach eliminates the need for additional reference dyes or materials, which can introduce inconsistencies due to variations in concentration, uneven dispersion, or environmental instability. Self-referencing materials provide a built-in correction mechanism, enhancing detection accuracy, reliability, and reproducibility while minimizing background interference. Despite their advantages, the design and synthesis of dual-emitting metal-organic frameworks (MOFs) with self-referencing capabilities remain rare and challenging. In this study, we introduce a novel Eu-based MOF with intrinsic dual-state, dual-emission properties, exhibiting distinct blue and red fluorescence in both liquid and solid states. The blue emission arises from the coordination-induced emission of the free, non-emissive ligand within the MOF structure. Interestingly, the red emission is selectively quenched by amoxicillin (AMX), while the blue fluorescence remains unaffected. This unique dual-emission feature enables a ratiometric sensing platform without requiring an external reference, ensuring greater stability, accuracy, and ease of use. With a linear detection range of 8.0–218 μM and a limit of detection of 0.354 μM, this Eu-MOF offers a robust and selective AMX sensing strategy. Additionally, a smartphone-assisted visual detection method using RGB analysis via the Color Grab App was developed, enabling portable and on-site quantification. This self-referencing Eu-MOF is inherently stable, recyclable, providing consistent signals and making it highly effective for pharmaceutical applications.
在基于荧光的传感中,自参考比率分析通过在单个材料中集成探针和参考信号而不是依赖于两个单独的组件,比外部参考具有显著的优势。这种固有的方法消除了额外的参考染料或材料的需要,这些参考染料或材料可能由于浓度变化、分散不均匀或环境不稳定而导致不一致。自参考材料提供内置校正机制,提高检测精度,可靠性和再现性,同时最大限度地减少背景干扰。尽管具有这些优点,但设计和合成具有自参考能力的双发射金属有机框架(mof)仍然很少见和具有挑战性。在这项研究中,我们引入了一种新型的铕基MOF,它具有固有的双态、双发射特性,在液体和固体状态下都表现出明显的蓝色和红色荧光。蓝色发射是由MOF结构中自由的非发射配体的配位引起的。有趣的是,红色荧光被阿莫西林(AMX)选择性猝灭,而蓝色荧光不受影响。这种独特的双发射功能使比率传感平台无需外部参考,确保更高的稳定性,准确性和易用性。该Eu-MOF的线性检测范围为8.0-218 μM,检测限为0.354 μM,提供了鲁棒性和选择性的AMX检测策略。此外,开发了一种智能手机辅助的视觉检测方法,通过Color Grab App使用RGB分析,实现便携式和现场定量。这种自我参考的Eu-MOF本质上是稳定的,可回收的,提供一致的信号,使其在制药应用中非常有效。
{"title":"Single-Entity Dual-Emissive MOF Platform for Reliable Ratiometric Point-of-Care Detection of Amoxicillin Residues","authors":"Sameera Sh. Mohammed Ameen ,&nbsp;Fiasal K. Algethami ,&nbsp;Khalid M. Omer ,&nbsp;Idrees B. Qader ,&nbsp;Hemn A. Qader","doi":"10.1016/j.ab.2025.116028","DOIUrl":"10.1016/j.ab.2025.116028","url":null,"abstract":"<div><div>In fluorescence-based sensing, self-referencing ratiometric analysis offers a significant advantage over external referencing by integrating both the probe and reference signals within a single material, rather than relying on two separate components. This intrinsic approach eliminates the need for additional reference dyes or materials, which can introduce inconsistencies due to variations in concentration, uneven dispersion, or environmental instability. Self-referencing materials provide a built-in correction mechanism, enhancing detection accuracy, reliability, and reproducibility while minimizing background interference. Despite their advantages, the design and synthesis of dual-emitting metal-organic frameworks (MOFs) with self-referencing capabilities remain rare and challenging. In this study, we introduce a novel Eu-based MOF with intrinsic dual-state, dual-emission properties, exhibiting distinct blue and red fluorescence in both liquid and solid states. The blue emission arises from the coordination-induced emission of the free, non-emissive ligand within the MOF structure. Interestingly, the red emission is selectively quenched by amoxicillin (AMX), while the blue fluorescence remains unaffected. This unique dual-emission feature enables a ratiometric sensing platform without requiring an external reference, ensuring greater stability, accuracy, and ease of use. With a linear detection range of 8.0–218 μM and a limit of detection of 0.354 μM, this Eu-MOF offers a robust and selective AMX sensing strategy. Additionally, a smartphone-assisted visual detection method using RGB analysis via the Color Grab App was developed, enabling portable and on-site quantification. This self-referencing Eu-MOF is inherently stable, recyclable, providing consistent signals and making it highly effective for pharmaceutical applications.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"710 ","pages":"Article 116028"},"PeriodicalIF":2.5,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145684009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HCNS:A deep learning model for identifying essential proteins based on hypergraph convolution and sequence features HCNS:一种基于超图卷积和序列特征识别必需蛋白质的深度学习模型
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-01 Epub Date: 2025-08-07 DOI: 10.1016/j.ab.2025.115949
Jialong Tian , Pengli Lu , Huining Sha
Accurate identification of essential proteins is crucial in biomedical research. Traditional methods rely on Protein–Protein Interaction (PPI) networks and limited biological feature data, often neglecting the critical role of protein amino acid sequences. To address this, we propose the HCNS model, which integrates the Hypergraph Convolutional Network (HGCN) module, the Seq-CNN-MB-NAG feature extraction module, and the Multi-Layer Perceptron (MLP) recognition module, significantly enhancing the accuracy of essential protein identification. The HGCN module constructs a hypergraph from the PPI network and protein complex data, capturing the complex structural relationships between proteins. The Seq-CNN-MB-NAG module utilizes Convolutional Neural Networks (CNN), Multi-Head Self-Attention (MHSA), Bidirectional Long Short-Term Memory (Bi-LSTM), and NAG Transformer to process protein amino acid sequences and extract sequence features. The MLP module then fuses these two sets of features for precise recognition. Experimental results show that the HCNS model outperforms existing methods, achieving an accuracy of 93.38%, with an Area Under the Curve (AUC) of 98.33% and an Area Under the Precision–Recall Curve (AUPR) of 97.16%, demonstrating its potential in essential protein identification.
准确鉴定必需蛋白质在生物医学研究中至关重要。传统的方法依赖于蛋白质-蛋白质相互作用(PPI)网络和有限的生物学特征数据,往往忽略了蛋白质氨基酸序列的关键作用。为了解决这个问题,我们提出了HCNS模型,该模型集成了超图卷积网络(HGCN)模块、Seq-CNN-MB-NAG特征提取模块和多层感知器(MLP)识别模块,显著提高了必需蛋白质识别的准确性。HGCN模块利用PPI网络和蛋白质复合体数据构建超图,捕捉蛋白质之间复杂的结构关系。Seq-CNN-MB-NAG模块利用卷积神经网络(CNN)、多头自注意(MHSA)、双向长短期记忆(Bi-LSTM)和NAG Transformer处理蛋白质氨基酸序列并提取序列特征。然后,MLP模块融合这两组特征以进行精确识别。实验结果表明,HCNS模型的准确率为93.38%,曲线下面积(Area Under the Curve, AUC)为98.33%,精确召回曲线下面积(Area Under Precision-Recall Curve, AUPR)为97.16%,优于现有的方法,显示了其在必需蛋白鉴定中的潜力。
{"title":"HCNS:A deep learning model for identifying essential proteins based on hypergraph convolution and sequence features","authors":"Jialong Tian ,&nbsp;Pengli Lu ,&nbsp;Huining Sha","doi":"10.1016/j.ab.2025.115949","DOIUrl":"10.1016/j.ab.2025.115949","url":null,"abstract":"<div><div>Accurate identification of essential proteins is crucial in biomedical research. Traditional methods rely on Protein–Protein Interaction (PPI) networks and limited biological feature data, often neglecting the critical role of protein amino acid sequences. To address this, we propose the HCNS model, which integrates the Hypergraph Convolutional Network (HGCN) module, the Seq-CNN-MB-NAG feature extraction module, and the Multi-Layer Perceptron (MLP) recognition module, significantly enhancing the accuracy of essential protein identification. The HGCN module constructs a hypergraph from the PPI network and protein complex data, capturing the complex structural relationships between proteins. The Seq-CNN-MB-NAG module utilizes Convolutional Neural Networks (CNN), Multi-Head Self-Attention (MHSA), Bidirectional Long Short-Term Memory (Bi-LSTM), and NAG Transformer to process protein amino acid sequences and extract sequence features. The MLP module then fuses these two sets of features for precise recognition. Experimental results show that the HCNS model outperforms existing methods, achieving an accuracy of 93.38%, with an Area Under the Curve (AUC) of 98.33% and an Area Under the Precision–Recall Curve (AUPR) of 97.16%, demonstrating its potential in essential protein identification.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"707 ","pages":"Article 115949"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144809717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ITGB3 as a promising non-invasive biomarker for type 2 diabetes and diabetic nephropathy ITGB3有望成为2型糖尿病和糖尿病肾病的无创生物标志物
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-01 Epub Date: 2025-08-25 DOI: 10.1016/j.ab.2025.115965
Seyed Amirhossein Hosseini , Parisa Ajorlou , Hasti Haddadian , Shahla Sohrabipour
The prevalence of Type 2 diabetes mellitus (T2DM) is increasing worldwide and represents a major risk factor for the development of diabetic nephropathy (DN), a severe microvascular complication. Chronic hyperglycemia activates inflammatory and fibrotic signaling pathways, which contribute to kidney damage. Integrins, as transmembrane adhesion receptors, play pivotal roles in regulating inflammation, immune cell trafficking, and insulin resistance. This research focused on identifying non-invasive biomarkers for T2DM and DN using PBMCs. Differentially expressed genes related to diabetes were identified through the analysis of multiple datasets retrieved from the Gene Expression Omnibus, including GSE95849, GSE9006, GSE25724, and GSE159984. ITGB3 was identified as a common gene across these datasets, and its expression in DN was further examined using the GSE142025 dataset. Real-time PCR analysis of PBMC samples revealed a significant upregulation of ITGB3 expression in individuals with DN and T2DM compared to healthy controls. The TF2DNA and miRNASNPv3 databases identified 10 transcription factors and 10 variants of ITGB3 involved in 60 miRNA interactions. Additionally, the DGIdb database revealed 15 drugs potentially regulating ITGB3 expression. These findings underscore the importance of integrin-related pathways in diabetes and suggest ITGB3 as a promising target for future research and therapeutic development.
2型糖尿病(T2DM)的患病率在全球范围内呈上升趋势,是糖尿病肾病(DN)发展的主要危险因素,是一种严重的微血管并发症。慢性高血糖会激活炎症和纤维化信号通路,从而导致肾脏损伤。整合素作为跨膜粘附受体,在调节炎症、免疫细胞运输和胰岛素抵抗中起关键作用。本研究的重点是利用pbmc识别T2DM和DN的非侵入性生物标志物。通过分析从Gene Expression Omnibus检索到的多个数据集,包括GSE95849、GSE9006、GSE25724和GSE159984,鉴定出与糖尿病相关的差异表达基因。ITGB3被鉴定为这些数据集中的共同基因,并使用GSE142025数据集进一步检测其在DN中的表达。PBMC样本的实时PCR分析显示,与健康对照相比,DN和T2DM患者的ITGB3表达显著上调。TF2DNA和miRNASNPv3数据库鉴定了参与60种miRNA相互作用的10个转录因子和10个ITGB3变体。此外,DGIdb数据库还发现了15种可能调节ITGB3表达的药物。这些发现强调了整合素相关通路在糖尿病中的重要性,并表明ITGB3是未来研究和治疗开发的有希望的靶点。
{"title":"ITGB3 as a promising non-invasive biomarker for type 2 diabetes and diabetic nephropathy","authors":"Seyed Amirhossein Hosseini ,&nbsp;Parisa Ajorlou ,&nbsp;Hasti Haddadian ,&nbsp;Shahla Sohrabipour","doi":"10.1016/j.ab.2025.115965","DOIUrl":"10.1016/j.ab.2025.115965","url":null,"abstract":"<div><div>The prevalence of Type 2 diabetes mellitus (T2DM) is increasing worldwide and represents a major risk factor for the development of diabetic nephropathy (DN), a severe microvascular complication. Chronic hyperglycemia activates inflammatory and fibrotic signaling pathways, which contribute to kidney damage. Integrins, as transmembrane adhesion receptors, play pivotal roles in regulating inflammation, immune cell trafficking, and insulin resistance. This research focused on identifying non-invasive biomarkers for T2DM and DN using PBMCs. Differentially expressed genes related to diabetes were identified through the analysis of multiple datasets retrieved from the Gene Expression Omnibus, including <span><span>GSE95849</span><svg><path></path></svg></span>, <span><span>GSE9006</span><svg><path></path></svg></span>, <span><span>GSE25724</span><svg><path></path></svg></span>, and <span><span>GSE159984</span><svg><path></path></svg></span>. ITGB3 was identified as a common gene across these datasets, and its expression in DN was further examined using the <span><span>GSE142025</span><svg><path></path></svg></span> dataset. Real-time PCR analysis of PBMC samples revealed a significant upregulation of ITGB3 expression in individuals with DN and T2DM compared to healthy controls. The TF2DNA and miRNASNPv3 databases identified 10 transcription factors and 10 variants of ITGB3 involved in 60 miRNA interactions. Additionally, the DGIdb database revealed 15 drugs potentially regulating ITGB3 expression. These findings underscore the importance of integrin-related pathways in diabetes and suggest ITGB3 as a promising target for future research and therapeutic development.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"707 ","pages":"Article 115965"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144896119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PreRBP: Interpretable deep learning for RNA-protein binding site prediction with attention mechanism PreRBP:基于注意机制的rna -蛋白结合位点预测的可解释深度学习。
IF 2.5 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Pub Date : 2025-12-01 Epub Date: 2025-09-04 DOI: 10.1016/j.ab.2025.115968
Huixian Chen , Yun Zuo , Xiangrong Liu , Xiangxiang Zeng , Zhaohong Deng , Jiasong Wu
In the complex process of gene expression and regulation, RNA-binding proteins occupy a pivotal position for RNA. Accurate prediction of RNA-protein binding sites can help researchers better understand RNA-binding proteins and their related mechanisms. And prediction techniques based on machine learning algorithms are both cost-effective and efficient in identifying these binding sites. However, there are some shortcomings in the currently available machine learning methods, such as the input features of the model only consider RNA sequence features, and most of the datasets suffer from class imbalance. To address these issues, this study first uses the publicly available 27 RNA-protein binding site datasets to construct a benchmark dataset. Then, we use RNAshapes and EDeN to obtain the secondary structure of RNA. Higher-order encoding method is used to extract the key information hidden in the RNA sequences and structures. In order to solve the class imbalance problem existing in the dataset, this study utilizes four undersampling algorithms, namely, random undersampling, NearMiss, ENN, and one-sided selection, to remove redundant samples in the negative samples, and lastly, based on Convolutional Neural Network, Bidirectional Long and Short Term Memory Network, this study constructs model PreRBP to predict RNA-protein binding sites.
The experimental results show that the model used in this study has an average AUC of 0.88, which is higher than other existing RNA-protein binding site prediction methods. Also, for the convenience of prediction, an online predictor is developed in this study. The predictor and experimental codes are available at https://github.com/B12-Comet/RBPPrediction.
在复杂的基因表达和调控过程中,RNA结合蛋白对RNA起着举足轻重的作用。准确预测rna -蛋白结合位点有助于研究人员更好地了解rna -蛋白结合及其相关机制。而基于机器学习算法的预测技术在识别这些结合位点方面既经济又有效。然而,目前可用的机器学习方法存在一些不足,例如模型的输入特征只考虑RNA序列特征,大多数数据集存在类不平衡。为了解决这些问题,本研究首先使用公开可用的27个rna -蛋白结合位点数据集构建基准数据集。然后,我们使用RNAshapes和EDeN来获得RNA的二级结构。采用高阶编码方法提取隐藏在RNA序列和结构中的关键信息。为了解决数据集中存在的类不平衡问题,本研究利用随机欠采样、NearMiss、ENN和片面选择四种欠采样算法去除负样本中的冗余样本,最后基于卷积神经网络、双向长短期记忆网络构建PreRBP模型预测rna -蛋白结合位点。实验结果表明,本研究使用的模型的平均AUC为0.88,高于现有的其他rna -蛋白结合位点预测方法。此外,为了便于预测,本研究还开发了一种在线预测器。预测器和实验代码可在https://github.com/B12-Comet/RBPPrediction上获得。
{"title":"PreRBP: Interpretable deep learning for RNA-protein binding site prediction with attention mechanism","authors":"Huixian Chen ,&nbsp;Yun Zuo ,&nbsp;Xiangrong Liu ,&nbsp;Xiangxiang Zeng ,&nbsp;Zhaohong Deng ,&nbsp;Jiasong Wu","doi":"10.1016/j.ab.2025.115968","DOIUrl":"10.1016/j.ab.2025.115968","url":null,"abstract":"<div><div>In the complex process of gene expression and regulation, RNA-binding proteins occupy a pivotal position for RNA. Accurate prediction of RNA-protein binding sites can help researchers better understand RNA-binding proteins and their related mechanisms. And prediction techniques based on machine learning algorithms are both cost-effective and efficient in identifying these binding sites. However, there are some shortcomings in the currently available machine learning methods, such as the input features of the model only consider RNA sequence features, and most of the datasets suffer from class imbalance. To address these issues, this study first uses the publicly available 27 RNA-protein binding site datasets to construct a benchmark dataset. Then, we use RNAshapes and EDeN to obtain the secondary structure of RNA. Higher-order encoding method is used to extract the key information hidden in the RNA sequences and structures. In order to solve the class imbalance problem existing in the dataset, this study utilizes four undersampling algorithms, namely, random undersampling, NearMiss, ENN, and one-sided selection, to remove redundant samples in the negative samples, and lastly, based on Convolutional Neural Network, Bidirectional Long and Short Term Memory Network, this study constructs model PreRBP to predict RNA-protein binding sites.</div><div>The experimental results show that the model used in this study has an average AUC of 0.88, which is higher than other existing RNA-protein binding site prediction methods. Also, for the convenience of prediction, an online predictor is developed in this study. The predictor and experimental codes are available at <span><span>https://github.com/B12-Comet/RBPPrediction</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"707 ","pages":"Article 115968"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145008102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Analytical biochemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1