Mingbo Peng, Tianjing Wang, Yujie Li, Zheng Zhang, Cuihong Wan
{"title":"通过 N-端组学方法绘制小型开放阅读框的起始密码子。","authors":"Mingbo Peng, Tianjing Wang, Yujie Li, Zheng Zhang, Cuihong Wan","doi":"10.1016/j.mcpro.2024.100860","DOIUrl":null,"url":null,"abstract":"<p><p>sORF-encoded peptides (SEPs) refer to proteins encoded by small open reading frames (sORFs) with a length of less than 100 amino acids, which play an important role in various life activities. Analysis of known SEPs showed that using non-canonical initiation codons of SEPs was more common. However, the current analysis of SEP sequences mainly relies on bioinformatics prediction, and most of them use AUG as the start site, which may not be completely correct for SEPs. Chemical labeling was used to systematically analyze the N-terminal sequences of SEPs to accurately define the start sites of SEPs. By comparison, we found that dimethylation and guanidinylation are more efficient than acetylation. The ACN precipitation and heating precipitation performed better in SEP enrichment. As an N-terminal peptide enrichment material, Hexadhexaldehyde was superior to CNBr-activated agarose and NHS-activated agarose. Combining these methods, we identified 128 SEPs with 131 N-terminal sequences. Among them, two-thirds are novel N-terminal sequences, and most of them start from the 11-31st amino acids of the original sequence. Partial novel N-termini were produced by proteolysis or signal peptide removal. Some SEPs' transcription start sites were corrected to be non-AUG start codons. One novel start codon was validated using GFP-tag vectors. These results demonstrated that the chemical labeling approaches would be beneficial for identifying the start codons of sORFs and the real N-terminal of their encoded peptides, which helps better understand the characterization of SEPs.</p>","PeriodicalId":18712,"journal":{"name":"Molecular & Cellular Proteomics","volume":" ","pages":"100860"},"PeriodicalIF":6.1000,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mapping Start Codons of Small Open Reading Frames by N-Terminomics Approach.\",\"authors\":\"Mingbo Peng, Tianjing Wang, Yujie Li, Zheng Zhang, Cuihong Wan\",\"doi\":\"10.1016/j.mcpro.2024.100860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>sORF-encoded peptides (SEPs) refer to proteins encoded by small open reading frames (sORFs) with a length of less than 100 amino acids, which play an important role in various life activities. Analysis of known SEPs showed that using non-canonical initiation codons of SEPs was more common. However, the current analysis of SEP sequences mainly relies on bioinformatics prediction, and most of them use AUG as the start site, which may not be completely correct for SEPs. Chemical labeling was used to systematically analyze the N-terminal sequences of SEPs to accurately define the start sites of SEPs. By comparison, we found that dimethylation and guanidinylation are more efficient than acetylation. The ACN precipitation and heating precipitation performed better in SEP enrichment. As an N-terminal peptide enrichment material, Hexadhexaldehyde was superior to CNBr-activated agarose and NHS-activated agarose. Combining these methods, we identified 128 SEPs with 131 N-terminal sequences. Among them, two-thirds are novel N-terminal sequences, and most of them start from the 11-31st amino acids of the original sequence. Partial novel N-termini were produced by proteolysis or signal peptide removal. Some SEPs' transcription start sites were corrected to be non-AUG start codons. One novel start codon was validated using GFP-tag vectors. These results demonstrated that the chemical labeling approaches would be beneficial for identifying the start codons of sORFs and the real N-terminal of their encoded peptides, which helps better understand the characterization of SEPs.</p>\",\"PeriodicalId\":18712,\"journal\":{\"name\":\"Molecular & Cellular Proteomics\",\"volume\":\" \",\"pages\":\"100860\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2024-10-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular & Cellular Proteomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.mcpro.2024.100860\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular & Cellular Proteomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.mcpro.2024.100860","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
摘要
sORF编码肽(SEPs)是指由长度小于100个氨基酸的小开放阅读框(sORFs)编码的蛋白质,它们在各种生命活动中发挥着重要作用。对已知 SEP 的分析表明,使用 SEP 的非规范起始密码子较为常见。然而,目前对SEP序列的分析主要依赖于生物信息学预测,且大多使用AUG作为起始位点,这对于SEP来说可能并不完全正确。我们采用化学标记法系统分析了SEPs的N端序列,以准确界定SEPs的起始位点。通过比较,我们发现二甲基化和鸟苷酸化比乙酰化更有效。ACN 沉淀和加热沉淀的 SEP 富集效果更好。作为 N 端多肽富集材料,六甲醛优于 CNBr 活化的琼脂糖和 NHS 活化的琼脂糖。结合这些方法,我们共鉴定出 128 个 SEPs,131 个 N 端序列。其中,三分之二是新的 N 端序列,它们大多从原始序列的第 11-31 个氨基酸开始。部分新型 N 端是通过蛋白水解或信号肽去除产生的。一些 SEP 的转录起始位点被修正为非 AUG 起始密码子。使用 GFP 标记载体对一个新的起始密码子进行了验证。这些结果表明,化学标记方法有助于鉴定 sORFs 的起始密码子及其编码肽的真正 N-末端,从而有助于更好地理解 SEPs 的特征。
Mapping Start Codons of Small Open Reading Frames by N-Terminomics Approach.
sORF-encoded peptides (SEPs) refer to proteins encoded by small open reading frames (sORFs) with a length of less than 100 amino acids, which play an important role in various life activities. Analysis of known SEPs showed that using non-canonical initiation codons of SEPs was more common. However, the current analysis of SEP sequences mainly relies on bioinformatics prediction, and most of them use AUG as the start site, which may not be completely correct for SEPs. Chemical labeling was used to systematically analyze the N-terminal sequences of SEPs to accurately define the start sites of SEPs. By comparison, we found that dimethylation and guanidinylation are more efficient than acetylation. The ACN precipitation and heating precipitation performed better in SEP enrichment. As an N-terminal peptide enrichment material, Hexadhexaldehyde was superior to CNBr-activated agarose and NHS-activated agarose. Combining these methods, we identified 128 SEPs with 131 N-terminal sequences. Among them, two-thirds are novel N-terminal sequences, and most of them start from the 11-31st amino acids of the original sequence. Partial novel N-termini were produced by proteolysis or signal peptide removal. Some SEPs' transcription start sites were corrected to be non-AUG start codons. One novel start codon was validated using GFP-tag vectors. These results demonstrated that the chemical labeling approaches would be beneficial for identifying the start codons of sORFs and the real N-terminal of their encoded peptides, which helps better understand the characterization of SEPs.
期刊介绍:
The mission of MCP is to foster the development and applications of proteomics in both basic and translational research. MCP will publish manuscripts that report significant new biological or clinical discoveries underpinned by proteomic observations across all kingdoms of life. Manuscripts must define the biological roles played by the proteins investigated or their mechanisms of action.
The journal also emphasizes articles that describe innovative new computational methods and technological advancements that will enable future discoveries. Manuscripts describing such approaches do not have to include a solution to a biological problem, but must demonstrate that the technology works as described, is reproducible and is appropriate to uncover yet unknown protein/proteome function or properties using relevant model systems or publicly available data.
Scope:
-Fundamental studies in biology, including integrative "omics" studies, that provide mechanistic insights
-Novel experimental and computational technologies
-Proteogenomic data integration and analysis that enable greater understanding of physiology and disease processes
-Pathway and network analyses of signaling that focus on the roles of post-translational modifications
-Studies of proteome dynamics and quality controls, and their roles in disease
-Studies of evolutionary processes effecting proteome dynamics, quality and regulation
-Chemical proteomics, including mechanisms of drug action
-Proteomics of the immune system and antigen presentation/recognition
-Microbiome proteomics, host-microbe and host-pathogen interactions, and their roles in health and disease
-Clinical and translational studies of human diseases
-Metabolomics to understand functional connections between genes, proteins and phenotypes