{"title":"Identifying Enhancers and Their Strength Based on PCWM Feature by A Two-Layer Predictor","authors":"Huan Yang, Shunfang Wang","doi":"10.1145/3469678.3469707","DOIUrl":null,"url":null,"abstract":"Enhancers are a small region of DNA that can bind with protein. After binding with protein, gene transcription will be strengthened. It is time-consuming and expensive to identify enhancers using traditional biological experimental methods. However, with the development of computer technology, more and more computer technology is applied to gene identification. There are two innovations in this study. First, a new feature information PCWM is proposed, which combines the normalized frequency information of k-tuple nucleotide in DNA sequence as weight and the physicochemical properties of k-tuple nucleotide to obtain DNA sequence features. Second, a two-layer model is proposed to process the acquired sequence feature information to predict the enhancer and its strength. The independent set test results show that this new feature method effectively improves the prediction accuracy of enhancers and their strengths, obtaining accuracy of 77.0% and 69.5%, respectively. Compared with the classical two feature methods, the new feature method shows greater advantages, and has greater improvement than the prediction results of the existing literature. This method is an effective supplement to the existing research methods.","PeriodicalId":22513,"journal":{"name":"The Fifth International Conference on Biological Information and Biomedical Engineering","volume":"73 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Fifth International Conference on Biological Information and Biomedical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469678.3469707","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Enhancers are a small region of DNA that can bind with protein. After binding with protein, gene transcription will be strengthened. It is time-consuming and expensive to identify enhancers using traditional biological experimental methods. However, with the development of computer technology, more and more computer technology is applied to gene identification. There are two innovations in this study. First, a new feature information PCWM is proposed, which combines the normalized frequency information of k-tuple nucleotide in DNA sequence as weight and the physicochemical properties of k-tuple nucleotide to obtain DNA sequence features. Second, a two-layer model is proposed to process the acquired sequence feature information to predict the enhancer and its strength. The independent set test results show that this new feature method effectively improves the prediction accuracy of enhancers and their strengths, obtaining accuracy of 77.0% and 69.5%, respectively. Compared with the classical two feature methods, the new feature method shows greater advantages, and has greater improvement than the prediction results of the existing literature. This method is an effective supplement to the existing research methods.