Lysine acetylation (Kace) is an important post-translational modification. Although structure information has been proven to be a key for improving model effectiveness, it is difficult to obtain in bulk due to experiments limitations. In this study, we propose a spatial coordinates representation using a 2-order tensor according to defined property sequence to address the existing limitations and depicting amino acid position in 3-d space. Based on the proposed coordinates, we construct optimal complex networks to extract network-derived features. Compared to existing network construction methods, protein contact networks (PCN), the features achieve superior performance, demonstrating the proposed spatial coordinates could effectively capture biological global information. Meanwhile, we proposed a computational model, named 3DCOOR-Kace, by fusing sequence and structure information based on DenseNet and Squeeze-and-Excitation layer. The 3DCOOR-Kace achieved satisfactory MCC with 0.7358. Compared with MusiteDeep and TransPTM by the independent testing set, the MCC is 0.4261 higher than MusiteDeep and 0.1660 higher than TransPTM, which demonstrates 3DCOOR-Kace are effective for integrating structure and sequence information for improving Kace site identification. Instead of doing biological experiments, the 3-d spatial coordinates representation could give sites positions directly, which could address the experiments limitation and be convenient for computational methods and biological function research.
扫码关注我们
求助内容:
应助结果提醒方式:
