利用机器学习方法从非编码 RNA 图谱对肝癌早期和晚期的长非编码 RNA 进行分类

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS Bioinformatics and Biology Insights Pub Date : 2024-06-05 eCollection Date: 2024-01-01 DOI:10.1177/11779322241258586
Songtham Anuntakarun, Jakkrit Khamjerm, Pisit Tangkijvanich, Natthaya Chuaypen
{"title":"利用机器学习方法从非编码 RNA 图谱对肝癌早期和晚期的长非编码 RNA 进行分类","authors":"Songtham Anuntakarun, Jakkrit Khamjerm, Pisit Tangkijvanich, Natthaya Chuaypen","doi":"10.1177/11779322241258586","DOIUrl":null,"url":null,"abstract":"<p><p>Long non-coding RNAs (lncRNAs), which are RNA sequences greater than 200 nucleotides in length, play a crucial role in regulating gene expression and biological processes associated with cancer development and progression. Liver cancer is a major cause of cancer-related mortality worldwide, notably in Thailand. Although machine learning has been extensively used in analyzing RNA-sequencing data for advanced knowledge, the identification of potential lncRNA biomarkers for cancer, particularly focusing on lncRNAs as molecular biomarkers in liver cancer, remains comparatively limited. In this study, our objective was to identify candidate lncRNAs in liver cancer. We employed an expression data set of lncRNAs from patients with liver cancer, which comprised 40 699 lncRNAs sourced from The CancerLivER database. Various feature selection methods and machine-learning approaches were used to identify these candidate lncRNAs. The results showed that the random forest algorithm could predict lncRNAs using features extracted from the database, which achieved an area under the curve (AUC) of 0.840 for classifying lncRNAs between early (stage 1) and late stages (stages 2, 3, and 4) of liver cancer. Five of 23 significant lncRNAs (WAC-AS1, MAPKAPK5-AS1, ARRDC1-AS1, AC133528.2, and RP11-1094M14.11) were differentially expressed between early and late stage of liver cancer. Based on the Gene Expression Profiling Interactive Analysis (GEPIA) database, higher expression of WAC-AS1, MAPKAPK5-AS1, and ARRDC1-AS1 was associated with shorter overall survival. In conclusion, the classification model could predict the early and late stages of liver cancer using the signature expression of lncRNA genes. The identified lncRNAs might be used as early diagnostic and prognostic biomarkers for patients with liver cancer.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241258586"},"PeriodicalIF":2.3000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11155358/pdf/","citationCount":"0","resultStr":"{\"title\":\"Classification of Long Non-Coding RNAs s Between Early and Late Stage of Liver Cancers From Non-coding RNA Profiles Using Machine-Learning Approach.\",\"authors\":\"Songtham Anuntakarun, Jakkrit Khamjerm, Pisit Tangkijvanich, Natthaya Chuaypen\",\"doi\":\"10.1177/11779322241258586\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Long non-coding RNAs (lncRNAs), which are RNA sequences greater than 200 nucleotides in length, play a crucial role in regulating gene expression and biological processes associated with cancer development and progression. Liver cancer is a major cause of cancer-related mortality worldwide, notably in Thailand. Although machine learning has been extensively used in analyzing RNA-sequencing data for advanced knowledge, the identification of potential lncRNA biomarkers for cancer, particularly focusing on lncRNAs as molecular biomarkers in liver cancer, remains comparatively limited. In this study, our objective was to identify candidate lncRNAs in liver cancer. We employed an expression data set of lncRNAs from patients with liver cancer, which comprised 40 699 lncRNAs sourced from The CancerLivER database. Various feature selection methods and machine-learning approaches were used to identify these candidate lncRNAs. The results showed that the random forest algorithm could predict lncRNAs using features extracted from the database, which achieved an area under the curve (AUC) of 0.840 for classifying lncRNAs between early (stage 1) and late stages (stages 2, 3, and 4) of liver cancer. Five of 23 significant lncRNAs (WAC-AS1, MAPKAPK5-AS1, ARRDC1-AS1, AC133528.2, and RP11-1094M14.11) were differentially expressed between early and late stage of liver cancer. Based on the Gene Expression Profiling Interactive Analysis (GEPIA) database, higher expression of WAC-AS1, MAPKAPK5-AS1, and ARRDC1-AS1 was associated with shorter overall survival. In conclusion, the classification model could predict the early and late stages of liver cancer using the signature expression of lncRNA genes. The identified lncRNAs might be used as early diagnostic and prognostic biomarkers for patients with liver cancer.</p>\",\"PeriodicalId\":9065,\"journal\":{\"name\":\"Bioinformatics and Biology Insights\",\"volume\":\"18 \",\"pages\":\"11779322241258586\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11155358/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics and Biology Insights\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/11779322241258586\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics and Biology Insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11779322241258586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

长非编码 RNA(lncRNA)是指长度超过 200 个核苷酸的 RNA 序列,在调节基因表达以及与癌症发展和恶化相关的生物过程中发挥着至关重要的作用。肝癌是全球癌症相关死亡的主要原因,尤其是在泰国。虽然机器学习已被广泛用于分析 RNA 序列数据以获得先进的知识,但对癌症潜在 lncRNA 生物标志物的鉴定,尤其是将 lncRNA 作为肝癌分子生物标志物的鉴定,仍然相对有限。在本研究中,我们的目标是鉴定肝癌中的候选lncRNA。我们采用了肝癌患者的lncRNA表达数据集,该数据集由来自CancerLivER数据库的40 699个lncRNA组成。我们采用了多种特征选择方法和机器学习方法来识别这些候选lncRNA。结果表明,随机森林算法可以利用从数据库中提取的特征预测lncRNA,在对肝癌早期(1期)和晚期(2、3和4期)的lncRNA进行分类时,其曲线下面积(AUC)达到了0.840。在23个重要的lncRNA中,有5个(WAC-AS1、MAPKAPK5-AS1、ARRDC1-AS1、AC133528.2和RP11-1094M14.11)在肝癌早期和晚期之间有差异表达。基于基因表达谱交互分析(GEPIA)数据库,WAC-AS1、MAPKAPK5-AS1和ARRDC1-AS1的高表达与较短的总生存期相关。总之,该分类模型可以利用lncRNA基因的特征表达预测肝癌的早期和晚期。所发现的lncRNA可作为肝癌患者的早期诊断和预后生物标志物。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Classification of Long Non-Coding RNAs s Between Early and Late Stage of Liver Cancers From Non-coding RNA Profiles Using Machine-Learning Approach.

Long non-coding RNAs (lncRNAs), which are RNA sequences greater than 200 nucleotides in length, play a crucial role in regulating gene expression and biological processes associated with cancer development and progression. Liver cancer is a major cause of cancer-related mortality worldwide, notably in Thailand. Although machine learning has been extensively used in analyzing RNA-sequencing data for advanced knowledge, the identification of potential lncRNA biomarkers for cancer, particularly focusing on lncRNAs as molecular biomarkers in liver cancer, remains comparatively limited. In this study, our objective was to identify candidate lncRNAs in liver cancer. We employed an expression data set of lncRNAs from patients with liver cancer, which comprised 40 699 lncRNAs sourced from The CancerLivER database. Various feature selection methods and machine-learning approaches were used to identify these candidate lncRNAs. The results showed that the random forest algorithm could predict lncRNAs using features extracted from the database, which achieved an area under the curve (AUC) of 0.840 for classifying lncRNAs between early (stage 1) and late stages (stages 2, 3, and 4) of liver cancer. Five of 23 significant lncRNAs (WAC-AS1, MAPKAPK5-AS1, ARRDC1-AS1, AC133528.2, and RP11-1094M14.11) were differentially expressed between early and late stage of liver cancer. Based on the Gene Expression Profiling Interactive Analysis (GEPIA) database, higher expression of WAC-AS1, MAPKAPK5-AS1, and ARRDC1-AS1 was associated with shorter overall survival. In conclusion, the classification model could predict the early and late stages of liver cancer using the signature expression of lncRNA genes. The identified lncRNAs might be used as early diagnostic and prognostic biomarkers for patients with liver cancer.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bioinformatics and Biology Insights
Bioinformatics and Biology Insights BIOCHEMICAL RESEARCH METHODS-
CiteScore
6.80
自引率
1.70%
发文量
36
审稿时长
8 weeks
期刊介绍: Bioinformatics and Biology Insights is an open access, peer-reviewed journal that considers articles on bioinformatics methods and their applications which must pertain to biological insights. All papers should be easily amenable to biologists and as such help bridge the gap between theories and applications.
期刊最新文献
Regulatory Element Analysis and Comparative Genomics Study of Heavy Metal-Resistant Genes in the Complete Genome of Cupriavidus gilardii CR3. Haplotypic Distribution of SARS-CoV-2 Variants in Cases of Intradomiciliary Infection in the State of Rondônia, Western Amazon. The TWW Growth Model and Its Application in the Analysis of Quantitative Polymerase Chain Reaction. Unlocking Benzosampangine's Potential: A Computational Approach to Investigating, Its Role as a PD-L1 Inhibitor in Tumor Immune Evasion via Molecular Docking, Dynamic Simulation, and ADMET Profiling. Drug Repositioning for Scorpion Envenomation Treatment Through Dual Inhibition of Chlorotoxin and Leiurotoxin.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1