Anouck Thienpont, Stefaan Verhulst, Leo A Van Grunsven, Vera Rogiers, Tamara Vanhaecke, Birgit Mertens
{"title":"Novel prediction models for genotoxicity based on biomarker genes in human HepaRG™ cells.","authors":"Anouck Thienpont, Stefaan Verhulst, Leo A Van Grunsven, Vera Rogiers, Tamara Vanhaecke, Birgit Mertens","doi":"10.14573/altex.2206201","DOIUrl":null,"url":null,"abstract":"<p><p>Transcriptomics-based biomarkers are promising new approach methodologies (NAMs) to identify molecular events underlying the genotoxic mode of action of chemicals. Previously, we developed the GENOMARK biomarker, consisting of 84 genes selected based on whole genomics DNA microarray profiles of 24 (non-)genotoxic reference chemicals covering different modes of action in metabolically competent human HepaRG™ cells. In the present study, new prediction models for genotoxicity were developed based on an extended reference dataset of 38 chemicals including existing as well as newly generated gene expression data. Both unsupervised and supervised machine learning algorithms were used, but as unsupervised machine learning did not clearly distinguish between groups, the performance of two supervised machine learning algorithms, i.e., support vector machine (SVM) and random forest (RF), was evaluated. More specifically, the predictive accuracy was compared, the sensitivity to outliers for one or more biomarker genes was assessed, and the prediction performance for 10 misleading positive chemicals exposed at their IC10 concentration was determined. In addition, the applicability of both prediction models on a publicly available gene expression dataset, generated with RNA-sequencing, was investigated. Overall, the RF and SVM models were complementary in their classification of chemicals for genotoxicity. To facilitate data analysis, an online application was developed, combining the outcomes of both prediction models. This research demonstrates that the combination of gene expression data with supervised machine learning algorithms can contribute to the ongoing paradigm shift towards a more human-relevant in vitro genotoxicity testing strategy without the use of experimental animals.</p>","PeriodicalId":51231,"journal":{"name":"Altex-Alternatives To Animal Experimentation","volume":"40 2","pages":"271-286"},"PeriodicalIF":4.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Altex-Alternatives To Animal Experimentation","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.14573/altex.2206201","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 1
Abstract
Transcriptomics-based biomarkers are promising new approach methodologies (NAMs) to identify molecular events underlying the genotoxic mode of action of chemicals. Previously, we developed the GENOMARK biomarker, consisting of 84 genes selected based on whole genomics DNA microarray profiles of 24 (non-)genotoxic reference chemicals covering different modes of action in metabolically competent human HepaRG™ cells. In the present study, new prediction models for genotoxicity were developed based on an extended reference dataset of 38 chemicals including existing as well as newly generated gene expression data. Both unsupervised and supervised machine learning algorithms were used, but as unsupervised machine learning did not clearly distinguish between groups, the performance of two supervised machine learning algorithms, i.e., support vector machine (SVM) and random forest (RF), was evaluated. More specifically, the predictive accuracy was compared, the sensitivity to outliers for one or more biomarker genes was assessed, and the prediction performance for 10 misleading positive chemicals exposed at their IC10 concentration was determined. In addition, the applicability of both prediction models on a publicly available gene expression dataset, generated with RNA-sequencing, was investigated. Overall, the RF and SVM models were complementary in their classification of chemicals for genotoxicity. To facilitate data analysis, an online application was developed, combining the outcomes of both prediction models. This research demonstrates that the combination of gene expression data with supervised machine learning algorithms can contribute to the ongoing paradigm shift towards a more human-relevant in vitro genotoxicity testing strategy without the use of experimental animals.
期刊介绍:
ALTEX publishes original articles, short communications, reviews, as well as news and comments and meeting reports. Manuscripts submitted to ALTEX are evaluated by two expert reviewers. The evaluation takes into account the scientific merit of a manuscript and its contribution to animal welfare and the 3R principle.