Zahir Aghayev, George F Walker, Funda Iseri, Moustafa Ali, Adam T Szafran, Fabio Stossi, Michael A Mancini, Efstratios N Pistikopoulos, Burcu Beykal
{"title":"Binary Classification of the Endocrine Disrupting Chemicals by Artificial Neural Networks.","authors":"Zahir Aghayev, George F Walker, Funda Iseri, Moustafa Ali, Adam T Szafran, Fabio Stossi, Michael A Mancini, Efstratios N Pistikopoulos, Burcu Beykal","doi":"10.1016/b978-0-443-15274-0.50418-2","DOIUrl":null,"url":null,"abstract":"<p><p>We develop a machine learning framework that integrates high content/high throughput image analysis and artificial neural networks (ANNs) to model the separation between chemical compounds based on their estrogenic receptor activity. Natural and man-made chemicals have the potential to disrupt the endocrine system by interfering with hormone actions in people and wildlife. Although numerous studies have revealed new knowledge on the mechanism through which these compounds interfere with various hormone receptors, it is still a very challenging task to comprehensively evaluate the endocrine disrupting potential of all existing chemicals and their mixtures by pure <i>in vitro</i> or <i>in vivo</i> approaches. Machine learning offers a unique advantage in the rapid evaluation of chemical toxicity through learning the underlying patterns in the experimental biological activity data. Motivated by this, we train and test ANN classifiers for modeling the activity of estrogen receptor-α agonists and antagonists at the single-cell level by using high throughput/high content microscopy descriptors. Our framework preprocesses the experimental data by cleaning, scaling, and feature engineering where only the middle 50% of the values from each sample with detectable receptor-DNA binding is considered in the dataset. Principal component analysis is also used to minimize the effects of experimental noise in modeling where these projected features are used in classification model building. The results show that our ANN-based nonlinear data-driven framework classifies the benchmark agonist and antagonist chemicals with 98.41% accuracy.</p>","PeriodicalId":72950,"journal":{"name":"ESCAPE. European Symposium on Computer Aided Process Engineering","volume":"52 ","pages":"2631-2636"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10413412/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESCAPE. European Symposium on Computer Aided Process Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/b978-0-443-15274-0.50418-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/7/18 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We develop a machine learning framework that integrates high content/high throughput image analysis and artificial neural networks (ANNs) to model the separation between chemical compounds based on their estrogenic receptor activity. Natural and man-made chemicals have the potential to disrupt the endocrine system by interfering with hormone actions in people and wildlife. Although numerous studies have revealed new knowledge on the mechanism through which these compounds interfere with various hormone receptors, it is still a very challenging task to comprehensively evaluate the endocrine disrupting potential of all existing chemicals and their mixtures by pure in vitro or in vivo approaches. Machine learning offers a unique advantage in the rapid evaluation of chemical toxicity through learning the underlying patterns in the experimental biological activity data. Motivated by this, we train and test ANN classifiers for modeling the activity of estrogen receptor-α agonists and antagonists at the single-cell level by using high throughput/high content microscopy descriptors. Our framework preprocesses the experimental data by cleaning, scaling, and feature engineering where only the middle 50% of the values from each sample with detectable receptor-DNA binding is considered in the dataset. Principal component analysis is also used to minimize the effects of experimental noise in modeling where these projected features are used in classification model building. The results show that our ANN-based nonlinear data-driven framework classifies the benchmark agonist and antagonist chemicals with 98.41% accuracy.
我们开发了一种机器学习框架,将高含量/高通量图像分析与人工神经网络(ANN)相结合,根据雌激素受体的活性来模拟化学物质之间的分离。天然和人造化学物质有可能干扰人和野生动物体内的激素作用,从而扰乱内分泌系统。尽管大量研究揭示了这些化合物干扰各种激素受体的新机制,但要通过纯体外或体内方法全面评估所有现有化学品及其混合物的内分泌干扰潜力,仍然是一项极具挑战性的任务。通过学习实验生物活性数据中的基本模式,机器学习在快速评估化学品毒性方面具有独特的优势。受此启发,我们利用高通量/高含量显微镜描述符训练和测试了 ANN 分类器,用于在单细胞水平上模拟雌激素受体-α 激动剂和拮抗剂的活性。我们的框架通过清洗、缩放和特征工程对实验数据进行预处理,数据集中只考虑每个样本中可检测到受体-DNA 结合的中间 50% 值。在建模过程中,还使用了主成分分析来尽量减少实验噪声的影响,这些预测特征将用于分类模型的建立。结果表明,我们基于 ANN 的非线性数据驱动框架对基准激动剂和拮抗剂化学物质进行分类的准确率为 98.41%。