首页 > 最新文献

Artificial intelligence in the life sciences最新文献

英文 中文
Fiscore package: Effective protein structural data visualisation and exploration Fiscore package:有效的蛋白质结构数据可视化和探索
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100016
Auste Kanapeckaite

The lack of bioinformatics tools to quickly assess protein conformational and topological features motivated to create an integrative and user-friendly R package. Moreover, the Fiscore package implements a pipeline for Gaussian mixture modelling making such machine learning methods readily accessible to non-experts. This is especially important since probabilistic machine learning techniques can help with a better interpretation of complex biological phenomena when it is necessary to elucidate various structural features that might play a role in protein function. Thus, Fiscore builds on the mathematical formulation of protein physicochemical properties that can aid in drug discovery, target evaluation, or relational database building. In addition, the package provides interactive environments to explore various features of interest. Finally, one of the goals of this package was to engage structural bioinformaticians and develop more robust and free R tools that could help researchers not necessarily specialising in this field. Package Fiscore (v.0.1.3) is distributed free of charge via CRAN and Github.

缺乏生物信息学工具来快速评估蛋白质构象和拓扑特征,促使创建一个集成的和用户友好的R包。此外,Fiscore包实现了一个用于高斯混合建模的管道,使得非专家也可以很容易地使用这种机器学习方法。这一点尤其重要,因为概率机器学习技术可以帮助更好地解释复杂的生物现象,当有必要阐明可能在蛋白质功能中发挥作用的各种结构特征时。因此,Fiscore建立在蛋白质物理化学性质的数学公式上,可以帮助药物发现、目标评估或关系数据库的建立。此外,该包还提供交互式环境来探索各种感兴趣的特性。最后,这个包的目标之一是吸引结构生物信息学家,开发更强大和免费的R工具,可以帮助研究人员不一定是专门在这个领域。包Fiscore (v.0.1.3)通过CRAN和Github免费分发。
{"title":"Fiscore package: Effective protein structural data visualisation and exploration","authors":"Auste Kanapeckaite","doi":"10.1016/j.ailsci.2021.100016","DOIUrl":"https://doi.org/10.1016/j.ailsci.2021.100016","url":null,"abstract":"<div><p>The lack of bioinformatics tools to quickly assess protein conformational and topological features motivated to create an integrative and user-friendly R package. Moreover, the <em>Fiscore</em> package implements a pipeline for Gaussian mixture modelling making such machine learning methods readily accessible to non-experts. This is especially important since probabilistic machine learning techniques can help with a better interpretation of complex biological phenomena when it is necessary to elucidate various structural features that might play a role in protein function. Thus, <em>Fiscore</em> builds on the mathematical formulation of protein physicochemical properties that can aid in drug discovery, target evaluation, or relational database building. In addition, the package provides interactive environments to explore various features of interest. Finally, one of the goals of this package was to engage structural bioinformaticians and develop more robust and free R tools that could help researchers not necessarily specialising in this field. Package <em>Fiscore</em> (v.0.1.3) is distributed free of charge via CRAN and Github.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100016"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000167/pdfft?md5=18a0905da0c4c31f07a8989a7db0d0c7&pid=1-s2.0-S2667318521000167-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136695093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AutoOmics: New multimodal approach for multi-omics research AutoOmics:多组学研究的多模态新方法
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100012
Chi Xu , Denghui Liu , Lei Zhang , Zhimeng Xu , Wenjun He , Hualiang Jiang , Mingyue Zheng , Nan Qiao

Deep learning is very promising in solving problems in omics research, such as genomics, epigenomics, proteomics, and metabolics. The design of neural network architecture is very important in modeling omics data against different scientific problems. Residual fully-connected neural network (RFCN) was proposed to provide better neural network architectures for modeling omics data. The next challenge for omics research is how to integrate information from different omics data using deep learning, so that information from different molecular system levels could be combined to predict the target. In this paper, we present a novel multi-omics integration approach named AutoOmics that could efficiently integrate information from different omics data and achieve better accuracy than previous approaches. We evaluated our method on four different tasks: drug repositioning, target gene prediction, breast cancer subtyping and cancer type prediction, and all the four tasks achieved state of art performances.

深度学习在解决基因组学、表观基因组学、蛋白质组学和代谢学等组学研究中的问题方面非常有前景。神经网络体系结构的设计对于针对不同科学问题的组学数据建模非常重要。残差全连接神经网络(RFCN)为组学数据建模提供了更好的神经网络架构。组学研究的下一个挑战是如何利用深度学习整合来自不同组学数据的信息,从而将来自不同分子系统水平的信息结合起来预测靶标。在本文中,我们提出了一种新的多组学集成方法AutoOmics,该方法可以有效地集成来自不同组学数据的信息,并且比以前的方法具有更高的准确性。我们在药物重新定位、靶基因预测、乳腺癌亚型和癌症类型预测四个不同的任务上对我们的方法进行了评估,四个任务都达到了最先进的水平。
{"title":"AutoOmics: New multimodal approach for multi-omics research","authors":"Chi Xu ,&nbsp;Denghui Liu ,&nbsp;Lei Zhang ,&nbsp;Zhimeng Xu ,&nbsp;Wenjun He ,&nbsp;Hualiang Jiang ,&nbsp;Mingyue Zheng ,&nbsp;Nan Qiao","doi":"10.1016/j.ailsci.2021.100012","DOIUrl":"10.1016/j.ailsci.2021.100012","url":null,"abstract":"<div><p>Deep learning is very promising in solving problems in omics research, such as genomics, epigenomics, proteomics, and metabolics. The design of neural network architecture is very important in modeling omics data against different scientific problems. Residual fully-connected neural network (RFCN) was proposed to provide better neural network architectures for modeling omics data. The next challenge for omics research is how to integrate information from different omics data using deep learning, so that information from different molecular system levels could be combined to predict the target. In this paper, we present a novel multi-omics integration approach named AutoOmics that could efficiently integrate information from different omics data and achieve better accuracy than previous approaches. We evaluated our method on four different tasks: drug repositioning, target gene prediction, breast cancer subtyping and cancer type prediction, and all the four tasks achieved state of art performances.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100012"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266731852100012X/pdfft?md5=79e7ba5e874a5e7ae6cd628f55bfdfeb&pid=1-s2.0-S266731852100012X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42563081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Identification of bile salt export pump inhibitors using machine learning: Predictive safety from an industry perspective 使用机器学习识别胆汁盐出口泵抑制剂:从行业角度预测安全性
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100027
Raquel Rodríguez-Pérez, Grégori Gerebtzoff

Bile salt export pump (BSEP) is a transporter that moves bile salts from hepatocytes into bile canaliculi. BSEP inhibition can result in the toxic accumulation of bile salts in the liver, which has been identified as a risk factor of drug-induced liver injury (DILI). Since DILI is a frequent cause of drug withdrawals from the market or failings in drug development, in vitro BSEP activity is measured with the [3H]taurocholate uptake assay and a half-maximal inhibitory concentration (IC50) higher than 30 µM is advised. Herein, a machine learning classification model was developed to accurately detect BSEP inhibitors and help in the prioritization of in vitro testing. Regression models for the numerical prediction of IC50 values were also generated. Classification and regression models for BSEP inhibition have been evaluated on realistic settings, which is critical prior to ML-based decision making in drug discovery programs. This work illustrates how predictive safety can help in early toxicity risk assessment and compound prioritization by leveraging Novartis historical experimental data.

胆汁盐输出泵(BSEP)是一种将胆汁盐从肝细胞输送到胆管的转运体。BSEP抑制可导致胆汁盐在肝脏中的毒性积聚,这已被确定为药物性肝损伤(DILI)的危险因素。由于DILI是药物退出市场或药物开发失败的常见原因,因此使用[3H]牛磺胆酸摄取法测量体外BSEP活性,建议使用高于30µM的半最大抑制浓度(IC50)。本文开发了一种机器学习分类模型,以准确检测BSEP抑制剂并帮助确定体外测试的优先级。并建立了IC50数值预测的回归模型。BSEP抑制的分类和回归模型已经在现实环境中进行了评估,这对于药物发现项目中基于ml的决策至关重要。这项工作说明了通过利用诺华的历史实验数据,预测安全性如何有助于早期毒性风险评估和化合物优先排序。
{"title":"Identification of bile salt export pump inhibitors using machine learning: Predictive safety from an industry perspective","authors":"Raquel Rodríguez-Pérez,&nbsp;Grégori Gerebtzoff","doi":"10.1016/j.ailsci.2021.100027","DOIUrl":"10.1016/j.ailsci.2021.100027","url":null,"abstract":"<div><p>Bile salt export pump (BSEP) is a transporter that moves bile salts from hepatocytes into bile canaliculi. BSEP inhibition can result in the toxic accumulation of bile salts in the liver, which has been identified as a risk factor of drug-induced liver injury (DILI). Since DILI is a frequent cause of drug withdrawals from the market or failings in drug development, <em>in vitro</em> BSEP activity is measured with the [<sup>3</sup>H]taurocholate uptake assay and a half-maximal inhibitory concentration (IC<sub>50</sub>) higher than 30 µM is advised. Herein, a machine learning classification model was developed to accurately detect BSEP inhibitors and help in the prioritization of <em>in vitro</em> testing. Regression models for the numerical prediction of IC<sub>50</sub> values were also generated. Classification and regression models for BSEP inhibition have been evaluated on realistic settings, which is critical prior to ML-based decision making in drug discovery programs. This work illustrates how predictive safety can help in early toxicity risk assessment and compound prioritization by leveraging Novartis historical experimental data.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100027"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000271/pdfft?md5=015967de1c7a203aefebbda4387e6f24&pid=1-s2.0-S2667318521000271-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43336869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Computational prediction of frequent hitters in target-based and cell-based assays 基于靶标和基于细胞的检测中频繁撞击的计算预测
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100007
Conrad Stork , Neann Mathai , Johannes Kirchmair

Compounds interfering with high-throughput screening (HTS) assay technologies (also known as “badly behaving compounds”, “bad actors”, “nuisance compounds” or “PAINS”) pose a major challenge to early-stage drug discovery. Many of these problematic compounds are “frequent hitters”, and we have recently published a set of machine learning models (“Hit Dexter 2.0”) for flagging such compounds.

Here we present a new generation of machine learning models which are derived from a large, manually curated and annotated data set. For the first time, these models cover, in addition to target-based assays, also cell-based assays. Our experiments show that cell-based assays behave indeed differently from target-based assays, with respect to hit rates and frequent hitters, and that dedicated models are required to produce meaningful predictions. In addition to these extensions and refinements, we explored a variety of additional setups for modeling, including the combination of four machine learning classifiers (i.e. k-nearest neighbors (KNN), extra trees, random forest and multilayer perceptron) with four sets of descriptors (Morgan2 fingerprints, Morgan3 fingerprints, MACCS keys and 2D physicochemical property descriptors).

Testing on holdout data as well as data sets of “dark chemical matter” (i.e. compounds that have been extensively tested in biological assays but have never shown activity) and known bad actors show that the multilayer perceptron classifiers in combination with Morgan2 fingerprints outperform other setups in most cases. The best multilayer perceptron classifiers obtained Matthews correlation coefficients of up to 0.648 on holdout data. These models are available via a free web service.

干扰高通量筛选(HTS)测定技术的化合物(也称为“不良行为化合物”、“不良行为者”、“滋扰化合物”或“PAINS”)对早期药物发现构成了重大挑战。这些有问题的化合物中有许多是“频繁攻击者”,我们最近发布了一组机器学习模型(“Hit Dexter 2.0”)来标记这些化合物。在这里,我们提出了新一代的机器学习模型,这些模型来自于一个大型的、人工整理和注释的数据集。这是第一次,这些模型覆盖,除了基于目标的分析,也基于细胞的分析。我们的实验表明,基于细胞的分析在命中率和频繁击中方面确实与基于目标的分析不同,并且需要专门的模型来产生有意义的预测。除了这些扩展和改进之外,我们还探索了各种额外的建模设置,包括四种机器学习分类器(即k近邻(KNN),额外树,随机森林和多层感知器)与四组描述符(Morgan2指纹,Morgan3指纹,MACCS密钥和2D物理化学性质描述符)的组合。对保留数据以及“暗化学物质”(即在生物分析中经过广泛测试但从未显示出活性的化合物)和已知不良分子的数据集进行的测试表明,多层感知器分类器与Morgan2指纹相结合在大多数情况下优于其他设置。最好的多层感知器分类器在holdout数据上获得的马修斯相关系数高达0.648。这些模型可以通过一个免费的网络服务获得。
{"title":"Computational prediction of frequent hitters in target-based and cell-based assays","authors":"Conrad Stork ,&nbsp;Neann Mathai ,&nbsp;Johannes Kirchmair","doi":"10.1016/j.ailsci.2021.100007","DOIUrl":"10.1016/j.ailsci.2021.100007","url":null,"abstract":"<div><p>Compounds interfering with high-throughput screening (HTS) assay technologies (also known as “badly behaving compounds”, “bad actors”, “nuisance compounds” or “PAINS”) pose a major challenge to early-stage drug discovery. Many of these problematic compounds are “frequent hitters”, and we have recently published a set of machine learning models (“Hit Dexter 2.0”) for flagging such compounds.</p><p>Here we present a new generation of machine learning models which are derived from a large, manually curated and annotated data set. For the first time, these models cover, in addition to target-based assays, also cell-based assays. Our experiments show that cell-based assays behave indeed differently from target-based assays, with respect to hit rates and frequent hitters, and that dedicated models are required to produce meaningful predictions. In addition to these extensions and refinements, we explored a variety of additional setups for modeling, including the combination of four machine learning classifiers (i.e. k-nearest neighbors (KNN), extra trees, random forest and multilayer perceptron) with four sets of descriptors (Morgan2 fingerprints, Morgan3 fingerprints, MACCS keys and 2D physicochemical property descriptors).</p><p>Testing on holdout data as well as data sets of “dark chemical matter” (i.e. compounds that have been extensively tested in biological assays but have never shown activity) and known bad actors show that the multilayer perceptron classifiers in combination with Morgan2 fingerprints outperform other setups in most cases. The best multilayer perceptron classifiers obtained Matthews correlation coefficients of up to 0.648 on holdout data. These models are available via a free web service.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100007"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ailsci.2021.100007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113386911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Introducing artificial intelligence in the life sciences 在生命科学领域引入人工智能
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100001
Mingyue Zheng , Carolina Horta Andrade , Jürgen Bajorath
{"title":"Introducing artificial intelligence in the life sciences","authors":"Mingyue Zheng ,&nbsp;Carolina Horta Andrade ,&nbsp;Jürgen Bajorath","doi":"10.1016/j.ailsci.2021.100001","DOIUrl":"https://doi.org/10.1016/j.ailsci.2021.100001","url":null,"abstract":"","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100001"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ailsci.2021.100001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136694523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An in silico pipeline for the discovery of multitarget ligands: A case study for epi-polypharmacology based on DNMT1/HDAC2 inhibition 发现多靶点配体的硅管道:基于DNMT1/HDAC2抑制的外源性多药理学案例研究
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100008
Fernando D. Prieto-Martínez , Eli Fernández-de Gortari , José L. Medina-Franco , L. Michel Espinoza-Fonseca

The search for novel therapeutic compounds remains an overwhelming task owing to the time-consuming and expensive nature of the drug development process and low success rates. Traditional methodologies that rely on the one drug-one target paradigm have proven insufficient for the treatment of multifactorial diseases, leading to a shift to multitarget approaches. In this emerging paradigm, molecules with off-target and promiscuous interactions may result in preferred therapies. In this study, we developed a general pipeline combining machine learning algorithms and a deep generator network to train a dual inhibitor classifier capable of identifying putative pharmacophoric traits. As a case study, we focused on dual inhibitors targeting DNA methyltransferase 1 (DNMT) and histone deacetylase 2 (HDAC2), two enzymes that play a central role in epigenetic regulation. We used this approach to identify dual inhibitors from a novel large natural product database in the public domain. We used docking and atomistic simulations as complementary approaches to establish the ligand-interaction profiles between the best hits and DNMT1/HDAC2. By using the combined ligand- and structure-based approaches, we discovered two promising novel scaffolds that can be used to simultaneously target both DNMT1 and HDAC2. We conclude that the flexibility and adaptability of the proposed pipeline has predictive capabilities of similar or derivative methods and is readily applicable to the discovery of small molecules targeting many other therapeutically relevant proteins.

由于药物开发过程耗时和昂贵的性质以及低成功率,寻找新的治疗化合物仍然是一项艰巨的任务。依靠一种药物-一种靶点范式的传统方法已被证明不足以治疗多因素疾病,导致向多靶点方法的转变。在这个新出现的范例中,具有脱靶和混杂相互作用的分子可能导致首选治疗。在本研究中,我们开发了一种结合机器学习算法和深度生成器网络的通用管道,以训练能够识别假定药效性状的双抑制剂分类器。作为一个案例研究,我们重点研究了靶向DNA甲基转移酶1 (DNMT)和组蛋白去乙酰化酶2 (HDAC2)的双重抑制剂,这两种酶在表观遗传调控中起着核心作用。我们使用这种方法从公共领域的一个新的大型天然产物数据库中识别双重抑制剂。我们使用对接和原子模拟作为互补的方法来建立最佳命中与DNMT1/HDAC2之间的配体相互作用谱。通过结合基于配体和结构的方法,我们发现了两种有希望的新型支架,可以同时靶向DNMT1和HDAC2。我们的结论是,所提出的管道的灵活性和适应性具有类似或衍生方法的预测能力,并且很容易适用于发现靶向许多其他治疗相关蛋白质的小分子。
{"title":"An in silico pipeline for the discovery of multitarget ligands: A case study for epi-polypharmacology based on DNMT1/HDAC2 inhibition","authors":"Fernando D. Prieto-Martínez ,&nbsp;Eli Fernández-de Gortari ,&nbsp;José L. Medina-Franco ,&nbsp;L. Michel Espinoza-Fonseca","doi":"10.1016/j.ailsci.2021.100008","DOIUrl":"10.1016/j.ailsci.2021.100008","url":null,"abstract":"<div><p>The search for novel therapeutic compounds remains an overwhelming task owing to the time-consuming and expensive nature of the drug development process and low success rates. Traditional methodologies that rely on the one drug-one target paradigm have proven insufficient for the treatment of multifactorial diseases, leading to a shift to multitarget approaches. In this emerging paradigm, molecules with off-target and promiscuous interactions may result in preferred therapies. In this study, we developed a general pipeline combining machine learning algorithms and a deep generator network to train a dual inhibitor classifier capable of identifying putative pharmacophoric traits. As a case study, we focused on dual inhibitors targeting DNA methyltransferase 1 (DNMT) and histone deacetylase 2 (HDAC2), two enzymes that play a central role in epigenetic regulation. We used this approach to identify dual inhibitors from a novel large natural product database in the public domain. We used docking and atomistic simulations as complementary approaches to establish the ligand-interaction profiles between the best hits and DNMT1/HDAC2. By using the combined ligand- and structure-based approaches, we discovered two promising novel scaffolds that can be used to simultaneously target both DNMT1 and HDAC2. We conclude that the flexibility and adaptability of the proposed pipeline has predictive capabilities of similar or derivative methods and is readily applicable to the discovery of small molecules targeting many other therapeutically relevant proteins.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100008"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ailsci.2021.100008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9530984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Chemistry-centric explanation of machine learning models 以化学为中心解释机器学习模型
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100009
Raquel Rodríguez-Pérez , Jürgen Bajorath
{"title":"Chemistry-centric explanation of machine learning models","authors":"Raquel Rodríguez-Pérez ,&nbsp;Jürgen Bajorath","doi":"10.1016/j.ailsci.2021.100009","DOIUrl":"10.1016/j.ailsci.2021.100009","url":null,"abstract":"","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100009"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266731852100009X/pdfft?md5=6bf9c6213d02c78ea314eab068194508&pid=1-s2.0-S266731852100009X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48664977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Novel computational models offer alternatives to animal testing for assessing eye irritation and corrosion potential of chemicals 新的计算模型为评估化学品的眼睛刺激和腐蚀潜力提供了替代动物试验的方法
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100028
Arthur C. Silva , Joyce V.V.B. Borba , Vinicius M. Alves , Steven U.S. Hall , Nicholas Furnham , Nicole Kleinstreuer , Eugene Muratov , Alexander Tropsha , Carolina Horta Andrade

Eye irritation and corrosion are fundamental considerations in developing chemicals to be used in or near the eye, from cleaning products to ophthalmic solutions. Unfortunately, animal testing is currently the standard method to identify compounds that cause eye irritation or corrosion. Yet, there is growing pressure on the part of regulatory agencies both in the USA and abroad to develop New Approach Methodologies (NAMs) that help reduce the need for animal testing and address unmet need to modernize safety evaluation of chemical hazards. In furthering the development and applications of computational NAMs in chemical safety assessment, in this study we have collected the largest expertly curated dataset of compounds tested for eye irritation and corrosion, and employed this data to build and validate binary and multi-classification Quantitative Structure-Activity Relationships (QSAR) models that can reliably assess eye irritation/corrosion potential of novel untested compounds. QSAR models were generated with Random Forest (RF) and Multi-Descriptor Read Across (MuDRA) machine learning (ML) methods, and validated using a 5-fold external cross-validation protocol. These models demonstrated high balanced accuracy (CCR of 0.68–0.88), sensitivity (SE of 0.61–0.84), positive predictive value (PPV of 0.65–0.90), specificity (SP of 0.56–0.91), and negative predictive value (NPV of 0.68–0.85). Overall, MuDRA models outperformed RF models and were applied to predict compounds’ irritation/corrosion potential from the Inactive Ingredient Database, which contains components present in FDA-approved drug products, and from the Cosmetic Ingredient Database, the European Commission source of information on cosmetic substances. All models built and validated in this study are publicly available at the STopTox web portal (https://stoptox.mml.unc.edu/). These models can be employed as reliable tools for identifying potential eye irritant/corrosive compounds.

从清洁产品到眼科溶液,在开发用于眼睛或眼睛附近的化学品时,眼睛刺激和腐蚀是基本考虑因素。不幸的是,动物试验目前是鉴定引起眼睛刺激或腐蚀的化合物的标准方法。然而,美国和国外的监管机构面临越来越大的压力,要求开发新的方法方法(NAMs),以帮助减少对动物试验的需求,并解决未满足的化学品危害安全评估现代化需求。为了进一步发展和应用计算NAMs在化学安全评估中的应用,在本研究中,我们收集了最大的专家整理的化合物的眼睛刺激和腐蚀测试数据集,并利用这些数据建立和验证二元和多分类的定量结构-活性关系(QSAR)模型,该模型可以可靠地评估新的未经测试的化合物的眼睛刺激/腐蚀潜力。使用随机森林(RF)和多描述符跨读(MuDRA)机器学习(ML)方法生成QSAR模型,并使用5倍外部交叉验证协议进行验证。这些模型具有较高的平衡准确性(CCR为0.68 ~ 0.88)、敏感性(SE为0.61 ~ 0.84)、阳性预测值(PPV为0.65 ~ 0.90)、特异性(SP为0.56 ~ 0.91)和阴性预测值(NPV为0.68 ~ 0.85)。总体而言,MuDRA模型优于RF模型,并应用于预测来自非活性成分数据库(包含fda批准的药品中存在的成分)和化妆品成分数据库(欧盟委员会化妆品物质信息来源)的化合物的刺激/腐蚀电位。在这项研究中建立和验证的所有模型都可以在STopTox网站上公开获得(https://stoptox.mml.unc.edu/)。这些模型可以作为识别潜在的眼睛刺激性/腐蚀性化合物的可靠工具。
{"title":"Novel computational models offer alternatives to animal testing for assessing eye irritation and corrosion potential of chemicals","authors":"Arthur C. Silva ,&nbsp;Joyce V.V.B. Borba ,&nbsp;Vinicius M. Alves ,&nbsp;Steven U.S. Hall ,&nbsp;Nicholas Furnham ,&nbsp;Nicole Kleinstreuer ,&nbsp;Eugene Muratov ,&nbsp;Alexander Tropsha ,&nbsp;Carolina Horta Andrade","doi":"10.1016/j.ailsci.2021.100028","DOIUrl":"10.1016/j.ailsci.2021.100028","url":null,"abstract":"<div><p>Eye irritation and corrosion are fundamental considerations in developing chemicals to be used in or near the eye, from cleaning products to ophthalmic solutions. Unfortunately, animal testing is currently the standard method to identify compounds that cause eye irritation or corrosion. Yet, there is growing pressure on the part of regulatory agencies both in the USA and abroad to develop New Approach Methodologies (NAMs) that help reduce the need for animal testing and address unmet need to modernize safety evaluation of chemical hazards. In furthering the development and applications of computational NAMs in chemical safety assessment, in this study we have collected the largest expertly curated dataset of compounds tested for eye irritation and corrosion, and employed this data to build and validate binary and multi-classification Quantitative Structure-Activity Relationships (QSAR) models that can reliably assess eye irritation/corrosion potential of novel untested compounds. QSAR models were generated with Random Forest (RF) and Multi-Descriptor Read Across (MuDRA) machine learning (ML) methods, and validated using a 5-fold external cross-validation protocol. These models demonstrated high balanced accuracy (CCR of 0.68–0.88), sensitivity (SE of 0.61–0.84), positive predictive value (PPV of 0.65–0.90), specificity (SP of 0.56–0.91), and negative predictive value (NPV of 0.68–0.85). Overall, MuDRA models outperformed RF models and were applied to predict compounds’ irritation/corrosion potential from the Inactive Ingredient Database, which contains components present in FDA-approved drug products, and from the Cosmetic Ingredient Database, the European Commission source of information on cosmetic substances. All models built and validated in this study are publicly available at the STopTox web portal (<span>https://stoptox.mml.unc.edu/</span><svg><path></path></svg>). These models can be employed as reliable tools for identifying potential eye irritant/corrosive compounds.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100028"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9355119/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40588277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Current status of active learning for drug discovery 药物发现中主动学习的现状
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100023
Jie Yu , Xutong Li , Mingyue Zheng

Active learning has been widely used in drug discovery and design in recent years. In this viewpoint, we will briefly summarize applications of AL for drug discovery and propose two potential limitations of research in this field.

近年来,主动学习在药物发现和设计中得到了广泛的应用。从这个角度来看,我们将简要总结人工智能在药物发现中的应用,并提出该领域研究的两个潜在局限性。
{"title":"Current status of active learning for drug discovery","authors":"Jie Yu ,&nbsp;Xutong Li ,&nbsp;Mingyue Zheng","doi":"10.1016/j.ailsci.2021.100023","DOIUrl":"10.1016/j.ailsci.2021.100023","url":null,"abstract":"<div><p>Active learning has been widely used in drug discovery and design in recent years. In this viewpoint, we will briefly summarize applications of AL for drug discovery and propose two potential limitations of research in this field.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100023"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000234/pdfft?md5=4b66ffe5aa91d2b4ff6b1d0f8fc4a84c&pid=1-s2.0-S2667318521000234-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46279614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Machine learning in agriculture domain: A state-of-art survey 农业领域的机器学习:现状调查
Pub Date : 2021-12-01 DOI: 10.1016/j.ailsci.2021.100010
Vishal Meshram , Kailas Patil , Vidula Meshram , Dinesh Hanchate , S.D. Ramkteke

Food is considered as a basic need of human being which can be satisfied through farming. Agriculture not only fulfills humans’ basic needs, but also considered as source of employment worldwide. Agriculture is considered as a backbone of economy and source of employment in the developing countries like India. Agriculture contributes 15.4% in the GDP of India. Agriculture activities are broadly categorized into three major areas: pre-harvesting, harvesting and post harvesting. Advancement in area of machine learning has helped improving gains in agriculture. Machine learning is the current technology which is benefiting farmers to minimize the losses in the farming by providing rich recommendations and insights about the crops. This paper presents an extensive survey of latest machine learning application in agriculture to alleviate the problems in the three areas of pre-harvesting, harvesting and post-harvesting. Application of machine learning in agriculture allows more efficient and precise farming with less human manpower with high quality production.

食物被认为是人类的基本需求,可以通过农业来满足。农业不仅满足了人类的基本需求,而且在世界范围内被认为是就业的来源。农业被认为是印度等发展中国家的经济支柱和就业来源。农业占印度GDP的15.4%。农业活动大致分为三个主要领域:收获前、收获和收获后。机器学习领域的进步有助于提高农业的收益。机器学习是当前的技术,通过提供丰富的建议和对作物的见解,使农民受益,从而最大限度地减少农业损失。本文对机器学习在农业中的最新应用进行了广泛的综述,以缓解收获前、收获和收获后三个方面的问题。机器学习在农业中的应用,可以用更少的人力和高质量的产品实现更高效、更精确的农业生产。
{"title":"Machine learning in agriculture domain: A state-of-art survey","authors":"Vishal Meshram ,&nbsp;Kailas Patil ,&nbsp;Vidula Meshram ,&nbsp;Dinesh Hanchate ,&nbsp;S.D. Ramkteke","doi":"10.1016/j.ailsci.2021.100010","DOIUrl":"10.1016/j.ailsci.2021.100010","url":null,"abstract":"<div><p>Food is considered as a basic need of human being which can be satisfied through farming. Agriculture not only fulfills humans’ basic needs, but also considered as source of employment worldwide. Agriculture is considered as a backbone of economy and source of employment in the developing countries like India. Agriculture contributes 15.4% in the GDP of India. Agriculture activities are broadly categorized into three major areas: pre-harvesting, harvesting and post harvesting. Advancement in area of machine learning has helped improving gains in agriculture. Machine learning is the current technology which is benefiting farmers to minimize the losses in the farming by providing rich recommendations and insights about the crops. This paper presents an extensive survey of latest machine learning application in agriculture to alleviate the problems in the three areas of pre-harvesting, harvesting and post-harvesting. Application of machine learning in agriculture allows more efficient and precise farming with less human manpower with high quality production.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"1 ","pages":"Article 100010"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000106/pdfft?md5=d2887b03e3cdff4a52c5bc0462338732&pid=1-s2.0-S2667318521000106-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46325215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 107
期刊
Artificial intelligence in the life sciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1