piCRISPR: Physically informed deep learning models for CRISPR/Cas9 off-target cleavage prediction

Florian Störtz, Jeffrey K. Mak, Peter Minary
{"title":"piCRISPR: Physically informed deep learning models for CRISPR/Cas9 off-target cleavage prediction","authors":"Florian Störtz,&nbsp;Jeffrey K. Mak,&nbsp;Peter Minary","doi":"10.1016/j.ailsci.2023.100075","DOIUrl":null,"url":null,"abstract":"<div><p>CRISPR/Cas programmable nuclease systems have become ubiquitous in the field of gene editing. With progressing development, applications in <em>in vivo</em> therapeutic gene editing are increasingly within reach, yet limited by possible adverse side effects from unwanted edits. Recent years have thus seen continuous development of off-target prediction algorithms trained on <em>in vitro</em> cleavage assay data gained from immortalised cell lines. It has been shown that in contrast to experimental epigenetic features, computed physically informed features are so far underutilised despite bearing considerably larger correlation with cleavage activity. Here, we implement state-of-the-art deep learning algorithms and feature encodings for off-target prediction with emphasis on <em>physically informed</em> features that capture the biological environment of the cleavage site, hence terming our approach piCRISPR. Features were gained from the large, diverse crisprSQL off-target cleavage dataset. We find that our best-performing models highlight the importance of sequence context and chromatin accessibility for cleavage prediction and compare favourably with literature standard prediction performance. We further show that our novel, environmentally sensitive features are crucial to accurate prediction on sequence-identical locus pairs, making them highly relevant for clinical guide design. The source code and trained models can be found ready to use at <span>github.com/florianst/picrispr</span><svg><path></path></svg>.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318523000193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

CRISPR/Cas programmable nuclease systems have become ubiquitous in the field of gene editing. With progressing development, applications in in vivo therapeutic gene editing are increasingly within reach, yet limited by possible adverse side effects from unwanted edits. Recent years have thus seen continuous development of off-target prediction algorithms trained on in vitro cleavage assay data gained from immortalised cell lines. It has been shown that in contrast to experimental epigenetic features, computed physically informed features are so far underutilised despite bearing considerably larger correlation with cleavage activity. Here, we implement state-of-the-art deep learning algorithms and feature encodings for off-target prediction with emphasis on physically informed features that capture the biological environment of the cleavage site, hence terming our approach piCRISPR. Features were gained from the large, diverse crisprSQL off-target cleavage dataset. We find that our best-performing models highlight the importance of sequence context and chromatin accessibility for cleavage prediction and compare favourably with literature standard prediction performance. We further show that our novel, environmentally sensitive features are crucial to accurate prediction on sequence-identical locus pairs, making them highly relevant for clinical guide design. The source code and trained models can be found ready to use at github.com/florianst/picrispr.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
piCRISPR:用于CRISPR/Cas9脱靶切割预测的物理信息深度学习模型
CRISPR/Cas可编程核酸酶系统在基因编辑领域已经变得无处不在。随着开发的进展,体内治疗性基因编辑的应用越来越触手可及,但受到不必要编辑可能产生的副作用的限制。因此,近年来,在从永生化细胞系获得的体外切割测定数据上训练的脱靶预测算法不断发展。研究表明,与实验表观遗传学特征相比,尽管计算的物理信息特征与切割活性具有相当大的相关性,但迄今为止尚未得到充分利用。在这里,我们实现了最先进的深度学习算法和特征编码,用于脱靶预测,重点是捕捉切割位点生物环境的物理信息特征,从而确定了我们的方法piCRISPR。特征是从大型、多样化的crisprSQL脱靶切割数据集中获得的。我们发现,我们表现最好的模型强调了序列上下文和染色质可及性对切割预测的重要性,并与文献标准预测性能相比较。我们进一步表明,我们新颖的环境敏感特征对于准确预测序列相同的基因座对至关重要,使其与临床指南设计高度相关。源代码和经过训练的模型可以在github.com/florians/picrispr上找到。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial intelligence in the life sciences
Artificial intelligence in the life sciences Pharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)
CiteScore
5.00
自引率
0.00%
发文量
0
审稿时长
15 days
期刊最新文献
Modeling PROTAC degradation activity with machine learning Machine learning proteochemometric models for Cereblon glue activity predictions Editorial Board Statistical approaches enabling technology-specific assay interference prediction from large screening data sets Federated learning for predicting compound mechanism of action based on image-data from cell painting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1