首页 > 最新文献

Journal of bioinformatics and systems biology : Open access最新文献

英文 中文
Abstract 259: Comparison of Illumina NovaSeq 6000 and MGISEQ-2000 in profiling xenograft models 259: Illumina NovaSeq 6000和MGISEQ-2000在异种移植物模型分析中的比较
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-259
W. Qian, Chen Xiaobo, H. Li, Sheng Guo
{"title":"Abstract 259: Comparison of Illumina NovaSeq 6000 and MGISEQ-2000 in profiling xenograft models","authors":"W. Qian, Chen Xiaobo, H. Li, Sheng Guo","doi":"10.1158/1538-7445.AM2021-259","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-259","url":null,"abstract":"","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"181 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85004100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Abstract 191: A probabilistic analysis of somatic mutations indicates individual survival outcome classes with AUC near 1.00 for all tested cancer-drug combinations from TCGA and 4 immune checkpoint studies (all having ≥ 20 patients and an outcome ratio < 6) 摘要191:体细胞突变的概率分析表明,来自TCGA和4个免疫检查点研究的所有测试的癌症药物组合的个体生存结局分类AUC接近1.00(所有患者均≥20,结局比< 6)。
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-191
J. Friedman
A new computational method to predict cancer treatment outcomes from somatic mutation data was tested. Using this method, treatment outcome success or failure for 78 different cancer-drug combinations (74 from TCGA & 4 from published immune checkpoint inhibitor studies) could be "predicted" for each patient with nearly perfect accuracy (AUC values from ROC curves at 1.000 or just below) based solely on individual patients9 somatic mutation information. Predictions worked for all examined cancer-drug combinations with information available for > 20 patients and with a treatment SUCCESS to FAILURE ratio between 1/6 and 6. Calculations disregarded outcome information about the patient for whom an outcome was being predicted, but so far only when calculating their own classification measure. More elaborate, independent calculations are being developed to eliminate the remnants of outcome information from one patient in classification measures calculated for other predicted patients, but these newer, more detailed calculations are ongoing. The methods avoid any (1) fitting of parameters to outcome or data, (2) use of linear algebraic methods, (3) determinations of scale factor values, and (4) use of some typically inaccurate types of experimentally estimated probability values. Instead, they use (1) more accurate metastatistics about an accurately determined type of probability value – the probability that the observed frequency of mutation for a gene differs from random in either separate population of the responder or of the non-responder patients – and (2) an analysis of some underlying causes of modeling bias – examining the sensitivity of how identifying non-random mutation frequencies can be perturbed by changes due to single patients. Statistics entailing extrapolation to an infinite sampling limit were avoided in favor of statistics more applicable to small finite samples. When one patient with a "known" outcome was deliberately varied, in a systematic non-random way, critical statistics exhibited consistent changes that differed depending on whether the varied patient belonged to the HIT or MISS outcome class and these changes remained consistent with outcome class when patients of "unknown" outcome were varied in a similar way. The analysis provided a quantitative mathematical explanation for why FLAG genes had appeared often in many GWAS and suggested that the mutational burden measure used often as a marker for checkpoint inhibitor studies might suffer from similar complications. Prospective studies are being planned. Citation Format: Jonathan Malcolm Friedman. A probabilistic analysis of somatic mutations indicates individual survival outcome classes with AUC near 1.00 for all tested cancer-drug combinations from TCGA and 4 immune checkpoint studies (all having ≥ 20 patients and an outcome ratio
测试了一种新的计算方法来预测体细胞突变数据的癌症治疗结果。使用这种方法,78种不同的癌症药物组合(74种来自TCGA, 4种来自已发表的免疫检查点抑制剂研究)的治疗结果成功或失败,可以仅基于个体患者的体细胞突变信息,以近乎完美的准确性“预测”每个患者(ROC曲线的AUC值在1.000或更低)。预测对所有被检查的癌症药物组合都有效,其中有20名患者的信息,治疗成功率在1/6到6之间。计算忽略了预测结果的患者的结果信息,但到目前为止只是在计算自己的分类测量时。更精细、独立的计算正在被开发,以消除在为其他预测患者计算的分类措施中来自一个患者的残余结果信息,但这些更新、更详细的计算正在进行中。这些方法避免了任何(1)对结果或数据的参数拟合,(2)使用线性代数方法,(3)确定比例因子值,以及(4)使用一些典型的不准确类型的实验估计概率值。相反,他们使用(1)更准确的关于一种精确确定的概率值类型的转移学——在有应答者或无应答者的单独人群中观察到的基因突变频率不同于随机的概率——以及(2)对建模偏差的一些潜在原因的分析——检查识别非随机突变频率如何被单个患者引起的变化所干扰的敏感性。避免了需要外推到无限抽样限制的统计数据,而采用更适用于有限小样本的统计数据。当以系统的非随机方式故意改变一个具有“已知”结果的患者时,关键统计数据显示出一致的变化,这些变化取决于不同的患者是属于HIT还是MISS结果类别,而当以类似方式改变“未知”结果的患者时,这些变化仍与结果类别保持一致。该分析为为什么FLAG基因经常出现在许多GWAS中提供了定量的数学解释,并表明通常用作检查点抑制剂研究标记的突变负担测量可能会遭受类似的并发症。正在计划进行前瞻性研究。引用格式:Jonathan Malcolm Friedman。体细胞突变的概率分析表明,来自TCGA和4个免疫检查点研究(均有≥20名患者和结果比)的所有测试的癌症药物组合的个体生存结局分类的AUC接近1.00
{"title":"Abstract 191: A probabilistic analysis of somatic mutations indicates individual survival outcome classes with AUC near 1.00 for all tested cancer-drug combinations from TCGA and 4 immune checkpoint studies (all having ≥ 20 patients and an outcome ratio < 6)","authors":"J. Friedman","doi":"10.1158/1538-7445.AM2021-191","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-191","url":null,"abstract":"A new computational method to predict cancer treatment outcomes from somatic mutation data was tested. Using this method, treatment outcome success or failure for 78 different cancer-drug combinations (74 from TCGA & 4 from published immune checkpoint inhibitor studies) could be \"predicted\" for each patient with nearly perfect accuracy (AUC values from ROC curves at 1.000 or just below) based solely on individual patients9 somatic mutation information. Predictions worked for all examined cancer-drug combinations with information available for > 20 patients and with a treatment SUCCESS to FAILURE ratio between 1/6 and 6. Calculations disregarded outcome information about the patient for whom an outcome was being predicted, but so far only when calculating their own classification measure. More elaborate, independent calculations are being developed to eliminate the remnants of outcome information from one patient in classification measures calculated for other predicted patients, but these newer, more detailed calculations are ongoing. The methods avoid any (1) fitting of parameters to outcome or data, (2) use of linear algebraic methods, (3) determinations of scale factor values, and (4) use of some typically inaccurate types of experimentally estimated probability values. Instead, they use (1) more accurate metastatistics about an accurately determined type of probability value – the probability that the observed frequency of mutation for a gene differs from random in either separate population of the responder or of the non-responder patients – and (2) an analysis of some underlying causes of modeling bias – examining the sensitivity of how identifying non-random mutation frequencies can be perturbed by changes due to single patients. Statistics entailing extrapolation to an infinite sampling limit were avoided in favor of statistics more applicable to small finite samples. When one patient with a \"known\" outcome was deliberately varied, in a systematic non-random way, critical statistics exhibited consistent changes that differed depending on whether the varied patient belonged to the HIT or MISS outcome class and these changes remained consistent with outcome class when patients of \"unknown\" outcome were varied in a similar way. The analysis provided a quantitative mathematical explanation for why FLAG genes had appeared often in many GWAS and suggested that the mutational burden measure used often as a marker for checkpoint inhibitor studies might suffer from similar complications. Prospective studies are being planned. Citation Format: Jonathan Malcolm Friedman. A probabilistic analysis of somatic mutations indicates individual survival outcome classes with AUC near 1.00 for all tested cancer-drug combinations from TCGA and 4 immune checkpoint studies (all having ≥ 20 patients and an outcome ratio","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"07 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85977738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 175: Pathway modeling to translate the 27-gene immuno-oncology algorithm into bladder cancer 175:将27基因免疫肿瘤学算法转化为膀胱癌的途径建模
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-175
R. Seitz, T. Nielsen, B. Schweitzer, D. Hout, D. Ross
Background The 27-gene immuno-oncology (IO) algorithm has demonstrated an association with immune checkpoint inhibitor (ICI) response in TNBC, NSCLC, and metastatic urothelial carcinoma (mUC). The algorithm can be run on data generated from either a qPCR assay or from analysis of whole transcriptome RNA-seq data. It integrates gene expression information from infiltrating inflammatory cells with signatures from surrounding stroma and tumor cells to classify cases into likely responder versus non-responders. We hypothesized that because the algorithm derives its biologic signature from the tumor immune microenvironment (TIME), the classification function and thresholds might translate to other solid tissue types based upon biologic separation of inflammatory phenotypes. Methods Using NSCLC and breast cancer datasets from TCGA, we identified 939 genes that comprise the Mesenchymal (M), Mesenchymal Stem-like (MSL), and Immunomodulatory (IM) gene expression patterns centered around a previously described 101-gene signature (Ring, 2016). We applied this 939 gene set to 433 bladder samples from TCGA (UC) and k-means clustered the genes based upon each of the three centroids. Clinical cases were also organized by k-means clustering (k=3). Pathway analysis was performed (GSEA—UCSD/Broad). We assessed classification of UC cases by looking at enrichment of inflammatory pathways into the IM cluster compared to mesenchymal pathways into the M or MSL clusters. The threshold for responder classification using the 27-gene IO algorithm previously established in TNBC was assessed by quantitating the fraction of cases enriched into the IM cluster (potential responders) as opposed to the M or MSL clusters (potential non-responders). Results The 939 genes centered around the 101-gene signature encoded twenty different physiologic pathways. Ten of these pathways included at least one of the genes from the 27-gene IO algorithm. Significant enrichment of inflammatory cell pathways was seen into the IM cluster as opposed to mesenchymal and reactive fibroblast pathways enriched into the M and MSL clusters. Pathways containing therapeutic targets designed to overcome resistance to ICIs were enriched in the MSL gene expression centroid. The 27-gene IO algorithm threshold applied to the TCGA samples classified 79% as responders in the IM cluster as opposed 16% in the M and MSL. Discussion These results support the hypothesis that gene expression signatures discerning TIME physiology associated with ICI response are tissue agnostic and relevant in multiple solid tissue types. The dramatic enrichment of responders into the IM cluster using previously established thresholds is consistent with appropriate biologic classification of the cases and supports utilizing the 27-gene IO algorithm and established threshold for association with ICI response in treated mUC cohorts. Citation Format: Robert S. Seitz, Tyler J. Nielsen, Brock L. Schweitzer, David R. Hout, Douglas T. Ross. Pat
27基因免疫肿瘤学(IO)算法已经证明与TNBC、NSCLC和转移性尿路上皮癌(mUC)的免疫检查点抑制剂(ICI)应答相关。该算法可以运行在从qPCR分析或从整个转录组RNA-seq数据分析产生的数据上。它将浸润性炎症细胞的基因表达信息与周围基质和肿瘤细胞的特征结合起来,将病例分为可能有反应的和无反应的。我们假设,由于该算法源自肿瘤免疫微环境(TIME)的生物学特征,分类功能和阈值可能转化为基于炎症表型生物分离的其他实体组织类型。方法利用来自TCGA的NSCLC和乳腺癌数据集,我们确定了939个基因,这些基因包括间充质(M)、间充质干样(MSL)和免疫调节(IM)基因表达模式,这些基因以先前描述的101个基因特征为中心(Ring, 2016)。我们将这939基因集应用于来自TCGA (UC)的433个膀胱样本,并基于三个质心对基因进行k-means聚类。采用k-means聚类(k=3)对临床病例进行分组。进行通路分析(GSEA-UCSD /Broad)。我们通过对比M或MSL聚集的间充质途径和IM聚集的炎症途径的富集来评估UC病例的分类。使用先前在TNBC中建立的27基因IO算法对应答者分类的阈值进行评估,通过量化富集到IM集群(潜在应答者)而不是M或MSL集群(潜在无应答者)的病例比例。结果以101个基因为中心的939个基因编码了20种不同的生理通路。其中10个途径至少包含27个基因IO算法中的一个基因。炎症细胞通路在IM集群中显著富集,而在M和MSL集群中则富集间充质和反应性成纤维细胞通路。含有治疗靶点的途径被设计来克服对ICIs的抗性,在MSL基因表达质心中富集。应用于TCGA样本的27个基因IO算法阈值将IM集群中79%的应答者分类为应答者,而M和MSL中为16%。这些结果支持这样的假设,即识别与ICI反应相关的TIME生理的基因表达特征是组织不可知的,并且与多种实体组织类型相关。使用先前建立的阈值将应答者显著富集到IM集群中,这与病例的适当生物学分类是一致的,并且支持使用27基因IO算法和已建立的阈值来与治疗的mUC队列中的ICI应答相关联。引用格式:Robert S. Seitz, Tyler J. Nielsen, Brock L. Schweitzer, David R. Hout, Douglas T. Ross。27基因免疫肿瘤学算法应用于膀胱癌的途径建模[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要第175期。
{"title":"Abstract 175: Pathway modeling to translate the 27-gene immuno-oncology algorithm into bladder cancer","authors":"R. Seitz, T. Nielsen, B. Schweitzer, D. Hout, D. Ross","doi":"10.1158/1538-7445.AM2021-175","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-175","url":null,"abstract":"Background The 27-gene immuno-oncology (IO) algorithm has demonstrated an association with immune checkpoint inhibitor (ICI) response in TNBC, NSCLC, and metastatic urothelial carcinoma (mUC). The algorithm can be run on data generated from either a qPCR assay or from analysis of whole transcriptome RNA-seq data. It integrates gene expression information from infiltrating inflammatory cells with signatures from surrounding stroma and tumor cells to classify cases into likely responder versus non-responders. We hypothesized that because the algorithm derives its biologic signature from the tumor immune microenvironment (TIME), the classification function and thresholds might translate to other solid tissue types based upon biologic separation of inflammatory phenotypes. Methods Using NSCLC and breast cancer datasets from TCGA, we identified 939 genes that comprise the Mesenchymal (M), Mesenchymal Stem-like (MSL), and Immunomodulatory (IM) gene expression patterns centered around a previously described 101-gene signature (Ring, 2016). We applied this 939 gene set to 433 bladder samples from TCGA (UC) and k-means clustered the genes based upon each of the three centroids. Clinical cases were also organized by k-means clustering (k=3). Pathway analysis was performed (GSEA—UCSD/Broad). We assessed classification of UC cases by looking at enrichment of inflammatory pathways into the IM cluster compared to mesenchymal pathways into the M or MSL clusters. The threshold for responder classification using the 27-gene IO algorithm previously established in TNBC was assessed by quantitating the fraction of cases enriched into the IM cluster (potential responders) as opposed to the M or MSL clusters (potential non-responders). Results The 939 genes centered around the 101-gene signature encoded twenty different physiologic pathways. Ten of these pathways included at least one of the genes from the 27-gene IO algorithm. Significant enrichment of inflammatory cell pathways was seen into the IM cluster as opposed to mesenchymal and reactive fibroblast pathways enriched into the M and MSL clusters. Pathways containing therapeutic targets designed to overcome resistance to ICIs were enriched in the MSL gene expression centroid. The 27-gene IO algorithm threshold applied to the TCGA samples classified 79% as responders in the IM cluster as opposed 16% in the M and MSL. Discussion These results support the hypothesis that gene expression signatures discerning TIME physiology associated with ICI response are tissue agnostic and relevant in multiple solid tissue types. The dramatic enrichment of responders into the IM cluster using previously established thresholds is consistent with appropriate biologic classification of the cases and supports utilizing the 27-gene IO algorithm and established threshold for association with ICI response in treated mUC cohorts. Citation Format: Robert S. Seitz, Tyler J. Nielsen, Brock L. Schweitzer, David R. Hout, Douglas T. Ross. Pat","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76843667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 265: Evaluating variation in drug efficacy endpoints in a syngeneic mouse model (CT26.WT) under immune checkpoint blockade 265:评估免疫检查点阻断下同基因小鼠模型(CT26.WT)药物疗效终点的变化
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-265
B. Mao, Sheng Guo, D. Ouyang, H. Li
{"title":"Abstract 265: Evaluating variation in drug efficacy endpoints in a syngeneic mouse model (CT26.WT) under immune checkpoint blockade","authors":"B. Mao, Sheng Guo, D. Ouyang, H. Li","doi":"10.1158/1538-7445.AM2021-265","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-265","url":null,"abstract":"","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"123 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76199337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 239: Integrated computational image analysis of cellular and acellular tissue components as a method for detailed tumor tissue mapping and structural patterns recognition 239:细胞和非细胞组织成分的集成计算图像分析作为详细肿瘤组织定位和结构模式识别的方法
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-239
G. Vasiukov, Tatiana Novitskaya, M. Senosain, A. Menshikh, A. Zijlstra, S. Novitskiy, P. Massion
Tumor microenvironment (TME) represents an integrated system that affects cancer cell behavior and contributes directly to disease outcome. Systemic approach to analysis of TME should uncover its complexity and facilitate discovery of mechanisms orchestrating tumor development and metastasis. Multiplex fluorescence tissue staining followed by spatial analysis of tumor tissue architecture can provide insights to pivotal interactions of cellular and acellular components of TME. Extracellular matrix (ECM is represented mainly by collagen deposition. Number of reports indicates that ECM contribution to TME state not only depends upon amount of accumulated collagen but its geometrical features and spatial orientation of fibers. These characteristics of collagen fibers contribute directly to physical and mechanical properties of tissue and can change tumor growth and metastasis. Current methods of computational image analysis of tissue implement assessment of cellular or acellular components separately. The goal of current work was to develop a new computational tool to perform integrated analysis of fibrous and cellular components of tumor tissue in spatial dependent manner to achieve detailed tumor tissue mapping and structural patterns recognition. To pursue this goal, we generated images of human lung adenocarcinoma tissue characterized by indolent and aggressive behavior. We performed multiplex immunofluorescence staining for following markers: CD3 - marker of T-lymphocytes, PanCytokeratin - marker of epithelial/tumor cells, collagen hybridizing peptide (3Helix) - marker of collagen, DAPI - nuclear counterstain. To develop image analysis pipeline, we utilized an open source graphical interface analytical platform KNIME, where we generated modular workflow. For ECM analysis, we integrated Python written code into KNIME node. Segmentation of collagen fibers was performed using skeletonization with subsequent calculation of geometrical properties (length, alignment, widths) and orientation of each fiber. Data, collected from single cell analysis and ECM architecture assessment, were combined and forwarded to downstream spatial analysis, where distances from cell to cell or cell to ECM were computed and neighborhood analysis was performed. We demonstrated that tumor cells in aggressive adenocarcinoma samples were co-localized with a smaller number of collagen fibers. In addition, length of that fibers was less in comparison to indolent group. Correlation analysis revealed positive correlation between length of collagen fibers and number of tumor cells in indolent group, but we did not observe this phenomenon in indolent group. Developed computational method provides additional dimensionality to tissue image analysis and can reveal underrecognized structural patterns of the tumor microenvironment. Citation Format: Georgii Vasiukov, Tatiana Novitskaya, Maria-Fernanda Senosain, Anna Menshikh, Andries Zijlstra, Sergey Novitskiy, Pierre Massion. Integrated
肿瘤微环境(Tumor microenvironment, TME)是一个影响癌细胞行为并直接影响疾病预后的综合系统。系统的方法分析TME应该揭示其复杂性,并有助于发现协调肿瘤发展和转移的机制。多重荧光组织染色,然后对肿瘤组织结构进行空间分析,可以为TME细胞和非细胞成分的关键相互作用提供见解。细胞外基质(ECM)主要以胶原沉积为代表。大量报道表明,ECM对TME状态的贡献不仅取决于胶原积累的数量,还取决于其几何特征和纤维的空间取向。胶原纤维的这些特性直接影响组织的物理和机械特性,并能改变肿瘤的生长和转移。目前的组织计算图像分析方法分别对细胞或非细胞成分进行评估。目前的工作目标是开发一种新的计算工具,以空间依赖的方式对肿瘤组织的纤维和细胞成分进行综合分析,以实现详细的肿瘤组织制图和结构模式识别。为了实现这一目标,我们生成了以惰性和侵袭性行为为特征的人肺腺癌组织图像。我们对以下标记物进行多重免疫荧光染色:t淋巴细胞CD3标记物,上皮/肿瘤细胞PanCytokeratin标记物,胶原杂交肽(3Helix) -胶原标记物,DAPI -核反染。为了开发图像分析管道,我们利用了开源图形界面分析平台KNIME,并在该平台上生成了模块化的工作流程。为了进行ECM分析,我们将Python编写的代码集成到KNIME节点中。使用骨架化对胶原纤维进行分割,随后计算每根纤维的几何特性(长度、排列、宽度)和方向。从单细胞分析和ECM架构评估中收集的数据被合并并转发给下游空间分析,在那里计算细胞到细胞或细胞到ECM的距离,并进行邻域分析。我们证明侵袭性腺癌样本中的肿瘤细胞与较少数量的胶原纤维共定位。此外,这些纤维的长度也比惰性组短。相关分析显示,慵懒组胶原纤维长度与肿瘤细胞数呈正相关,而慵懒组未见此现象。开发的计算方法为组织图像分析提供了额外的维度,可以揭示肿瘤微环境的未被识别的结构模式。引文格式:Georgii Vasiukov, Tatiana Novitskaya, Maria-Fernanda Senosain, Anna Menshikh, Andries Zijlstra, Sergey Novitskiy, Pierre Massion。细胞和非细胞组织成分的集成计算图像分析作为详细肿瘤组织定位和结构模式识别的方法[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要nr 239。
{"title":"Abstract 239: Integrated computational image analysis of cellular and acellular tissue components as a method for detailed tumor tissue mapping and structural patterns recognition","authors":"G. Vasiukov, Tatiana Novitskaya, M. Senosain, A. Menshikh, A. Zijlstra, S. Novitskiy, P. Massion","doi":"10.1158/1538-7445.AM2021-239","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-239","url":null,"abstract":"Tumor microenvironment (TME) represents an integrated system that affects cancer cell behavior and contributes directly to disease outcome. Systemic approach to analysis of TME should uncover its complexity and facilitate discovery of mechanisms orchestrating tumor development and metastasis. Multiplex fluorescence tissue staining followed by spatial analysis of tumor tissue architecture can provide insights to pivotal interactions of cellular and acellular components of TME. Extracellular matrix (ECM is represented mainly by collagen deposition. Number of reports indicates that ECM contribution to TME state not only depends upon amount of accumulated collagen but its geometrical features and spatial orientation of fibers. These characteristics of collagen fibers contribute directly to physical and mechanical properties of tissue and can change tumor growth and metastasis. Current methods of computational image analysis of tissue implement assessment of cellular or acellular components separately. The goal of current work was to develop a new computational tool to perform integrated analysis of fibrous and cellular components of tumor tissue in spatial dependent manner to achieve detailed tumor tissue mapping and structural patterns recognition. To pursue this goal, we generated images of human lung adenocarcinoma tissue characterized by indolent and aggressive behavior. We performed multiplex immunofluorescence staining for following markers: CD3 - marker of T-lymphocytes, PanCytokeratin - marker of epithelial/tumor cells, collagen hybridizing peptide (3Helix) - marker of collagen, DAPI - nuclear counterstain. To develop image analysis pipeline, we utilized an open source graphical interface analytical platform KNIME, where we generated modular workflow. For ECM analysis, we integrated Python written code into KNIME node. Segmentation of collagen fibers was performed using skeletonization with subsequent calculation of geometrical properties (length, alignment, widths) and orientation of each fiber. Data, collected from single cell analysis and ECM architecture assessment, were combined and forwarded to downstream spatial analysis, where distances from cell to cell or cell to ECM were computed and neighborhood analysis was performed. We demonstrated that tumor cells in aggressive adenocarcinoma samples were co-localized with a smaller number of collagen fibers. In addition, length of that fibers was less in comparison to indolent group. Correlation analysis revealed positive correlation between length of collagen fibers and number of tumor cells in indolent group, but we did not observe this phenomenon in indolent group. Developed computational method provides additional dimensionality to tissue image analysis and can reveal underrecognized structural patterns of the tumor microenvironment. Citation Format: Georgii Vasiukov, Tatiana Novitskaya, Maria-Fernanda Senosain, Anna Menshikh, Andries Zijlstra, Sergey Novitskiy, Pierre Massion. Integrated","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76237555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 197: MONE: A construction for interpreting deep learning features in pathology slides 摘要197:MONE:一个解释病理切片中深度学习特征的结构
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-197
Ali Foroughi pour, Jonghanne Park, Jeffrey H. Chuang
Deep learning has become a popular tool for analyzing hematoxylin and eosin (HE 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 197.
深度学习已经成为分析苏木精和伊红的流行工具(HE 2021 4月10-15日和5月17-21日)。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要第197期。
{"title":"Abstract 197: MONE: A construction for interpreting deep learning features in pathology slides","authors":"Ali Foroughi pour, Jonghanne Park, Jeffrey H. Chuang","doi":"10.1158/1538-7445.AM2021-197","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-197","url":null,"abstract":"Deep learning has become a popular tool for analyzing hematoxylin and eosin (HE 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 197.","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"82 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83537626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 162: Cholesterol metabolism gene expression and prostate cancer-specific outcomes in radiotherapy-treated patients 162:放射治疗患者胆固醇代谢基因表达与前列腺癌特异性结局
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-162
S. Winter, S. Halliday, Konrad H. Stopsack, S. Osman, A. Hounsell, G. Prue, S. Jain, E. Allott
{"title":"Abstract 162: Cholesterol metabolism gene expression and prostate cancer-specific outcomes in radiotherapy-treated patients","authors":"S. Winter, S. Halliday, Konrad H. Stopsack, S. Osman, A. Hounsell, G. Prue, S. Jain, E. Allott","doi":"10.1158/1538-7445.AM2021-162","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-162","url":null,"abstract":"","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90660757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 160: LASSO-based protein signatures for survival prediction in human cancer cohorts 160:基于lasso的蛋白质标记用于人类癌症群体的生存预测
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-160
Mariam M. Konaté, Ming-Chung Li, L. McShane, Yingdong Zhao
Background: Large-scale multi-omics data characterizing human tumors are increasingly available and can be leveraged to develop a deeper understanding of biological processes and predict clinical outcomes. Reverse-phase protein array (RPPA) is a high-throughput, antibody-based method that provides a more direct assessment of cellular activity compared to DNA and RNA sequencing, which generate data that do not always correlate with protein expression. Multiple studies have demonstrated the prognostic value of RPPA data. Some of these studies have used pathway-driven approaches, relying on prior knowledge from the literature to group proteins into biological pathways, to develop prognostic signatures or predictors of treatment response. Methods: We obtained normalized RPPA data for up to 258 total, cleaved, acetylated, or phosphorylated protein species from The Cancer Proteome Atlas (TCPA). Starting from a published RPPA-based seven-protein signature of receptor tyrosine kinase (RTK) pathway activity in the form of an unweighted sum of the seven protein measurements, shown to have prognostic value in a 445-patient renal clear cell carcinoma cohort (TCGA-KIRC), we demonstrated that strong stratification of patients into high and low risk groups can be achieved by using a statistical approach—LASSO regression—with no a priori biological knowledge, to select from the 233 proteins and optimally combine their RPPA measurements into a weighted risk score. Method performance was assessed using two unbiased approaches: 1) 10 iterations of 3-fold cross-validation for unbiased estimation of hazard ratio and difference in 5-year survival (by Kaplan-Meier method) between predictor-defined high and low risk groups; and 2) a permutation test to evaluate the statistical significance of the cross-validated log-rank statistic. Results: For the first evaluation approach, the median hazard ratio between high and low risk groups across the held-out folds in the cross-validation based on the 7-protein RTK score was 2.4, compared to 3.3 when using the risk score derived by LASSO applied to the training data folds. Furthermore, the median difference in overall survival probability at 5 years based on the LASSO-derived risk score was 32.8%, compared to 25.2% when using the 7-protein RTK score. The permutation test p values were 5.0e-4 for both the RTK pathway-driven and the LASSO data-driven approaches. Finally, we demonstrated the applicability and performance of our approach for overall survival prediction in additional TCGA cohorts; namely, ovarian serous cystadenocarcinoma (TCGA-OVCA), sarcoma (TCGA-SARC), and cutaneous melanoma (TCGA-SKCM). Conclusions: The data-driven nature of our LASSO-based approach makes it versatile and particularly well-suited for the discovery of unexplored protein/disease associations that could aid in therapeutic discovery. Citation Format: Mariam M. Konate, Ming-Chung Li, Lisa McShane, Yingdong Zhao. LASSO-based protein signatures for surv
背景:表征人类肿瘤的大规模多组学数据越来越多,可以用来更深入地了解生物过程和预测临床结果。逆相蛋白阵列(RPPA)是一种高通量、基于抗体的方法,与DNA和RNA测序相比,它提供了更直接的细胞活性评估,DNA和RNA测序产生的数据并不总是与蛋白质表达相关。多项研究证实了RPPA数据的预后价值。其中一些研究使用了途径驱动的方法,依靠文献中的先验知识将蛋白质分组为生物学途径,以开发治疗反应的预后特征或预测因子。方法:我们从癌症蛋白质组图谱(TCPA)中获得了258种总、断裂、乙酰化或磷酸化蛋白的标准化RPPA数据。从已发表的基于rpa的受体酪氨酸激酶(RTK)途径活性的7种蛋白标记(以7种蛋白测量值的未加权和的形式)开始,在445例肾透明细胞癌队列(TCGA-KIRC)中显示出预后价值,我们证明可以通过使用统计方法- lasso回归-在没有先验生物学知识的情况下将患者分为高风险和低风险组。从233种蛋白质中进行选择,并将其RPPA测量结果最佳地结合成加权风险评分。采用两种无偏方法评估方法的性能:1)10次3重交叉验证,以无偏估计预测者定义的高风险组和低风险组之间的风险比和5年生存率差异(通过Kaplan-Meier方法);2)用置换检验来评价交叉验证的对数秩统计量的统计显著性。结果:对于第一种评估方法,基于7蛋白RTK评分的交叉验证中,高风险组和低风险组之间的中位风险比为2.4,而使用LASSO导出的风险评分应用于训练数据折叠时为3.3。此外,基于lasso衍生风险评分的5年总生存率的中位数差异为32.8%,而使用7蛋白RTK评分的中位数差异为25.2%。RTK路径驱动和LASSO数据驱动方法的排列检验p值均为5.0 ~ 4。最后,我们证明了我们的方法在其他TCGA队列中用于总生存预测的适用性和性能;即卵巢浆液性囊腺癌(TCGA-OVCA)、肉瘤(TCGA-SARC)和皮肤黑色素瘤(TCGA-SKCM)。结论:我们基于lasso的方法的数据驱动性质使其具有通用性,特别适合于发现未探索的蛋白质/疾病关联,可以帮助发现治疗方法。引用格式:Mariam M. Konate, Ming-Chung Li, Lisa McShane, Yingdong Zhao。基于lasso的蛋白质特征用于人类癌症群体的生存预测[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要第160期。
{"title":"Abstract 160: LASSO-based protein signatures for survival prediction in human cancer cohorts","authors":"Mariam M. Konaté, Ming-Chung Li, L. McShane, Yingdong Zhao","doi":"10.1158/1538-7445.AM2021-160","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-160","url":null,"abstract":"Background: Large-scale multi-omics data characterizing human tumors are increasingly available and can be leveraged to develop a deeper understanding of biological processes and predict clinical outcomes. Reverse-phase protein array (RPPA) is a high-throughput, antibody-based method that provides a more direct assessment of cellular activity compared to DNA and RNA sequencing, which generate data that do not always correlate with protein expression. Multiple studies have demonstrated the prognostic value of RPPA data. Some of these studies have used pathway-driven approaches, relying on prior knowledge from the literature to group proteins into biological pathways, to develop prognostic signatures or predictors of treatment response. Methods: We obtained normalized RPPA data for up to 258 total, cleaved, acetylated, or phosphorylated protein species from The Cancer Proteome Atlas (TCPA). Starting from a published RPPA-based seven-protein signature of receptor tyrosine kinase (RTK) pathway activity in the form of an unweighted sum of the seven protein measurements, shown to have prognostic value in a 445-patient renal clear cell carcinoma cohort (TCGA-KIRC), we demonstrated that strong stratification of patients into high and low risk groups can be achieved by using a statistical approach—LASSO regression—with no a priori biological knowledge, to select from the 233 proteins and optimally combine their RPPA measurements into a weighted risk score. Method performance was assessed using two unbiased approaches: 1) 10 iterations of 3-fold cross-validation for unbiased estimation of hazard ratio and difference in 5-year survival (by Kaplan-Meier method) between predictor-defined high and low risk groups; and 2) a permutation test to evaluate the statistical significance of the cross-validated log-rank statistic. Results: For the first evaluation approach, the median hazard ratio between high and low risk groups across the held-out folds in the cross-validation based on the 7-protein RTK score was 2.4, compared to 3.3 when using the risk score derived by LASSO applied to the training data folds. Furthermore, the median difference in overall survival probability at 5 years based on the LASSO-derived risk score was 32.8%, compared to 25.2% when using the 7-protein RTK score. The permutation test p values were 5.0e-4 for both the RTK pathway-driven and the LASSO data-driven approaches. Finally, we demonstrated the applicability and performance of our approach for overall survival prediction in additional TCGA cohorts; namely, ovarian serous cystadenocarcinoma (TCGA-OVCA), sarcoma (TCGA-SARC), and cutaneous melanoma (TCGA-SKCM). Conclusions: The data-driven nature of our LASSO-based approach makes it versatile and particularly well-suited for the discovery of unexplored protein/disease associations that could aid in therapeutic discovery. Citation Format: Mariam M. Konate, Ming-Chung Li, Lisa McShane, Yingdong Zhao. LASSO-based protein signatures for surv","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74605529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 208: Development of Evidence Statement curation algorithms to aid cancer variant interpretation 摘要208:证据陈述管理算法的发展,以帮助癌症变异的解释
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-208
J. Saliba, Lana M. Sheta, Kilannin Krysiak, Arpad M. Danos, Alex R Marr, Erica K. Barnell, Shahil P. Pema, Wan-Hsin Lin, P. Terraf, Joshua F. McMichael, C. Grisdale, Shruti Rao, S. Kiwala, Adam C. Coffman, A. Wagner, O. Griffith, M. Griffith
The Clinical Interpretation of Variants in Cancer (CIViC) knowledgebase (civicdb.org) is an open access, centralized hub for structured, community curated and expertly moderated relationships between genomic variants and cancer. Evidence is curated from peer-reviewed, published literature and is classified into one of five Types: Predisposing, Diagnostic, Prognostic, Predictive (therapeutic), or Functional. The robustness of the Evidence is conveyed through the assignment of Levels with the first three derived from patient studies (Validated, Clinical, Case Study), Preclinical, generated from in vivo or in vitro data, and Inferential, which describes indirect associations. Each Evidence Item requires an Evidence Statement written in the curator9s own words summarizing the source9s results regarding the variant9s clinical impact. Collaborations with groups like ClinGen have generated a significant influx of new curators, increasing the demand for detailed principles regarding data prioritization in the Evidence Statement in order to streamline the curation process. The curation community would benefit from simpler, visual guides through the complex decisions needed to appropriately and consistently curate Evidence Items. We are devoting significant effort to continue the development of straightforward Evidence curation algorithms (decision trees) similar to those used in clinical molecular testing labs to aid CIViC curators. Previously published guidelines on development of these statements are the basis of our Evidence algorithms. Obvious inflection points for curators are clearly identified with specific details noted for each to optimize decision efficiency. As the predominant Evidence Type comprising 57% of all CIViC submissions, 58% of referenced patient trials, and 92% of Preclinical submissions, Predictive Evidence is the initial focus of our pilot guidelines with Diagnostic and Prognostic to follow. Within the Predictive Evidence Type, clinical trials, case studies, and preclinical Levels each require vastly different Evidence Statement details and ultimately the creation of three separate, uniquely modeled algorithms. The implementation of these algorithms will assist in streamlining both curation and the expert review process. Notably, a template is not being created, as the preservation of curator style and voice is important to maintain the community feel of the database. To ensure the highest level of clarity, our team is utilizing specific novice and experienced curators to assist with the development process. As these algorithms pass the pilot phase, they are being tested as curator training tools. Ultimately, these guidelines will be used to encourage independence in curators and to enhance the Evidence already contained in CIViC. Citation Format: Jason Saliba, Lana Sheta, Kilannin Krysiak, Arpad Danos, Alex Marr, Erica Barnell, Shahil Pema, Wan-Hsin Lin, Panieh Terraf, Joshua F. McMichael, Cameron J. Grisdale, Shruti Rao, Susanna
癌症变异的临床解释(CIViC)知识库(civicdb.org)是一个开放获取、集中的中心,用于结构化、社区策划和专家调节基因组变异与癌症之间的关系。证据来自同行评审的已发表文献,并分为五种类型之一:易感性、诊断性、预后性、预测性(治疗性)或功能性。证据的稳健性通过以下级别的分配来传达:前三个级别来自患者研究(验证,临床,案例研究),临床前,来自体内或体外数据,以及描述间接关联的推论。每个证据项目都需要一份用管理者自己的话撰写的证据声明,总结了关于变异临床影响的来源结果。与ClinGen等组织的合作产生了大量新的策展人,增加了对证据声明中有关数据优先级的详细原则的需求,以简化策展过程。策展社区将受益于更简单、直观的指南,通过适当和一致地策展证据项目所需的复杂决策。我们正在投入大量精力,继续开发直接的证据管理算法(决策树),类似于临床分子检测实验室中用于帮助CIViC策展人的算法。以前发表的关于这些陈述的发展指南是我们证据算法的基础。明确确定了策展人的明显拐点,并为每个拐点记录了具体细节,以优化决策效率。作为主要的证据类型,包括57%的CIViC提交,58%的参考患者试验和92%的临床前提交,预测性证据是我们试点指南的最初重点,随后是诊断和预后。在预测证据类型中,临床试验、案例研究和临床前水平都需要截然不同的证据声明细节,并最终创建三种独立的、独特的建模算法。这些算法的实施将有助于简化策展和专家审查过程。值得注意的是,没有创建模板,因为保存管理员风格和声音对于维护数据库的社区感觉很重要。为了确保最高水平的清晰度,我们的团队正在使用特定的新手和有经验的管理员来协助开发过程。随着这些算法通过试点阶段,它们将作为策展人培训工具进行测试。最终,这些指导方针将用于鼓励策展人的独立性,并加强CIViC中已经包含的证据。引文格式:Jason Saliba, Lana Sheta, Kilannin Krysiak, Arpad Danos, Alex Marr, Erica Barnell, Shahil Pema, Wan-Hsin Lin, Panieh Terraf, Joshua F. McMichael, Cameron J. Grisdale, Shruti Rao, Susanna Kiwala, Adam Coffman, Alex Wagner, Obi L. Griffith, Malachi Griffith。证据陈述管理算法的发展,以帮助癌症变异解释[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要nr 208。
{"title":"Abstract 208: Development of Evidence Statement curation algorithms to aid cancer variant interpretation","authors":"J. Saliba, Lana M. Sheta, Kilannin Krysiak, Arpad M. Danos, Alex R Marr, Erica K. Barnell, Shahil P. Pema, Wan-Hsin Lin, P. Terraf, Joshua F. McMichael, C. Grisdale, Shruti Rao, S. Kiwala, Adam C. Coffman, A. Wagner, O. Griffith, M. Griffith","doi":"10.1158/1538-7445.AM2021-208","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-208","url":null,"abstract":"The Clinical Interpretation of Variants in Cancer (CIViC) knowledgebase (civicdb.org) is an open access, centralized hub for structured, community curated and expertly moderated relationships between genomic variants and cancer. Evidence is curated from peer-reviewed, published literature and is classified into one of five Types: Predisposing, Diagnostic, Prognostic, Predictive (therapeutic), or Functional. The robustness of the Evidence is conveyed through the assignment of Levels with the first three derived from patient studies (Validated, Clinical, Case Study), Preclinical, generated from in vivo or in vitro data, and Inferential, which describes indirect associations. Each Evidence Item requires an Evidence Statement written in the curator9s own words summarizing the source9s results regarding the variant9s clinical impact. Collaborations with groups like ClinGen have generated a significant influx of new curators, increasing the demand for detailed principles regarding data prioritization in the Evidence Statement in order to streamline the curation process. The curation community would benefit from simpler, visual guides through the complex decisions needed to appropriately and consistently curate Evidence Items. We are devoting significant effort to continue the development of straightforward Evidence curation algorithms (decision trees) similar to those used in clinical molecular testing labs to aid CIViC curators. Previously published guidelines on development of these statements are the basis of our Evidence algorithms. Obvious inflection points for curators are clearly identified with specific details noted for each to optimize decision efficiency. As the predominant Evidence Type comprising 57% of all CIViC submissions, 58% of referenced patient trials, and 92% of Preclinical submissions, Predictive Evidence is the initial focus of our pilot guidelines with Diagnostic and Prognostic to follow. Within the Predictive Evidence Type, clinical trials, case studies, and preclinical Levels each require vastly different Evidence Statement details and ultimately the creation of three separate, uniquely modeled algorithms. The implementation of these algorithms will assist in streamlining both curation and the expert review process. Notably, a template is not being created, as the preservation of curator style and voice is important to maintain the community feel of the database. To ensure the highest level of clarity, our team is utilizing specific novice and experienced curators to assist with the development process. As these algorithms pass the pilot phase, they are being tested as curator training tools. Ultimately, these guidelines will be used to encourage independence in curators and to enhance the Evidence already contained in CIViC. Citation Format: Jason Saliba, Lana Sheta, Kilannin Krysiak, Arpad Danos, Alex Marr, Erica Barnell, Shahil Pema, Wan-Hsin Lin, Panieh Terraf, Joshua F. McMichael, Cameron J. Grisdale, Shruti Rao, Susanna ","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"98 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73807692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract 199: Blossom AI: A novel drug discovery app for the prediction of hotspots on multiplex protein protein interaction complexes using random forest algorithms 199: Blossom AI:一个基于随机森林算法预测多重蛋白相互作用复合物热点的新型药物发现应用程序
Pub Date : 2021-07-01 DOI: 10.1158/1538-7445.AM2021-199
Stephanie Zhang, Minsoo Kang
Protein protein interactions (PPIs) form the backbone of signal transduction pathways in diverse physiological processes, mediating the transmission and regulation of oncogenic signals essential to cellular proliferation and survival, thus representing a potential new class of drug targets for anticancer therapeutic discovery. However, several challenges face the targeting of PPIs, including large PPI interface areas, a lack of deep pockets, the presence of noncontiguous binding sites, and a general lack of natural ligands. The presence of hot spots (small subsets of amino acid residues that contribute significantly to free binding energy) makes PPIs amenable to small molecule perturbations, playing essential roles in the stability of protein binding. Effectively identifying which specific interface residues of protein protein complexes form the hot spots is critical for understanding the principles of protein interactions and has broad application prospects in protein design and drug development. This project presents Blossom AI, a novel, user friendly mobile app developed in XCode and CoreML that uses random forest decision tree algorithms (RF) to computationally predict the presence of hotspots on protein complexes within seconds, aiding the design of small molecule and peptide drugs that target protein-protein interactions, particularly for anticancer therapy. Leveraging features such as solvent accessible surface area (ASA), blocks substitution matrix, physicochemical properties (hydrophobicity, polarity, polarizability, propensities), position specific scoring matrix (PSSM) and solvent exposure, the RF is trained through a dataset of 313 mutated interface residues (133 hotspot residues and 180 non hotspot residues) from over 60 protein complexes to produce a training accuracy of 88.75%, validation accuracy of 92.86%, specificity of 87.18%, sensitivity of 75.38%, PPV 94.23%, NPV 86.61%. Blossom is high speed, low cost, and user friendly with significantly improved accuracy over the standard of alanine scanning mutagenesis. Citation Format: Stephanie Zhang, Minsoo Kang. Blossom AI: A novel drug discovery app for the prediction of hotspots on multiplex protein protein interaction complexes using random forest algorithms [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 199.
蛋白-蛋白相互作用(PPIs)是多种生理过程中信号转导通路的主干,介导细胞增殖和存活所必需的致癌信号的传递和调控,因此代表了抗癌治疗发现的潜在新一类药物靶点。然而,靶向PPI面临着一些挑战,包括大的PPI界面区域,缺乏深度的资金,存在不连续的结合位点,以及普遍缺乏天然配体。热点(对自由结合能有显著贡献的一小部分氨基酸残基)的存在使PPIs能够适应小分子扰动,在蛋白质结合的稳定性中起着至关重要的作用。有效识别蛋白质复合物中哪些特定的界面残基形成热点是理解蛋白质相互作用原理的关键,在蛋白质设计和药物开发中具有广阔的应用前景。该项目介绍了Blossom AI,这是一个用XCode和CoreML开发的新颖的用户友好的移动应用程序,它使用随机森林决策树算法(RF)在几秒钟内计算预测蛋白质复合物上热点的存在,帮助设计针对蛋白质-蛋白质相互作用的小分子和肽药物,特别是用于抗癌治疗。利用溶剂可达表面积(ASA)、块取代矩阵、物理化学性质(疏水性、极性、极化性、倾向)、位置特异性分数矩阵(PSSM)和溶剂暴露等特征,RF通过来自60多个蛋白质复合物的313个突变界面残基(133个热点残基和180个非热点残基)数据集进行训练,训练准确率为88.75%,验证准确率为92.86%,特异性为87.18%。灵敏度75.38%,PPV 94.23%, NPV 86.61%。Blossom是高速度,低成本,和用户友好与显著提高准确性比标准的丙氨酸扫描诱变。引文格式:Stephanie Zhang, Minsoo Kang。Blossom AI:一款新型药物发现应用程序,用于使用随机森林算法预测多重蛋白质蛋白质相互作用复合物的热点[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要第199期。
{"title":"Abstract 199: Blossom AI: A novel drug discovery app for the prediction of hotspots on multiplex protein protein interaction complexes using random forest algorithms","authors":"Stephanie Zhang, Minsoo Kang","doi":"10.1158/1538-7445.AM2021-199","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-199","url":null,"abstract":"Protein protein interactions (PPIs) form the backbone of signal transduction pathways in diverse physiological processes, mediating the transmission and regulation of oncogenic signals essential to cellular proliferation and survival, thus representing a potential new class of drug targets for anticancer therapeutic discovery. However, several challenges face the targeting of PPIs, including large PPI interface areas, a lack of deep pockets, the presence of noncontiguous binding sites, and a general lack of natural ligands. The presence of hot spots (small subsets of amino acid residues that contribute significantly to free binding energy) makes PPIs amenable to small molecule perturbations, playing essential roles in the stability of protein binding. Effectively identifying which specific interface residues of protein protein complexes form the hot spots is critical for understanding the principles of protein interactions and has broad application prospects in protein design and drug development. This project presents Blossom AI, a novel, user friendly mobile app developed in XCode and CoreML that uses random forest decision tree algorithms (RF) to computationally predict the presence of hotspots on protein complexes within seconds, aiding the design of small molecule and peptide drugs that target protein-protein interactions, particularly for anticancer therapy. Leveraging features such as solvent accessible surface area (ASA), blocks substitution matrix, physicochemical properties (hydrophobicity, polarity, polarizability, propensities), position specific scoring matrix (PSSM) and solvent exposure, the RF is trained through a dataset of 313 mutated interface residues (133 hotspot residues and 180 non hotspot residues) from over 60 protein complexes to produce a training accuracy of 88.75%, validation accuracy of 92.86%, specificity of 87.18%, sensitivity of 75.38%, PPV 94.23%, NPV 86.61%. Blossom is high speed, low cost, and user friendly with significantly improved accuracy over the standard of alanine scanning mutagenesis. Citation Format: Stephanie Zhang, Minsoo Kang. Blossom AI: A novel drug discovery app for the prediction of hotspots on multiplex protein protein interaction complexes using random forest algorithms [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 199.","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84434536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Journal of bioinformatics and systems biology : Open access
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1