Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-259
W. Qian, Chen Xiaobo, H. Li, Sheng Guo
{"title":"Abstract 259: Comparison of Illumina NovaSeq 6000 and MGISEQ-2000 in profiling xenograft models","authors":"W. Qian, Chen Xiaobo, H. Li, Sheng Guo","doi":"10.1158/1538-7445.AM2021-259","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-259","url":null,"abstract":"","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"181 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85004100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-191
J. Friedman
A new computational method to predict cancer treatment outcomes from somatic mutation data was tested. Using this method, treatment outcome success or failure for 78 different cancer-drug combinations (74 from TCGA & 4 from published immune checkpoint inhibitor studies) could be "predicted" for each patient with nearly perfect accuracy (AUC values from ROC curves at 1.000 or just below) based solely on individual patients9 somatic mutation information. Predictions worked for all examined cancer-drug combinations with information available for > 20 patients and with a treatment SUCCESS to FAILURE ratio between 1/6 and 6. Calculations disregarded outcome information about the patient for whom an outcome was being predicted, but so far only when calculating their own classification measure. More elaborate, independent calculations are being developed to eliminate the remnants of outcome information from one patient in classification measures calculated for other predicted patients, but these newer, more detailed calculations are ongoing. The methods avoid any (1) fitting of parameters to outcome or data, (2) use of linear algebraic methods, (3) determinations of scale factor values, and (4) use of some typically inaccurate types of experimentally estimated probability values. Instead, they use (1) more accurate metastatistics about an accurately determined type of probability value – the probability that the observed frequency of mutation for a gene differs from random in either separate population of the responder or of the non-responder patients – and (2) an analysis of some underlying causes of modeling bias – examining the sensitivity of how identifying non-random mutation frequencies can be perturbed by changes due to single patients. Statistics entailing extrapolation to an infinite sampling limit were avoided in favor of statistics more applicable to small finite samples. When one patient with a "known" outcome was deliberately varied, in a systematic non-random way, critical statistics exhibited consistent changes that differed depending on whether the varied patient belonged to the HIT or MISS outcome class and these changes remained consistent with outcome class when patients of "unknown" outcome were varied in a similar way. The analysis provided a quantitative mathematical explanation for why FLAG genes had appeared often in many GWAS and suggested that the mutational burden measure used often as a marker for checkpoint inhibitor studies might suffer from similar complications. Prospective studies are being planned. Citation Format: Jonathan Malcolm Friedman. A probabilistic analysis of somatic mutations indicates individual survival outcome classes with AUC near 1.00 for all tested cancer-drug combinations from TCGA and 4 immune checkpoint studies (all having ≥ 20 patients and an outcome ratio
{"title":"Abstract 191: A probabilistic analysis of somatic mutations indicates individual survival outcome classes with AUC near 1.00 for all tested cancer-drug combinations from TCGA and 4 immune checkpoint studies (all having ≥ 20 patients and an outcome ratio < 6)","authors":"J. Friedman","doi":"10.1158/1538-7445.AM2021-191","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-191","url":null,"abstract":"A new computational method to predict cancer treatment outcomes from somatic mutation data was tested. Using this method, treatment outcome success or failure for 78 different cancer-drug combinations (74 from TCGA & 4 from published immune checkpoint inhibitor studies) could be \"predicted\" for each patient with nearly perfect accuracy (AUC values from ROC curves at 1.000 or just below) based solely on individual patients9 somatic mutation information. Predictions worked for all examined cancer-drug combinations with information available for > 20 patients and with a treatment SUCCESS to FAILURE ratio between 1/6 and 6. Calculations disregarded outcome information about the patient for whom an outcome was being predicted, but so far only when calculating their own classification measure. More elaborate, independent calculations are being developed to eliminate the remnants of outcome information from one patient in classification measures calculated for other predicted patients, but these newer, more detailed calculations are ongoing. The methods avoid any (1) fitting of parameters to outcome or data, (2) use of linear algebraic methods, (3) determinations of scale factor values, and (4) use of some typically inaccurate types of experimentally estimated probability values. Instead, they use (1) more accurate metastatistics about an accurately determined type of probability value – the probability that the observed frequency of mutation for a gene differs from random in either separate population of the responder or of the non-responder patients – and (2) an analysis of some underlying causes of modeling bias – examining the sensitivity of how identifying non-random mutation frequencies can be perturbed by changes due to single patients. Statistics entailing extrapolation to an infinite sampling limit were avoided in favor of statistics more applicable to small finite samples. When one patient with a \"known\" outcome was deliberately varied, in a systematic non-random way, critical statistics exhibited consistent changes that differed depending on whether the varied patient belonged to the HIT or MISS outcome class and these changes remained consistent with outcome class when patients of \"unknown\" outcome were varied in a similar way. The analysis provided a quantitative mathematical explanation for why FLAG genes had appeared often in many GWAS and suggested that the mutational burden measure used often as a marker for checkpoint inhibitor studies might suffer from similar complications. Prospective studies are being planned. Citation Format: Jonathan Malcolm Friedman. A probabilistic analysis of somatic mutations indicates individual survival outcome classes with AUC near 1.00 for all tested cancer-drug combinations from TCGA and 4 immune checkpoint studies (all having ≥ 20 patients and an outcome ratio","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"07 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85977738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-175
R. Seitz, T. Nielsen, B. Schweitzer, D. Hout, D. Ross
Background The 27-gene immuno-oncology (IO) algorithm has demonstrated an association with immune checkpoint inhibitor (ICI) response in TNBC, NSCLC, and metastatic urothelial carcinoma (mUC). The algorithm can be run on data generated from either a qPCR assay or from analysis of whole transcriptome RNA-seq data. It integrates gene expression information from infiltrating inflammatory cells with signatures from surrounding stroma and tumor cells to classify cases into likely responder versus non-responders. We hypothesized that because the algorithm derives its biologic signature from the tumor immune microenvironment (TIME), the classification function and thresholds might translate to other solid tissue types based upon biologic separation of inflammatory phenotypes. Methods Using NSCLC and breast cancer datasets from TCGA, we identified 939 genes that comprise the Mesenchymal (M), Mesenchymal Stem-like (MSL), and Immunomodulatory (IM) gene expression patterns centered around a previously described 101-gene signature (Ring, 2016). We applied this 939 gene set to 433 bladder samples from TCGA (UC) and k-means clustered the genes based upon each of the three centroids. Clinical cases were also organized by k-means clustering (k=3). Pathway analysis was performed (GSEA—UCSD/Broad). We assessed classification of UC cases by looking at enrichment of inflammatory pathways into the IM cluster compared to mesenchymal pathways into the M or MSL clusters. The threshold for responder classification using the 27-gene IO algorithm previously established in TNBC was assessed by quantitating the fraction of cases enriched into the IM cluster (potential responders) as opposed to the M or MSL clusters (potential non-responders). Results The 939 genes centered around the 101-gene signature encoded twenty different physiologic pathways. Ten of these pathways included at least one of the genes from the 27-gene IO algorithm. Significant enrichment of inflammatory cell pathways was seen into the IM cluster as opposed to mesenchymal and reactive fibroblast pathways enriched into the M and MSL clusters. Pathways containing therapeutic targets designed to overcome resistance to ICIs were enriched in the MSL gene expression centroid. The 27-gene IO algorithm threshold applied to the TCGA samples classified 79% as responders in the IM cluster as opposed 16% in the M and MSL. Discussion These results support the hypothesis that gene expression signatures discerning TIME physiology associated with ICI response are tissue agnostic and relevant in multiple solid tissue types. The dramatic enrichment of responders into the IM cluster using previously established thresholds is consistent with appropriate biologic classification of the cases and supports utilizing the 27-gene IO algorithm and established threshold for association with ICI response in treated mUC cohorts. Citation Format: Robert S. Seitz, Tyler J. Nielsen, Brock L. Schweitzer, David R. Hout, Douglas T. Ross. Pat
27基因免疫肿瘤学(IO)算法已经证明与TNBC、NSCLC和转移性尿路上皮癌(mUC)的免疫检查点抑制剂(ICI)应答相关。该算法可以运行在从qPCR分析或从整个转录组RNA-seq数据分析产生的数据上。它将浸润性炎症细胞的基因表达信息与周围基质和肿瘤细胞的特征结合起来,将病例分为可能有反应的和无反应的。我们假设,由于该算法源自肿瘤免疫微环境(TIME)的生物学特征,分类功能和阈值可能转化为基于炎症表型生物分离的其他实体组织类型。方法利用来自TCGA的NSCLC和乳腺癌数据集,我们确定了939个基因,这些基因包括间充质(M)、间充质干样(MSL)和免疫调节(IM)基因表达模式,这些基因以先前描述的101个基因特征为中心(Ring, 2016)。我们将这939基因集应用于来自TCGA (UC)的433个膀胱样本,并基于三个质心对基因进行k-means聚类。采用k-means聚类(k=3)对临床病例进行分组。进行通路分析(GSEA-UCSD /Broad)。我们通过对比M或MSL聚集的间充质途径和IM聚集的炎症途径的富集来评估UC病例的分类。使用先前在TNBC中建立的27基因IO算法对应答者分类的阈值进行评估,通过量化富集到IM集群(潜在应答者)而不是M或MSL集群(潜在无应答者)的病例比例。结果以101个基因为中心的939个基因编码了20种不同的生理通路。其中10个途径至少包含27个基因IO算法中的一个基因。炎症细胞通路在IM集群中显著富集,而在M和MSL集群中则富集间充质和反应性成纤维细胞通路。含有治疗靶点的途径被设计来克服对ICIs的抗性,在MSL基因表达质心中富集。应用于TCGA样本的27个基因IO算法阈值将IM集群中79%的应答者分类为应答者,而M和MSL中为16%。这些结果支持这样的假设,即识别与ICI反应相关的TIME生理的基因表达特征是组织不可知的,并且与多种实体组织类型相关。使用先前建立的阈值将应答者显著富集到IM集群中,这与病例的适当生物学分类是一致的,并且支持使用27基因IO算法和已建立的阈值来与治疗的mUC队列中的ICI应答相关联。引用格式:Robert S. Seitz, Tyler J. Nielsen, Brock L. Schweitzer, David R. Hout, Douglas T. Ross。27基因免疫肿瘤学算法应用于膀胱癌的途径建模[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要第175期。
{"title":"Abstract 175: Pathway modeling to translate the 27-gene immuno-oncology algorithm into bladder cancer","authors":"R. Seitz, T. Nielsen, B. Schweitzer, D. Hout, D. Ross","doi":"10.1158/1538-7445.AM2021-175","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-175","url":null,"abstract":"Background The 27-gene immuno-oncology (IO) algorithm has demonstrated an association with immune checkpoint inhibitor (ICI) response in TNBC, NSCLC, and metastatic urothelial carcinoma (mUC). The algorithm can be run on data generated from either a qPCR assay or from analysis of whole transcriptome RNA-seq data. It integrates gene expression information from infiltrating inflammatory cells with signatures from surrounding stroma and tumor cells to classify cases into likely responder versus non-responders. We hypothesized that because the algorithm derives its biologic signature from the tumor immune microenvironment (TIME), the classification function and thresholds might translate to other solid tissue types based upon biologic separation of inflammatory phenotypes. Methods Using NSCLC and breast cancer datasets from TCGA, we identified 939 genes that comprise the Mesenchymal (M), Mesenchymal Stem-like (MSL), and Immunomodulatory (IM) gene expression patterns centered around a previously described 101-gene signature (Ring, 2016). We applied this 939 gene set to 433 bladder samples from TCGA (UC) and k-means clustered the genes based upon each of the three centroids. Clinical cases were also organized by k-means clustering (k=3). Pathway analysis was performed (GSEA—UCSD/Broad). We assessed classification of UC cases by looking at enrichment of inflammatory pathways into the IM cluster compared to mesenchymal pathways into the M or MSL clusters. The threshold for responder classification using the 27-gene IO algorithm previously established in TNBC was assessed by quantitating the fraction of cases enriched into the IM cluster (potential responders) as opposed to the M or MSL clusters (potential non-responders). Results The 939 genes centered around the 101-gene signature encoded twenty different physiologic pathways. Ten of these pathways included at least one of the genes from the 27-gene IO algorithm. Significant enrichment of inflammatory cell pathways was seen into the IM cluster as opposed to mesenchymal and reactive fibroblast pathways enriched into the M and MSL clusters. Pathways containing therapeutic targets designed to overcome resistance to ICIs were enriched in the MSL gene expression centroid. The 27-gene IO algorithm threshold applied to the TCGA samples classified 79% as responders in the IM cluster as opposed 16% in the M and MSL. Discussion These results support the hypothesis that gene expression signatures discerning TIME physiology associated with ICI response are tissue agnostic and relevant in multiple solid tissue types. The dramatic enrichment of responders into the IM cluster using previously established thresholds is consistent with appropriate biologic classification of the cases and supports utilizing the 27-gene IO algorithm and established threshold for association with ICI response in treated mUC cohorts. Citation Format: Robert S. Seitz, Tyler J. Nielsen, Brock L. Schweitzer, David R. Hout, Douglas T. Ross. Pat","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76843667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-265
B. Mao, Sheng Guo, D. Ouyang, H. Li
{"title":"Abstract 265: Evaluating variation in drug efficacy endpoints in a syngeneic mouse model (CT26.WT) under immune checkpoint blockade","authors":"B. Mao, Sheng Guo, D. Ouyang, H. Li","doi":"10.1158/1538-7445.AM2021-265","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-265","url":null,"abstract":"","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"123 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76199337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-239
G. Vasiukov, Tatiana Novitskaya, M. Senosain, A. Menshikh, A. Zijlstra, S. Novitskiy, P. Massion
Tumor microenvironment (TME) represents an integrated system that affects cancer cell behavior and contributes directly to disease outcome. Systemic approach to analysis of TME should uncover its complexity and facilitate discovery of mechanisms orchestrating tumor development and metastasis. Multiplex fluorescence tissue staining followed by spatial analysis of tumor tissue architecture can provide insights to pivotal interactions of cellular and acellular components of TME. Extracellular matrix (ECM is represented mainly by collagen deposition. Number of reports indicates that ECM contribution to TME state not only depends upon amount of accumulated collagen but its geometrical features and spatial orientation of fibers. These characteristics of collagen fibers contribute directly to physical and mechanical properties of tissue and can change tumor growth and metastasis. Current methods of computational image analysis of tissue implement assessment of cellular or acellular components separately. The goal of current work was to develop a new computational tool to perform integrated analysis of fibrous and cellular components of tumor tissue in spatial dependent manner to achieve detailed tumor tissue mapping and structural patterns recognition. To pursue this goal, we generated images of human lung adenocarcinoma tissue characterized by indolent and aggressive behavior. We performed multiplex immunofluorescence staining for following markers: CD3 - marker of T-lymphocytes, PanCytokeratin - marker of epithelial/tumor cells, collagen hybridizing peptide (3Helix) - marker of collagen, DAPI - nuclear counterstain. To develop image analysis pipeline, we utilized an open source graphical interface analytical platform KNIME, where we generated modular workflow. For ECM analysis, we integrated Python written code into KNIME node. Segmentation of collagen fibers was performed using skeletonization with subsequent calculation of geometrical properties (length, alignment, widths) and orientation of each fiber. Data, collected from single cell analysis and ECM architecture assessment, were combined and forwarded to downstream spatial analysis, where distances from cell to cell or cell to ECM were computed and neighborhood analysis was performed. We demonstrated that tumor cells in aggressive adenocarcinoma samples were co-localized with a smaller number of collagen fibers. In addition, length of that fibers was less in comparison to indolent group. Correlation analysis revealed positive correlation between length of collagen fibers and number of tumor cells in indolent group, but we did not observe this phenomenon in indolent group. Developed computational method provides additional dimensionality to tissue image analysis and can reveal underrecognized structural patterns of the tumor microenvironment. Citation Format: Georgii Vasiukov, Tatiana Novitskaya, Maria-Fernanda Senosain, Anna Menshikh, Andries Zijlstra, Sergey Novitskiy, Pierre Massion. Integrated
{"title":"Abstract 239: Integrated computational image analysis of cellular and acellular tissue components as a method for detailed tumor tissue mapping and structural patterns recognition","authors":"G. Vasiukov, Tatiana Novitskaya, M. Senosain, A. Menshikh, A. Zijlstra, S. Novitskiy, P. Massion","doi":"10.1158/1538-7445.AM2021-239","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-239","url":null,"abstract":"Tumor microenvironment (TME) represents an integrated system that affects cancer cell behavior and contributes directly to disease outcome. Systemic approach to analysis of TME should uncover its complexity and facilitate discovery of mechanisms orchestrating tumor development and metastasis. Multiplex fluorescence tissue staining followed by spatial analysis of tumor tissue architecture can provide insights to pivotal interactions of cellular and acellular components of TME. Extracellular matrix (ECM is represented mainly by collagen deposition. Number of reports indicates that ECM contribution to TME state not only depends upon amount of accumulated collagen but its geometrical features and spatial orientation of fibers. These characteristics of collagen fibers contribute directly to physical and mechanical properties of tissue and can change tumor growth and metastasis. Current methods of computational image analysis of tissue implement assessment of cellular or acellular components separately. The goal of current work was to develop a new computational tool to perform integrated analysis of fibrous and cellular components of tumor tissue in spatial dependent manner to achieve detailed tumor tissue mapping and structural patterns recognition. To pursue this goal, we generated images of human lung adenocarcinoma tissue characterized by indolent and aggressive behavior. We performed multiplex immunofluorescence staining for following markers: CD3 - marker of T-lymphocytes, PanCytokeratin - marker of epithelial/tumor cells, collagen hybridizing peptide (3Helix) - marker of collagen, DAPI - nuclear counterstain. To develop image analysis pipeline, we utilized an open source graphical interface analytical platform KNIME, where we generated modular workflow. For ECM analysis, we integrated Python written code into KNIME node. Segmentation of collagen fibers was performed using skeletonization with subsequent calculation of geometrical properties (length, alignment, widths) and orientation of each fiber. Data, collected from single cell analysis and ECM architecture assessment, were combined and forwarded to downstream spatial analysis, where distances from cell to cell or cell to ECM were computed and neighborhood analysis was performed. We demonstrated that tumor cells in aggressive adenocarcinoma samples were co-localized with a smaller number of collagen fibers. In addition, length of that fibers was less in comparison to indolent group. Correlation analysis revealed positive correlation between length of collagen fibers and number of tumor cells in indolent group, but we did not observe this phenomenon in indolent group. Developed computational method provides additional dimensionality to tissue image analysis and can reveal underrecognized structural patterns of the tumor microenvironment. Citation Format: Georgii Vasiukov, Tatiana Novitskaya, Maria-Fernanda Senosain, Anna Menshikh, Andries Zijlstra, Sergey Novitskiy, Pierre Massion. Integrated","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76237555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-197
Ali Foroughi pour, Jonghanne Park, Jeffrey H. Chuang
Deep learning has become a popular tool for analyzing hematoxylin and eosin (HE 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 197.
{"title":"Abstract 197: MONE: A construction for interpreting deep learning features in pathology slides","authors":"Ali Foroughi pour, Jonghanne Park, Jeffrey H. Chuang","doi":"10.1158/1538-7445.AM2021-197","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-197","url":null,"abstract":"Deep learning has become a popular tool for analyzing hematoxylin and eosin (HE 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 197.","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"82 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83537626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-162
S. Winter, S. Halliday, Konrad H. Stopsack, S. Osman, A. Hounsell, G. Prue, S. Jain, E. Allott
{"title":"Abstract 162: Cholesterol metabolism gene expression and prostate cancer-specific outcomes in radiotherapy-treated patients","authors":"S. Winter, S. Halliday, Konrad H. Stopsack, S. Osman, A. Hounsell, G. Prue, S. Jain, E. Allott","doi":"10.1158/1538-7445.AM2021-162","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-162","url":null,"abstract":"","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90660757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-160
Mariam M. Konaté, Ming-Chung Li, L. McShane, Yingdong Zhao
Background: Large-scale multi-omics data characterizing human tumors are increasingly available and can be leveraged to develop a deeper understanding of biological processes and predict clinical outcomes. Reverse-phase protein array (RPPA) is a high-throughput, antibody-based method that provides a more direct assessment of cellular activity compared to DNA and RNA sequencing, which generate data that do not always correlate with protein expression. Multiple studies have demonstrated the prognostic value of RPPA data. Some of these studies have used pathway-driven approaches, relying on prior knowledge from the literature to group proteins into biological pathways, to develop prognostic signatures or predictors of treatment response. Methods: We obtained normalized RPPA data for up to 258 total, cleaved, acetylated, or phosphorylated protein species from The Cancer Proteome Atlas (TCPA). Starting from a published RPPA-based seven-protein signature of receptor tyrosine kinase (RTK) pathway activity in the form of an unweighted sum of the seven protein measurements, shown to have prognostic value in a 445-patient renal clear cell carcinoma cohort (TCGA-KIRC), we demonstrated that strong stratification of patients into high and low risk groups can be achieved by using a statistical approach—LASSO regression—with no a priori biological knowledge, to select from the 233 proteins and optimally combine their RPPA measurements into a weighted risk score. Method performance was assessed using two unbiased approaches: 1) 10 iterations of 3-fold cross-validation for unbiased estimation of hazard ratio and difference in 5-year survival (by Kaplan-Meier method) between predictor-defined high and low risk groups; and 2) a permutation test to evaluate the statistical significance of the cross-validated log-rank statistic. Results: For the first evaluation approach, the median hazard ratio between high and low risk groups across the held-out folds in the cross-validation based on the 7-protein RTK score was 2.4, compared to 3.3 when using the risk score derived by LASSO applied to the training data folds. Furthermore, the median difference in overall survival probability at 5 years based on the LASSO-derived risk score was 32.8%, compared to 25.2% when using the 7-protein RTK score. The permutation test p values were 5.0e-4 for both the RTK pathway-driven and the LASSO data-driven approaches. Finally, we demonstrated the applicability and performance of our approach for overall survival prediction in additional TCGA cohorts; namely, ovarian serous cystadenocarcinoma (TCGA-OVCA), sarcoma (TCGA-SARC), and cutaneous melanoma (TCGA-SKCM). Conclusions: The data-driven nature of our LASSO-based approach makes it versatile and particularly well-suited for the discovery of unexplored protein/disease associations that could aid in therapeutic discovery. Citation Format: Mariam M. Konate, Ming-Chung Li, Lisa McShane, Yingdong Zhao. LASSO-based protein signatures for surv
背景:表征人类肿瘤的大规模多组学数据越来越多,可以用来更深入地了解生物过程和预测临床结果。逆相蛋白阵列(RPPA)是一种高通量、基于抗体的方法,与DNA和RNA测序相比,它提供了更直接的细胞活性评估,DNA和RNA测序产生的数据并不总是与蛋白质表达相关。多项研究证实了RPPA数据的预后价值。其中一些研究使用了途径驱动的方法,依靠文献中的先验知识将蛋白质分组为生物学途径,以开发治疗反应的预后特征或预测因子。方法:我们从癌症蛋白质组图谱(TCPA)中获得了258种总、断裂、乙酰化或磷酸化蛋白的标准化RPPA数据。从已发表的基于rpa的受体酪氨酸激酶(RTK)途径活性的7种蛋白标记(以7种蛋白测量值的未加权和的形式)开始,在445例肾透明细胞癌队列(TCGA-KIRC)中显示出预后价值,我们证明可以通过使用统计方法- lasso回归-在没有先验生物学知识的情况下将患者分为高风险和低风险组。从233种蛋白质中进行选择,并将其RPPA测量结果最佳地结合成加权风险评分。采用两种无偏方法评估方法的性能:1)10次3重交叉验证,以无偏估计预测者定义的高风险组和低风险组之间的风险比和5年生存率差异(通过Kaplan-Meier方法);2)用置换检验来评价交叉验证的对数秩统计量的统计显著性。结果:对于第一种评估方法,基于7蛋白RTK评分的交叉验证中,高风险组和低风险组之间的中位风险比为2.4,而使用LASSO导出的风险评分应用于训练数据折叠时为3.3。此外,基于lasso衍生风险评分的5年总生存率的中位数差异为32.8%,而使用7蛋白RTK评分的中位数差异为25.2%。RTK路径驱动和LASSO数据驱动方法的排列检验p值均为5.0 ~ 4。最后,我们证明了我们的方法在其他TCGA队列中用于总生存预测的适用性和性能;即卵巢浆液性囊腺癌(TCGA-OVCA)、肉瘤(TCGA-SARC)和皮肤黑色素瘤(TCGA-SKCM)。结论:我们基于lasso的方法的数据驱动性质使其具有通用性,特别适合于发现未探索的蛋白质/疾病关联,可以帮助发现治疗方法。引用格式:Mariam M. Konate, Ming-Chung Li, Lisa McShane, Yingdong Zhao。基于lasso的蛋白质特征用于人类癌症群体的生存预测[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要第160期。
{"title":"Abstract 160: LASSO-based protein signatures for survival prediction in human cancer cohorts","authors":"Mariam M. Konaté, Ming-Chung Li, L. McShane, Yingdong Zhao","doi":"10.1158/1538-7445.AM2021-160","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-160","url":null,"abstract":"Background: Large-scale multi-omics data characterizing human tumors are increasingly available and can be leveraged to develop a deeper understanding of biological processes and predict clinical outcomes. Reverse-phase protein array (RPPA) is a high-throughput, antibody-based method that provides a more direct assessment of cellular activity compared to DNA and RNA sequencing, which generate data that do not always correlate with protein expression. Multiple studies have demonstrated the prognostic value of RPPA data. Some of these studies have used pathway-driven approaches, relying on prior knowledge from the literature to group proteins into biological pathways, to develop prognostic signatures or predictors of treatment response. Methods: We obtained normalized RPPA data for up to 258 total, cleaved, acetylated, or phosphorylated protein species from The Cancer Proteome Atlas (TCPA). Starting from a published RPPA-based seven-protein signature of receptor tyrosine kinase (RTK) pathway activity in the form of an unweighted sum of the seven protein measurements, shown to have prognostic value in a 445-patient renal clear cell carcinoma cohort (TCGA-KIRC), we demonstrated that strong stratification of patients into high and low risk groups can be achieved by using a statistical approach—LASSO regression—with no a priori biological knowledge, to select from the 233 proteins and optimally combine their RPPA measurements into a weighted risk score. Method performance was assessed using two unbiased approaches: 1) 10 iterations of 3-fold cross-validation for unbiased estimation of hazard ratio and difference in 5-year survival (by Kaplan-Meier method) between predictor-defined high and low risk groups; and 2) a permutation test to evaluate the statistical significance of the cross-validated log-rank statistic. Results: For the first evaluation approach, the median hazard ratio between high and low risk groups across the held-out folds in the cross-validation based on the 7-protein RTK score was 2.4, compared to 3.3 when using the risk score derived by LASSO applied to the training data folds. Furthermore, the median difference in overall survival probability at 5 years based on the LASSO-derived risk score was 32.8%, compared to 25.2% when using the 7-protein RTK score. The permutation test p values were 5.0e-4 for both the RTK pathway-driven and the LASSO data-driven approaches. Finally, we demonstrated the applicability and performance of our approach for overall survival prediction in additional TCGA cohorts; namely, ovarian serous cystadenocarcinoma (TCGA-OVCA), sarcoma (TCGA-SARC), and cutaneous melanoma (TCGA-SKCM). Conclusions: The data-driven nature of our LASSO-based approach makes it versatile and particularly well-suited for the discovery of unexplored protein/disease associations that could aid in therapeutic discovery. Citation Format: Mariam M. Konate, Ming-Chung Li, Lisa McShane, Yingdong Zhao. LASSO-based protein signatures for surv","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74605529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-208
J. Saliba, Lana M. Sheta, Kilannin Krysiak, Arpad M. Danos, Alex R Marr, Erica K. Barnell, Shahil P. Pema, Wan-Hsin Lin, P. Terraf, Joshua F. McMichael, C. Grisdale, Shruti Rao, S. Kiwala, Adam C. Coffman, A. Wagner, O. Griffith, M. Griffith
The Clinical Interpretation of Variants in Cancer (CIViC) knowledgebase (civicdb.org) is an open access, centralized hub for structured, community curated and expertly moderated relationships between genomic variants and cancer. Evidence is curated from peer-reviewed, published literature and is classified into one of five Types: Predisposing, Diagnostic, Prognostic, Predictive (therapeutic), or Functional. The robustness of the Evidence is conveyed through the assignment of Levels with the first three derived from patient studies (Validated, Clinical, Case Study), Preclinical, generated from in vivo or in vitro data, and Inferential, which describes indirect associations. Each Evidence Item requires an Evidence Statement written in the curator9s own words summarizing the source9s results regarding the variant9s clinical impact. Collaborations with groups like ClinGen have generated a significant influx of new curators, increasing the demand for detailed principles regarding data prioritization in the Evidence Statement in order to streamline the curation process. The curation community would benefit from simpler, visual guides through the complex decisions needed to appropriately and consistently curate Evidence Items. We are devoting significant effort to continue the development of straightforward Evidence curation algorithms (decision trees) similar to those used in clinical molecular testing labs to aid CIViC curators. Previously published guidelines on development of these statements are the basis of our Evidence algorithms. Obvious inflection points for curators are clearly identified with specific details noted for each to optimize decision efficiency. As the predominant Evidence Type comprising 57% of all CIViC submissions, 58% of referenced patient trials, and 92% of Preclinical submissions, Predictive Evidence is the initial focus of our pilot guidelines with Diagnostic and Prognostic to follow. Within the Predictive Evidence Type, clinical trials, case studies, and preclinical Levels each require vastly different Evidence Statement details and ultimately the creation of three separate, uniquely modeled algorithms. The implementation of these algorithms will assist in streamlining both curation and the expert review process. Notably, a template is not being created, as the preservation of curator style and voice is important to maintain the community feel of the database. To ensure the highest level of clarity, our team is utilizing specific novice and experienced curators to assist with the development process. As these algorithms pass the pilot phase, they are being tested as curator training tools. Ultimately, these guidelines will be used to encourage independence in curators and to enhance the Evidence already contained in CIViC. Citation Format: Jason Saliba, Lana Sheta, Kilannin Krysiak, Arpad Danos, Alex Marr, Erica Barnell, Shahil Pema, Wan-Hsin Lin, Panieh Terraf, Joshua F. McMichael, Cameron J. Grisdale, Shruti Rao, Susanna
癌症变异的临床解释(CIViC)知识库(civicdb.org)是一个开放获取、集中的中心,用于结构化、社区策划和专家调节基因组变异与癌症之间的关系。证据来自同行评审的已发表文献,并分为五种类型之一:易感性、诊断性、预后性、预测性(治疗性)或功能性。证据的稳健性通过以下级别的分配来传达:前三个级别来自患者研究(验证,临床,案例研究),临床前,来自体内或体外数据,以及描述间接关联的推论。每个证据项目都需要一份用管理者自己的话撰写的证据声明,总结了关于变异临床影响的来源结果。与ClinGen等组织的合作产生了大量新的策展人,增加了对证据声明中有关数据优先级的详细原则的需求,以简化策展过程。策展社区将受益于更简单、直观的指南,通过适当和一致地策展证据项目所需的复杂决策。我们正在投入大量精力,继续开发直接的证据管理算法(决策树),类似于临床分子检测实验室中用于帮助CIViC策展人的算法。以前发表的关于这些陈述的发展指南是我们证据算法的基础。明确确定了策展人的明显拐点,并为每个拐点记录了具体细节,以优化决策效率。作为主要的证据类型,包括57%的CIViC提交,58%的参考患者试验和92%的临床前提交,预测性证据是我们试点指南的最初重点,随后是诊断和预后。在预测证据类型中,临床试验、案例研究和临床前水平都需要截然不同的证据声明细节,并最终创建三种独立的、独特的建模算法。这些算法的实施将有助于简化策展和专家审查过程。值得注意的是,没有创建模板,因为保存管理员风格和声音对于维护数据库的社区感觉很重要。为了确保最高水平的清晰度,我们的团队正在使用特定的新手和有经验的管理员来协助开发过程。随着这些算法通过试点阶段,它们将作为策展人培训工具进行测试。最终,这些指导方针将用于鼓励策展人的独立性,并加强CIViC中已经包含的证据。引文格式:Jason Saliba, Lana Sheta, Kilannin Krysiak, Arpad Danos, Alex Marr, Erica Barnell, Shahil Pema, Wan-Hsin Lin, Panieh Terraf, Joshua F. McMichael, Cameron J. Grisdale, Shruti Rao, Susanna Kiwala, Adam Coffman, Alex Wagner, Obi L. Griffith, Malachi Griffith。证据陈述管理算法的发展,以帮助癌症变异解释[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志,2021;81(13 -增刊):摘要nr 208。
{"title":"Abstract 208: Development of Evidence Statement curation algorithms to aid cancer variant interpretation","authors":"J. Saliba, Lana M. Sheta, Kilannin Krysiak, Arpad M. Danos, Alex R Marr, Erica K. Barnell, Shahil P. Pema, Wan-Hsin Lin, P. Terraf, Joshua F. McMichael, C. Grisdale, Shruti Rao, S. Kiwala, Adam C. Coffman, A. Wagner, O. Griffith, M. Griffith","doi":"10.1158/1538-7445.AM2021-208","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-208","url":null,"abstract":"The Clinical Interpretation of Variants in Cancer (CIViC) knowledgebase (civicdb.org) is an open access, centralized hub for structured, community curated and expertly moderated relationships between genomic variants and cancer. Evidence is curated from peer-reviewed, published literature and is classified into one of five Types: Predisposing, Diagnostic, Prognostic, Predictive (therapeutic), or Functional. The robustness of the Evidence is conveyed through the assignment of Levels with the first three derived from patient studies (Validated, Clinical, Case Study), Preclinical, generated from in vivo or in vitro data, and Inferential, which describes indirect associations. Each Evidence Item requires an Evidence Statement written in the curator9s own words summarizing the source9s results regarding the variant9s clinical impact. Collaborations with groups like ClinGen have generated a significant influx of new curators, increasing the demand for detailed principles regarding data prioritization in the Evidence Statement in order to streamline the curation process. The curation community would benefit from simpler, visual guides through the complex decisions needed to appropriately and consistently curate Evidence Items. We are devoting significant effort to continue the development of straightforward Evidence curation algorithms (decision trees) similar to those used in clinical molecular testing labs to aid CIViC curators. Previously published guidelines on development of these statements are the basis of our Evidence algorithms. Obvious inflection points for curators are clearly identified with specific details noted for each to optimize decision efficiency. As the predominant Evidence Type comprising 57% of all CIViC submissions, 58% of referenced patient trials, and 92% of Preclinical submissions, Predictive Evidence is the initial focus of our pilot guidelines with Diagnostic and Prognostic to follow. Within the Predictive Evidence Type, clinical trials, case studies, and preclinical Levels each require vastly different Evidence Statement details and ultimately the creation of three separate, uniquely modeled algorithms. The implementation of these algorithms will assist in streamlining both curation and the expert review process. Notably, a template is not being created, as the preservation of curator style and voice is important to maintain the community feel of the database. To ensure the highest level of clarity, our team is utilizing specific novice and experienced curators to assist with the development process. As these algorithms pass the pilot phase, they are being tested as curator training tools. Ultimately, these guidelines will be used to encourage independence in curators and to enhance the Evidence already contained in CIViC. Citation Format: Jason Saliba, Lana Sheta, Kilannin Krysiak, Arpad Danos, Alex Marr, Erica Barnell, Shahil Pema, Wan-Hsin Lin, Panieh Terraf, Joshua F. McMichael, Cameron J. Grisdale, Shruti Rao, Susanna ","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"98 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73807692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1158/1538-7445.AM2021-199
Stephanie Zhang, Minsoo Kang
Protein protein interactions (PPIs) form the backbone of signal transduction pathways in diverse physiological processes, mediating the transmission and regulation of oncogenic signals essential to cellular proliferation and survival, thus representing a potential new class of drug targets for anticancer therapeutic discovery. However, several challenges face the targeting of PPIs, including large PPI interface areas, a lack of deep pockets, the presence of noncontiguous binding sites, and a general lack of natural ligands. The presence of hot spots (small subsets of amino acid residues that contribute significantly to free binding energy) makes PPIs amenable to small molecule perturbations, playing essential roles in the stability of protein binding. Effectively identifying which specific interface residues of protein protein complexes form the hot spots is critical for understanding the principles of protein interactions and has broad application prospects in protein design and drug development. This project presents Blossom AI, a novel, user friendly mobile app developed in XCode and CoreML that uses random forest decision tree algorithms (RF) to computationally predict the presence of hotspots on protein complexes within seconds, aiding the design of small molecule and peptide drugs that target protein-protein interactions, particularly for anticancer therapy. Leveraging features such as solvent accessible surface area (ASA), blocks substitution matrix, physicochemical properties (hydrophobicity, polarity, polarizability, propensities), position specific scoring matrix (PSSM) and solvent exposure, the RF is trained through a dataset of 313 mutated interface residues (133 hotspot residues and 180 non hotspot residues) from over 60 protein complexes to produce a training accuracy of 88.75%, validation accuracy of 92.86%, specificity of 87.18%, sensitivity of 75.38%, PPV 94.23%, NPV 86.61%. Blossom is high speed, low cost, and user friendly with significantly improved accuracy over the standard of alanine scanning mutagenesis. Citation Format: Stephanie Zhang, Minsoo Kang. Blossom AI: A novel drug discovery app for the prediction of hotspots on multiplex protein protein interaction complexes using random forest algorithms [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 199.
{"title":"Abstract 199: Blossom AI: A novel drug discovery app for the prediction of hotspots on multiplex protein protein interaction complexes using random forest algorithms","authors":"Stephanie Zhang, Minsoo Kang","doi":"10.1158/1538-7445.AM2021-199","DOIUrl":"https://doi.org/10.1158/1538-7445.AM2021-199","url":null,"abstract":"Protein protein interactions (PPIs) form the backbone of signal transduction pathways in diverse physiological processes, mediating the transmission and regulation of oncogenic signals essential to cellular proliferation and survival, thus representing a potential new class of drug targets for anticancer therapeutic discovery. However, several challenges face the targeting of PPIs, including large PPI interface areas, a lack of deep pockets, the presence of noncontiguous binding sites, and a general lack of natural ligands. The presence of hot spots (small subsets of amino acid residues that contribute significantly to free binding energy) makes PPIs amenable to small molecule perturbations, playing essential roles in the stability of protein binding. Effectively identifying which specific interface residues of protein protein complexes form the hot spots is critical for understanding the principles of protein interactions and has broad application prospects in protein design and drug development. This project presents Blossom AI, a novel, user friendly mobile app developed in XCode and CoreML that uses random forest decision tree algorithms (RF) to computationally predict the presence of hotspots on protein complexes within seconds, aiding the design of small molecule and peptide drugs that target protein-protein interactions, particularly for anticancer therapy. Leveraging features such as solvent accessible surface area (ASA), blocks substitution matrix, physicochemical properties (hydrophobicity, polarity, polarizability, propensities), position specific scoring matrix (PSSM) and solvent exposure, the RF is trained through a dataset of 313 mutated interface residues (133 hotspot residues and 180 non hotspot residues) from over 60 protein complexes to produce a training accuracy of 88.75%, validation accuracy of 92.86%, specificity of 87.18%, sensitivity of 75.38%, PPV 94.23%, NPV 86.61%. Blossom is high speed, low cost, and user friendly with significantly improved accuracy over the standard of alanine scanning mutagenesis. Citation Format: Stephanie Zhang, Minsoo Kang. Blossom AI: A novel drug discovery app for the prediction of hotspots on multiplex protein protein interaction complexes using random forest algorithms [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 199.","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84434536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}