Diego Morazán-Fernández, Javier Mora, Jose Arturo Molina-Mora
{"title":"In Silico Pipeline to Identify Tumor-Specific Antigens for Cancer Immunotherapy Using Exome Sequencing Data.","authors":"Diego Morazán-Fernández, Javier Mora, Jose Arturo Molina-Mora","doi":"10.1007/s43657-022-00084-9","DOIUrl":null,"url":null,"abstract":"<p><p>Tumor-specific antigens or neoantigens are peptides that are expressed only in cancer cells and not in healthy cells. Some of these molecules can induce an immune response, and therefore, their use in immunotherapeutic strategies based on cancer vaccines has been extensively explored. Studies based on these approaches have been triggered by the current high-throughput DNA sequencing technologies. However, there is no universal nor straightforward bioinformatic protocol to discover neoantigens using DNA sequencing data. Thus, we propose a bioinformatic protocol to detect tumor-specific antigens associated with single nucleotide variants (SNVs) or \"mutations\" in tumoral tissues. For this purpose, we used publicly available data to build our model, including exome sequencing data from colorectal cancer and healthy cells obtained from a single case, as well as frequent human leukocyte antigen (HLA) class I alleles in a specific population. HLA data from Costa Rican Central Valley population was selected as an example. The strategy included three main steps: (1) pre-processing of sequencing data; (2) variant calling analysis to detect tumor-specific SNVs in comparison with healthy tissue; and (3) prediction and characterization of peptides (protein fragments, the tumor-specific antigens) derived from the variants, in the context of their affinity with frequent alleles of the selected population. In our model data, we found 28 non-silent SNVs, present in 17 genes in chromosome one. The protocol yielded 23 strong binders peptides derived from the SNVs for frequent HLA class I alleles for the Costa Rican population. Although the analyses were performed as an example to implement the pipeline, to our knowledge, this is the first study of an in silico cancer vaccine using DNA sequencing data in the context of the HLA alleles. It is concluded that the standardized protocol was not only able to identify neoantigens in a specific but also provides a complete pipeline for the eventual design of cancer vaccines using the best bioinformatic practices.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s43657-022-00084-9.</p>","PeriodicalId":74435,"journal":{"name":"Phenomics (Cham, Switzerland)","volume":"3 2","pages":"130-137"},"PeriodicalIF":3.7000,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10110822/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Phenomics (Cham, Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s43657-022-00084-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/4/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Tumor-specific antigens or neoantigens are peptides that are expressed only in cancer cells and not in healthy cells. Some of these molecules can induce an immune response, and therefore, their use in immunotherapeutic strategies based on cancer vaccines has been extensively explored. Studies based on these approaches have been triggered by the current high-throughput DNA sequencing technologies. However, there is no universal nor straightforward bioinformatic protocol to discover neoantigens using DNA sequencing data. Thus, we propose a bioinformatic protocol to detect tumor-specific antigens associated with single nucleotide variants (SNVs) or "mutations" in tumoral tissues. For this purpose, we used publicly available data to build our model, including exome sequencing data from colorectal cancer and healthy cells obtained from a single case, as well as frequent human leukocyte antigen (HLA) class I alleles in a specific population. HLA data from Costa Rican Central Valley population was selected as an example. The strategy included three main steps: (1) pre-processing of sequencing data; (2) variant calling analysis to detect tumor-specific SNVs in comparison with healthy tissue; and (3) prediction and characterization of peptides (protein fragments, the tumor-specific antigens) derived from the variants, in the context of their affinity with frequent alleles of the selected population. In our model data, we found 28 non-silent SNVs, present in 17 genes in chromosome one. The protocol yielded 23 strong binders peptides derived from the SNVs for frequent HLA class I alleles for the Costa Rican population. Although the analyses were performed as an example to implement the pipeline, to our knowledge, this is the first study of an in silico cancer vaccine using DNA sequencing data in the context of the HLA alleles. It is concluded that the standardized protocol was not only able to identify neoantigens in a specific but also provides a complete pipeline for the eventual design of cancer vaccines using the best bioinformatic practices.
Supplementary information: The online version contains supplementary material available at 10.1007/s43657-022-00084-9.