{"title":"GCOC: A Genome Classifier-On-Chip based on Similarity Search Content Addressable Memory.","authors":"Yuval Harary, Paz Snapir, Shir Siman Tov, Chen Kruphman, Eyal Rechef, Zuher Jahshan, Esteban Garzon, Leonid Yavits","doi":"10.1109/TBCAS.2024.3449788","DOIUrl":null,"url":null,"abstract":"<p><p>GCOC is a genome classification system-on-chip (SoC) that classifies genomes by k-mer matching, an approach that divides a DNA query sequence into a set of short DNA fragments of size k, which are searched in a reference genome database, with the underlying assumption that sequenced DNA reads of the same organism (or its close variants) share most of such k-mers. At the core of GCOC is a similarity, or approximate search-capable Content Addressable Memory (SAS-CAM), which in addition to exact match, also supports approximate, or Hamming distance tolerant search. Classification operation is controlled by an embedded RISC-V processor. GCOC classification platform was designed and manufactured in a commercial 65nm process. We conduct a thorough analysis of GCOC classification efficiency as well as its performance, silicon area, and power consumption using silicon measurements. GCOC classifies 769.2K short DNA reads/sec. The silicon area of GCOC SoC is 3.12mm<sup>2</sup> and its power consumption is 1.27mW. We envision GCOC deployed as a field (for example at points of care) portable classifier where the classification is required to be real-time, easy to operate and energy efficient.</p>","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biomedical circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TBCAS.2024.3449788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
GCOC is a genome classification system-on-chip (SoC) that classifies genomes by k-mer matching, an approach that divides a DNA query sequence into a set of short DNA fragments of size k, which are searched in a reference genome database, with the underlying assumption that sequenced DNA reads of the same organism (or its close variants) share most of such k-mers. At the core of GCOC is a similarity, or approximate search-capable Content Addressable Memory (SAS-CAM), which in addition to exact match, also supports approximate, or Hamming distance tolerant search. Classification operation is controlled by an embedded RISC-V processor. GCOC classification platform was designed and manufactured in a commercial 65nm process. We conduct a thorough analysis of GCOC classification efficiency as well as its performance, silicon area, and power consumption using silicon measurements. GCOC classifies 769.2K short DNA reads/sec. The silicon area of GCOC SoC is 3.12mm2 and its power consumption is 1.27mW. We envision GCOC deployed as a field (for example at points of care) portable classifier where the classification is required to be real-time, easy to operate and energy efficient.