Implementation of a Hamming distance-like genomic quantum classifier using inner products on ibmqx2 and ibmq_16_melbourne.

IF 4.4 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Quantum Machine Intelligence Pub Date : 2020-01-01 Epub Date: 2020-07-17 DOI:10.1007/s42484-020-00017-7

Kunal Kathuria, Aakrosh Ratan, Michael McConnell, Stefan Bekiranov

{"title":"Implementation of a Hamming distance-like genomic quantum classifier using inner products on ibmqx2 and ibmq_16_melbourne.","authors":"Kunal Kathuria, Aakrosh Ratan, Michael McConnell, Stefan Bekiranov","doi":"10.1007/s42484-020-00017-7","DOIUrl":null,"url":null,"abstract":"Motivated by the problem of classifying individuals with a disease versus controls using a functional genomic attribute as input, we present relatively efficient general purpose inner product-based kernel classifiers to classify the test as a normal or disease sample. We encode each training sample as a string of 1 s (presence) and 0 s (absence) representing the attribute's existence across ordered physical blocks of the subdivided genome. Having binary-valued features allows for highly efficient data encoding in the computational basis for classifiers relying on binary operations. Given that a natural distance between binary strings is Hamming distance, which shares properties with bit-string inner products, our two classifiers apply different inner product measures for classification. The active inner product (AIP) is a direct dot product-based classifier whereas the symmetric inner product (SIP) classifies upon scoring correspondingly matching genomic attributes. SIP is a strongly Hamming distance-based classifier generally applicable to binary attribute-matching problems whereas AIP has general applications as a simple dot product-based classifier. The classifiers implement an inner product between N = 2 n dimension test and train vectors using n Fredkin gates while the training sets are respectively entangled with the class-label qubit, without use of an ancilla. Moreover, each training class can be composed of an arbitrary number m of samples that can be classically summed into one input string to effectively execute all test-train inner products simultaneously. Thus, our circuits require the same number of qubits for any number of training samples and are <math><mi>O</mi> <mo>(</mo> <mi>log</mi> <mi>N</mi> <mo>)</mo></math> in gate complexity after the states are prepared. Our classifiers were implemented on ibmqx2 (IBM-Q-team 2019b) and ibmq_16_melbourne (IBM-Q-team 2019a). The latter allowed encoding of 64 training features across the genome.","PeriodicalId":29924,"journal":{"name":"Quantum Machine Intelligence","volume":"2 1","pages":"1-26"},"PeriodicalIF":4.4000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s42484-020-00017-7","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantum Machine Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s42484-020-00017-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/7/17 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 13

Abstract

Motivated by the problem of classifying individuals with a disease versus controls using a functional genomic attribute as input, we present relatively efficient general purpose inner product-based kernel classifiers to classify the test as a normal or disease sample. We encode each training sample as a string of 1 s (presence) and 0 s (absence) representing the attribute's existence across ordered physical blocks of the subdivided genome. Having binary-valued features allows for highly efficient data encoding in the computational basis for classifiers relying on binary operations. Given that a natural distance between binary strings is Hamming distance, which shares properties with bit-string inner products, our two classifiers apply different inner product measures for classification. The active inner product (AIP) is a direct dot product-based classifier whereas the symmetric inner product (SIP) classifies upon scoring correspondingly matching genomic attributes. SIP is a strongly Hamming distance-based classifier generally applicable to binary attribute-matching problems whereas AIP has general applications as a simple dot product-based classifier. The classifiers implement an inner product between N = 2 ⁿ dimension test and train vectors using n Fredkin gates while the training sets are respectively entangled with the class-label qubit, without use of an ancilla. Moreover, each training class can be composed of an arbitrary number m of samples that can be classically summed into one input string to effectively execute all test-train inner products simultaneously. Thus, our circuits require the same number of qubits for any number of training samples and are $O (\log N)$ in gate complexity after the states are prepared. Our classifiers were implemented on ibmqx2 (IBM-Q-team 2019b) and ibmq_16_melbourne (IBM-Q-team 2019a). The latter allowed encoding of 64 training features across the genome.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在ibmqx2和ibmq_16_melbourne上使用内积实现类似汉明距离的基因组量子分类器。

由于使用功能基因组属性作为输入对患有疾病的个体与对照组进行分类的问题，我们提出了相对有效的通用内部基于产品的内核分类器，将测试分类为正常样本或疾病样本。我们将每个训练样本编码为1秒(存在)和0秒(不存在)的字符串，表示属性在细分基因组的有序物理块中的存在。具有二进制值的特征可以在依赖于二进制操作的分类器的计算基础中实现高效的数据编码。假设二进制字符串之间的自然距离是汉明距离，它与位串内积具有相同的性质，我们的两个分类器采用不同的内积度量进行分类。主动内积(AIP)是直接基于点积的分类器，而对称内积(SIP)是根据相应匹配的基因组属性进行分类的。SIP是一种强基于汉明距离的分类器，通常适用于二元属性匹配问题，而AIP作为一种简单的基于点积的分类器具有一般的应用。分类器使用N个Fredkin门实现N = 2n维测试和训练向量之间的内积，而训练集分别与类标签量子比特纠缠，而不使用辅助。此外，每个训练类可以由任意数量的m个样本组成，这些样本可以经典地求和为一个输入字符串，从而有效地同时执行所有测试训练内部产品。因此，对于任何数量的训练样本，我们的电路需要相同数量的量子比特，并且在状态准备后，门复杂度为O (log N)。我们的分类器在ibmqx2 (IBM-Q-team 2019b)和ibmq_16_melbourne (IBM-Q-team 2019a)上实现。后者允许在整个基因组中编码64个训练特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Quantum Machine Intelligence Multiple-

CiteScore

7.60

自引率

4.20%

发文量