{"title":"作为马尔可夫决策过程的量子逻辑门综合","authors":"M. Sohaib Alam, Noah F. Berthusen, Peter P. Orth","doi":"10.1038/s41534-023-00766-w","DOIUrl":null,"url":null,"abstract":"<p>Reinforcement learning has witnessed recent applications to a variety of tasks in quantum programming. The underlying assumption is that those tasks could be modeled as Markov decision processes (MDPs). Here, we investigate the feasibility of this assumption by exploring its consequences for single-qubit quantum state preparation and gate compilation. By forming discrete MDPs, we solve for the optimal policy exactly through policy iteration. We find optimal paths that correspond to the shortest possible sequence of gates to prepare a state or compile a gate, up to some target accuracy. Our method works in both the absence and presence of noise and compares favorably to other quantum compilation methods, such as the Ross–Selinger algorithm. This work provides theoretical insight into why reinforcement learning may be successfully used to find optimally short gate sequences in quantum programming.</p>","PeriodicalId":19212,"journal":{"name":"npj Quantum Information","volume":null,"pages":null},"PeriodicalIF":6.6000,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Quantum logic gate synthesis as a Markov decision process\",\"authors\":\"M. Sohaib Alam, Noah F. Berthusen, Peter P. Orth\",\"doi\":\"10.1038/s41534-023-00766-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Reinforcement learning has witnessed recent applications to a variety of tasks in quantum programming. The underlying assumption is that those tasks could be modeled as Markov decision processes (MDPs). Here, we investigate the feasibility of this assumption by exploring its consequences for single-qubit quantum state preparation and gate compilation. By forming discrete MDPs, we solve for the optimal policy exactly through policy iteration. We find optimal paths that correspond to the shortest possible sequence of gates to prepare a state or compile a gate, up to some target accuracy. Our method works in both the absence and presence of noise and compares favorably to other quantum compilation methods, such as the Ross–Selinger algorithm. This work provides theoretical insight into why reinforcement learning may be successfully used to find optimally short gate sequences in quantum programming.</p>\",\"PeriodicalId\":19212,\"journal\":{\"name\":\"npj Quantum Information\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2023-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"npj Quantum Information\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1038/s41534-023-00766-w\",\"RegionNum\":1,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHYSICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Quantum Information","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1038/s41534-023-00766-w","RegionNum":1,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHYSICS, APPLIED","Score":null,"Total":0}
Quantum logic gate synthesis as a Markov decision process
Reinforcement learning has witnessed recent applications to a variety of tasks in quantum programming. The underlying assumption is that those tasks could be modeled as Markov decision processes (MDPs). Here, we investigate the feasibility of this assumption by exploring its consequences for single-qubit quantum state preparation and gate compilation. By forming discrete MDPs, we solve for the optimal policy exactly through policy iteration. We find optimal paths that correspond to the shortest possible sequence of gates to prepare a state or compile a gate, up to some target accuracy. Our method works in both the absence and presence of noise and compares favorably to other quantum compilation methods, such as the Ross–Selinger algorithm. This work provides theoretical insight into why reinforcement learning may be successfully used to find optimally short gate sequences in quantum programming.
期刊介绍:
The scope of npj Quantum Information spans across all relevant disciplines, fields, approaches and levels and so considers outstanding work ranging from fundamental research to applications and technologies.