{"title":"GANzzle++:潜在空间表征中从局部到全局分配的拼图游戏生成方法","authors":"Davide Talon , Alessio Del Bue , Stuart James","doi":"10.1016/j.patrec.2024.11.010","DOIUrl":null,"url":null,"abstract":"<div><div>Jigsaw puzzles are a popular and enjoyable pastime that humans can easily solve, even with many pieces. However, solving a jigsaw is a combinatorial problem, and the space of possible solutions is exponential in the number of pieces, intractable for pairwise solutions. In contrast to the classical pairwise local matching of pieces based on edge heuristics, we estimate an approximate solution image, i.e., a <em>mental image</em>, of the puzzle and exploit it to guide the placement of pieces as a piece-to-global assignment problem. Therefore, from unordered pieces, we consider conditioned generation approaches, including Generative Adversarial Networks (GAN) models, Slot Attention (SA) and Vision Transformers (ViT), to recover the solution image. Given the generated solution representation, we cast the jigsaw solving as a 1-to-1 assignment matching problem using Hungarian attention, which places pieces in corresponding positions in the global solution estimate. Results show that the newly proposed GANzzle-SA and GANzzle-VIT benefit from the early fusion strategy where pieces are jointly compressed and gathered for global structure recovery. A single deep learning model generalizes to puzzles of different sizes and improves the performances by a large margin. Evaluated on PuzzleCelebA and PuzzleWikiArts, our approaches bridge the gap of deep learning strategies with respect to optimization-based classic puzzle solvers.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 35-41"},"PeriodicalIF":3.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GANzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations\",\"authors\":\"Davide Talon , Alessio Del Bue , Stuart James\",\"doi\":\"10.1016/j.patrec.2024.11.010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Jigsaw puzzles are a popular and enjoyable pastime that humans can easily solve, even with many pieces. However, solving a jigsaw is a combinatorial problem, and the space of possible solutions is exponential in the number of pieces, intractable for pairwise solutions. In contrast to the classical pairwise local matching of pieces based on edge heuristics, we estimate an approximate solution image, i.e., a <em>mental image</em>, of the puzzle and exploit it to guide the placement of pieces as a piece-to-global assignment problem. Therefore, from unordered pieces, we consider conditioned generation approaches, including Generative Adversarial Networks (GAN) models, Slot Attention (SA) and Vision Transformers (ViT), to recover the solution image. Given the generated solution representation, we cast the jigsaw solving as a 1-to-1 assignment matching problem using Hungarian attention, which places pieces in corresponding positions in the global solution estimate. Results show that the newly proposed GANzzle-SA and GANzzle-VIT benefit from the early fusion strategy where pieces are jointly compressed and gathered for global structure recovery. A single deep learning model generalizes to puzzles of different sizes and improves the performances by a large margin. Evaluated on PuzzleCelebA and PuzzleWikiArts, our approaches bridge the gap of deep learning strategies with respect to optimization-based classic puzzle solvers.</div></div>\",\"PeriodicalId\":54638,\"journal\":{\"name\":\"Pattern Recognition Letters\",\"volume\":\"187 \",\"pages\":\"Pages 35-41\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167865524003179\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524003179","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
GANzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations
Jigsaw puzzles are a popular and enjoyable pastime that humans can easily solve, even with many pieces. However, solving a jigsaw is a combinatorial problem, and the space of possible solutions is exponential in the number of pieces, intractable for pairwise solutions. In contrast to the classical pairwise local matching of pieces based on edge heuristics, we estimate an approximate solution image, i.e., a mental image, of the puzzle and exploit it to guide the placement of pieces as a piece-to-global assignment problem. Therefore, from unordered pieces, we consider conditioned generation approaches, including Generative Adversarial Networks (GAN) models, Slot Attention (SA) and Vision Transformers (ViT), to recover the solution image. Given the generated solution representation, we cast the jigsaw solving as a 1-to-1 assignment matching problem using Hungarian attention, which places pieces in corresponding positions in the global solution estimate. Results show that the newly proposed GANzzle-SA and GANzzle-VIT benefit from the early fusion strategy where pieces are jointly compressed and gathered for global structure recovery. A single deep learning model generalizes to puzzles of different sizes and improves the performances by a large margin. Evaluated on PuzzleCelebA and PuzzleWikiArts, our approaches bridge the gap of deep learning strategies with respect to optimization-based classic puzzle solvers.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.