Marzia Riso, Giacomo Nazzaro, E. Puppo, Alec Jacobson, Qingnan Zhou, F. Pellacini
We port Boolean set operations between 2D shapes to surfaces of any genus, with any number of open boundaries. We combine shapes bounded by sets of freely intersecting loops, consisting of geodesic lines and cubic Bézier splines lying on a surface. We compute the arrangement of shapes directly on the surface and assign integer labels to the cells of such arrangement. Differently from the Euclidean case, some arrangements on a manifold may be inconsistent. We detect inconsistent arrangements and help the user to resolve them. Also, we extend to the manifold setting recent work on Boundary-Sampled Halfspaces, thus supporting operations more general than standard Booleans, which are well defined on inconsistent arrangements, too. Our implementation discretizes the input shapes into polylines at an arbitrary resolution, independent of the level of resolution of the underlying mesh. We resolve the arrangement inside each triangle of the mesh independently and combine the results to reconstruct both the boundaries and the interior of each cell in the arrangement. We reconstruct the control points of curves bounding cells, in order to free the result from discretization and provide an output in vector format. We support interactive usage, editing shapes consisting up to 100k line segments on meshes of up to 1M triangles.
我们将二维形状之间的布尔集合操作移植到任意曲面上
{"title":"BoolSurf: Boolean Operations on Surfaces","authors":"Marzia Riso, Giacomo Nazzaro, E. Puppo, Alec Jacobson, Qingnan Zhou, F. Pellacini","doi":"10.1145/3550454.3555466","DOIUrl":"https://doi.org/10.1145/3550454.3555466","url":null,"abstract":"We port Boolean set operations between 2D shapes to surfaces of any genus, with any number of open boundaries. We combine shapes bounded by sets of freely intersecting loops, consisting of geodesic lines and cubic Bézier splines lying on a surface. We compute the arrangement of shapes directly on the surface and assign integer labels to the cells of such arrangement. Differently from the Euclidean case, some arrangements on a manifold may be inconsistent. We detect inconsistent arrangements and help the user to resolve them. Also, we extend to the manifold setting recent work on Boundary-Sampled Halfspaces, thus supporting operations more general than standard Booleans, which are well defined on inconsistent arrangements, too. Our implementation discretizes the input shapes into polylines at an arbitrary resolution, independent of the level of resolution of the underlying mesh. We resolve the arrangement inside each triangle of the mesh independently and combine the results to reconstruct both the boundaries and the interior of each cell in the arrangement. We reconstruct the control points of curves bounding cells, in order to free the result from discretization and provide an output in vector format. We support interactive usage, editing shapes consisting up to 100k line segments on meshes of up to 1M triangles.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"7 1","pages":"1 - 13"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74455810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Nuvoli, N. Pietroni, Paolo Cignoni, R. Scateni, M. Tarini
We propose a novel technique to compose new 3D animated models, such as videogame characters, by combining pieces from existing ones. Our method works on production-ready rigged, skinned, and animated 3D models to reassemble new ones. We exploit mix-and-match operations on the skeletons to trigger the automatic creation of a new mesh, linked to the new skeleton by a set of skinning weights and complete with a set of animations. The resulting model preserves the quality of the input meshings (which can be quad-dominant and semi-regular), skinning weights (inducing believable deformation), and animations, featuring coherent movements of the new skeleton. Our method enables content creators to reuse valuable, carefully designed assets by assembling new ready-to-use characters while preserving most of the hand-crafted subtleties of models authored by digital artists. As shown in the accompanying video, it allows for drastically cutting the time needed to obtain the final result.
{"title":"SkinMixer: Blending 3D Animated Models","authors":"S. Nuvoli, N. Pietroni, Paolo Cignoni, R. Scateni, M. Tarini","doi":"10.1145/3550454.3555503","DOIUrl":"https://doi.org/10.1145/3550454.3555503","url":null,"abstract":"We propose a novel technique to compose new 3D animated models, such as videogame characters, by combining pieces from existing ones. Our method works on production-ready rigged, skinned, and animated 3D models to reassemble new ones. We exploit mix-and-match operations on the skeletons to trigger the automatic creation of a new mesh, linked to the new skeleton by a set of skinning weights and complete with a set of animations. The resulting model preserves the quality of the input meshings (which can be quad-dominant and semi-regular), skinning weights (inducing believable deformation), and animations, featuring coherent movements of the new skeleton. Our method enables content creators to reuse valuable, carefully designed assets by assembling new ready-to-use characters while preserving most of the hand-crafted subtleties of models authored by digital artists. As shown in the accompanying video, it allows for drastically cutting the time needed to obtain the final result.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"13 1","pages":"1 - 15"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74580144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shao-feng Zheng, Zhiqian Zhou, Xin Chen, Difei Yan, Chuyan Zhang, Yuefeng Geng, Yan Gu, Kun Xu
The advancements in hardware have drawn more attention than ever to high-quality offline rendering with modern stream processors, both in the industry and in research fields. However, the graphics APIs are fragmented and existing shading languages lack high-level constructs such as polymorphism, which adds complexity to developing and maintaining cross-platform high-performance renderers. We present LuisaRender1, a high-performance rendering framework for modern stream-architecture hardware. Our main contribution is an expressive C++-embedded DSL for kernel programming with JIT code generation and compilation. We also implement a unified runtime layer with resource wrappers and an optimized Monte Carlo renderer. Experiments on test scenes show that LuisaRender achieves much higher performance than existing research renderers on modern graphics hardware, e.g., 5--11× faster than PBRT-v4 and 4--16× faster than Mitsuba 3.
设备资源管理抽象语法树
{"title":"LuisaRender: A High-Performance Rendering Framework with Layered and Unified Interfaces on Stream Architectures","authors":"Shao-feng Zheng, Zhiqian Zhou, Xin Chen, Difei Yan, Chuyan Zhang, Yuefeng Geng, Yan Gu, Kun Xu","doi":"10.1145/3550454.3555463","DOIUrl":"https://doi.org/10.1145/3550454.3555463","url":null,"abstract":"The advancements in hardware have drawn more attention than ever to high-quality offline rendering with modern stream processors, both in the industry and in research fields. However, the graphics APIs are fragmented and existing shading languages lack high-level constructs such as polymorphism, which adds complexity to developing and maintaining cross-platform high-performance renderers. We present LuisaRender1, a high-performance rendering framework for modern stream-architecture hardware. Our main contribution is an expressive C++-embedded DSL for kernel programming with JIT code generation and compilation. We also implement a unified runtime layer with resource wrappers and an optimized Monte Carlo renderer. Experiments on test scenes show that LuisaRender achieves much higher performance than existing research renderers on modern graphics hardware, e.g., 5--11× faster than PBRT-v4 and 4--16× faster than Mitsuba 3.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"60 1","pages":"1 - 19"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74055386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jozef Hladky, Michael Stengel, Nicholas Vining, B. Kerbl, H. Seidel, M. Steinberger
Streaming rendered 3D content over a network to a thin client device, such as a phone or a VR/AR headset, brings high-fidelity graphics to platforms where it would not normally possible due to thermal, power, or cost constraints. Streamed 3D content must be transmitted with a representation that is both robust to latency and potential network dropouts. Transmitting a video stream and reprojecting to correct for changing viewpoints fails in the presence of disocclusion events; streaming scene geometry and performing high-quality rendering on the client is not possible on limited-power mobile GPUs. To balance the competing goals of disocclusion robustness and minimal client workload, we introduce QuadStream, a new streaming content representation that reduces motion-to-photon latency by allowing clients to efficiently render novel views without artifacts caused by disocclusion events. Motivated by traditional macroblock approaches to video codec design, we decompose the scene seen from positions in a view cell into a series of quad proxies, or view-aligned quads from multiple views. By operating on a rasterized G-Buffer, our approach is independent of the representation used for the scene itself; the resulting QuadStream is an approximate geometric representation of the scene that can be reconstructed by a thin client to render both the current view and nearby adjacent views. Our technical contributions are an efficient parallel quad generation, merging, and packing strategy for proxy views covering potential client movement in a scene; a packing and encoding strategy that allows masked quads with depth information to be transmitted as a frame-coherent stream; and an efficient rendering approach for rendering our QuadStream representation into entirely novel views on thin clients. We show that our approach achieves superior quality compared both to video data streaming methods, and to geometry-based streaming.
{"title":"QuadStream: A Quad-Based Scene Streaming Architecture for Novel Viewpoint Reconstruction","authors":"Jozef Hladky, Michael Stengel, Nicholas Vining, B. Kerbl, H. Seidel, M. Steinberger","doi":"10.1145/3550454.3555524","DOIUrl":"https://doi.org/10.1145/3550454.3555524","url":null,"abstract":"Streaming rendered 3D content over a network to a thin client device, such as a phone or a VR/AR headset, brings high-fidelity graphics to platforms where it would not normally possible due to thermal, power, or cost constraints. Streamed 3D content must be transmitted with a representation that is both robust to latency and potential network dropouts. Transmitting a video stream and reprojecting to correct for changing viewpoints fails in the presence of disocclusion events; streaming scene geometry and performing high-quality rendering on the client is not possible on limited-power mobile GPUs. To balance the competing goals of disocclusion robustness and minimal client workload, we introduce QuadStream, a new streaming content representation that reduces motion-to-photon latency by allowing clients to efficiently render novel views without artifacts caused by disocclusion events. Motivated by traditional macroblock approaches to video codec design, we decompose the scene seen from positions in a view cell into a series of quad proxies, or view-aligned quads from multiple views. By operating on a rasterized G-Buffer, our approach is independent of the representation used for the scene itself; the resulting QuadStream is an approximate geometric representation of the scene that can be reconstructed by a thin client to render both the current view and nearby adjacent views. Our technical contributions are an efficient parallel quad generation, merging, and packing strategy for proxy views covering potential client movement in a scene; a packing and encoding strategy that allows masked quads with depth information to be transmitted as a frame-coherent stream; and an efficient rendering approach for rendering our QuadStream representation into entirely novel views on thin clients. We show that our approach achieves superior quality compared both to video data streaming methods, and to geometry-based streaming.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"81 1","pages":"1 - 13"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81634420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
StageMix is a mixed video that is created by concatenating the segments from various performance videos of an identical song in a visually smooth manner by matching the main subject's silhouette presented in the frame. We introduce PopStage, which allows users to generate a StageMix automatically. PopStage is designed based on the StageMix Editing Guideline that we established by interviewing creators as well as observing their workflows. PopStage consists of two main steps: finding an editing path and generating a transition effect at a transition point. Using a reward function that favors visual connection and the optimality of transition timing across the videos, we obtain the optimal path that maximizes the sum of rewards through dynamic programming. Given the optimal path, PopStage then aligns the silhouettes of the main subject from the transitioning video pair to enhance the visual connection at the transition point. The virtual camera view is next optimized to remove the black areas that are often created due to the transformation needed for silhouette alignment, while reducing pixel loss. In this process, we enforce the view to be the maximum size while maintaining the temporal continuity across the frames. Experimental results show that PopStage can generate a StageMix of a similar quality to those produced by professional creators in a highly reduced production time.
{"title":"PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching","authors":"Dawon Lee, Jung Eun Yoo, Kyungmin Cho, Bumki Kim, Gyeonghun Im, Jun-yong Noh","doi":"10.1145/3550454.3555467","DOIUrl":"https://doi.org/10.1145/3550454.3555467","url":null,"abstract":"StageMix is a mixed video that is created by concatenating the segments from various performance videos of an identical song in a visually smooth manner by matching the main subject's silhouette presented in the frame. We introduce PopStage, which allows users to generate a StageMix automatically. PopStage is designed based on the StageMix Editing Guideline that we established by interviewing creators as well as observing their workflows. PopStage consists of two main steps: finding an editing path and generating a transition effect at a transition point. Using a reward function that favors visual connection and the optimality of transition timing across the videos, we obtain the optimal path that maximizes the sum of rewards through dynamic programming. Given the optimal path, PopStage then aligns the silhouettes of the main subject from the transitioning video pair to enhance the visual connection at the transition point. The virtual camera view is next optimized to remove the black areas that are often created due to the transformation needed for silhouette alignment, while reducing pixel loss. In this process, we enforce the view to be the maximum size while maintaining the temporal continuity across the frames. Experimental results show that PopStage can generate a StageMix of a similar quality to those produced by professional creators in a highly reduced production time.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"25 1","pages":"1 - 13"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75877979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Generating high-quality artistic portrait videos is an important and desirable task in computer graphics and vision. Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious limitations when applied to videos, such as the fixed frame size, the requirement of face alignment, missing non-facial details and temporal inconsistency. In this work, we investigate the challenging controllable high-resolution portrait video style transfer by introducing a novel VToonify framework. Specifically, VToonify leverages the mid- and high-resolution layers of StyleGAN to render high-quality artistic portraits based on the multi-scale content features extracted by an encoder to better preserve the frame details. The resulting fully convolutional architecture accepts non-aligned faces in videos of variable size as input, contributing to complete face regions with natural motions in the output. Our framework is compatible with existing StyleGAN-based image toonification models to extend them to video toonification, and inherits appealing features of these models for flexible style control on color and intensity. This work presents two instantiations of VToonify built upon Toonify and DualStyleGAN for collection-based and exemplar-based portrait video style transfer, respectively. Extensive experimental results demonstrate the effectiveness of our proposed VToonify framework over existing methods in generating high-quality and temporally-coherent artistic portrait videos with flexible style controls.
{"title":"VToonify: Controllable High-Resolution Portrait Video Style Transfer","authors":"Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy","doi":"10.48550/arXiv.2209.11224","DOIUrl":"https://doi.org/10.48550/arXiv.2209.11224","url":null,"abstract":"Generating high-quality artistic portrait videos is an important and desirable task in computer graphics and vision. Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious limitations when applied to videos, such as the fixed frame size, the requirement of face alignment, missing non-facial details and temporal inconsistency. In this work, we investigate the challenging controllable high-resolution portrait video style transfer by introducing a novel VToonify framework. Specifically, VToonify leverages the mid- and high-resolution layers of StyleGAN to render high-quality artistic portraits based on the multi-scale content features extracted by an encoder to better preserve the frame details. The resulting fully convolutional architecture accepts non-aligned faces in videos of variable size as input, contributing to complete face regions with natural motions in the output. Our framework is compatible with existing StyleGAN-based image toonification models to extend them to video toonification, and inherits appealing features of these models for flexible style control on color and intensity. This work presents two instantiations of VToonify built upon Toonify and DualStyleGAN for collection-based and exemplar-based portrait video style transfer, respectively. Extensive experimental results demonstrate the effectiveness of our proposed VToonify framework over existing methods in generating high-quality and temporally-coherent artistic portrait videos with flexible style controls.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"03 1","pages":"203:1-203:15"},"PeriodicalIF":0.0,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86099233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steve Lesser, A. Stomakhin, Gilles Daviet, J. Wretborn, Johan Edholm, Noh-Hoon Lee, Eston Schweickart, Xiao Zhai, S. Flynn, Andrew Moffat
We introduce Loki, a new framework for robust simulation of fluid, rigid, and deformable objects with non-compromising fidelity on any single element, and capabilities for coupling and representation transitions across multiple elements. Loki adapts multiple best-in-class solvers into a unified framework driven by a declarative state machine where users declare 'what' is simulated but not 'when,' so an automatic scheduling system takes care of mixing any combination of objects. This leads to intuitive setups for coupled simulations such as hair in the wind or objects transitioning from one representation to another, for example bulk water FLIP particles to SPH spray particles to volumetric mist. We also provide a consistent treatment for components used in several domains, such as unified collision and attachment constraints across 1D, 2D, 3D deforming and rigid objects. Distribution over MPI, custom linear equation solvers, and aggressive application of sparse techniques keep performance within production requirements. We demonstrate a variety of solvers within the framework and their interactions, including FLIPstyle liquids, spatially adaptive volumetric fluids, SPH, MPM, and mesh-based solids, including but not limited to discrete elastic rods, elastons, and FEM with state-of-the-art constitutive models. Our framework has proven powerful and intuitive enough for voluntary artist adoption and has delivered creature and FX simulations for multiple major movie productions in the preceding four years.
{"title":"Loki: a unified multiphysics simulation framework for production","authors":"Steve Lesser, A. Stomakhin, Gilles Daviet, J. Wretborn, Johan Edholm, Noh-Hoon Lee, Eston Schweickart, Xiao Zhai, S. Flynn, Andrew Moffat","doi":"10.1145/3528223.3530058","DOIUrl":"https://doi.org/10.1145/3528223.3530058","url":null,"abstract":"We introduce Loki, a new framework for robust simulation of fluid, rigid, and deformable objects with non-compromising fidelity on any single element, and capabilities for coupling and representation transitions across multiple elements. Loki adapts multiple best-in-class solvers into a unified framework driven by a declarative state machine where users declare 'what' is simulated but not 'when,' so an automatic scheduling system takes care of mixing any combination of objects. This leads to intuitive setups for coupled simulations such as hair in the wind or objects transitioning from one representation to another, for example bulk water FLIP particles to SPH spray particles to volumetric mist. We also provide a consistent treatment for components used in several domains, such as unified collision and attachment constraints across 1D, 2D, 3D deforming and rigid objects. Distribution over MPI, custom linear equation solvers, and aggressive application of sparse techniques keep performance within production requirements. We demonstrate a variety of solvers within the framework and their interactions, including FLIPstyle liquids, spatially adaptive volumetric fluids, SPH, MPM, and mesh-based solids, including but not limited to discrete elastic rods, elastons, and FEM with state-of-the-art constitutive models. Our framework has proven powerful and intuitive enough for voluntary artist adoption and has delivered creature and FX simulations for multiple major movie productions in the preceding four years.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"29 1","pages":"1 - 20"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81903376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-24DOI: 10.48550/arXiv.2205.12231
Difan Liu, Sandesh Shetty, T. Hinz, Matthew Fisher, Richard Zhang, Taesung Park, E. Kalogerakis
We present ASSET, a neural architecture for automatically modifying an input high-resolution image according to a user's edits on its semantic segmentation map. Our architecture is based on a transformer with a novel attention mechanism. Our key idea is to sparsify the transformer's attention matrix at high resolutions, guided by dense attention extracted at lower image resolutions. While previous attention mechanisms are computationally too expensive for handling high-resolution images or are overly constrained within specific image regions hampering long-range interactions, our novel attention mechanism is both computationally efficient and effective. Our sparsified attention mechanism is able to capture long-range interactions and context, leading to synthesizing interesting phenomena in scenes, such as reflections of landscapes onto water or flora consistent with the rest of the landscape, that were not possible to generate reliably with previous convnets and transformer approaches. We present qualitative and quantitative results, along with user studies, demonstrating the effectiveness of our method.
{"title":"ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions","authors":"Difan Liu, Sandesh Shetty, T. Hinz, Matthew Fisher, Richard Zhang, Taesung Park, E. Kalogerakis","doi":"10.48550/arXiv.2205.12231","DOIUrl":"https://doi.org/10.48550/arXiv.2205.12231","url":null,"abstract":"We present ASSET, a neural architecture for automatically modifying an input high-resolution image according to a user's edits on its semantic segmentation map. Our architecture is based on a transformer with a novel attention mechanism. Our key idea is to sparsify the transformer's attention matrix at high resolutions, guided by dense attention extracted at lower image resolutions. While previous attention mechanisms are computationally too expensive for handling high-resolution images or are overly constrained within specific image regions hampering long-range interactions, our novel attention mechanism is both computationally efficient and effective. Our sparsified attention mechanism is able to capture long-range interactions and context, leading to synthesizing interesting phenomena in scenes, such as reflections of landscapes onto water or flora consistent with the rest of the landscape, that were not possible to generate reliably with previous convnets and transformer approaches. We present qualitative and quantitative results, along with user studies, demonstrating the effectiveness of our method.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"2 1","pages":"74:1-74:12"},"PeriodicalIF":0.0,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86026223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Floorplan generation has drawn widespread interest in the community. Re- cent learning-based methods for generating realistic floorplans have made significant progress while a complex heuristic post-processing is still neces- sary to obtain desired results. In this paper, we propose a novel wall-oriented method, called WallPlan , for automatically and efficiently generating plausi- blefloorplansfromvariousdesignconstraints.Wepioneertherepresentation ofthefloorplanasawallgraphwithroomlabelsandconsiderthefloorplangenerationasagraphgeneration.Giventheboundaryasinput,wefirst initializetheboundarywithwindowspredictedbyWinNet.ThenagraphgenerationnetworkGraphNetandsemanticspredictionnetworkLabelNet arecoupledtogeneratethewallgraphprogressivelybyimitatinggraphtra-versal. WallPlan can be applied for practical architectural designs, especially the wall-based constraints. We conduct ablation experiments, qualitative evaluations, quantitative comparisons, and perceptual studies to evaluate our method’s feasibility, efficacy, and versatility. Intensive experiments demon- strate our method requires no post-processing, producing higher quality floorplans than state-of-the-art techniques.
{"title":"WallPlan: synthesizing floorplans by learning to generate wall graphs","authors":"Jiahui Sun, Wenming Wu, Ligang Liu, Wenjie Min, Gaofeng Zhang, Liping Zheng","doi":"10.1145/3528223.3530135","DOIUrl":"https://doi.org/10.1145/3528223.3530135","url":null,"abstract":"Floorplan generation has drawn widespread interest in the community. Re- cent learning-based methods for generating realistic floorplans have made significant progress while a complex heuristic post-processing is still neces- sary to obtain desired results. In this paper, we propose a novel wall-oriented method, called WallPlan , for automatically and efficiently generating plausi- blefloorplansfromvariousdesignconstraints.Wepioneertherepresentation ofthefloorplanasawallgraphwithroomlabelsandconsiderthefloorplangenerationasagraphgeneration.Giventheboundaryasinput,wefirst initializetheboundarywithwindowspredictedbyWinNet.ThenagraphgenerationnetworkGraphNetandsemanticspredictionnetworkLabelNet arecoupledtogeneratethewallgraphprogressivelybyimitatinggraphtra-versal. WallPlan can be applied for practical architectural designs, especially the wall-based constraints. We conduct ablation experiments, qualitative evaluations, quantitative comparisons, and perceptual studies to evaluate our method’s feasibility, efficacy, and versatility. Intensive experiments demon- strate our method requires no post-processing, producing higher quality floorplans than state-of-the-art techniques.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"97 1","pages":"92:1-92:14"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85944373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafał K. Mantiuk, Alexandre Chapiro, Alexandre Chapiro
contrast many The existing CSFs typically account for a subset of relevant dimensions describing a stimulus, limiting the use of such functions to either static or foveal content but not both. In this paper, we propose a unified CSF, stelaCSF, which accounts for all major dimensions of the stimulus: spatial and temporal frequency, eccentricity, luminance, and area. To model the 5- dimensional space of contrast sensitivity, we combined data from 11 papers, each of which studied a subset of this space. While previously proposed CSFs were fitted to a single dataset, stelaCSF can predict the data from all these studies using the same set of parameters. The predictions are accurate in the entire domain, including low frequencies. In addition, stelaCSF relies on psychophysical models and experimental evidence to explain the major interactions between the 5 dimensions of the CSF. We demonstrate the utility of our new CSF in a flicker detection metric and in foveated rendering.
{"title":"stelaCSF: a unified model of contrast sensitivity as the function of spatio-temporal frequency, eccentricity, luminance and area","authors":"Rafał K. Mantiuk, Alexandre Chapiro, Alexandre Chapiro","doi":"10.1145/3528223.3530115","DOIUrl":"https://doi.org/10.1145/3528223.3530115","url":null,"abstract":"contrast many The existing CSFs typically account for a subset of relevant dimensions describing a stimulus, limiting the use of such functions to either static or foveal content but not both. In this paper, we propose a unified CSF, stelaCSF, which accounts for all major dimensions of the stimulus: spatial and temporal frequency, eccentricity, luminance, and area. To model the 5- dimensional space of contrast sensitivity, we combined data from 11 papers, each of which studied a subset of this space. While previously proposed CSFs were fitted to a single dataset, stelaCSF can predict the data from all these studies using the same set of parameters. The predictions are accurate in the entire domain, including low frequencies. In addition, stelaCSF relies on psychophysical models and experimental evidence to explain the major interactions between the 5 dimensions of the CSF. We demonstrate the utility of our new CSF in a flicker detection metric and in foveated rendering.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"52 1","pages":"145:1-145:16"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84864902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}