Sketch-based content generation offers flexible controllability, making it a promising narrative avenue in film production. Directors often visualize their imagination by crafting storyboards using sketches and textual descriptions for each shot. However, current video generation methods suffer from three-dimensional inconsistencies, with notably artifacts during large motion or camera pans around scenes. A suitable solution is to directly generate 4D scene, enabling consistent dynamic three-dimensional scenes generation. We define the Sketch-2-4D problem, aiming to enhance controllability and consistency in this context. We propose a novel Control Score Distillation Sampling (SDS-C) for sketch-based 4D scene generation, providing precise control over scene dynamics. We further design Spatial Consistency Modules and Temporal Consistency Modules to tackle the temporal and spatial inconsistencies introduced by sketch-based control, respectively. Extensive experiments have demonstrated the effectiveness of our approach.
Feature lines play a pivotal role in the reconstruction of CAD models. Currently, there is a lack of a robust explicit reconstruction algorithm capable of achieving sharp feature reconstruction in point clouds with noise and non-uniformity. In this paper, we propose a feature-preserving CAD model surface reconstruction algorithm, named FACE. The algorithm initiates with preprocessing the point cloud through denoising and resampling steps, resulting in a high-quality point cloud that is devoid of noise and uniformly distributed. Then, it employs discrete optimal transport to detect feature regions and subsequently generates dense points along potential feature lines to enhance features. Finally, the advancing-front surface reconstruction method, based on normal vector directions, is applied to reconstruct the enhanced point cloud. Extensive experiments demonstrate that, for contaminated point clouds, this algorithm excels not only in reconstructing straight edges and corner points but also in handling curved edges and surfaces, surpassing existing methods.
Mesh-based image vectorization techniques have been studied for a long time, mostly owing to their compactness and flexibility in capturing image features. However, existing methods often lead to relatively dense meshes, especially when applied to images with high-frequency details or textures. We present a novel method that automatically vectorizes an image into a sparse collection of Coons patches whose size adapts to image features. To balance the number of patches and the accuracy of feature alignment, we generate the layout based on a harmonic cross field constrained by image features. We support T-junctions, which keeps the number of patches low and ensures local adaptation to feature density, naturally complemented by varying mesh-color resolution over the patches. Our experimental results demonstrate the utility, accuracy, and sparsity of our method.
The restoration of digital images holds practical significance due to the fact that degradation of digital image data on the internet is common. State-of-the-art image restoration methods usually employ end-to-end trained networks. However, we argue that a network trained with diverse image pairs is not optimal for restoring line drawings which have extensive plain backgrounds. We propose a line-drawing restoration framework which takes a restoration neural network as backbone and processes an input degraded line drawing in two steps. First, a proposed mask-predicting network predicts a line mask which indicates the possible location of foreground and background in the potential original line drawing. Next, we feed the degraded input line drawing together with the predicted line mask into the backbone restoration network. The traditional loss for the backbone restoration network is substituted with a masked Mean Square Error (MSE) loss. We test our framework on two classical image restoration tasks: JPEG restoration and super-resolution, and experiments demonstrate that our framework can achieve better quantitative and visual results in most cases.

