Image anomaly detection is a popular task in computer graphics, which is widely used in industrial fields. Previous works that address this problem often train CNN-based (e.g. Auto-Encoder, GANs) models to reconstruct covered parts of input images and calculate the difference between the input and the reconstructed image. However, convolutional operations are good at extracting local features making it difficult to identify larger image anomalies. To this end, we propose a transformer architecture based on mutual attention for image anomaly separation. This architecture can capture long-term dependencies and fuse local features with global features to facilitate better image anomaly detection. Our method was extensively evaluated on several benchmarks, and experimental results showed that it improved detection capability by 3.1% and localization capability by 1.0% compared with state-of-the-art reconstruction-based methods.
In this study, we propose view interpolation networks to reproduce changes in the brightness of an object's surface depending on the viewing direction, which is important in reproducing the material appearance of a real object. We use an original and a modified version of U-Net for image transformation. The networks were trained to generate images from intermediate viewpoints of four cameras placed at the corners of a square. We conducted an experiment with three different combinations of methods and training data formats. We found that it is best to input the coordinates of the viewpoints together with the four camera images and to use images from random viewpoints as the training data.
Owing to the rapid development of deep networks, single image deraining tasks have achieved significant progress. Various architectures have been designed to recursively or directly remove rain, and most rain streaks can be removed by existing deraining methods. However, many of them cause a loss of details during deraining, resulting in visual artifacts. To resolve the detail-losing issue, we propose a novel unrolling rain-guided detail recovery network (URDRN) for single image deraining based on the observation that the most degraded areas of the background image tend to be the most rain-corrupted regions. Furthermore, to address the problem that most existing deep-learning-based methods trivialize the observation model and simply learn an end-to-end mapping, the proposed URDRN unrolls the single image deraining task into two subproblems: rain extraction and detail recovery. Specifically, first, a context aggregation attention network is introduced to effectively extract rain streaks, and then, a rain attention map is generated as an indicator to guide the detail-recovery process. For a detail-recovery sub-network, with the guidance of the rain attention map, a simple encoder–decoder model is sufficient to recover the lost details. Experiments on several well-known benchmark datasets show that the proposed approach can achieve a competitive performance in comparison with other state-of-the-art methods.
Judging how an image is visually appealing is a complicated and subjective task. This highly motivates having a machine learning model to automatically evaluate image aesthetic by matching the aesthetics of general public. Although deep learning methods have been successfully learning good visual features from images, correctly assessing image aesthetic quality is still challenging for deep learning. To tackle this, we propose a novel multi-view convolutional neural network to assess image aesthetic by analyzing image color composition and space formation (IAACS). Specifically, from different views of an image, including its key color components with their contributions, the image space formation and the image itself, our network extracts their corresponding features through our proposed feature extraction module (FET) and the ImageNet weight-based classification model. By fusing the extracted features, our network produces an accurate prediction score distribution of image aesthetic. Experiment results have shown that we have achieved a superior performance.
Video anomaly detection has always been a hot topic and attracting an increasing amount of attention. Much of the existing methods on video anomaly detection depend on processing the entire video rather than considering only the significant context. This paper proposes a novel video anomaly detection method named COVAD, which mainly focuses on the region of interest in the video instead of the entire video. Our proposed COVAD method is based on an auto-encoded convolutional neural network and coordinated attention mechanism, which can effectively capture meaningful objects in the video and dependencies between different objects. Relying on the existing memory-guided video frame prediction network, our algorithm can more effectively predict the future motion and appearance of objects in the video. Our proposed algorithm obtained better experimental results on multiple data sets and outperformed the baseline models considered in our analysis. At the same time we improve a visual test that can provide pixel-level anomaly explanations.
Due to the limitation of the working principle of 3D scanning equipment, the point cloud obtained by 3D scanning is usually sparse and unevenly distributed. In this paper, we propose a new Generative Adversarial Network(GAN) for point cloud upsampling, which is extended from PU-GAN. Its core architecture is to replace the traditional Self-Attention (SA) module with the implicit Laplacian Off-Set Attention(OA) module, and adjacency features are aggregated using the Multi-Scale Off-Set Attention(MSOA) module, which adaptively adjusts the receptive field to learn various structural features. Finally, Residual links were added to form our Residual Multi-Scale Off-Set Attention (RMSOA) module, which utilized multi-scale structural relationships to generate finer details. A large number of experiments show that the performance of our method is superior to the existing methods, and our model has high robustness.
The lack of social activities in the elderly for physical reasons can make them feel lonely and prone to depression. With the spread of COVID-19, it is difficult for the elderly to conduct the few social activities stably, causing the elderly to be more lonely. The metaverse is a virtual world that mirrors reality. It allows the elderly to get rid of the constraints of reality and perform social activities stably and continuously, providing new ideas for alleviating the loneliness of the elderly. Through the analysis of the needs of the elderly, a virtual social center framework for the elderly was proposed in this study. Besides, a prototype system was designed according to the framework. The elderly can socialize in virtual reality with metaverse-related technologies and human-computer interaction tools. Additionally, a test was jointly conducted with the chief physician of the geriatric rehabilitation department of a tertiary hospital. The results demonstrated that the mental state of the elderly who had used the virtual social center was significantly better than that of the elderly who had not used it. Thus, virtual social centers alleviated loneliness and depression in older adults. Virtual social centers can help the elderly relieve loneliness and depression when the global epidemic is normalizing and the population is aging. Hence, they have promotion value
Developments in new-generation information technology have enabled Digital Twins to reshape the physical world into a virtual digital space and provide technical support for constructing the Metaverse. Metaverse objects can be at the micro-, meso-, or macroscale. The Metaverse is a complex collection of solid, liquid, gaseous, plasma, and other uncertain states. Additionally, the Metaverse integrates tangibles with social relations, such as interpersonal (friends, partners, and family) and social relations (ethics, morality, and law). This review introduces some principles and laws, such as broken windows theory, small-world phenomenon, survivor bias, and herd behavior, for constructing a Digital Twins model for social relations. Therefore, from multiple perspectives, this article reviews mappings of tangible and intangible real-world objects to the Metaverse using the Digital Twins model.