In the original publication of this article [1], the Figs. 3 and 4 are not clear enough. They are adjusted the size and showed as below.
In the original publication of this article [1], the Figs. 3 and 4 are not clear enough. They are adjusted the size and showed as below.
With the rapid development of deep learning technology, behavior recognition based on video streams has made great progress in recent years. However, there are also some problems that must be solved: (1) In order to improve behavior recognition performance, the models have tended to become deeper, wider, and more complex. However, some new problems have been introduced also, such as that their real-time performance decreases; (2) Some actions in existing datasets are so similar that they are difficult to distinguish. To solve these problems, the ResNet34-3DRes18 model, which is a lightweight and efficient two-dimensional (2D) and three-dimensional (3D) fused model, is constructed in this study. The model used 2D convolutional neural network (2DCNN) to obtain the feature maps of input images and 3D convolutional neural network (3DCNN) to process the temporal relationships between frames, which made the model not only make use of 3DCNN's advantages on video temporal modeling but reduced model complexity. Compared with state-of-the-art models, this method has shown excellent performance at a faster speed. Furthermore, to distinguish between similar motions in the datasets, an attention gate mechanism is added, and a Res34-SE-IM-Net attention recognition model is constructed. The Res34-SE-IM-Net achieved 71.85%, 92.196%, and 36.5% top-1 accuracy (The predicting label obtained from model is the largest one in the output probability vector. If the label is the same as the target label of the motion, the classification is correct.) respectively on the test sets of the HMDB51, UCF101, and Something-Something v1 datasets.
This review investigates the recent developments of heterogeneous objects modeling in additive manufacturing (AM), as well as general problems and widespread solutions to the modeling methods of heterogeneous objects. Prevalent heterogeneous object representations are generally categorized based on the different expression or data structure employed therein, and the state-of-the-art of process planning procedures for AM is reviewed via different vigorous solutions for part orientation, slicing methods, and path planning strategies. Finally, some evident problems and possible future directions of investigation are discussed.
In a positron emission tomography (PET) scanner, the time-of-flight (TOF) information gives us rough event position along the line-of-response (LOR). Using the TOF information for PET image reconstruction is able to reduce image noise. The state-of-the-art TOF PET image reconstruction uses iterative algorithms. This study introduces an analytic TOF PET algorithm that focuses on three-dimensional (3D) reconstruction. The proposed algorithm is in the form of backprojection filtering, in which the backprojection is performed first by using a time-resolution profile function, and then a 3D filter is applied to the backprojected image. For the list-mode data, the backprojection is carried out in the event-by-event fashion, and the timing resolution determined weighting function is used along the projection LOR. Computer simulations are carried out to verify the feasibility of the proposed algorithm.
A hybrid image allows multiple image interpretations to be modulated by the viewing distance. Originally, it can be constructed by combining the low and high spatial frequencies of two different images. The original hybrid image synthesis was limited to similar shapes of source images that were aligned in the edges, e.g., faces with a different expression, to produce an effective double image interpretation. In our previous work, we proposed a noise-inserted method for synthesizing a hybrid image from dissimilar shape images or unaligned images. In this work, we propose a novel method for adding an image to be seen from a middle viewing distance. The middle-frequency (MF) image is extracted by a special bandpass filter, which generates ringing while extracting only specified frequency bands. With this method, the middle frequency should be perceived as a meaningless pattern when viewed from a far distance and close up. A parameter tuning experiment was performed to determine the suitable cutoff frequencies for designing the filter for the MF image. We found that ringings of a suitable size could be used to make the middle frequency less noticeable when seen from far away.
Exploration of artworks is enjoyable but often time consuming. For example, it is not always easy to discover the favorite types of unknown painting works. It is not also always easy to explore unpopular painting works which looks similar to painting works created by famous artists. This paper presents a painting image browser which assists the explorative discovery of user-interested painting works. The presented browser applies a new multidimensional data visualization technique that highlights particular ranges of particular numeric values based on association rules to suggest cues to find favorite painting images. This study assumes a large number of painting images are provided where categorical information (e.g., names of artists, created year) is assigned to the images. The presented system firstly calculates the feature values of the images as a preprocessing step. Then the browser visualizes the multidimensional feature values as a heatmap and highlights association rules discovered from the relationships between the feature values and categorical information. This mechanism enables users to explore favorite painting images or painting images that look similar to famous painting works. Our case study and user evaluation demonstrates the effectiveness of the presented image browser.
Based on patient computerized tomography data, we segmented a region containing an intracranial hematoma using the threshold method and reconstructed the 3D hematoma model. To improve the efficiency and accuracy of identifying puncture points, a point-cloud search arithmetic method for modified adaptive weighted particle swarm optimization is proposed and used for optimal external axis extraction. According to the characteristics of the multitube drainage tube and the clinical needs of puncture for intracranial hematoma removal, the proposed algorithm can provide an optimal route for a drainage tube for the hematoma, the precise position of the puncture point, and preoperative planning information, which have considerable instructional significance for clinicians.
To tackle challenges such as interference and poor accuracy of indoor positioning systems, a novel scheme based on ultra-wide bandwidth (UWB) technology is proposed. First, we illustrate a distance measuring method between two UWB devices. Then, a Taylor series expansion algorithm is developed to detect coordinates of the mobile node using the location of anchor nodes and the distance between them. Simulation results show that the observation error under our strategy is within 15 cm, which is superior to existing algorithms. The final experimental data in the hardware system mainly composed of STM32 and DW1000 also confirms the performance of the proposed scheme.
Texture features have played an essential role in the field of medical imaging for computer-aided diagnosis. The gray-level co-occurrence matrix (GLCM)-based texture descriptor has emerged to become one of the most successful feature sets for these applications. This study aims to increase the potential of these features by introducing multi-scale analysis into the construction of GLCM texture descriptor. In this study, we first introduce a new parameter - stride, to explore the definition of GLCM. Then we propose three multi-scaling GLCM models according to its three parameters, (1) learning model by multiple displacements, (2) learning model by multiple strides (LMS), and (3) learning model by multiple angles. These models increase the texture information by introducing more texture patterns and mitigate direction sparsity and dense sampling problems presented in the traditional Haralick model. To further analyze the three parameters, we test the three models by performing classification on a dataset of 63 large polyp masses obtained from computed tomography colonoscopy consisting of 32 adenocarcinomas and 31 benign adenomas. Finally, the proposed methods are compared to several typical GLCM-texture descriptors and one deep learning model. LMS obtains the highest performance and enhances the prediction power to 0.9450 with standard deviation 0.0285 by area under the curve of receiver operating characteristics score which is a significant improvement.