In this paper we present a new conceptual model of trajectories, which accounts for semantic and indoor space information and supports the design and implementation of context-aware mobility data mining and statistical analytics methods. Motivated by a compelling museum case study, and by what we perceive as a lack in indoor trajectory research, we combine aspects of state-of-the-art semantic outdoor trajectory models, with a semantically-enabled hierarchical symbolic representation of the indoor space, which abides by OGC's IndoorGML standard. We drive the discussion on modeling issues that have been overlooked so far and illustrate them with a real-world case study concerning the Louvre Museum, in an effort to provide a pragmatic view of what the proposed model represents and how. We also present experimental results based on Louvre's visiting data showcasing how state-of-the-art mining algorithms can be applied on trajectory data represented according to the proposed model, and outline their advantages and limitations. Finally, we provide a formal outline of a new sequential pattern mining algorithm and how it can be used for extracting interesting trajectory patterns.
Spatiotemporal event sequences (STESs) are the ordered series of event types whose instances frequently follow each other in time and are located close-by. An STES is a spatiotemporal frequent pattern type, which is discovered from moving region objects whose polygon-based locations continiously evolve over time. Previous studies on STES mining require significance and prevalence thresholds for the discovery, which is usually unknown to domain experts. The quality of the discovered sequences is of great importance to the domain experts who use these algorithms. We introduce a novel algorithm to find the most relevant STESs without threshold values. We tested the relevance and performance of our threshold-free algorithm with a case study on solar event metadata, and compared the results with the previous STES mining algorithms.
Fast and precise querying in a given set of trajectory points is an important issue of trajectory query. Typically, there are massive trajectory data in the database, yet the query sets only have a few points, which is a challenge for the superior performance of trajectory querying. The current trajectory query methods commonly use the tree-based index structure and the signature-based method to classify, simplify, and filter the trajectory to improve the performance. However, the unstructured essence and the spatiotemporal heterogeneity of the trajectory-sequence lead these methods to a high degree of spatial overlap, frequent I/O, and high memory occupation. Thus, they are not suitable for the time-critical tasks of trajectory big data. In this paper, a query method of trajectory is developed on the Bloom Filter. Based on the gridded space and geocoding, the spatial trajectory sequences (tracks) query is transformed into the query of the text string. The geospace was regularly divided by the geographic grid, and each cell was assigned an independent geocode, converting the high-dimensional irregular space trajectory query into a one-dimensional string query. The point in each cell is regarded as a signature, which forms a mapping to the bit-array of the Bloom Filter. This conversion effectively eliminates the high degree of overlap and instability of query performance. Meanwhile, the independent coding ensures the uniqueness of the whole tracks. In this method, there is no need for additional I/O on the raw trajectory data when the track is queried. Compared to the original data, the memory occupied by this method is negligible. Based on Beijing Taxi and Shenzhen bus trajectory data, an experiment using this method was constructed, and random queries under a variety of conditions boundaries were constructed. The results verified that the performance and stability of our method, compared to R*tree index, have been improved by 2000 to 4000 times, based on one million to tens of millions of trajectory data. And the Bloom Filter-based query method is hardly affected by grid size, original data size, and length of tracks. With such a time advantage, our method is suitable for time-critical spatial computation tasks, such as anti-terrorism, public safety, epidemic prevention, and control, etc.