Persistent memory (PM) promises byte-addressability, large capacity, and durability. Main memory systems, such as key-value stores and in-memory databases, benefit from such features of PM. Due to the great popularity of hashing index in main memory systems, a number of research efforts are made to provide high average performance persistent hashing. However, suboptimal tail performance in terms of tail throughput and tail latency is still observed for existing persistent hashing. In this paper, we analyze major sources of suboptimal tail performance from key design issues of persistent hashing. We identify the global hash structure and concurrency control as remaining explorable design spaces for improving tail performance. We propose Directory-sharing Multi-level Extendible Hashing (Dalea) for PM. Dalea designs ancestor link-based extendible hashing as well as fine-grained transient lock to address the two main sources (rehashing and locking) affecting tail performance. The evaluation results show that, compared with state-of-the-art persistent hashing Dash, Dalea achieves increased tail throughput by 4.1x and reduced tail latency by 5.4x. Moreover, in order to provide design guidelines for improving tail performance, we adopt Dalea as a testbed to identify different impacts of four factors on tail performance, including fine-grained rehashing, transient locking, memory pre-allocation, and fingerprinting.
Inspired by the concept of content-addressable retrieval from cognitive science, we propose a novel fragmentbased Chinese named entity recognition (NER) model augmented with a lexicon-based memory in which both characterlevel and word-level features are combined to generate better feature representations for possible entity names. Observing that the boundary information of entity names is particularly useful to locate and classify them into pre-defined categories, position-dependent features, such as prefix and suffix, are introduced and taken into account for NER tasks in the form of distributed representations. The lexicon-based memory is built to help generate such position-dependent features and deal with the problem of out-of-vocabulary words. Experimental results show that the proposed model, called LEMON, achieved state-of-the-art performance with an increase in the F1-score up to 3.2% over the state-of-the-art models on four different widely-used NER datasets.
Tensors are a popular programming interface for developing artificial intelligence (AI) algorithms. Layout refers to the order of placing tensor data in the memory and will affect performance by affecting data locality; therefore the deep neural network library has a convention on the layout. Since AI applications can use arbitrary layouts, and existing AI systems do not provide programming abstractions to shield the layout conventions of libraries, operator developers need to write a lot of layout-related code, which reduces the efficiency of integrating new libraries or developing new operators. Furthermore, the developer assigns the layout conversion operation to the internal operator to deal with the uncertainty of the input layout, thus losing the opportunity for layout optimization. Based on the idea of polymorphism, we propose a layout-agnostic virtual tensor programming interface, namely the VTensor framework, which enables developers to write new operators without caring about the underlying physical layout of tensors. In addition, the VTensor framework performs global layout inference at runtime to transparently resolve the required layout of virtual tensors, and runtime layout-oriented optimizations to globally minimize the number of layout transformation operations. Experimental results demonstrate that with VTensor, developers can avoid writing layout-dependent code. Compared with TensorFlow, for the 16 operations used in 12 popular networks, VTensor can reduce the lines of code (LOC) of writing a new operation by 47.82% on average, and improve the overall performance by 18.65% on average.
Logs contain runtime information for both systems and users. As many of them use natural language, a typical log-based analysis needs to parse logs into the structured format first. Existing parsing approaches often take two steps. The first step is to find similar words (tokens) or sentences. Second, parsers extract log templates by replacing different tokens with variable placeholders. However, we observe that most parsers concentrate on precisely grouping similar tokens or logs. But they do not have a well-designed template extraction process, which leads to inconsistent accuracy on particular datasets. The root cause is the ambiguous definition of variable placeholders and similar templates. The consequences include abuse of variable placeholders, incorrectly divided templates, and an excessive number of templates over time. In this paper, we propose our online log parsing approach Cognition. It redefines variable placeholders via a strict lower bound to avoid ambiguity first. Then, it applies our template correction technique to merge and absorb similar templates. It eliminates the interference of commonly used parameters and thus isolates template quantity. Evaluation through 16 public datasets shows that Cognition has better accuracy and consistency than the state-of-the-art approaches. It also saves up to 52.1% of time cost on average than the others.
In multiagent systems, agents usually do not have complete information of the whole system, which makes the analysis of such systems hard. The incompleteness of information is normally modelled by means of accessibility relations, and the schedulers consistent with such relations are called uniform. In this paper, we consider probabilistic multiagent systems with accessibility relations and focus on the model checking problem with respect to the probabilistic epistemic temporal logic, which can specify both temporal and epistemic properties. However, the problem is undecidable in general. We show that it becomes decidable when restricted to memoryless uniform schedulers. Then, we present two algorithms for this case: one reduces the model checking problem into a mixed integer non-linear programming (MINLP) problem, which can then be solved by Satisfiability Modulo Theories (SMT) solvers, and the other is an approximate algorithm based on the upper confidence bounds applied to trees (UCT) algorithm, which can return a result whenever queried. These algorithms have been implemented in an existing model checker and then validated on experiments. The experimental results show the efficiency and extendability of these algorithms, and the algorithm based on UCT outperforms the one based on MINLP in most cases.