Maria Kosche, Tore Koß, Florin Manea, Stefan Siemer
{"title":"Absent Subsequences in Words","authors":"Maria Kosche, Tore Koß, Florin Manea, Stefan Siemer","doi":"10.3233/fi-222159","DOIUrl":null,"url":null,"abstract":"An absent factor of a string w is a string u which does not occur as a contiguous substring (a.k.a. factor) inside w. We extend this well-studied notion and define absent subsequences: a string u is an absent subsequence of a string w if u does not occur as subsequence (a.k.a. scattered factor) inside w. Of particular interest to us are minimal absent subsequences, i.e., absent subsequences whose every subsequence is not absent, and shortest absent subsequences, i.e., absent subsequences of minimal length. We show a series of combinatorial and algorithmic results regarding these two notions. For instance: we give combinatorial characterisations of the sets of minimal and, respectively, shortest absent subsequences in a word, as well as compact representations of these sets; we show how we can test efficiently if a string is a shortest or minimal absent subsequence in a word, and we give efficient algorithms computing the lexicographically smallest absent subsequence of each kind; also, we show how a data structure for answering shortest absent subsequence-queries for the factors of a given string can be efficiently computed.","PeriodicalId":56310,"journal":{"name":"Fundamenta Informaticae","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fundamenta Informaticae","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/fi-222159","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
An absent factor of a string w is a string u which does not occur as a contiguous substring (a.k.a. factor) inside w. We extend this well-studied notion and define absent subsequences: a string u is an absent subsequence of a string w if u does not occur as subsequence (a.k.a. scattered factor) inside w. Of particular interest to us are minimal absent subsequences, i.e., absent subsequences whose every subsequence is not absent, and shortest absent subsequences, i.e., absent subsequences of minimal length. We show a series of combinatorial and algorithmic results regarding these two notions. For instance: we give combinatorial characterisations of the sets of minimal and, respectively, shortest absent subsequences in a word, as well as compact representations of these sets; we show how we can test efficiently if a string is a shortest or minimal absent subsequence in a word, and we give efficient algorithms computing the lexicographically smallest absent subsequence of each kind; also, we show how a data structure for answering shortest absent subsequence-queries for the factors of a given string can be efficiently computed.
期刊介绍:
Fundamenta Informaticae is an international journal publishing original research results in all areas of theoretical computer science. Papers are encouraged contributing:
solutions by mathematical methods of problems emerging in computer science
solutions of mathematical problems inspired by computer science.
Topics of interest include (but are not restricted to):
theory of computing,
complexity theory,
algorithms and data structures,
computational aspects of combinatorics and graph theory,
programming language theory,
theoretical aspects of programming languages,
computer-aided verification,
computer science logic,
database theory,
logic programming,
automated deduction,
formal languages and automata theory,
concurrency and distributed computing,
cryptography and security,
theoretical issues in artificial intelligence,
machine learning,
pattern recognition,
algorithmic game theory,
bioinformatics and computational biology,
quantum computing,
probabilistic methods,
algebraic and categorical methods.