We show how to obtain, via a unified framework provided by logic and automata theory, many classical results of Brillhart and Morton on Rudin-Shapiro sums. The techniques also facilitate easy proofs for new results.
We study the following natural variant of the budgeted maximum coverage problem: We are given a budget B and a hypergraph [Formula: see text], where each vertex has a non-negative cost and a non-negative profit. The goal is to select a set of hyperedges [Formula: see text] such that the total cost of the vertices covered by T is at most B and the total profit of all covered vertices is maximized. This is a natural generalization of the maximum coverage problem. Our interest in this problem stems from its application to bid optimization in sponsored search auctions. It is easily seen that this problem is at least as hard as budgeted maximum coverage (where the costs are associated with the selected hyperedges instead of the covered vertices). This implies [Formula: see text]-inapproximability for any [Formula: see text]. Furthermore, standard greedy approaches do not yield constant factor approximations for our variant of the problem. In fact, through a reduction from Densest k-Subgraph, it can be established that our problem is inapproximable up to a constant factor, conditional on the exponential time hypothesis. Our main results are as follows: (i.) We obtain a [Formula: see text]-approximation algorithm for graphs. (ii.) We derive a fully polynomial-time approximation scheme (FPTAS) if the incidence graph of the hypergraph is a forest (i.e., the hypergraph is Berge-acyclic). We extend this result to incidence graphs with a fixed-size feedback hyperedge node set. (iii.) We give a [Formula: see text]-approximation algorithm for all [Formula: see text], where d is the maximum vertex degree.
An elastic-degenerate (ED) string is a sequence of n finite sets of strings of total length N, introduced to represent a set of related DNA sequences, also known as a pangenome. The ED string matching (EDSM) problem consists in reporting all occurrences of a pattern of length m in an ED text. The EDSM problem has recently received some attention by the combinatorial pattern matching community, culminating in an (mathcal {tilde{O}}(nm^{omega -1})+mathcal {O}(N))-time algorithm [Bernardini et al., SIAM J. Comput. 2022], where (omega ) denotes the matrix multiplication exponent and the (mathcal {tilde{O}}(cdot )) notation suppresses polylog factors. In the k-EDSM problem, the approximate version of EDSM, we are asked to report all pattern occurrences with at most k errors. k-EDSM can be solved in (mathcal {O}(k^2mG+kN)) time, under edit distance, or (mathcal {O}(kmG+kN)) time, under Hamming distance, where G denotes the total number of strings in the ED text [Bernardini et al., Theor. Comput. Sci. 2020]. Unfortunately, G is only bounded by N, and so even for (k=1), the existing algorithms run in (varOmega (mN)) time in the worst case. In this paper we make progress in this direction. We show that 1-EDSM can be solved in (mathcal {O}((nm^2 + N)log m)) or (mathcal {O}(nm^3 + N)) time under edit distance. For the decision version of the problem, we present a faster (mathcal {O}(nm^2sqrt{log m} + Nlog log m))-time algorithm. We also show that 1-EDSM can be solved in (mathcal {O}(nm^2 + Nlog m)) time under Hamming distance. Our algorithms for edit distance rely on non-trivial reductions from 1-EDSM to special instances of classic computational geometry problems (2d rectangle stabbing or 2d range emptiness), which we show how to solve efficiently. In order to obtain an even faster algorithm for Hamming distance, we rely on employing and adapting the k-errata trees for indexing with errors [Cole et al., STOC 2004]. This is an extended version of a paper presented at LATIN 2022.
Firstly studied by Kempa and Prezza in 2018 as the unifying idea behind text compression algorithms, string attractors have become a compelling object of theoretical research within the community of combinatorics on words. In this context, they have been studied for several families of finite and infinite words. In this paper, we focus on string attractors of prefixes of particular automatic infinite words (including the famous period-doubling and k-bonacci words) related to simple-Parry numbers. For a subfamily of these words, we describe string attractors of optimal size, while for the rest of them, we provide nearly optimal-size ones. Such a contribution is of particular interest, since in general finding smallest string attractors is NP-hard. This extends our previous work published in the international conference WORDS 2023.
Jumping automata are finite automata that read their input in a non-consecutive manner, disregarding the order of the letters in the word. We introduce and study jumping automata over infinite words. Unlike the setting of finite words, which has been well studied, for infinite words it is not clear how words can be reordered. To this end, we consider three semantics: automata that read the infinite word in some order so that no letter is overlooked, automata that can permute the word in windows of a given size k, and automata that can permute the word in windows of an existentially-quantified bound. We study expressiveness, closure properties and algorithmic properties of these models.
It is known that the set of solutions of any constant-free three-variable word equation can be represented using parametric words, and the number of numerical parameters and the level of nesting in these parametric words is at most logarithmic with respect to the length of the equation. We show that this result can be significantly improved in the case of unbalanced equations, that is, equations where at least one variable has a different number of occurrences on the left-hand side and on the right-hand side. More specifically, it is sufficient to have two numerical parameters and one level of nesting in this case. We also discuss the possibility of proving a similar result for balanced equations in the future.
A classical result by Myerson (Math. Oper. Res. 6(1), 58-73, 1981) gives a characterization of an optimal auction for any given distribution of valuations of the bidders. We consider the situation where the distribution is not explicitly given but can be observed in a sample of auction results from the same distribution. A seminal paper by Morgenstern and Roughgarden (Adv.Neural Inf. Process. Syst. 28, 2015) proposes to learn a near-optimal auction from the hypothesis class of t-level auctions. They prove a bound on the sample complexity, i.e., the function (f(varepsilon , delta )) of required samples to guarantee a certain level of precision ((1-varepsilon )) with a probability of at least ((1-delta )), for the general single-parameter case and a tighter bound for the very restricted matroid case. We show a new bound for the case of independence systems, that widely generalizes matroids and contains several important combinatorial optimization problems. This bound of (tilde{O}left( nicefrac {H^2n^4}{varepsilon ^3}right) ) falls neatly between those known for the general and the matroid case. The class of independence systems contains several well known NP-hard problems such as knapsack. Therefore, the allocation itself might in practice be limited to (alpha )-approximate solutions. In a second result we show that an approximation algorithm can be used without compromising the sample complexity. Also, the precision is affected only mildly, resulting in a factor of (alpha cdot (1-varepsilon )).
We define a new class of ternary sequences that are 2-balanced. These sequences are obtained by colouring of Sturmian sequences. We show that the class contains sequences of any given letter frequencies. We provide an upper bound on factor and abelian complexity of these sequences. Using the interpretation by rectangle exchange transformation, we prove that for almost all triples of letter frequencies, the upper bound on factor and abelian complexity is reached. The bound on factor complexity is given using a number-theoretical function which we compute explicitly for a class of parameters.
This paper studies obstructions to preservation of return sets by episturmian morphisms. We show, by way of an explicit construction, that infinitely many obstructions exist. This generalizes and improves an earlier result about Sturmian morphisms.

