Specific data compression techniques, formalized by the concept of coresets, proved to be powerful for many optimization problems. In fact, while tightly controlling the approximation error, coresets may lead to significant speed up of the computations and hence allow to extend algorithms to much larger problem sizes. The present paper deals with a weight-balanced clustering problem, and is specifically motivated by an application in materials science where a voxel-based image is to be processed into a diagram representation. Here, the class of desired coresets is naturally confined to those which can be viewed as lowering the resolution of the input data. While one might expect that such resolution coresets are inferior to unrestricted coreset we prove bounds for resolution coresets which improve known bounds in the relevant dimensions and also lead to significantly faster algorithms in practice.
We seek to extract a small number of representative scenarios from large panel data that are consistent with sample moments. Among two novel algorithms, the first identifies scenarios that have not been observed before, and comes with a scenario-based representation of covariance matrices. The second proposal selects important data points from states of the world that have already realized, and are consistent with higher-order sample moment information. Both algorithms are efficient to compute and lend themselves to consistent scenario-based modeling and multi-dimensional numerical integration that can be used for interpretable decision-making under uncertainty. Extensive numerical benchmarking studies and an application in portfolio optimization favor the proposed algorithms.