To overcome the limitations of static feature extraction and inefficient context modeling in existing learned image compression, this paper proposes an image compression algorithm that integrates Depth-aware Adaptive Transformation (DAT) framework and Multi-reference Dynamic Entropy Model (MDEM). A proposed Multi-scale Capacity-aware Feature Enhancer (MCFE) model is adaptively embedded into the network to enhance feature extraction capability. The DAT architecture integrates a variational autoencoder framework with MCFE to increase the density of latent representations. Furthermore, an improved soft-threshold sparse attention mechanism is combined with a multi-context model, incorporating adaptive weights to eliminate spatial redundancy in the latent representations across local, non-local, and global dimensions, while channel context is introduced to capture channel dependencies. Building upon this, the MDEM integrates the side information provided by DAT along with spatial and channel context information and employs a channel-wise autoregressive model to achieve accurate pixel estimation for precise entropy probability estimation, which improves compression performance. Evaluated on the Kodak, Tecnick, and CLIC(Challenge on Learned Image Compression) Professional Validation datasets, the proposed method achieves BD-rate(Bjøntegaard Delta rate) gains of , , and , respectively, compared to the VTM(Versatile Video Coding Test Model)-17.0 benchmark. Therefore, the proposed algorithm overcomes the limitations of fixed-context and static feature extraction strategies, enabling precise probability estimation and superior compression performance through dynamic resource allocation and multi-dimensional contextual modeling.
扫码关注我们
求助内容:
应助结果提醒方式:
