Compute-in-memory (CIM) enables efficient deep neural network (DNN) implementation, but suffers from area and energy overhead from analog-to-digital converters (ADCs) and crossbar arrays of limited cell precisions. Low-precision ADCs mitigate these overheads, but introduce partial-sum quantization errors, degrading accuracy. Cell precision limitations impose low-bit weight constraints that further challenge network accuracy. Although prior work has focused on fine-grained partial-sum quantization to reduce ADC resolution, weight granularity remains underexplored, which is crucial for achieving high accuracy. Utilizing low-precision cells, weight decomposition is commonly employed to represent signed weights, but conventional zero-anchored schemes restrict resolution. We address these issues by integrating unanchored weight decomposition with column-wise alignment of weight and partial-sum quantization. Our method improves accuracy while maintaining dequantization overhead, simplifies training by removing two-stage processes, and maximizes representable weight quantization levels. We also introduce an open-source CIM-oriented convolution framework that manages fine-grained weights and partial-sums through novel tiling and group convolution. Experimental results demonstrate accuracy improvements up to 4.05% in comparison with state-of-the-art methods, highlighting the effectiveness of our quantization scheme in enhancing accuracy while maintaining hardware efficiency in CIM accelerators. Our code is available at https://github.com/jiyoonkm/ColumnQuant.
扫码关注我们
求助内容:
应助结果提醒方式:
