Background: Although computed tomography (CT) is widely employed in disease detection, X-ray radiation may pose a risk to the health of patients. Reducing the projection views is a common method, however, the reconstructed images often suffer from streak artifacts.
Purpose: In previous related works, it can be found that the convolutional neural network (CNN) is proficient in extracting local features, while the Transformer is adept at capturing global information. To suppress streak artifacts for sparse-view CT, this study aims to develop a method that combines the advantages of CNN and Transformer.
Methods: In this paper, we propose a Multi-Attention and Dual-Branch Feature Aggregation U-shaped Transformer network (MAFA-Uformer), which consists of two branches: CNN and Transformer. Firstly, with a coordinate attention mechanism, the Transformer branch can capture the overall structure and orientation information to provide a global context understanding of the image under reconstruction. Secondly, the CNN branch focuses on extracting crucial local features of images through channel spatial attention, thus enhancing detail recognition capabilities. Finally, through a feature fusion module, the global information from the Transformer and the local features from the CNN are integrated effectively.
Results: Experimental results demonstrate that our method achieves outstanding performance in terms of peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and root mean square error (RMSE). Compared with Restormer, our model achieves significant improvements: PSNR increases by 0.76 dB, SSIM improves by 0.44%, and RMSE decreases by 8.55%.
Conclusion: Our method not only effectively suppresses artifacts but also better preserves details and features, thereby providing robust support for accurate diagnosis of CT images.