The employment of artificial intelligence methods in computer-assisted diagnosis systems is critical for colorectal cancer survival analysis and prognosis. However, due to the low prediction accuracy of single-modal data research and the complexity of multimodal data fusion methods, the current study's effect on colorectal cancer is minimal. To address this issue, the authors offer a multimodal cross attention fusion (MMCAF) technique for predicting colorectal cancer survival status. First, feature engineering is used to create feature sets for every mode and to address the heterogeneity of multimodal data. Second, a three-mode fusion technique is used to allocate weight to single-mode and multimodal features via channels and cross-attention processes. Lastly, the cross-entropy loss function is minimized in order to estimate the classification survival. The experimental results reveal that the MMCAF approach predicts survival states with 97.73% accuracy and an area under the receiver operating characteristic curve (AUC) of 0.99. When compared to the best outcome of other fusion algorithms (feature concatenation), the prediction accuracy increases by about 6 percentage points, while the AUC increases by 7 percentage points. This finding thoroughly demonstrates MMCAF's efficacy in predicting colorectal cancer survival.