Computed tomography (CT) is commonly used as a diagnostic and treatment planning imaging modality in craniomaxillofacial (CMF) surgery to correct patient's bony defects. A major disadvantage of CT is that it emits harmful ionizing radiation to patients during the exam. Magnetic resonance imaging (MRI) is considered to be much safer and noninvasive, and often used to study CMF soft tissues (e.g., temporomandibular joint and brain). However, it is extremely difficult to accurately segment CMF bony structures from MRI since both bone and air appear to be black in MRI, along with low signal-to-noise ratio and partial volume effect. To this end, we proposed a 3D deep-learning based cascade framework to solve these issues. Specifically, a 3D fully convolutional network (FCN) architecture is first adopted to coarsely segment the bony structures. As the coarsely segmented bony structures by FCN tend to be thicker, convolutional neural network (CNN) is further utilized for fine-grained segmentation. To enhance the discriminative ability of the CNN, we particularly concatenate the predicted probability maps from FCN and the original MRI, and feed them together into the CNN to provide more context information for segmentation. Experimental results demonstrate a good performance and also the clinical feasibility of our proposed 3D deep-learning based cascade framework.