The specialty coffee industry relies heavily on manual grading to maintain the ultimate cupping quality of the specialty coffee. This is a subjective and costly process, as this grading is performed by skilled labor. Therefore, this study aims to evaluate the potential of computer vision and machine learning approaches to classify green coffee beans into specialty and defective categories. In this regard, two traditional machine learning models, including random forest (RF) and support vector classifier (SVC), and three deep learning models, including a custom lightweight Convolutional Neural Network (CNN), MobileNetV2, and MobileNetV3, were evaluated on this task. Model performances were assessed using accuracy, precision, recall, F1-score, learning curves, Grad-CAM visualization, and precision-recall analysis. According to the results, the traditional machine-learning models achieved classification accuracies of 98% with RF and 95% with SVC. Similarly, the deep-learning models achieved accuracy values of 99.6% with the lightweight custom CNN, 99.6% with MobileNetV2, and 98.7% with MobileNetV3. Moreover, inference time was tested on a Raspberry Pi 5 to assess the feasibility of real-time deployment capabilities of the models on low-cost edge devices. The results demonstrated ultra-fast inference time of 0.155 ms with SVC compared to RF (1.226 ms). Similarly, average inference time for deep learning models demonstrated 94.811 ms for CNN with custom architecture, 125.144 ms with MobileNetV2, and 115.86 ms with MobileNetV3. Furthermore, this inference time was reduced significantly after the conversion of the models to a TFLite model. Based on overall evaluations of the models, the lightweight CNN with a custom architecture outperformed, maintaining consistent inference time and strong feature interpretability with generalized performance.