Fruit size is crucial for growers as it influences consumer willingness to buy and the price of the fruit. Fruit size and growth along the seasons are two parameters that can lead to more precise orchard management favoring production sustainability. In this study, a Python-based computer vision system (CVS) for sizing apples directly on the tree was developed to ease fruit sizing tasks. The system is made of a consumer-grade depth camera and was tested at two distances among 17 timings throughout the season, in a Fuji apple orchard. The CVS exploited a specifically trained YOLOv5 detection algorithm, a circle detection algorithm, and a trigonometric approach based on depth information to size the fruits. Comparisons with standard-trained YOLOv5 models and with spherical objects were carried out. The algorithm showed good fruit detection and circle detection performance, with a sizing rate of 92%. Good correlations (r > 0.8) between estimated and actual fruit size were found. The sizing performance showed an overall mean error (mE) and RMSE of + 5.7 mm (9%) and 10 mm (15%). The best results of mE were always found at 1.0 m, compared to 1.5 m. Key factors for the presented methodology were: the fruit detectors customization; the HoughCircle parameters adaptability to object size, camera distance, and color; and the issue of field natural illumination. The study also highlighted the uncertainty of human operators in the reference data collection (5–6%) and the effect of random subsampling on the statistical analysis of fruit size estimation. Despite the high error values, the CVS shows potential for fruit sizing at the orchard scale. Future research will focus on improving and testing the CVS on a large scale, as well as investigating other image analysis methods and the ability to estimate fruit growth.