Objective: To evaluate intra-observer diagnostic reproducibility using traditional slides (TS) versus whole slide images (WSI).
Methods: TS and WSI of 1427 prostatic biopsies (107 consecutive patients) were evaluated by a single pathologist. Agreement between readings was evaluated with Gwet's Agreement coefficient (AC) and Landis and Koch benchmark scale.
Results: The positive/negative agreement between the readings was almost perfect (AC1= 0.962; 95% CI[0.949,0.974]), with method independent distribution of discrepancies. Among positive biopsies, 212 had identical Gleason score (GS) on TS and WSI and discordant GS in 69 cases (AC2 = 0.932; 95% CI[0.907, 0.956]). Concordant negative and positive patient classification was observed in 39 and 64 cases, respectively; two cases were assigned to the positive group on TS and 2 on WSI configuring an almost perfect agreement (AC1=0.929; 95% C1[0.860, 0.998]). ISUP Grade group (ISUP GG) agreement was evaluated in the 60 concordantly positive cases: in 45 cases it was identical on TS and WSI; in 10 biopsies the discrepancy implied a modification of the assigned ISUP GG of ≤ 1 class and in 5 the discrepancy implied a modification of 2 classes. Gwet's agreement coefficient was (95% CI [0.834, 0.962]), i.e.: almost perfect agreement.
Conclusions: Our data show almost perfect agreement between digital and traditional diagnostic activity in a routine setting, confirming that digital pathology can be safely introduced into routine workflows.