Purpose: The limited volume of medical training data remains one of the leading challenges for machine learning for diagnostic applications. Object detectors that identify and localize pathologies require training with a large volume of labeled images, which are often expensive and time-consuming to curate. To reduce this challenge, we present a method to support distant supervision of object detectors through generation of synthetic pathology-present labeled images.
Approach: Our method employs the previously proposed cyclic generative adversarial network (cycleGAN) with two key innovations: (1) use of "near-pair" pathology-present regions and pathology-absent regions from similar locations in the same subject for training and (2) the addition of a realism metric (Fréchet inception distance) to the generator loss term. We trained and tested this method with 2800 fracture-present and 2800 fracture-absent image patches from 704 unique pediatric chest radiographs. The trained model was then used to generate synthetic pathology-present images with exact knowledge of location (labels) of the pathology. These synthetic images provided an augmented training set for an object detector.
Results: In an observer study, four pediatric radiologists used a five-point Likert scale indicating the likelihood of a real fracture (1 = definitely not a fracture and 5 = definitely a fracture) to grade a set of real fracture-absent, real fracture-present, and synthetic fracture-present images. The real fracture-absent images scored , real fracture-present images , and synthetic fracture-present images . An object detector model (YOLOv5) trained on a mix of 500 real and 500 synthetic radiographs performed with a recall of and an score of . In comparison, when trained on only 500 real radiographs, the recall and score were and , respectively.
Conclusions: Our proposed method generates visually realistic pathology and that provided improved object detector performance for the task of rib fracture detection.