Purpose: We aim to provide a robust dataset for training automated systems to detect tuberculosis bacilli using Ziehl-Neelsen stained slides. By making this dataset available, a critical gap in the availability of public datasets that can be used for developing and testing artificial intelligence techniques for tuberculosis diagnosis is addressed. Our rationale is grounded in the urgent need for diagnostic tools that can enhance tuberculosis diagnosis quickly and efficiently, especially in resource-limited settings.
Approach: The Ziehl-Neelsen method was used to prepare 362 slides, which were manually read. According to the World Health Organization's guidelines for performing bacilloscopy for tuberculosis diagnosis, experts annotated each slide to diagnose it as negative or positive. In addition, selected images underwent a detailed annotation process aimed at pinpointing the location of each bacillus and cluster within each image.
Results: The database consists of three directories. The first contains all the images, separated by slide, and indicates whether it is negative or the number of crosses if positive, for each slide. The second directory contains the 502 images selected for training automated systems, with each bacillus's position annotated and the Python code used. All the image fragments (positive and negative patches) used in the models' training, validation, and testing stages are available in the third directory.
Conclusions: The development of this annotated image database represents a significant advancement in tuberculosis diagnosis. By providing a high-quality and accessible resource to the scientific community, we enhance existing diagnostic tools and facilitate the development of automated technologies.
扫码关注我们
求助内容:
应助结果提醒方式:
