Benjamin Murray , Richard Brown , Pengcheng Ma , Eric Kerfoot , Daguang Xu , Andrew Feng , Jorge Cardoso , Sebastien Ourselin , Marc Modat
{"title":"Lazy Resampling: Fast and information preserving preprocessing for deep learning","authors":"Benjamin Murray , Richard Brown , Pengcheng Ma , Eric Kerfoot , Daguang Xu , Andrew Feng , Jorge Cardoso , Sebastien Ourselin , Marc Modat","doi":"10.1016/j.cmpb.2024.108422","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Preprocessing of data is a vital step for almost all deep learning workflows. In computer vision, manipulation of data intensity and spatial properties can improve network stability and can provide an important source of generalisation for deep neural networks. Models are frequently trained with preprocessing pipelines composed of many stages, but these pipelines come with a drawback; each stage that resamples the data costs time, degrades image quality, and adds bias to the output. Long pipelines can also be complex to design, especially in medical imaging, where cropping data early can cause significant artifacts.</div></div><div><h3>Methods:</h3><div>We present Lazy Resampling, a software that rephrases spatial preprocessing operations as a graphics pipeline. Rather than each transform individually modifying the data, the transforms generate transform descriptions that are composited together into a single resample operation wherever possible. This reduces pipeline execution time and, most importantly, limits signal degradation. It enables simpler pipeline design as crops and other operations become non-destructive. Lazy Resampling is designed in such a way that it provides the maximum benefit to users without requiring them to understand the underlying concepts or change the way that they build pipelines.</div></div><div><h3>Results:</h3><div>We evaluate Lazy Resampling by comparing traditional pipelines and the corresponding lazy resampling pipeline for the following tasks on Medical Segmentation Decathlon datasets. We demonstrate lower information loss in lazy pipelines vs. traditional pipelines. We demonstrate that Lazy Resampling can avoid catastrophic loss of semantic segmentation label accuracy occurring in traditional pipelines when passing labels through a pipeline and then back through the inverted pipeline. Finally, we demonstrate statistically significant improvements when training UNets for semantic segmentation.</div></div><div><h3>Conclusion:</h3><div>Lazy Resampling reduces the loss of information that occurs when running processing pipelines that traditionally have multiple resampling steps and enables researchers to build simpler pipelines by making operations such as rotation and cropping effectively non-destructive. It makes it possible to invert labels back through a pipeline without catastrophic loss of accuracy.</div><div>A reference implementation for Lazy Resampling can be found at <span><span>https://github.com/KCL-BMEIS/LazyResampling</span><svg><path></path></svg></span>. Lazy Resampling is being implemented as a core feature in MONAI, an open source python-based deep learning library for medical imaging, with a roadmap for a full integration.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"257 ","pages":"Article 108422"},"PeriodicalIF":4.9000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724004152","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Objective:
Preprocessing of data is a vital step for almost all deep learning workflows. In computer vision, manipulation of data intensity and spatial properties can improve network stability and can provide an important source of generalisation for deep neural networks. Models are frequently trained with preprocessing pipelines composed of many stages, but these pipelines come with a drawback; each stage that resamples the data costs time, degrades image quality, and adds bias to the output. Long pipelines can also be complex to design, especially in medical imaging, where cropping data early can cause significant artifacts.
Methods:
We present Lazy Resampling, a software that rephrases spatial preprocessing operations as a graphics pipeline. Rather than each transform individually modifying the data, the transforms generate transform descriptions that are composited together into a single resample operation wherever possible. This reduces pipeline execution time and, most importantly, limits signal degradation. It enables simpler pipeline design as crops and other operations become non-destructive. Lazy Resampling is designed in such a way that it provides the maximum benefit to users without requiring them to understand the underlying concepts or change the way that they build pipelines.
Results:
We evaluate Lazy Resampling by comparing traditional pipelines and the corresponding lazy resampling pipeline for the following tasks on Medical Segmentation Decathlon datasets. We demonstrate lower information loss in lazy pipelines vs. traditional pipelines. We demonstrate that Lazy Resampling can avoid catastrophic loss of semantic segmentation label accuracy occurring in traditional pipelines when passing labels through a pipeline and then back through the inverted pipeline. Finally, we demonstrate statistically significant improvements when training UNets for semantic segmentation.
Conclusion:
Lazy Resampling reduces the loss of information that occurs when running processing pipelines that traditionally have multiple resampling steps and enables researchers to build simpler pipelines by making operations such as rotation and cropping effectively non-destructive. It makes it possible to invert labels back through a pipeline without catastrophic loss of accuracy.
A reference implementation for Lazy Resampling can be found at https://github.com/KCL-BMEIS/LazyResampling. Lazy Resampling is being implemented as a core feature in MONAI, an open source python-based deep learning library for medical imaging, with a roadmap for a full integration.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.