Srikanth Namuduri, Prateek Mehta, Lise Barbe, Zohreh Faghihmonzavi, Stephanie Lam, steven finkbeiner, S. Bhansali
{"title":"Automated Quantification of DNA Damage Using Deep Learning and Use of Synthetic Data Generated from Basic Geometric Shapes","authors":"Srikanth Namuduri, Prateek Mehta, Lise Barbe, Zohreh Faghihmonzavi, Stephanie Lam, steven finkbeiner, S. Bhansali","doi":"10.1149/2754-2726/ad21ea","DOIUrl":null,"url":null,"abstract":"\n Comet assays are used to assess the extent of Deoxyribonucleic acid (DNA) damage, in human cells, caused by substances such as novel drugs or nano materials. Deep learning is showing promising results in automating the process of quantifying the percentage of damage using the assay images. But the lack of large datasets and imbalanced data is a challenge. In this study, synthetic comet assay images generated from simple geometric shapes were used to augment the data for training the Convolutional Neural Network. The results from the model trained using the augmented data were compared with the results from a model trained exclusively on real images. It was observed that the use of synthetic data in training not only gave a significantly better coefficient of determination (R2), but also resulted in a more robust model i.e., with less variation in R2 compared to training without synthetic data. This approach can lead to improved training while using a smaller training dataset, saving cost and effort involved in capturing additional experimental images and annotating them. Additional benefits include addressing imbalanced datasets, and data privacy concerns. Similar approaches must be explored in other low data domains to extract the same benefits.","PeriodicalId":72870,"journal":{"name":"ECS sensors plus","volume":"64 20","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ECS sensors plus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1149/2754-2726/ad21ea","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Comet assays are used to assess the extent of Deoxyribonucleic acid (DNA) damage, in human cells, caused by substances such as novel drugs or nano materials. Deep learning is showing promising results in automating the process of quantifying the percentage of damage using the assay images. But the lack of large datasets and imbalanced data is a challenge. In this study, synthetic comet assay images generated from simple geometric shapes were used to augment the data for training the Convolutional Neural Network. The results from the model trained using the augmented data were compared with the results from a model trained exclusively on real images. It was observed that the use of synthetic data in training not only gave a significantly better coefficient of determination (R2), but also resulted in a more robust model i.e., with less variation in R2 compared to training without synthetic data. This approach can lead to improved training while using a smaller training dataset, saving cost and effort involved in capturing additional experimental images and annotating them. Additional benefits include addressing imbalanced datasets, and data privacy concerns. Similar approaches must be explored in other low data domains to extract the same benefits.