Combining imperfect automated annotations of underwater images with human annotations to obtain precise and unbiased population estimates

Jui-Han Chang , Deborah R. Hart , Burton V. Shank , Scott M. Gallager , Peter Honig , Amber D. York
{"title":"Combining imperfect automated annotations of underwater images with human annotations to obtain precise and unbiased population estimates","authors":"Jui-Han Chang ,&nbsp;Deborah R. Hart ,&nbsp;Burton V. Shank ,&nbsp;Scott M. Gallager ,&nbsp;Peter Honig ,&nbsp;Amber D. York","doi":"10.1016/j.mio.2016.09.006","DOIUrl":null,"url":null,"abstract":"<div><p><span><span><span>Optical methods for surveying populations are becoming increasingly popular. These methods often produce hundreds of thousands to millions of images, making it impractical to analyze all the images manually by human annotators. </span>Computer vision software can rapidly annotate these images, but their error rates are often substantial, vary spatially and are autocorrelated. Hence, population estimates based on the raw computer automated counts can be seriously biased. We evaluated four estimators that combine automated annotations of all the images with manual annotations from a random sample to obtain (approximately) unbiased population estimates, namely: ratio, offset, and linear regression estimators as well as the mean of the manual annotations only. Each of these estimators was applied either globally or locally (i.e., either all data were used or only those near the point in question, to take into account spatial variability and </span>autocorrelation in error rates). We also investigated a simple stratification scheme that splits the images into two strata, based on whether the automated annotator detected no targets or at least one target. The 16 methods resulting from a combination of four estimators, global or local estimation, and one stratum or two strata, were evaluated using simulations and field data. Our results indicated that the probability of a </span>false negative is the key factor determining the best method, regardless of the probability of false positives. Stratification was the most effective method in improving the accuracy and precision of the estimates, provided the false negative rate was not too high. If the probability of false negatives is low, stratified estimation with the local ratio estimator or local regression (essentially geographically weighted regression) is best. If the probability of false negatives is high, no stratification with a simple global linear regression or simply the manual sample mean alone is recommended.</p></div>","PeriodicalId":100922,"journal":{"name":"Methods in Oceanography","volume":"17 ","pages":"Pages 169-186"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.mio.2016.09.006","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods in Oceanography","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211122015300219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Optical methods for surveying populations are becoming increasingly popular. These methods often produce hundreds of thousands to millions of images, making it impractical to analyze all the images manually by human annotators. Computer vision software can rapidly annotate these images, but their error rates are often substantial, vary spatially and are autocorrelated. Hence, population estimates based on the raw computer automated counts can be seriously biased. We evaluated four estimators that combine automated annotations of all the images with manual annotations from a random sample to obtain (approximately) unbiased population estimates, namely: ratio, offset, and linear regression estimators as well as the mean of the manual annotations only. Each of these estimators was applied either globally or locally (i.e., either all data were used or only those near the point in question, to take into account spatial variability and autocorrelation in error rates). We also investigated a simple stratification scheme that splits the images into two strata, based on whether the automated annotator detected no targets or at least one target. The 16 methods resulting from a combination of four estimators, global or local estimation, and one stratum or two strata, were evaluated using simulations and field data. Our results indicated that the probability of a false negative is the key factor determining the best method, regardless of the probability of false positives. Stratification was the most effective method in improving the accuracy and precision of the estimates, provided the false negative rate was not too high. If the probability of false negatives is low, stratified estimation with the local ratio estimator or local regression (essentially geographically weighted regression) is best. If the probability of false negatives is high, no stratification with a simple global linear regression or simply the manual sample mean alone is recommended.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
将不完善的水下图像自动注释与人工注释相结合,以获得精确和无偏的种群估计
测量人口的光学方法正变得越来越流行。这些方法通常会产生数十万到数百万的图像,使得人工注释器手动分析所有图像变得不切实际。计算机视觉软件可以快速注释这些图像,但它们的错误率往往很大,空间变化,并且是自相关的。因此,基于原始计算机自动计数的人口估计可能存在严重偏差。我们评估了四种估计器,它们将所有图像的自动注释与来自随机样本的手动注释结合起来,以获得(近似)无偏总体估计,即:比率、偏移量和线性回归估计器以及仅手动注释的平均值。这些估计器中的每一个都是全局或局部应用的(即,要么使用所有数据,要么只使用有关点附近的数据,以考虑误差率的空间变异性和自相关性)。我们还研究了一种简单的分层方案,该方案根据自动注释器是否检测到没有目标或至少一个目标,将图像分成两层。使用模拟和现场数据对四种估计方法(全局或局部估计、一层或两层)的组合产生的16种方法进行了评估。我们的结果表明,假阴性的概率是决定最佳方法的关键因素,而不考虑假阳性的概率。在假阴性率不太高的情况下,分层是提高估计的准确性和精密度的最有效方法。如果假阴性的概率很低,使用局部比率估计器或局部回归(本质上是地理加权回归)的分层估计是最好的。如果假阴性的概率很高,则不建议使用简单的全局线性回归或简单的手动样本平均值进行分层。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Editorial Board Final issue of Methods in Oceanography Measuring pH in the Arctic Ocean: Colorimetric method or SeaFET? A topological approach for quantitative comparisons of ocean model fields to satellite ocean color data Optical methods for estimating apparent density of sediment in suspension
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1