{"title":"Temporal 3D RetinaNet for fish detection","authors":"Zhou Shen, Chuong H. Nguyen","doi":"10.1109/DICTA51227.2020.9363372","DOIUrl":null,"url":null,"abstract":"Automatic detection and tracking of fish provides valuable information for marine life science. Deep convolutional networks have been applied with some success but performance is affected by challenging imaging conditions including complex background, variation of light and the low visibility of the underwater environment. Existing works including Fast R-CNN and RetinaNet rely on single frame fish detection and suffer noisy and unreliable detections. In this paper, we propose and examine two 3D deep learning networks using temporal features to improve fish detection performance. The first one called 3D-backbone RetinaNet based 3D ResNet for temporal information is found worse than 2D RetinaNet. The second one called 3D-subnets RetinaNet based on 3D Regression subnet and Classification subnet to extract the temporal information is found better than 2D RetinaNet. To validating the performance of these networks, we also created a new fish data set which will be made publicly available with codes of the proposed networks.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA51227.2020.9363372","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Automatic detection and tracking of fish provides valuable information for marine life science. Deep convolutional networks have been applied with some success but performance is affected by challenging imaging conditions including complex background, variation of light and the low visibility of the underwater environment. Existing works including Fast R-CNN and RetinaNet rely on single frame fish detection and suffer noisy and unreliable detections. In this paper, we propose and examine two 3D deep learning networks using temporal features to improve fish detection performance. The first one called 3D-backbone RetinaNet based 3D ResNet for temporal information is found worse than 2D RetinaNet. The second one called 3D-subnets RetinaNet based on 3D Regression subnet and Classification subnet to extract the temporal information is found better than 2D RetinaNet. To validating the performance of these networks, we also created a new fish data set which will be made publicly available with codes of the proposed networks.